Professional Documents
Culture Documents
networks: a review
Samuel Schmidgall1, , Jascha Achterberg2,3 , Thomas Miconi4 , Louis Kirsch5 , Rojin Ziaei6 , S. Pardis Hajiseyedrazi6 , and
Jason Eshraghian7
1
Johns Hopkins University
2
University of Cambridge
3
Intel Labs
4
ML Collective
5
The Swiss AI Lab IDSIA
arXiv:2305.11252v1 [cs.NE] 18 May 2023
6
University of Maryland, College Park
7
University of California, Santa Cruz
Artificial neural networks (ANNs) have emerged as an essential tinuously learn and adapt like biological brains 13–15 . Unlike
tool in machine learning, achieving remarkable success across current models of machine intelligence, animals can learn
diverse domains, including image and speech generation, game throughout their entire lifespan, which is essential for stable
playing, and robotics. However, there exist fundamental dif- adaptation to changing environments. This ability, known as
ferences between ANNs’ operating mechanisms and those of lifelong learning, remains a significant challenge for artifi-
the biological brain, particularly concerning learning processes.
cial intelligence, which primarily optimizes problems con-
This paper presents a comprehensive review of current brain-
inspired learning representations in artificial neural networks.
sisting of fixed labeled datasets, causing it to struggle gen-
We investigate the integration of more biologically plausible eralizing to new tasks or retain information across repeated
mechanisms, such as synaptic plasticity, to enhance these net- learning iterations 14 . Addressing this challenge is an active
works’ capabilities. Moreover, we delve into the potential ad- area of research, and the potential implications of developing
vantages and challenges accompanying this approach. Ulti- AI with lifelong learning abilities could have far-reaching im-
mately, we pinpoint promising avenues for future research in pacts across multiple domains.
this rapidly advancing field, which could bring us closer to un- In this paper, we offer a unique review that seeks to identify
derstanding the essence of intelligence. the mechanisms of the brain that have inspired current artifi-
cial intelligence algorithms. To better understand the biolog-
Correspondence: sschmi46@jhu.edu
ical processes underlying natural intelligence, the first sec-
tion will explore the low-level components that shape neuro-
modulation, from synaptic plasticity, to the role of local and
Introduction
global dynamics that shape neural activity. This will be re-
The dynamic interrelationship between memory and learning lated back to ANNs in the third section, where we compare
is a fundamental hallmark of intelligent biological systems. and contrast ANNs with biological neural systems. This will
It empowers organisms to not only assimilate new knowl- give us a logical basis that seeks to justify why the brain has
edge but also to continuously refine their existing abilities, more to offer AI, beyond the inheritance of current artificial
enabling them to adeptly respond to changing environmen- models. Following that, we will delve into algorithms of ar-
tal conditions. This adaptive characteristic is relevant on tificial learning that emulate these processes to improve the
various time scales, encompassing both long-term learning capabilities of AI systems. Finally, we will discuss various
and rapid short-term learning via short-term plasticity mech- applications of these AI techniques in real-world scenarios,
anisms, highlighting the complexity and adaptability of bi- highlighting their potential impact on fields such as robotics,
ological neural systems 1–3 . The development of artificial lifelong learning, and neuromorphic computing. By doing so,
systems that draw high-level, hierarchical inspiration from we aim to provide a comprehensive understanding of the in-
the brain has been a long-standing scientific pursuit span- terplay between learning mechanisms in the biological brain
ning several decades. While earlier attempts were met with and artificial intelligence, highlighting the potential benefits
limited success, the most recent generation of artificial in- that can arise from this synergistic relationship. We hope our
telligence (AI) algorithms have achieved significant break- findings will encourage a new generation of brain-inspired
throughs in many challenging tasks. These tasks include, but learning algorithms.
are not limited to, the generation of images and text from
human-provided prompts 4–7 , the control of complex robotic
systems 8–10 , and the mastery of strategy games such as Chess Processes that support learning in the brain
and Go 11 and a multimodal amalgamation of these 12 . A grand effort in neuroscience aims at identifying the un-
While ANNs have made significant advancements in various derlying processes of learning in the brain. Several mecha-
fields, there are still major limitations in their ability to con- nisms have been proposed to explain the biological basis of
learning at varying levels of granularity–from the synapse to suggested to play a role in various forms of plasticity, includ-
population-level activity. However, the vast majority of bi- ing short- 19 and long-term plasticity 22 .
ologically plausible models of learning are characterized by
plasticity that emerges from the interaction between local and Metaplasticity The ability of neurons to modify both their
global events 16 . Below, we introduce various forms of plas- function and structure based on activity is what characterizes
ticity and how these processes interact in more detail. synaptic plasticity. These modifications which occur at the
synapse must be precisely organized so that changes occurs
Synaptic plasticity Plasticity in the brain refers to the ca- at the right time and by the right quantity. This regulation
pacity of experience to modify the function of neural circuits. of plasticity is referred to as metaplasticity, or the ’plasticity
The plasticity of synapses specifically refers to the modifica- of synaptic plasticity,’ and plays a vital role in safeguarding
tion of the strength of synaptic transmission based on activity the constantly changing brain from its own saturation 24–26 .
and is currently the most widely investigated mechanism by Essentially, metaplasticity alters the ability of synapses to
which the brain adapts to new information 17,18 . There are generate plasticity by inducing a change in the physiolog-
two broader classes of synaptic plasticity: short- and long- ical state of neurons or synapses. Metaplasticity has been
term plasticity. Short-term plasticity acts on the scale of tens proposed as a fundamental mechanism in memory stability,
of milliseconds to minutes and has an important role in short- learning, and regulating neural excitability. While similar,
term adaptation to sensory stimuli and short-lasting memory metaplasticity can be distinguished from neuromodulation,
formation 19 . Long-term plasticity acts on the scale of min- with metaplastic and neuromodulatory events often overlap-
utes to more, and is thought to be one of the primary pro- ping in time during the modification of a synapse.
cesses underlying long-term behavioral changes and memory
storage 20 . Neurogenesis The process by which newly formed neu-
rons are integrated into existing neural circuits is referred
Neuromodulation In addition to the plasticity of synapses, to as neurogenesis. Neurogenesis is most active during em-
another important mechanism by which the brain adapts to bryonic development, but is also known to occur through-
new information is through neuromodulation 3,21,22 . Neu- out the adult lifetime, particularly in the subventricular zone
romodulation refers to the regulation of neural activity by of the lateral ventricles 27 , the amygdala 28 , and in the den-
chemical signaling molecules, often referred to as neuro- tate gyrus of the hippocampal formation 29 . In adult mice,
transmitters or hormones. These signaling molecules can neurogenesis has been demonstrated to increase when liv-
alter the excitability of neural circuits and the strength of ing in enriched environments versus in standard laboratory
synapses, and can have both short- and long-term effects on conditions 30 . Additionally, many environmental factors such
neural function. Different types of neuromodulation have as exercise 31,32 and stress 33,34 have been demonstrated to
been identified, including acetylcholine, dopamine, and sero- change the rate of neurogenesis in the rodent hippocampus.
tonin, which have been linked to various functions such as at- Overall, while the role of neurogenesis in learning is not fully
tention, learning, and emotion 23 . Neuromodulation has been understood, it is believe to play an important role in support-
where ∆wij is the change in the weight between neuron i Backpropagation. Backpropagation is a powerful error-
and neuron j, ∆t is the time difference between the pre- and driven global learning method which changes the weight of
post-synaptic spikes, A+ and A− are the amplitudes of the connections between neurons in a neural network to produce
potentiation and depression, respectively, and τ+ and τ− are a desired target behavior 62 . This is accomplished through
the time constants for the potentiation and depression, respec- the use of a quantitative metric (an objective function) that
tively. This rule states that the strength of the connection describes the quality of a behavior given sensory informa-
between the two neurons will be increased or decreased de- tion (e.g. visual input, written text, robotic joint positions).
pending on the timing of their spikes relative to each other. The backpropagation algorithm consists of two phases: the
forward pass and the backward pass. In the forward pass,
the input is propagated through the network, and the output
Processes that support learning in artificial is calculated. During the backward pass, the error between
neural networks the predicted output and the "true" output is calculated, and
the gradients of the loss function with respect to the weights
There are two primary approaches for weight optimization of the network are calculated by propagating the error back-
in artificial neural networks: error-driven global learning and wards through the network. These gradients are then used to
brain-inspired local learning. In the first approach, the net- update the weights of the network using an optimization al-
work weights are modified by driving a global error to its gorithm such as stochastic gradient descent. This process is
minimum value. This is achieved by delegating error to repeated for many iterations until the weights converge to a
each weight and synchronizing modifications between each set of values that minimize the loss function.
weight. In contrast, brain-inspired local learning algorithms Lets take a look at a brief mathematical explanation of back-
aim to learn in a more biologically plausible manner, by mod- propagation. First, we define a desired loss function, which
ifying weights from dynamical equations using locally avail- is a function of the network’s outputs and the true values:
able information. Both optimization approaches have unique
benefits and drawbacks. In the following sections we will 1X
L(y, ŷ) = (yi − ŷi )2
discuss the most utilized form of error-driven global learn- 2
i
ing, backpropagation, followed by in-depth discussions of
brain-inspired local algorithms. It is worth mentioning that where y is the true output and ŷ is the network’s output. In
these two approaches are not mutually exclusive, and will this case we are minimizing the squared error, but could very
often be integrated in order to compliment their respective well optimize for any smooth and differentiable loss function.
strengths 58–61 . Next, we use the chain rule to calculate the gradient of the
enable a neural network to modify all of its parameters in and functionality of the biological brain 42,111,112 . This ap-
recursive fashion. Thus, the learner can also modify the proach seeks to develop artificial neural networks that not
meta-learner. This in principles allows arbitrary levels of only replicate the brain’s learning capabilities but also its en-
learning, meta-learning, meta-meta-learning, etc. Some ap- ergy efficiency and inherent parallelism. Neuromorphic com-
proaches meta-learn the parameter initialization of such a puting systems often incorporate specialized hardware, such
system 102,104 . Finding this initialization still requires a hard- as neuromorphic chips or memristive devices, to enable the
wired meta-learner. In other works the network self-modifies efficient execution of brain-inspired learning algorithms 112 .
in a way that eliminates even this meta-learner 103,105 . Some- These systems have the potential to drastically improve the
times the learning rule to be discovered has structural search performance of machine learning applications, particularly in
space restrictions which simplify self-improvement where a edge computing and real-time processing scenarios.
gradient-based optimizer can discover itself 106 or an evolu- A key aspect of neuromorphic computing lies in the devel-
tionary algorithm can optimize itself 107 . Despite their dif- opment of specialized hardware architectures that facilitate
ferences, both synaptic plasticity, as well as self-referential the implementation of spiking neural networks, which more
approaches, aim to achieve self-improvement and adaptation closely resemble the information processing mechanisms of
in neural networks. biological neurons. Neuromorphic systems operate based
on the principle of brain-inspired local learning, which al-
Generalization of meta-optimized learning rules The ex- lows them to achieve high energy efficiency, low-latency pro-
tent to which discovered learning rules generalize to a wide cessing, and robustness against noise, which are critical for
range of tasks is a significant open question–in particular, real-world applications 113 . The integration of brain-inspired
when should they replace manually derived general-purpose learning techniques with neuromorphic hardware is vital for
learning rules such as backpropagation? A particular obser- the successful application of this technology.
vation that poses a challenge to these methods is that when In recent years, advances in neuromorphic computing have
the search space is large and few restrictions are put on the led to the development of various platforms, such as Intel’s
learning mechanism 92,108,109 , generalization is shown to be- Loihi 114 , IBM’s TrueNorth 115 , and SpiNNaker 116 , which
come more difficult. However, toward amending this, in vari- offer specialized hardware architectures for implementing
able shared meta learning 93 flexible learning rules were pa- SNNs and brain-inspired learning algorithms. These plat-
rameterized by parameter-shared recurrent neural networks forms provide a foundation for further exploration of neu-
that locally exchange information to implement learning al- romorphic computing systems, enabling researchers to de-
gorithms that generalize across classification problems not sign, simulate, and evaluate novel neural network architec-
seen during meta-optimization. Similar results have also tures and learning rules. As neuromorphic computing contin-
been shown for the discovery of reinforcement learning al- ues to progress, it is expected to play a pivotal role in the fu-
gorithms 110 . ture of artificial intelligence, driving innovation and enabling
the development of more efficient, versatile, and biologically
Applications of brain-inspired learning plausible learning systems.
Neuromorphic Computing Neuromorphic computing rep- Robotic learning Brain-inspired learning in neural net-
resents a paradigm shift in the design of computing systems, works has the potential to overcome many of the current
with the goal of creating hardware that mimics the structure challenges present in the field of robotics by enabling robots