Professional Documents
Culture Documents
Wulfram Gerstner†
Swiss Federal Institute of Technology, Center of Neuromimetic Systems, EPFL-DI, CH-1015 Lausanne, Switzerland
DJ i ~ t ! 5 h E
t
t1T
dt 8 @ w inS in
i ~ t 8 ! 1w S ~ t 8 !#
out out lations between these input spikes and the output spike at time t 1 .
This becomes visible with the aid of the learning window W cen-
tered at t 1 . The input spikes are too far away in time. The next
1h E t
t1T
dt 8 E
t
t1T
dt 9 W ~ t 9 2t 8 ! S in
i ~ t9!S ~ t8!
out
output spike at t 2 , however, is close enough to the previous input
spike at t 3i . The weight J i is changed by w out,0 plus the contribu-
tion W(t 3i 2t 2 ).0, the sum of which is positive ~arrowheads!.
~1a! Similarly, the input spike at time t 4i leads to a change w in1W(t 4i
2t 2 ),0.
5h F( f
ti
8 w in1 ( 8 w out1 ( 8 W ~ t if 2t n ! .
tn f
t i ,t n
G discrete events that trigger a discontinuous change of the
synaptic weight; cf. Fig. 2 ~bottom!. If we assume a stochas-
~1b!
tic spike arrival or if we assume a stochastic process for
generating output spikes, the change DJ i is a random vari-
In Eq. ~1b! the prime indicates that only firing times t if and t n able, which exhibits fluctuations around some mean drift.
in the time interval @ t,t1T# are to be taken into account; cf. Averaging implies that we focus on the drift and calculate
Fig. 2. the expected rate of change. Fluctuations are treated in Sec.
Equation ~1! represents a Hebb-type learning rule since VI.
they correlate presynaptic and postsynaptic behavior. More
precisely, here our learning scheme depends on the time se- 1. Self-averaging of learning
quence of input and output spikes. The parameters w in,w out Effective learning needs repetition over many trials of
as well as the amplitude of the learning window W may, and
length T, each individual trial being independent of the pre-
in general will, depend on the value of the efficacy J i . Such
vious ones. Equation ~1! tells us that the results of the indi-
a J i dependence is useful so as to avoid unbounded growth
vidual trials are to be summed. According to the ~strong! law
of synaptic weights. Even though we have not emphasized
of large numbers @44# in conjunction with h being ‘‘small’’
this in our notation, most of the theory developed below is
valid for J i -dependent parameters; cf. Sec. V B. @45# we can average the resulting equation, viz., Eq. ~1!,
regardless of the random process. In other words, the learn-
ing procedure is self-averaging. Instead of averaging over
B. Ensemble average
several trials, we may also consider one single long trial
Given that input spiking is random but partially correlated during which input and output characteristics remain con-
and that the generation of spikes is in general a complicated stant. Again, if h is sufficiently small, time scales are sepa-
dynamic process, an analysis of Eq. ~1! is a formidable prob- rated and learning is self-averaging.
lem. We therefore simplify it. We have introduced a small The corresponding average over the resulting random pro-
parameter h .0 into Eq. ~1! with the idea in mind that the cess is denoted by angular brackets ^ & and is called an en-
learning process is performed on a much slower time scale semble average, in agreement with physical usage. It is a
than the neuronal dynamics. Thus we expect that only aver- probability measure on a probability space, which need not
aged quantities enter the learning dynamics. be specified explicitly. We simply refer to the literature @44#.
Considering averaged quantities may also be useful in or- Substituting s5t 9 2t 8 on the right-hand side of Eq. ~1a! and
der to disregard the influence of noise. In Eq. ~1! spikes are dividing both sides by T, we obtain
4500 KEMPTER, GERSTNER, AND van HEMMEN PRE 59
t2t 8
dsW ~ s ! C i ~ s;t ! 5 ^ S in
i ~ t1s ! S ~ t ! & ,
out
~3!
dJ i
[J̇ i 5a 0 1a 1 n in
i 1a 2 n
out
dt
1a 3 n in
i n 1a 4 ~ n i ! 1a 5 ~ n ! ,
out in 2 out 2
~5!
however, are not generally valid. According to the results of In our simple neuron model we assume that output spikes
Markram et al. @25#, the width W of the Hebbian learning are generated stochastically with a time-dependent rate
window in cortical pyramidal cells is in the range of 100 ms. l out(t) that depends on the timing of input spikes. Each input
At retinotectal synapses W is also in the range of 100 ms spike arriving at synapse i at time t if increases ~or decreases!
@26#. the instantaneous firing rate l out by an amount J i (t if ) e (t
A mean rate formulation thus requires that all changes of 2t if ), where e is a response kernel. The effect of an incom-
the activity are slow at a time scale of 100 ms. This is not ing spike is thus a change in probability density proportional
necessarily the case. The existence of oscillatory activity in to J i . Causality is imposed by the requirement e (s)50 for
the cortex in the range of 40 Hz ~e.g., @14,15,20,48#! implies s,0. In biological terms, the kernel e may be identified with
activity changes every 25 ms. Retinal ganglion cells fire an excitatory ~or inhibitory! postsynaptic potential. In
synchronously at a time scale of about 10 ms @49#; cf. also throughout what follows, we assume excitatory couplings
@50#. Much faster activity changes are found in the auditory J i .0 for all i and e (s)>0 for all s. In addition, the response
system. In the auditory pathway of, e.g., the barn owl, spikes kernel e (s) is normalized to * ds e (s)51; cf. Fig. 4.
can be phase-locked to frequencies of up to 8 kHz @51–53#. The contributions from all N synapses as measured at the
Furthermore, beyond the correlations between instantaneous axon hillock are assumed to add up linearly. The result gives
rates, additional correlations between spikes may exist. rise to a linear inhomogeneous Poisson model with intensity
Because of all the above reasons, the learning rule ~5! in
the simple rate formulation is insufficient to provide a gen- N
erally valid description. In Secs. IV and V we will therefore
study the full spike-based learning equation ~4!.
l out~ t ! 5 n 0 1 ( (f J i~ t if ! e ~ t2t if ! .
i51
~6!
4502 KEMPTER, GERSTNER, AND van HEMMEN PRE 59
Here, n 0 is the spontaneous firing rate and the sums run over
all spike arrival times at all synapses. By definition, the spike
generation process ~6! is independent of previous output
spikes. In particular, this Poisson model does not include
refractoriness.
In the context of Eq. ~4!, we are interested in ensemble
averages over both the input and the output. Since Eq. ~6! is
a linear equation, the average can be performed directly and
yields
( J i~ t ! E0 ds e ~ s ! l ini ~ t2s ! .
N
`
^ S out& ~ t ! 5 n 0 1 ~7! FIG. 5. Spike-spike correlations. To understand the meaning of
i51 Eq. ~9! we have sketched ^ S in i (t 8 )S (t) & /l i (t 8 ) as a function of
out in
time t ~full line!. The dot-dashed line at the bottom of the graph is
In deriving Eq. ~7! we have replaced J i (t if ) by J i (t) because the contribution J i (t 8 ) e (t2t 8 ) of an input spike occurring at time
efficacies are assumed to change adiabatically with respect to t 8 . Adding this contribution to the mean rate contribution, n 0
the width of e . The ensemble-averaged output rate in Eq. ~7! 1 ( j J j (t)L inj (t) ~dashed line!, we obtain the rate inside the square
depends on the convolution of e with the input rates. In what brackets of Eq. ~9! ~full line!. At time t 9 .t 8 the input spike at time
follows we denote t 8 enhances the output firing rate by an amount J i (t 8 ) e (t 9 2t 8 )
~arrows!. Note that in the main text we have taken t 9 2t 8 52s.
L in
i ~ t !5 E `
0
ds e ~ s ! l in
i ~ t2s ! . ~8! B. Learning equation
Before inserting the correlation function ~10! into the
learning rule ~4! we define the covariance matrix
Equation ~7! may suggest that input and output spikes are
statistically independent, which is not the case. To show this
i ~ t1s ! 2 n i ~ t1s !#@ L j ~ t ! 2 n j ~ t !# ~11!
q i j ~ s;t ! ª @ l in in in in
explicitly, we determine the ensemble-averaged correlation
^ S ini (t1s)S out(t) & in Eq. ~3!. Since ^ S ini (t1s)S out(t) & corre- and its convolution with the learning window W,
sponds to a joint probability, it equals the probability density
l in
i (t1s) for an input spike at synapse i at time t1s times
the conditional probability density of observing an output
spike at time t given the above input spike at t1s,
Q i j~ t !ª E `
2`
dsW ~ s ! q i j ~ s;t ! . ~12!
F
i ~ t1s ! n 0 1J i ~ t ! e ~ 2s ! 1
5l in
N
( J j ~ t ! L inj ~ t !
j51
G .
J̇ i 5w inn in
i 1w
out
n 01 F (j J j n j G 1W̃ ~ 0 ! n in
i n0
(
C i ~ s;t ! 5 J j ~ t ! l in
i ~ t1s ! L j ~ t !
E
in
j51 k 3 5 n in dsW ~ s ! e ~ 2s !
i ~ t1s !@ n 0 1J i ~ t ! e ~ 2s !# ,
1l in ~10!
in Eq. ~13! and arrive at
where we have assumed the weights J j to be constant in the
time interval @ t,t1T# . Temporal averages are denoted by a
J̇ i 5k 1 1 (j ~ Q i j 1k 2 1k 3 d i j ! J j . ~15!
i (t)5 n i (t).
bar; cf. Sec. II C. Note that l in in
PRE 59 HEBBIAN LEARNING AND SPIKING NEURONS 4503
TABLE I. Values of the parameters used for the numerical arbitrary time-dependent input, l in in
i (t)5l (t), with the same
simulations ~a! and derived quantities ~b!.
mean input rate n in5l in(t) as in group M1 . Without going
into details about the dependence of l in(t) upon the time t,
~a! Parameters
we assume l in(t) to be such that the covariance q i j (s;t) in
Learning h 51025 Eq. ~11! is independent of t. In this case it follows from Eq.
w in5 h ~12! that Q i j (t)[Q for i, jPM2 , regardless of t. For the
w out521.0475h sake of simplicity we require in addition that Q.0. In sum-
A 1 51 mary, we suppose in the following
H
A 2 521
Q.0 for i, jPM2 ,
t 1 51 ms Q i j~ t !5 ~17!
t 2 520 ms 0 otherwise.
t syn55 ms
We recall that Q i j is a measure of the correlations in the
EPSP t 0 510 ms input arriving at synapses i and j; cf. Eqs. ~11! and ~12!.
Synaptic input N550 Equation ~17! states that at least some of the synapses re-
M 1 525 ceive positively correlated input, a rather natural assumption.
M 2 525 Three different realizations of Eq. ~17! are now discussed in
n in510 s21
turn.
d n in510 s21
1. White-noise input
v /(2 p )540 s21
For all synapses in group M2 , let us consider the case of
Further parameters n 0 50 stochastic white-noise input with intensity l in(t) and mean
q 50.1 firing rate l in(t)5 n in(t)>0. The fluctuations are
@ l in(t1s)2 n in(t1s) #@ l in(t)2 n in(t) # 5 s 0 d (s). Due to the
~b! Derived quantities convolution ~8! with e , Eq. ~11! yields q i j (s;t)5 s 0 e (2s),
W̃(0)ª * dsW(s)54.7531028 s independently of t, i, and j. We use Eq. ~12! and find
* dsW(s) 2 53.68310212 s Q i j (t)[Q5 s 0 * dsW(s) e (2s). We want Q.0 and there-
* dsW(s) e (2s)57.0431026 fore arrive at * W(s) e (2s)5k 3 / n in.0. We have seen be-
Q56.8431027 s21 fore in Sec. IV D that k 3 .0 is a natural assumption and in
k 1 5131024 s21
agreement with experiments.
k 2 52131024 s21
2. Colored-noise input
k 3 57.0431025 s21
t av523102 s Let us now consider the case of an instantaneous and
t str52.933104 s memoryless excitation, e (s)5 d (s). We assume that l in
t 51.623105 s
noise 2 n in obeys a stationary Ornstein-Uhlenbeck process @62#
J av5231022 with correlation time t c . The fluctuations are therefore
*
D52.4731029 s21 q i j (s;t)}exp(2usu/tc), independent of the synaptic indices i
D 8 51.4731029 s21 and j. Q.0 implies * dsW(s)exp(2usu/tc).0.
3. Periodic input
1
N
t av5J av/k 1 521/@ N ~ k 2 1Q av!# ~22!
J av5
N (
i51
Ji ~19! *
is the time constant of normalization. The fixed point in Eq.
approaches a stable fixed point during learning. Moreover, in ~21! is stable, if and only if t av.0.
this case the mean output rate n out is also stabilized since During learning, weights $ J i % and rates $ l in
i % may become
n out5 n 0 1NJ avn in; cf. Eq. ~7!. correlated. In Appendix C we demonstrate that the influence
As long as the learning parameters do not depend on the J of any interdependence between weights and rates on nor-
values, the rate of change of the average weight is obtained malization can be neglected in the case of weak correlations
from Eqs. ~15!, ~19!, and k 3 50 ~Sec. IV C!, in the input,
0,Q!2k 2 . ~23!
J̇ av5k 1 1N k 2 J av1N 21 (
i, j
Qi jJ j . ~20!
The fixed point J av in Eq. ~21! and the time constant t av in
*
In the following we consider the situation at the beginning Eq. ~22! are, then, almost independent of the average corre-
of the learning procedure where the set of weights $ J i % has lation Q av, which is always of the same order as Q.
not picked up any correlations with the set of Poisson inten- In Figs. 7 and 8 we show numerical simulations with
sities $ l in
i % yet and therefore is independent. We may then parameters as given in Appendix B. The average weight J av
replace J i and Q i j on the right-hand side of Eq. ~20! by their always approaches J av , independent of any initial conditions
*
average values J av and Q av5N 22 ( Ni, j Q i j , respectively. The in the distribution of weights.
4506 KEMPTER, GERSTNER, AND van HEMMEN PRE 59
Weight constraints lead to further conditions on learning may be enlarged is of order 1, if we take the upper bound to
parameters be q 5(11d)J av , where d.0 is of order 1, which will be
*
We have seen that normalization is possible without a J assumed throughout what follows; cf. Appendix D.
dependence of the learning parameters. Even if the average
weight J av approaches a fixed point J av , there is no restric- C. Structure formation
*
tion for the size of individual weights, apart from J i >0 for In our simple model with two groups of input, structure
excitatory synapses and J i &N J . This means that a single
av
formation can be measured by the difference J str between the
*
weight at most comprises the total ~normalized! weight of all average synaptic strength in groups M1 and M2 ; cf. Sec.
N synapses. The latter case is, however, unphysiological, V A. We derive conditions under which this difference in-
since almost every neuron holds many synapses with nonva- creases during learning. In the course of the argument we
nishing efficacies ~weights! and efficacies of biological syn- also show that structure formation takes place on a time scale
apses seem to be limited. We take this into account in our t str considerably slower than the time scale t av of normaliza-
learning rule by introducing a hard upper bound q for each tion.
individual weight. As we will demonstrate, a reasonable We start from Eq. ~15! with k 3 50 and randomly distrib-
value of q does not influence normalization in that J av re- uted weights. For the moment we assume that normalization
*
mains unchanged. However, an upper bound q .0, whatever has already taken place. Furthermore, we assume small cor-
its value, leads to further constraints on the learning param- relations as in Eq. ~23!, which assures that the fixed point
eters. J av'2k 1 /(Nk 2 ) is almost constant during learning; cf. Eqs.
To incorporate the restricted range of individual weights *
~21! and ~C1!. If the formation of any structure in $ J i % is
into our learning rule ~1!, we assume that we can treat the slow as compared to normalization, we are allowed to use
learning parameters w in,w out, and the amplitude of W to be J av5J av during learning. The consistency of this ansatz is
constant in the range 0<J i < q . For J i ,0 or J i . q , we take *
checked at the end of this section.
w in5w out5W50. In other words, we use Eq. ~15! only be- The average weight in each of the two groups M1 and
tween the lower bound 0 and the upper bound q and set M2 is
dJ i /dt50 if J i ,0 or J i . q .
Because of lower and upper bounds for each synaptic 1 1
weight, 0<J i < q for all i, a realizable fixed point J av has to
*
J ~ 1! 5
M1 (
iPM 1
Ji and J ~ 2! 5
M2 (
iPM2
Ji . ~26!
be within these limits. Otherwise all weights saturate either
at the lower or at the upper bound. To avoid this, we first of
If lower and upper bounds do not influence the dynamics of
all need J av.0. Since t av5J av/k 1 in Eq. ~22! must be posi-
* * each weight, the corresponding rates of change are
tive for stable fixed points, we also need k 1 .0. The meaning
becomes transparent from Eq. ~14! in the case of vanishing
spontaneous activity in the output, n 0 50. Then k 1 .0 re- J̇ ~ 1! 5k 1 1M 1 J ~ 1! k 2 1M 2 J ~ 2! k 2 ,
duces to ~27!
~ 2! ~ 2! ~ 1!
J̇ 5k 1 1M 2 J ~ k 2 1Q ! 1M 1 J k2 .
w in.0, ~24!
One expects the difference J str5J (2) 2J (1) between those av-
which corresponds to neurobiological reality @28,56,31#.
erage weights to grow during learning because group M2
A second condition for a realizable fixed point arises
receives a stronger reinforcement than M1 . Differentiating
from the upper bound q .J av . This requirement leads to
* J str with respect to time, using Eq. ~27! and the constraint
k 2 ,2k 1 /(N q )2Q av. Exploiting only k 2 ,0, we find from
J av5J av5N 21 (M 1 J (1) 1M 2 J (2) ), we find the rate of growth
Eq. ~14! that w out1W̃(0) n in,0, which means that postsyn- *
aptic spikes on average reduce the total weight of synapses. M1 M2
This is one of our predictions that can be tested experimen- J̇ str5 Q J str1M 2 Q J av . ~28!
N *
tally. Assuming W̃(0).0, which seems reasonable — with
the benefit of hindsight — in terms of rate-coded learning in
the manner of Hebb ~Sec. III!, we predict The first term on the right-hand side gives rise to an expo-
nential increase (Q.0) while the second term gives rise to a
w out,0, ~25! linear growth of J str. Equation ~28! has an unstable fixed
point at J str52N/M 1 J av . Note that J str is always negative
* * *
which has not been verified by experiments yet. and independent of Q.
Weight constraints do not influence the position of the We associate the time constant t of structure formation
str
fixed point J av ~as long as it remains realizable! but may with the time that is necessary for an increase of J str from a
* typical initial value to its maximum. The maximum of J str is
enlarge the value of the time constant t av of normalization
~see details in Appendix D!. The time constant t av changes of order J av if M 1 /M 2 is of order 1 ~Sec. V A! and if q
*
because weights saturated at the lower ~upper! bound cannot 5(11d) J av , where d.0 is of order 1 ~Sec. V B!. At the
*
contribute to a decrease ~increase! of J av. If fewer than the beginning of learning (t50) we may take J str(0)50. Using
total number of weights add to our ~subtractive! normaliza- this initial condition, an integration of Eq. ~28! leads to
tion, then the fixed point is approached more slowly; cf. Fig. J str(t)5(N/M 1 ) J av@ exp(t M 1M 2Q/N)21#. With t5 t str and
*
8, dashed line and insets. The factor, however, by which t av J str( t str)5J av we obtain t str5N/(M 1 M 2 Q) ln(M 1 /N11).
*
PRE 59 HEBBIAN LEARNING AND SPIKING NEURONS 4507
1 n inn outW̃ ~ 0 ! @ 2 ~ w in1w out! 1W̃ ~ 0 !~ n in1 n out!# . the dashed straight line are due to fluctuations. The overall agree-
ment between theory and simulation is good.
~31!
var $ J i % (t) as long as upper and lower bounds have not been
Thus because of Poisson spike arrival and stochastic output reached. Furthermore, all synapses ‘‘see’’ the same spike
firing with disjoint intervals being independent, each weight train of the postsynaptic neuron they belong to. In contrast to
J i undergoes a diffusion process with diffusion constant D. that, input spikes at different synapses are independent.
To discuss the dependence of D upon the learning param- Again we assume that input and output spikes are indepen-
eters, we restrict our attention to the case n in5 n out in Eq. dent; cf. the second paragraph at the beginning of Sec. VI.
~31!. Since mean input and output rates in biological neurons Combining the above arguments, we obtain the diffusion
typically are not too different, this makes sense. Moreover, constant D 8 by simply setting w out50 and disregarding the
we do not expect that the ratio n in/ n out is a critical parameter. term @ n inW̃(0) # 2 n out in Eq. ~31!, which leads to
We recall from Sec. V B that n out52k 1 /k 2 n in once the D 8 5 n in (w in) 2 1 n in n out@ * ds W(s) 2 12 w in W̃(0)1W̃(0) 2 # .
weights are already normalized and if n 0 5Q av50. With The boundaries of validity of Eq. ~33! are illustrated in Fig.
n in5 n out this is equivalent to k 1 52k 2 . Using the definition 12.
of k 1 and k 2 in Eq. ~14! we find w in1w out52W̃(0) n in. If
we insert this into Eq. ~31!, the final term vanishes. In what B. Time scale of diffusion
remains of Eq. ~31! we identify the contributions due to input The effects of shot noise in input and output show up on
spikes, n in (w in) 2 , and output spikes, n out (w out) 2 . Weight a time scale t noise which may be defined as the time interval
changes because of correlations between input and output necessary for an increase of the variance ~30! from
spikes enter Eq. ~31! via n inn out* ds W(s) 2 . var J i (t 0 )50 to var J i (t 0 1 t noise)5(J av) 2 . We chose J av as a
* *
Equation ~30! describes the time course of the variance of reference value because it represents the available range for
a single weight. Estimating var J i is numerically expensive each weight. From Eq. ~30! we obtain t noise5(J ) 2 /D. We av
*
because we have to simulate many independent learning tri- use J av52k 1 /(N k 2 ) from Eq. ~21! and Q av50. This yields
*
S D
als. It is much cheaper to compute the variance of the distri- 2
bution $ J i % of weights in a single learning trial. For the sake k1
1
t noise5 . ~34!
of a comparison of theory and numerics in Fig. 11, we plot 2 k
N D 2
N
1
(
C. Comparison of time scales
var $ J i % ~ t ! ª @ J ~ t ! 2J av~ t !# 2 , ~32!
N21 i51 i We now compare t noise in Eq. ~34! with the time constant
t 51/(N Q) of structure formation as it appears in Eq. ~29!.
str
which obeys a diffusion process with The ratio
var $ J i % ~ t ! 5 ~ t2t 0 ! D 8 ,
however, different from D because weights of single neurons should exceed 1 so as to enable structure formation ~in the
do not develop independently of each other. Each output sense of Sec. V C!. Otherwise weight diffusion due to noise
spike triggers the change of all N weights by an amount w out. spreads the weights between the lower bound 0 and the up-
Therefore, output spikes do not contribute to a change of per bound q and, consequently, destroys any structure.
PRE 59 HEBBIAN LEARNING AND SPIKING NEURONS 4509
sible to include the effects of noise directly at the level of the The term ^ S in
i (t1s) f i (t) & in Eq. ~A1! has to be handled with
differential equation, as is standard in statistics @62#. Such an more care as it describes the influence of synapse i on the
approach would then lead to a Fokker-Planck equation for firing behavior of the postsynaptic neuron,
the evolution of weights as discussed in @63#. All this is in
principle straightforward but in practice very cumbersome.
Finally, we emphasize that we have used a crudely over-
simplified neuron model, viz., a linear stochastic unit. In par- ^ S ini ~ t1s ! f i ~ t ! &
ticular, there is no spike emission threshold nor reset or spike
afterpotential. Poisson firing is not as unrealistic as it may at
first seem, though. Large networks of integrate-and-fire neu-
rons with stochastic connectivity exhibit Poisson-like firing
@64#. Experimental spike interval distributions are also con-
5 KF( f8
GF
d ~ t1s2t if 8 ! J i ~ t ! ( e ~ t2t if !
f
GL . ~A3!
K F
^ S ini ~ t1s ! S out~ t ! & 5 S ini ~ t1s ! f i ~ t ! 1 (
j ~ Þi !
f j~ t ! GL , ~A1!
J i ~ t !~ Dt ! 21 (k e ~ t2t k !
where f i (t)5J i (t) ( f e (t2t if ) with the upper index f ranging 3 ^ 1$ spike in [t1s,t1s1Dt) % 1$ spike in [t k ,t k 1Dt ! % & .
over the firing times t if ,t of neuron i, which has an axonal
connection to synapse i; here 1<i<N. Since e is causal, i.e., ~A4!
e (s)[0 for s,0, we can drop the restriction t if ,t. The
synapses being independent of each other, the sum over j
(Þi) is independent of S in i and thus we obtain
Without restriction of generality we can choose our parti-
tion so that t k 5s1t for some k, say k5l. Singling out k
K S in
i ~ t1s ! F ( GL
j ~ Þi !
f j~ t !
5l, the rest (kÞl) can be averaged directly, since events in
disjoint intervals are independent. Because ^ 1$ % &
5prob$ spike in @ t1s,t1s1Dt) % 5l in i (t1s) Dt, the result
F( G
is J i (t) l in
i (t1s) L in
i (t), where we have used Eq. ~8!. As for
5 ^ S in
i & ~ t1s ! ^ f j&~ t ! the term k5l, we plainly have 1$2 % 51$ % , as an indicator
j ~ Þi ! function assumes only two distinct values, 0 and 1. We ob-
5l in F( E
i ~ t1s !
j ~ Þi !
J j~ t !
`
0
dt 8 e ~ t 8 ! l in G
j ~ t2t 8 ! . ~A2!
i (t1s) e (2s).
tain J i (t) l in
Collecting terms and incorporating n 0 Þ0, we find Eq. ~9!.
PRE 59 HEBBIAN LEARNING AND SPIKING NEURONS 4511
We discuss the parameter regime of the simulations as shown in Secs. V and VI. Numerical values and important derived
quantities are summarized in Table I.
1. Learning window
We use the learning window
S D F S D S DG
5
s s s
exp A 1 12 1A 2 12 for s<0,
t syn
t̃ 1 t̃ 2
W~ s !5h ~B1!
A 1 exp 2 S D
s
t1
1A 2 exp 2
s
t2 S D for s.0.
rameter regime. Fifth, input and output rates are identical for APPENDIX E: RANDOM WALK OF SYNAPTIC WEIGHTS
normalized weights, n in5 n out for n 0 50; see Sec. V B.
We consider the random walk of a synaptic weight J i (t)
for t.t 0 , where J i (t 0 ) is some starting value. The time
APPENDIX C: NORMALIZATION AND CORRELATIONS
course of J i (t) follows from Eq. ~1!,
BETWEEN WEIGHTS AND INPUT RATES
t0
dt 9 W ~ t 9 2t 8 ! S in
i ~ t9! S ~ t8!.
out
~E1!
or M2 . To show that even under the condition of interde-
pendence of $ J i % and $ l in
i % there is a normalization property For a specific i, the spike trains S in i (t 8 ) and S (t 8 ) are now
out
of Eq. ~15! similar to that derived in Sec. V B, we investigate
assumed to be statistically independent and generated by
the most extreme case in which the total mass of synaptic
Poisson processes with constant rates n in for all i and n out,
weight is, e.g., in M2 . Taking J i 50 for iPM1 into
respectively; cf. Secs. II A and IV A. Here n in can be pre-
account, we replace N 21 ( i j J j Q i j in Eq. ~20! by
scribed, whereas n out then follows; cf., for instance, Eq. ~7!.
M 21 2 av av
2 N J Q . The fixed point J
av
is similar to that in Eq. For large N the independence is an excellent approximation.
*
~21! except for a multiplicative prefactor N/M 2 of order 1 The learning parameters w in and w out can be positive or
preceding Q av in Eq. ~21!, negative. The learning window W is some quadratically in-
tegrable function with a width W as defined in Sec. II C.
J av52k 1 / @ N ~ k 2 1Q av N/M 2 !# . ~C1! Finally, it may be beneficial to realize that spikes are de-
*
scribed by d functions.
Since N/M 2 .1, k 1 .0, and k 2 1Q avN/M 2 ,0, J av in Eq. The weight J i (t) is a stepwise constant function of time;
* see Fig. 2 ~bottom!. According to Eq. ~E1!, an input spike
~C1! is larger than J av in Eq. ~21!, where we assumed inde-
* in arriving at synapse i at time t changes J i at that time by a
pendence of $ J i % and $ l i % . Correlations between $ J i % and
$ l in constant amount w in and a variable amount * tt dt 8 W(t
i % can be neglected, however, if we assume 0,Q'Q
av
0
!2k 2 ; cf. Eq. ~23!. In this case, J in Eqs. ~21! and ~C1!
av 2t 8 ) S out(t 8 ), which depends on the sequence of output
* spikes in the interval @ t 0 ,t # . Similarly, an output spike at
are almost identical and independent of Q av.
time t results in a constant weight change w out and a variable
one that equals * tt dt 9 W(t 9 2t) S in i (t 9 ). We obtain a random
APPENDIX D: NORMALIZATION AND WEIGHT 0
CONSTRAINTS
walk with independent steps but randomly variable step size.
Suitable rescaling of this random walk leads to Brownian
Let us consider the influence of weight constraints ~Sec. motion.
V B! on the position of the fixed point J av in Eq. ~21! and the As in Sec. II B, we substitute s5t 9 2t 8 in the second line
* of Eq. ~E1! and extend the integration over the new variable
time constant t av of normalization in Eq. ~22!. We call N ↓
and N ↑ the number of weights at the lower bound 0 and the s so as to run from 2` to `. This does not introduce a big
upper bound q .0, respectively. By construction we have error for t2t 0 @W. The second line of Eq. ~E1! then reduces
N ↓ 1N ↑ <N, where N is the number of synapses. to * ds W(s) * tt dt 8 S in
i (t 8 1s) S (t 8 ).
out
0
For example, if the average weight J av approaches J av We denote ensemble averages by angular brackets ^ & .
*
from below, then only N2N ↑ weights can contribute to an The variance then reads
increase of J av. For the remaining N ↑ saturated synapses we
have J̇ i 50. Deriving from Eq. ~15! an equation equivalent to var J i ~ t ! 5 ^ J 2i & ~ t ! 2 ^ J i & 2 ~ t ! . ~E2!
Eq. ~20!, we obtain J̇ av5(12N ↑ /N) (k 1 1N k 2 J av
1J av Q av/N). The fixed point J av remains unchanged as To simplify the ensuing argument, upper and lower bounds
*
compared to Eq. ~21! but the time constant t av for an ap- for each weight are not taken into account.
proach of J av from below is increased by a factor (1 For the calculation of the variance in Eq. ~E2!, first of all
*
2N ↑ /N) 21 >1 as compared to Eq. ~22!. Similarly, t av for we consider the term ^ J i & (t). We use the notation ^ S ini & (t)
an approach of J av from above is increased by a factor (1 5 n in and ^ S out& (t)5 n out because of constant input and out-
* put intensities. Stochastic independence of input and output
2N ↓ /N) 21 >1.
i (t 8 1s) S (t 8 ) & 5 n n . Using Eq. ~E1! and
leads to ^ S in out in out
The factor by which t av is increased is of order 1, if we
use the upper bound q 5(11d) J av , where d. is of order 1. * ds W(s)5W̃(0) we then obtain
*
If J av5J av , at most N ↑ 5N/(11d) synapses can saturate at
*
the upper bound comprising the total weight. The remaining ^ J i & ~ t ! 5J i ~ t 0 ! 1 ~ t2t 0 !@ w inn in1w outn out1 n inn outW̃ ~ 0 !# .
N ↓ 5N2N/(11d) synapses are at the lower bound 0. The ~E3!
time constant t av is enhanced by at most 111/d and 11d
for an approach of the fixed point from below and above, Next, we consider the term ^ J 2i & (t) in Eq. ~E2!. Using Eq.
respectively. ~E1! once again we obtain for t2t 0 @W
PRE 59 HEBBIAN LEARNING AND SPIKING NEURONS 4513
E E We note that ^ S in
i & 5 n for all i and ^ S & 5 n .
t t in out out
^ J 2i & ~ t ! 52J i ~ t 0 ! 2 12J i ~ t 0 ! ^ J i & ~ t ! 1 dt 8 du 8 Input spikes at times t 8 and u 8 are independent as long as
t0 t0
t 8 Þu 8 . In this case we therefore have ^ S in i (t 8 ) S i (u 8 ) &
in
H i ~ t 8 ! S i ~ u 8 !& ~ w !
3 ^ S in in in 2 5 n n . For arbitrary times t 8 and u 8 we find ~cf. Appendix
A!
in in
E E
Using Eqs. ~E5!, ~E6!, and ~E7! in Eq. ~E4!, performing
1 ds dv W~ s ! W~ v ! the integrations, and inserting the outcome together with Eq.
~E3! into Eq. ~E2!, we arrive at
i ~ t 8 1s ! S i ~ u 8 1 v ! S ~ t 8 ! S ~ u 8 ! & .
3 ^ S in in out out
J var J i ~ t ! 5 ~ t2t 0 ! D, ~E8!
~E4! where
E
Since input and output were assumed to be independent, we
get D5 n in~ w in! 2 1 n out~ w out! 2 1 n inn out ds W ~ s ! 2
^ S ini S ini S out S out& 5 ^ S ini S ini & ^ S out S out& ,
1 n inn outW̃ ~ 0 ! @ 2 ~ w in1w out! 1W̃ ~ 0 ! ~ n in1 n out!# ,
^ S ini S ini S out& 5 ^ S ini S ini & ^ S out& , ~E9!
^ S out S out S ini & 5 ^ S out S out& ^ S ini & . ~E5! as announced.
@1# D.O. Hebb, The Organization of Behavior ~Wiley, New York, @18# A.K. Kreiter and W. Singer, Eur. J. Neurosci. 4, 369 ~1992!.
1949!. @19# M. Abeles, in Models of Neural Networks II, edited by E.
@2# C. von der Malsburg, Kybernetik 14, 85 ~1973!. Domany, J.L. van Hemmen, and K. Schulten ~Springer, New
@3# R. Linsker, Proc. Natl. Acad. Sci. USA 83, 7508 ~1986!. York, 1994!, pp. 121–140.
@4# R. Linsker, Proc. Natl. Acad. Sci. USA 83, 8390 ~1986!. @20# W. Singer, in Models of Neural Networks II, edited by E.
@5# D.J.C. MacKay and K.D. Miller, Network 1, 257 ~1990!. Domany, J.L. van Hemmen, and K. Schulten ~Springer, New
@6# K.D. Miller, Neural Comput. 2, 321 ~1990!. York, 1994!, pp. 141–173.
@7# K.D. Miller and D.J.C. MacKay, Neural Comput. 6, 100 @21# J.J. Hopfield, Nature ~London! 376, 33 ~1995!.
~1994!. @22# W. Gerstner, R. Kempter, J.L. van Hemmen, and H. Wagner,
@8# S. Wimbauer, O.G. Wenisch, K.D. Miller, and J.L. van Hem- Nature ~London! 383, 76 ~1996!.
men, Biol. Cybern. 77, 453 ~1997!. @23# R.C. deCharms and M.M. Merzenich, Nature ~London! 381,
@9# S. Wimbauer, O.G. Wenisch, J.L. van Hemmen, and K.D. 610 ~1996!.
Miller, Biol. Cybern. 77, 463 ~1997!. @24# M. Meister, Proc. Natl. Acad. Sci. USA 93, 609 ~1996!.
@10# D.J. Willshaw and C. von der Malsburg, Proc. R. Soc. London, @25# H. Markram, J. Lübke, M. Frotscher, and B. Sakmann, Science
Ser. B 194, 431 ~1976!. 275, 213 ~1997!.
@11# K. Obermayer, G.G. Blasdel, and K. Schulten, Phys. Rev. A @26# L.I. Zhang et al., Nature ~London! 395, 37 ~1998!.
45, 7568 ~1992!. @27# B. Gustafsson et al., J. Neurosci. 7, 774 ~1987!.
@12# T. Kohonen, Self-Organization and Associative Memory @28# T.V.P. Bliss and G.L. Collingridge, Nature ~London! 361, 31
~Springer, Berlin, 1984!. ~1993!.
@13# L.M. Optican and B.J. Richmond, J. Neurophysiol. 57, 162 @29# D. Debanne, B.H. Gähwiler, and S.M. Thompson, Proc. Natl.
~1987!. Acad. Sci. USA 91, 1148 ~1994!.
@14# R. Eckhorn et al., Biol. Cybern. 60, 121 ~1988!. @30# T.H. Brown and S. Chattarji, in Models of Neural Networks II,
@15# C.M. Gray and W. Singer, Proc. Natl. Acad. Sci. USA 86, edited by E. Domany, J.L. van Hemmen, and K. Schulten
1698 ~1989!. ~Springer, New York, 1994!, pp. 287–314.
@16# C.E. Carr and M. Konishi, J. Neurosci. 10, 3227 ~1990!. @31# C.C. Bell, V.Z. Han, Y. Sugawara, and K. Grant, Nature ~Lon-
@17# W. Bialek, F. Rieke, R.R. de Ruyter van Steveninck, and D. don! 387, 278 ~1997!.
Warland, Science 252, 1855 ~1991!. @32# V. Lev-Ram et al., Neuron 18, 1025 ~1997!.
4514 KEMPTER, GERSTNER, AND van HEMMEN PRE 59
@33# H.J. Koester and B. Sakmann, Proc. Natl. Acad. Sci. USA 95, @49# M. Meister, L. Lagnado, and D.A. Baylor, Science 270, 1207
9596 ~1998!. ~1995!.
@34# L.F. Abbott and K.I. Blum, Cereb. Cortex 6, 406 ~1996!. @50# M.J. Berry, D.K. Warland, and M. Meister, Proc. Natl. Acad.
@35# A.V.M. Herz, B. Sulzer, R. Kühn, and J.L. van Hemmen, Eu- Sci. USA 94, 5411 ~1997!.
rophys. Lett. 7, 663 ~1988!. @51# W.E. Sullivan and M. Konishi, J. Neurosci. 4, 1787 ~1984!.
@52# C.E. Carr, Annu. Rev. Neurosci. 16, 223 ~1993!.
@36# A.V.M. Herz, B. Sulzer, R. Kühn, and J.L. van Hemmen, Biol.
@53# C. Köppl, J. Neurosci. 17, 3312 ~1997!.
Cybern. 60, 457 ~1989!.
@54# R. Kempter, Hebbsches Lernen zeitlicher Codierung: Theorie
@37# J.L. van Hemmen et al., in Konnektionismus in Artificial Intel- der Schallortung im Hörsystem der Schleiereule, Naturwissen-
ligence und Kognitionsforschung, edited by G. Dorffner schaftliche Reihe, Vol. 17 ~DDD, Darmstadt, 1997!.
~Springer, Berlin, 1990!, pp. 153–162. @55# L. Wiskott and T. Sejnowski, Neural Comput. 10, 671 ~1998!.
@38# W. Gerstner, R. Ritz, and J.L. van Hemmen, Biol. Cybern. 69, @56# P.A. Salin, R.C. Malenka, and R.A. Nicoll, Neuron 16, 797
503 ~1993!. ~1996!.
@39# R. Kempter, W. Gerstner, J.L. van Hemmen, and H. Wagner, @57# W. Gerstner, R. Kempter, J.L. van Hemmen, and H. Wagner,
in Advances in Neural Information Processing Systems 8, ed- in Pulsed Neural Nets, edited by W. Maass and C.M. Bishop
ited by D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo ~MIT Press, Cambridge, MA, 1998!, Chap. 14, pp. 353–377.
~MIT Press, Cambridge, MA, 1996!, pp. 124–130. @58# A. Zador, C. Koch, and T.H. Brown, Proc. Natl. Acad. Sci.
USA 87, 6718 ~1990!.
@40# P. Häfliger, M. Mahowald, and L. Watts, in Advances in Neu-
@59# C.W. Eurich, J.D. Cowan, and J.G. Milton, in Proceedings of
ral Information Processing Systems 9, edited by M. C. Mozer,
the 7th International Conference on Artificial Neural Networks
M. I. Jordan, and T. Petsche ~MIT Press, Cambridge, MA, (ICANN’97), edited by W. Gerstner, A. Germond, M. Hasler,
1997!, pp. 692–698. and J.-D. Nicoud ~Springer, Heidelberg, 1997!, pp. 157–162.
@41# B. Ruf and M. Schmitt, in Proceedings of the 7th International @60# C.W. Eurich, U. Ernst, and K. Pawelzik, in Proceedings of the
Conference on Artificial Neural Networks (ICANN’97), edited 8th International Conference on Artificial Neural Networks
by W. Gerstner, A. Germond, M. Hasler, and J.-D. Nicoud (ICANN’98), edited by L. Niklasson, M. Boden, and T.
~Springer, Heidelberg, 1997!, pp. 361–366. Ziemke ~Springer, Berlin, 1998!, pp. 355–360.
@42# W. Senn, M. Tsodyks, and H. Markram, in Proceedings of the @61# H. Hüning, H. Glünder, and G. Palm, Neural Comput. 10, 555
7th International Conference on Artificial Neural Networks ~1998!.
(ICANN’97), edited by W. Gerstner, A. Germond, M. Hasler, @62# N.G. van Kampen, Stochastic Processes in Physics and Chem-
and J.-D. Nicoud ~Springer, Heidelberg, 1997!, pp. 121–126. istry ~North-Holland, Amsterdam, 1992!.
@43# R. Kempter, W. Gerstner, and J.L. van Hemmen, in Advances @63# H. Ritter, T. Martinez, and K. Schulten, Neural Computation
in Neural Information Processing Systems 11, edited by M. and Self-Organizing Maps: An Introduction ~Addison-Wesley,
Kearns, S. A. Solla, and D. Cohn ~MIT Press, Cambridge, MA, Reading, 1992!.
in press!. @64# C. van Vreeswijk and H. Sompolinsky, Science 274, 1724
@44# J. W. Lamperti, Probability, 2nd ed. ~Wiley, New York, 1996!. ~1996!.
@45# J.A. Sanders and F. Verhulst, Averaging Methods in Nonlinear @65# W. Softky and C. Koch, J. Neurosci. 13, 334 ~1993!.
Dynamical Systems ~Springer, New York, 1985!. @66# F. Rieke, D. Warland, R. de Ruyter van Steveninck, and W.
@46# R. Kempter, W. Gerstner, J.L. van Hemmen, and H. Wagner, Bialek, Spikes: Exploring the Neural Code ~MIT Press, Cam-
Neural Comput. 10, 1987 ~1998!. bridge, MA, 1997!.
@47# T.J. Sejnowski and G. Tesauro, in Neural Models of Plasticity: @67# M.N. Shadlen and W.T. Newsome, Curr. Opin. Neurobiol. 4,
Experimental and Theoretical Approaches, edited by J. H. By- 569 ~1994!.
rne and W.O. Berry ~Academic Press, San Diego, 1989!, Chap. @68# Ö. Bernander, R.J. Douglas, K.A.C. Martin, and C. Koch,
6, pp. 94–103. Proc. Natl. Acad. Sci. USA 88, 11 569 ~1991!.
@48# R. Ritz and T.J. Sejnowski, Curr. Opin. Neurobiol. 7, 536 @69# M. Rapp, Y. Yarom, and I. Segev, Neural Comput. 4, 518
~1997!. ~1992!.