Professional Documents
Culture Documents
net/publication/40758726
CITATIONS READS
2 206
1 author:
Sidney R Lehky
Salk Institute for Biological Studies
55 PUBLICATIONS 1,649 CITATIONS
SEE PROFILE
All content following this page was uploaded by Sidney R Lehky on 04 June 2014.
Sidney R. Lehky
sidney@salk.edu
Computational Neuroscience Lab, Salk Institute, La Jolla, CA 92037, U.S.A.
1 Introduction
a.60 b.
4
spikes/sec
40
trial
8
20 12
0 16
0 400 800 0 400 800
time (ms) time (ms)
Figure 1: (a) Example of a spike rate function λ(t). (b) Set of 16 Poisson spike
trains generated by λ(t).
40
σ=30 ms
30
spikes/sec
20
10
0
0 400 800
time (ms)
60
a. estimated b. estimated
actual actual
spikes/sec
40
20
σ=4 σ=16
0
60
c. estimated d. estimated
actual actual
spikes/sec
40
20
σ=64 σ=256
0
0 400 800 0 400 800
time (ms) time (ms)
Figure 3: The spike rate function λ̂(t) estimated from a set of spike trains de-
pends on the width σ of the gaussian smoothing kernel used. Shown are the
actual λ(t) (dotted line), as well as the estimated λ̂(t) generated by four different
values of σ . Each estimated λ̂(t) was produced by convolving 100 spike trains,
generated by the same λ(t), with a gaussian kernel of the specified σ . The results
for all trials were then averaged.
2 Methods
S(t | ti ) = δ(t − ti ) (2.1)
ti
1 −(t−ti )2
K (t | ti , σ ) = √ e 2σ 2 , (2.2)
σ 2π
1250 S. Lehky
and the estimate of the instantaneous firing rate given by the convolution
of the spike train by the smoothing kernel:
λ̂(t | ti , σ ) = S ∗ K = K (t | ti , σ ) (2.3)
ti
2
H(ω | σ ) = e − 2 (σ ω/1000) ,
1
(2.4)
Thus, there is an inverse relation between the width of the kernel in the
time and frequency domains. Broad kernels in the time domain correspond
to narrow-width low-pass filters.
The rate function λ(t) was characterized by three parameters: (1) the
mean rate,
tD
1
λ(t)t = λ(t) dt, (2.6)
tD 0
where t is the mean value operator with respect to time and tD is spike
train duration; (2) the variance Var(λ(t)); and (3) an exponent α defining a
power law Fourier amplitude spectrum for λ(t), which was proportional to
1/ f α .
The rate function was computed by generating gaussian white noise fol-
lowed by low-pass filtering in the frequency domain with the appropriate
power law 1/ f α . The spectrum was truncated to zero for frequencies above
100 Hz for biophysical reasons. The function was then returned to the time
domain by an inverse Fourier transform. At this point, the function was
Decoding Poisson Spike Trains by Gaussian Filtering 1251
30
a. pink noise α=1.0 b. brown noise α=2.0 c. black noise α=3.0
25
spikes/sec
20
15
10
0 400 800 0 400 800 0 400 800
time (ms) time (ms) time (ms)
Figure 4: Example rate functions λ(t) with different Fourier amplitude spectra.
The Fourier spectra were proportional to 1/ f α , with the values of α indicated in
each panel. Besides the Fourier spectral exponent α, the other two parameters
defining the randomly generated λ(t) were their mean and variance.
linearly scaled to give the desired mean and variance. This procedure pro-
duced rate functions λ(t) that were random functions in the form of colored
noise, with α = 1 being pink noise, α = 2 being brown noise, and α = 3
sometimes called black noise (Schroeder, 1991). Some example rate func-
tions are shown in Figure 4 for different values of α. In addition to a smaller
high-frequency contribution, larger values of α lead to a broader autocor-
relation function and can be considered “burstier.” Intracellular recordings
show that neuronal firing rates can show rapid fluctuations similar to those
seen in these rate functions, with the firing rates mirroring noisy current
inputs to the cell (Carandini, Mechler, Leonard, & Movshon, 1996; Nowak,
Sanchez-Vivez, & McCormick, 1997).
A simulation “experiment” consisted of p trials, each trial using an iden-
tical rate function λ(t) but with a different random spike train derived from
that rate function (see Figure 1). This is analogous to the way the same
stimulus is presented multiple times in actual experiments. The value of
p in simulations ranged from 1 to 256. The estimated rate function for an
experiment, ε λ̂(t), was calculated by averaging the estimated rate functions
for individual trials:
1
p
experiment group. The n exemplars of λ(t) within each group were ran-
domly generated using the same triplet of parameters (mean, variance, and
spectral exponent α). Thus, for each experimental group, we were interested
in estimation errors associated with the statistical characteristics underly-
ing λ(t), not with individual examples of λ(t). Each experimental group
was therefore associated with one point in the statistical parameter space
defining λ(t), and the parameter space was sampled with simulation data
from a set of experimental groups.
The total error function was the mean integrated square error (MISE)
between actual and estimated rate functions:
⎛ ⎞
m tD
1 ⎝ 1
n
E T (σ ) = (λi (t) − ε λ̂i j (t | σ ))2 dt ⎠, (2.8)
n m t=0
i=1 j=1
where tD was the duration of the spike train. Taking advantage of the
ergodicity of λ(t), total error in practice was calculated across time rather
than across an ensemble at a point in time.
3 Results
E T (σ ) = E V (σ ) + E B (σ ). (3.1)
⎛ ⎞
m tD
1 ⎝ 1
n
E V (σ ) = Var(ε λ̂i j (t | σ )) dt ⎠, (3.2)
n m t=0
i=1 j=1
averaged over m instantiations of spike trains from the same λ(t), as well
as n exemplars of different λ(t) functions randomly generated using the
same triplet of parameter values. Variance here refers to variance across an
Decoding Poisson Spike Trains by Gaussian Filtering 1253
4
4 spikes/sec 1 trial
10 4 trials
16 trials
64 trials
256 trials
2
10
error
10
0
a.
4
16 spikes/sec 1 trial
10 4 trials
16 trials
64 trials
256 trials
2
10
error
0
10
b.
4
64 spikes/sec 1 trial
10 4 trials
16 trials
64 trials
256 trials
2
error
10
0
10
c.
0 1 2 3
10 10 10 10
Figure 5: Error between actual rate function λ(t) and estimated rate function λ̂(t)
as a function of the width σ of the gaussian smoothing kernel used. Error was
defined as mean integrated square error (MISE), with units of (spikes/sec)2 .
Curves were generated through simulations. Each panel shows error curves
for a different mean firing rate of λ(t). Within individual panels, each curve
corresponds to a different number of trials per experiment p, with the curve
reflecting estimation error after pooling the p replications. In all cases, variance
of λ(t) was equal to 0.1 times the mean firing rate, with Fourier exponent α = 2.
Dotted lines are asymptotic bias error curves as p → ∞.
2
10
error 0
10
a.
2
10
error
0
10
b.
2
10
error
0
10
c.
0 1 2 3
10 10 10 10
averaged in the same way, again based on ensemble calculations. For con-
venience, we shall refer to E V (σ ) and E B (σ ) by the shorthand expressions
“variance error” and “bias error,” keeping in mind that we are actually
referring to integrated errors. Example E T (σ ) curves decomposed into their
E V (σ ) and E B (σ ) components are shown in Figure 6. The variance error is
associated with the descending portion of the E T (σ ) curve to the left, while
Decoding Poisson Spike Trains by Gaussian Filtering 1255
the bias error is associated with the flat or ascending portion of the curve
to the right.
An analytic expression for E V (σ ) can be derived. Consider a spike train
of duration tD with a single spike at time ti that has been smoothed by a
gaussian kernel with width σ . The smoothed spike train (estimated rate
function λ̂(t)) is then given by
1 −(t−ti )2
λ̂(t | σ ) = √ e 2σ 2 t = [0, tD ]. (3.4)
σ 2π
For an experiment (see equation 2.7) averaging data from p one-spike trials
with the spike at random ti for each trial, the resulting ε λ̂(t) is the sum of
p gaussian curves, each gaussian being attenuated in height by a factor of
1/ p (variance of each gaussian attenuated by a factor of 1/ p 2 ). That gives
1 2 1 2
Var(ε λ̂) ≈ p λ̂ t = λ̂ t , (3.7)
p2 p
assuming that the firing rate and kernel width are both small enough such
that the p gaussians do not substantially overlap. Inserting this expression
into equation 3.2,
⎛ ⎞
m
1 ⎝ 1 1 tD
2
n
E V (σ ) = λ̂ dt ⎠. (3.8)
n m p t=0 i j t
i=1 j=1
1 − (t−t2i )2
λ̂(t | σ )2 = e σ . (3.9)
2πσ 2
1256 S. Lehky
The value of λ̂2 t , or mean value of λ̂(t | σ )2 , is the area under λ̂(t | σ )2
divided by the length of the time period tD :
tD
1 1 − (t−t2i )2
λ̂(t | σ )2 t = e σ dt. (3.10)
2πtD t=0 σ2
as the error function erf approaches one for σ small relative to tD − ti and
ti . If we have a spike train with s spikes instead of one, then equation 3.11
becomes
s
λ̂(t | σ )2 t ≈ √ , (3.12)
2 πtD σ
assuming the spike rate is low enough that convolution kernels for the
spikes do not substantially overlap. Substituting equation 3.12 into equation
3.8 and moving constant terms out of the integral,
⎛ ⎞
m tD
1 ⎝ 1
n
1 1 si j
E V (σ ) = √ dt ⎠
2 πp σ n m
i=1
tD t=0
j=1
⎛ ⎞
m
1 ⎝ 1 si j ⎠
n
tD 1
= √ . (3.13)
2 πp σ n m
i=1
tD
j=1
The expected value of si j /tD (spikes per unit time) is equal to the mean
of the rate function λ(t)t , giving us the final result for the variance error
curve:
tD λ(t)t 1
E V (σ ) = √ . (3.14)
2 πp σ
tD λ(t)t
βV = √ . (3.15)
2 πp
The inverse relation between variance error and σ seen in equation 3.14
is an idealization that is perturbed by two factors in real situations: edge
effects from a finite duration spike train and overlap between kernels as the
Decoding Poisson Spike Trains by Gaussian Filtering 1257
Table 1: Fitted Parameter Values for the Empirical Variance Error Curve,
Equation 3.16, for Different Mean Firing Rates λ(t)t .
λ(t)t η0 η1 η2 η3 η4
spike rate increases. The idealization holds for σ small relative to the spike
train duration tD , and for low spike rates such that σ is small relative to the
mean interspike interval. In a log-log plot, the ideal E V (σ ) is a straight line
with a slope of −1. In the E V (σ ) plots of Figure 6, we see deviations from
this ideal for large σ . An empirical expression for the actual E V (σ ) is given
by
4
Ê v (σ ) = βV exp ηi log(σ ) i
, (3.16)
i=0
Table 2: Fitted Parameter Values for the Asymptotic Bias Error Curve,
Equation 3.18, for Different Values of the Fourier Spectral Exponent α.
α ν1 ν2 ν3 ν4 ν5
1.0
normalized error
0.8
0.6
α=1.0
0.4 α=1.5
α=2.0
0.2 α=2.5
α=3.0
0
0 200 400 600 800 1000
Figure 7: Asymptotic bias error curves, showing how curve shape changes as
a function of Fourier spectral exponent α. These curves have been normalized
to a height of one. The asymptotic bias error curves indicate the limit of bias
error as the number of trials per experiment p → ∞ (see the dotted curves in
Figure 5).
where is the gamma function. The parameters ν1,...,5 for the curves, found
by nonlinear regression of equation 3.18 on the simulations, are given in
Table 2. This equation is the weighed sum of the equations for the log-
normal cumulative distribution function (cdf) and the gamma cdf. Those
cdfs have no probabilistic interpretation here but were empirically cho-
sen for their ability to fit the sigmoid E a syB (σ ) curves that arose through
simulation.
Decoding Poisson Spike Trains by Gaussian Filtering 1259
1
n tD
lim E(σ ) = βa syB = (λi (t) − λi (t)t ) dt .
2
(3.20)
σ →∞ n t=0
i=1
n tD
1 1
βa syB = tD (λi (t) − λi (t)t )2 dt . (3.21)
n tD t=0
i=1
The term in the inner parentheses is the mean of (λi (t) − λi (t))2 , so
1
n
βa syB = (tD (λi (t) − λi (t)t )2 t ). (3.22)
n
i=1
1
n
βa syB = (tD Var (λi (t))). (3.23)
n
i=1
As all λi (t) are defined to have identical variance, we get the final result:
Empirically, we shall treat the bias error curve as having the same shape
as the asymptotic bias error curve but translated upward. This amounts to
replacing in equation 3.18 the proportionality constant βa syB by β B :
2
σ −(ln x−β2 )
β1 1
Ê B (σ ) = β B √ e 2β 2
3 dx
β3 2π x=−∞ x
σ
1 − β1 β4 −1
x
+ β
x e β5
dx . (3.25)
β5 4 (β4 ) x=−∞
1260 S. Lehky
β B = lim E B (σ ) = lim E T (σ )
σ →∞ σ →∞
⎛ ⎛ ⎞⎞
n m tD
1 1
= lim ⎝ ⎝ (λi (t) − ε λ̂i j (t | σ ))2 dt ⎠⎠ . (3.26)
σ →∞ n m t=0
i=1 j=1
sk
λ(t | ti , σ ) = S ∗ K = S(t | ti ) = = rk . (3.27)
tD
⎛ ⎞
m tD
1 ⎝ 1
n
βB = (λi (t) − ε ri j )2 dt ⎠. (3.28)
n m t=0
i=1 j=1
Adding the term λi (t)t − λi (t)t = 0 into equation 3.28 and rearranging
gives
m tD
1 1
n
βB = [(λi (t) − λi (t)t ) − (ε ri j − λi (t)t )] dt .
2
n m t=0
i=1 j=1
(3.29)
m tD tD
1 1
n
βB = (λi (t) − λi (t)t )2 dt − 2 (λi (t) − λi (t)t )
n m t=0 t=0
i=1 j=1
tD
× (ε ri j − λi (t)t ) dt + (ε ri j − λi (t)t ) dt .
2
(3.30)
t=0
Decoding Poisson Spike Trains by Gaussian Filtering 1261
The ε ri j − λi (t)t term in the middle integral is a constant, and the λi (t) −
λi (t)t term integrates to zero, so the middle integral disappears, leaving
m tD
1 1
n
βB = (λi (t) − λi (t)t )2 dt
n m t=0
i=1 j=1
tD
+ (ε ri j − λi (t)t ) dt
2
. (3.31)
t=0
The first integral in equation 3.31 is βa syB (see equations 3.20 and 3.24), so:
⎛ ⎞
m tD
1 ⎝ 1
n
β B = βa syB + (ε ri j − λi (t)t )2 dt ⎠
n m t=0
i=1 j=1
⎛ ⎞
m tD
1 ⎝ 1
n
= tD Var(λ(t)) + (ε ri j − λi (t)t ) dt ⎠.
2
(3.32)
n m t=0
i=1 j=1
Both ε ri j and λi (t)t are constants as a function of time, so they may be
taken outside the integral, producing
⎛ ⎞
1 ⎝ 1
n m
β B = tD Var(λ(t)) + tD (ε ri j − λi (t)t )2 ⎠. (3.33)
n m
i=1 j=1
As the mean rate λi (t)t is defined as identical for all i, the expected value
of ε r t = λi (t)t , where ε r t is the observed mean firing rate over m · n
experiments. Therefore:
⎛ ⎞
1 ⎝ 1
n m
β B = tD Var(λ(t)) + tD (ε ri j − ε r t )2 ⎠
n m
i=1 j=1
where Var(ε r ) is the variance of the mean firing rate among replicated exper-
iments. Since Var(ε r ) results from averaging over p trials per experiment,
in terms of a single-trial r , equation 3.34 becomes
Var(r )
β B = tD Var(λ(t)) + tD , (3.35)
p
where Var(r ) is the variance of the mean rate between individual trials.
1262 S. Lehky
Var(s)
Fs = . (3.36)
st
1
Var(r ) Var(s/tD ) 2 Var(s) 1 Var(s) Fs
tD
Fr = = = = = . (3.37)
r t s/tD t 1
tD
st tD st tD
Fs
Var(r ) = Fr r t = r t . (3.38)
tD
Fs
β B = tD Var(λ(t)) + r t . (3.39)
p
The expected value of the mean firing rate r t estimated over m · n · p trials
is equal to the actual mean firing rate λ(t)t , giving the final result:
Fs
β B = tD Var(λ(t)) + λ(t)t . (3.40)
p
δ = p · λ(t)t . (3.41)
Decoding Poisson Spike Trains by Gaussian Filtering 1263
1
I = . (3.42)
δ
Given that bit of nomenclature, we can now turn to the issue of the optimal
value of σ , which minimizes the E T (σ ) curve.
Under certain conditions, changing mean rate λ(t)t translates the to-
tal error curve E T (σ ) vertically while leaving its shape invariant under a
logarithmic scale. The significance of maintaining the shape invariance of
E T (σ ) is that the value of σ that minimizes E T (σ ), σopt , remains constant. For
error curve shape invariance to happen, the ratio of the variance error and
bias error proportionality constants, β B /βV , must be constant as a function
of λ(t)t . Using the definition of spike density δ (see equation 3.41), the
proportionality constants βV (see equation 3.15) and β B (see equation 3.40)
become
tD λ(t)2t
βV = √ (3.43)
2 π δ
λ(t)2t
β B = tD Var(λ(t)) + Fn . (3.44)
δ
Their ratio is
βB √ δ Var(λ(t)) Fn
=2 π +
βV λ(t)2t tD
2
√ Std(λ(t)) Fn
=2 π δ + . (3.45)
λ(t)t tD
The term within the inner bracket, the ratio of the standard deviation to the
mean of λ(t), is the coefficient of variation of λ(t):
Std(λ(t))
C V (λ) = . (3.46)
λ(t)t
βB √ Fn
= 2 π δC V +
2
. (3.47)
βV tD
1264 S. Lehky
error
10
0
10
0 1 2 3
10 10 10 10
Figure 8: A set of error curves produced when the following three parameters
were held constant: the spike density δ of the data set, the coefficient of variation
C V , and the Fourier exponent α of the rate function. This shows that the shapes
of error curves are nearly invariant when these parameters are held constant,
and consequently have their minima σopt at approximately the same values.
Curves are vertically shifted from each other as a function of mean firing rate.
Spike density is the mean of the firing rate function λ(t)t times the number
of trials per experiment p. For all three curves, spike density was fixed at
1024 spikes/sec. The coefficient of variation C V of λ(t) was set to 0.1 for these
curves, and the Fourier exponent α was 2.
a.
α=2.0
1
10
Cv=0.05
Cv=0.10
0
C =0.20
v
10
b.
Cv=0.10
optimal sigma (ms)
2
10
1
10
α=1.0
α=2.0
α=3.0
0
10 -2 -1 0 1
10 10 10 10
pooled mean interspike interval (ms)
Figure 9: The optimal value of the gaussian width σopt plotted as a function
of the pooled mean interspike time I . The pooled mean interspike time was
calculated after combining multiple replication trials in an experiment and
is the reciprocal of spike density. (a) Three σopt versus I curves generated by
changing the coefficient of variation C V of the rate function λ(t). (b) Three
σopt versus I curves generated by changing the Fourier spectral exponent α
of the rate function λ(t). In both panels, the curves show a discontinuity to
σopt → ∞ at the point when the intermediate dip in the error curves disappears,
and instead they monotonically decline to some asymptotic value.
σopt = a I b . (3.48)
1266 S. Lehky
Order a b
1 6.41C−1.17
v 0.87α −0.57
2 5.54C−1.24
v 0.84α −0.57
3 4.52C−1.16
v 0.88α −0.51
Cv =0.10
α =2
1
10
Figure 10: Comparison of optimal gaussian width σopt determined from the
minima of total error curves (taken from Figure 9a) with σopt determined by the
Sheather-Jones algorithm.
4 Conclusion
It has been generally understood in the past that σopt is directly related to
mean interspike interval (or modifying that concept here, the pooled mean
1268 S. Lehky
References
Arieli, A., Sterkin, A., Grinvald, A., & Aertsen, A. (1996). Dynamics of ongoing
activity: Explanation of the large variability in evoked cortical responses. Science,
273, 1868–1871.
August, D. A., & Levy, W. B. (1996). A simple spike train decoder inspired by the
sampling theorem. Neural Computation, 8, 67–84.
Bialek, W., Rieke, F., de Ruyter van Steveninck, R. R., & Warland, D. (1991). Reading
a neural code. Science, 252, 1854–1857.
Bowman, A. W., & Azzalini, A. (1997). Applied smoothing techniques for data analysis.
Oxford: Oxford University Press.
Carandini, M., Mechler, F., Leonard, C. S., & Movshon, J. A. (1996). Spike train
encoding by regular-spiking cells of the visual cortex. Journal of Neurophysiology,
76, 3425–3441.
Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in the
primate cerebral cortex. Cerebral Cortex, 1, 1–47.
Ferster, D., & Jagadeesh, B. (1992). EPSP-IPSP interactions in cat visual cortex studied
with in vivo whole-cell patch recording. Journal of Neuroscience, 12, 1262–1274.
Goldberg, D. H., & Andreou, A. G. (2007). Distortion of neural signals by spike
coding. Neural Computation, 19, 2797–2839.
Jones, M. C., Marron, J. S., & Sheather, S. J. (1996). A brief survey of bandwidth
selection for density estimation. Journal of the American Statistical Society, 91, 401–
407.
Lehky, S. R. (2004). Bayesian estimation of stimulus responses in Poisson spike trains.
Neural Computation, 16, 1325–1343.
Mainen, Z. F., & Sejnowski, T. J. (1995). Reliability of spike timing in neocortical
neurons. Science, 268, 1503–1506.
Nowak, L. G., Sanchez-Vivez, M. V., & McCormick, D. A. (1997). Influence of low
and high frequency inputs on spike timing in visual cortical neurons. Cerebral
Cortex, 7, 487–501.
Parzen, E. (1962). On estimation of a probability density function and mode. Annals
of Mathematical Statistics, 33, 1065–1076.
Paulin, M. G., & Hoffman, L. F. (2001). Optimal firing rate estimation. Neural Net-
works, 14, 877–881.
Richmond, B. J., Optican, L. M., Podell, M., & Spitzer, H. (1987). Temporal encoding
of two-dimensional patterns by single units in primate inferior temporal cortex.
I. Response characteristics. Journal of Neurophysiology, 57, 132–146.
Rieke, F., Warland, D., de Ruyter van Steveninck, R., & Bialek, W. (1997). Spikes:
Exploring the neural code. Cambridge, MA: MIT Press.
Schroeder, M. (1991). Fractals, chaos, and power laws: Minutes from an infinite universe.
New York: Freeman.
Sheather, S. J., & Jones, M. C. (1991). A reliable data-based bandwidth selection
method for kernel density estimation. Journal of the Royal Statistical Society B, 53,
683–690.
Silverman, B. W. (1986). Density estimation for statistics and data analysis. Boca Raton,
FL: Chapman & Hall/CRC Press.
Decoding Poisson Spike Trains by Gaussian Filtering 1271
Tiesinga, P., Fellous, J. M., & Sejnowski, T. J. (2008). Regulation of spike timing in
visual cortical circuits. Nature Reviews Neuroscience, 9, 97–107.
Wand, M. P., & Jones, M. C. (1995). Kernel smoothing. Boca Raton, FL: Chapman &
Hall/CRC Press.