Presentation 31 01 2022

Efficient simulation of point processes with
applications to neurosciences
Cyrille Mascart1 , Ph.D.
1
ex-PhD student at I3S laboratory, Université Côte d’Azur
Lab meeting - Koulakov lab

January 31, 2022
Thesis work
Articles:
Simulation of stochastic processes (Ornstein-Uhlenbeck)[1] (published)
Simulation of large networks of point processes[2] (in-press)
Application to large networks of point processes[3] (submitted)
Software:
GODDESS: generic discrete-event simulator
SPIKES: point process simulation
RECONSTRUCTION: causal interaction estimation
Supervisor: GUI for SPIKES and RECONSTRUCTION
1 P. Grazieschi, M. Leocata, C. Mascart, J. Chevallier, F. Delarue, and E. Tanré. “Network of interacting

neurons with random synaptic weights”. In: ESAIM: Proceedings and Surveys 65 (2019), pp. 445–475.
2 C. Mascart, A. Muzy, and P. Reynaud-Bouret. “Efficient Simulation of Sparse Graphs of Point
Processes”. In: arXiv preprint arXiv:2001.01702 (2020).
3 C. Mascart, G. Scarella, P. Reynaud-Bouret, and A. Muzy. “Simulation scalability of large brain
neuronal networks thanks to time asynchrony”. In: bioRxiv (2021).
2/39
Thesis work
Articles:
Simulation of stochastic processes (Ornstein-Uhlenbeck)[1] (published)
Simulation of large networks of point processes[2] (in-press)
Application to large networks of point processes[3] (submitted)
Software:
GODDESS: generic discrete-event simulator
SPIKES: point process simulation
RECONSTRUCTION: causal interaction estimation
Supervisor: GUI for SPIKES and RECONSTRUCTION

2/39
3/39
4/39
Context: neuroscience
Neurons (= process, mark, type)
A neuron randomly emits spikes

A neuron is a node in an interaction graph
5/39

Spikes (= points)
A spike is an isolated point on R
5/39

Spikes (= points)
Spike
Sorting
5/39

Spikes (= points)
Spike
Sorting
5/39
Point processes
Countable random set of points (= spikes)
6/39
Point processes

𝑥𝑡 = {𝑇𝑖 |𝑇𝑖 < 𝑡}𝑖=1,…
𝑡
𝑇1 𝑇2 𝑇3 𝑇4 𝑇5 𝑇6
6/39
Point processes

𝑥𝑡 = {𝑇𝑖 |𝑇𝑖 < 𝑡}𝑖=1,…
𝑡
𝑇1 𝑇2 𝑇3 𝑇4 𝑇5 𝑇6
Conditional Intensity
Some family described by conditional intensity 𝜙𝑡 (𝑥𝑡 ) (≈ rate):
6/39
Point processes

𝑥𝑡 = {𝑇𝑖 |𝑇𝑖 < 𝑡}𝑖=1,…
𝑡
𝑇1 𝑇2 𝑇3 𝑇4 𝑇5 𝑇6
Conditional Intensity
Some family described by conditional intensity 𝜙𝑡 (𝑥𝑡 ) (≈ rate):
P[1 new point ∈ [𝑡, 𝑡 + d𝑡[ | 𝑥𝑡 ] = 𝜙𝑡 (𝑥𝑡 ) d𝑡 + 𝑜(d𝑡)
6/39
Example: homogeneous Poisson point process
Constant conditional intensity function: 𝜙𝑡 (𝑥𝑡 ) = Γ > 0
7/39
𝜙𝑡 (𝑥𝑡 ) = Γ
𝑡
𝑇1 𝑇2 𝑇3 𝑇4 𝑇5 𝑇6
7/39
𝜙𝑡 (𝑥𝑡 ) = Γ
𝑡
𝑇1 𝑇2 𝑇3 𝑇4 𝑇5 𝑇6
ℰ (Γ) ℰ (Γ)
Property
Interpoint intervals are independent and exponentially distributed
7/39
Why use point processes in neuroscience?
Neuron modelling level

Neuronal discharge can be seen as a stochastic process + ionic
channels are usually modelled stochastically
4 R. C. Lambert, C. Tuleau-Malot, T. Bessaih, V. Rivoirard, Y. Bouret, N. Leresche, and

P. Reynaud-Bouret. “Reconstructing the functional connectivity of multiple spike trains using Hawkes
models”. In: Journal of neuroscience methods 297 (2018), pp. 9–21.
8/39

Not all neurons are best modeled using a threshold for triggering a
spike

8/39

spike
Network modelling level

Cannot take into account whole network of neurons (hidden
neurons)

8/39

spike
Network modelling level

Cannot take into account whole network of neurons (hidden
neurons)
Point process models are used for causal interactions estimation
(↔ functional connectivity)[4]

8/39
Point process models fit spiking data well
Classically, Poisson process models
5 P. Reynaud-Bouret, V. Rivoirard, F. Grammont, and C. Tuleau-Malot. “Goodness-of-fit tests and

nonparametric adaptive estimation for spike train analysis”. In: The Journal of Mathematical
Neuroscience 4.1 (2014), pp. 1–41.
9/39

Poisson model falls short whenever synchronization occurs, or
neurons are part of the same assembly[5]

9/39

Poisson model falls short whenever synchronization occurs, or
neurons are part of the same assembly[5]
Need models embedding interactions

9/39
Interactions and time asynchrony
10/39
𝑗
𝜙𝑖,𝑡 (𝑥𝑡 ) ℎ𝑗𝑖 (𝑡)
10/39
𝑗
10/39
𝑗
10/39
𝑗
10/39
𝑗
Time asynchrony
At any time 𝑡, there can be only 1 point among all the processes.
10/39
𝑗
Time asynchrony
At any time 𝑡, there can be only 1 point among all the processes. This
is a given when the conditional intensity exists.
10/39
Multivariate Hawkes point process
Self-exciting conditional intensity function
𝑀 𝑡
ℎ (𝑡 − 𝑠) d𝑥𝑠
𝑖 𝑗
𝜙𝑖,𝑡 (𝑥𝑡 ) = 𝑓 𝜇 + ∑
( ∫ 𝑗𝑖 )
𝑗=1 0
11/39
𝑀
𝜙𝑖,𝑡 (𝑥𝑡 ) = 𝜇 𝑖 + ∑ ∑ ℎ𝑗𝑖 (𝑡 − 𝑇𝑗 )
𝑗=1 𝑗
𝑇𝑗 <𝑡,𝑇𝑗 ∈𝑥𝑡
11/39
𝑀
𝜙𝑖,𝑡 (𝑥𝑡 ) = 𝜇 𝑖 + ∑ ∑ ℎ𝑗𝑖 (𝑡 − 𝑇𝑗 )
𝑗=1 𝑗
𝑇𝑗 <𝑡,𝑇𝑗 ∈𝑥𝑡
11/39
𝑀
𝜙𝑖,𝑡 (𝑥𝑡 ) = 𝜇 𝑖 + ∑ ∑ ℎ𝑗𝑖 (𝑡 − 𝑇𝑗 )
𝑗=1 𝑇 <𝑡,𝑇 ∈𝑥𝑗
𝑗 𝑗 𝑡
𝜙𝑖,𝑡 (𝑥𝑡 )
𝜇𝑖
𝑖
11/39
𝑀
𝜙𝑖,𝑡 (𝑥𝑡 ) = 𝜇 𝑖 + ∑ ∑ ℎ𝑗𝑖 (𝑡 − 𝑇𝑗 )
𝑗 𝑗 𝑡
𝑗
𝑇𝑗
𝜇𝑖
𝑖
11/39
𝑀
𝜙𝑖,𝑡 (𝑥𝑡 ) = 𝜇 𝑖 + ∑ ∑ ℎ𝑗𝑖 (𝑡 − 𝑇𝑗 )
𝑗 𝑗 𝑡
𝑗
𝑇𝑗
ℎ𝑗𝑖 (𝑡)
𝜇𝑖
𝑖
11/39
𝑀
𝜙𝑖,𝑡 (𝑥𝑡 ) = 𝜇 𝑖 + ∑ ∑ ℎ𝑗𝑖 (𝑡 − 𝑇𝑗 )
𝑗 𝑗 𝑡
𝑗
𝑇𝑗
ℎ𝑗𝑖 (𝑡)
𝑖 𝑇𝑗
11/39
Thinning
𝜙𝑡 (𝑥𝑡 )
12/39
Thinning
Γ
12/39
Thinning
Γ
𝑡
ℰ (Γ) ℰ (Γ) ℰ (Γ)
12/39
Thinning
Γ
𝑇1
𝑡
𝜙𝑇1 (𝑥𝑇1 )
Accepted, P ∼ Γ
12/39
Thinning
Γ
𝑇2
𝑡
Rejected, P ∼ 1 − Γ
12/39
Thinning
Γ
𝑇3
𝑡
Accepted, P ∼ Γ
12/39
Time-rescaling theorem[6]
Time-rescaling theorem (or Random-Time Change theorem)

Let 𝑇1 < ⋯ < 𝑇𝑛 be a realisation of the point process over (0, 𝑇 ].
Let define the transformation (compensator)
𝑡
Φ𝑡 (𝑥𝑡 ) = 𝜙𝑠 (𝑥𝑠 ) d𝑠 and ∀𝑡 ∈ (0, 𝑇 ], Φ𝑡 (𝑥𝑡 ) < ∞.
∫0
6 E. N. Brown, R. Barbieri, V. Ventura, R. E. Kass, and L. M. Frank. “The time-rescaling theorem and
its application to neural spike train data analysis”. In: Neural computation 14.2 (2002), pp. 325–346.
13/39

𝑡
∫0
Then apply the transformation to the 𝑇𝑘 ∶ Φ𝑇𝑘 (𝑥𝑇𝑘 )
13/39

𝑡
∫0
Then apply the transformation to the 𝑇𝑘 ∶ Φ𝑇𝑘 (𝑥𝑇𝑘 )
{Φ𝑇𝑘 (𝑥𝑇𝑘 ) / 𝑇𝑘 ∈ 𝑥𝑇 }
is a unit-rate Poisson process.
13/39
Time-rescaling theorem
∀(𝑇𝑘 < 𝑇𝑘+1 ) ∈ 𝑥𝑇 , Φ𝑇𝑘+1 (𝑥𝑇𝑘+1 ) − Φ𝑇𝑘 (𝑥𝑇𝑘 ) ∼ ℰ (1)
13/39
∀(𝑇𝑘 < 𝑇𝑘+1 ) ∈ 𝑥𝑇 , Φ𝑇𝑘+1 (𝑥𝑇𝑘+1 ) − Φ𝑇𝑘 (𝑥𝑇𝑘 ) ∼ − log (𝒰(0, 1])
ℰ (1)
13/39
ℰ (1)
13/39
ℰ (1)
13/39
ℰ (1)
13/39
The Gillespie-inspired methods
A galaxy of methods
Doob’s algorithm (Markov processes)[7] .
7 J. L. Doob. “Topics in the theory of Markoff chains”. In: Transactions of the American Mathematical
Society 52.1 (1942), pp. 37–64.
8 D. T. Gillespie. “A general method for numerically simulating the stochastic time evolution of coupled
chemical reactions”. In: Journal of Computational Physics 22.4 (1976), pp. 403–434. issn: 0021-9991.
9 A. Bouchard-Côté, S. J. Vollmer, and A. Doucet. “The bouncy particle sampler: A nonreversible
rejection-free Markov chain Monte Carlo method”. In: Journal of the American Statistical Association
113.522 (2018), pp. 855–867.
10 P. A. W. Lewis and G. S. Shedler. “Simulation of nonhomogeneous poisson processes by thinning”. In:
Naval Research Logistics Quarterly 26.3 (1979), pp. 403–413. eprint:
https://onlinelibrary.wiley.com/doi/pdf/10.1002/nav.3800260304.
11 J. Møller and J. G. Rasmussen. “Perfect simulation of Hawkes processes”. In: Advances in applied
probability 37.3 (2005), pp. 629–646.
12 E. Bacry, I. Mastromatteo, and J.-F. Muzy. “Hawkes processes in finance”. In: Market Microstructure
and Liquidity 1.01 (2015), p. 1550005.
14/39
A galaxy of methods
Popularized by Gillespie, rediscovered by physicists[8,9]
Society 52.1 (1942), pp. 37–64.
113.522 (2018), pp. 855–867.
probability 37.3 (2005), pp. 629–646.
and Liquidity 1.01 (2015), p. 1550005.
14/39
A galaxy of methods
Thinning method by Lewis and Schedler[10]
Society 52.1 (1942), pp. 37–64.
113.522 (2018), pp. 855–867.
probability 37.3 (2005), pp. 629–646.
and Liquidity 1.01 (2015), p. 1550005.
14/39
A galaxy of methods
Thinning method by Lewis and Schedler[10]
Peculiar methods: immigrant-birth (a.k.a. clustering)[11,12]
Society 52.1 (1942), pp. 37–64.
113.522 (2018), pp. 855–867.
probability 37.3 (2005), pp. 629–646.
and Liquidity 1.01 (2015), p. 1550005.
14/39
Ogata’s algorithm[13] (next occuring point)
Step 1: Computation of cumulated sum ∑𝑀

𝑗=1 𝜙𝑗,𝑡 (𝑥𝑡 )
13 Y. Ogata. “On Lewis’ simulation method for point processes”. In: IEEE transactions on information
theory 27.1 (1981), pp. 23–31.
15/39

𝜙1,𝑡 (𝑥𝑡 )
theory 27.1 (1981), pp. 23–31.
15/39

𝜙1,𝑡 (𝑥𝑡 ) + 𝜙2 (𝑥𝑡 ) 2
1 2
1+2
theory 27.1 (1981), pp. 23–31.
15/39

𝜙𝑀,𝑡 (𝑥𝑡 )
𝜙1,𝑡 (𝑥𝑡 ) + 𝜙2 (𝑥𝑡 ) 𝑀
1+2
⋮ ⋮ ⋮
𝜙1,𝑡 (𝑥𝑡 ) + ⋯ + 𝜙𝑀,𝑡 (𝑥𝑡 )
joint process
theory 27.1 (1981), pp. 23–31.
15/39
Step 2: Computation of the next point of the joint process
𝜙1,𝑡 (𝑥𝑡 ) + 𝜙2 (𝑥𝑡 )
1+2
⋮ ⋮ ⋮
𝜙1,𝑡 (𝑥𝑡 ) + ⋯ + 𝜙𝑀,𝑡 (𝑥𝑡 )
joint process
theory 27.1 (1981), pp. 23–31.
15/39
Step 2: Computation of the next point of the joint process
𝜙1,𝑡 (𝑥𝑡 ) + 𝜙2 (𝑥𝑡 )
1+2
⋮ ⋮ ⋮
𝜙1,𝑡 (𝑥𝑡 ) + ⋯ + 𝜙𝑀,𝑡 (𝑥𝑡 )
joint process
𝑇
theory 27.1 (1981), pp. 23–31.
15/39
Step 3: Choice of the process emitting the point
5 6
8 9
theory 27.1 (1981), pp. 23–31.
15/39
Step 4: Impact of the new point emitted by 5 to its children
𝜙𝑖,𝑡 (𝑥𝑡 ) = 𝜇 𝑖 + ∑𝑀
𝑗=1 ∑𝑇 <𝑡,𝑇 ∈𝑥 ℎ𝑗𝑖 (𝑡 − 𝑇𝑗 )
𝑗
𝑗 𝑗 𝑡
Update 𝜙6,𝑡 (𝑥𝑡 ) with new spike from 5

5 6
8 9
theory 27.1 (1981), pp. 23–31.
15/39
16/39
Bottleneck
(a) Ogata’s algorithm (b) Local graph
Figure: Bottleneck of the classical Ogata algorithm
17/39
Using context
Neuroscience:
Sparse graph (synaptic connectivity is big, but neuronal
connectivity is very low)
18/39
Using context
Neuroscience:
Low spiking rate (0.1 Hz-1 Hz on average)[14,15]
14 P. Lennie. “The cost of cortical computation”. In: Current biology 13.6 (2003), pp. 493–497.
15 S. Herculano-Houzel. “Scaling of brain metabolism with a fixed energy budget per neuron:
implications for neuronal activity, plasticity and evolution”. In: PloS one 6.3 (2011), e17514.
18/39
Using context
Neuroscience:
Large to very large networks (𝑀 >> 106 )[16]
16 S. Herculano-Houzel, B. Mota, and R. Lent. “Cellular scaling rules for rodent brains”. In: Proceedings
of the National Academy of Sciences 103.32 (2006), pp. 12138–12143. issn: 0027-8424. eprint:
https://www.pnas.org/content/103/32/12138.full.pdf.
18/39
Using context
Neuroscience:
Large to very large networks (𝑀 >> 106 )[16]
How can we take advantage of this particular background to im-

prove execution time and memory imprint?
16 S. Herculano-Houzel, B. Mota, and R. Lent. “Cellular scaling rules for rodent brains”. In: Proceedings
of the National Academy of Sciences 103.32 (2006), pp. 12138–12143. issn: 0027-8424. eprint:
https://www.pnas.org/content/103/32/12138.full.pdf.
18/39
Local Independence
Didelez[17] : Granger causality for point process[18]

A point process 𝑃𝑏 is said to Granger cause another point process 𝑃𝑎
when the past information of 𝑃𝑏 can provide statistically significant
information about the future occurrences of 𝑃𝑎 .
17 V. Didelez. “Graphical models for marked point processes based on local independence”. In: Journal
of the Royal Statistical Society: Series B (Statistical Methodology) 70.1 (2008), pp. 245–264.
18 F. Figueiredo, G. Resende Borges, P. O.S. Vaz de Melo, and R. Assunção. “Fast Estimation of Causal
Interactions using Wold Processes”. In: Advances in Neural Information Processing Systems. Ed. by
S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett. Vol. 31. Curran
Associates, Inc., 2018.
19/39
Local Independence
Concept and graphical representation by Didelez[17] .
𝑃1 𝑃2 𝑃3
At any time 𝑡, the next occurrence of 𝑃3 is Granger-causaly

independent from 𝑃1 .
19/39
Local Independence
Concept and graphical representation by Didelez[17] .
𝑃1 𝑃2 𝑃3
At any time 𝑡, the next occurrence of 𝑃3 is Granger-causaly

independent from 𝑃1 .
Property
The local independence graph and the interaction graph are the same
for the Hawkes process.
19/39
Event-based simulation
Event-based simulation (discrete-event)

Models the operations of a system as a discrete sequence of events in
time. The systems undergoes discrete jumps of states whenever an
event occurs. Can be specified using DEVS (Discrete-Event System
Specification) formalism.
18 A. Muzy, B. P. Zeigler, and F. Grammont. “Iterative specification as a modeling and simulation

formalism for I/O general systems”. In: IEEE Systems Journal 12.3 (2017), pp. 2982–2993.
19 G. L. Grinblat, H. Ahumada, and E. Kofman. “Quantized state simulation of spiking neural networks”.
In: Simulation 88.3 (2012), pp. 299–313.
20/39

Successfully used for LIF neurons[18] and Izhikevich neurons[19] .

In: Simulation 88.3 (2012), pp. 299–313.
20/39


Traditional simulation algorithms for point processes are
event-based.

In: Simulation 88.3 (2012), pp. 299–313.
20/39


Traditional simulation algorithms for point processes are
event-based.
No previous work has been done to link the two fields.

In: Simulation 88.3 (2012), pp. 299–313.
20/39
Scheduler
A scheduler is a data structure.
21/39
Scheduler

Stores points (= 2-tuple (time, value)).
21/39
Scheduler

Keeps them sorted at all time.
21/39
Scheduler

We use self-balancing trees to represent them.
21/39
Scheduler


Complexity of access and modification constant (𝒪(1)) or
logarithmic (𝒪(log2 𝑁)).
21/39
Scheduler


(7, 𝑣2 )
(3, 𝑣3 ) (12, 𝑣4 )
ℎ ≤ log2 𝑁
(1, 𝑣6 ) (5, 𝑣7 ) (9, 𝑣1 ) (13, 𝑣0 )
(10, 𝑣5 )
21/39
Scheduler


(1, 𝑣6 ) (3, 𝑣3 ) (5, 𝑣7 ) (7, 𝑣2 ) (9, 𝑣1 ) (10, 𝑣5 ) (12, 𝑣4 ) (13, 𝑣0 )
21/39
How to represent functions?
Use piecewise constant functions to approximate the real interaction

function.
22/39

function.
22/39

function.
𝑇1
𝑡
Rejected, P ∼ Γ𝑇1
22/39
Efficient representation of a piecewise constant function
3
𝑡
0 1 2 3 4 5
23/39
3
𝑡
0 1 2 3 4 5
(0, 1) (1, 2) (2, 3) (3, 2) (4, 1) (5, 0)
23/39
𝜙𝑡 (𝑥𝑡 ) 𝜙(𝑥𝑡 )
3 1
2
1 + 0.5
𝑡 𝑡
0 1 2 3 4 5 0 1 2 3 4 5
23/39
4
3
2
1
𝑡
0 1 2 3 4 5
23/39
4
3
2
1
𝑡
0 1 2 3 4 5
(0, 1) (1, 2) (1.5, 2.5) (2, 3.5) (2.5, 4) (3, 3) (3.5, 2.5) (4, 1.5) (4.5, 1) (5, 0)
23/39
3
𝑡
0 1 2 3 4 5
23/39
3
𝑡
0 1 2 3 4 5
(0, 1) (1, 1) (2, 1) (3, −1) (4, −1) (5, −1)
23/39
4
3
2
1
𝑡
0 1 2 3 4 5
(0, 1) (1, 1) (1.5, .5) (2, 1) (2.5, .5) (3, −1) (3.5, −.5) (4, −1) (4.5, −.5) (5, −1)
23/39
Local graph algorithm
Step 0 (Initialization): Scheduling all possible next points

independently
𝑡5 𝑡3 𝑡8 𝑡1 𝑡6 …
24/39
Step 1: Choosing the event point with minimum time
… 1 3
𝑡5 𝑡3 𝑡8 𝑡1 𝑡6
5 6
24/39
Step 2: Updating processes
Update 𝜙6,𝑡 (𝑥𝑡 ) with new spike

from 5
Update 𝜙8,𝑡 (𝑥𝑡 ) with new spike 5 6
from 5
Update 𝜙9,𝑡 (𝑥𝑡 ) with new spike
from 5 8 9
24/39
Step 3: rescheduling
… 1 3
𝑡5 𝑡1 𝑡6 𝑡3 𝑡8
5 6
𝑡1 𝑡5 𝑡8 𝑡3 𝑡9 …
8
24/39
25/39
Comparison of theoretical complexities
Complexities of classical versus local-graph algorithms[20]

For a connectivity degree bounded by 𝑑 small,

26/39

For a balanced spiking rates in the network,

26/39

For a balanced spiking rates in the network,
The complexities are
𝒪(𝑇 𝑀 2 𝑑 log2 (𝑑𝑀)) vs. 𝒪(𝑇 𝑀𝑑[𝑑 + log2 𝑀]).

26/39
Comparison of execution times
Execution time on ”small” and sparse graphs, on single core[21]

27/39
Connection matrices
ℎ ℎ ℎ ℎ ℎ ℎ ℎ ℎ ℎ
⎛ ℎ11 ℎ21 ℎ31 ℎ41 ℎ51 ℎ61 ℎ71 ℎ81 ℎ91 ⎞
⎜ ℎ12 ℎ22 ℎ32 ℎ42 ℎ52 ℎ62 ℎ72 ℎ82 ℎ92 ⎟
⎜ 13 23 33 43 53 63 73 83 93 ⎟
⎜ ℎ14 ℎ24 ℎ34 ℎ44 ℎ54 ℎ64 ℎ74 ℎ84 ℎ94 ⎟
⎜ ℎ15 ℎ25 ℎ35 ℎ45 ℎ55 ℎ65 ℎ75 ℎ85 ℎ95 ⎟
⎜ ℎ16 ℎ26 ℎ36 ℎ46 ℎ56 ℎ66 ℎ76 ℎ86 ℎ96 ⎟
⎜ ℎ17 ℎ27 ℎ37 ℎ47 ℎ57 ℎ67 ℎ77 ℎ87 ℎ97 ⎟
⎜ ℎ18 ℎ28 ℎ38 ℎ48 ℎ58 ℎ68 ℎ78 ℎ88 ℎ98 ⎟
⎝ ℎ19 ℎ29 ℎ39 ℎ49 ℎ59 ℎ69 ℎ79 ℎ89 ℎ99 ⎠
28/39
Connection matrices
010110000
⎛0 0 1 0 1 1 0 0 0⎞
⎜0 0 0 0 0 1 0 0 0⎟
⎜ ⎟
⎜0 0 0 0 1 0 1 1 0⎟
⎜ 0 0 0 0 0 1 0 1 1 ⎟
⎜0 0 0 0 0 0 0 0 1⎟
⎜ 0 0 0 0 0 0 0 1 0 ⎟
⎜0 0 0 0 0 0 0 0 1⎟
⎝0 0 0 0 0 0 0 0 0⎠
28/39
Connection matrices
010110000
⎛0 0 1 0 1 1 0 0 0⎞
⎜0 0 0 0 0 1 0 0 0⎟
⎜ ⎟
⎜0 0 0 0 1 0 1 1 0⎟
⎜ 0 0 0 0 0 1 0 1 1 ⎟
⎜0 0 0 0 0 0 0 0 1⎟
⎜ 0 0 0 0 0 0 0 1 0 ⎟
⎜0 0 0 0 0 0 0 0 1⎟
⎝0 0 0 0 0 0 0 0 0⎠
28/39
Connection matrices
010110000
⎛0 0 1 0 1 1 0 0 0⎞
⎜0 0 0 0 0 1 0 0 0⎟
⎜ ⎟
⎜0 0 0 0 1 0 1 1 0⎟
⎜ 0 0 0 0 0 1 0 1 1 ⎟
⎜0 0 0 0 0 0 0 0 1⎟
⎜ 0 0 0 0 0 0 0 1 0 ⎟
⎜0 0 0 0 0 0 0 0 1⎟
⎝0 0 0 0 0 0 0 0 0⎠
Connection matrices
Full connectivity matrices can be very large
Ex: 𝑀 ∗ 𝑀 ∗ 64 =𝑀=106 64 ⋅ 1012 = 64 Tibit
28/39
Connection matrices
010110000
⎛0 0 1 0 1 1 0 0 0⎞
⎜0 0 0 0 0 1 0 0 0⎟
⎜ ⎟
⎜0 0 0 0 1 0 1 1 0⎟
⎜ 0 0 0 0 0 1 0 1 1 ⎟
⎜0 0 0 0 0 0 0 0 1⎟
⎜ 0 0 0 0 0 0 0 1 0 ⎟
⎜0 0 0 0 0 0 0 0 1⎟
⎝0 0 0 0 0 0 0 0 0⎠
Connection matrices
Ex: 𝑀 ∗ 𝑀 ∗ 64 =𝑀=106 64 ⋅ 1012 = 64 Tibit
But connection matrices are very sparses! Can we store only the
connections?
28/39
(Old) Yale’s format
3 245
⎛ ⎞ ⎛ ⎞
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠
Connection matrices
Ex: 𝑀 ∗ 𝑀 ∗ 64 =𝑀=106 64 ⋅ 1012 = 64 Tibit
connections?
Yale’s format (or compressed sparse row (CSR) or compressed row
storage (CRS))
29/39
3 245
⎛3⎞ ⎛3 5 6⎞
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠
Connection matrices
Ex: 𝑀 ∗ 𝑀 ∗ 64 =𝑀=106 64 ⋅ 1012 = 64 Tibit
connections?
storage (CRS))
29/39
3 245
⎛3⎞ ⎛3 5 6⎞
⎜1⎟ ⎜6 ⎟
⎜ ⎟ ⎜ ⎟
⎜3⎟ ⎜5 7 8⎟
⎜3⎟ ⎜ 6 8 9 ⎟
⎜1⎟ ⎜9 ⎟
⎜1⎟ ⎜8 ⎟
⎜1⎟ ⎜9 ⎟
⎝0⎠ ⎝ ⎠
Connection matrices
Ex: 𝑀 ∗ 𝑀 ∗ 64 =𝑀=106 64 ⋅ 1012 = 64 Tibit
connections?
storage (CRS))
29/39
3 245
⎛3⎞ ⎛3 5 6⎞
⎜1⎟ ⎜6 ⎟
⎜ ⎟ ⎜ ⎟
⎜3⎟ ⎜5 7 8⎟
⎜3⎟ ⎜ 6 8 9 ⎟
⎜1⎟ ⎜9 ⎟
⎜1⎟ ⎜8 ⎟
⎜1⎟ ⎜9 ⎟
⎝0⎠ ⎝ ⎠
Connection matrices
Ex: 𝑀 ∗ 𝑀 ∗ 64 =𝑀=106 64 ⋅ 1012 = 64 Tibit
connections?
storage (CRS))
Can we do better?
29/39
3 245
⎛3⎞ ⎛3 5 6⎞
⎜1⎟ ⎜6 ⎟
⎜ ⎟ ⎜ ⎟
⎜3⎟ ⎜5 7 8⎟
⎜3⎟ ⎜ 6 8 9 ⎟
⎜1⎟ ⎜9 ⎟
⎜1⎟ ⎜8 ⎟
⎜1⎟ ⎜9 ⎟
⎝0⎠ ⎝ ⎠
Connection matrices
Ex: 𝑀 ∗ 𝑀 ∗ 64 =𝑀=106 64 ⋅ 1012 = 64 Tibit
connections?
storage (CRS))
Can we do better?
Procedural connectivity
29/39
Procedural connectivity[22,23]
Idea: using the determinism of (pseudo) random-number generators
⎛𝑆1 ⎞ ⎛𝐶11 𝐶12 … 𝐶1𝑀 ⎞

⎜ ⎟ ⎜ ⎟
⎜ ⎟⟶⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠

23 J. C. Knight and T. Nowotny. “Larger GPU-accelerated brain simulations with procedural
connectivity”. In: Nature Computational Science 1.2 (2021), pp. 136–142.
30/39
⎛𝑆1 ⎞ ⎛𝐶11 𝐶12 … 𝐶1𝑀 ⎞

⎜𝑆2 ⎟ ⎜𝐶21 𝐶22 … 𝐶2𝑀 ⎟
⎜ ⎟⟶⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠

30/39
⎛ 𝑆1 ⎞ ⎛ 𝐶11 𝐶12 … 𝐶1𝑀 ⎞

⎜ 𝑆2 ⎟ ⎜ 𝐶21 𝐶22 … 𝐶2𝑀 ⎟
⎜ ⋮ ⎟⟶⎜ ⋮ ⋱ ⋱ ⋮ ⎟
⎜ ⎟ ⎜ ⎟
𝑆
⎝ 𝑀⎠ ⎝ 𝑀1 ⋱
𝐶 ⋱ 𝐶𝑀𝑀 ⎠

30/39
Results
Execution time and memory consumption on large graphs

and on single core[24]
Execution time: Hawkes simulation (5s) Maximum memory: Hawkes simulation (5s)
Log scale Log scale
d=250 (slope 1.15) d=250 (slope 0.96)
105 d=500 (slope 1.15) d=500 (slope 0.97)
d=1000 (slope 1.12) d=1000 (slope 0.95)
Maximum memory (in GBytes)

101
Execution time (s)
104
100
103
102 10 1
105 106 107 108 105 106 107 108

Number of neurons Number of neurons

31/39
Goodness-of-fit tests[25]
Why?
To test model fitting to experimental data
To validate simulation software

32/39
Goodness-of-fit tests[25]
Why?
To test model fitting to experimental data
To validate simulation software
How?
Ogata’s tests (Time-rescaling theorem + ACF)
Martingale property of point processes

32/39
Ogata’s goodness-of-fit tests
Reminder: Time-rescaling theorem

33/39

𝑡
∫0
33/39

𝑡
∫0
{Φ𝑇𝑘 (𝑥𝑇𝑘 )/𝑇𝑘 ∈ 𝑥𝑇 } is a unit-rate Poisson process.
33/39

𝑡
∫0
{Φ𝑇𝑘 (𝑥𝑇𝑘 )/𝑇𝑘 ∈ 𝑥𝑇 } is a unit-rate Poisson process.
{Φ𝑖,𝑇𝑘 /𝑇𝑘 ∈ 𝑥𝑇 } ∼ 𝒰(0, Φ𝑖,𝑇 ).

𝑇𝑘 < 𝑇𝑘+1 , {Φ𝑖,𝑇𝑘+1 − Φ𝑖,𝑇𝑘 }𝑘=1,…,𝑀−1 ∼ ℰ (1).
Test that the delays between points {Φ𝑖,𝑇𝑘 /𝑇𝑘 ∈ 𝑥𝑇 } are
independent (ex: auto-correlation test).
33/39
Time transformation
34/39
Using the Martingale property of Hawkes processes
Theorem
Let N be a point process, of intensity 𝜙𝑡 (𝑥𝑡 ), with associated counting
process 𝑁𝑡 and compensator Φ𝑡 (𝑥𝑡 ). Then the difference.
𝑀𝑡 = 𝑁𝑡 − Φ𝑡 (𝑥𝑡 )
is a Martingale. The property remains true by integration with respect

to a predictable process 𝜓𝑡𝑘 .
𝑇
𝑋= 𝜓𝑡 (d𝑁𝑡 − dΦ𝑡 ), E[𝑋] = 0
∫0
35/39
Martingale property
Node b, spontaneous + connected nodes
40
●
Difference
● ● ● ●
20 ●●● ●●● ●
●
●●
●
● ●●● ● ●●●● ● ● ●●●
●● ● ● ●●●
●● ●●● ●● ● ●
● ●●● ●● ●●●●●● ●
●●● ●● ●
●● ●●
●●●●●● ●●● ●●●● ●
● ●● ●● ●●●●●
●●
● ●●
●
●● ● ●●
●●
●●● ● ● ●●●●
●● ●
●● ●●● ●●
●●
●
●●● ● ●●●●●●● ● ●● ●● ●
●● ● ●
●●
0 ●
●●●● ●
● ●●●
●
●
●●
●●
●
●●●●● ● ●●●●●● ●●●●●●
●
●
●●●
●● ● ●●● ●●
●● ●
●●●●
●● ●
●●●● ● ●●● ●●● ●● ●●● ●
●●
● ●●
●●
●●
● ●
●●● ●●● ●● ●
● ●●
●●●●
●●●● ● ● ●
● ● ●● ●● ●● ●●
●● ●● ●●● ● ● ●● ● ● ● ●●●●
●
●
−20 ●● ● ●
● ●
Spontaneous ● Parents Grand−parents Disconnected nodes
Node b, disconnected Node a, spontaneous + disconnected
40 30
20
Difference
Difference
20 10
0 0
−10
−20 −20
36/39
Conclusion
New algorithm for (Hawkes) point process simulation, on single core
37/39
Conclusion
Able to simulate networks with large number of elements (done

108 processes, number of neurons in small monkey brain)
37/39
Conclusion
Able to simulate networks with large number of elements (done

108 processes, number of neurons in small monkey brain)
Exploits the graph sparsity and relatively low spiking rate
37/39
Perspectives
The simulation algorithms can be applied to more general classes

of point processes
38/39
Perspectives

of point processes
Add parallelization to the algorithm
38/39
Perspectives

of point processes
Add parallelization to the algorithm
Help improve the performances of Deep Spiking Neural
Networks[26,27]
26 H. Mei and J. Eisner. “The neural hawkes process: A neurally self-modulating multivariate point
process”. In: arXiv preprint arXiv:1612.09328 (2016).
27 N. Du, H. Dai, R. Trivedi, U. Upadhyay, M. Gomez-Rodriguez, and L. Song. “Recurrent marked
temporal point processes: Embedding event history to vector”. In: Proceedings of the 22nd ACM
SIGKDD international conference on knowledge discovery and data mining. 2016, pp. 1555–1564.
38/39
The end?
Thank you for your attention!
39/39

Presentation 31 01 2022

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Presentation 31 01 2022

Uploaded by

Copyright:

Available Formats

Efficient simulation of point processes with

Cyrille Mascart1 , Ph.D.

Lab meeting - Koulakov lab

1 P. Grazieschi, M. Leocata, C. Mascart, J. Chevallier, F. Delarue, and E. Tanré. “Network of interacting

1 P. Grazieschi, M. Leocata, C. Mascart, J. Chevallier, F. Delarue, and E. Tanré. “Network of interacting

Neurons (= process, mark, type)

A neuron randomly emits spikes

Neurons (= process, mark, type)

A neuron randomly emits spikes

A spike is an isolated point on R

Neurons (= process, mark, type)

A neuron randomly emits spikes

A spike is an isolated point on R

Neurons (= process, mark, type)

A neuron randomly emits spikes

A spike is an isolated point on R

Countable random set of points (= spikes)

Countable random set of points (= spikes)

Countable random set of points (= spikes)

Countable random set of points (= spikes)

P[1 new point ∈ [𝑡, 𝑡 + d𝑡[ | 𝑥𝑡 ] = 𝜙𝑡 (𝑥𝑡 ) d𝑡 + 𝑜(d𝑡)

Constant conditional intensity function: 𝜙𝑡 (𝑥𝑡 ) = Γ > 0

Constant conditional intensity function: 𝜙𝑡 (𝑥𝑡 ) = Γ > 0

Constant conditional intensity function: 𝜙𝑡 (𝑥𝑡 ) = Γ > 0

Neuron modelling level

4 R. C. Lambert, C. Tuleau-Malot, T. Bessaih, V. Rivoirard, Y. Bouret, N. Leresche, and

Neuron modelling level

4 R. C. Lambert, C. Tuleau-Malot, T. Bessaih, V. Rivoirard, Y. Bouret, N. Leresche, and

Neuron modelling level

Network modelling level

4 R. C. Lambert, C. Tuleau-Malot, T. Bessaih, V. Rivoirard, Y. Bouret, N. Leresche, and

Neuron modelling level

Network modelling level

4 R. C. Lambert, C. Tuleau-Malot, T. Bessaih, V. Rivoirard, Y. Bouret, N. Leresche, and

Classically, Poisson process models

5 P. Reynaud-Bouret, V. Rivoirard, F. Grammont, and C. Tuleau-Malot. “Goodness-of-fit tests and

Classically, Poisson process models

5 P. Reynaud-Bouret, V. Rivoirard, F. Grammont, and C. Tuleau-Malot. “Goodness-of-fit tests and

Classically, Poisson process models

5 P. Reynaud-Bouret, V. Rivoirard, F. Grammont, and C. Tuleau-Malot. “Goodness-of-fit tests and

Self-exciting conditional intensity function

Self-exciting conditional intensity function

Self-exciting conditional intensity function

Self-exciting conditional intensity function

Self-exciting conditional intensity function

Self-exciting conditional intensity function

Self-exciting conditional intensity function

Time-rescaling theorem (or Random-Time Change theorem)

Time-rescaling theorem (or Random-Time Change theorem)

Then apply the transformation to the 𝑇𝑘 ∶ Φ𝑇𝑘 (𝑥𝑇𝑘 )

Time-rescaling theorem (or Random-Time Change theorem)

Then apply the transformation to the 𝑇𝑘 ∶ Φ𝑇𝑘 (𝑥𝑇𝑘 )

is a unit-rate Poisson process.

Step 1: Computation of cumulated sum ∑𝑀

Step 1: Computation of cumulated sum ∑𝑀

Step 1: Computation of cumulated sum ∑𝑀

𝜙1,𝑡 (𝑥𝑡 ) + 𝜙2 (𝑥𝑡 ) 2

Step 1: Computation of cumulated sum ∑𝑀

𝜙1,𝑡 (𝑥𝑡 ) + 𝜙2 (𝑥𝑡 ) 𝑀

Step 2: Computation of the next point of the joint process

𝜙1,𝑡 (𝑥𝑡 ) + 𝜙2 (𝑥𝑡 )

Step 2: Computation of the next point of the joint process