You are on page 1of 8

Gabor feature processing in spiking neural networks

from retina-inspired data

Aristeidis Tsitiridis, Cristina Conde, Isaac Martin de Diego, Jose Sanchez del Rio Saez, Jorge Raul Gomez, and
Enrique Cabello
Department of Computer Science and Statistics
King Juan Carlos University
Madrid, Spain

Abstract- In recent years, there has been a growing interest variations encoded as address-events. The Address-Event
in dynamic vision sensors due to their incredible advantages in Representation (AER) [4], i.e. reflectance events
speed, computational cost and power consumption. These new asynchronously encoded in x, y pixel coordinates resembles
vision sensors have been inspired from biological retinae and
the precisely timed electrical impulses or spikes of the
use asynchronous address-event representation for visual
spatially arranged optical nerves stemming from the retinae
information instead of a series of snapshots taken from
traditional frame-based devices. Spiking neurons are
to the primary visual cortex.
biologically-plausible artificial neurons that process
The most prevalent method for constructing biologically­
information in sequences of time events and are particularly
inspired vision models requires alternating hierarchical
suited for processing address-event information. A novel and
layers following the early visual processing stages [5].
refined biologically-inspired Gabor feature approach based on
spiking neural networks is presented here. This approach
Neurons at higher layers progressively exhibit a combination
utilises the retina-inspired data from dynamic vision sensors of selectivity and invariance to object translations such as
with Gabor edge detection in a hierarchical structure that has size, position, rotation, depth etc. In the recent past, there had
been populated with Leaky-Integrate and Fire neurons that been many models and variants that employed this kind of
have been trained via the Remote Supervision Method. The architecture such as the Neocognitron [6], Convolutional
number of active spiking neurons at each time instance depends neural network [7], and Hierarchical model and X (HMAX)
on the number of time events. This idea provides a flexible
[8], and all of them produced promising results for a variety
approach that avoids unnecessary computations and
of object recognition tasks. However, these models had been
complexity. The biologically-inspired model developed for this
solely applied for frame-driven visual scenarios which
preliminary work has shown promising results and has laid the
foundation for a rapid parallel object recognition model
considerably increased their computational cost especially as
designed for the new retina-like address-event representation the complexity of a given situation rose. The high
sensors. customisation for images or frames from videos had made
their integration with temporal features and consequently
Keywords-Biologically-inspired machine vision; Gabor edge video in real-time, exceptionally difficult and eventually
detection; Spiking neural networks; Hierarchical vision model; implausible with biology.
Address - Event Representation (AER)
Soon after the emergence of AER retina sensors, several
I. INTRODUCTION models were introduced to harness their approach to visual
An investigation of biological visual perception representation. A convolutional neural network was directly
properties in primates can quickly reveal certain irrefutable applied in [9] for object recognition with minor alterations.
advantages. It can be fast and efficient under various Although this work reported a frame-based architecture that
conditions, for example, humans can recognise faces in was applied directly to retina sensors, its fast recognition
different poses in as little as 140ms [1]. In addition, these response and rate offer a promising insight to methodologies
cognitive operations consume relatively small amounts of that share similar hierarchical characteristics. Edge detection
energy and portray adaptation to a wide spectrum of real­ was performed by applying Gabor filter templates directly on
world situations. Remarkably, all these visual cognitive AER images. Furthermore, this model was extended to a
operations occur in parallel with other senses in a seamless neuromorphic engineering application with an event-driven
manner. Therefore, it is not surprising that researchers across convolution module [10] and combined with AER sensors
various fields, over the last years, have intensified their focus was employed for high-speed recognition examples. The
on harnessing biologically-inspired techniques for their spiking neuron model presented in [11], was a biologically­
applications. plausible approach that captured temporal visual information
and learned features in an unsupervised manner based on the
Dynamic VISIOn sensors (DVS) [2] [3] mImIc certain Spike-Timing Dependent Plasticity (STDP) [12]. This
biological traits of the retinae. The continuous data output of hierarchical Spiking Neural Network (SNN) performed,
these sensors is an asynchronous stream of reflectance

978-1-4799-1959-8/15/$31.00 @2015 IEEE


similarly to the model in [9], fast and accurately in tests for AEtimewindow = f. tstart
end AExy(tev)dtev (1)
tracking cars. While the topology of this network was t
biologically-plausible, there was no provision for edge In equation (1), tSlarl and tend show the time in which
detection operations to match the activity of VI cells, i.e. the address-events have been processed. These input visual
neurons located in area VI of the visual cortex. More events are directly supplied to LIF spiking neurons.
recently, the model in [13] proposed a convolutional neural
network with Gabor filter templates for edge detection and a
A. Spiking neurons
SNN classifier (Tempotron) for object recognition with For many years, extensive biological research on the
impressive results compared to HMAX. mammalian brain has been continuously unveiling
information on its structure and mechanisms. It is known that
Inspired from the early hierarchical operations of the biological neurons are the building block of all information
primary visual cortex, this work presents a new Gabor processing in the brain and that they are connected in
feature-based model of feed-forward Leaky-Integrate and complicated patterns to fulfil everyday activities, such as the
Fire (LIF) spiking neurons that acts as precursor work for a extremely demanding visual tasks. Mimicking biological
future object recognition model that can fully exploit retina neuron behaviour, SNN models have emerged to improve
sensor data in real-time. The LIF neurons are trained as VI our understanding of the brain and indirectly harness some
simple cells to perform orientation selective edge detection its capabilities in potential applications. SNN simulate many
spatiotemporally. In contrast to previous models where biological properties with the main compromise being
Gabor filters are directly convolved over the original image, between biological plausibility and computational efficiency.
in this work, spiking neurons are created to exploit the Amongst several models examined for this work, LIF
asynchronous temporal nature of AER information. neurons [14] with a refractory period were preferred since
Subsequently, a layer of neurons with properties similar to they are computationally efficient. The LIF neuron simulates
complex cells pool AER information via spatial summation general membrane behaviour with less emphasis on other
with greater Receptive Fields (RF) following the principles aspects such as gate voltages. The LIF neuron temporal
of their biological inspiration [5]. operation can be described by the following equation:
11. METHODOLOGY d
-T m Vm
dt( t ) = - v.m (t) + RI5 (t) (2)
This section presents the development steps taken to
create a biologically-inspired model capable of processing In equation (2), the decay term for the neuron membrane
AER information. The discussion centres on the nature of is given by Tm = RC where R is the membrane resistance and
AER data, the fundamental processing units to carry the C is the membrane capacitance. Furthermore, Vm is the
subsequent biological-like operations, the learning rule to membrane potential and Is the synaptic current. The synaptic
appropriately tune neurons, and the overall model structure current in a neuron is usually expressed as the total sum of
layer by layer. currents from excitatory and inhibitory synapses connected
to the neuron itself. As the neuron is depolarised it reaches a
AER data threshold voltage Vlh and then generates a spike (event). The
DVS provide a wide dynamic range with low response refractory period following repolarisation, restricts the
latency. Notably, incoming visual events are processed in the generation of further spikes immediately after a spike was
time domain real-time in contrast to conventional cameras transmitted and adds realism to the model.
that capture visual information in regular intervals (frames)
regardless of their significance. Another advantage of DVS
B. The learning rule - ReSuMe
is their extremely low power consumption which makes Much like VI simple cells recelvmg optical nerve
them particularly suitable for a wide selection of applications impulses, the LIF neurons are expected to process incoming
where power management is critical, e.g. navigation and AER information. A learning rule is employed so that LIF
robotics. At this moment, two major drawbacks of this type neurons are instructed to work like simple V1 cells (section
of sensors are the low spatial resolution and limited spectral C). This learning rule is REmote SUpervision MEthod
information. (ReSuMe) [15] and it is a biologically-inspired supervised
learning method for spiking neurons that concentrates on the
The biological retinae generate continuous electrical precise timing of their spikes. More specifically, it utilises
impulses to represent differences in their perceivable light the learning windows approach, originally introduced from
wavelengths. This information is propagated with retinal Hebbian learning for spike trains [16], in a supervised
ganglion cells (optic nerves) via the lateral geniculate manner. This learning approach expands from the well­
nucleus as far as the primary visual cortex. Similarly to this, known unsupervised method STDP [17]. In ReSuMe
Address-Event Representation is a protocol of asynchronous learning, two opposite mechanisms are balanced comparably
transmission of reflection changes in the form of digital to STDP and take the form of functions within specific
spike events. These continuous visual signals are encoded in learning windows in time. Its first rule states that an
spatial (x, y) addresses of pixels from DVS and time is excitatory synapse is facilitated if a presynaptic spike is
represented by their occurrence. Therefore, within a given transmitted and vice versa, an inhibitory synapse is
time window for address-events (AE): depressed for a similar situation. Its second rule states that an
excitatory synapse is depressed if a spike is received directly
before a postsynaptic spike and vice versa, an inhibitory specify the spatial tuning accuracy of the Gabor filter. In
synapse is facilitated for a similar situation. The synaptic order to preserve the spatial integrity of the objects the
efficacies (w) equation for ReSuMe between a presynaptic chosen number of orientations used is eight (Fig. 1). It has
neuron nk and a postsynaptic neuron n, is expressed as: been shown that the VI cell RF sizes may vary considerably

; [«,+ {1Jw'(Sd)SIn(t_Sd)dSd]
[21], [22] and can adapt and reorganise [23] during the
Wk; (t)=st(t) lifetime of the visual cortex. This principle is adopted here
and Gabor filter parameters in addition to their sizes are

+s' (t) [a'+ (s')Sin(t-s')ds']


fooo w' (3)
explained further in the experimental setup section.
Parameterisation follows the tuning properties of VI
In equation (3), st is the desired signal of spikes, s' the parafoveal cells that are exhibited on average as a response
d
learning signal of spikes, a and a' the respective non­ to visual stimuli [24].
Hebbian amplitudes. w' and w' are the learning windows D. Model design
given by the following equations:
The hierarchical architecture of the early processing steps

{+: cr (;)
" in the visual cortex has been reported in many past findings
"'('")� p [25]-[27]. This topology has also been proven efficient in its
(4)
adaptation to many applications in object and face
recognition [7], [S]. The main objective of such a topology is

W V) � {-A�crP U) (5)
the progressive creation of a view-invariant representation of
objects with some important invariance properties being size,
position, rotation and illumination. Similarly, the future goal
of the model is to obtain enhanced object invariance
In equations above, Wd W,
and are the exponential
windows for the desired and learning signals. and areAd A, properties. Hence, this view-invariant approach (Fig. 2) is
also partly followed here.
constants and can be positive for excitatory synapses and
Id I,
negative for inhibitory. and are positive time constants.
C. VI Cells - Gabor edge detection
Cllaver
V1 simple cells in the primary visual cortex process
incoming visual data [5] from the retinae and perform edge
detection operations for subsequent layers of the visual
cortex. Gabor filters have been found to match the response
of V1 cells to oriented bars or gratings and as such, have
been used to match the response of VI cells to oriented bars SI layer
or gratings [I S]-[20]. Their properties are essentially
encoded with ReSuMe as discussed in the previous section. �
Inputlaver

128x128
Fig. l.An example of a 7x7 Gabor filter at the 8 different orientations
used by the model.
Fig. 2. Algorithm diagram.

A Gabor filter is a linear filter defined as the product of a


complex sinusoid with a 2D Gaussian envelope and for 1) Input to SI layer
values in pixel coordinates (x,y), it is expressed as:
( 2+y2y2
G(x,y)=exp - X� cos ;:- X
) (2l1:) (6)
The fust layer is named SI layer and it consists of the
simple cells created from the process previously described.
X=xcosB-ysinB (7) These cells essentially perform edge detection operations on
Y=-xsinB+ycosB (S) all incoming AER information, arranged in a spatial layer
which covers the entire extent of the retina sensor data. The
In equation (6), y is the aspect ratio and in this work is set fixed AER array size of 12Sx12S is processed by various SI
to 1. Parameter A is known as the wavelength of the cosine RF sizes in the input layer with a subsampling factor of 2
factor and together with parameter (1, the effective width,
which means that for each different RF size there is a standard pc hardware (Fig. 3). All videos were originally
different cell population. recorded with jAER, a Java interface software specifIcally
For instance in the input layer and assuming a RF size of designed for retina sensors which is publicly available [34].
3x3 with a total number of 142884 incoming retina events, The hardware setup consisted of the retina sensor [35] and an
then potentially 15876 neurons can exist each with 72 USBAERmini2 board [4] which enables timestamp
synapses that produce a total of 1143072 synapses. However, synchronised capturing of AER events.
the simultaneous operations of all of these neurons are
necessary only when the entire input layer array is
transmitting events. In practice, the number of asynchronous
events is much less and respectively for each RF a smaller
number of neurons that fIre. The algorithm examines only
incoming events, showing a signifIcant advantage over
traditional sensors that may process redundant information
with models which may need more resources and time to
process image frames as a whole.
Fig. 3. Outline of the hardware and software setup.

Lateral inhibition is a well-studied phenomenon of LIF neurons are pre-trained using ReSuMe with the
biological vision and is implemented in this work for both SI Gabor parameters (as explained in section I1.C and D) and
and Cl layers. Lateral inhibition is a mechanism which should respond maximally to the specifIc orientations of
promotes the activity of maximally fIring spikes by reducing edges they are tuned for. More specifIcally, Gabor ftIter
the activity of their neighbours. In some respects this activity amplitude values are scaled and encoded to spike trains in
can be viewed as a biological threshold technique when the time domain.
excessive noise is present and as such is treated here, i.e. a
As shown in Fig. 4, each of the input neurons fIres at
noise ftIter.
precisely timed instances forming temporal patterns of Gabor
2) SI to Cl layer ftIters. Higher vector values are translated as delayed
responses in the time domain and vice versa, lower values to
In the next layer referred as Cl, complex cells receive faster responses. Since these responses are scaled the exact
local edge information from SI layer neurons and pool their fIring time of each of the input neurons directly depends on
responses across all different RF sizes and orientations. The the chosen time period. All these responses are applied to
pooling operation is achieved via spatial summation [28]­ ReSuMe in order to train multiple LIF neurons that after
training will behave as Gabor ftIters.
[30]. Spatial summation, i.e. the integration of spike events
from various retinal RF, is another established biological
function of vision that explains how edge features propagate
in higher layers of the visual cortex. Spatial summation is
directly applied over each RF and for all orientations:

(9)

Equation (9) indicates that a Cl neuron response r is the


Inputspilcalnlins Input sp�.lnIins
sum of vector values x between all orientations i of a 10r-------,
BO '
particular RF size k. Following spatial summation, lateral ..'
70
inhibition is applied according to the approach proposed in ,: '
60,' :
[31]:
50 )
.. ..
"
.� 0
(10) .. 4
30 ,)
The threshold value r2 is controlled by the inhibition 20 ••
value h which is affected by the minimum and maximum 10
.:
values of the entire visual fIeld. Lateral inhibition then takes ...

0.05 0.1 0.15 0.2 0.25 00 0.05 0.1 0.15 02 0.25


place for every rj value lower than the threshold value r2. bme[s[ bmo[s)

Ill. EXPERIMENTAL SETUP


Fig. 4. 3x3 and 7x7 Gabor templates at 90 degrees. Top row shows the
This section focuses on the preliminary experiments original templates and the bottom row the input spiking neurons firing
conducted for testing the SNN model of this work. AER at precise instances (events) in time
processing and software code development for the model,
were accomplished in the MA TLAB environment with
20
0.8
'0
(I). 0.6

g 0.4
u

50 100 150 200 250 300


input # epoch #

00 (� �
Fig. 5. An example of a Gabor neuron training process of 300 epochs for a 7x7 RF size at 45°. a) The lower row of spike events (green dots) is
generated by membrane potential spikes which vary in each epoch and readjust according to a Poisson distribution (red dots). b) Synaptic weight
values of excitatory (positive) and inhibitory (negative) synapses. c) Correlation values for 300 training epochs

The number of input neuron events processed by each Gabor epoch, synaptic weight values rearrange according to the
neuron depends on the size of the respective RF size, e.g. for equations presented in the previous section to match the
a small RF size of 3x3, the total number of input neurons is desired spike train response;t (equation 3). As shown in Fig.
9. LIF neurons are trained for 8 different orientations and for 5c, the correlation values between the desired train response
4 different RF sizes (Table 1). This means that in this work ;t and output spike train S', reaches a maximum value after a
the total number of pre-trained types of Gabor-like neurons relatively small number oftraining epochs. Membrane spikes
is 32. are monitored throughout the training process and spikes
Table 1. RF sizes and their respective G and J.. values.
with the highest correlation score achieved from all epochs,
are stored along with their respective synaptic weights.
RF sizes G J..
The Poisson-like membrane spikes and synaptic weights
3x3 1.4 2
5x5 2 2.8 are the necessary parameters to define Gabor simple neurons
7x7 2.8 3.5 (or simple cells) tuned at that particular orientation and RF.
9x9 3.6 4.6 In the SI layer, all correlation measurement values higher
than a pre-set threshold value are neglected. If an edge exists
ReSuMe is an efficient learning rule and requires a within a certain RF area of SI cells then their membrane
relatively small number of epochs before neurons have spikes emit patterns similar to the pre-trained spike trains.
reached their highest possible training score. The training Therefore, the cross-correlation difference between these SI
score is calculated as the highest cross-correlation number unit responses and the pre-trained responses can be
between membrane potential spikes and desired output measured. These measurement values fluctuate from 0 to 1,
spikes. Training scores are most successful when they and like weights, they are multiplied with the incoming
approach an absolute value of 1. In practice, reaching an signals to indicate how strong the presence of a particular
absolute score is not important as discussed further below. edge is.
The output spike pattern is set according to a random IV. RESULTS
predefined Poisson distribution (Fig. 5a). In past literature,
The SNN model is first tested against ideal shapes of 0
the Poisson-like distributions are known to exist in the
and 1 values without the presence of noise. Fig. 6 shows a
primary visual cortex and have often been examined in bio­
simple example of a solid fill circle processed by the model
inspired systems [32], [33]. Positive synaptic weight values
with various RF sizes in the SI layer. For the given circle the
signify excitatory synapses and conversely, negative values
3x3 RF size produces the closest circular shape to the
indicate inhibitory synapses (Fig. 5b) and with every training
original.

Fig. 6. From left to right, the leftmost input circle image is processed with various RF sizes of Gabor neurons, at 3x3, 5x5, 7x7 and 9x9 producing the
respective SI layer results.
Moreover, it is noticeable that as the RF size increases, the edges and can be incompatible with the biological-like
quality of edge detection decreases which is caused by standards that were set for the model here.
progressively larger RF sizes overlapping on a constant small
Fig. 8 shows some examples from original AER video
area over different angles. Also by incorporating more
data and their Cl layer images. Noise reduction from lateral
information from the homogeneous space of the original
inhibition is noticeable. There are some minor improvements
circle with larger edge detection windows, thicker
in the thickness of the edges and the space between them but
overlapping edges create artefacts. This is evident from edge
it is apparent that the integrity of morphological information
information that has advanced further inside the circle,
cannot be further enhanced.
creating a thick uneven outline.
SI layer errors are mostly compensated by inhibition and
Further tests were conducted with other 'ideal' shapes
spatial surrunation in the Cl layer. In practice, using the
without the presence of noise (Fig. 7). In simpler examples
optimal RF sizes and spatial frequencies for SI unit tuning
such as the triangle and square, edge integrity appears
on the numerous objects that can be found in the real world,
slightly better. However, as the number of orientations
is a subject of rigorous analysis [20], [23] which has been
increases such as in the pentagon and star examples, corners
planned for this model in the near future. Furthermore,
progressively exhibit a thicker outline. Regardless of some of
sensor related errors are expected to be improved drastically
these tuning difficulties with RF sizes and parameterisation,
with higher spatial resolution DVS, less sensitive to noise,
all images prove the model's ability to extract spatial features
that are planned for production.
from Gabor-like spiking neurons.
It is important to mention that contrary to frame-based
AER videos pose a challenging task compared to the
approaches, AER-based data are processed in time windows
perfect shapes that have been presented so far. The
as they occur. These time windows can be set by the user
experiments centred on objects in motion without any
manually to as little as 1/-15. The chosen temporal resolution
obstructions or clutter. The video data of this work were
was set empirically to capture the objects particular speed of
captured under natural light conditions and contained some
motion. With faster moving objects this setting would
reflectance noise from background objects and surfaces.
require smaller values and with variable speed monitoring, a
Contrast polarity changes in AER data vary between three
mechanism which adjusts accordingly.
event states -1, 0 and l. The -1 state indicates that the AER
sensor pixel has detected illumination reductions and vice The speed of AER image processing was not the main
versa for l. Naturally, zero events indicate that no changes focus of these preliminary experiments. However, given the
have been detected. Since this work only focuses on the nature of AER data, the model was exceptionally efficient in
extraction and processing of spatial features, -1 occurrences processing only meaningful information as it occurred. The
are treated simply as 1. This effectively neglects directional model performs its operations only in sections were there are
motion and concentrates on the edges being detected by the time events. This is an additional advantage over frame­
AER sensor. In real-world situations object edges and based techniques which need to scan, often aimlessly, the
surfaces are not uniformly illuminated, partly due to entire image for important or meaningful visual information.
naturally occurring shadows or the direction of the light Consequently, CPU load and stored data for all the
source. Therefore, objects appear significantly distorted in experiments were minimal.
AER data by salt and pepper-like noise or inhomogeneous
edges. The jAER software package provides de-noise filter
options but such filtering has been noticed to further distort

• *

Fig. 7. Cl layer examples with simple shapes. Top row shows the original shapes and bottom row the processed results at the Cl layer.
Retina Events 1716. Time window 162 Retina Events 368, TIme window 108 Retin. Events 907. Time window 127

20 20 20

40 40 40

60 60 60

80 80 80

100 100 100

120 120 120


20 40 60 80 100 120 20 40 60 80 100 120 20 40 60 80 100 120
Cl units, TIme window 162 Cl units. Time window 108 Cl units, Time window 127

20 20 20

40 40 40

60 60 60

80 80 80

100 100 100

120 120 1 20
20 40 60 80 100 120 20 40 60 80 100 120

Fig. 8. Results from AER videos. Top row shows the original AER events in MATLAB of a triangle, a pool table 8-ball and a hand. Bottom row shows
the processed Cl layer images from the model.

Naturally, in the near future with more advanced retina-like simulate any of the Gestalt principles found in higher cortical
sensors of higher spatiotemporal resolutions that can areas [36] and therefore these edges cannot be accurately
additionally process spectral information, the amount of data detected or joined.
and their processing load is expected to increase.
The work presented in this paper has contributed the
V, CONCLUSIONS following for the first time: a) Gabor filters are directly
encoded with SNN in the time domain, b) a biologically­
A biologically-plausible model for Gabor feature
inspired learning technique is used to teach neurons as VI
extraction using spiking neural networks is described in this
cells for AER processing and c) a hierarchical and
paper. Its methodology relies on LIF neurons that have been
biologically-inspired model with increased biological
pre-trained with the Remote Supervision Method.
plausibility is applied on retina-like data. Furthermore, the
Subsequently, these neurons form a Gabor edge detection
model has been successful in establishing the foundation
layer and progressively the model processes these features
upon which future enhancements and modifications will rely
with alternating layers in a temporal manner. The number of
for an advanced low power, low cost model which will
LIF neurons being created to handle incoming visual
perform rapid parallel object recognition in a biologically­
information depends on the events being captured at a given
inspired manner, utilising SNNs together with AER sensors.
time window. This flexible approach closely simulates the
overall structure of the mammalian brain and exhibits In the near future, this approach is going to be enriched
adaptation to AER data. Furthermore, the model with the with more advanced DVS of higher spatiotemporal
proposed methodology avoids the unnecessary complexity resolutions and additional spectral information that will
that would otherwise involve thousands of additional either be introduced by an upgraded DVS or a frame-based
neurons and their respective synapses with the extra data device in a more elaborate schema. Moreover, by
storage required for unwanted visual information. incorporating extra layers in the hierarchy, more complex
classification problems and practical pattern recognition
The model is examined with noiseless images containing
scenarios will be investigated for specific AER applications
simple shapes and then applied on retina-like data from an
in video surveillance and navigation. Particular attention will
AER sensor. The preliminary investigation produced
be given to future security projects and applications
satisfactory results in the absence of noise and promising
involving the use of fast response recognition systems.
results for the actual retina-like data. More specifically with
AER data, the model sufficiently identifies the edges of ACKNOWLEDGEMENT
objects. However, edges that are separated, dotted or broken
The authors would like to thank the Robotics and
proved to be a difficult task that necessitates more advanced
Computer Technology group in the University of Seville and
AER sensors or techniques. With the current version of the
the Institute of Microelectronics in Seville, CSIC. Finally,
model, there is no provision for techniques that introduce or
the authors thank the ABC4EU project for funding this work.
REFERENCES [ 19] 1. G. Daugman, "Uncertainty relation for resolution in space, spatial
frequency, and orientation optimized by two-dimensional visual
[I] S. Yamamoto and K. Kashikura, "Speed of face recognition in cortical filters," J. Opt. Soc. Am., vol. 2,no. 7,pp. 1 160- 1 169, 1985.
humans: an event-related potentials study.," 1999.
[20] M. A. Webster and R. L. De Valois, "Relationship between spatial­
[2] P. Lichtsteiner, C. Posch, and T. DelbrOck, "A 128x128 120dB 15/15 frequency and orientation tuning of striate-cortex cells.," J. Opt. Soc.
Latency Asynchronous Temporal Contrast Vision Sensor," iEEE J. Am. A., vol. 2,pp. 1 124- 1 132, 1985.
Solid-State Circuits, vol. 43,pp. 566-576,2008.
[2 1] C. J. McAdams and R. C. Reid,"Attention modulates the responses of
[3] C. Posch, D. Matolin, and R. Wohlgenannt, "A QVGA 143 dB simple cells in monkey primary visual cortex.," J. Neurosci., vol. 25,
dynamic range frame-free PWM image sensor with lossless pixel-Ievel pp. 1 1023-1 1033,2005.
video compression and time-domain CDS," in IEEE Journal of Solid­
[22] N. C. Rust, O. Schwartz, 1. A. Movshon, and E. P. Simoncelli,
State Circuits, 20 1 1,vol. 46,pp. 259-275.
"Spatiotemporal elements of macaque VI receptive fields," Neuron,
[4] R. Berner, T. Delbruck,A. Civit-Balcells,and A. Linares-Barranco, "A vol. 46,pp. 945-956, 2005.
5 Meps $ 100 USB2.0 Address-Event Monitor-Sequencer Interface,"
[23] M. P. Sceniak, M. 1. Hawken, and R. Shapley, "Contrast-dependent
2007 iEEE Int. Symp. Circuits Syst., 2007.
changes in spatial frequency tuning of macaque VI neurons: effects of
[5] D. H. Hubel and T. N. Wiesel, "Receptive fields and functional a changing receptive field size.," J. Neurophysiol., vol. 88, pp. 1363-
architecture of monkey striate cortex," J. Physiol., vol. 195, no. I, pp. 1373,2002.
2 15-243.,1967.
[24] T. Serre and M. Riesenhuber, "Realistic Modeling of Simple and
[6] K. Fukushima, "Neocognitron: A self organizing neural network for a Complex Cell Tuning in the HMAX Model , and Implications for
mechanism of pattern recognition unaffected by shift in position," BioI. Invariant Object Recognition in Cortex," Methods. p. -017,2004.
Cybern., vol. 36,no. 4,pp. 93-202, 1980.
[25] D. Felleman and V. Essen D, "Distributed hierarchical processing in
[7] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffiler, " Gradient-based the primate cerebral cortex," Cereb. Cortex, vol. I, no. I, pp. 1-47,
learning applied to document recognition.," Proc. iEEE, vol. 86, pp. 1991.
2278-2324, 1998.
[26] 1. P. Van Kleef, S. L. Cloherty, and M. R. Ibbotson, "Complex cell
[8] M. Riesenhuber and T. Poggio, "Hierarchical models of object receptive fields: evidence for a hierarchical mechanism," J. Physiol.,
recognition in cortex," Nat. Neurosci., no. 2( 1 1): I0 19-25, 1999. vol. 588,no. 18,pp. 3457-3470,2010.
[9] 1. A. Perez-Carrasco, c. Serrano, B. Acha, T. Serrano-Gotarredona, [27] V. Axelrod and G. Yovel,"Hierarchical Processing of Face Viewpoint
and B. Linares-Barranco, "Spike-based convolutional network for real­ in Human Visual Cortex," Journal of Neuroscience, vol. 32. pp. 2442-
time processing," in Proceedings - international Conference on 2452,2012.
Pattern Recognition, 2010,pp. 3085-3088.
[28] E. R. Howell and R. F. Hess, 'The functional area for summation to
[ 10] L. Camuiias-Mesa, C. Zamarreiio-Ramos, A. Linares-Barranco, A. 1. threshold for sinusoidal gratings.," Vision Res., vol. 18, pp. 369-374,
Acosta-Jimenez, T. Serrano-Gotarredona, and B. Linares-Barranco, 1978.
"An event-driven multi-kernel convolution processor module for
[29] S. J. Anderson and D. C. Burr, "Spatial summation properties of
event-driven vision sensors," IEEE J. Solid-State Circuits, vol. 47, pp.
directionally selective mechanisms in human vision.," J. Opt. Soc. Am.
504-5 17,2012.
A., vol. 8,pp. 1330-1339, 1991.
[ 1 1] O. Bichler, D. Querlioz, S. 1. Thorpe, 1. P. Bourgoin, and C. Gamrat,
[30] S. Sukumar and S. J. Waugh, "Separate first- and second-order
"Extraction of temporally correlated features from dynamic vision
processing is supported by spatial summation estimates at the fovea
sensors with spike-timing-dependent plasticity," Neural Networks, vol.
and eccentrically," Vision Res., vol. 47,pp. 58 1-596,2007.
32,pp. 339-348,2012.
[3 1] J. Mutch and D. Lowe, "Object class recognition and localisation using
[ 12] H. Markram,1. Lubke, M. Frotscher,and B. Sakmann, " Regulation of
sparse features with limited receptive fields," int. J. Comput. Vis., vol.
synaptic efficacy by coincidence of postsynaptic APs and EPSPs,"
80,no. I,pp. 45-57, 2008.
Science (80-. ).,vol. 275,no. 5297,pp. 2 13-215, 1997.
[32] E. Niebur and C. Koch, "A model for the neuronal implementation of
[ 13] B. Zhao, Q. Vu, H. Vu, S. Chen, and H. Tang, "A bio-inspired
selective visual attention based on temporal correlation among
feedforward system for categorization of AER motion events," in
neurons," J. Comput. Neurosci., vol. I,pp. 141- 158, 1994.
Biomedical Circuits and Systems Conference (BioCAS), 2013, pp. 9-
12. [33] I. C. Lin, D. Xing, and R. Shapley, "Integrate-and-fire vs Poisson
models of LGN input to VI cortex: Noisier inputs reduce orientation
[ 14] S. Thorpe and J. Gautrais, "Rank order coding," Comput. Neurosci.
selectivity," J. Comput. Neurosci., vol. 33,pp. 559-572,2012.
Trends Res., vol. 13,pp. 1 13- 1 19, 1998.
[34] T. Delbruck and L. Longinotti, ')AER "
[ 15] F. Ponulak, "ReSuMe-new supervised learning method for Spiking
http://sourceforge.netlpl}aerlwikiIHomel, 2014.
Neural Networks," in International Conference on Machine Learning,
ICML,2005. [35] T. Serrano-Gotarredona and B. Linares-Barranco, "A 128,x 128 1.5%
contrast sensitivity 0.9% FPN 3 J.lS latency 4 mW asynchronous frame-
[ 16] W. Gerstner and W. M. Kistler, "Mathematical formulations of
free dynamic vision sensor using transimpedance preamplifiers," IEEE
Hebbian learning," BioI. Cybern., vol. 87,pp. 404-4 15,2002.
J. Solid-State Circuits, vol. 48,pp. 827-838,2013.
[ 17] R. C. Froemke and Y. Dan, "Spike-timing-dependent synaptic
[36] W. Ehrenstein, L. Spillmann, and V. Sarris, "Gestalt issues in modem
modification induced by natural spike trains.," Nature, vol. 4 16, pp.
neuroscience," Axiomathes, vol. 13,pp. 433-458, 2003.
433-438,2002.
[ 18] S. Marcelja, "Mathematical description of the responses of simple
cortical cells.," J. Opt. Soc. Am., vol. 70,pp. 1297-1300, 1980.

You might also like