The Ripple - Pond: Enabling Spiking Networks To See

S. Afshar1,2, G. Cohen1, R. Wang1, A. van Schaik1, J. Tapson1, T. Lehmann2, T.J. Hamilton*1,2 !
The Ripple Pond: Enabling Spiking Networks to See
Bioelectronics and Neurosciences, The MARCS Institute, University of Western Sydney, Penrith NSW Australia 2 School of Electrical Engineering and Telecommunications, The University of New South Wales, Sydney NSW Australia Correspondence: Tara Julia Hamilton The MARCS Institute Bioelectronics and Neuroscience University of Western Sydney Locked Bag 1797 Penrith NSW 2751 Australia t.hamilton@uws.edu.au
AbstractIn this paper we present the biologically inspired Ripple Pond Network (RPN), a simply connected spiking neural network that, operating together with recently proposed PolyChronous Networks (PCN), enables rapid, unsupervised, scale and rotation invariant object recognition using efficient spatio-temporal spike coding. The RPN has been developed as a hardware solution linking previously implemented neuromorphic vision and memory structures capable of delivering end-to-end high-speed, low-power and lowresolution recognition for mobile and autonomous applications where slow, highly sophisticated and power hungry signal processing solutions are ineffective. Key aspects in the proposed approach include utilising the spatial properties of physically embedded neural networks and propagating waves of activity therein for information processing, using dimensional collapse of imagery information into amenable temporal patterns and the use of asynchronous frames for information binding. Key Wordsobject recognition, spiking neural network, neuromorphic engineering, image transformation invariance, view invariance, polychronous network INTRODUCTION How did the earliest predators achieve the ability to recognize their prey regardless of their relative position and orientation? (Figure 1). What minimal neural networks could possibly achieve the task of real-time view invariant recognition, which is so ubiquitous in animals (Caley, M.J., 2003), even those with miniscule nervous systems (Van der Velden, 2008), (Tricarico, 2011), (Neri, 2012), (Avargues-Weber, 2010), (Gherardi, 2012), yet so difficult for artificial systems? (Pinto, 2008). If realised such a minimal solution would be ideal for todays autonomous imaging sensor networks, wireless phones, and other embedded vision systems which are coming up against the same constraints of limited size, power and realtime operation as those earliest animals of the Cambrian oceans! In this paper we outline such a simple network. We call this system the Ripple Pond Network (RPN). The name Ripple Pond alludes to the networks fluid-like rippling operation. The RPN is a feed-forward time-delay spiking neural network with static unidirectional connectivity. The network converts centered input images received from an unspecified
upstream salience detector into spatio-temporal spike patterns that can be learnt and recognized easily by a downstream distributed Polychronous Network (PCN). Working together the entire system can perform shift, scale and rotation invariant recognition.
!
Figure 1. Batesian mimicry: The highly poisonous pufferfish, Canthigaster valentine (top) and its edible mimic Paraluteres prionurus (bottom). The remarkable degree of precision in the deception reveals the sophisticated recognition capabilities of local predatory fish whose neural networks, despite being orders of magnitude smaller than that of primates nonetheless seem capable of matching human vision in performance (note: the dorsal, anal and pectoral fins are virtually invisible in the animals natural environment).!
The View Invariance Problem In the 2005 book, 23 Problems in Systems Neuroscience (van Hemmen, 2005), the 16th problem in systems neuroscience, as posed by Laurenz Wiskott, is the view invariance problem. The problem arises from the fact that any real world 3D object can produce an infinite set of possible projections on to the 2D retina. Leaving aside factors such as partial images, occlusions, shadows and lighting variations, the problem comes down to the shift, rotation, scale and skew variations. How then, with no central control could a network like the brain learn, store, classify and recognize in real time the multitude of relevant objects in its environment? Generally most biologically based object recognition solutions have been based on vertebrate vision, in particular mammalian vision, and have used either statistical methods (Sonka, 2007), (Gong, 2012), (Sountsov, 2011), signal processing techniques (such as log-polar filters) (Cavanagh, 1978), (Reitboeck, 1984), artificial neural networks (i.e. non-spiking neural networks) (Norouzi 2009), (Nakamura, 2002), (Iftekharuddin, 2011), and more recently, spiking neural networks (SNNs) (Meng 2011), (Serrano-Gotarredona, 2009), (Rasche, 2007), (Serre, 2005). Since many of the approaches above are based on mammalian vision and achieving the accuracy and resolution of mammalian vision, they are very complex and can only be truly implemented on computers (Sonka, 2007), (Gong, 2012), (Norouzi 2009), (Nakamura, 2002), (Iftekharuddin, 2011), (Meng 2011), (Serre, 2005), (Jhuang, 2007), sometimes with very slow computation times. Other implementations that have been demonstrated on hardware (Serrano-Gotarredona, 2009), (Rasche, 2007), (Folowosele, 2011) have been successful in proving that vision can be achieved for small, low-power robots, UAVs, and remote sensing applications, however; the functionality of such neuromorphic systems has been limited.
The few models of invertebrate visual recognition have had an explanatory focus (Huerta, 2009), (Arena, 2012) and, not being developed for the purposes of hardware implementability, assume highly connected networks not suitable for hardware. CMOS implementations of bio-inspired radial image sensors are most closely related to the RPN however, these sensors ultimately interface with conventional processors (Pardo, 1998), (Traver, 2010) rather than biologically plausible SNNs as is the case with the RPN. Figure 2 shows the human fovea (a), a typical log-polar implementation used to model this (b), the visual field (c) and mapping of this in V1 in a primate (d). While the mapping may look approximately log-polar in (d), the mapping at the fovea is approximately uniform which, together with the highly intricate connectivity required, reduces the biological plausibility of such log polar approaches.
!
Figure 2. a) Human fovea centralis responsible for the central 2! of the visual field (two thumb widths at arms length) (Ahnelt, 1987), b) Standard log polar model used in artificial vision (Traver, 2010), c) Visual field, and d) Visuotopic organization of V1 (i.e. the mapping of the visual field) in the monkey from (Gattass, 2005).
The RPN trades the high accuracy and resolution approaches favoured by other implementations, for speed and view invariance. Recent papers (Afshar, 2012), (Hamilton, 2011) showing the ability and efficiency with which SNNs, with competition between neurons, overcome problems with noise, incomplete data, power consumption and computation speed suggest that this is a good starting point to build a network capable of solving the view invariance problem. In the following sections we describe the learning network and the RPN that build on these principles. Polychronous Learning Network Before discussing the RPN, we will briefly describe the neuromorphic learning network with which it interfaces, namely: the PolyChronous Network (PCN) (Izhikevich, 2006). A PCN uses a spatio-temporal pattern of spiking neurons to represent information asynchronously whereas classic artificial neural networks discard timing information by modelling neuron firing rates sampled synchronously by a central clock. The PCNs additional use of asynchronous temporal information results in greater energy efficiency and speed (Levy, 1996), (Van Rullen, 2001). The PCN shares this feature with other recently proposed dynamic neural networks including liquid and echo state machines (Maass, 2002), (Jaeger H., 2001), however unlike PCNs, liquid state machines do not perform well for spatio temporal pattern recognition unless the reservoir network has been set up to ensure feedback stability,
fading memory, and input separability (Manevitz, L., 2010), and as such we prefer to use the simpler, feed forward PCN for pattern recognition.
!
Figure 3. Biological model (top) and mathematical description (bottom) of a single element in a PCN.
A PCN can function as a content addressable, distributed memory system comprising of multiple neurons connected to each other via multiple pathways. The PCN through dynamic adaptation of synaptic weights (W1, W2, W3 in Figure 3) and dendritic propagation delays (d1, d2, d3 in Figure 3) makes particular neurons exclusively responsible for particular interspiking intervals. It achieves this by continuously adapting its parameters to maximize recognition at its output in response to the statistics of its input. Figure 4 shows an example of a spike pattern entered into a PCN with five neurons (Wang, 2011). Here in order for neuron 3 to spike, a spike at neuron 1 is delayed by T1+T2 and a spike at neuron 5 is delayed by T2. Thus the spikes at neuron 1 and 5 arrive simultaneously at neuron 3, causing it to spike. A longer spatio-temporal pattern can be stored in a network of neurons by learning the delays and weights that are required between neurons (Paugam-Moisy, H., 2008), (Ranhel, J., 2012). Subsequently the PCNs output can be measured as the relative activation of certain neurons which individually or in concert indicate the recognition of a certain pattern (Martinez, R., 2009).
!
Figure 4. Neuron activation in a polychronous network (PCN) (Wang, 2011).!
However PCNs or other distributed memory systems tasked with recognition cannot directly interface with the retina since they expect their learnt patterns to appear via the same channels every time (see Figure 5). The RPN is in the class of systems that serve as the interface between such a memory system and retinotopically mapped inputs from the visual field. It does so by converting the raw visual stimulus to a Temporal Pattern (TP) suitable for
the PCN, in this way the RPN is a transformation of an image from its 2-dimensional geometry into a 1-dimensional temporal sequence. Subsequent extensions of/to the RPN allow extraction of different image features which are presented to the PCN as additional TPs thus converting a two dimensional image to an m-dimensional spatio-temporal sequence.
!
Figure 5. A PCN element with the spiking pattern at the input flipped, compared with Figure 3, results in no recognition.
MATERIALS AND METHODS The RPN Network A central feature of the RPN is getting the most functionality from a minimally connected network. Connectivity is a limiting factor which, though largely absent in biology, frequently constrains hardware implementations and yet is not often considered in the development of artificial neural networks (Sivilotti, 1991), (Hall, 2004), (Furber, 2012), (Park, 2012). An Upstream Shift Invariant Salience Network As shown in Figure 6, the RPN receives an image as the spatio-temporal, high-pass filtered activation pattern of neurons on a conceptual 2D sheet representing the field of attention that has been produced by an upstream salience detection system. By using a sliding the window of attention and focusing it onto a single salient object at a time the salience detector effectively allows the overall network to operate in a shift invariant manner. The field of computational and biologically-based salience detection is extensive with a wide range of models and techniques and approaches (Drazen, 2011), (Gao, 2009), (Itti, 2001), (Vogelstein, 2007). The proposed RPN system does not require any specific salience model having a centered input image as its only requirement. For simplicity, however, we may assume the upstream salience network to consist of only a motion detector, which physically fixes the creatures gaze onto a moving object. In fact, this simplified system is not far off the mark in the case of many organisms (Land, 1999), (Dill, 1993) and may serve in robotics applications where energy and hardware are also limiting factors. Input Images Ripple Outwards on the Neuron Disc After centring by the salience network and high-pass filtering, the incoming image stimulates in a frame-based fashion the neurons distributed on a disk as shown in Figure 6. For visual clarity in all the illustrated figures the disc comprises of only ten neurons on each arm (N=10) with a total of thirty arms (!=30). For n=1N For !=1" ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! where an,! (t) is the activity on the nth neuron on arm ! at time t.
(1)
!
Figure 6. The spiral RPN System Diagram: Input image to Temporal Pattern generation.
Functionally these neurons represent simple binary relays with fixed unit delays between them. More complex neuron models (Indiveri, 2011) could increase the systems resolution but would incur an increased hardware cost. In the RPN each neuron an+1,! (t) can be activated if its upstream neuron (on the same spiral arm but closer to the disc centre), an,! (t) transmits a pulse to its input. All neurons are also sensitive to any incoming image, functioning as radially distributed outwardly connected pixels on a circular retina. With implementation in mind, and in contrast to log-polar filters, the density of the neurons along each arm of the disc is increased along the arm as we move away from the centre of the disc, permitting an approximately uniform distribution of neurons. Starting from the discs centre, the neuronal connections radiate outward to the edge of the disc. This outward connection creates a rippling effect. Since the neurons act as delayed relays, stimulation of a neuron close to the centre of the disc travels outward along the disc arm as a discrete wave, stimulating succeeding neurons in turn (1). The Summing Neuron outputs a Temporal Pattern The neurons at the edge of the disc all connect to a common Summing Neuron (red sigma in Figure 6 that outputs a real-valued TP that can be represented as a 1"N vector (2). From the
geometry of the neuron disc it is clear that this output TP is rotationally invariant. More subtly but as important is the RPNs central feature of image-to-collapsed-temporal-pattern conversion. This conversion greatly simplifies the task of scale invariant recognition of the downstream PCN. ! ! (2) !"!"# !!! ! ! ! ! ! !! !! ! !! ! ! !!
! !!
Where: TPsum(t) is the temporal pattern output of the summing neuron, " is the total number of neurons on the disc, N is the total number of neurons per arm, an,! (t) is the activity on the nth neuron on arm ! at time t. The summing neuron sums the activity of the neurons at the edge of the disc. Note that for simplicity we represent the summing neuron as a single neuron with a real valued spike output, but may also be represented by a network of neurons, which receiving the same input collectively produce the same output. The PCN however receives spatio-temporal spike patterns as inputs which are, to the first approximation, binary pulses. Thus, the real valued output of the RPNs summing neuron needs to be encoded into binary values via multiple channels into the PCN. This can be achieved in the PCNs input layer through the use of multiple heterogeneous neurons with a range of different internal parameters such as varied input weights, thresholds and center frequencies that would relay the value of the RPN output to the PCNs reservoir. These neurons would effectively act as multiple input channels which, though all receiving the same input, can provide an arbitrary number of channels into the PCN in a simple fan-out fashion. The Inhibitory Neuron Acts as an Asynchronous Shutter In addition to being sensitive to an incoming image and synapsing onto outer neurons along the disc arm, all neurons on the disc also connect to an inhibitory neuron (green sigma in Figure 6) such that the inhibitory neuron carries information about the net activity of all neurons on the disc: ! !"!!!! ! Where: Inh(t) is the output of the inhibitory neuron. The output of the inhibitory neuron then feeds back inhibitively onto the visual pathway carrying the image to the disc. The Inhibitory neuron blocks this pathway preventing further transmission of the image. In this way the neuron ensures that as long as any activity remains on the disc (i.e. while RPN is processing the image) no new image will be projected onto it. In hardware this feedback loop can be realised via an asynchronous sample and hold; however, biologically plausible systems require a more distributed approach (Wang, 2010), (McDonnell, 2012), (Brunel, 2000) which we will consider further in the discussion section. !
! !! ! !!
! ! ! ! ! !! !
(3)
Need for Normalization Depends on PCN Robustness In this paper we do not make assumptions about the functional complexity of the PCN and propose a complete system compatible with the simplest PCN realization; however we also suggest useful PCN functionalities that would enhance the RPN-PCN system. One example is the ability of the PCN to recognize spatio-temporal patterns received at different speeds which we here refer to as being robust. In order to compensate for a PCN lacking this capability we include the following network augmentation.
!
Figure 7. Normalization of amplitude and time of the TP for scaled images for a non-robust PCN. (N=10, !=30. All signals are accurate.)
It can be observed that the information carried by the inhibitory neuron can also be used to normalize the systems TP output. The timing and level of initial activation of this neuron can be made to dynamically affect the magnitude and propagation speed of the output signal of the summing neuron as shown in Figure 7. In biological networks this process, called shunting inhibition (Koch, 1983), (Volman, 2010), is a fundamental element in neuronal computation and results in normalization both of the signals magnitude and its temporal scale (i.e. duration). Shunting inhibition involves a control signal (for example the net activity
of a group of neurons) affecting the time constants of dynamic neuronal processes through which a signal travels, effectively speeding up and slowing down the signal, while also altering the signal amplitude (Wills, 2004), (Carandini, 2012). Figure 7 outlines how the normalization process affects the output from the RPN. Immediately after the projection of the image, in this case Rs of different sizes, onto the disc, the total activation of all neurons is summed at the inhibitory neuron (iv), the value of this signal at its initial rise can be used to normalize the magnitude of the output whereas the interval between this inhibitory pulse (v) and the activation of the summing neuron, L, indicates the required degree of warping in time. The output from the summing neuron in both the large (a) and small (b) images of R is given in (i). This output then passes through a low-pass filter (LPF), which converts (i) into the continuous signal given in (ii). Note that this LPF operation is only needed due to the assumption of an ideal system. In a biological context the summing neuron would simply be represented by a population of neurons all with the same input rather than a single neuron and the output would be in the form of a modulated local field potential and thus already continuous. Furthermore, in a hardware implementation context, device non-idealities (internal parasitic capacitance, imperfect neuron placement, mismatched delays along arms, noise in combination with high neuron numbers, etc.) would similarly result in a continuous signal exiting the summing neuron. This utilization of nonidealities in the aid of processing is also a common feature in biological systems (McDonnell, 2011). The (now continuous) TP is normalized both in time and magnitude resulting in the output signal shown in (iii).
!!
Figure 8. Computational description of the compartmentalized normalization block with (N = 10).
The functional diagram of the normalization block is shown in Figure 8. We include this block for simplicity in order to interface with of non-robust downstream PCNs, however, whether in the context of biology, neurocomputational models, or in hardware, no normalization may be required as the internal parameters of the PCN could be made to automatically vary in response to incoming signals ensuring robust recognition of rescaled signals (Carandini, 2012), (Kohonen, 1982), (Tapson, 2013a), (Paugam-Moisy, 2008), (Gutig, 2009). Figure 8 details the functional operations to normalize the TP if required. The output of the summing neuron (the red TPin in Figure 7) is down sampled using L, the time interval between the image initially hitting the disc and the first activity by the summing neuron such that:
!"!"# !
!"#$%&'()*!!"!" !! !! !"! !! ! !!!

(4)
! ! !!"# ! !!"! where, TPout is the output of the normalization block. TPin is the output of the summing neuron, Inh is the output of the Inh neuron, M is the down sampling factor, N is the number of neurons per arm, L is the time interval between the first activity of the inhibitory and summing neurons tTP is the time of initial activation of the summing neuron, tinh is the time of activation of the inhibitory neuron, The square root of the total initial activation via the inhibitory neuron is used to normalize the TPs magnitude. Taking the square root of the inhibitory signal compensates for the collapse of the discs area onto a one dimensional TP. As a result, regardless of the size or relative intensity of the incident image, its corresponding TP is recognizable by a non-robust PCN. The result of the overall system is that an image projected onto the disc in the form of spikes begins to radiate outwards from the disc centre. Finally all the incident spikes reach the edge of the disc and their sum is translated into the output activation of the summing neuron, which forms the TP. This TP along with the inhibitory total activation signal is then input directly to a robust PCN, or alternatively the TP is low-pass filtered, normalized using the total activation signal and then input to a non-robust PCN. The PCN uses synaptic weight and delay adjustments to learn and recognize the commonest encountered TPs. Thus the completed system delivers recognition using a simple evolvable network. Thus the complete RPN-PCN system delivers scale and rotation recognition. Multiple Parallel Heterogeneous Discs Result in Higher Specificity A drawback of the collapse of a feature rich 2D image into a temporal pattern is the significant loss of information. To prevent this loss of information instead of using a single disc, the input image can be projected onto multiple discs operating in parallel. By varying disc connectivities, densities and pre-filtering, multiple feature maps can be created so that the incident image can be processed into an array of multiple, independent TPs the combination of which are unique for every object. Examples of such heterogeneities include introduction of cross talk or coupling between the discs arms, use of discs with different neuron densities and use of hardware implemented Gabor filters (analogous to orientation sensitive hypercolumns in the visual cortex (Bressloff, 2002), (Dahlem, 2004)) to create orientation sensitive discs (Chicca, 2007), (Choi, 2005), (Shi, 2006). To function correctly the RPN and all pre-processing system working with it must have the rotation invariance property. If standard Gabor filters operating on Cartesian coordinates were used the resulting feature maps would be sensitive to rotation. A simple solution to this problem is to first transform the image into polar coordinates and then perform Cartesian Gabor filtering. However in the RPN we propose using hardware implemented radial Gabor
filters which group features based on their orientation relative to the disc center as shown in Figure 9. This approach has the benefit of merging the two steps into one that delivers possible speed advantages, avoids a biologically implausible transform and leaves open the option of extending the RPN in a distributed manner where information about a dynamic center of attention can be used locally to generate rotationally invariant feature maps.
!
Figure 9. Feature extraction via Cartesian Gabor filters, G("), in the top two blocks extract edges relative to the Cartesian coordinates, and via Radial Gabor filters, g("), in the bottom two blocks where edges of similar orientation are extracted relative to radial coordinates.
Another useful multi-disc arrangement would be the use of discs with different neuron densities along their arms which produce higher speed TPs that can reach the PCN more rapidly. These parallel high speed TPs could not only provide early information for signal normalization but can also be used to narrow the range of possible objects the image could represent. Recalling that patterns are represented in the PCN as a signal propagation pathway, signals from the sparsely populated discs can readily be used to deactivate the vast majority of PCN pathways not matching the early small scale low resolution TP, thus saving most of the energy required to check a high resolution TP against every known pattern. This ensures a highly energy efficient system which rapidly narrows the number of possible object candidates with successively greater certainty. As an illustration of the fan-out feature extraction architecture, Figure 10 shows separation of an incident image via parallel, radial Gabor filters to create a higher dimension spatiotemporal pattern for the PCN, thus, enabling greater selectivity. As examples, radial Gabor filters with 0#$ 45# and 90# orientation relative to the center are shown. Also illustrated are outputs of discs with N (full), N/2 and N/4 neurons on each arm, demonstrating the relative temporal order of the multi-resolution spatio-temporal patterns delivered to the PCN.
Figure 10. RPN producing a spatiotemporal pattern using parallel radial Gabor filters and varying neuron density discs.
After passing through the parallel Gabor filters, each with a different orientation, the incident image in Figure 10 is separated into independent feature sets matching the preferred direction of the corresponding filter. The resultant separated images are then each processed by multiple parallel discs, each with a different neuron density (dj) resulting in a corresponding number of TPs with different resolutions. Thus, for j discs operating on i Gabor filters, i j TPs are produced. However, as shown at the right half of Figure 10 the low resolution TPs (those resulting from the disc with N/4 and N/2 neurons per arm) are produced and available to the PCN very rapidly and far sooner than the full resolution TP. A PCN with complex features such as lateral and recurrent inhibitory connections having recognized these earlier, low resolution TPs, which could correspond to more general categories of objects, can use this information to switch off the recognition paths obviously not corresponding to the received pattern rapidly and efficiently narrowing the PCNs propagation pathways to the target object. The simplicity of such a parallelized, multi-scale system, the biological evidence for multiscale visual receptive fields (Itti, 1998), (Riesenhuber, 1999), the presence of multispeed pathways in the visual cortex (Loxley, 2011) and the potential impact on energy consumption, the primary limiting factor for all biological systems, argues in favour of further investigation of such a multi-scale, multispeed scheme.
RESULTS To better illustrate the pertinent characteristics of the RPN we focus only on the simple one disc case without the added multi-disc extensions. Although these extensions deliver significant improvement in the overall systems performance, conceptually they are repetitions of the simple case and merely make the PCN more effective by delivering more information in parallel.
!
Figure 11. Temporal patterns (TPs) from the RPN illustrating rotational invariance (left) and scale invariance (right). Measures of similarity between the TPs are given in the form of the Cosine Similarity (cos#) and Spearman's Rank Correlation Coefficient (Spear'$). Both of these measures show high degrees of similarity between the images.
RPNs Geometry Enables Scale and Rotation Invariance It may be apparent by inspection that the RPN is invariant to rotation due to the radial symmetry of the disc. The left panel in Figure 11 shows the output TPs resulting from a 200200 pixel image and its rotated equivalent. The TP similarity measured via Cosine Similarity (cos#) and Spearman's Rank Correlation Coefficient (Spear'$) are high for the rotation transformation. A less obvious result of the RPN is the scale invariance shown in Figure 11, right. Unlike rotational information, scale information is preserved by the RPN in the form of the time interval between the initial projection of the image on the disc and the activation of the summing neuron. Using this information the additional TP normalization step, discussed above, can be performed. To measure the RPN performance as a function of image rotation scale and shift a mixed set of 300 different 200 by 200 pixel test images consisting of letters, numbers, words, shapes, faces and fish were used in approximately equal numbers. All images were high pass filtered using a difference of Gaussians kernel and processed by an RPN disc with 200 spiral arms each with 200 neurons. The similarity metrics of Spearman-$ and cosine similarity were measured for each image across a range of rotation, translation and scale transforms with respect to the original image with the average values shown in Figure 12. Variance as a function of rotation is shown in the left panel, where the interlaced spiral distribution of the 200 disc arms resulting in a high level of similarity. The pattern shown from 0 to %/100 radians is repeated as expected due to the discs symmetry.
!
Figure 12. Variance as a function of rotation, scale and shift.
The central panel shows variance with respect to scaling. Here the resampling operation on the small 200200 image was the dominant source of variance a source which would not be present in a real world context. Nonetheless the system is robust to rescaling down to image sizes where the resampling operation makes recognition challenging even for the human eye. The right most panel shows variance as a function of shift or translational transform. As would be expected for a global polar transform RPN is highly sensitive to non-centered images where a 10% shift of a 200200 image results in a drop of .17 and .25 on the cosine and Spearman similarity metrics respectively. RPN is Fast The RPN has been designed for speed. In the worst case, such as the examples in Figure 13, where the only distinguishing feature of two otherwise identical objects is at the object centre (since the centre of the image takes the longest time to propagate out to the summing neuron), the TP from a disc with N neurons on each arm would be recognized in: !!"#$%&'(" ! !!"#$%&' ! ! !!"##$% !! ! !!"# !!
(5)
where Trecognize is the total time needed for recognition, Tproject is the time needed for the image from the retina/sensor to be projected onto the RPN disc. In the multi-disc case this term would consist of the time required to generate the most time consuming feature maps e.g. hardware implemented Gabor filters giving resulting in Tproject & 240ns (Chicca, 2007). Tripple is the time needed for the activation pattern to propagate one step out from the original activation point on the disc along its corresponding radial arm. Assuming implementation via a digital relay Tripple is in the order of nanoseconds or even smaller. TPCN is the time needed by the PCN to process one weighted input spike which can loosely be interpreted as the temporal resolution of the PCN. Given TPCN and the length of a temporal pattern, N, and assuming a linear relationship between the length of the input signal and time to recognition, the response time of the PCN can be estimated. With TPCN being on the order of 10 nanoseconds in current first generation hardware implementations of the PCN (Wang, 2013), and choosing a high N (say N=500), we obtain an approximate Trecognize on the order of 5 microseconds.
!
Figure 13. Distinguishing between a juvenile Spotted Hampala (Hampala macrolepidota) and an adult plain Hampala would represent a worst case for the RPN.
Since most recognizable objects do not share the properties of the pathological examples in Figure 13, the PCN often needs only some initial section of a TP to reach its recognition threshold for that pattern since in most cases, no other learnt pattern apart from the target object matches the initial TP of the object. This results in a Trecognize that is considerably shorter than the worst case. In addition, the ability of the PCN to recognize the TP as it is being generated by the RPN effectively eliminates the Tripple term due to the temporal overlap. Which results in: !!"#$%&'(" ! !!"#$%&' ! !!"# !!
(6)
Signal processing programs running on sequential von Neumann machines require computation times on the order of several milliseconds just to convert Cartesian images into log-polar images while consuming significant computational resources (Chessa, 2011). Hardware implemented log-polar solutions provide significantly higher speed than mapping techniques (Traver, 2010), however unlike the RPNs rippling operation which delivers TPs usable by a distributed memory system like the PCN, the log-polar foveated systems operate essentially as simple sensor grids and must interface with conventional sequential processors, introducing bottlenecks that distributed memory systems avoid. Other hardware implementations can partially bypass this bottleneck by using processor per pixel architecture or convolution networks resulting in very high speeds (Dudek, 2005), (Perez-Carrasco, 2013). However, these implementations lack complete view-invariance. RPN Connects the Dots The two dashed squares in Figure 14 (bottom left and right) are pixel-for-pixel equivalent matches to the square yet the heterogeneous TP (and critically its early section) output by RPN matches our intuitions. In the case of the RPN the conversion process results in the outer features being the first to reach the PCN. As mentioned previously, PCNs often require only some initial fraction of the entire TP to confidently recognize the object, and in the case where multiple, independently structured discs are operating in parallel producing multiple feature maps to the PCN in the form of spatio-temporal patterns, the fraction is further reduced. Therefore the system can afford to trade the obtained high confidence level for increased speed, rapidly classifying a TP and moving on the next object once a threshold of certainty is reached. This allows rapid recognition of the salient exterior of objects, with anomalies and inconsistencies in the internal regions of the objects leaving only an after-taste of something amiss. However, any unanticipated anomalies at the initial section of the TP could bring the entire process to a halt. If the initial section of the TP does not match the pattern expected by the PCN, automatic recognition ceases and higher, more complex, deliberate mechanisms, which would
presumably operate at much slower speeds, must be called upon to resolve the anomaly. In the case of human perception why exterior features of an object may have greater weight or significance in recognition and produce similar results to the proposed RPN-PCN in these limited cases could warrant further investigation.
!
Figure 14. The RPN illustrates the intuitive relationships between object temporal patterns. Here a square with corners intact is recognized as a square (left) while a square with corners removed is recognized as an octagon (right).
DISCUSSION Hardware Implementability and Biological Plausibility Advantages of Spiral Packing in Hardware and Biology As the RPN uses a global image transform to achieve view invariance it is at best an incomplete element in a possible model of biology, highlighting the utility of approaches such as the use of propagating activation waves, geometric properties of simply connected networks and conversion of imagery information to temporal information in modelling biology. But in the context of artificial system where limited connectivity is a major constraint the RPNs minimally connected network together with high-speed upstream salience detector and downstream PCN can potentially improve end to end processing time significantly. In the RPN model we present here, the connectivity of neurons that radiate from the centre of the disc comes in two flavours: spokes and spirals (see Figure 15). The spoke connectivity is obviously easier to understand, however, spiralling the arms allows for more uniform neuron placement. In our implementation we used a simple adaptive algorithm that incrementally varied the radial twist of the spiral arms as a function of the disc's last neuron density. While this algorithm is not the optimal solution it was sufficient for the purposes the RPN. Such spiral structural symmetry as well as the spreading of wave-like activation has been observed in the visual pathways of a range of animals (see Figure 16) (Dahlem, 1997), (Huang, 2004), (Dahlem, 2009), (Wu, 2008) suggesting possible utility in visual processing. In the context of artificial systems the use of wave dynamics for computation and recognition has only recently begun (Izhikevich, 2009), (Fernando, 2003), (Maass, 2007), (Adamatzky, 2002), (Adamatzky, 2005).
!
Figure 15. RPN connectivity: spokes (left) and spirals (right). Spiral packing is superior to spoke packing.
!
Figure 16. a) Spiral and b) semi-circular propagating wave on the chicken retina. Images from (Yu, 2012) and (Dahlem, 2009).
Uniform vs Log-Polar Distribution Log-polar solutions require precise neuron connectivity and a radially uniform but continuously increasing neuron density from the periphery to the centre of the visual field. This makes these systems biologically implausible and difficult to implement in hardware (Cavanagh, 1978), (Reitboeck, 1984). Additionally, log-polar solutions are particularly sensitive to centring, a problem not evidenced in biology. Instead the RPN is constructed simply with approximately uniform neuron distribution throughout the visual field, firstly because this is a more efficient use of available sensor space and secondly, recognition with this uniform distribution allows extension of the RPN to more challenging tasks and greater biological plausibility not achievable with log-polar designs. Simple Neuron Distribution and Network Evolvability The RPN, with spiral or spoke arms, is a simple time-delay neural network. The simplicity of this network is illustrated in the following example: if the network received the image of a small spot centred on the disc, this would result in a single spike stimulating a neuron at the centre of the disc. The resultant spike from this single neuron would result in a propagating spike wave front, or ripple, reaching the summing neuron along all the spiral arms at the same moment. Thus, in the RPN, the delay along each of the arms (spiral or spoke), as a spike propagates from neuron-to-neuron along the arm, is equal.
This property of matching delays between the arms may imply that the connectivity of the disc needs to be precisely ordered. This connectivity, however, could have evolved autonomously from a randomly connected phase to a self-organised phase based on reinforcing the delay between a spike occurring at the central neuron and a spike generated at the summing neuron. The arbitrary selection of a neuronal region as the disc centre and summing node would enable unsupervised wiring and pruning of connections. The process might include the spontaneous genetically programmed activation of the central region followed by delay adaptation and pruning along and between the arms such that the signal received by the summing neuron is a single sharp impulse. The initial feedback loop between the disc centre and the summing neuron would create the error signal required to adjust delays and weaken inter-arm connections until the error signal is minimised. An argument against the existence of log-polar filters in the brain is that the connectivity required is highly intricate, and no mechanism for its construction has been proposed (van Hemmen, 2005). Thus, unlike previous models that have attempted to show view invariant recognition, (Cavanagh, 1978), (Reitboeck, 1984), RPN is evolvable using the mechanisms of competition and reinforcement learning present in biological neurons (Barabsi, 2004). Simple Neuron Distribution and Hardware Implementation Unlike the log-polar filters, the RPN does not require increased neuron density at the image centre; in fact, the central regions are less significant and can often be left empty. The number of neurons on each arm can be set to increase with the radius from the centre, thus permitting a uniform density of sensors on the disc. This is a highly useful property for implementation in hardware where rectangular grids and maximum element density are dictated by the various technologies. In this context, the structures on the initial sensor disc can be minimally complex, the sensors, being only tasked to produce a pulse upon excitation, can be made very small, their output routed out onto a propagation/buffer/flip-flop layer which itself need only cascade the spikes down its arms at every tripple. Such simple structures permit approximately radial placement of elements despite the rectangular grid on which most integrated circuits are developed. RPN is Scalable in Parallel Multiple discs with different neuron distributions and connectivity schemes can be used in parallel to increase the resolution available to the PCN network, increasing the robustness without adding extra delays. Moreover, this increase in resolution does not require any restructuring of the system allowing a smooth increase in performance as well as graceful degradation. Evidence of such separate parallel visual processing is particularly pronounced in invertebrates where features such of color, motion, and stimulus timing are processed through anatomically segregated parts of the brain (Paulk, 2008). The parallelized structure allows different levels of recognition to occur simultaneously. Discs with sparse neuron populations can very rapidly (via very few propagation steps) provide a low-resolution outline of the TP permitting the PCN to pre-inhibit obviously incongruent templates long before the first full resolution TP arrives. In this way discs with increasingly higher resolution would report their TP providing higher and higher confidence of increasingly refined classification leading to high-speed hierarchical recognition. The RPN Approach can be Extended to Three Dimensions
All the features of the RPN work equally well in three dimensions, and can just as easily recognize reconstructed three-dimensional images. In this context the disc is replaced by a sphere with the three dimensional image being mapped into the sphere, rippling outwards and being integrated at the surface. Here the sphere does not necessarily refer to the physical shape of the network but to the conceptual structure of the connectivity, with a highly connected sphere centre and radiating connectivity out to an integrating layer of neurons on the sphere surface. Given a 3D projection of an object within the sphere, skew invariant recognition could also be realised, which is among the most difficult challenges in image recognition (Pinto, 2008). The reconstruction of 3D images in artificial systems is a well-developed field (Faugeras, 1993). In contrast, the underlying mechanisms performing this 3D information representation task in humans is still an area of active research (Blthoff, 1995), (Fang, 2009), where the evidence points to complex interactions between multiple mechanisms. RPN Operates on Framed Inputs The RPN converts two (or three) dimensional data into a 1 dimensional TP that can be theoretically learnt by a single neuron with dendritic delays corresponding to the pattern. In this way, the RPN is similar to a Liquid State Machine (LSM) in that in both, high dimensional data can be collapsed into lower dimensional data, allowing multiple uses by downstream networks (Maass, 2002). However, unlike an LSM, which requires a continuous, computational medium but can operate on continuous input signals, the RPN operates well with a limited number of neurons and connections, but requires framed inputs. The framebased operation of the RPN makes it more useful from a hardware implementation context, but appears to detract from its biological plausibility prompting a search for a frameless solution. Yet despite attempts to eliminate the frame requirement, to date every proposed and implemented recognition system, including those with the express goal of performing frameless event-based visual processing, has had to introduce some variant of a frame-based approach when attempting recognition, and although the approach tends to acquire different names along the way, the final result presented to the downstream memory system is the convergence of temporally distant pieces of information by the partial slowing or arresting of the leading segments of the incoming signal (Chen, 2009), (Wiesmann, 2012), (PerezCarrasco, 2013), (Zelnik-Manor, 2001), (Farabet, 2012), (Lazar, 2011). However this failure may speak more to the inherent nature of the visual recognition problem than any lack of human ingenuity. Increasing evidence from neuroscience points to functional synchronicity being present in the visual cortex in the form of synchronized gamma waves where one might hypothesise an RPN or other recognition system to exist. The function of this synchronicity has been attributed to the unification of related elements in the visual field, an effect especially pronounced with attention (Fries, 2009), (Gregoriou, 2009), (Van Rullen, 2007), (Buschman, 2009), (Dehaene, 2011), (Meador, 2002). Furthermore the mechanisms proposed to explain such observed waves corresponds to a more distributed analog of the RPNs inhibitory neuron (Martinez, 2005), namely the inhibitory lateral and feedback connections that clump related, but spatially distant information into compact wavefronts separated by periods of inactivity. This convergence from separate fields may be pointing to the usefulness of temporal synchrony for visual inputs in the context of recognition (Seth, 2004). The Shortcomings of the RPN Motivates the Dynamic RPN
A significant drawback of the RPN system is the need for precise centring of a salient object by an unexplained salience detection system. This system not only needs to detect objects of interest based on cues such as colour or movement, but more challengingly it must shift the object onto the neuron disc. In the context of centralized processing, the problem of shifting an image by an arbitrary value is trivial, however, in the context of a self-similar distributed network with no central control, the task is challenging. A proposed solution is the use of dynamic routing systems (Olshausen, 1993), (Hudson, 1997) where a series of route controlling units transport the input image to some hypothetical central recognition aperture in the brain. The switching speeds required to operate such control systems, however, are too high to be biological achievable yet we know that humans and other animals are capable of recognizing non centered objects in the visual field. Even more importantly, the extraordinary high-speed performance of biological systems (the primary motivation for our approach) points to a more distributed system than this nave centralized solution. In our follow-up paper we will present our solution to the salience detection black box and show that through restricting the design to distributed, and thus biologically plausible operations, a more powerful solution presents itself, namely a high speed, fully parallel, multiple scale, multiple object recognition system operating on stochastically structured networks which we call the dynamic Ripple Pond Network (dRPN). CONCLUSION In this paper we have introduced the RPN-PCN system, a simple yet robust biologically inspired view invariant vision recognition system that is hardware implementable, and capable of rapidly performing simple vision recognition tasks. We demonstrated the very particular disabilities of the system which appear to coincide with those of our own distributed recognition system. We described a few of the ways in which RPN can be extended as well as detailing its shortcomings. With these as motivation we briefly outlined the characteristics of a more powerful general solution that meets these challenges and more. REFERENCES
Adamatzky, A., De Lacy Costello, B., & Ratcliffe, N. M. (2002). Experimental reactiondiffusion preprocessor for shape recognition. Physics Letters A, 297(56), 344352. Adamatzky, A., De Lacy Costello, B., & Asai, T. (2005). Reaction-Diffusion Computers. Elsevier Science. Afshar S., Kavehei O., van Schaik A., Tapson J., Skafidas S., Hamilton T.J., (2012). "Emergence of competitive control in a memristor-based neuromorphic circuit," Neural Networks (IJCNN), The 2012 International Joint Conference on , vol., no., pp.1-8. Ahnelt PK, Kolb H, Pflug R. (1987). Identification of a subtype of cone photoreceptor, likely to be blue sensitive, in the human retina, J Comp Neurol. 1987 Jan 1;255(1):18-34. Arena, P., et al. (2012). "Learning expectation in insects: a recurrent spiking neural model for spatio-temporal representation." Neural Netw 32: 35-45. Arthur, J. V. and K. Boahen (2004). Recurrently connected silicon neurons with active dendrites for one-shot learning. Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on. Avargues-Weber, A., et al. (2010). "Configural processing enables discrimination and categorization of facelike stimuli in honeybees." J Exp Biol 213(4): 593-601. Barabsi A.L., Oltvai Z.N. (2004). Network biology: understanding the cell's functional organization, Nat Rev Genet. 5(2):101-13. Bressloff P.C., Cowan J.D., Golubitsky M., Thomas P.J., Wiener M.C. (2002). What geometric visual hallucinations tell us about the visual cortex, Neural Comput. 14(3):473-91. Blthoff H.H., Edelman S.Y., Tarr M.J. (1995). How are three-dimensional objects represented in the brain? Cereb Cortex. 5(3):247-60. Buschman, T. J. and E. K. Miller (2009). "Serial, covert shifts of attention during visual search are reflected by the frontal eye fields and correlated with population oscillations." Neuron 63(3): 386-396. Carandini M. and Heeger D. J., (2012). "Normalization as a canonical neural computation," Nat Rev Neurosci, vol. 13, pp. 51-62. Cavanagh P. (1978). "Size and position invariance in the visual system," Perception 7(2) pg. 167 177.
Shoushun, C., et al. (2009). A bio-inspired event-based size and position invariant human posture recognition algorithm, Circuits and Systems, 2009. ISCAS 2009. IEEE International Symposium on. Chessa M., Sabatini S.P., Solari F., and Tatti F., (2011). A quantitative comparison of speed and reliability for log-polar mapping techniques, In Proceedings of the 8th international conference on Computer vision systems (ICVS'11), James L. Crowley, Bruce A. Draper, and Monique Thonnat (Eds.). Springer-Verlag, Berlin, Heidelberg, 41-50. Choi, T. Y. W., et al. (2005). "Neuromorphic implementation of orientation hypercolumns." Circuits and Systems I: Regular Papers, IEEE Transactions on 52(6): 1049-1060. Caley, M.J., & Schluter, D. (2003). Predators favour mimicry in a tropical reef fish. Proceedings of the Royal Society of London. Series B: Biological Sciences , 270 (1516 ), 667672. doi:10.1098/rspb.2002.2263 Dahlem M.A., Chronicle E.P. (2004). A computational perspective on migraine aura, Prog Neurobiol. 74(6):351-61. Dahlem MA, Hadjikhani N. (2009). Migraine aura: retracting particle-like waves in weakly susceptible cortex, PLoS One. 4(4):e5007. Dahlem M.A., Mller S.C. (1997). Self-induced splitting of spiral-shaped spreading depression waves in chicken retina, Exp Brain Res. 115(2):319-24. Dehaene, S., & Changeux, J.-P. (2011). Experimental and Theoretical Approaches to Conscious Processing. Neuron, 70(2), 200227. Dill, M., Wolf, R., & Heisenberg, M. (1993). Visual pattern recognition in Drosophila involves retinotopic matching. Nature, 365(6448), 751753. Drazen D., Lichtsteiner P., Hfliger P., Delbrck T., Jensen A., (2011). Toward real-time particle tracking using an event-based dynamic vision sensor, Experiments in Fluids, 51(5) pp 1465-1469. Dudek P. (2005). A general-purpose processor-per-pixel analog SIMD vision chip, IEEE Transactions on Circuits and Systems I: Regular Papers 52(1) pp13-20. Fang L., Grossberg S. (2009). From stereogram to surface: how the brain sees the world in depth, Spat Vis. 22(1):45-82. Farabet, C., et al. (2012). "Comparison between Frame-Constrained Fix-Pixel-Value and Frame-Free SpikingDynamic-Pixel ConvNets for Visual Processing." Front Neurosci 6: 32. Faugeras O. (1993). Three-Dimensional Computer Vision, The MIT Press. Fernando, C., & Sojakka, S. (2003). Pattern Recognition in a Bucket. In W. Banzhaf, J. Ziegler, T. Christaller, P. Dittrich, & J. Kim (Eds.), Advances in Artificial Life SE - 63 (Vol. 2801, pp. 588597). Springer Berlin Heidelberg. Folowosele F.O., Vogelstein R.J., Etienne-Cummings R., (2011). Towards a Cortical Prosthesis: Implementing A Spike-Based HMAX Model of Visual Object Recognition in Silico, IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 1(4), pg. 516525. Fries P. (2009). Neuronal gamma-band synchronization as a fundamental process in cortical computation, Annu Rev Neurosci. 32:209-24. Furber, S.B. et al. (2012). "Overview of the SpiNNaker System Architecture." IEEE Transactions on Computers 99 (PrePrints). Gao D., Vasconcelos N., (2009). Decision-Theoretic Saliency: Computational Principles, Biological Plausibility, and Implications for Neurophysiology and Psychophysics, Neural Computation 21(1), pages 239-271. Gattass R., et al. (2005). Cortical visual areas in monkeys: location, topography, connections, columns, plasticity and cortical dynamics, Philos Trans R Soc Lond B Biol Sci. 2005 Apr 29;360 (1456):709-31. Gherardi, F., et al. (2012). "Revisiting social recognition systems in invertebrates." Anim Cogn 15(5): 745-762. Gong B., Shi Y., Sha F., Grauman K. (2012). Geodesic Flow Kernel for Unsupervised Domain Adaptation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Page(s): 2066 - 2073. Gregoriou G.G., Gotts S.J., Zhou H., Desimone R. (2009). High-frequency, long-range coupling between prefrontal and visual cortex during sustained attention, Science. 324(5931):1207-10. Gtig, R., & Sompolinsky, H. (2009). Time-WarpInvariant Neuronal Processing. PLoS Biol, 7(7), e1000141. Hall, T. S., et al. (2004). Developing large-scale field-programmable analog arrays. Parallel and Distributed Processing Symposium, 2004. Proceedings. 18th International. Hamilton T.J., Tapson J., (2011)."A neuromorphic cross-correlation chip," Circuits and Systems (ISCAS), 2011 IEEE International Symposium on , vol., no., pp.865-868. Huang X., Troy W.C., Yang Q., Ma H., Laing C.R., Schiff S.J., Wu J.Y. (2004). Spiral waves in disinhibited mammalian neocortex, J Neurosci. 24(44):9897-902. Hudson, P. T., et al. (1997). "SCAN: A Scalable Model of Attentional Selection." Neural Netw 10(6): 993-1015. Huerta, R. and T. Nowotny (2009). "Fast and robust learning by reinforcement signals: explorations in the insect brain." Neural Comput 21(8): 2123-2151.
Hussain S., Basu A., Wang M., Hamilton T.J. (2012). DELTRON: Neuromorphic architectures for delay based learning, Circuits and Systems (APCCAS), 2012 IEEE Asia Pacific Conference on, p. 304 307. Iftekharuddin K.M., (2011). "Transformation Invariant On-Line Target Recognition," Neural Networks, IEEE Transactions on , vol.22, no.6, pp.906-918. Indiveri, G., et al. (2011). "Neuromorphic silicon neuron circuits." Frontiers in Neuroscience 5. Itti L., Koch C. Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11) p. 1254 1259. Itti L., Koch C. (2001). Computational modelling of visual attention, Nat Rev Neurosci. 2(3):194-203. Izhikevich E.M., (2006). Polychronization: Computation with spikes, Neural Computation, 18(2), pg. 245 282. Izhikevich, E. M., & Hoppensteadt, F. C. (2009). Polychronous Wavefront Computations. International Journal of Bifurcation and Chaos, 19(05), 17331739. Jaeger H., (2001). The echo state approach to analysing and training recurrent neural networks GMD-Report 148, German National Research Institute for Computer Science. Jhuang H., Serre T., Wolf L., Poggio T., (2007). A biologically inspired system for action recognition, IEEE 11th International Conference on Computer Vision, 2007 (ICCV). Koch, C., Poggio, T., & Torre, V. (1983). Nonlinear interactions in a dendritic tree: localization, timing, and role in information processing. Proceedings of the National Academy of Sciences , 80 (9 ), 27992802. Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43:59-69. Land M.F., (1999). Motion and vision: why animals move their eyes, Journal of Comparative Physiology A, 185(4): p. 341-352. Lazar, A. A. and E. A. Pnevmatikakis (2011). "Video Time Encoding Machines." Neural Networks, IEEE Transactions on 22(3): 461-473. Levy W.B., Baxter R.A. (1996). Energy efficient neural codes, Neural Comput. 8(3):531-43. Loxley, P. N., Bettencourt, L. M., & Kenyon, G. T. (2011). Ultra-fast detection of salient contours through horizontal connections in the primary visual cortex. EPL (Europhysics Letters), 93(6), 64001. Maass W., Natschlger T., Markram H. (2002). Real-time computing without stable states: a new framework for neural computation based on perturbations, Neural Comput. 14(11):2531-60. Maass, W. (2007). Liquid Computing. In S. B. Cooper, B. Lwe, & A. Sorbi (Eds.), Computation and Logic in the Real World SE - 53 (Vol. 4497, pp. 507516). Springer Berlin Heidelberg. Manevitz, L., & Hazan, H. (2010). Stability and Topology in Reservoir Computing. In G. Sidorov, A. Hernndez Aguirre, & C. Reyes Garca (Eds.), Advances in Soft Computing SE - 21 (Vol. 6438, pp. 245256). Springer Berlin Heidelberg. Martinez, D. (2005). "Oscillatory synchronization requires precise and balanced feedback inhibition in a model of the insect antennal lobe." Neural Comput 17(12): 2548-2570. McDonnell, M. D. and L. M. Ward (2011). "The benefits of noise in neural systems: bridging theory and experiment." Nat Rev Neurosci 12(7): 415-426. McDonnell, M. D., et al. (2012). "Input-rate modulation of gamma oscillations is sensitive to network topology, delays and short-term plasticity." Brain Res 1434: 162-177. Meador, K. J., Ray, P. G., Echauz, J. R., Loring, D. W., & Vachtsevanos, G. J. (2002). Gamma coherence and conscious perception. Neurology , 59 (6 ), 847854. Meng Y., Jin Y., Yin J., (2011). "Modeling Activity-Dependent Plasticity in BCM Spiking Neural Networks With Application to Human Behavior Recognition," Neural Networks, IEEE Transactions on , vol.22, no.12, pp.1952-1966. Nakamura K., Arimura K., Takase K., (2002). Shape, Orientation and Size Recognition of Normal and Ambiguous Faces by a Rotation and Size Spreading Associative Neural Network, Proceedings of the 2002 International Joint Conference on Neural Networks (IJCNN '02), Page(s): 2439-2444. Neri, P. (2012). "Feature binding in zebrafish." Animal Behaviour 84(2): 485-493. Norouzi M., Ranjbar M., Mori G., (2009). Stacks of Convolutional Restricted Boltzmann Machines for ShiftInvariant Feature Learning, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Page(s): 2735-2742. Olshausen, B., et al. (1993). "A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information." The Journal of Neuroscience 13(11): 4700-4719. Oster M., Douglas R. and Liu S.-C., (2009). Computation with spikes in a winner-take-all network, Neural Computation, 21(9). Pardo, F., et al. (1998). Space-variant nonorthogonal structure CMOS image sensor design, Solid-State Circuits, IEEE Journal of, 33(6), pages 842 849. Perez-Carrasco J.A., Zhao B., Serrano C., Acha B., Serrano-Gotarredona T., Chen S., Linares-Barranco B. (2013). Mapping from Frame-Driven to Frame-Free Event-Driven Vision Systems by Low-Rate Rate-
Coding and Coincidence Processing. Application to Feed Forward ConvNets, IEEE Trans Pattern Anal Mach Intell. 2013 Apr 10. [Epub ahead of print]. Pinto N., Cox D.D., DiCarlo J.J. (2008) Why is Real-World Visual Object Recognition Hard? PLoS Comput Biol 4(1): e27. Paugam-Moisy, H., Martinez, R., & Bengio, S. (2008). Delay learning and polychronization for reservoir computing. Neurocomputing, 71(79), 11431158. Paulk, A. C., et al. (2008). "The Processing of Color, Motion, and Stimulus Timing Are Anatomically Segregated in the Bumblebee Brain." The Journal of Neuroscience 28(25): 6319-6332. Ranhel, J. (2012). Neural Assembly Computing. Neural Networks and Learning Systems, IEEE Transactions on. Rasche C., (2007). "Neuromorphic Excitable Maps for Visual Processing," Neural Networks, IEEE Transactions on , vol.18, no.2, pp.520-529. Reitboeck H. J., Altmann J., (1984). A model for size- and rotation-invariant pattern processing in the visual system, Biological Cybernetics 51. Riesenhuber, M. and T. Poggio (1999). "Hierarchical models of object recognition in cortex." Nat Neurosci 2(11): 1019-1025. Serrano-Gotarredona R., Oster M., Lichtsteiner P., et al., (2009). "CAVIAR: A 45k Neuron, 5M Synapse, 12G Connects/s AER Hardware SensoryProcessing LearningActuating System for High-Speed Visual Object Recognition and Tracking," Neural Networks, IEEE Transactions on , vol.20, no.9, pp.1417-1438. Serre T., Kouh M., Cadieu C., Knoblich U., Kreiman G., Poggio T. (2005). A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex, AI Memo 2005-036, CBCL Memo 259. Seth, A. K., McKinstry, J. L., Edelman, G. M., & Krichmar, J. L. (2004). Visual Binding Through Reentrant Connectivity and Dynamic Synchronization in a Brain-based Device. Cerebral Cortex , 14 (11 ), 1185 1199. Shi, B. E., et al. (2006). Expandable hardware for computing cortical feature maps. Circuits and Systems, 2006. ISCAS 2006. Proceedings. 2006 IEEE International Symposium on. Sivilotti, M. A. (1991). Wiring considerations in analog VLSI systems, with application to field-programmable networks, California Institute of Technology. Sonka M., Hlavac V., BoyleImage R., (2007). Processing, Analysis, and Machine Vision, CL Engineering 3rd ed. Sountsov, P., et al. (2011). "A Biologically Plausible Transform for Visual Recognition that is Invariant to Translation, Scale, and Rotation." Front Comput Neurosci 5: 53. Tapson J. and van Schaik A., (2013a). Learning the pseudoinverse solution to network weights, Neural Networks, online. Tapson J., Cohen G., Afshar S., Stiefel K., Buskila Y., Wang R., Hamilton T.J., van Schaik A. (2013b). Synthesis of neural networks for spatio-temporal spike pattern recognition and processing, Submitted to Frontiers in Neuromorphic Engineering 2013. Traver V.J., and Bernardino A., (2010). A review of log-polar imaging for visual perception in robotics, Robot. Auton. Syst. 58, 4 (April 2010), 378-398. Tricarico E., Borrelli L., Gherardi F., Fiorito G. (2011). I Know My Neighbour: Individual Recognition in Octopus vulgaris, PLoS ONE 6(4): e18710. Van der Velden J., Zheng Y., Patullo B.W., Macmillan D.L. (2008). Crayfish Recognize the Faces of Fight Opponents, PLoS ONE 3(2): e1695. van Hemmen J.L., Sejnowski T.J., (2005). 23 Problems in Systems Neuroscience, Oxford University Press, USA. Van Rullen R., Thorpe S.J. (2001). Rate coding versus temporal order coding: what the retinal ganglion cells tell the visual cortex, Neural Comput. 13(6):1255-83. Van Rullen, R., et al. (2007). "The blinking spotlight of attention." Proceedings of the National Academy of Sciences 104(49): 19204-19209. Vogelstein, R. J., et al. (2007). "A Multichip Neuromorphic System for Spike-Based Visual Information Processing." Neural Comput. 19(9): 2281-2300. Volman, V., Levine, H., & Sejnowski, T. J. (2010). Shunting Inhibition Controls the Gain Modulation Mediated by Asynchronous Neurotransmitter Release in Early Development. PLoS Comput Biol, 6(11), e1000973. Wang R., Tapson J., Hamilton T.J., van Schaik A., (2011). An analogue VLSI implementation of polychronous spiking neural networks, Seventh International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), pg. 97102. Wang R., Cohen G., Stiefel K.M., Hamilton T.J., Tapson J., van Schaik A., (2013). An FPGA implementation of a polychronous spiking neural network with delay adaptation, Frontiers in Neuroscience 7(14).
Wang, X.-J. (2010). "Neurophysiological and Computational Principles of Cortical Rhythms in Cognition." Physiological Reviews 90(3): 1195-1268. Wiesmann, G., et al. (2012). Event-driven embodied system for feature extraction and object recognition in robotic applications, Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on. Wills S., (2004). Computation with Spiking Neurons, University of Cambridge, Ph.D. Dissertation. Wu JY, Xiaoying Huang, Chuan Zhang (2008). Propagating waves of activity in the neocortex: what they are, what they do, Neuroscientist. 14(5):487-502. Yu Y., Santos L.M., Mattiace L.A., Costa M.L., Ferreira L.C., Benabou K., Kim A.H., Abrahams J., Bennett M.V., Rozental R. (2012). Reentrant spiral waves of spreading depression cause macular degeneration in hypoglycemic chicken retina, Proc Natl Acad Sci U S A.109(7):2585-9. Zelnik-Manor, L. and M. Irani (2001). Event-based analysis of video, Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on.

The Ripple - Pond: Enabling Spiking Networks To See

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Ripple - Pond: Enabling Spiking Networks To See

Uploaded by

Copyright:

Available Formats

S. Afshar1,2, G. Cohen1, R. Wang1, A. van Schaik1, J. Tapson1, T. Lehmann2, T.J. Hamilton*1,2 !

The Ripple Pond: Enabling Spiking Networks to See

!"#$%&'()*!!"!" !! !! !"! !! ! !!!

You might also like