Professional Documents
Culture Documents
A Simple Algorithm For Topographic ICA
A Simple Algorithm For Topographic ICA
Ewaldo Santana1 , Allan Kardec Barros1 , Christian Jutten2 , Eder Santana1 and Luis Claudio Oliveira1
1 Federal University of Maranhao, Dept. Electrical Engineering, Sao Luis, Maranhao, Brazil
2 GIPSA-lab, Department Images et Signal, Saint Martin d’Heres Cedex, France
Abstract: A number of algorithms have been proposed which find structures that resembles that of the visual cortex.
However, most of the works require sophisticated computations and lack a rule for how the structure arises.
This work presents an unsupervised model for finding topographic organization with a very easy and local
learning algorithm. Using a simple rule in the algorithm, we can anticipate which kind of structure will result.
When applied to natural images, this model yields an efficient code for natural images and the emergence
of simple-cell-like receptive fields. Moreover, we conclude that the local interactions in spatially distributed
systems and local optimization with norm L2 are sufficient to create sparse basis, which normally requires
higher order statistics.
150
Santana, E., Barros, A., Jutten, C., Santana, E. and Oliveira, L.
A Simple Algorithm for Topographic ICA.
DOI: 10.5220/0005683501500155
In Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2016) - Volume 4: BIOSIGNALS, pages 150-155
ISBN: 978-989-758-170-0
Copyright
c 2016 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
A Simple Algorithm for Topographic ICA
tive fields when applied to natural images. In this where f is an even symmetric non-linear function,
work we propose an unsupervised model for find- which is chosen to exploit the higher order statisti-
ing topographic organization using convex nonlinear- cal information from the input signal Hyvarinen and
ities with a very easy and local learning algorithm. Hoyer (2000). This signal, yk , interacts with neigh-
Our development is based on the fact that most non- boring outputs, since in the nervous system the re-
linearities yield higher order statistical properties in sponse of a neuron tends to be correlated with the re-
modeling. In our context, we model the neuron as sponse of its neighboring area Durbin and Mitchison
a system which performs non-linear stimuli filtering (2007). So, yk is used to generate a signal to adapt wk j
Barros and Ohnishi (2002), and use its output as in- and make its output coherent with the response of its
put signals from neighboring neurons. This arrange- neighbors.
ment gives spatially coherent responses. This paper We define the neighborhood of neuron k, all neu-
is organized as follows: In Section 2, we present the rons i that are close to neuron k in a matrix representa-
model and develop methods for obtaining the algo- tion, and define the action of a neuron’s neighborhood
rithm. Section 3 shows simulations and results. Dis- as a weighted sum of output signals from all neurons
cussions and conclusions are proposed in section 4. in the neighborhood,
vk = ∑ yik = ∑ f (uik ), (2)
i i
2 METHODS where i are the indices of the neighboring outputs k
and vk interacts with yk to generate a self reference
signal, εk = yk + αvk , where α determines the neigh-
borhood influence on neuron’s activation. This self
reference signal must be minimized in order to reduce
the local redundancy. Thus, we define the following
local cost,
Jk = E[ε2k ] = E[(yk + αvk )2 ], (3)
which should be minimized with respect to all
weights of neuron k. Taking the partial derivative of
Eq.(3) with respect to the parameters and equating it
to zero we obtain the following equation:
2E f (uk )(yk + αvk )x = 0,
′
(4)
Figure 1: Proposed model, which works in parallel fashion.
First layer: input vector x; second layer: the input vector where we used the fact that f ′ (uik ), with respect to
are locally linearly combined by synaptic weights vector w wk , is zero for i 6= k.
yielding the signal u; Third layer: the signal u pass through
a non-linearity, f(.) to get higher order characteristics of In order to avoid the trivial solution of zero
input signal; Fourth layer: the local discrepancy. weight, we utilize the Lagrange multipliers to mini-
mize Jk under constraint kwk k = 1 and observe that
the derivative of ∑i f ′ (uik ) with respect to wk is zero.
Input signals, xk = [x1 (k), ..., xn (k)], are assumed gen-
With this, we obtain the following two step algorithm,
erated by a linear combination of real world source
signals, s(k) = [s1 (k), ..., sn (k)]T , x = As, where A is
′
an unknown basis functions set. wk = E f (uk )εk x
In order to facilitate further development, we wk
also assume that the input signals are preprocessed wk = . (5)
kwk k
by whitening Hyvarinen and Hoyer (2000). In the
demixing model each signal, x j , at the input of This procedure is accomplished in parallel to ad-
synapse j connected to neuron k will be multiplied just all weights for each neuron in the network,
by the synaptic weight wk j and summed to produce where we include orthogonalization steps taken from
the linear combined output signal of neuron k, which Foldiák (1990). In this way a matrix W is obtained.
is an estimation of sk , Moreover, we can show that the proposed algo-
n
rithm converges as given in the theorem below.
uk = ∑ wk j x j , yk = f (uk ), (1) Theorem: Assume that the observed stimuli fol-
low the ICA model and that the whitened version of
j=1
151
BIOSIGNALS 2016 - 9th International Conference on Bio-inspired Systems and Signal Processing
152
A Simple Algorithm for Topographic ICA
153
BIOSIGNALS 2016 - 9th International Conference on Bio-inspired Systems and Signal Processing
Hyvarinen, Aapo and Hoyer, Patrik O., Topographic In- For example, for T = 1, the neighborhood of neuron 28
dependent Component Analysis. Neural Computation, as highlighted in the table below. In this case v28 =
v.13, pp 1527-1558. (2000). {19, 20, 21, 27, 29, 35, 36, 37}.
Kohonen, T., Self-Organizing Maps. Springer, 2000.
Table 2: One neighborhood of neuron 28.
Linsker, R., Local Synaptic Rules Suffice to maximize Mu-
tual Information in a Linear Network. Neural Compu- 1 2 3 4 5 6 7 8
tation, v.04, pp 691-702, 1992.
9 10 11 12 13 14 15 16
Laughlin, Simon., A Simple Coding Procedure Enhances a
Neuron’s Information Capacity. Z. Naturforsch, v.36, 17 18 19 20 21 22 23 24
pp 910-912, 1981. 25 26 27 28 29 30 31 32
Oja, E., Neural Networks,Principal Components and Linear 33 34 35 36 37 38 39 40
Neural Networks. Neural Networks, v.05, pp 27-935,
1989. 41 42 43 44 45 46 47 48
Olshausen, B. A. and Field, D. J., Emergence of Simple- 49 50 51 52 53 54 55 56
Cell Receptive Field Properties by Learning a Sparse 57 58 59 60 61 62 63 64
Coding for natural Images. Nature, v.381, pp 607-609,
1996.
Osindero, Simon; Welling, Max and Hinton, Geoffrey With this definition of neighborhood, we are able to an-
E., Topographic Products Models Applied to Natural alyze the algorithm stability in the following steps:
Scene Statistics. Neural Computation, v.18, pp 381- Assume that signal x can be modeled as x=As, where A
414, 2006. is an invertible matrix;
Let us make the following change in coordinates, pk =
Ray, K. L., McKay, D. R., Fox, P. M., Riedel, M. C.,
Uecker, A. M., Beckmann, C. F., Smith, S. M., Fox, [p1 , p2 , · · · , pn ]T = AT VT wk , where V is the whitening ma-
P. T., Laird, A. ICA model order selection of task co- trix to obtain the signals x;
activation networks. Front. Neurosci. 7:237; 2013 In this new coordinates basis, we define the cost func-
tion as
Simoncelli, E. P and Olshausen, B. A., Natural Image
Statistics and Neural representations. Annu. Rev. Neu- J(pk ) = E[ε2k ] = E{[ f (ptk s) + α · vk ]2 }; (8)
roscience, n. 24, pp 1193-1216, 2001.
Shannon C. E., A Mathematical Theory of Communica- One can analyze, without loss of generality, the stabil-
tion. Bell Systems Technical Journal, v. 27, pp 379- ity of J(pk ) at point pk = [0, · · · , pk , · · · , 0]T = ek . In this
423, 1948. case, pk = 1 and than wk is one row of the inverse of VA.
Stork, D.G. and Wilson, H.R. Do Gabor functions pro- For assumption, J(pk ) is an even function and this analysis
vide appropriate descriptions of visual cortical recep- applies for pk = −1;
tive fields? J Opt Soc Am A7, 1362-1373, 1990. Let d = [d1 , · · · , dn ]T be a small perturbation on ek . Ex-
pressing J(ek + d) into a Taylor series expansion about ek ,
Vinje, W. E. and Gallant, J. L., Sparse Coding and Decorre-
we obtain
lation in Primary Visual Cortex During natural Vision.
Vision Science, v. 287, pp 1273-1276, 2000. ∂J(ek ) 1 T ∂2 J(ek ) T
J(ek +d) = J(ek )+dT + d d +o(kd2 k).
∂p 2 ∂p2
(9)
∂J(p)
APPENDIX = sk f ′ (pT s)[ f (pT s) + vk ], (10)
∂pk
∂J(p)
Take an indexed set √
of N neurons,
√ as one can see in table 1, = s2k f ′′ (pT s)[ f (pT s) + vk ] + s2k f ′ (pT s) f ′ (pT s)
and put them into a N × N matrix (let’s suppose N=64, ∂p2k
(11)
Table 1: A set of N neurons. and
1 2 3 4 5 6 7 8 ··· N ∂J(p)
= sk s j f ′′ (pT s)[ f (pT s) + vk ] + sk s j f ′ (pT s) f ′ (pT s)
∂p2k
for example), as one can see in table 2 (12)
We define the neighborhood of neuron k, all neurons i Taking in account that signals out of neighborhood are
that are close to the neuron k in the matrix representation. statistically independent, we have
In other words, neurons i are neighbors of neuron k if the
coordinates of neuron i obeys the following relation ∂(ek )
dT 2{E sk f (sk ) f ′ (sk ) dk +
=
q ∂p
D(i, k) = (ai − ak )2 + (bi − bk )2 6 T, (7) α ∑ E si f (sk ) f ′ (sk ) di },
+ (13)
i∈vk
where (ak , bk ) and (ai , bi ) are the coordinates of neuron
k and neuron i, respectively, in the matrix representa- because in consequence of independency,
tion, T is a constant that defines the neighborhood size. E[s j f (sk ) f ′ (sk )] = 0, if j ∈
/ vk .
154
A Simple Algorithm for Topographic ICA
155