Professional Documents
Culture Documents
Modificacão Da Receita de Bolo
Modificacão Da Receita de Bolo
1 Introduction
The Evolution theory, proposed by Darwin and Wallace (1; 2), is one of the most important theories
we have nowadays, and fundamental to understand our world. It is accepted by all scientists as a
starting point to study everything related to living beings or part of the living beings. All organs
of the living creatures were molded by evolution.
The brain also grew and developed, obviously, by the action of natural selection. It has to work,
actually, respecting his evolutionary history. But what means its evolutionary history? It means
the development, from generation to generation, from species to species, of a complex system of
prompt responses to external stimuli, a fundamental condition to survive. In fact, a living being
has to react, in many cases instantaneously, to escaping a predator, to catching a prey, to avoiding
an accident, or for many other reasons. This means that the nervous systems of the animals
were developed as the fundamental tools of these organisms to prompt react to changes of the
environment.
After millions of years of development of the nervous systems, generally increasing their sizes
and complexity, it appeared a concentrated region of nervous cells, the primitive brain, that quickly
increased its size. It was a breakthrough in the development of the living beings. But the task
was the same, allowing the living being reacting quickly and more efficiently to changes in the
environment. This task was at the origin of the brain and is recorded “au fer et au feu” in the way
the brain works. This stimuli-dependent-way-to-work of the brain should be behind all tasks the
brain performs, the old ones (essentially instinctive reactions) and the biological newer ones (more
related to cognition).
We will discuss here a stimuli-dependent neural network model for pattern recognition, but
certainly a similar scenario can be constructed for other brain activities. We could summarize
some of the main steps of a pattern recognizing of a living being as:
• the living being by its history has accumulated memories (some input patterns in the past),
that are stored some way, for example using the Hebb rule,
• without external stimuli, the system does not recognize any pattern; it is in a noisy state,
1
• in the presence of an external stimulus, associated with some stored pattern, the system
recognizes that pattern; if the external stimulus has no relation with any stored pattern, the
system does not recognize any stored memory, not even the input pattern,
• if the external stimulus associated with some memory disappear, the effectivity of the recog-
nized pattern decreases, going to practically zero after some time delay, returning the system
to the noisy state,
• if the external stimulus changes abruptly, as it often happens in nature, the system changes
quickly the recognized pattern associated with the first stimulus to the possible recognized
pattern associated with the second stimulus,
• these steps are repeated again and again in the presence of each new stimulus or sequence of
stimulus.
A biological-realistic pattern recognition neural network model should reproduce the steps
sketched above. These steps have many good features, going into the direction of the new concepts
of complex systems, which are systems that live at the chaos-order border or ordered-disordered sys-
tems. Chaotic, ordered or disordered systems can never be good models for a neural network model
or even for life. A really effective neural network model should work at the border chaos-order.
This paper is divided in the following sections: in section II we discuss attractor neural networks
(ANN) and some problems related to them. In section III we present our model, intending to respect
the several points sketched in the introduction. We also discuss the new concepts involving this
new approach, including a discussion about the number of stored patterns and how this number
can be much larger than in ANN models. This approach, or framework, can be used in many neural
networks (NN) models, but we will illustrated it here with the Hopfield model, due to its simplicity.
In section IV we apply our approach in the Hopfield model, we present analytical calculations of
our model and present some numerical results, focusing mainly on the number of stored patterns
and on the quality of the pattern recognition. In section V we present our final comments.
2
as the number of stored memories increases, limiting severely the number of stored memories, even
if these memories are not correlated. Improvements on this model, turning the coupling constants
not symmetric (there is no more a Hamiltonian) (4), coupling between one memory with another
one (5), dilution (6; 7) etc, modify some details but always do not satisfy some of the items listed
above, especially items 4 and 5, but items 2 and 3 can not always be satisfied. In general these
systems works also at the ordered side, not at the border chaos-order. That’s why they are ANN.
The asymmetric coupling constants, the coupling among sequencial memories and the dilution lead
to fixed points of the dynamics or to cycles, which also belong to the ordered side. It could also
lead to a chaotic state (8). In both situations we are not at the border chaos-order but rather on
one side or on the other, and both sides are inappropriate to deal with a complex system like the
brain.
3
the order of the signal, and the system is not anymore able to recognize any stored pattern. The
ability of the system to recognize stored memories fails because the variance of the distribution of
the noise cancels the signal from the external initial stimulus.
The main hypothesis we are proposing here in this work is that the long-time evolution allows
the living beings to calibrate the influence of the external stimulus in such a way as to cancel, as
much as possible, the noise caused by the memories not correlated with the external stimulus. We
can, shortly, divide the influence (hi ) exerted on a particular neuron i by the others neurons in the
presence of a external stimulus correlated with some stored memory by
3.1 Example
Let us illustrate this frame with a simple neural network model, a model where we can even obtain
many analytical results, but in fact, this approach can be used in any NN model. We will use here as
example the Hopfield model because we can get many analytical results, helping us to understand
better what is going on.
In our framework, applied to the simplest Hopfield model, the local field hi at time t, acting on
the i-th neuron, has the following expression:
N
X
hi (t) = Ji,j σj (t) + κ ηi , (2)
j6=i
where σi (t) = ±1 represents the state of the i-th neuron at time t, activated or in rest, Ji,j
represents the intensity of the synapses between neurons i and j, here considered as symmetric,
ηi = ±1 represents the effect of the external stimulus on the i-th neuron, remaining fixed for as long
as the external stimulus is present, and κ represents the intensity of which the system considers
the effect of the external stimulus in the recognition process, a fundamental point in this process.
The intensity of the synapses Ji,j can be expressed in function of the p-stored memories {ξiµ } as
(Hebb’s rule):
p
1 X µ µ
Ji,j = ξi ξj (i 6= j), (3)
N
µ=1
where ξiµ = ±1 for any i and µ. The memories are assumed orthogonal and satisfying the distribu-
tion
1 1
P (ξiµ ) = δ(ξiµ − 1) + δ(ξiµ + 1) . (4)
2 2
4
In general, in the storage of an extensive number of memories, p can be written as p = αN .
This is the case we are interested here, since the storage of non-extensive p patterns is well-known
and it is known that when α & 0.14 the system is not able to recognize any pattern anymore.
Therefore, any good improvement in the recognition process should be able to recognize a stored
pattern even for values of α > 0.14.
With respect to the external stimulus, we will consider that it could have some overlap with one
particular stored memory, let us say memory ρ, 1 ≤ ρ ≤ p. This can be expressed mathematically
by saying that ηi obeys the probability distribution:
The RHS of eq. (6) has three terms: the first one is the signal, when the configuration of the
neurons is the same as the stored memory {ξiρ }; the second one is the noise, induced by the others
memories unlike ρ, considered here orthogonal one of the other; and the third is the term induced
by the external stimulus. Now, if we consider that, at a time t, the states of the neurons, {σi }, as
well as the external stimulus {ηi }, are exactly the same as the pattern {ξiρ }, i.e., γ = 1. Then,
N p
N −1 ρ 1 XX µ µ ρ
hi (t) = ξi + ξi ξj ξj + κξiρ , (7)
N N
j6=i µ6=ρ
showing that κ helps to increase the signal term against the noise term, which can have an odd
signal with respect to ξiρ . In fact, if we could calibrate κ in such a way that it has practically the
same strenght as the variance of the noise, caused by the second RHS term of eq. (7), the external
stimulus can, essentially, cancel the noise term, the signal predominates and the pattern can be
recognized, even in the presence of a large noise. This could allow us to recognize patterns in the
Hopfield model, for example, for values of p much higher than the value 0.14N (α = 0.14), the
upper limit of recognized patterns in the Hopfield model. The calibrate value of κ is the crucial
point of this new frame. It can not be very small, because it will not cancel the noise term, nor
it can not be much higher than the two first terms of the RHS of eqs. (6), since in this case the
external field will dominate the local field and there will be no pattern recognition at all; the local
field will simply reproduce the external stimululs, whether associated with a memory or not.
One other comment is that if the external stimulus change after a time t∗ , associating for times
t > t∗ the external stimulus with another stored memory, let us say, {ξ ν }, the memory {ξ ρ } will
loose stability and will not anymore be recognized, the memory associated with the pattern ν being
now the recognized pattern. This way, the system changes the pattern recognition accordingly the
5
presentation of the external stimulus, as it happens normally. The model captures this important
facet of living systems. In fact, all the points listed in the introduction are satisfied, as it should.
In order to study how to find this optimal value of κ, let us do, as much as possible, the
analytical calculations of the model given by eq. (2).
The Hamiltonian of this system, for simplicity with symmetric Ji,j , is:
N N p N
X 1 X ρ ρ 1 X X µ µ X
H=− hi (t)σi (t) = ξi ξj σj (t)σi (t) + ξi ξj σj (t)σi (t) + κ ηi σi (t) , (8)
2N 2N
i i,j,j6=i i,j,j6=i µ6=ρ i=1
where h...iη,ξ indicate the quenched average on η or ξ and we will use the replica method to calculate
the free-energy.
The quenched free-energy can be expressed as function of the following order parameters:
(a) macroscopic superposition with the condensed pattern (in our model we have just one condensed
pattern)
N
* +
1 X ρ
mρ = ξi hσi iT , (10)
N
i=1 η,ξ
where h. . .iT means thermal average, and can be written as (see Appendix A):
s
α 1 X ν 2 αβ α βq
f = + (m ) + r (1 − q) + ln (1 − β + βq) −
2 2 2 2β 1 − β + βq
ν=1
s
** " !#+ +
1 √ X
ν ν
− ln 2 cosh β z αr + m ξ + κη (13)
,
β
ν=1 η z
6
2
where h. . .iz ≡ √dz2π exp − z2 h. . .iξρ represents the average over the condensed pattern ξ ρ , and
R
over the Gaussian noise z. The equilibrium solution comes from the derivatives of the free-energy
with respect to the parameters, leading to the saddle-point equations:
D
√ E
mρ = ξ ρ tanh z αr + mρ ξ ρ + κη η
z
D
2 √ ρ ρ
E
q = tanh z αr + m ξ + κη η
z
q
r = . (14)
(1 − β + βq)2
3.3 T = 0 solutions
In the limit T = 0 or β → ∞ the equations (14) yield (calling from now on mρ by m):
m+κ m−κ
m = γ erf √ + (1 − γ) erf √ (15)
2αr 2αr
where, r = (1 − C)−2 , erf(x) is the error function and
( " # " #)
(m + κ)2 (m − κ)2
r
2
C= γ exp − + (1 − γ) exp − . (16)
παr 2αr 2αr
√ √
With the definitions m = y 2αr and κ = x 2αr, the equations (15) and (16) can finally be
rewritten as:
γ erf (y + x) + (1 − γ) erf (y − x)
y = √ h 2 2
i,
2α + √2π γ e−(y+x) + + (1 − γ) e−(y−x)
m = γerf (y + x) + (1 − γ) erf (y − x) . (17)
When γ = 1 we recover the equations that correspond to an external stimulus that is exactly equal
to the stored pattern ξ ρ ,
erf (y + x)
y = √ 2 ,
2α + √2 e−(y+x)
π
m = erf (y + x) . (18)
and when y + x → x we get the equations that correspond to an external stimulus which is not
related to any stored pattern,
erf (x)
y = √ ,
2α + √2π e−x2
m = erf (x) . (19)
7
3.4 Numerical simulation
We can see one part of the good performance of this new approach in figures (1-3). In figure (1)
it is shown numerical simulations of the macroscopic superposition mρ with the memory {ξ ρ } as
function of κ, for γ = 1, i.e., the external pattern is exactly the stored pattern. This simulation was
done for N =??? neurons and p = αN stored memories. The parameter α varies from 0.5, 0.7, 0.9
and 1.0 for graphs (a), (b), (c) and (d), all values above (or well-above) the critical value αc ' 0.138,
value above which the standard Hopfield neural network does not recognize any pattern. We have
shown the figures for mρ and for the macroscopic superposition with an external pattern that is
not stored and is orthogonal with all the stored patterns. The difference among these two curves,
represented by the curve with triangles, shown the best value of κ for each case. For this value
we can easily recognize the stored pattern associated with the external pattern, with macroscopic
superposition almost equal to one, even for α = 1, when mρ ' 0.9. The recognition works in fact
even for values of α much higher than α = 1. The value of κ associated with the maximum of the
difference among the two curves is the optimal one because it has an enough strength allowing to
recognize the stored pattern for α > 0.138 but not so strong in such a way that the extenal signal
can not dominate the other two terms. This subtle balance is the heart of this approach. The
same can be seen in figures (2), when γ = 0.9, and (3), when γ = 0.74. In these two figures we
are considering an external pattern that is not exactly equal to the stored pattern, but has 1 − γ
of the total bits unequal. In figure (2) we can see that the macroscopic superposition parameter
mρ decreases, what is expected since the external pattern has some differences with the stored
pattern. But, even for α = 1, mρ ' 0.7, what is enough for the recognition of the pattern, since
the macroscopic superposition of the other patterns are practically equal to zero. The last figure
(3) shows the limit we can disturb the stored pattern allowing recognizing. Practically there is no
maximum anymore in the difference among the two curves and we can not, essentially, separate
a stored pattern from a new pattern, not stored and orthogonal with the others stored patterns.
Therefore, in this approach we can deform the stored pattern up to 25% and the system still will
recogize it, even for values of α greater than αc , which is an impressive performance.
8
mχ
(a)
0.8
0.6 κc=0.70
0.4
0.2
0
0 0.5 1 1.5 2 2.5 κ
mχ
(b)
0.8
0.6
κc=0.85
0.4
0.2
0
0 0.5 1 1.5 2 2.5 κ
mχ
(c)
0.8
0.6
0.4 κc=0.90
0.2
0
0 0.5 1 1.5 2 2.5 κ
mχ
(d)
0.8
0.6
0.4 κc=0.95
0.2
0
0 0.5 1 1.5 2 2.5 κ
mθ m⊥ ∆m
Figure 1: Macroscopic superposition of the external pattern mρ for γ = 1. (a) α = 0.5; (b) α = 0.7;
(c) α = 0.9 and (d) α = 1.0. The curves with blue dots are the macroscopic superposition with
the memory ρ, mρ ; the curve with green dots are the macroscopic superposition with an external
pattern that is not stored and is orthogonal with all the stored patterns, and the curve with triangles
are the differences between the macroscopic recognition of the stored pattern ρ and a non-stored
pattern, orthogonal to all memories.
9
mχ
(a)
0.8
0.6
0.4 κc=0.70
0.2
0
0 0.5 1 1.5 2 2.5 κ
mχ
(b)
0.8
0.6
0.4
κc=0.85
0.2
0
0 0.5 1 1.5 2 2.5 κ
mχ
(c)
0.8
0.6
0.4
κc=0.90
0.2
0
0 0.5 1 1.5 2 2.5 κ
mχ
(d)
0.8
0.6
0.4
0.2 κc=0.95
0
0 0.5 1 1.5 2 2.5 κ
mθ m⊥ ∆m
Figure 2: Macroscopic superposition of the external pattern mρ for γ = 0.9. (a) α = 0.5; (b)
α = 0.7; (c) α = 0.9 and (d) α = 1.0. The curves with blue dots are the macroscopic superposition
with the memory ρ, mρ ; the curve with green dots are the macroscopic superposition with an
external pattern that is not stored and is orthogonal with all the stored patterns, and the curve
with triangles are the differences between the macroscopic recognition of the stored pattern ρ and
a non-stored pattern, orthogonal to all memories.
10
mχ
(a)
0.8
0.6
0.4
0.2
κc=0.50
0
0 0.5 1 1.5 2 2.5 κ
mχ
(b)
0.8
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 κ
mχ
(c)
0.8
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 κ
mχ
(d)
0.8
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 κ
mθ m⊥ ∆m
Figure 3: Macroscopic superposition of the external pattern mρ for γ = 0.74. (a) α = 0.5; (b)
α = 0.7; (c) α = 0.9 and (d) α = 1.0. The curves with blue dots are the macroscopic superposition
with the memory ρ, mρ ; the curve with green dots are the macroscopic superposition with an
external pattern that is not stored and is orthogonal with all the stored patterns, and the curve
with triangles are the differences between the macroscopic recognition of the stored pattern ρ and
a non-stored pattern, orthogonal to all memories.
11
mχ
(a)
0.8
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 κ
mχ
(b)
0.8
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 κ
mχ
(c)
0.8
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 κ
mχ
(d)
0.8
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 κ
mθ m⊥ ∆m mTθ mT⊥ ∆m
T
12
In the figure (??) we can see how the neural network reacts when we change the external pattern.
We plot the macroscopic superposition m as function of time. Up to t = 50 there is no external
pattern presented. No memory is recognized.
7 Conclusions
13
mχ
(a)
0.8
0.6
0.4
κ = 0.0 κ = 0.2 κ = 0.2
0.2
0
0 25 50 75 100 125 κ (x103)
mχ
(b)
0.8
0.6
0.4
κ = 0.0 κ = 0.85 κ = 0.85
0.2
0
0 25 50 75 100 125 κ (x103)
mχ
(c)
0.8
0.6
0.4
κ = 0.0 κ = 1.5 κ = 1.5
0.2
0
0 25 50 75 100 125 κ (x103)
mχ
(d)
0.8
0.6
0.4
κ = 0.0 κ = 2.5 κ = 2.5
0.2
0
0 25 50 75 100 125 κ (x103)
mθ mν
Figure 5: graph-time
14
A Equação de Campo Médio
Caso α = p/N seja finito o cálculo segue os passos da referencia (3) e a média temperada sobre os
ξ’s é realizada usando o método de réplicas. Assim, a energia livre é calculada a partir da média:
p
* +
n
Y 1 XX µ µ a a
X
a
hZ iη, ξ = Tr exp β − N
ξi ξj Si Sj − κηi Si , (22)
Sa
a i, j µ=1 i
(j>i) η, ξ
onde a etiqueta n réplicas fictı́cias e TrS a denota o traço sobre os spins em cada uma das n réplicas.
Desacoplando o termo quadrático em ξ, com ajuda da transformação gaussiana,
2 1
Z h √ i
dx exp −x2 /2 + 2λax ,
exp λa = √ (23)
2π
e calculando a média sobre os ξ’s altos (µ > s), encontramos:
* Z !
Y dmµa X (mµa )2
r
X β X µ a
hZ n iξ = e−βpn/2 Tr √ exp − + ln cosh ma Si
Sa
a, µ 2π
µ, a 2 N a
i, µ
(µ>s) (µ>s)
+
X (mν )2
r
a
X β ν X X
× exp − + m ξiν Sia + βκ ηi Sia , (24)
ν, a
2 ν, a
N a
i i, a
η, ξ ν
1
hZ n i = e−βpn/2 Tr exp − pTr ln [(1 − β) I − βQ]
S a 2
*Z +
Y dma µ
1 1 κ
mνa ξiµ Sia +
X X X
× √ exp βN − (mνa )2 + ηi Sia ,(25)
ν, a 2π 2 ν, a
N N
i, ν, a i, a
η, ξ ν
onde, como em (3), qab = N1 i Sia Sib − δab . Introduzindo rab como multiplicadores de Lagrange
P
para os elementos não-diagonais de Q, qab , encontramos para a energia livre
α 1 X ν 2 α αβ X
f = + (ma ) + Tr ln [(1 − β) I − βQ] + rab qab
2 2n ν, a 2βn 2n
a, b
(a6=b)
* +
1 αβ 2 X X X
− ln Tr exp rab S a S b + β mνa ξ ν S a + βκ ηS a . (26)
βn S 2 ν, a a
a6=b
η, ξ ν
Assumindo a simetria de réplicas e deixando n ir a zero em (26), obtemos a seguinte expressão para
15
a energia livre:
α αβ 1X ν 2 α βq
f = + r (1 − q) + (m ) + ln (1 − β + βq) −
2 2 2 ν 2β 1 − β + βq
** " !#+ +
1 √ X
− ln 2 cosh β z αr + mν ξ ν + κη , (27)
β ν η z
onde h. . .iz indica uma média Gaussiana sobre z, bem como uma média sobre o ξ discreto, com
os ξ’s distribuı́dos de acordo com (5).
References
[1] R. Darwin, Origin of evolution of especies
16