Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more ➡
Download
Standard view
Full view
of .
Add note
Save to My Library
Sync to mobile
Look up keyword
Like this
3Activity
×
0 of .
Results for:
No results containing your search query
P. 1
Chapter 8

Chapter 8

Ratings: (0)|Views: 681|Likes:
Published by armin2200

More info:

Published by: armin2200 on Dec 06, 2008
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See More
See less

07/29/2010

pdf

text

original

 
Recurrent Networks Rooted inStatistical Physics
8.1
Introduction
The multilayer perceptron and the radial-basis function network considered in the previoustwo chapters represent important examples of a class of neural networks known as nonlinearlayered feedforward networks. In this chapter we consider another important class of neural networks that have a
 recurrent
structure, and the development of which is inspiredby different ideas from
 statistical physics.
In
particular, they share the following distinctivefeatures:
8
Nonlinear computing units
8
Symmetric synaptic connections
8
Abundant use of feedback All the characteristics described herein are exemplified by the Hopfield network, theBoltzmann machine, and the mean-field-theory machine.The
 Hopfield network
is a recurrent network that embodies a profound physical princi-ple, namely, that of 
 storing information in a dynamically stable conjiguration.
Prior tothe publication of Hopfield’s influential paper in 1982, this approach to the design of aneural network had occupied the attention of several investigators
-
Grossberg (1967,
1968), Amari (1972),
Little
(1974),
and
Cowan (1968),
among others; the work of someof these pioneers predated that of Hopfield by more than a decade. Nevertheless, it wasin Hopfield’s 1982 paper that the physical principle of storing information in a dynamicallystable network was formulated
in
precise terms for the first time. Hopfield’s idea of locating each pattern to be stored at the bottom of a “valley” of an energy landscape,and then permitting a dynamical procedure to minimize the energy of the network in sucha way that the valley becomes a basin of attraction is novel indeed!The standard discrete-time version of 
the
Hopfield network uses the
McCulloch-Pitts
model for its neurons. Retrieval of information stored in the network is accomplished viaa dynamical procedure of updating the state of a neuron selected from among those thatwant to change, with that particular neuron being picked
 randomly
and one at a time.
This
asynchronous dynamical procedure is repeated until there are no further state changesto report. In a more elaborate version of the Hopfield network, the firing mechanism of the neurons
(Le.,
switching them on or off) follows
 aprobabilistic law.
In such a situation,we refer to the neurons as
 stochastic neurons.
The use of stochastic neurons permits usto develop further insight into the statistical characterization of the Hopfield network bylinking its behavior with the well
-
established subject of statistical physics.The
 Boltzmann machine
represents a generalization of the Hopfield network 
(Hinton
and
Sejnowski,
1983,1986; Ackley et al., 1985). It combines the use of symmetric synaptic
285
 
286
8
/
Recurrent Networks Rooted in Statistical Physics
connections (a distinctive feature of the Hopfield network) with the use of hidden neurons(a distinctive feature of multilayer feedforward networks). For its operation, the Boltzmannmachine relies on a stochastic concept rooted in statistical thermodynamics that is knownas
simulated annealing
(Kirkpatrick et al., 1983). The Boltzmann machine was namedby
Hinton
and
Sejnowski
in honor
o
Boltzmann. The general discipline of statisticalthermodynamics grew out of the work of Boltzmann who, in 1872, made the discoverythat the random motion of the molecules of a gas has an energy related to temperature.The
mean
-
 field 
-
theory
(MFT) machine is derived from the Boltzmann machine by
invoking
a “naive” approximation known as the
mean
-
 field approximation
(Peterson and
Hartman,
1989; Peterson and Anderson, 1987). According to this approximation, thestochastic binary units
o
the Boltzmann machine are replaced by deterministic analogunits. The motivation for the mean
-
field approximation is to circumvent the excessivecomputer time required for the implementation
o
the Boltzmann machine.The Hopfield network operates in an unsupervised manner.
As
such, it may be usedas a content
-
addressable memory or as a computer for solving optimization problems of a combinatorial
kind.
In a
combinatorial optimization problem
we have a discrete systemwith a large but finite number of possible solutions; the requirement is to find the solutionthat minimizes a cost function providing a measure of system performance. The Boltzmannmachine and its derivative, the mean
-
field
-
theory machine, on the other hand, may requiresupervision by virtue of using input and output units.The Hopfield network, the Boltzmann machine, and the mean
-
field
-
theory machinerequire time to settle to an equilibrium condition; they may therefore be excessively slow,unless special
-
purpose chips or hardware are used for their implementation. Moreover,they are
relaxation networks with a local learning rule.
Above all, however, they are allrooted in statistical physics.
Organization
of
the Chapter
The main body of this chapter is organized as follows. In Section 8.2 we present anoverview of the dynamics of the class of recurrent networks considered here. In Section8.3 we describe the Hopfield network, which uses the formal neuron of 
McCulloch
andPitts (1943) as its processing unit. The convergence properties of the Hopfield network are given particular attention here. This is followed by a computer experiment illustratingthe behavior of the Hopfield network in Section 8.4. Then, in Section 8.5, we discuss theenergy function of the Hopfield network and the related issue of spurious states. In Section8.6 we present a probabilistic treatment
o
associative recall in a Hopfield network. Thematerial covered in this latter section establishes a fundamental limit on the storagecapacity of the Hopfield network as an associative memory for correlated patterns. InSection 8.7 we discuss the “isomorphism” between the Hopfield network and the
spin
-
glass model
that is rooted in statistical mechanics. This is followed by a description of 
stochastic neurons
in Section 8.8, and then a qualitative discussion of the
 phase diagram
of a stochastic Hopfield network in Section 8.9. The phase diagram delineates the linesacross which the network changes its computational behavior.In Section 8.10 we describe the stochastic simulated annealing algorithm. This materialpaves the way for a detailed description of the Boltzmann machine in Section 8.11 froma statistical physics perspective. In Section 8.12 we view the Boltzmann machine as aMarkov chain model. Next, we describe the mean
-
field
-
approximation theory in Section8.13. In Section 8.14 we describe a computer experiment comparing the Boltzmann andmean
-
field
-
theory machines. The chapter concludes with some general discussion in Sec
-
tion 8.15.
 
8.2
I
Dynamical Considerations
28
8.2
Dynamical Considerations
Consider a recurrent networ
(i.e.,
a neural network with feedback) made up o
 N 
neuronswith
 symmetric coupling
described by
w,,
=
w,~,
where
w,,
is the synaptic weight connectingneuron
i
to neuron
j.
The symmetry of the synaptic connections results in a powerfultheorem about the behavior of the network, as discussed here. Let
u,(t)
denote the activationpotential acting on neuron
j,
and let
x,(t)
denote the corresponding value of the neuron'soutput. These two variables
are
related by
xj
=
PJ(u,>
(8.1)
where
p,(.)
is the sigmoidal nonlinearity of neuron j. Both
u,
and
 x,
are functions of thecontinuous-timevariable
 t.
The
 state
of neuronj may be described in terms of the activationpotential
u,(t)
or, equivalently, the output signal
x,(t).
In the former case, the
 dynamics
of the recurrent network is described by a set of coupled nonlinear differential equationsas follows (Hopfield,
1984a;
Cohen and Grossberg, 1983):
(8.2)
where
0,
is a threshold applied to neuronj from an external source. The finite rate of change of the activation potential
uj(t)
with respect to time
 t
is due to the
 capacitive
effects
Cj
associated with neuron
 j,
which are an intrinsic property of biological neuronsor the physical implementation of artificial neurons. According to Eq.
(8.2),
three factorscontribute to the rate of change
dujldt:
1.
Postsynaptic effects induced in neuron
 j
due to the presynaptic activities of neurons
2.
Leakage due to finite input resistance
Rj
of the nonlinear element of neuron
j
3.
Threshold
0,
i
=
1,
 2,
.
. .
,
N,
excluding
i
=
 j
For the recurrent network with symmetric coupling as described here, we may define an
energy function
or
Liapunou
 function
as follows (Hopfield,
1984a):
where
xj
is the output of neuronj, related to the activation potential
uj
by Eq. (8.1). Theenergy function of Eq. (8.3) is a special case of a theorem due to Cohen and Grossberg
(1983),
which is considered in Chapter 14 devoted to neurodynamics. The importance of the energy function
 E
is that it provides the basis for a deep understanding of how specificproblems may be solved by recurrent networks. For now, it suffices to note that the energyfunction
 E
is fully descriptive of the recurrent network under study in that it includes allthe synaptic weights and all the state variables of the network, and that we may state thefollowing theorem for the case when the threshold
4
changes slowly over the time of computation (Hopfield,
1984a;
Cohen and Grossberg, 1983):
The energy function
 E
is a monotonically decreasing function
 o
 the network state
{xjlj
=
1,
 2,.
. .
,
 N}.
When the network is started in any initial state, it will move in a
 downhill 
direction of the energy function
 E
until it reaches a
local minimum;
at that point, it stops changing

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->