286
8
/
Recurrent Networks Rooted in Statistical Physics
connections (a distinctive feature of the Hopfield network) with the use of hidden neurons(a distinctive feature of multilayer feedforward networks). For its operation, the Boltzmannmachine relies on a stochastic concept rooted in statistical thermodynamics that is knownas
simulated annealing
(Kirkpatrick et al., 1983). The Boltzmann machine was namedby
Hinton
and
Sejnowski
in honor
of
Boltzmann. The general discipline of statisticalthermodynamics grew out of the work of Boltzmann who, in 1872, made the discoverythat the random motion of the molecules of a gas has an energy related to temperature.The
mean

field

theory
(MFT) machine is derived from the Boltzmann machine by
invoking
a “naive” approximation known as the
mean

field approximation
(Peterson and
Hartman,
1989; Peterson and Anderson, 1987). According to this approximation, thestochastic binary units
of
the Boltzmann machine are replaced by deterministic analogunits. The motivation for the mean

field approximation is to circumvent the excessivecomputer time required for the implementation
of
the Boltzmann machine.The Hopfield network operates in an unsupervised manner.
As
such, it may be usedas a content

addressable memory or as a computer for solving optimization problems of a combinatorial
kind.
In a
combinatorial optimization problem
we have a discrete systemwith a large but finite number of possible solutions; the requirement is to find the solutionthat minimizes a cost function providing a measure of system performance. The Boltzmannmachine and its derivative, the mean

field

theory machine, on the other hand, may requiresupervision by virtue of using input and output units.The Hopfield network, the Boltzmann machine, and the mean

field

theory machinerequire time to settle to an equilibrium condition; they may therefore be excessively slow,unless special

purpose chips or hardware are used for their implementation. Moreover,they are
relaxation networks with a local learning rule.
Above all, however, they are allrooted in statistical physics.
Organization
of
the Chapter
The main body of this chapter is organized as follows. In Section 8.2 we present anoverview of the dynamics of the class of recurrent networks considered here. In Section8.3 we describe the Hopfield network, which uses the formal neuron of
McCulloch
andPitts (1943) as its processing unit. The convergence properties of the Hopfield network are given particular attention here. This is followed by a computer experiment illustratingthe behavior of the Hopfield network in Section 8.4. Then, in Section 8.5, we discuss theenergy function of the Hopfield network and the related issue of spurious states. In Section8.6 we present a probabilistic treatment
of
associative recall in a Hopfield network. Thematerial covered in this latter section establishes a fundamental limit on the storagecapacity of the Hopfield network as an associative memory for correlated patterns. InSection 8.7 we discuss the “isomorphism” between the Hopfield network and the
spin

glass model
that is rooted in statistical mechanics. This is followed by a description of
stochastic neurons
in Section 8.8, and then a qualitative discussion of the
phase diagram
of a stochastic Hopfield network in Section 8.9. The phase diagram delineates the linesacross which the network changes its computational behavior.In Section 8.10 we describe the stochastic simulated annealing algorithm. This materialpaves the way for a detailed description of the Boltzmann machine in Section 8.11 froma statistical physics perspective. In Section 8.12 we view the Boltzmann machine as aMarkov chain model. Next, we describe the mean

field

approximation theory in Section8.13. In Section 8.14 we describe a computer experiment comparing the Boltzmann andmean

field

theory machines. The chapter concludes with some general discussion in Sec

tion 8.15.