You are on page 1of 12

This article was downloaded by: [Universitaetsbibliothek Heidelberg] On: 19 August 2013, At: 04:05 Publisher: Taylor &

Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

International Journal of Computer Mathematics


Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/gcom20

The demon algorithm


Theo Zimmermann & Peter Salamon
a a a

Department of Mathematical Sciences, San Diego State University, San Diego, CA, 92182, U.S.A Published online: 20 Mar 2007.

To cite this article: Theo Zimmermann & Peter Salamon (1992) The demon algorithm, International Journal of Computer Mathematics, 42:1-2, 21-31, DOI: 10.1080/00207169208804047 To link to this article: http://dx.doi.org/10.1080/00207169208804047

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the Content) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Intern. I. Computer Marh., Vol. 42, pp. 21-31 Reprints available directly from the publisher Photocopying permitted by l~censeonly

1992 Gordon and Breach Sc~encePublishers S A Prmted In the United Kmgdom

THE DEMON ALGORITHM


T H E 0 ZIMMERMANN and PETER SALAMON Department of Mathematical Sciences, Sun Diego State University, Sun Diego, CA 92182, U.S.A.
Downloaded by [Universitaetsbibliothek Heidelberg] at 04:05 19 August 2013
(Received 1 May 1991)

This paper introduces a generalization of the simulated annealing algorithm for global optimization. Simulated annealing has been successfully applied to a number of combinatorial and continuous optimization problems. The original approach has been significantly improved by introducing adaptive annealing schedules and annealing several copies of the problem in parallel. In this paper we make a further step and propose a generalized simulated annealing algorithm called Demon Algorithm. This algorithm is constructed in analogy to the action of Maxwell's Demon and has been motivated by an information-theoretic analysis of simulated annealing. The algorithm is based on an ensemble of identical systems that are annealed in parallel. The ensemble evolves according to a sequence of target distributions with the aim t o end up in a distribution that is concentrated on optimal solutions. The evolution of the ensemble is based on collective moves. The algorithm is implemented for the problem of graph bipartitioning and its performance is compared with conventional simulated annealing and a downhill search algorithm. KEY WORDS: Global optimization, combinatorial algorithms, simulated annealing C.R. CATEGORIES: G.1.6, G.2.1, G.3.

I INTRODUCTION Simulated annealing (SA) is a stochastic search algorithm. [I, 21 It draws from an analogy to statistical mechanics and aims at finding global minima by stochastic relaxation under the control of a temperature parameter T To this end the possible states or configurations w of the system to be optimized are identified with the states of a statistical-mechanical system where the objective function plays the role of the energy. SA uses the Metropolis algorithm [3] to define a random walk through the state space LR = {w). This assumes that LR is equipped with a topology or neighborhood structure that tells which states W' can be reached from a given state w in one move. This topology is also called the move class, [4] It is usually not given a priori and the choice of a move class on a given state space can have a significant impact on the performance of SA. The Metropolis algorithm defines the probabilities for accepting or rejecting an where w' has been chosen attempted move from a state w with energy E(w) to o', among the neighbors of W:

22

T. ZIMMERMANN A N D P. SALAMON

For fixed IT: Eq. (1) describes a stationary Markov process with an invariant distribution of states given by

Downloaded by [Universitaetsbibliothek Heidelberg] at 04:05 19 August 2013

This is known as the Boltzmann distribution at temperature T and gives the probability of finding the equilibrated system in state o.The associated distribution of energies is usually addressed by the same name; it is given by

The number of states g(E) at an energy E is also called the density of states. SA proceeds by making moves according to Eq. (1) and slowly lowering the temperature T If the system is allowed to equilibrate at each temperature, it evolves through a sequence of Boltzmann distributions and will eventually "freeze" at T = 0. It can be shown that if the cooling is done slowly enough [5] the system will end up in a Boltzmann distribution at T= 0, i.e. in the lowest energy state(s). However, for practical purposes this cooling will take too long and one has to define feasible temperature schedules which have a high probability of ending up with a good, though not necessarily best, solution. Once a suitable annealing schedule has been defined, an important further question becomes how to allocate given computing resources to arrive at the best possible results. In the simplest implementation all available time is spent on a single annealing run. It is typical for such a run that the majority of the time is spent at low temperature, where it takes very long to improve on the best energy that has been encountered so far. It turns out that for large computing resources it is better to split the effort into several independent runs [6]. This improves the chances of finding low energy states by better sampling of the solution space. A further important advantage of such an approach is that these runs can be done in parallel. This finding also motivates a change in perspective. Instead of one random walker we now look at an ensemble of random walkers moving independently through the state space Q. Assuming that they share a common temperature, this gives easy access to useful statistical information in the form of ensemble averages. As will be discussed below, the first two moments of the energy provide the means for a usefuf adaptive temperature schedule.

THE DEMON ALGORITHM

23

I1 THE DEMON ALGORITHM The main subject of this paper is to further exploit the ensemble approach by introducing a generalization of SA that the approach makes possible. The generalization was motivated by an analysis of stochastic search algorithms in terms of information theory [7]. The basic question asked there is what is the minimum number of evaluations of the objective function needed to proceed from one state of the ensemble to the next one. The state of the ensemble is thereby given by a probability distribution that defines the likelihood of finding a certain state in the ensemble. It is assumed that the basic step of a stochastic search algorithm consists of replacing the current ensemble members by sampling from neighboring states with the aim of achieving a certain target distribution of the energies in the ensemble. Constraints are that the mean energy of the ensemble decrease and that the number of function evaluations to produce the next generation of states be minimal. These requirements actually suggest that the ensemble should proceed via a sequence of Boltzmann distributions with decreasing temperatures [7]. This general setting does not require the Metropolis algorithm in order to produce the desired Boltzmann distribution of energies and suggests more direct ways to achieve the same result. We relax the condition that a member of the ensemble can only be replaced by one of its own neighbors. We treat the ensemble as a population whose members can reproduce and die, so each member can be replaced by any other member's neighbor. To this end the algorithm creates a pool of candidate states. The states for the next generation are selected from this pool which is formed by the current states and (some of) their neighbors. This defines what we call a collective move class since the ensemble evolves as a whole. This method shares several features with genetic algorithms and similar optimization algorithms which mimic biological evolution [8]. The algorithm that we are proposing is as follows. Assume there is an ensemble with N members. A current state of the ensemble at step k is then given by a list of N states, Lk = (w:, . . . , w;). The aim is to generate a target list Lk+' where the energies of the members follow a Boltzmann distribution at some target temperature T,, Using the Metropolis algorithm, such a distribution is created as the limiting distribution of the simple Markov process, Eq. (1). In our new method, the corresponding expected number of states at each energy is determined first and candidate states are accepted and rejected accordingly. The calculation of these numbers, we call them target frequencies, is actually a nontrivial task and can be done using a run-time estimation of the local density of states, as discussed below. In our implementation, the target list is filled in such a way that each state in Lk contributes one neighbor to an auxiliary list L'. This can be easily generalized such ' is then scanned that each state in Lk contributes several neighbors. The neighbor list L (in a random way) and each state with an energy still needed to arrive at the target frequencies is moved to the target list L k + ' .Thus states at each energy are gathered until a sufficient numberof them have been accepted to fill up the "quotas" after which states at that energy are rejected. If L' is exhausted but Lk+' is not complete, the original list Lk is scanned in the same way. If Lk+' is still not complete it is filled

Downloaded by [Universitaetsbibliothek Heidelberg] at 04:05 19 August 2013

,.

24

T. Z I M M E R M A N N AND P. S A L A M O N

Downloaded by [Universitaetsbibliothek Heidelberg] at 04:05 19 August 2013

with randomly chosen states from L'. It is then decided using a Xz-test whether Lk+' is close enough to the target distribution using some prespecified confidence level. If If yes, we set a no, we generate a new neighbor list L' and try again to reach Lk+'. new target temperature and new target frequencies for Lkf and the whole process is repeated. This way of achieving a certain target distribution is reminiscent of the way Maxwell's Demon achieves one: The demon controls a door in a wall separating two gas containers. Initially both gases are at the same temperature. The demon then creates a temperature difference by only allowing fast molecules to go from the left to the right container and slow ones only from the right to the left. Because of this analogy we are calling our method the Demon Algorithm (DA). Similar to traditional SA this algorithm proceeds through a sequence of Boltzmann distributions with decreasing temperature. The main difference between the two methods is that in SA the target distribution is achieved through a stochastic relaxation process where a number of random walkers move independently through the state space according to the Metropolis transition probabilities, Eq. (1). In the presence of energy barriers this relaxation can be very slow. The DA, on the other hand, is characterized by a population that evolves under a selection pressure defined by the target distribution, where walkers die and/or reproduce, removing the need for barrier climbing. The only effective barrier climbing that can take place involves transitions within a few energy standard deviations of the current mean energy. It also becomes possible for a state together with one or several of its neighbors to become part of the next list, whereas other states die out in the sense that one of their neighbors become part of successive lists. The DA is thus able to achieve a certain target distribution of energies much faster than SA. The question, however, is whether the quality of the solutions will be comparable. This problem will be addressed in Section V.

111 ADAPTIVE TEMPERATURE CONTROL AND ESTIMATION O F

TARGET FREQUENCIES To completely specify our optimization algorithms we have to define how the temperature is controlled during an annealing run. We are using a temperature schedule that has been proposed by Nulton and Salamon [9,10] and can be motivated by the idea that two successive Boltzmann distributions should not be too far apart. This can be made more precise and leads to the requirement that the mean energy should always be lowered by a certain fraction u of the current energy standard deviation a , . For the target mean energy ( E ) , . at temperature T,+ we then have

The constant v is called thermodynamic speed and Eq. (4) defines a constant thermodynamic speed schedule. The new target temperature Tk, follows implicitly

THE DEMON ALGORITHM

from Eq. (4) via

Downloaded by [Universitaetsbibliothek Heidelberg] at 04:05 19 August 2013

where gi is the density of states, i.e. the number of states with energy Ei. The basic problem in using this schedule is that the density of states g, is usually not known a priori. However, it can be estimated locally during the execution of the algorithm [ll]. The idea is to keep a matrix Q' of counters qJiwhich are incremented by 1 each time a state with energy Ei generates a neighbor with energy E j . Normalizing Q' to unit column sums we arrive at estimates q.. =- q;i JI

Cj 4 i

for the infinite temperature Metropolis transition probability of going from energy Ei to E j . The equilibrium distribution of the so defined Markov process is simply the density of states gi and can be obtained as the eigenvector with eigenvalue 1 of the stochastic matrix Q. From the density of states also follow directly the target frequencies for the DA, i.e. the expected number ni of states at energy Ei in the target list is: ni = N g , exp(- E i / T k + Z ( T k ')
+

where N = ni is the size of the ensemble. At a certain stage of the algorithm only matrix elements qii connecting energies in the vicinity of the current mean energy will be encountered. For energies below the lowest energy seen so far the corresponding counters will be zero. However, the Boltzmann distribution is typically concentrated around the current mean energy, so it is only that region where the density of states is actually needed. For this reason we only consider a submatrix R' taken from Q' that is centered about the current mean energy and covers a few energy standard deviations. If R' covers an energy En], the matrix elements qli not included in R' with m _< i I n and j < m interval [Em, or j > n have to be added to the diagonal element qii in the same column in order to obtain an unbiased estimate of g. The matrix R' is then normalized to unit column sums. The eigenvector with eigenvalue 1 of the resulting stochastic matrix R is the current estimate of g and is used to set up the target frequencies for the DA and to calculate the next target temperature via Eqs. (4) and (5). In our implementation, a new estimate for g is calculated from the updated countermatrix Q' whenever a new target temperature has to be set or after a certain number of new lists has been generated.

xi

26

T. Z l M M E R M A N N A N D P. SALAMON

IV APPLICATION T O GRAPH BIPARTITIONING A first test of the DA was performed on the caricature of a state space which consisted of a weighted graph defining transition probabilities between its vertices [12]. The main feature was a single energy barrier which separated an unfavorable local minimum with a large weight from the lesser weighted global minimum. Numerical results showed the DA to be an order of magnitude faster than SA. This is of course a very simplified situation; as a more realistic testing ground we implemented the DA for the problem of graph bipartitioning. In this problem the task is to divide the vertices of a given graph into two sets of equal size such that the number of edges crossing between these sets is minimal. This problem is NP-complete [I31 and hence among the hardest combinatorial optimization problems. Graph bipartitioning is a well studied problem with interesting properties. For example, the random graph bipartitioning problem can be mapped on a spin glass system and it is known that such systems have a complex state space structure with many local energy minima [14]. A random graph is defined by the number n of vertices and a connection probability p, i.e. each of the possible n(n - 1)/2 edges is present with probability p. SA has been found to give good results for bipartitioning such graphs [I51 and we shall concentrate on this type of partitioning problem. We implemented the DA as well as the (ensemble-) SA algorithm with a constant thermodynamic speed schedule as described in Section 111. For a complete comparison we also implemented a random downhill search which simply generates a sequence of neighboring states whose energy never increases. This is actually equivalent to SA with a target temperature T = 0 and is called "quenching". The random graphs we used had n = 500 vertices and a connection probability of p = 0.004. This leads to a mean vertex degree (i.e. number of edges originating from a vertex) of 2. A state of the system to be annealed is now simply given by a partition, i.e. by specifying for each vertex to which one of the two sets it belongs. The move class is defined such that a neighboring partition is generated by randomly selecting one vertex from each set and exchanging them. To evaluate the performance of the different algorithms we monitor the value of the very best so far energy (vbsfe), i.e., the best energy found so far during an optimization run. Figure 1 shows these values from representative runs of SA, DA and quenching for partitioning the same random graph. The Metropolis based SA used an ensemble size N = 100 and a thermodynamic speed of v = 0.1. The DA was run with an ensemble size N = 1000 and a speed v = 0.01. The quenching was performed by doing 1000 independent runs with different initial conditions in parallel. The figure also shows the results for a variant of the Demon algorithm that will be introduced in Section V. The vbsfe values are plotted with respect to the number of energy evaluations. This does not reflect the actual computing time spent on the problem since quenching does not involve any significant overhead, whereas the adaptive temperature control for SA and DA can take up a significant fraction of the available computing resources. For graph partitioning, the calculation of the energy difference for a move is fairly cheap, but in general the evaluation of the energy function can be a costly step and

Downloaded by [Universitaetsbibliothek Heidelberg] at 04:05 19 August 2013

THE D E M O N ALGORITHM

27

Downloaded by [Universitaetsbibliothek Heidelberg] at 04:05 19 August 2013

04

, 3

, , , , ,, 4 5 6

,
1

4 5 6

10

energy evaluations 11 o6
Figure I Very best energy seen so far with respect to the number of energy evaluations performed. The inset also gives the final vbsfe in parentheses. The data are only displayed up to the last improvement of vbsfe; the calculations have been continued substantially longer.

usually justifies large overhead provided this can reduce the number of energy evaluations. Figure 1 shows three different regimes in the behavior of the various optimization algorithms. For small computing resources quenching gives the best results. This is simply due to the fact that quenching does not make any uphill moves. For intermediate resources the DA becomes superior. However, the algorithm terminates with a final best energy value of vbsfe = 29. For very large resources therefore the SA algorithm performs best with a final vbsfe = 19. The quenching algorithm ended with vbsfe = 26. All runs shown in Figure 1 were only stopped after there was no improvement of the vbsfe for a time span longer than at least 3 times the time of the last change of vbsfe. A closer analysis of the behavior of the DA showed that in the course of the run eventually only states within a single local minimum in state space survive. To make this point more clear note that the state space R together with its topology and .R defines an energy hypersurface. For a continuous objective function E: R problem R = Rn, while for our problem R is discrete. The energy surface can then be viewed as a hierarchy of basins that are formed by energy barriers of various heights. [16] A random walker moving on that surface at a certain mean energy is only restricted by energy barriers that are higher than that energy. For decreasing energy, the state space is divided into more and more different basins or components. In SA the Metropolis algorithm guarantees in the ideal case that the distribution of random walkers is uniform on sets of states with the same energy. The state space

28

T. ZIMMERMANN A N D P. SALAMON

Downloaded by [Universitaetsbibliothek Heidelberg] at 04:05 19 August 2013

is thus uniformly probed. For the DA however, sampling fluctuation effects may prematurely remove all states from a basin that would actually lead to a good energy minimum. This effect will be addressed more closely below. There is also a second, more drastic effect. Namely, in our runs the DA typically ended such that a single state in a local minimum reproduced and the state list was rapidly filled with copies or neighbors of that state. This happened when at some low temperature the generation of a lower energy neighbor became a very unlikely event, but the probability for a neighbor of the same energy was considerably higher. For the sparse random graphs treated here, neighboring states with the same energy are actually quite likely since there is a considerable fraction of isolated vertices (i.e. vertices of degree zero). SA does not have these kind of problems since there are only individual moves and random walkers cannot die out or multiply. The Metropolis algorithm in SA generates a Boltzmann distribution of srates as opposed to a Boltzmann distribution of energies. In particular, the Boltzmann distribution over states is uniform over states with the same energy. This guarantees an unbiased probing of the state space. The DA, on the other hand, only enforces a Boltzmann distribution of energies and in this way the distribution of states can be biased due to fluctuation and selection effects. In a practical application where the Boltzmann distribution is only approximately realized, one can argue that SA works well because large basins attract many walkers and it is plausible that deep basins are associated with large rims. This conjecture is true for 0 = Rn and a smooth energy function. [7] It is also true for our graph partitioning problem. It has not been proven in a general context, however, but we believe it to be a basis for the success of SA. The same argument should also make it plausible that the DA should work well, however, one has to take into account the "counterproductive" effect of statistical sampling fluctuations. To understand this effect we investigated a simple "neutral" model of the DA which neglects selective forces. Assume an ensemble of N walkers and a state space with v disconnected basins. At some stage of the algorithm, basin i contains m, walkers. In a Demon step each walker generates k - 1 neighbors (each in the same basin) and we simply assume that the new list is generated by randomly selecting N walkers from the available set of kN walkers, i.e. current and neighboring states. The probability of ending up with m: walkers in basin i is then given by

= xim',= N . Eq. (8) is a hypergeometric distribution and defines where we have a Markov process. It is clear that as soon as a basin is empty (mi = 0), it will remain empty and evenutally all states will be within one basin. It is interesting to note that Eq. (8) is also known from population genetics and describes the effect of neutral evolution [17], i.e. evolutionary processes that are solely due to statistical fluctuations in finite populations in the absence of selection forces. For the simple case of v = 2 basins one can calculate approximately the mean time

zimi

THE DEMON ALGORITHM

29

predicted from Eq. (8) until a basin is emptied [18]. This time ~ ( xdepends ) on the initial relative population x = m l / N and is given by

k - 1/N r ( x ) = - 2N - ( X In x
k-1

+ ( 1 - x ) ln(1 - x))

Downloaded by [Universitaetsbibliothek Heidelberg] at 04:05 19 August 2013

It follows that r grows approximately proportional to .the ensemble size N . This means that it may require rather large ensembles to suppress these sampling fluctuations to a sufficient degree such that selective forces alone are dominating. Note that for k 2 2, Eq. (9) is maximum for k = 2. This is the value we used in our implementation of the Demon algorithm. V THE DELTA DEMON ALGORITHM The DA in its present implementation involves a significant overhead for the temperature control. This fact, and also the above mentioned problem of a single state quickly filling up the target list at low temperature motivated us to investigate a rather simplified version of the DA. This version simply requires the target list to be filled with states at a single energy. In the language of statistical mechanics this means that the ensemble follows a microcanonical distribution. Since this distribution is basically a &function centered on the target energy we call this version of the DA the "Delta Demon" (DD). In our graph partitioning example there are only integer energies and so we defined the target energy to be equal to the current energy minus one. The D D algorithm only accepts states with energy lower than the current one and is therefore closer to the quenching algorithm. The main difference to quenching is that the D D algorithm is able to reallocate its random walkers to more favorable parts in state space. In this way the algorithm avoids early trapping in unfavorable local minima and eventually outperforms quenching. The algorithm proceeds in such a way that each state of the current list Lk produces a neighboring state. If the energy of this state is equal to the target energy, the state is moved to the target list Lk+'.If its energy is less, then that state is kept in a reservoir from where it will be moved in a later stage of the algorithm into the target list for that energy. In our implementation, the current list is scanned in a random order and the scan is repeated until the target list is filled. In this way it takes at least N sweeps through the current list before the target list could be filled with equal energy neighbors of a single low energy state. For the DA this may happen much faster in log, N steps since both original and neighbor state can be moved to the target list. Since at low energy it can take very long to fill the target list we set an upper limit to the number of energy evaluations per target list. The algorithm then only continues with the target states found so far. This actually decreases the ensemble size N. Figure 1 shows the vbsfe values for the D D algorithm for a representative run. The initial ensemble size was N = 1000 and the maximum number of energy evaluations set to 100000. As expected, the D D algorithm is initially faster than SA and the DA because there are only downhill moves. The algorithm is slower than

30

T. ZIMMERMANN A N D P. SALAMON

Downloaded by [Universitaetsbibliothek Heidelberg] at 04:05 19 August 2013

quenching because the latter does not require a target list to be filled. Eventually, however, quenching becomes slower due to trapping in local minima. In that regime the D D algorithm performs similar to the DA with the tendency of finding slightly better final best energies; in the example of Figure 1 the final value was vbsfe = 24. It was interesting to observe that there is a certain critical energy E* x 26 below which it rather suddenly becomes very difficult to find lower energies. This threshold is known from the spin glass analogue as the glass transition, indicating the sudden transition to a regime where typical time scales (in particular relaxation times) become very large. Our results indicate that only SA was really able to find energies significantly below E*. VI SUMMARY AND CONCLUSIONS

Conventional simulated annealing in the ensemble approach is based on individual moves of the ensemble members according to the Metropolis algorithm. In the present paper we generalize this approach by replacing the Metropolis algorithm by a collective move class. The new members of the ensemble are selected from a pool of states that is formed by the current states and some of their neighbors. The selection is done with the aim of achieving a certain target distribution of the energies in the next state of the ensemble. In this way ensemble members can be replaced by more favorable neighbors of other states. In analogy to the Metropolis algorithm we use the Boltzmann distribution in order to set up the required number of states at each energy in the target. Such a target distribution method reminds one of the action of Maxwell's Demon, hence the name Demon Algorithm. We implemented the Demon Algorithm together with conventional simulated annealing and random downhill search (quenching) for the problem of graph bipartitioning. The results showed that each one of these algorithms is optimal within a certain range of available computing resources. As a measure we took the very best energy seen so far with respect to the number of energy function evaluations performed. For a small number of available energy function evaluations quenching gives the best results, simply because this algorithm does not accept any uphill moves. For intermediate computing resources quenching slows down due to trapping in local minima and the Demon Algorithm becomes superior. On the very long run conventional simulated annealing performs best. The reason for that is that sampling fluctuations in the Demon Algorithm lead to a clustering of all ensemble states in a small region in state space. The algorithm in this way "freezes" in a single local minimum. We modelled these fluctuations with a simple model which shows that the clustering takes place on a timescale proportional to the ensemble size, emphasizing the need for large ensembles. Metropolis based simulated annealing does not have this kind of problem; the individual moves of the ensemble members guarantee a more uniform sampling of the state space. Simple techniques can be implemented to reduce the extent of clustering. As an example, we mention the possibility of restricting the fertility of ensemble members, e.g., reduce the likelihood of considering neighbors of states whose neighbors have already been added to the list.

THE DEMON ALGORITHM

31

Our results show that the choice of the best algorithm depends on the available computing resources. Where exactly the distinction is to be made will of course depend on the specific optimization problem. The Demon Algorithm allows for a number of possible variations, e.g., the rather simplified Delta Demon discussed in Section V. Implementation of the Demon Algorithm as described in Section IV leaves open the optimal choice of characteristic parameters such as ensemble size, thermodynamic speed and number of neighbors generated per ensemble member.

References
Downloaded by [Universitaetsbibliothek Heidelberg] at 04:05 19 August 2013
[I] S. Kirkpatrick, C. D. Gelatt Jr., and M. P. Vecchi, Optimization by simulated annealing, Science 220 (1983). 671-680. [2] V. cerny, Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm, JOTA 45 (1985), 41. [3] N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller and E. Teller, Equation of state calculations by fast computing machines, J. chem. Phys. 21 (1953), 1087-1092. [4] S . R. White, Concepts of scale in simulated annealing, ICCD '84 (1984). 646. [5] S. Geman and D. Geman, Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of ' images, lEEE Proc. Pattern Analysis and Machine Intelligence, PAMI-6 6 (1984), 721-741. [6] G. Ruppeiner, J. M. Pedersen and P. Salamon, Ensemble approach to simulated annealmg, J. Phys. I1 France 1 (1991). 455470. [7] P. Salamon, K. H. Hoffmann, J. Harland and J. D. Nulton, An information theoretic bound on the performance of simulated annealing algorithms. SDSU IRC report 88-1, 1988. [S] J. J. Grefenstette and J. E. Baker, How genetic algorithms work: A critical look at implicit parallelism, ICGA (1989). [9] J. D. Nulton and P. Salamon, Statistical mechanics of combinatorial optimization, Phys. Rev. A 37 (1988), 1351-1356. [lo] P. Salamon, J. Nulton, J. Robinson, J. Pedersen, G. Ruppeiner and L. Liao, Simulated annealing with constant thermodynamic speed, Comput. Phys. Commun. 49 (1988), 423428, [1 I] B. Andresen, K. H. Hoffmann, K. Mosegaard, J. Nulton, J. M. Pedersen and P. Salamon, On lumped models for thermodynamic properties of simulated annealing problems, J. Phys. France 49 (1988), 1485-1492. [I21 0 . Mercado Kalas, The demon algorithm, (MS thesis, San Diego S:ate University, 1989). [13] M. R. Garey and D. S. Johnson, Computers and Intracrabiliry (Freeman, San Francisco, 1979). [14] Y. Fu and P. W. Anderson, Application of statistical mechanics to NP complete problems in combinatorial optimization, J. Phys. A 19 (1986). 1605. 1151 D. S. Johnson, C . R. Aragon, L. A. McGeoch and C . Schevon, Optimization by simulated annealing: An experimental evaluation (Part I), Oper. Res. (1989). 1161 B. Hajek, Cooling chedules for optimal annealing, Math. Oper. R. 13 (1988), 311-329. [17] J. F. Crow and M. Kimura, An Inrroducrion ro Population Generics (Harper & Row, New York, 1970). [IS] M. Kimura and T. Ohta, The average number of generations until fixation of a mutant gene in a finite population, Genetics 61 (1969), 763-771.

You might also like