Finding Minimum Energy Atomic Clusters Using Social Insect-based Algorithms

Phil Tomson

February 7, 2006

1

Contents List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . List of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Background research. . . . . . . . . . . . . . . . . . . . . . . . 2.1 2.2 Simulated Annealing . . . . . . . . . . . . . . . . . . . . . . . . . . Evolutionary Algorithms . . . . . . . . . . . . . . . . . . . . . . . . iii iv vi 1 3 3 4 6

3 Problem description . . . . . . . . . . . . . . . . . . . . . . . .

4 Method description . . . . . . . . . . . . . . . . . . . . . . . . 10 4.1 4.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ant Colony Optimization (ACO) . . . . . . . . . . . . . . . . . . . 4.2.1 4.2.2 4.2.3 4.3 Basic ACO algorithm . . . . . . . . . . . . . . . . . . . . . . Population-based ACO algorithm . . . . . . . . . . . . . . . Cluster Searching with ACO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 10 13 16 19 21 22 23 25

Particle Swarm Optimization (PSO) 4.3.1 4.3.2 4.3.3

PSO algorithm . . . . . . . . . . . . . . . . . . . . . . . . . Applying PSO to cluster optimization . . . . . . . . . . . . . Methods for escaping local minima in PSO . . . . . . . . . .

i

5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 29 5.1 P-ACO experiments and results . . . . . . . . . . . . . . . . . . . . 5.1.1 5.1.2 5.1.3 5.2 Searching for known optimal cluster conformations . . . . . Monte Carlo sampling . . . . . . . . . . . . . . . . . . . . . Observations . . . . . . . . . . . . . . . . . . . . . . . . . . 29 29 31 35 35 35 38 40 42 43 43 44

PSO Experiments and Results . . . . . . . . . . . . . . . . . . . . . 5.2.1 5.2.2 5.2.3 5.2.4 5.2.5 5.2.6 5.2.7 Basic PSO results . . . . . . . . . . . . . . . . . . . . . . . . Determining the best values for c1 , c2 and τ . . . . . . . . . Observing Swarm Diversity . . . . . . . . . . . . . . . . . . Escaping local minima using ARPSO . . . . . . . . . . . . . Escaping local minima using particle bouncing . . . . . . . . Escaping local minima using function stretching . . . . . . . Using PSO without ACO . . . . . . . . . . . . . . . . . . . .

6 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 6.1 6.2 6.3 Improving PSO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Using relaxation to improve results . . . . . . . . . . . . . . . . . . Speeding up runtimes . . . . . . . . . . . . . . . . . . . . . . . . . . 46 49 49

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

ii

List of Tables 1 2 Optimal conformations for small clusters [22],[23]. . . . . . . . . . . Results of experiment to determine best population size and values for τ0 , τ1 while searching for the 7-atom low-energy cluster among 1000 randomly placed atoms. . . . . . . . . . . . . . . . . . . . . . 3 Results from running the basic PSO algorithm for cluster sizes 7 to 20 (10 times each). . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Varying values of c1 ,c2 and τ to determine the best values for each (success rate based on 20 runs for each set of values). . . . . . . . . 5 Tracking positional and velocity diversity of swarm while running PSO on 7-atom cluster. . . . . . . . . . . . . . . . . . . . . . . . . . 6 7 41 40 37 36 9

Results using PSO with Function Stretching to escape local minima. 45 PSO with function stretching using random seed cluster. . . . . . . 46

iii

List of Figures 1 2 The scaled Lennard-Jones potential energy function. . . . . . . . . .
A:

9

Ants form pheromone trail between nest and food; B: An obstacle

placed in the trail leads to initial confusion; C: Ants nd two paths around the obstacle; D: The shorter trail gets more pheromone than the longer one and eventually all ants follow the shorter one. . . . . 3 A Monte Carlo technique is used to position atoms in 3D space. Three tours of length 4 are highlighted. Each tour creates a 4-atom cluster. An atom may be in more than one tour. . . . . . . . . . . . 4 Average cluster energy/iteration while searching for the minimum energy 7-atom cluster embedded among 200 randomly placed atoms. 30 5 Average cluster energy/iteration while searching for the minimum energy 7-atom cluster embedded among 400 randomly placed atoms. 31 6 Average cluster energy/iteration while searching for a low energy 7-atom cluster among 200 randomly placed atoms. . . . . . . . . . . 7 Average cluster energy/iteration while searching for a low energy 7-atom cluster among 400 randomly placed atoms. . . . . . . . . . . 8 Average cluster energy/iteration while searching for the minimum energy 14-atom cluster (-47.84 eV) among 200 and 400 randomly placed atoms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 33 32 20 11

iv

9 10 11

-16.505 eV global minima 7-atom conformation. . . . . . . . . . . . -15.533 eV local minima 7-atom conformation. . . . . . . . . . . . . -15.935 eV local minima 7-atom conformation. . . . . . . . . . . . .

38 39 39

v

List of Algorithms 1 2 3 Basic ACO algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . The P-ACO add_cluster function. . . . . . . . . . . . . . . . . . . . Basic PSO algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . 13 18 24

vi

1 Introduction Atomic clusters are aggregates of atoms held together by the same forces that cause, for example, phase transition from vapor to liquid, formations of crystals, and so on. Cluster sizes range from as few as three atoms to more than several hundred atoms. In bulk material physical properties exist independent of size, but as things become suciently small, cluster size has a signicant eect. The physical and chemical characteristics of a cluster often vary with its size. Even the addition of a single atom can result in an entirely dierent structure. At present the question of how large a cluster has to grow before bulk properties prevail, remains unanswered. Atomic clusters have been studied for some time, but the emergence of the nanotechnology eld in recent years has inspired renewed interest in cluster research. One of the most exciting research areas involves the role clusters play in the design of nano-scale systems. For example, researchers recently discovered that 20-atom gold clusters have large energy gaps [16] and thus large amounts of energy are required to induce any chemical reaction with them. Since these gold clusters are chemically inert, they could be used as insulators  even though bulk gold is an excellent conductor. Moreover, their inertness suggests a potential as a building block for new materials. Semiconductor clusters have received attention as well and are considered important to the development of quantum dots [1].

1

This thesis focuses on methods for nding minimal-energy atomic cluster conformations since they represent the most stable conformations. This is an NP-hard problem [28] and many methods have been applied to solving it in the past. This thesis builds on work outlined in an earlier paper by Dr. Greenwood and myself [27]. In that paper we used Ant Colony Optimization (ACO) which to our knowledge had not been applied to atomic clusters before. This thesis presents results from obtained using another new optimization method, Particle Swarm Optimization (PSO), both in conjunction with ACO and by itself for nding minimal-energy atomic clusters.

2

2 Background research Given that the problem of nding minimum-energy (or ground state ) atomic cluster conformations is NP-hard, heuristic optimization techniques are the only viable approaches to nding solutions. Optimization algorithms fall into two distinct categories: deterministic and probabilistic (or stochastic). Deterministic methods such as branch-and-bound search[15] and gradient descent methods are useful for small problems, but are generally not applied to larger problems due to the fact that they become computationally expensive in larger search spaces. Probabilistic search and optimization algorithms rely on some degree of randomness being applied to solve the problem at hand. Simulated Annealing (SA) and Genetic Algorithms (GAs) are two probabilistic optimization algorithms which have been applied to this problem with some success.

2.1 Simulated Annealing As the name implies, Simulated Annealing is a probabilistic optimization heuristic patterned after annealing in metallurgy in which metals are heated and then cooled in a controlled manner such that the crystalline structure of the metal eventually settles into a low energy state which results in a stronger structure. The heating phase allows atoms to wander from initial positions randomly through other states including higher energy states. Slow cooling gives more chance of nding congura3

tions with lower energy than the initial conguration (prior to heating). Simulated annealing as an optimization technique was rst proposed by Kirkpatrick, Gelatt and Vecchi in 1983 [13]. The user-dened annealing schedule determines the rate of cooling and has a large eect on the quality of the nal result. Higher initial temperatures and slower cooling generally results in better solutions, however the time required to run the algorithm increases as the cooling is slowed. Coleman, Shalloway and Wu applyied simulated annealing to searching for low energy clusters ranging in size from 3 to 27 atoms [5]. Known global minimum values were not acheived for any of the cluster sizes presented (their modied SA algorithm did somewhat better, but still did not acheive global minimum values in most cases).

2.2 Evolutionary Algorithms The term evolutionary algorithm refers to a category of algorithms which make use of the principles of Darwinian evolution in order to nd optimal solutions. Genetic algroithms (GAs) and Evolution Strategies (ES) are two algorithms which t into this category. In recent years GAs have been shown to perform better than simulated annealing for many types of optimization problems. GAs make use of reproduction operators (from which child solutions are created from two or more parent solutions) and random mutation operators. Deaven, Morris and Ho in 1996 applied a genetic algorithm to nding low energy 4

clusters and were able to nd new lowest-energy conformations for six clusters ranging in size from 38 to 98 atoms [6]. The reproduction operator they chose was to bisect two parent clusters choosing a random plane passing through the center of each and then forming a child cluster from the two halves created by this plane. Wolf and Landman built on this work by adding twinning mutations as well as seeding the population with structural motifs to improve the GA's search capability [30]. However the optimized GA proposed by Wolf and Landman was designed to use the Lennard-Jones potential energy function; if another energy function were used, the GA would have to be re-optimized.

5

3 Problem description In this thesis we are interested in nding the geometric conformation (i.e., the 3-D structure) of atoms which results in the most stable cluster. The stability of any cluster conformation is determined, in part, by the total energy of the cluster. By exploiting the Born-Oppenheimer approximation[3] (which assumes that atomic nuclei are stationary with respect to the electrons thus allowing electron-nucleus interactions to ignore nuclei movements), one can express the total energy of an atomic cluster as a function of the positions of the individual atoms. That is, E =

E(r1 , ..., rN ) where E is the total energy, N is the number of atoms in the cluster
and ri is the position of the i-th atom. Since the total energy is invariant with respect to the overall translation and rotation of the whole cluster, the total number of degrees of freedom is 3N − 6. The energy required to dissociate a conformation into isolated atoms is called the cluster's binding energy. A lower total energy value corresponds to a higher binding energy magnitude. Consequently, conformations with minimum total energy are the most stable. The problem of nding this minimum energy structure is therefore equivalent to optimizing E with respect to variations in all 3N −6 degrees of freedom. The object is then to solve the following optimization problem:

Given an atomic cluster of N atoms which are subject to two-body central forces, nd the conformation in 3-D Euclidean space that has the
6

minimum total energy.
One method of solving this optimization problem is to explore the potential energy surface (PES) composed of all possible cluster conformations. Unfortunately, as the cluster size increases, so does the number of degrees of freedom in the placement of atoms. This characteristic produces a PES where the number of local optima grows exponentially with the cluster size[2]. Determining the ground-state energy level, which is the most energetically stable level, is extremely dicult. Wille and Vennik [28] proved this problem is NP-hard for homonuclear clusters (i.e., clusters with only one type of atom) and Greenwood [10] later proved the same thing for heteronuclear clusters. Hence, heuristic search techniques are the only viable alternative. There are dierent methods of quantifying the energy of a conformation. The most accurate are ab initio methods, however they are computationally very expensive and scale typically with the 4th power of the number of electrons thus making them impractical for a single large cluster conformation evaluation let alone global optimization. Practical alternatives to ab initio methods include density functional methods and simple force elds derived from tting analytical function to empirical and/or ab initio data. In this thesis the results presented rely on using the latter approach for potential energy functions. In particular, we approximate

7

the total energy as the sum of all pairwise interactions between atoms:

N −1

N

E=
i=1 j=i+1

v(rij )

(1)

where rij is the Euclidean distance between atoms i and j, and v(rij ) is the pairwise potential energy function. Such approximations are adequate when quantum eects and many-body eects are negligible. The commonly used Lennard-Jones pairwise potential energy function is used for v(rij ) in (1)

v(rij ) =

1 rij

12

1 −2 rij

6

(2)

The rst term in (2) accounts for the repulsion between atoms and is the dominant interaction when atoms are only a short distance apart. The second term represents the attractive component between neutral atoms and this component dominates as the distance increases. This function is scaled meaning that the globally minimal potential energy level v(r0 ) = −1 , occurs with a bond distance of

r0 = 1. Figure 1 shows this pairwise potential function. Having dened a pairwise
potential energy function cluster conformations can now be constructed and their energy level computed using (2). Table 1 gives the globally minimum total energy conformations for small clusters using the scaled Lennard-Jones potential energy function.

8

3 Lennard-Jones 0 -1

2.5

2

1.5

1 v(r) 0.5 0 -0.5 -1 -1.5 0 0.5 1 1.5 r 2 2.5 3

Figure 1: The scaled Lennard-Jones potential energy function.

N 2 3 4 5 6 7

Conformation linear chain separated by 1 unit equilateral triangle each side 1 unit tetrahedron, each side one unit triangular bipyramid, contracted along the symmetry and distended in the symmetry plane regular octahedron with slightly contracted sides pentagonal bipyramid Table 1: Optimal conformations for small clusters [22],[23].

9

4 Method description 4.1 Overview The method for nding minimal energy atomic clusters described in this thesis involves the use of two dierent algorithms based on the behavior of social insects: Ant Colony Optimization and Particle Swarm Optimization. The rst stage of the method is to nd a candidate cluster using Monte Carlo sampling in conjunction with Ant Colony Optimization. The cluster resulting from this initial stage will not likely be optimal, so it is used as an input to the Particle Swarm Optimizer to improve the cluster's energy driving it closer to the global minimum.

4.2 Ant Colony Optimization (ACO) The Ant Colony Optimization metaheuristic was rst proposed by Dorigo [7]. Ants are social insects with behaviors oriented towards the benet of the colony in which they live. Of particular interest is the way they forage for food. Ants keep track of the path from the colony to a food source by depositing a chemical pheromone on the trail. The pheromone acts as a chemical marker which other ants can detect. The stronger the concentration of pheromone, the greater the probability that other ants will follow it as opposed to randomly searching. Ants deposit pheromone on both the outgoing trip to nd food and the return trip to the nest. Therefore, ants

10

Figure 2:

Ants form pheromone trail between nest and food; B: An obstacle placed in the trail leads to initial confusion; C: Ants nd two paths around the obstacle; D: The shorter trail gets more pheromone than the longer one and eventually all ants follow the shorter one.
A:

which return to the nest sooner from foraging will have built stronger pheromone trails than ants which have not yet completed the full round-trip. Evaporation of pheromone also plays a role: the longer it takes for a particular ant to make a round trip, the more time the pheromone has to evaporate. Over time the shortest path will tend to have the strongest pheromone concentration as more ants travel it. For the ants in a colony, the pheromone trails are a communication mechanism which causes individual ants to coordinate the food foraging activity. This method of communication whereby individual agents communicate with one another by modifying their local environment is known as stigmergy. Ant algorithms use multiple articial 'ants' to search the for the solution to dicult combinatorial optimization problems. They conduct this search by emulating

11

the stigmergic communication used in real ant colonies. Each agent represents an ant that explores the edges of a graph, thereby incrementally constructing a solution to the optimization problem. As an ant moves throughout the graph it deposits articial pheromones on graph edges that are part of good solutions. These articial pheromones are represented by oating point numerical values with higher values indicating more pheromone. Other agents can read these pheromone values and make decisions based on these values. Hence, agents communicate with each other using a form of distributed memory. Ant algorithms are particularly well suited for optimization problems that involve paths in graphs such as routing in communications networks, vehicle routing and the well-known Traveling Salesman Problem (TSP).[8] The ACO algorithm does dier in several respects from real ants:

• Articial ants are agents that transition from one discrete state to another
discrete state.

• Articial ants have an internal state that records previous movements. Specifically, articial ants can avoid traversing edges of the graph which have already been visited.

• Articial ants can deposit an amount of pheromone which reects the solution's quality.

• Articial ants can exhibit behaviors (such as backtracking) which real ants
12

Algorithm 1 Basic ACO algorithm.

Initialize fully connected graph G Initialize pheromone level on each edge of G to τ0 while(termination conditions not met) position each ant on a different node do each ant incrementally applies a state transition rul e to construct a tour. update pheromones on visited edges using a local update rule. until (all ants construct a tour) Update pheromones using a global updating rule. end while

do not. As is the case with real ant trails in which the chemical pheromone evaporates, ant algorithms introduce a pheromone evaporation rate to allow the colony to 'forget' suboptimal solutions. However, Dorigo notes that pheromone evaporation seems to play a much more important role in articial ant algorithms than it does in the case of real ants.[9]

4.2.1 Basic ACO algorithm The basic ACO algorithm is shown in Algorithm 1. During each iteration of the ACO algorithm an ant moves across some edge to a new node. Let τij (t)be the amount of pheromone on edge (i,j) of the graph at time t. (The more pheromone there is the more favorable the edge.) As soon as

13

an ant traverses an edge, the pheromone level is updated by the local update rule:

τij (t + 1) = (1 − ρ) · τij (t) + ρ · τ0

(3)

Where ρ is a user-dened constant in the range 0 < ρ < 1 that determines pheromone persistence, 1 − ρ can be interpreted as the pheromone evaporation rate, and τ0 is the initial pheromone level. After N iterations the ant will nd a tour of length N. After completing a tour the following global update rule is applied:

τij (t + N ) = (1 − α) · τij (t) + α · ∆τ

(4)

where α is a user dened constant in the range 0 < α < 1 that determines pheromone persistence (it plays the same role as ρ in the local update rule) and

∆τ is determined by: ∆τ =
    
1 ε

if edge is in tour otherwise

    0

where ε is the total energy of the cluster (or length of tour in the case of solving the TSP problem). The aim of the global update rule is to allocate a greater amount of pheromone to a better solution (in this case to a cluster with lower energy). The probability that an ant k on atom r will choose to move to atom s is given

14

by a random proportional rule :
          

pk (r, s) =

[τ (r,s)·[η(r,s)]β [τ (r,u)]·[η(r,u)]β u∈J (r)
k

s ∈ Jk (r) otherwise

(5)

0

where τ is the pheromone level, η = 1/((r)rs + ε) with ε = 1.000001 to keep the denominator from going to zero. (Recall that the Lennard-Jones energy function has a minimum value of -1.) (r)rs is the pairwise energy between atoms r and s computed using equation 1. Jk (r) is the set of atoms that remain to be visited by ant k positioned on atom r. β is a user-dened positive parameter which determines the relative importance between pheromone level and energy level. An ant located on atom r chooses the atom s to move to by applying a state transition rule given by
   β  argmax  u∈Jk (r) {[τ (r, s)] · [η(r, s)] }    

if q ≤ q0 (exploitation) otherwise (biased exploration)

s=

(6)

S

where q is a random number uniformly distributed between 0 and 1, q0 is a constant (0 ≤ q0 ≤ 1), S is a random variable selected according to the probability distribution given in equation (5). The state transition rule given by equations (5) and (6) is a pseudo-random

proportional rule. This rule tends to favor movement towards nodes (atoms) con-

15

nected by edges with a large amount of pheromone as well as towards nodes which will result in lower energy levels for the cluster. Before choosing an atom to move to, the ant samples a random number q from the unit interval and compares it to a user dened constant q0 . This constant allows the user to adjust the amount of exploration vs. exploitation in the algorithm. That is, as q0 approaches 1, the ant favors moving to an atom that lowers the energy. Conversely, as q0 approaches 0, lower energy moves are abandoned in favor of doing more random exploration. If

q ≤ q0 , then (6) governs; otherwise (5) is used.

4.2.2 Population-based ACO algorithm The results in this thesis were obtained using a modication of the basic ACO algorithm called the Population-based Ant Colony Optimization algorithm (P-ACO) [11]. P-ACO has the advantage of minimizing the number of oating-point operations and was recently used as the basis for an FPGA implementation of ACO[26]. The quality of results produced by the P-ACO were found to be indistinguishable from those produced by the basic ACO implementation. P-ACO transfers less information (only the most important information) from one iteration of the algorithm to the next, thus making it more ecient. The P-ACO implementation was faster by at least a factor of 2, so it was used to produce the results reported in the Experimental Results section below. P-ACO keeps a population of best solutions; in this case best clusters. An elitism 16

strategy is employed so that the best (lowest energy) cluster found up to that point is always kept. In addition to this best cluster, a list of N clusters is kept such that the best cluster found in each iteration of the algorithm is inserted into the list; this insertion replaces the oldest cluster in the list with the current one meaning that the population list acts as a FIFO. This means that instead of updating a pheromone matrix after each iteration of ants has constructed a solution, P-ACO updates the population list. The evaporation concept in the P-ACO occurs when older solutions leave the population. The local update rule for P-ACO simply becomes:

τij (t + 1) = τij (t) + τl

(7)

where τij (t) is the amount of pheromone on edge (ij) at time t, and τl is a user dened constant integer used to represent the amount of pheromone deposited in a local update. Recall that τ0 is the amount of pheromone initially applied to all edges at initialization, in our implementation of P-ACO we've introduced a dierent pheromone level, τl for local updates. (See the Experimental results section for experimental results using dierent values for τ0 and τl ). In P-ACO the global update rule from ACO shown in equation (4) is replaced by the add_cluster function shown in pseudo code in Algorithm 2.

17

Algorithm 2 The P-ACO add_cluster function.

Given a population_list of clusters and a best_cluster : def add_cluster( new_cluster) if new_cluster energy < best_cluster energy for each edge in best_cluster edge pheromone level = edge pheromone level - τ0 end best_cluster ← new_cluster for each edge in best_cluster edge pheromone level = edge pheromone level + τ0 end end insert new_cluster in population_list removed_cluster ← oldest cluster in population_list for each edge in removed_cluster edge pheromone level = edge pheromone level - τ0 end for each edge in new_cluster edge pheromone level = edge pheromone level + τ0 end end

18

4.2.3 Cluster Searching with ACO ACO is easy to apply to problems like the Traveling Salesman Problem where there is a collection of cities and the optimizer tries to nd the shortest closed path that would visit all cities. The criteria to optimize is Euclidean distance on a two dimensional map. In the case of the minimal energy atomic cluster problem atoms are positioned in three dimensional space and the goal is to minimize the pairwise Lennard-Jones energy between all atoms of the cluster. To apply ACO to this problem a Monte Carlo sampling method was used where a large number of atoms were randomly scattered in a bounded 3-D space. The number of atoms should be at least 10N for a cluster size of N. The boundaries for the 3-D space should be such that they can easily accommodate the cluster of size N which is being sought. After the atoms are randomly placed, ants are randomly placed on atoms. Each ant must now visit exactly N − 1 other atoms to build a cluster of size N . This process is shown in Figure 3 where three clusters of size N= 4 are shown. The Monte Carlo sampling process can be very inecient if the ant needs to consider each of the randomly placed atoms when choosing the next atom to move to using equation 6. For example, given 1000 atoms placed randomly in the 3D space, an ant would need to initially evaluate 999 other atoms to move to if it were to evaluate every other atom in the space. The Lennard-Jones energy function oers clues for making the sampling process more ecient. We know that 19

Figure 3: A Monte Carlo technique is used to position atoms in 3D space. Three tours of length 4 are highlighted. Each tour creates a 4-atom cluster. An atom may be in more than one tour. two atoms very close together will have a large energy as calculated by LennardJones. It has been shown that in an optimum cluster the pairwise inter-atomic normalized distance is bounded from below by 0.5[31]. This being the case, there is no need to consider atoms which are closer than 0.5 normalized distance. We also know that atoms beyond a certain distance will not contribute to reducing the energy of the cluster since the energy approaches 0 as the distance between atoms increases. This allows us to ignore atoms beyond a certain distance. After the atoms are randomly placed, a consideration list is built for each atom such that only atoms with distance of greater than 0.5 but less than some maximum distance from the atom are inserted in the list; 3.0 was used for a maximum distance in most experiments. As the ACO algorithm is run only atoms in a particular atom's 20

consideration list are considered as candidates. While using more memory up
front, this adaptation can speed up the operation of the algorithm especially as the number of randomly-placed atoms grows.

4.3 Particle Swarm Optimization (PSO) Particle swarm optimization is a population based stochastic optimization technique rst proposed by Eberhart and Kennedy in 1995 [12] PSO was inspired by social behavior of swarming insects, ocking birds and sh schooling. The technique has some similarities with evolutionary algorithms such as Genetic Algorithms (GA) and Evolution Strategies, though PSO does not include mutation or crossover operators. PSO starts with a population of randomly generated solutions referred to as

particles. Each particle in the population has a position and a velocity. The goal
is to have particles converge on the optimum solution much like a swarm of insects converges on a food source. Each particle remembers the best solution it has found so far; this value is referred to as pbest. The best solution found by the entire swarm is also tracked; this value is referred to as gbest. Solutions can also be tracked by sociometry. Sociometries are ways in which particles are related to other particles; it refers to a group of particles which are considered neighbors. The most common sociometries in PSO are the star (in which all particles in the swarm are considered neighbors) and ring (in which each particle has two 21

neighbors), however many other sociometries have been proposed and investigated [18]. The best solution found by a neighborhood of particles is referred to as nbest. Over time, the velocity of each particle is adjusted so that it will tend to move toward it's own best position (pbest) and either the best position found by the whole swarm (gbest) or the best solution found by it's neighborhood group (nbest) depending on the sociometry of the swarm.

4.3.1 PSO algorithm The PSO pseudo code is given in Algorithm 3. The equations used to calculate the next position and next velocity are:

p(t + 1) = p(t) + v(t)

(8)

v(t + 1) = ωv(t) + c1 U (pbest − p(t)) + c2 U (lbest − p(t))
where:

(9)

• p is a vector representing the current position of the particle. • v is a vector representing the particle's velocity. • t is time. • ω is a scalar used to adjust the momentum or inertia weight of the particle
(usually set to between 0.5 and 2.0 ). 22

• c1 and c2 are scalar constants that determine how much the particle is directed towards good positions. Usually set near 1.0.

• U is a uniform random vector of values usually distributed over [0,1]. • pbest is the position of the best solution found so far by the particle. • lbest is determined by:
    nbest if rand() > τ     gbest

lbest = 

(10)

otherwise

where rand() is a random number between 0 and 1, and τ is a user dened constant between 0 and 1; τ thus denes a threshold which determines whether nbest or gbest is selected. Values of τ under 0.5 favor choosing nbest, while values over 0.5 favor choosing gbest.

4.3.2 Applying PSO to cluster optimization Each atom in a cluster has a location in 3-D space, so the position vector for the PSO contains the x,y and z coordinates of each of the atoms in the cluster. Given a cluster size of N atoms, the resulting position vector (p) and velocity vector (v) sizes are 3N. We start out with a seed cluster which is the output from the ACO phase of the method. The seed cluster is added to the swarm as-is and is also used to generate some number of particles which will become members of the swarm. 23

Algorithm 3 Basic PSO algorithm.

Initialize a list of N randomly generated particles if using ring sociometry Initialize list of neighborhood and assign particles to neighborhoods end Do For each particle Calculate fitness value if fitness is better than current pbest set pbest to current position end Calculate particle's next position (using Eq. 8) Calculate particle's next velocity (using Eq. 9) end Choose particle with best fitness value as gbest if using ring sociometry For each neighborhood Choose particle with best fitness as nbest end end While maximum iterations or optimum solution not attained

24

Other completely random particles are generated to complete the swarm. Usually a swarm consists of between 20 and 40 particles; more details on swarm size will be given in the Experimental Results section below. Each of the particles of the swarm is initially given a random velocity vector. To create particles based on the seed cluster input, a random delta value in the range from -1.0 to 1.0 is added to each value in the seed cluster's position vector. Half of the particles in the swarm are these particles derived from the seed cluster, the other half are randomly generated clusters.

4.3.3 Methods for escaping local minima in PSO While it has the advantage of converging quickly, the basic PSO algorithm has a tendency to converge too readily on local minima. Several modications to the basic algorithm have been proposed to deal with this problem. The following methods were tried: 1. Attractive-Repulsive Particle Swarm Optimizer (ARPSO) [25]: The main idea of ARPSO is to measure the positional diversity of the swarm and when it falls below a certain threshold, to put the swarm in reverse so that instead of being attracted by the best positions found so far by the swarm (gbest ), neighborhood (nbest ) and each particle (pbest ), each particle becomes repulsed by these best positions. The swarm continues to run in this repulsive mode until the positional diversity of the swarm increases above the threshold 25

criteria. In theory this should increase the diversity of the swarm while also backing the swarm away from a local minima. A fairly simple change to the basic PSO next velocity equation (9) is required:
        

ωv(t) + c1 U (pbest − p(t)) + c2 U (lbest − p(t)) ωv(t) − (c1 U (pbest − p(t)) + c2 U (lbest − p(t)) )

if Sd > t else
(11)

v(t + 1) =

where Sd is swarm diversity and t is a user dened threshold. 2. Particle Swarm Optimization with Spatial Particle Extension (Particle Bouncing) [14]: Similar to ARPSO, but instead of tracking the diversity of the whole swarm, the distance between particles is tracked. If two particles get too close to each other (meaning there are two solutions which are essentially the same) they are made to 'bounce' away from each other in the same direction they came from. The next velocity equation is very similar to the one for ARPSO in equation 11 except that instead of the whole swarm being reversed, only the two particles which have become too close are reversed:
    

ωv(t) + c1 U (pbest − p(t)) + c2 U (lbest − p(t))

if d < t else
(12)

v(t + 1) =

    ωv(t) − (c1 U (pbest − p(t)) + c2 U (lbest − p(t)) )

where d is the distance between particles and t is a user dened threshold. 3. Function Stretching [21]: This method modies the objective function so 26

that a local minima is eliminated while still preserving the global minima. Just as in the ARPSO method, swarm diversity must be monitored in order to detect when the swarm has converged on a local minima. When a local minima has been detected a two-stage transformation is made to the original objective function f (x):

G(x) = f (x) + γ1

||x − x|| · (sign(f (x) − f (ˆ)) + 1 ˆ x 2 sign(f (x) − f (ˆ)) + 1 x 2 tanh(µ(G(x) − f (x)))

(13)

H(x) = G(x) + γ2

(14)

where x is the detected local minimum of f , γ1 , γ2 and µ are arbitrarily ˆ chosen positive constants, and sign(x) returns -1 if x is negative, 0 if x is 0 and +1 if x is positive. Since the rst method (ARPSO) and the third method (Function Stretching) depend on monitoring the diversity of the swarm it is important to develop methods for measuring swarm diversity. Two dierent notions of diversity can be developed: positional diversity and velocity diversity. Positional diversity considers the average distance between particles in the swarm, while velocity diversity considers the average velocity of all particles in the swarm. Riget and Vesterstrom [25] dene a

27

positional diversity function:
|S| 1 · |S| · |L| i=1

N

diversity(S) =

(pij − pj )2
j=1

(15)

where S is the swarm, |S| is the swarm size, |L| is the length of the longest diagonal in the search space, N is the dimensionality of the problem (number of atoms in the cluster × 3), pij is the j 'th value of the i 'th particle and pj is the j' th value of the average point p. Positional diversity may not be the best measure of swarm diversity for our particular problem, however. Imagine a cluster conformation which is either rotated around a chosen axis to produce a second cluster, or alternatively, imagine adding some constant value to one dimension of each atom's position to create a second cluster. Both of the clusters created will have the same energy value as the original cluster, calculated by equations 1 and 2, however if each of these clusters were considered to be particles in a PSO swarm there would be some distance between them and thus they would be considered positionally diverse.

28

5 Experimental Results 5.1 P-ACO experiments and results There are two main categories of experiments which were performed using P-ACO: 1. Searching for known optimal clusters which have been 'hidden' among a group of randomly placed atoms. 2. Random or Monte Carlo sampling where the search for a minimal-energy cluster occurs only among randomly placed atoms.

5.1.1 Searching for known optimal cluster conformations The purpose of this series of experiments was to determine how well the P-ACO algorithm would do at nding a known optimal cluster which is hidden among a larger number of randomly generated atoms. Unless noted otherwise, the constants used for these experiments were: α = ρ = 0.1, β = 2.0, τ0 = τl = 1.0, and the population size was 3 (refer to equation 7 and algorithm 2). For the rst experiment 200 atoms are randomly placed in a 3 dimensional space sized 10 × 10 × 10. Then a hand-built (created according to the geometries presented in [23]), near minimum-energy 7-atom cluster (-16.4353eV; the minimum is -16.505eV) was embedded among the randomly placed atoms. Fifteen ants were then randomly placed at atom locations within the 3 dimensional space in each iteration. Each run went for 30 iterations. Figure 4 shows the results of averaging 29

Ave. Cluster Energy per iteration (over 10 runs) 0 "aco_7_200.dat" -2 -4 -6 Cluster Energy -8 -10 -12 -14 -16 -18 0 5 10 15 Iteration 20 25 30

Figure 4: Average cluster energy/iteration while searching for the minimum energy 7-atom cluster embedded among 200 randomly placed atoms. 10 runs. The optimal cluster was found in all 10 runs; the earliest the optimum was found was in iteration #1, while the latest the optimum was found was in iteration #5. In the next experiment the number of randomly placed atoms was increased to 400 to determine how the results would be eected by a larger amount of randomly placed atoms. Figure 5 shows the results averaged over 10 runs. In this case the optimum was only found in 7 out of 10 runs; the earliest the optimum was found was in iteration #3, while the latest the optimum was found was in iteration #28. Having a larger number of atoms proved more dicult for the algorithm, but this result seems reasonable as it it less likely that a randomly placed ant will end up 30

Ave. Cluster Energy per iteration (over 10 runs) 0 "aco_7_400.dat" -2

-4

Cluster Energy (eV)

-6

-8

-10

-12

-14

-16 0 5 10 15 Iteration 20 25 30

Figure 5: Average cluster energy/iteration while searching for the minimum energy 7-atom cluster embedded among 400 randomly placed atoms. on an atom that is in, or near the optimal cluster.

5.1.2 Monte Carlo sampling The next series of experiments duplicated the rst series, except that the optimal cluster was not embedded in the randomly placed atoms. Figure 6 shows the result of searching for a low energy 7-atom cluster among 200 randomly placed atoms, averaging over 10 runs. On average, the lowest cluster energy obtained was about -9.0 eV. The best result of the 10 runs was -9.7987 eV. The next experiment is the same as the previous one, except that the number of randomly placed atoms was increased to 400 on the theory that more atoms

31

Avg. Cluster Energy per iteration (over 10 runs) 0 Avg. cluster energy -16.505 eV global minimum energy target -2 -4 Avg. Cluster Energy (eV) -6 -8 -10 -12 -14 -16 -18 0 5 10 15 Iteration 20 25 30

Figure 6: Average cluster energy/iteration while searching for a low energy 7-atom cluster among 200 randomly placed atoms. in the space might result in a greater probability that a near-optimum cluster exists among the randomly placed atoms. Results of this experiment are shown in Figure 7. While the outcome is not very dierent from the previous experiment, it does show some improvement when more randomly placed atoms are available to sample. The minimum energy found in this set of runs was -10.4793 eV. The next experiment investigates what happens when larger clusters are embedded among the randomly placed atoms. Figure 8 shows the results of an experiment run by embedding the known optimal 14-atom cluster ( -47.84 eV) among 200 and 400 randomly placed atoms, again, averaging over 10 runs for each case. In the case where the optimum 14-atom cluster is embedded with 200 randomly placed 32

Avg. Cluster Energy per iteration (over 10 runs) 0 Avg. cluster energy -16.505 eV global minimum energy target -2 -4 Avg. Cluster Energy (eV) -6 -8 -10 -12 -14 -16 -18 0 5 10 15 Iteration 20 25 30

Figure 7: Average cluster energy/iteration while searching for a low energy 7-atom cluster among 400 randomly placed atoms. atoms the optimum was found 3 times out of 10 (a 30% success rate ), however in the case where it was embedded in 400 randomly placed atoms the optimum was not found in any of the 10 runs. These results suggest that the algorithm has more diculty nding larger clusters. The last experiment in this section involved varying the population size and the

τ0 , τ1 pheromone values used in the P-ACO algorithm. Recall that τ0 is the level of
pheromone applied to all edges initially and it is also the amount added to edges included in the best cluster for each iteration (See algorithm 2 ). τ1 is the amount of pheromone added in the local update rule (equation 7). The experiment was set up as follows: 33

Ave. Cluster energy per iteration (over 10 runs) 50 14-atom cluster embedded in 200 14-atom cluster embedded in 400 40 30 20 Cluster Energy (eV) 10 0 -10 -20 -30 -40 0 5 10 15 Iteration 20 25 30

Figure 8: Average cluster energy/iteration while searching for the minimum energy 14-atom cluster (-47.84 eV) among 200 and 400 randomly placed atoms.

• τ0 and τ1 were each varied between 1 and 4. • Two population sizes were tried: 2 and 3.
This results in a total of 32 tests (16 combinations of τ0 , τ1 and 2 dierent population sizes). In each test the same 7-atom nearly minimal-energy cluster (-16.4353 eV) used in the previous experiments was embedded among 1000 randomly generated atoms. For each test, the algorithm was run 10 times to nd an average number of generations in which the optimal cluster was found. The success rate (number of times the optimal cluster was found out of the total of 10 runs) was also tracked. When multiple tests have the same success rate, the one with the

34

lowest average number of iterations to nd the optimum is considered the best. Results of this experiment appear in Table 2. The best results were obtained using

τ0 = 1, τ1 = 2 with a population size of 3.

5.1.3 Observations The more randomly placed atoms there are in a given area, the greater the likelihood that some subset of those atoms sized N will be a near-optimal conformation for a cluster of size N. However, as more atoms are placed into the search space, there is less probability that a randomly placed ant will be placed on an atom in or near that near-optimum conformation. This can account for the fact that the algorithm did not perform as well at nding an embedded optimal conformation when the number of randomly placed atoms was doubled. This suggests that as cluster size increases, the number of ants per iteration probably needs to be increased as well.

5.2 PSO Experiments and Results 5.2.1 Basic PSO results Table 3 shows the results of running the basic PSO algorithm for cluster sizes 7 to 20. These results were generated by using output clusters from the P-ACO algorithm as seed clusters for the PSO algorithm. The number of iterations run was clustersize×200 (For example, the 7-atom cluster was run for 1400 iterations) 35

Population Size 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

τ0 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4

τ1 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3

Ave. Iterations to nd optimum 9.375 8.0 8.14 19.57 8.75 13.4 5.875 10.25 11.375 12.875 9.625 6.33 10.778 12.86 10.5 10.667 5.833 7.333 10.667 7.667 10.75 10.857 9.0 9.285 10.57 9.375 12.889 12.28 14.5 13.2 11.5

Success rate % 80 80 70 70 40 50 80 80 80 80 80 60 90 70 80 60 60 90 (best) 90 60 80 70 70 70 70 80 90 70 60 50 60

Table 2: Results of experiment to determine best population size and values for τ0 , τ1 while searching for the 7-atom low-energy cluster among 1000 randomly placed atoms.

36

N 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Start energy (eV) -8.850 -10.048 -14.691 -13.234 -14.195 -18.494 -21.789 -16.130 -22.054 -18.914 -5.1426 237.422 24.169 40.034

Average nal energy (eV) -15.928 -19.492 -23.348 -26.790 -31.042 -34.826 -39.719 -44.454 -50.161 -53.656 -57.290 -62.801 -66.270 -71.548

Success rate % 20 20 20 0 0 10 0 10 10 10 0 10 0 0

Opt. energy (eV) -16.505 -19.821 -24.113 -28.422 -32.765 -37.967 -44.845 -47.845 -52.322 -56.815 -61.317 -66.530 -72.659 -77.177

Table 3: Results from running the basic PSO algorithm for cluster sizes 7 to 20 (10 times each). and the number of particles was 35. Other user-dened settings: c1 = c2 = 0.9 (equation 9), and τ = 0.3 (meaning that nbest was favored 70% of the time). The PSO algorithm was run 10 times for each cluster size in order to track the success rate of the algorithm (the number of times the optimum cluster was found). To save time, the ACO algorithm was only run once to generate a seed cluster. Note: The optimal energy values in the last column are the best known as reported in the literature [29, 20]. Examining the results for the 7-atom cluster more closely: for this cluster size the algorithm only achieved a 20% success rate. The conformation for the global optimum 7-atom cluster is shown in Figure 9. Out of the 10 runs, the following

37

Figure 9: -16.505 eV global minima 7-atom conformation. non-optimal energy values seem to repeat: -15.93 eV (found 5 times; conformation shown in Figure 11), and -15.53 eV (found 3 times; conformation shown in Figure 10). These appear to be local minima which the basic PSO algorithm has trouble escaping after they are encountered. To test that theory both the -15.93 eV and -15.53 eV non-optimal clusters were used as the inputs to the algorithm (instead of using the output from the P-ACO algorithm) and the algorithm was run for 3000 iterations in each case; more than twice the number of iterations used in the original runs. In 10 runs the nal cluster's energy did not dier signicantly from the input; this seems to indicate that the basic PSO algorithm has great diculty escaping local minima.

5.2.2 Determining the best values for c1 , c2 and τ Next a set of experiments was run to determine the best values for c1 , c2 (in equation 9) and τ (in equation 10). Since the rst experiment showed the algorithm 38

Figure 10: -15.533 eV local minima 7-atom conformation.

Figure 11: -15.935 eV local minima 7-atom conformation.

39

c1 0.85 0.85 0.85 0.95 0.95 0.95

c2 0.50 0.50 0.50 0.55 0.55 0.55

τ 0.05 0.10 0.25 0.05 0.10 0.25

Success rate % 10 25 15 20 5 30

Table 4: Varying values of c1 ,c2 and τ to determine the best values for each (success rate based on 20 runs for each set of values). easily gets stuck in local minima, it was hypothesized that larger values of c1 relative to c2 would favor exploration by increasing movement towards the particle's best (pbest ) position as opposed to movement towards the swarm best (gbest ) or neighborhood best (nbest ). Also, lower values of τ would cause nbest to be chosen more often than gbest which should also help delay the convergence of the swarm. Table 4 shows the results of this set of experiments run on the 7-atom cluster. Each test was run 20 times to determine a success rate. The best settings were found to be c1 =0.95, c2 =0.55 and τ =0.25. These settings were used in all subsequent experiments unless otherwise noted. Fine tuning these parameters increased the success rate from 20% to 30% for the 7-atom cluster, but the algorithm was still getting stuck in local minima 70% of the time.

5.2.3 Observing Swarm Diversity As noted previously, some methods of escaping local minima require the ability to determine when the diversity of the swarm has fallen below some threshold. In this

40

case 1 2 3 4 5 6 7 8 9 10

starting positional diversity 21.916 21.573 21.663 22.584 22.004 22.381 22.285 21.936 22.074 22.210

starting velocity 2.108 2.049 1.934 2.041 1.906 1.878 1.993 1.694 2.139 1.942

ave.

local min. positional diversity 2.281 1.572 18.935 2.953 3.639 1.727 1.498 3.966 2.176 2.114

local min. ave. velocity 0.074 0.061 0.067 0.074 0.064 0.071 0.075 0.073 0.074 0.071

Table 5: Tracking positional and velocity diversity of swarm while running PSO on 7-atom cluster. experiment PSO was run on the 7-atom cluster and monitor the gbest to see when it converges on the -15.53 eV local minimum while also looking at two measures of swarm diversity: positional diversity as dened in equation 15, and average velocity. In this case we consider that a local minima has been encountered when there has not been a signicant change in particle energy within 200 iterations. Table 5 shows the results of this experiment. Based on these results it is dicult to determine a good threshold to use for positional diversity in a local minima because it ranges from 1.498 to 18.935. However the average velocity measurement seems much more consistent. All subsequent experiments which relied on a diversity measure used an average velocity threshold of 0.075 .

41

5.2.4 Escaping local minima using ARPSO The rst method tried for improving the ability of PSO to escape local minima was the ARPSO. The experiment was setup as follows:

• The basic PSO algorithm was run for a maximum number of iterations set
at 27,500.

• Cluster size was 7. • Swarm size was 35, c1 = 0.95, c2 = 0.55, τ = 0.25. • Every 20 iterations the average velocity of the swarm was calculated, if the
value fell below 0.075 the swarm is reversed (i.e. put into the repulsive mode in which each particle is repelled from the best values) for 60 iterations. Out of 20 runs the success rate was 25% which was not a signicant improvement over the basic PSO results. Upon closer examination it appears that the local minima was not encountered in any of the successful runs meaning that the swarm never entered the reverse mode in any of the successful cases. Conversely, in all of the unsuccessful runs one of the local minima was encountered and the swarm did enter reverse mode, but reversing the swarm never proved to be a successful strategy for exiting the local minima. Even with a signicantly larger number of iterations (27,500 vs. 1,400) the ARPSO algorithm was not able to escape local minima. 42

5.2.5 Escaping local minima using particle bouncing This experiment was setup exactly as the previous one. Instead of tracking swarm diversity, however, at each iteration the distance between each particle in the swarm and every other particle in the swarm was checked. The next velocity of particles was calculated according to equation 12. The minimum distance threshold was set to 0.1. Out of 20 runs the success rate was again 25% as in the ARPSO case. Similar to the ARPSO experiment, when local minima were encountered the swarm was not able to escape. Examination of the results showed that particles that bounced seemed to enter into an oscillation where they would continue to be attracted towards the local minima and eventually bounce again (after some number of iterations) as other particles were also attracted to the same area.

5.2.6 Escaping local minima using function stretching This experiment was setup exactly as the ARPSO experiment where swarm diversity was tracked every 20 iterations. However, instead of reversing the swarm when swarm diversity falls below 0.075 (which implies that a local minima has been encountered), the objective function is changed from equation 1 to equation 14 (where f (x) is the original pairwise Lennard-Jones energy function 1, and x ˆ is the local minima just encountered). The algorithm is then run for 800 iterations using this modied objective function before returning to using the original objective function. The hypothesis was that the swarm will move away from the 43

local minima during this 800-iteration period since function stretching eectively raises the local minima. Using this method, the results for the 7-atom cluster were dramatically improved over the basic PSO results. In 20 runs the global minima (-16.505 eV) was found in all but one run; a 95% success rate. Closer examination of the results revealed that in several of the 20 runs a local minima was encountered and later avoided after function stretching was employed; clear evidence that the function stretching method can escape local minima in the context of this problem. Table 6 shows the results for cluster sizes 7 to 20 (success rate based on 20 runs). Of the three methods tested, function stretching shows some promise for escaping local minima in the context of this problem. The results for the 19 and 20 atom clusters were still not favorable because the optimum value was not found in 20 runs of the algorithm for either case. However, if the maximum number of iterations allowed were raised above 27,500 it's possible that the optimal would be found given that the average number of iterations needed to nd the optima generally tends to increase with the number of atoms in the cluster and in the 18-atom case it was already 26,286 iterations.

5.2.7 Using PSO without ACO In the nal experiment PSO with function stretching was used as in the previous set of experiments. However, this time the seed cluster used was randomly generated instead of being generated by ACO. The results are shown in Table 7. For the 44

N 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Start energy (eV) -9.045 -11.204 -13.566 -13.412 -15.748 -15.943 -15.611 -20.549 -5.261 -22.584 -21.735 -25.387 -27.048 -29.623

Avg. nal energy (eV) -16.489 -19.821 -24.066 -28.028 -32.029 -36.918 -42.606 -46.094 -50.389 -55.433 -58.956 -64.996 -67.946 -69.205

Success rate % 95 100 95 60 45 50 45 40 20 35 45 5 0 0

Avg. iterations 5956 4753 6543 14428 17491 18755 18407 19305 22996 22189 18455 26286 NA NA

Table 6: Results using PSO with Function Stretching to escape local minima. smaller cluster sizes it seems that starting with a randomly generated cluster versus starting with a cluster generated from the ACO algorithm makes little dierence. As cluster size increases, however, more iterations are required and the success rate falls when the random cluster is used. Then again, the results are somewhat mixed since in this set of experiments there was actually one successful nding of the global minimum for the 20-atom cluster which was not found in the previous set of experiments.

45

N 7 8 9 10 11 12 13 14 15 16 17 18 19 20

start energy (eV) 39.612 -3.714 16815 9.583 14.044 63079 13246 1619.13 156.95 -2.787 591162 2851.85 14626 40082.9

Ave. nal energy (eV) -16.500 -19.817 -23.980 -27.671 -32.318 -36.804 -41.251 -45.679 -50.318 -53.974 -58.0795 -61.439 -63.438 -68.196

Success rate % 100 95 85 40 50 40 15 20 35 15 15 0 0 5

ave. iterations 5895 6280 10759 22184 17923 21116 25429 25159 23304 25557 24647 NA NA 26501

Table 7: PSO with function stretching using random seed cluster. 6 Future work 6.1 Improving PSO Further work needs to be done to improve the eciency of the PSO-based method. Note that in Table 6 the average number of iterations required to nd the global minima tends to increase as the number of atoms in the cluster increases. In the 18-atom cluster case the average number of iterations was 26286. Given that there were 35 particles used in that experiment it means that there were on average

35 × 26, 286 = 920, 010 evaluations of the pairwise energy function (equation 1)
before arriving at the global minima. Reducing the number of particles would not likely help as it is likely that more iterations would be required; Richards and

46

Ventura in [24] found that reducing swarm size would have a negative impact in cases where there are numerous local minima. More experimentation needs to be done to discover the optimum swarm size for this problem. Locatelli and Schoen noted that the Lennard-Jones function (2) does not penalize pairs of atoms which are far apart any more than atoms which are about 1.8 units apart (where the Lennard-Jones function eectively approaches 0), so they introduced a modied Lennard-Jones function[17]. Their modication basically adds the scaled distance to the basic Lennard-Jones function so that the result increases as the distance between atoms increases instead of assymtotically approaching 0. This modication tends to penalize larger distances. They then use this modied Lennard-Jones function in the pairwise energy function (1). This becomes the objective function (the function to minimize) for a simple Multistartbased minimization algorithm. Even with this fairly simple minimization algorithm they were able to nd the global minima for some clusters which are considered dicult. This method of modifying the Lennard-Jones function could prove benecial in the PSO algorithm as well by improving eciency (reducing the average number of iterations) and may help avoid some local minima. The idea would be to rst run the PSO algorithm using the modied Lennard-Jones energy function for some number of iterations and then run the algorithm for some number of iterations with the traditional Lennard-Jones energy function. Monson has suggested that Spatial Particle Extension (Particle Bouncing) as 47

used in PSO adds parameters which are dicult to tune [19]. The problematic parameters to tune are minimum distance between particles and the magnitude of the bounce. Monson proposes a proportional (or adaptive) bouncing method which is much less sensitive to these settings; his initial results indicate that it is quite successful on highly multimodal functions with many local minima. The method involves setting minimum distance between particles (the minimum distance which triggers a bounce) relatively high, initially, and then decreasing that distance as time goes on, while the length of the bounce is increased over time. Future experiments should include this method to determine if it might be benecial for this problem. While function stretching denitely seems to do well at escaping local minima, it still does not prevent the swarm from re-entering a previously visited local minima. Since we have developed a method for detecting local minima, perhaps it would be possible to use that information to avoid re-entering the same local minima as the swarm progresses. When a local minima is detected the particle at the local minima could be added to a list of xed tabu particles which should be avoided by using a method similar to particle bouncing: the distance between all particles in the swarm and particles in the xed tabu particle list could be measured and particles too close to tabu particles could be 'bounced' away from them. Some initial work was done in this direction, but the results were not favorable probably because two particles can have the same conformation and energy level but at the 48

same time can be quite distant from each other. Perhaps a similar idea could be implemented using tabu energy levels only.

6.2 Using relaxation to improve results Biomimetic algorithms such as ACO, PSO or GAs have a real potential for nding good solutions to the clustering problem. However, these algorithms perform a macro-search, often making it dicult to nd the lowest energy conguration in a highly multimodal PES. It is often necessary to augment these algorithms with some sort of relaxation method in order to nd really good low energy structures [32]. The ACO and PSO algorithms presented here should be augmented with a relaxation method such as the bound constrained method proposed by Byrd et al. [4]which has had wide acceptance for these types of problems.

6.3 Speeding up runtimes Finally, all of the programs written as the basis for the experiments presented here were written in a combination of Ruby (an interpreted language) and C (vector math functions were implemented in C, for example). Some proling of the Ruby code would allow for some concentrated speed optimization of that code. Perhaps some of the slower functions could be written entirely in C to speed the program up and thus allow for running tests with larger clusters than were presented here. Faster runtimes would also allow for a greater number of experiments to be run

49

in a given period of time and thus more algorithmic modications could be tried in a shorter amount of time to determine which ones are eective. In the tradeo between program development speed (where Ruby has the advantage) versus program runtime speed (where C has the advantage), I opted for reduced development time.

50

References [1] A. P. Alivisatos. Semiconductor clusters, nanocrystals, and qauntum dots.

Science, 271:933937, 1996.
[2] R. Berry. Potential surfaces and dynamics: what clusters tell us. In Chem.

Rev., volume 93, pages 23792394, 1993.
[3] M. Born and R. Oppenheimer. Zur quantentheorie der molekeln. In Ann.

Physik, volume 84, pages 457484, 1927.
[4] R. Byrd, P. Lu, and J. Nocedal. A limited memory algorithm for bound constrained optimization. In SIAM J. Sci. and Stat. Comp., volume 16, pages 11901208, 1995. [5] T. F. Coleman, D. Shalloway, and Z. Wu. Isotropic eective energy simulated annealing searches for low energy molecular cluster states. In Computational

Optimization and Applications, volume 2, pages 145170, 1993.
[6] D. M. Deaven, N. Tit, J. R. Morris, and K. M. Ho. Structural optimization of lennard-jones clusters by a genetic algorithm. In Chem. Phys. Lett., volume 256, pages 195198, 1996. [7] Marco Dorigo and Alberto Colorni. The ant system: Optimization by a colony of cooperating agents. In IEEE Trans. on Systems, Man and Cybernetics -

Part B, volume 26, pages 113, 1996.
51

[8] Marco Dorigo and Luca Maria Gambardella. Ant colony system: A cooperative learning approach to the traveling salesman problem. In IEEE Transac-

tions on Evolutionary Computation, volume 1, pages 124, 1997.
[9] Maro Dorigo and Thomas Stutzle. Ant Colony Optimization. MIT Press, 2004. [10] G. Greenwood. Revisiting the complexity of nding globally minimum energy congurations in atomic clusters. In Z. Phys. Chem., volume 211, pages 105 114, 1999. [11] Michael Guntsch and Martin Middendorf. A population based approach for aco. In E. Hart M. Middendorf G.R. Raidl S. Cagnoni, J. Gottlieb, editor, Applications of Evolutionary Computing: EvoWorkshops 2002: EvoCOP,

EvoIASP, EvoSTIM/EvoPLAN, volume 2279, pages 7281, 2002.
[12] J. Kennedy and R.C. Eberhart. Particle swarm optimization. In Proceedings of

IEEE International Conference on Neural Networks, pages 19421948, 1995.
[13] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science, 220:671680, 1983. [14] T. Krink, J. Vesterstrom, and J. Riget. Particle swarm optimization with spatial particle extenstion. In Proceedings of the Congress on Evolutionary

Computation, 2002.
52

[15] A. H. Land and A. G. Doig. An automatic method for solving discrete programming problems. In Econometrica, volume 28, pages 497520, 1960. [16] J. Li, X. Li, and L. Wang. Au20: a tetrahedral cluster. Science, 299:864867, 2003. [17] M. Locatelli and F. Schoen. Fast global optimization of dicult lennard-jones clusters. In Computational Optimization and Applications, volume 21, pages 5570, 2002. [18] R. Mendes, J. Kennedy, and J. Neves. The fully informed particle swarm: Simpler may be better. In IEEE Transactions of Evolutionary Computation, volume 8, pages 204210, 2004. [19] C. Monson. email communication. November 2005. [20] J.A. Northby. Structure and binding of lennard-jones clusters: 13 < n < 147. In Journal of Chemical Physics, volume 87, pages 61666178, 1987. [21] K. Parsopoulos, V. Plagianakos, G. Magoulas, and M. Vrahatis. Stretching technique for obtaining global minimizers through particle swarm optimization. In Proc. Particle Swarm Optimization Workshop, pages 2229, 2001. [22] W. Pullan. Energy minimization of mixed argon-xenon microclusters using a genetic algorithm. In Journal of Compuational Chemistry, volume 18, pages 10961111, 1997. 53

[23] K. Raghavachari and C. M. Rohlng. Bonding and stabilities of small silicon clusters: A theoretical study of si7-si10. In J. Chem. Phys., volume 89, pages 22192234, 1988. [24] M. Richards and D. Ventura. Dynamic sociometry in particle swarm optimization. In International Conference on Computational Intelligence and Natural

Computing, pages 15771560, 2003.
[25] J. Riget and J. S. Vesterstrom. A diversity-guided particle swarm optimizer the arpso. In Technical report, EVALife, Dept. of Computer Science, Univer-

sity of Aarhus, Denmark, 2002.
[26] B. Scheuermann, K. So, M. Guntsch, M. Middendorf, O.Diessel, H. ElGindy, and H. Schmeck. Fpga implementation of population-based ant colony optimization. In Applied Soft Computing, volume 4, pages 303322, 2004. [27] P. Tomson and G. Greenwood. Using ant colony optimization to nd low energy atomic structures. In Proc. 2005 Congress on Evolutionary Computation, pages 121126, 2005. [28] L. Wille and J. Vennik. Computational complexity of the ground-state determination of atomic clusters. In J. Phys. A, volume 18, pages L419L422, 1985. [29] L. T. Wille. Minimum-energy cogurations of atomic clusters: New results 54

obtained by simulated annealing. In Chem. Phys. Lett., volume 133, pages 405410, 1987. [30] M. Wolf and U. Landman. Genetic algorithms for structural cluster optimization. In J. Phys. Chem. A., volume 102, pages 61296137, 1998. [31] G. Xue. Inter-particle distance at global minimizers of lennard-jones clusters. In Journal of Global Optimization, volume 11, pages 8390, 1997. [32] J. Zhao and R. Xie. Genetic algorithms for the geometry optimization of atomic and molecular clusters. In J. Theo. Nanosci., volume 1, pages 117 131, 2004.

55

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.