Professional Documents
Culture Documents
Swam Report PDF
Swam Report PDF
SWARM INTELLIGENCE
mate by direct contact or trail-laying. Then a group of ants collectively carries the
large prey back. Although this scenario seems to be well understood in biology, the
mechanisms underlying cooperative transport remain unclear. Roboticists have
attempted to model this cooperative transport. For instance, Kube and Zhang
introduce a simulation model including stagnation recovery with the method of task
modeling. The collective behavior of their system appears to be very similar to that of
real ants.
SWARM INTELLIGENCE
SWARM INTELLIGENCE
SWARM INTELLIGENCE
SWARM INTELLIGENCE
SWARM INTELLIGENCE
10
SWARM INTELLIGENCE
ACO was first introduced using the Travelling Salesman Problem (TSP).
Starting from its start node, an ant iteratively moves from one node to another. When
being at a node, an ant chooses to go to a unvisited node at time t with a probability
given by
Pki,j(t)= [i,j(t)][i,j(t)]
lnta[i,j(t)][i,j(t)]
where Nik is the feasible neighborhood of the antk, that is, the set of cities which
antkhas not yet visited; i,j(t) is the pheromone value on the edge (i, j) at the time t,
is the weight of pheromone; i,j(t) is a priori available heuristic information on the
edge (i, j) at the time t, is the weight of heuristic information. Two parameters and
determine the relative influence of pheromone trail and heuristic information. A
generalized version of the pseudo-code for the ACO algorithm with reference to the
TSP is illustrated in Algorithm.
11
SWARM INTELLIGENCE
14. Update the trail pheromone intensity for every edge (i, j).
15. Compare and update the best solution;
16. End While.
3.1.2 ANT COLONY ALGORITHMS FOR OPTIMIZATION PROBLEMS
3.1.2.1 TRAVELLING SALESMAN PROBLEM (TSP)
Given a collection of cities and the cost of travel between each pair of them,
the travelling salesman problem is to find the cheapest way of visiting all of the cities
and returning to the starting point. It is assumed that the travel costs are symmetric in
the sense that travelling from city X to city Y costs just as much as travelling from Y
to X. The parameter settings used for ACO algorithm are as follows:
Number of ants = 5
Maximum number of iterations = 1000
=2
=2
= 0.9
Q = 10
A TSP with 20 cities is used to illustrate the ACO algorithm.
The best route obtained is depicted as 1 14 11 4 8 10 15 19 7
18 16 5 13 20 6 17 9 2 12 3 1, and is illustrated
in Fig with a cost of 24.5222. The search result for a TSPfor 198 cities is illustrated in
Figure 1.7 with a total cost of 19961.3045.
12
SWARM INTELLIGENCE
13
SWARM INTELLIGENCE
of objects in its neighborhood, and their similarity. Lumer and Faieta used an average
similarity, mixing distances between objects with their number, incorporating it
simultaneously into a response threshold function like the algorithm proposed by
Deneubourg et al. ACLUSTER uses combinations of two independent response
threshold functions, each associated with a different environmental factor depending
on the number of objects in the area, and their similarity.
where w is called as the inertia factor, r1 and r2 are the random numbers, which are
used to maintain the diversity of the population, and are uniformly distributed in the
interval [0,1] for the j-th dimension of the i-th particle. c1 is a positive constant,
called as coefficient of the self-recognition component, c2is a positive constant,
called as coefficient of the social component. From Eq a particle decides where to
move next, considering its own experience, which is the memory of its best past
position, and the experience of its most successful particle in the swarm. In the
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
COLLEGE OF ENGINEERING, ADOOR
14
SWARM INTELLIGENCE
particle swarm model, the particle searches the solutions in the problem space with a
range [s, s] (If the range is not symmetrical, it can be translated to the corresponding
symmetrical range.) In order to guide the particles effectively in the search space, the
maximum moving distance during one iteration must be clamped in between the
maximum velocity [vmax, vmax] given in Eq:
vij= sign(vij)min(|vij| , vmax).
The value of vmax is p s, with 0.1 p 1.0 and is usually chosen to be s, i.e. p = 1.
The pseudo-code for particle swarm optimization algorithm is illustrated in Algorithm
3.2.1 PARTICLE SWARM OPTIMIZATION ALGORITHM
01. Initialize the size of the particle swarm n, and other parameters.
02. Initialize the positions and the velocities for all the particles randomly.
03. While (the end criterion is not met) do
04. t= t + 1;
05. Calculate the fitness value of each particle;
06. x= argminn
i=1(f(x(t 1)), f(x1(t)), f(x2(t)), , f(xi(t)), , f(xn(t)));
07. For i= 1 to n
08. x#i(t) = argminn
i=1(f(x#i(t 1)), f(xi(t));
09. For j = 1 to Dimension
10. Update the j-th dimension value of xi and vi
10. according to Eqs.
12. Next j
13. Next i
14. End While.
The end criteria are usually one of the following:
Maximum number of iterations: the optimization process is terminated after a fixed
number of iterations, for example, 1000 iterations.
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
COLLEGE OF ENGINEERING, ADOOR
15
SWARM INTELLIGENCE
16
SWARM INTELLIGENCE
previous best success and the success of some other particle. Various methods have
been used to identify some other particle to influence the individual. Eberhart and
Kennedy called the two basic methods as gbest model and lbest model. In the
lbest model, particles have information only of their own and their nearest array
neighbors best (lbest), rather than that of the entire group. Namely, in Eq., gbest is
replaced by lbest in the model. So a new neighborhood relation is defined for the
swarm:
vid(t+1) = w vid(t)+c1 r1 (pid(t)xid(t))+c2 r2 (pld(t)xid(t)).
xid(t + 1) = xid(t) + vid(t + 1).
In the gbest model, the trajectory for each particles search is influenced by the best
point found by any member of the entire population. The best particle acts as an
attractor, pulling all the particles towards it. Eventually all particles will converge to
this position. The lbest model allows each individual to be influenced by some
smaller number of adjacent members of the population array. The particles selected to
be in one subset of the swarm have no direct relationship to the other particles in the
other neighborhood. Typically lbest neighborhoods comprise exactly two neighbors.
When the number of neighbors increases to all but itself in the lbest model, the case is
equivalent to the gbest model. Some experiment results testified that gbest model
onverges quickly on problem solutions but has a weakness for becoming trapped in
local optima, while lbest model converges slowly on problem solutions but is able to
flow around local optima, as the individuals explore different regions. The gbest
model is recommended strongly for unimodal objective functions, while a variable
neighborhood model is recommended for multimodal objective functions. Kennedy
and Mendes studied the various population topologies on the PSO performance.
Different concepts for neighborhoods could be envisaged. It can be observed as a
spatial neighborhood when it is determined by the Euclidean distance between the
positions of two particles, or as a sociometric neighborhood (e.g. the index position in
the storing array). The different concepts for neighborhood leads to different
neighborhood topologies. Different neighborhood topologies primarily affect the
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
COLLEGE OF ENGINEERING, ADOOR
17
SWARM INTELLIGENCE
communication abilities and thus the groups performance. Different topologies are
illustrated in Fig. In the case of a global neighborhood, the structure is a fully
connected network where every particle has access to the others best position. But in
local neighborhoods there are more possible variants. In the von Neumann topology,
neighbors above, below, and each side on a two dimensional lattice are connected.
Fig. illustrates the von Neumann topology with one section flattened out. In a
pyramid topology, three dimensional wire frame triangles are formulated as
illustrated in Fig. As shown in Fig, one common structure for a local neighborhood
is the circle topology where individuals are far away from others (in terms of graph
structure, not necessarily distance) and are independent of each other but neighbors
are closely connected. Another structure is called wheel (star) topology and has a
more hierarchical structure, because all members of the neighborhood are connected
to a leader individual as shown in Fig. So all information has to be communicated
though this leader, which then compares the performances of all others.
18
SWARM INTELLIGENCE
management, security, grid service marketing, collaboration and so on. The load
sharing of computational jobs is the major task of computational grids. To formulate
our objective, define Ci,j(i {1, 2, ,m}, j {1, 2, , n}) as the completion time
that the grid node Gi finishes the job Jj, _Ci represents the time that the grid node Gi
finishes all the scheduled jobs. Define Cmax= max{_Ci} as the make span, and _m
i=1(_Ci) as the flow time. An optimal schedule will be the one that optimizes the
flow time and make span. The conceptually obvious rule to minimize _m i=1(_Ci) is
to schedule Shortest Job on the Fastest Node (SJFN). The simplest rule to minimize
Cmax is to schedule the Longest Job on the Fastest Node (LJFN). Minimizing _m
i=1(_Ci) asks the average job finishes quickly, at the expense of the largest job taking
a long time, whereas minimizing Cmax, asks that no job takes too long, at the
expense of most jobs taking a long time. Minimization of Cmax will result in the
maximization of _mi=1(_Ci).
19
SWARM INTELLIGENCE
20
SWARM INTELLIGENCE
improves the simplicity of the rule, since a shorter rule is usually easier to be
understood by the user than a longer one. As soon as the current particle completes
the construction of its rule, the rule pruning procedure is called. The quality of a rule,
denoted by Q, is computed by the formula: Q = sensitivity specificity. Just after the
covering algorithm returns a rule set, another post-processing routine is used: rule set
cleaning, where rules that will never be applied are removed from the rule set. The
purpose of the validation algorithm is to statistically evaluate the accuracy of the rule
set obtained by the covering algorithm. This is done using a method known as tenfold
cross validation [29]. Rule set accuracy is evaluated and presented as the percentage
of instances in the test set correctly classified. In order to classify a new test case,
unseen during training, the discovered rules are applied in the order they were
discovered.
The performance of PSO-miner was evaluated using four public-domain data
sets from the UCI repository [4]. The used parameters settings are as following:
swarm size=30; c1 = c2 = 2; maximum position=5; maximum velocity=0.10.5;
maximum uncovered cases = 10 and maximum number of iterations=4000. The
results are reported in Table 1.3. The algorithm is not only simple than many other
methods, but also is a good alternative method for data mining.
3.3 RIVER FORMATION DYNAMICS
Instead of associating pheromone values to edges, we associate altitude values
to nodes. Drops erode the ground (they reduce the altitude of nodes) or deposit the
sediment (increase it) as they move. The probability of the drop to take a given edge
instead of others is proportional to the gradient of the down slope in the edge, which
in turn depends on the difference of altitudes between both nodes and the distance
(i.e. the cost of the edge). At the beginning, a flat environment is provided, that is, all
nodes have the same altitude. The exception is the destination node, which is a hole.
Drops are unleashed at the origin node, which spread around the flat environment
until some of them fall in the destination node. This erodes adjacent nodes, which
creates new down slopes, and in this way the erosion process is propagated. New
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
COLLEGE OF ENGINEERING, ADOOR
21
SWARM INTELLIGENCE
drops are inserted in the origin node to transform paths and reinforce the erosion of
promising paths. After some steps, good paths from the origin to the destination are
found. These paths are given in the form of sequences of decreasing edges from the
origin to the destination.
3.3.1 ALGORITHM OF RIVER FORMATION DYNAMICS
The basic scheme of the RFD algorithm follows:
Initialize Drops()
Initialize Nodes()
while (not all Drops Follow The Same Path())
and (not other Ending Condition())
move Drops()
erode Paths()
deposit Sediments()
analyze Paths()
end while
The scheme shows the main ideas of the proposed algorithm .We comment on the
behavior of each step. First, drops are initialized (initialize Drops()), i.e., all drops are
put in the initial node. Next, all nodes of the graph are initialized (initialize Nodes()).
This consists of two operations. On the one hand, the altitude of the destination node
is fixed to 0. In terms of the river formation dynamics analogy, this node represents
the sea, that is, the final goal of all drops. On the other hand, the altitude of the
remaining nodes is set to some equal value. The while loop of the algorithm is
executed until either
TheSamePath()), that is, all drops traverse the same sequence of nodes, or another
alternative finishing condition is satisfied (otherEndingCondition()). This condition
may be used, for example, for limiting the number of iterations or the execution time.
Another choice is to finish the loop if the best solution found so far is not surpassed
during the lastn iterations. The first step of the loop body consists in moving the drops
across the nodes of the graph (moveDrops()) in a partially random way. The
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
COLLEGE OF ENGINEERING, ADOOR
22
SWARM INTELLIGENCE
following transition rule defines the probability that a drop k at a node i chooses the
node j to move next:
Pk(i, j) = decreasingGrad(i,j) l Vk(i)decreasingGrad(i,l)ifj Vk(i)
0 if j _ Vk(i)
whereVk(i) is the set of nodes that are neighbors of node I that can be visited by the
drop k and decreasingGrad(i,j) represents the negative gradient between nodes i and j,
which is defined as follows:
decreasingGrad(i, j) = altitude(j) altitude(i) distance(i, j)
wherealtitude(x) is the altitude of the node x and distance(i,j) is the length of the edge
connecting node iand node j. Let us note that, at the beginning oft_he algorithm, the
altitude of all nodes is the same, so l Vk(i) decreasingGradient(i, l) is 0. In order to
give a special treatment to flat gradients, we modify this scheme as follows: We
consider that the probability that a drop moves through an edge with 0 gradient is set
to some (non null) value. This enables drops to spread around a flat environment,
which is mandatory, in particular, at the beginning of the algorithm. In fact, going one
step further, we also introduce this improvement:
We let drops climb increasing slopes with a low probability. This probability will be
inverse proportional to the increasing gradient, and it will be reduced during the
execution of the algorithm by using a similar method to the one used in Simulated
Annealing. This new feature improves the search of good paths. In the next phase
(erodePaths()) paths are eroded according to the movements of drops in the previous
phase. In particular, if a drop moves from node A to node B then we erode A. That is,
the altitude of this node is reduced depending on the current gradient between A and
B. The erosion is higher if the down slope between A and B is high. If the edge is flat
or increasing then a small erosion is performed.
The altitude of the final node (i.e., the sea) is never modified and it remains equal to 0
during all the execution. Once the erosion process finishes, the altitude of all
nodes is slightly increased (depositSediments()). The objective is to avoid that, after
some iterations, the erosion process leads to a situation where all altitudes are close
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
COLLEGE OF ENGINEERING, ADOOR
23
SWARM INTELLIGENCE
to 0, which would make gradients negligible and would ruin all formed paths. In fact,
we also enable drops to deposit sediment in nodes. This happens when all movements
available for a drop imply to climb an increasing slope and the drop fails to climb any
edge (according to the probability assigned to it). In this case, the drop is blocked and
it deposits the sediments it transports. This increases the altitude of the current node.
The increment is proportional to the amount of cumulated sediment. Finally, the last
step (analyzePaths()) studies all solutions found by drops and stores the best one.
3.3.2 ADAPTING THE METHOD TO TSP
In this section we show how the previous general scheme is particularized to the TSP
problem .The adaptation of our method to TSP has several similarities with the way
ACO is applied to this problem. Like ants, each drop must remember the nodes
traversed so far in previous movements. In particular, this is necessary to avoid
repeating nodes. In spite of the fact that the use of gradients strongly minimizes these
situations in RFD, sequences of (improbable) climbing movements can still lead a
drop to a local cycle. Besides, altitudes are not eroded after each individual drop
movement. On the contrary, the altitudes of traversed nodes are eroded all together
when the drop actually completes a round-trip (as ants do when they increase
pheromone trails). This is another reason to keep track of the sequence of traversed
nodes. As in ACO, when a drop finds that all adjacent nodes have already been
visited, the drop disappears (again, this is much less probable in RFD than in ACO).
There is an additional reason in RFD why a drop may disappear: When all adjacent
nodes are higher than the actual node (i.e., the node is valley) and the drop fails to
climb any up gradient, the drop is removed and it deposits the sediment it carries at
the current node. Note that, in this case, the computations performed to move the drop
should not be considered as lost despite of the fact that it did not found a solution.
This is because the cumulation of sediments increases the node altitude, and thus
gradients coming from adjacent nodes are flatted. This eventually avoids that other
drops fall in the same blind alley. The adaptation of RFD to TSP has other
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
COLLEGE OF ENGINEERING, ADOOR
24
SWARM INTELLIGENCE
peculiarities that do not appear in ACO. Our algorithm provides an intrinsic method
to avoid that drops traverse cycles, but the goal of TSP consists in finding a cycle
indeed. In order to allow drops to follow a cycle involving all nodes, the origin node
(which, as we said before, can be any fix node) is cloned as well as all edges
involving it. The original node play the role of origin node and the cloned node plays
the role of destination node. In this way, drops can form decreasing gradients from
the origin node to the destination node (for us, from a node to itself). Finally, the
adaptation of RFD to TSP requires to introduce other additional nodes for different
technical reasons. The number of these additional nodes, called barrier nodes, belongs
to O(n2) where n is the number of nodes of the original graph. In the next section,
ACO and RFD will be compared in terms of graphs with the same number of
standard nodes. That is, if an experiment deals with a 30 nodes graph, then the quality
of solutions and the consumed time are computed and compared in the case where
ACO deals with the original 30-nodes graph and RFD deals with the corresponding
adapted bigger graph.
3.3.3 APPLYING RFD TO DYNAMIC TSP
In this section we describe the application of our approach to solve dynamic
TSP and we report some experimental results. We compare the results found by using
ACO methods and the solutions found by using our method. All the experiments were
performed in an Intel T2400 processor with 1.83 GHz. In these experiments we
follow the next procedure. First, we calculate the solution of an instance of the
problem by using both algorithms. Next, we introduce one of the following changes
in the graph: (a) We delete an edge that is common to the solutions found by both
methods; (b) We delete a node of the graph; (c) We add a new node. Once a change is
introduced, we calculate a solution for this instance of the problem. Let us remark that
we do not restart the algorithms, but we keep the state of the methods (that is, the
amount of pheromone at each edge and the altitudes of the nodes, respectively) just
before the change.The figures show the best solution found by each method for each
execution time before andafter the changes are introduced. These graphs have been
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
COLLEGE OF ENGINEERING, ADOOR
25
SWARM INTELLIGENCE
randomly generated assuming that each node is connected with only a few other
nodes (between 10 and 20 connections per host) as it is the most common case in
complex networks. The basic shape shown in the figures is also obtained with other
benchmark examples. Next we comment the experimental results.
3.3.3.1 DELETING EDGES
In this experiment we can see how RFD and ACO recalculate the solution
after deleting an edge of the graph (i.e. after removing a connection of the
network).This specially remarkable that the difference between the best solutions
obtained by RFD and ACO is increased after the change is introduced. That is, RFD
has more capacity to reconstruct good solutions after a change happens.
3.3.3.2 DELETING NODE
In this experiment we show how both methods recalculate the solution after
deleting a node of the graph( i.e. after a host of the network breaks down). Note that
the results are similar to those obtained when removing one edge.
3.3.3.3 ADDING NODES
Finally, we show how both methods recalculate their solutions after adding a
new host in the network. When we are using our algorithm and we add a new node,
we have to set its altitude to some value. This initial altitude should ease as much as
possible the task of finding new solutions. In particular, the altitude of the new node
is set to the average of its neighbors. We use this altitude because it is the value that is
less aggressive with its environment, and in fact we observed that it allows to find
good solutions. On the other hand, there are a lot of different strategies to treat
dynamic graphs in ACO. When a change is introduced in the graph, it is necessary to
reset part of the pheromone information. Let us remark that, in this case, ACO did not
find any solution to the problem after the node is added. However, RFD obtains a
solution. Hence, in this case the advantage of RFD over ACO is quite relevant.
Obviously,we could completely restart the computation of ACO from the beginning.
26
SWARM INTELLIGENCE
27
SWARM INTELLIGENCE
4. CONCLUSION
28
SWARM INTELLIGENCE
29