You are on page 1of 19

Engineering Optimization

ISSN: 0305-215X (Print) 1029-0273 (Online) Journal homepage: https://www.tandfonline.com/loi/geno20

RePAMO: Recursive Perturbation Approach for


Multimodal Optimization

Bhaskar Dasgupta , Kotha Divya , Vivek Kumar Mehta & Kalyanmoy Deb

To cite this article: Bhaskar Dasgupta , Kotha Divya , Vivek Kumar Mehta & Kalyanmoy Deb
(2013) RePAMO: Recursive Perturbation Approach for Multimodal Optimization, Engineering
Optimization, 45:9, 1073-1090, DOI: 10.1080/0305215X.2012.725050

To link to this article: https://doi.org/10.1080/0305215X.2012.725050

Published online: 28 Nov 2012.

Submit your article to this journal

Article views: 150

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=geno20
Engineering Optimization, 2013
Vol. 45, No. 9, 1073–1090, http://dx.doi.org/10.1080/0305215X.2012.725050

RePAMO: Recursive Perturbation Approach for Multimodal


Optimization
Bhaskar Dasguptaa *, Kotha Divyab , Vivek Kumar Mehtaa and Kalyanmoy Deba
a Department of Mechanical Engineering, IIT, Kanpur—208 016, India; b ISRO Satellite Centre, Bangalore,
India

(Received 5 October 2011; final version received 2 August 2012)

In this article, a strategy is presented to exploit classical algorithms for multimodal optimization problems,
which recursively applies any suitable local optimization method, in the present case Nelder and Mead’s
simplex search method, in the search domain. The proposed method follows a systematic way to restart
the algorithm. The idea of climbing the hills and sliding down to the neighbouring valleys is utilized. The
implementation of the algorithm finds local minima as well as maxima. The concept of perturbing the
minimum/maximum in several directions and restarting the algorithm for maxima/minima is introduced.
The method performs favourably in comparison to other global optimization methods. The results of this
algorithm, named RePAMO, are compared with the GA–clearing and ASMAGO techniques in terms of
the number of function evaluations. Based on the results, it has been found that the RePAMO outperforms
GA clearing and ASMAGO by a significant margin.

Keywords: multimodal optimization; recursive optimization; alternate minimization and maximization;


RePAMO algorithm

1. Introduction

Suggesting multiple solutions to optimization problems that are multimodal in nature has the
benefit of providing flexibility of choice while making decisions. It also leads to a better under-
standing of the problem which is very useful in a wide range of applications such as decision
making, designing tasks, motion planning, scheduling, etc. Optimization problems often involve
human judgements that are not quantifiable. In such problems, an optimization algorithm needs
to propose possible alternative solutions that can later be judged by a human decision-maker.
There have been two approaches in the development of global optimization algorithms: deter-
ministic and stochastic. Deterministic approaches can guarantee absolute success, but only by
making restrictive assumptions on the objective function. Stochastic approaches, on the other
hand, use evaluation of the objective function values at randomly generated points by making
milder assumptions. The convergence is, however, not absolute. But, the probability of their
success approaches unity as the sample size tends to infinity. The present work attempts to
imbibe the determinism of classical approaches while retaining the diversity of a population-based
approach.

*Corresponding author. Email: dasgupta@iitk.ac.in

© 2013 Taylor & Francis


1074 B. Dasgupta et al.

Among deterministic (classical) methods, different approaches have been proposed by


researchers, like deflation techniques for the calculation of further solutions of a nonlinear system
by Brown and Gearhart (1971) and, more recently, in interval analysis based methods (Hansen
1993, Hansen and Walster 2004).
Stochastic approaches promise the global optimum, but they are usually heuristic in nature and
expensive in application. In some approaches, gradient-based methods are coupled with certain
auxiliary functions to move successfully from one local minimum to a better one. The algorithms
developed based on this idea are the tunnelling method of Levy and Montalvo (1985), the bridging
method of Liu and Teo (1999), and the filled function method of Liu (2001), which bypass the
previously generated local minima. The success of these algorithms relies heavily on the effective
construction of suitable auxiliary functions. Yiu et al. (2004) proposed a hybrid descent method,
consisting of simulated annealing and a gradient-based method, to retain the robustness of the
stochastic optimization method and the speed of the local minimizing algorithm to find the global
solution.
Shashikala (1992) proposed an ingenious heuristic to find all the extrema of a given function.
The problem of determining the critical points of a function on a manifold is transformed to
the problem of finding all the equilibrium points of an appropriate vector field. The solution to
the vector field is evaluated through a numerical approach wherein this algorithm (Branin 1972,
Shashikala 1992, Shashikala et al. 1992, Sudarsan and Sathiya Keerthi 1998) used the constrained
stabilization approach and Adam’s formulas for integration.
Another recent method, called A Simple Multistart Algorithm for Global Optimization
(ASMAGO) was proposed by Hickernell and Yuan (1997). This algorithm starts with a quasir-
andom sample of N points representing the feasible set. The algorithm applies p iterations of an
inexpensive local search to concentrate on the sample set and retains q points with the smallest
function values. The remaining (N − q) points are replaced by new quasirandom points and then
the concentration step is repeated. Any point that is retained for s iterations of concentration
steps is used to start an efficient local search, provided that its function value is not significantly
larger than the smallest function value obtained thus far. The algorithm terminates when no new
local minimum is found after several ‘major iterations’. The strength of this algorithm lies in
choosing quasirandom rather than random sample points, thus targetting better coverage of the
feasible region. But the algorithm fails to locate the global optima in the case of highly oscillating
functions.
Various approaches in the history of evolutionary algorithms, which facilitate the determination
of multiple solutions, can be divided into three subgroups even though all the techniques fall
under the major group called niching techniques: (i) the sharing function approach, (ii) sequential
niching and (iii) crowding.
In multimodal GAs, instead of assigning the usual fitness to all individuals in a population, the
population is divided into species and each individual in a specie is forced to share the fitness of
all the other individuals in its specie. The incorporation of forced sharing of resources discourages
crowding of the population at a particular optimal point and causes the formation of stable sub-
populations (species) at different optima (niches). Furthermore, the number of individuals devoted
to each niche is proportional to the expected niche payoff.
A practical scheme that directly uses the sharing metaphor to induce niching and speciation
was detailed in Goldberg and Richardson (1987), Deb (1989) and Deb and Goldberg (1989). In
this scheme, a sharing function is defined to determine the neighbourhood and degree of sharing
for each string in the population. It has been reported that the algorithm fails to establish the
niches in massively multimodal problems (Goldberg et al. 1992) and niches with required peaks
of relative fitness worse than the global optimum. Scaling the fitness function has been proposed
as a remedy to overcome the problem. Although other fitness scaling methods, such as linear
scaling, exponential scaling, sigma truncation scaling, etc., exist, power law scaling has been
Engineering Optimization 1075

adjudged the best in a study by Kreinovich et al. (1993). Modifications based on niche radius
were proposed by Yu (2005) where the niche radius is encoded in the chromosome itself which
was the major problem in setting this parameter in the previous case.
In the above algorithms, calculation of the sharing fitness value requires computation of the
order O(N 2 ), hence various algorithms have been proposed with the aim of reducing the com-
plexity. Some such algorithms are Monte Carlo sampling (Goldberg and Richardson 1987),
k-mean clustering analysis (Yin and Germay 1993), and k-mean clustering analysis with collec-
tive sharing (Pictet et al. 1996). Miller and Shaw (1996) used a technique called dynamic niche
sharing, which dynamically identifies the niches based on a greedy algorithm. Oei et al. (1991)
chose tournament selection as compared to other selection methods to reduce the computational
complexity.
Gan and Warwick (1999) proposed a niching scheme called dynamic niche clustering-1(DNC-
I) that used a clustering algorithm for the identification of peaks followed by a sharing function.
DNC-I was originally proposed for one-dimensional space, and later extended to DNC-II (Gan
and Warwick 2000), in which the authors introduced a new triangular sharing function and the use
of hill–valley function topology. It has been reported that DNC-II requires an increased number
of fitness evaluation calculations, which significantly add to the computational time as compared
to the other simplistic approaches.
Hanagandi and Nikolaou (1998) proposed a clustering scheme without a sharing function to
solve multimodal problems. This hybrid GA clustering (GA-CL) used the local optima present
in the population as seed points for the formation of clusters. Although they emphasized the
maintenance of multiple solutions, the algorithm was unable to maintain solutions and suf-
fered from genetic drift. Petrowski (1996) suggested an alternative sharing scheme where the
available resources (fitness) are distributed only among the top individuals of the population.
This algorithm has been shown to require a lesser population size than the sharing methods
and leads to less computational cost. Petrowski (1997a,b) and Petrowski and Genet (1999) sug-
gested an algorithm called clearing with clustering in which a hierarchial algorithm based on
vector quantization principles was used to determine the sub-populations and then a sharing
or clearing method was applied. But it was reported that the implementation of the algorithm
was more complex. Finally, an algorithm that falls into the sharing function approach cate-
gory is the evolutionary local search algorithm proposed by Menczer (1998). All the algorithms
described above more or less experience difficulty finding multiple solutions because of the
parameter settings, e.g. niche radius in sharing and clearing radius in clearing. Even though
some proposals came to handle a particular class of problems, unless one is already con-
versant with the problem, one would face difficulty setting the appropriate values for these
parameters.
Sequential niching (Beasley et al. 1993) is another class of algorithms to extract multiple
solutions of a given problem. This procedure locates the niches temporally through an iterative
procedure of derating the fitness landscape according to previously located optima. Mahfoud
(1995) has highlighted various disadvantages of such a method. Sequential niching’s deration of
a peak may alter the location of optima and also create false peaks near previously located optima
which leads to errors.
Crowding, the third subgroup of algorithms which falls within the niching category, is sim-
ilar to the sharing methods. However, the resource considered and the methodology to allow
niching changes. An important distinction between sharing and crowding arises from the fact
that in sharing methods the behaviour by which a stable mixture of sub-population (species) is
arrived at and its expected distribution is dependent upon the representative niche fitness, whereas
crowding strives to maintain the diversity of the pre-existing mixture. Thus, the initial distribu-
tion of the population assumes significance in these schemes and this dependence arises as a
major weakness in such algorithms. Moreover, these schemes are based on steady state genetic
1076 B. Dasgupta et al.

algorithms instead of generational genetic algorithms and thus replace only a part of the population
every generation. Various algorithms that fall within this category are Cavicchio’s preselection
(Cavicchio 1970), Dejong’s crowding factor model (De Jong 1975), deterministic crowding (Mah-
foud 1992), multi-niche crowding (Cedeño 1995), Harik’s restricted tournament selection (Harik
1995), probabilistic crowding (Mengshoel and Goldberg 1999), and finally the struggle genetic
algorithm (Grüninger and Wallace 1996). The above methods opt for GA as the baseline technique
and attempt to induce diversity in the existing population. An original motivation for developing
niching methods was, in fact, to promote diversity in the traditional GA. Diversity can serve two
purposes in GAs. The first purpose is to delay convergence in order to increase exploration, so
that a better, single solution can be located where convergence to undesirable solutions or to
non-global optima has been stated as premature convergence in the GA literature. The second
purpose is to locate multiple final solutions.
The concept of niche and species, in a framework of classical optimization, is used in the
Universal Evolutionary Global Optimizer (UEGO) by Jelasity et al. (2001), developing the concept
of Genetic Algorithm Species (GAS) (Jelasity and Dombi 1998), for exploring multiple solutions
in highly oscillating problem domains, where sharing function approaches fail. Recently, Deb
and Saha (2012) have suggested a bi-objective optimization procedure for finding the minima in a
multimodal problem. This study has also suggested scalable constrained multimodal optimization
problems and solved some of them by using the proposed approach.
Apart from GA, some studies suggest Differential Evolution (DE) (Storn and Price 1995),
which is a real-parameter evolutionary algorithm, to be an efficient algorithm for solving multi-
modal optimization problem. Though it was proposed for finding single global optimum solutions,
some authors (Hendershot 2004, Zaharie 2004, Li 2005, Ronkkonen and Lampinen 2007) have
extended DE to solve multimodal problems as well.
From the above review, it is clear that most of the recent contributions in multimodal opti-
mization are based on evolutionary algorithms. However, in the opinion of the authors, classical
multi-start methods have not been explored to their complete potential. In the present work,
building on the original idea of Shashikala (1992), an algorithm called the Recursive Perturbation
Approach for Multimodal Optimization (RePAMO) has been developed for exploring multiple
solutions of a multimodal optimization problem. The RePAMO, in a way, is also a population based
approach. But, in contrast to GAs, it deals with a varying population size in every generation rather
than a constant population. The number of previously generated independent ‘optima’ constitute
the current generation. As the three basic operators of GAs, namely reproduction, crossover and
mutation, the RePAMO has another set of three operators: perturbation, optimization and com-
parison, but they are used in a different context. GAs apply niching and speciation to split the
entire population into many sub-populations where each sub-population converges to a specific
peak or valley in a given problem. Therefore, in GAs, the population size is a major parameter,
to be carefully chosen, to explore all the prominent zones/regions and is required to be a con-
stant multiple of the number of peaks in a given domain which increases the computational cost
enormously. The major intuitive theme of the present work is to develop an algorithm which can
find all the optima by following a systematic way of restarting any existing classical optimization
algorithm, eventually capturing the entire solution space with reasonable implementation cost,
for a fairly general class of problems.
The scope of the present article is limited to the development of a systematic restarting strat-
egy for multimodal optimization problems. The algorithm developed recursively applies existing
optimization techniques to locate all the solutions. For this, the concepts of alternate minimization
and maximization have been used. The algorithm is tested on several test functions (constrained
and unconstrained, continuous and discontinuous) and on a few problems of practical inter-
est. The results of the algorithm are also compared with the existing GA technique as well as
ASMAGO.
Engineering Optimization 1077

In the next section, the detailed formulation and the nitty-gritty of the implementation are
presented, and in two subsequent sections, several test functions and the results of RePAMO on
them are detailed.

2. Formulation

RePAMO is a classical multi-start algorithm that deals with variable populations instead of a
constant population as in GAs. A selected classical optimization algorithm is recursively applied
to trace all the extrema of a given function. The idea of climbing the hills and sliding down
to the valleys is utilized. The algorithm has three basic operators: direction set generation and
perturbation, optimization and comparison.
The algorithm starts with some initial feasible random guess and finds a minimum. A direction
set containing a number of directional vectors is generated, and the minimum point is perturbed
along these directions. With these perturbed points as starting solutions, the function is opti-
mized for its neighbourhood maxima, and only the distinct maxima are retained after comparison.
The algorithm utilizes the previously generated minima/maxima to find the present generation
maxima/minima.

2.1. The fundamental principle

The idea, simply put, is to design some heuristic A+ and A− , where A+ starts at some local
minimum xmin and returns all the local maxima z1 , . . . , zl , adjacent to xmin . Similarly, A− returns
all the neighbouring minima by starting at some initial local maximum xmax . The algorithm is
designed to find all the extrema of a function defined over a domain.
The authors propose a powerful heuristic design to implement A+ and A− . Here, they present the
details of A+ only, since those of A− are similar. Let x∗ ∈ Min = {minima}. The aim of A+ (x∗ )
is to find a set of small perturbation vectors P(x∗ ) = {pi , i = 1, . . . , l}, such that the solutions to
the given optimization problems with (x∗ + αpi ) as the initial points lead to adjacent maxima.
The perturbation vectors are generated first by perturbing an optimum solution obtained along
coordinate axes, 2N of them. Additionally, the solution is also perturbed along random directions,
2N of them. If new solutions are obtained by perturbing along random directions, only then is the
solution perturbed along a new set of random axes.
The idea is illustrated through Rosenbrock’s function in Figure 1. Here, the arrow directs the
feasible random point to the global minimum located in a long narrow valley which completes
the first generation of the algorithm. The minimum obtained is perturbed along coordinate and
random directions.
If an algorithm based on derivatives is used as the native optimization method in individual
searches, then the extent of the perturbation (α) needs to be small enough (for the domain) to
be considered a perturbation and large enough to avoid extremely small values of the derivatives
due to being too close to a stationary point. These considerations leave quite a wide spectrum of
possible values of α. In the present implementation, however, a derivative-free method (simplex
search method) is used. Hence, the perturbation is only to generate multiple starting points in
different directions and the lower bound for α to be useful is governed only by the consideration
of how much of it will make the resulting points really different. In the present implementation,
a value of α = 0.01 is used. Even a value α = 0.1 or α = 0.001 poses no bottleneck in the
performance.
The concurrent application of the optimization algorithm simultaneously from all these per-
turbed points as the starting points gives the neighbourhood maxima as shown in Figure 2. The
Rosenbrock function consists of three maxima located on the boundary of the domain, so only
1078 B. Dasgupta et al.

500

400

300

200

100

12 1 2 3 4
0 11 5
10 9 8 6
7

−100
−20 −15 −10 −5 0 5 10 15 20

Figure 1. Perturbing the minimum in 6N directions.

500

400

300

200

100

−100
−20 −15 −10 −5 0 5 10 15 20

Figure 2. Perturbed points leading to neighbourhood maxima.

three among all the searches are useful and the remaining paths re-discover the same peaks and
are excluded from the list of peaks. This completes the second generation. As there are no more
new minima in the domain of the function, the algorithm terminated in the third generation, when
all A− paths led to the original minimum of the first generation.

2.2. The Algorithm

Begin
• Step 1: Assume bounds on x, (xL , xU ),1 set α as 0.01, for use in A+ and A− , maximum number
of generations, Kmax . Set generation number k = 1, XOptk
= φ, and choose a starting point x(0) .
Engineering Optimization 1079

• Step 2: Apply a suitable optimization algorithm to find a minimum x̄, XOpt k


= {x̄}.
• Step 3: If XOptk
= φ terminate; else k = k + 1.
• Step 4: for each x ∈ XOpt k−1

• set counter = 1
• if k is even, Apply A+ ; else apply A− .
• If counter = 1; set direction to be perturbed as coordinate direction; else set direction to
perturbed as random direction.
• Along each of the directions pi , determine the perturbed point as x(0) = x + αpi and optimize
the function to obtain x̄. 
j
• If x̄  ∈ XOpt , j ≤ k; XOpt
k
= XOpt
k
{x̄}.
• If counter = 1; check whether any new solution have been obtained or not.
• If new solutions are found, set counter = counter + 1, and continue; else stop perturbing
current point and move to next x ∈ XOpt k−1
.
• Go to step 3
End

k
At the end of execution of the algorithm, the set XOpt gives minima in all odd generations of k
and maxima in all even generations of k.

2.3. Implementation details

RePAMO can be implemented with any existing classical optimization technique, depending
upon the nature of the problem. It is particularly observed that its performance follows similar
trends irrespective of the native algorithm used inside the recursive steps, as long as it has the
usual convergence properties for reaching a local optimum in the search domain. Thus, it is the
power of the proposed recursive strategy and not the native algorithm that is the subject matter
of the present work. Independence of the native algorithm has, in fact, the advantage of being
applicable in combination with several native algorithms, which may be decided based on the kind
of problem one needs to solve. The performance statistics associated with the results presented
in the article pertain to its use in combination with the authors’ implementation of Nelder and
Mead’s simplex search method, which is just one of the native algorithms with which the authors
tried RePAMO.
The method follows a particular sequence of restarting. It may seem that the computational
cost increases exponentially, but the comparison between inter-generation and intra-generation
solutions substantially decreases the size of the current generation. This also allows a theoretical
basis for termination.
Like any optimization method, the algorithm becomes costly in higher dimensional problems.
The successive generation of random directions and alternate minimization and maximization
sometimes makes the process expensive, compared to which the intermediate comparisons add
only a small fraction of the total computational cost. The default implementation finds minima
and maxima alternately, but only one of the two sets is required in most optimization problems.
The implementation makes use of maxima/minima to explore minima/maxima respectively. So,
it may seem that almost 50% of search effort is spent finding unwanted optima.
The idea behind finding alternate maxima follows the general intuition that when one is on a
hill-top one can view all the neighbouring valleys easily. Without climbing up to the next hill-top,
one cannot know how far the other valleys are located from the present one. Theoretically, the
positive gradient nature of the contour lines towards the maximum hampers the search for the
next minimum. One way of handling the problem is to take larger steps from the present valley,
ensuring that the step is away from the effect of the positive gradient, and minimize. But, without
1080 B. Dasgupta et al.

Figure 3. Generation 1: Random initial point leading to a minimum. Generation 2: Generation-1 minimum finds four
boundary maxima. Generation 3: Previously generated maxima find remaining minima. Generation 4: Remaining boundary
maxima and one internal maximum are obtained from the previous generation minima.

knowing the function behaviour, it becomes difficult to estimate the required step size. In this
case, one cannot be assured of finding all the solutions of interest with only minimization (without
conducting alternate maximization). Thus, the maxima of the function, in fact, get generated as a
by-product of the efficient procedure to explore the solution space in search of the minima.

3. Results and discussion

The algorithm RePAMO, explained previously, has been implemented using Nelder and Mead’s
simplex search method as the basic optimization algorithm. The authors have extended the basic
algorithm, which is for unconstrained and unbounded problems, to accommodate bounds and
constraints. Several well-known benchmark multimodal functions of various complexities were
considered to test the behaviour of the algorithm. The complexity of a function is considered
in terms of the distribution (uniform/non-uniform) of its optima. Out of the case studies made,
only a representative set, for functions listed in Appendix A, is presented here. Statistics for the
algorithm for some of the test functions are shown in Table 1. To illustrate the working of the
algorithm, the authors first present a simple 2-D example of the Six-Hump Camel function.

3.1. Simple 2-D example P1: Six-Hump Camel function

The Six-Hump Camel function has 6 internal minima, 2 maxima within the search domain and 10
boundary maxima. The solutions obtained after the execution of the algorithm are represented on
respective contour plots as a posteriori analyses; this is to verify whether the algorithm identified
all the solutions of a function or not. Figure 3 shows the generational progress of the algorithm for
this function. The connectivity of the optima is represented on a contour plot given in Figure 4.
The algorithm finds all the minima and maxima of the function. It took 5 generations and 71,998
function evaluations.
Engineering Optimization 1081

Figure 4. P1: Connectivity of all optimal solutions for the Six-Hump Camel function.

2
−−−−−− x2 −−−−−−>

−1

−2

−3

−4

−5
−5 −4 −3 −2 −1 0 1 2 3 4 5
−−−−−− x1 −−−−−−>

Figure 5. P2: Connectivity of the optima for the Himmelblau function.

3.2. Other 2-D examples

3.2.1. P2: Himmelblau function

The Himmelblau function has 6 boundary maxima and 4 internal minima. The execution of the
algorithm found all 10 solutions. It took 50,181 function evaluations over 5 generations. The
connectivity of the optima is shown in Figure 5.

3.2.2. P3: Guilin Hills function

The Guilin Hills function (Hickernell and Yuan 1997) has 15 local minima and 8 internal maxima.
The optima are very unevenly distributed. There is a lot of variability in the location of solutions
where a large number of them are clustered in the region X = {x | 0.65 ≤ xi ≤ 1 (i = 1, 2)}. The
1082 B. Dasgupta et al.

0.9

0.8

0.7
−−−−−− x2 −−−−−−>

0.6

0.5

0.4

0.3

0.2

0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−−−−−− x1 −−−−−−>

Figure 6. P3: Connectivity of the optima for the Guilin Hills function.

Table 1. Statistics of the algorithm for some of the test functions.

Function (Minima)(Maxima) NFun NGen Solution obtained

Six-Hump Camel function (6)(12) 71,998 5 ALL


Himmelblau function (4)(6) 50,181 5 ALL
Guilin Hills function (15)(25) 211,603 7 ALL
Rastrigin function (16)(25) 176,673 11 ALL
Ackley function (25)(36) 244,785 7 ALL

algorithm took 211,603 function evaluations and the algorithm terminates in 7 generations. The
connectivity of the optima is represented in Figure 6.

3.2.3. P4: Rastrigin’s function

Rastrigin’s function has 16 local minima uniformly distributed in the search domain. The algorithm
finds all the solutions in 11 generations with 176,673 function evaluations (see Figure 7).

3.2.4. P5: Ackley’s path function

Ackley’s path function has 25 local minima. the algorithm took 244,785 function evaluations to
find all of them. The algorithm terminates in 7 generations. The optima are shown in Figure 8.

3.3. Examples of problems with constraints

3.3.1. P6: Constrained Himmelblau function

In this problem, with an equality constraint, all the solutions lie over a circle of radius 5 units and
the problem has 4 minima and maxima each. The implementation of RePAMO captured all of
them in 100,347 function evaluations in 6 generation. The connectivity for the optima are shown
in Figure 9.
Engineering Optimization 1083

3.5

2.5

1.5

0.5

−0.5
−0.5 0 0.5 1 1.5 2 2.5 3 3.5
−−−−−− x1 −−−−−−>

Figure 7. P4: Connectivity of the optima for the Rastrigin function.

1.5

0.5

−0.5

−1

−1.5

−2
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
−−−−−− x1 −−−−−−>

Figure 8. P5: Connectivity of the optima for Ackley’s path function.

3.3.2. P7: Constrained Guilin Hills function

The algorithm is also applied to the constrained Guilin Hills function. The Guilin Hills function
is subjected to two inequality constraints. The algorithm successfully captures all the optima, in 6
generations with 205,539 function evaluations. The performance of the algorithm over the optima
are shown in Figure 10.

3.4. Higher dimensional problems

Apart from 2-D cases, the algorithm has been tested with a few higher dimensional problems as
well. For example, Rastrigin function as the model function has been tested for 3-D, 4-D, 5-D,
6-D and 8-D problems. The details of the test runs are given in Table 2.
1084 B. Dasgupta et al.

Table 2. Statistics of the algorithm for the higher dimensional Rastrigin function.

Function (Minima) NFun NGen Solution obtained

3-D Rastrigin function (8) 18,587 5 ALL


4-D Rastrigin function (16) 77,786 5 ALL
5-D Rastrigin function (32) 305,874 7 ALL
6-D Rastrigin function (64) 1,606,742 5 ALL
8-D Rastrigin function (256) 12,572,378 7 ALL

5
function contours
Equality constraint
4

2
−−−−−− x2 −−−−−−>

−1

−2

−3

−4

−5
−5 −4 −3 −2 −1 0 1 2 3 4 5
−−−−−− x1 −−−−−−>

Figure 9. P6: Constrained Himmelblau function.

3.5. An application-based example: inverse kinematics problem in robotics

In robot kinematics, the forward kinematics problem finds a specific posture of the end-
effector/hand in the Cartesian space for a given set of joint angles of a robot. Its solution provides
a transformation from joint space to Cartesian space. On the other hand, the inverse kinematics
problem calculates all possible sets of joint angles (θ1 , θ2 , . . . , θn ), for a given Cartesian position
and orientation of the hand. Its solution is a transformation from Cartesian space to joint space.
Here the authors have formulated the inverse kinematics problem as a multimodal optimization
problem to locate multiple sets of joint angles, which lead to a specified end-effector posture for
the 6-dof PUMA robot. For any given robot, the link-to-link transformation matrix is given by (Fu
et al. 1987)
⎡ ⎤
cθi −sθi cαi sαi sθi ai cθi
⎢sθi cαi cθi −sαi cθi ai sθi ⎥
⎢ ⎥.
iT = ⎣ 0
i−1
(1)
sαi cαi di ⎦
0 0 0 1
Once the description of any link with respect to its previous link frame is available, the authors
can represent any link (n ≤ N) with respect to the base frame {0} by applying the following
transformation.
nT = 1T 2T 3T . . .
0 0 1 2 n−1
nT (2)
The authors have considered the joint angles (θ1 , θ2 , . . . , θn ) as the variables of the problem and
the robot is intended to reach the point specified by the Cartesian coordinates [500, 500, 400]T
Engineering Optimization 1085

1
function contours
constraint boundary
0.9

0.8

0.7
−−−−−− x2 −−−−−−>

0.6

0.5

0.4

0.3

0.2

0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−−−−−− x1 −−−−−−>

Figure 10. P7: Constrained Guilin Hills function.

Table 3. DH parameters for a 6-link robot.

Link Link-lengths, ai Link-twists, αi Link-offsets, di Joint-variables, θi

1 0.0 −90 0.0 θ1


2 432 0 149.5 θ2
3 0.0 90 0.0 θ3
4 0.0 −90 432 θ4
5 0.0 90 0.0 θ5
6 0.0 0 55.5 θ6

Table 4. Inverse kinematic solutions of a 6-link robot.

θ1 θ2 θ3 θ4 θ5 θ6 funval

0.5730 0.0084 0.6800 −0.0000 −0.6884 −0.5730 0.0000


0.5730 −0.9314 2.5554 0.0000 −1.6239 −0.5730 0.0001
−2.1437 −2.2102 0.6800 0.0002 1.5302 2.1436 0.0003
−2.1444 −3.1497 2.5546 −0.0136 0.5963 2.1564 0.0078

with an orientation of [0, 0, 0]T with respect to the base frame. The DH parameters considered for
this purpose are given in Table 3.
RePAMO was executed on the above problem in direct form (without any pre-processing) and
the various sets of solutions obtained from the execution are given in Table 4. The table shows
four inverse kinematic solutions with four distinct sets of the positional variables. It is known
(Craig 2004) that there cannot exist more than four distinct configurations of the PUMA robot.

4. Comparison of RePAMO with clearing and ASMAGO

RePAMO is compared with clearing on functions having variable complexities, which include
the Himmelblau function, the Six-Hump Camel function and the Guilin Hills function, and with
ASMAGO on the Guilin Hills function, in terms of the number of function evaluations.
1086 B. Dasgupta et al.

Table 5. Comparison between clearing and RePAMO algorithms for problems P1, P2 and P3.

Clearing RePAMO
Function σclear NPop NGen NFun NGen NFun

Himmelblau 0.7 300 300 90,000 5 50,181


Six-Hump Camel 0.05 3000 50 150,000 5 71,998
Guilin Hills 0.02 5000 100 500,000 7 211,603

The authors’implementation of clearing requires a population of size 300 and has to be executed
for 300 generations to locate 4 minima of the Himmelblau function, whereas RePAMO requires
50,181 function evaluations over 5 generations (see Table 5). On the Six-Hump Camel function,
clearing requires 50 generations of execution time and a population size of 3000, whereas the
present algorithm requires 71,998 evaluations over 5 generations. In the case of the Guilin Hills
function, RePAMO requires 7 generations with 211,603 function evaluations to locate all 15 highly
oscillating local minima, whereas clearing required to run for 100 generations with a population
size of 5000.
Moreover, for the Guilin Hills function it has been reported that ASMAGO (Hickernell and
Yuan 1997) could not converge to the global minimum even after its termination, with 7077
function evaluations and 3609 gradient evaluations. The clearing method required all the minima
to be either only negative or positive. The Six-Hump Camel function has some of the minima
of positive function value and the remaining of negative. The function was lifted by an amount
approximately equal to the highest fitness value existing among the minima. Only after this
modification could the clearing method capture all the minima of the function.

5. Conclusions and future work

This article reports the development of an algorithm for determining all the extrema of a function in
a given domain. The algorithm is based on a simple idea of locating neighbouring maxima from
a minimum and vice versa. The algorithm recursively applies existing classical optimization
techniques to locate all the solutions and is also well-suited for parallel programming.
The algorithm was tested on a class of problems of various complexities and succeeded in most
of the situations. The cost of computation was less than that for most of the existing multimodal
optimization techniques. The algorithm outperformed the GA–clearing technique in terms of the
number of function evaluations for all the test functions compared.
There are a few avenues in which extensions of the method can be attempted. For example,
as the algorithm stands at the present stage, domains consisting of disconnected parts cannot be
handled. Moreover, the algorithm needs to be extended to deal with problems having sparse optima
in higher dimensional discontinuous functions and to recognize the cases where the problem has
very few minima and a very large number of maxima, or vice versa. The algorithm can be tested
extensively on root finding and other practical problems. Re-inventing the same optima multiple
times actually increases the number of function evaluations for which tracing the path of each
search and stopping the search in the intermediate iterations (if the path is retraced) can be
suggested, which is not included in the present implementation. A combination of an extension
of interval analysis with RePAMO may prove to be an excellent strategy.

Note

1. Other than bounds, no explicit constraint is considered at this stage.


Engineering Optimization 1087

References

Beasley, D., Bull, D., and Martin, R., 1993. A sequential niche technique for multimodal function optimization.
Evolutionary Computation, 1 (2), 101–125.
Branin, F., 1972. Widely convergent method for finding multiple solutions of simultaneous nonlinear equations. IBM
Journal of Research and Development, 16 (5), 504–522.
Brown, K. and Gearhart, W., 1971. Deflation techniques for the calculation of further solutions of a nonlinear system.
Numerische Mathematik, 16 (4), 334–342.
Cavicchio, D.J, 1970. Adaptive search using simulated evolution. Technical Report. Computer and Communication
Sciences Department, University of Michigan, Ann Arbor, MI. Available from: http://hdl.handle.net/2027.42/4042
[Accessed 2 September 2012].
Cedeño, W., 1995. The multi-niche crowding genetic algorithm: analysis and applications. Thesis (PhD). University of
California, Davis, CA.
Craig, J.J., 2004. Introduction to robotics: mechanics and control: international edition. 3rd ed. Upper Saddle River, NJ:
Prentice Hall.
De Jong, K., 1975. Analysis of the behavior of a class of genetic adaptive systems. Thesis (PhD). University of Michigan,
Ann Arbor, MI.
Deb, K., 1989, Genetic algorithm in multimodal function optimization. Technical Report 89002. The Clearinghouse for
Genetic Algorithms, University of Alabama, Tuscaloosa, AL.
Deb, K. and Goldberg, D., 1989. An investigation of niche and species formation in genetic function optimization. In:
Proceedings of the 3rd international conference on genetic algorithms, 4–7 June 1989, George Mason University,
Fairfax, VA. San Francisco, CA: Morgan Kaufmann, 42–50.
Deb, K. and Saha, A., 2012. Multimodal optimization using a bi-objective evolutionary algorithm. Evolutionary
Computation, 20 (1), 27–62.
Fu, K., Gonzalez, R., and Lee, C., 1987. Robotics: control, sensing, vision, and intelligence. New York: McGraw-Hill.
Gan, J. and Warwick, K., 1999. A genetic algorithm with dynamic niche clustering for multimodal function optimisation.
In: Proceedings of the 4th international conference on artificial neural networks and genetic algorithms (ICANNGA
99), 6–9 April, Vienna. New York: Springer, 248–255.
Gan, J. and Warwick, K., 2000. A variable radius niching technique for speciation in genetic algorithms. In: Proceedings
of the 2nd genetic and evolutionary computation conference (GECCO 2000), 8–12 July 2000, Las Vegas, NV. San
Francisco, CA: Morgan-Kaufmann, 96–103.
Goldberg, D., Deb, K., and Horn, J., 1992. Massive multimodality, deception, and genetic algorithms. In: R. Männer and
B. Manderick, eds. Parallel problem solving from Nature, PPSN II. Amsterdam: North-Holland, 37–46.
Goldberg, D. and Richardson, J., 1987. Genetic algorithms with sharing for multimodal function optimization. In: J.J.
Grefenstette, ed. Proceedings of the second international conference on genetic algorithms and their applications,
28–31 July 1987, MIT, Cambridge, MA. Hillsdale, NJ: Lawrence Erlbaum Associates, 41–49.
Grüninger, T. and Wallace, D., 1996. Multimodal optimization using genetic algorithms. Thesis (Master’s). Stuttgart
University.
Hanagandi, V. and Nikolaou, M., 1998. A hybrid approach to global optimization using a clustering algorithm in a genetic
search framework. Computers & Chemical Engineering, 22 (12), 1913–1925.
Hansen, E.R., 1993. Computing zeros of functions using generalized interval arithmetic. Interval Computations, 3, 3–28.
Hansen, E. and Walster, G., 2004. Global optimization using interval analysis. Monographs and textbooks in pure and
applied mathematics Vol. 264. Boca Raton, FL: CRC Press.
Harik, G., 1995. Finding multimodal solutions using restricted tournament selection. In: L.J. Eshelman, ed. Proceedings
of the Sixth International Conference on Genetic Algorithms, 15–19 July 1995, Pittsburgh, PA. San Francisco, CA:
Morgan Kaufmann, 24–31.
Hendershot, Z., 2004. A differential evolution algorithm for automatically discovering multiple global optima in multi-
dimensional, discontinues spaces. Proceedings of the fifteenth Midwest artificial intelligence and cognitive sciences
conference (MAICS 2004), 16–18 April 2004, Chicago, IL. Madison, WI: Omnipress, 92–97.
Hickernell, F. and Yuan, Y., 1997. A simple multistart algorithm for global optimization. OR Transactions, 1 (2), 1–11.
Jelasity, M. and Dombi, J., 1998. GAS, a concept on modeling species in genetic algorithms. Artificial Intelligence, 99
(1), 1–19.
Jelasity, M., Ortigosa, P., and García, I., 2001. UEGO, an abstract clustering technique for multimodal global optimization.
Journal of Heuristics, 7 (3), 215–233.
Kreinovich, V., Quintana, C., and Fuentes, O., 1993. Genetic algorithms: what fitness scaling is optimal? Cybernetics and
Systems, 24 (1), 9–26.
Levy, A.V. and Montalvo, A., 1985. The tunneling algorithm for the global minimization of functions. SIAM Journal on
Scientific and Statistical Computing, 6 (1), 15–29.
Li, X., 2005. Efficient differential evolution using speciation for multimodal function optimization. In: Proceedings of the
2005 conference on genetic and evolutionary computation (GECCO ’05), 25–29 June 2005, Washington DC. New
York: ACM, 873–880.
Liu, X., 2001. Finding global minima with a computable filled function. Journal of Global Optimization, 19 (2), 151–161.
Liu, Y. and Teo, K., 1999. A bridging method for global optimization. Journal of the Australian Mathematical Society—
Series B, 41 (1), 41–57.
1088 B. Dasgupta et al.

Mahfoud, S., 1992. Crowding and preselection revisited. In: R. Männer and B. Manderick, eds. Parallel problem solving
from Nature, PPSN II. Amsterdam: North-Holland, 27–36.
Mahfoud, S., 1995. A comparison of parallel and sequential niching methods. In: L.J. Eshelman, ed. Proceedings of the
sixth international conference on genetic algorithms, 15–19 July 1995, Pittsburgh, PA. San Francisco, CA: Morgan
Kaufmann, 136–143.
Menczer, F., 1998. Life-like agents: Internalizing local cues for reinforcement learning and evolution. Thesis (PhD).
University of California, San Diego, CA.
Mengshoel, O. and Goldberg, D., 1999. Probabilistic crowding: deterministic crowding with probabilistic replacement.
In: W. Banzhaf, et al., eds. Proceedings of the genetic and evolutionary computation conference (GECCO 1999),
13–17 July 1999, Orlando, FL. San Francisco, CA: Morgan Kaufmann, 409–416.
Miller, B. and Shaw, M., 1996. Genetic algorithms with dynamic niche sharing for multimodal function optimization. In:
Proceedings of IEEE international conference on evolutionary computation (ICEC ’96), 20–22 May 1996, Nayoya
University, Japan. IEEE, 786–791.
Oei, C., Goldberg, D., and Chang, S., 1991. Tournament selection, niching, and the preservation of diversity. Technical
Report 91011. Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL.
Petrowski, A., 1996. A clearing procedure as a niching method for genetic algorithms. In: Proceedings of the 1996 IEEE
international conference on evolutionary computationn (ICEC ’96), 20–22 May 1996, Nagoya, Japan. New York:
IEEE Press, 798–803.
Petrowski, A., 1997a. An efficient hierarchical clustering technique for speciation. Presented at Evolution Artificielle,
third European conference (AE’97), 22–24 October 1997, Nîmes, France.
Petrowski, A., 1997b. A new selection operator dedicated to speciation. In: Proceedings of the 7th international conference
on genetic algorithms (ICGA’97), 19–23 July 1997, Michigan State University, East Lansing, MI. San Francisco,
CA: Morgan Kaufmann, 144–151.
Petrowski, A. and Girog Genet, M.G., 1999. A classification tree for speciation. In: Proceedings of the 1999 congress on
evolutionary computation (CEC 99), 6–9 July 1999, Washington, D.C. Vol. 1. Piscataway, NJ: IEEE Press, 204–211.
Pictet, O., et al., 1996. Genetic algorithms with collective sharing for robust optimization in financial applications. Olsen
& Associates, Research Institute for Applied Economics, Zürich, 1–16.
Ronkkonen, J. and Lampinen, J., 2007. On determining multiple global optima by differential evolution. In: Proceedings
of the evolutionary and deterministic methods for design, optimization and control conference (Eurogen 2007), 11–13
June 2007, Jyväskylä, Finland. Barcelona: CIMNE, 146–151.
Shashikala, H., 1992. A hybrid descent method for global optimization. Thesis (PhD). Indian Institute of Science,
Bangalore, India.
Shashikala, H., Sancheti, N., and Keerthi, S., 1992. Path planning: an approach based on connecting all the minimizers
and maximizers of a potential function. In: Proceedings of the 1992 IEEE international conference on robotics and
automation, 12–14 May 1992, Nice, France. New York: IEEE Press, 2309–2314.
Storn, R. and Price, K., 1995. Differential evolution—a simple and efficient adaptive scheme for global optimization over
continuous spaces. Technical Report TR-95-012. International Computer Science Institute, Berkeley, CA.
Sudarsan, R. and Sathiya Keerthi, S., 1998. Numerical approaches for solution of differential equations on manifolds.
Applied Mathematics and Computation, 92 (2–3), 153–193.
Yin, X. and Germay, N., 1993. A fast genetic algorithm with sharing scheme using cluster analysis methods in multi-
modal function optimization. In: R.F. Albrecht, C.R. Reeves and N.C. Steele, eds. Proceedings of the international
conference on artificial neural networks and genetic algorithms, Innsbruck, Austria. Berlin: Springer-Verlag,
450–457.
Yiu, K., Liu, Y., and Teo, K., 2004. A hybrid descent method for global optimization. Journal of Global Optimization, 28
(2), 229–238.
Yu, X., 2005. Fitness sharing genetic algorithm with self-adaptive annealing peaks radii control method. In: L. Wang,
K. Chen and Y. Ong, eds. Proceedings of the first international conference on advances in natural computation
(ICNC’05). 27-29 August 2005, Changsha, Hunan, PR China. Lecture notes in computer science Vol. 3612. Berlin:
Springer-Verlag, 1064–1071.
Zaharie, D., 2004. Extensions of differential evolution algorithms for multimodal optimization. In: D. Petcu, et al.,
eds. Proceedings of the 6th international symposium on symbolic and numeric algorithms for scientific computing
(SYNASC ’04), 26–30 September 2004, Timisoara, Romania. Timisoara: Mirton Publishing House, Vol. 4., 523–534.

Appendix A. Function definitions

P1: Six-Hump Camel function

The Six-Hump Camel function,

f (x) = 4x12 − 2.1x14 + x16 /3 + x1 x2 − 4x22 + 4x24 , X = {x | − 2 ≤ x1 ≤ 2, −1 ≤ x2 ≤ 1} , (A1)

has six local minima and two local maxima in the region of interest and ten boundary maxima.
Engineering Optimization 1089

P2: Himmelblau function

The Himmelblau function,


f (x) = (x12 + x2 − 11)2 + (x1 + x22 − 7)2 , X = {x | − 5 ≤ x1 ≤ 5, −5 ≤ x2 ≤ 5} , (A2)
has four local minima in the region of interest and six maxima on the domain boundary.

P3: Guilin Hills function

The Guilin Hills function,



n

xi + 9 π
f (x) = 3 + ci sin , X = {x | 0 ≤ xi ≤ 1 (i = 1, . . . , n)} , (A3)
xi + 10 1 − xi + 1/2ki
i=1

where ci > 0 are parameters and ki are positive integers, has ni=1 ki local minima in the region X. This function is called
Guilin Hills because of its similarity to the mountains in Guilin, PR China. Only one of the local minima of the function
is the global minimum, which is very close to the point
T
1 1 1
1− 2 , 1− 2 ,..., 1 − 2 (A4)
8k1 − 4k1 8k2 − 4k2 8kn − 4kn

For the 2-D case, as in Hickernell andYuan (1997), n = 2, c1 = 1, c2 = 1.5, k1 = 5 and k2 = 3 are used in the example here.

P4: Rastrigin’s function

Rastrigin’s function,

n
f (x) = 10 + xi2 − 10 cos(2π xi ), X = {x | − A1 ≤ xi ≤ A2 (i = 1, . . . , n)} , (A5)
i=1

has ([A1 ] + [A2 ] + 1)n local minima in the region X. In the present case, for the 2-D problem the domain of X is considered
as X = {x | − 0.5 ≤ xi ≤ 3.5}, and for the 3-D, 4-D, 5-D, 6-D and 8-D problems, X = {x | − 0.5 ≤ xi ≤ 1.5}.

P5: Ackley’s path function

Ackley’s path function,


 ⎛ ⎞ n 
 n
1 1
f (x) = −a exp ⎝−b xi2 ⎠ − exp cos(cxi ) + a + exp(1),
n n (A6)
i=1 i=1

X = {x | − 2 ≤ xi ≤ 2, (i = 1, . . . , n)} ,
is widely used as a multi-modal test function, where a = 20, b = 0.2 and c = 2π . The function has 25 local minima and
33 local maxima in the region of interest.

P6: Constrained Himmelblau function

The constrained Himmelblau function,


f (x) = (x12 + x2 − 11)2 + (x1 + x22 − 7)2 , (A7)
such that
x12 + x22 = 25,
has four minima and four maxima.
1090 B. Dasgupta et al.

P7 : Constrained Guilin Hills function

The constrained Guilin Hills function,


n

xi + 9 π
f (x) = 3 + ci sin , X = {x | 0 ≤ xi ≤ 1 (i = 1, 2)} (A8)
xi + 10 1 − xi + 1/2ki
i=1

such that

0.15{1 + sin(10x1 )} − x2 ≤ 0 and x2 − (x1 − 0.5)2 − 0.5 ≤ 0.

You might also like