You are on page 1of 53

Accepted Manuscript

Preference-inspired co-evolutionary algorithms using weight vectors

Rui Wang, Robin C. Purshouse, Peter J. Fleming

PII: S0377-2217(14)00426-3
DOI: http://dx.doi.org/10.1016/j.ejor.2014.05.019
Reference: EOR 12319

To appear in: European Journal of Operational Research

Received Date: 3 October 2013


Accepted Date: 8 May 2014

Please cite this article as: Wang, R., Purshouse, R.C., Fleming, P.J., Preference-inspired co-evolutionary algorithms
using weight vectors, European Journal of Operational Research (2014), doi: http://dx.doi.org/10.1016/j.ejor.
2014.05.019

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers
we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and
review of the resulting proof before it is published in its final form. Please note that during the production process
errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Preference-inspired co-evolutionary algorithms using
weight vectors
Rui Wanga,b , Robin C. Purshouseb , Peter J. Flemingb
a
Department of System Engineering, College of Information System and Management,
National University of Defense Technology, Chang Sha, China, 410073
b
Department of Automatic Control & Systems Engineering, University of Sheffield,
Mappin Street, Sheffield, S1 3JD, UK

Abstract
Decomposition based algorithms perform well when a suitable set of weights
are provided; however determining a good set of weights a priori for real-
world problems is usually not straightforward due to a lack of knowledge
about the geometry of the problem. This study proposes a novel algorithm
called preference-inspired co-evolutionary algorithm using weights (PICEA-
w) in which weights are co-evolved with candidate solutions during the search
process. The co-evolution enables suitable weights to be constructed adap-
tively during the optimisation process, thus guiding candidate solutions to-
wards the Pareto optimal front effectively. The benefits of co-evolution are
demonstrated by comparing PICEA-w against other leading decomposition
based algorithms that use random, evenly distributed and adaptive weights
on a set of problems encompassing the range of problem geometries likely to
be seen in practice, including simultaneous optimization of up to seven con-
flicting objectives. Experimental results show that PICEA-w outperforms
the comparison algorithms for most of the problems and is less sensitive to
the problem geometry.
Keywords: Evolutionary algorithms, multi-objective optimisation,
many-objective, co-evolution, weights

Email address: ruiwangnudt@gmail.com (Rui Wang)

Preprint submitted to European Journal of Operational Research May 15, 2014


1. Introduction
Multi-objective optimisation problems (MOPs) arise in many real-world
applications, where multiple conflicting objectives must be simultaneously
satisfied. Typically, the optimal solution set of MOPs is not a single solu-
tion but comprises of a set of trade-off solutions. Multi-objective evolution-
ary algorithms (MOEAs) are well suited for solving MOPs since (i) their
population-based nature leads naturally to the generation of an approximate
trade-off surface in a single run; and (ii) they tend to be robust to underlying
cost function characteristics (Coello et al., 2007, pp. 5-7).
Over the last two decades, a variety of MOEA approaches have been
proposed. Most of these approaches are based on the concept of Pareto-
dominance and niching technique suggested by Goldberg (1989): for exam-
ple, MOGA (Fonseca & Fleming, 1993), NSGA-II (Deb et al., 2002) and
SPEA2 (Zitzler et al., 2002). It is accepted that Pareto-dominance based
MOEAs perform well on MOPs with 2 and 3 objectives. However, their
search capability often degrades significantly as the number of objectives
increase (Ishibuchi et al., 2008). This is because that the proportion of non-
dominated objective vectors in the population grows large when MOPs have
more than 3 objectives, i.e., so-called many-objective problems (Purshouse &
Fleming, 2003). As a result, insufficient selection pressure can be generated
towards the Pareto front (Purshouse & Fleming, 2007).
In addition to the Pareto-dominance based approaches, there has been
considerable effort invested in other types of MOEAs. One of the most
promising alternatives1 is to use a decomposition approach (Hughes, 2003;
Zhang & Li, 2007), denoted as D-MOEA in this study. Note that in some
studies decomposition based methods are also called aggregation based or
scalarising function based methods (Hughes, 2003; Ishibuchi et al., 2010).
Decomposition based MOEAs transfer a MOP into a set of single objective
problems by means of scalarising functions with different weights. Compared
with Pareto-dominance based approaches, decomposition based approaches
have a number of advantages such as high search ability for combinatorial op-
timisation, computational efficiency on fitness evaluation and high compati-
bility with local search (Zhang & Li, 2007; Ishibuchi et al., 2010). The seminal

1
There are also other types of MOEAs such as indicator based MOEAs (Zitzler &
Künzli, 2004), e.g., SMS-EMOA (Emmerich et al., 2005) and HypE (Bader & Zitzler,
2011); and modified dominance based MOEAs, e.g., ε-EMOA (Deb et al., 2005).

2
decomposition based MOEA, i.e., MOEA/D, that popularised this method,
has been used in many real-world applications2 . Its modified version, i.e.,
MOEA/D-DRA (Zhang et al., 2009) won the “Unconstrained multi-objective
evolutionary algorithm” competition at the 2009 Congress on Evolutionary
Computation.
Despite these advantages, more recent studies have identified that de-
composition based algorithms, e.g., MOEA/D, (i) face difficulties on prob-
lems having a complex Pareto front geometry3 (Gu et al., 2012); and (ii)
though performing well on bi-objective problems, are not particularly use-
ful for many-objective problems due to a loss of solution diversity (Wang
et al., 2013). These issues are likely to arise from an inappropriate specifi-
cation of search directions (which are determined by the weights) a priori,
itself arising from a general lack of problem knowledge. In other words the
choice of the scalarising functions’ underlying search directions is typically
problem-dependent and therefore is difficult if no information about the prob-
lem characteristics is known before the search proceeds. For example, evenly
distributed search directions are good for problems having a linear Pareto
optimal front, see Figure 1(a): however, they are not suitable for problems
with complex Pareto optimal fronts, e.g, disconnected, see Figure 1(b).
Our interest remains in a posteriori decision-making, that is, providing
decision-makers with both a proximal and diverse representation of the entire
Pareto optimal front. This study then proposes a novel strategy to adaptively
modify the search directions (i.e., weights) on line for decomposition based
MOEAs so as to obtain a good approximation of the Pareto optimal front
for problems having different geometries.
This new strategy adopts the concept of preference-inspired co-evolution
(Purshouse et al., 2011; Wang et al., 2013), that is, candidate solutions are
co-evolved with weight vectors (used as preferences) during the search. The
usefulness of weights is maintained by being evaluated using the current
population of candidates solutions. It is hypothesised that via co-evolution
suitable weights can be constructed on the fly and thus leading decomposition
based algorithms to be less sensitive to problem geometries and also to scale
up well on many-objective problems.

2
Professor Qingfu Zhang maintains a website which records the related research and
applications of MOEA/D: http://dces.essex.ac.uk/staff/zhang/webofmoead.htm
3
Pareto front geometry and problem geometry are used interchangeably in this paper.

3
I I 




 

 
 

  I   I

(a) (b)

Figure 1: Illustration of a good distribution of search directions for different Pareto fronts:
• = weights;  = solution images.

The rest of this paper is organised as follows. Section 2 introduces some


related work to this study which contains an introduction to decomposition
based MOEAs and their issues as well as a brief review of some representative
decomposition based MOEAs. Section 3 elaborates the proposed co-evolution
based weights adaptation strategy and the associated algorithm: preference-
inspired co-evolutionary algorithms using weights (PICEA-w). Section 4
describes the experiment setup. Section 5 presents the experiment results.
A further discussion of the algorithm PICEA-w is provided in Section 6.
Section 7 concludes.

2. Related work
Without loss of generality, a minimisation MOP is defined as follows:

minimise fm (x) m = 1, 2, ..., M


subject to gj (x) ≤ 0, j = 1, 2, ..., J
(1)
hk (x) = 0, k = 1, 2, ..., K
xli ≤ xi ≤ xui , i = 1, 2, ..., n

A solution x is a vector of n decision variables: x = (x1 , x2 , · · · , xn ), x ∈


R . Each decision variable xi is subject to a lower bound xli , and an upper
n

4
xui bound. fm represents the mth objective function. M is the number of
objectives (generally, M > 2). J and K are the number of inequality and
equality constraints, respectively.

2.1. Basics of decomposition based MOEAs


Decomposition based algorithms handle a MOP by simultaneously solving
a set of single objective problems defined by means of scalarising functions
with different weights. The optimal solution of each single objective problem,
defined by a weighted scalarising function, corresponds to one Pareto opti-
mal solution of a MOP. The weight vector defines a search direction for the
scalarising function. Thus, one can then employ different weights to search
for a set of diversified Pareto optimal solutions.
Typically, in decomposition based algorithms weights can either be ini-
tialised as an even distribution before the search, or randomly generated, or
adaptively modified during the search. A variety of scalarising functions can
be employed in decomposition based algorithms. The weighted sum and the
weighted Chebyshev are two frequently-used scalarising functions, and can
be written as follows.

• The weighted sum scalarising function:



g ws (x|w) = {λi (fi (x) − zi∗ ), λi = 1/wi} (2)
i=1,2,··· ,M

• The weighted Chebyshev scalarising function:

g wc (x|w) = max {λi (fi (x) − zi∗ ), λi = 1/wi} (3)


i=1,2,··· ,M

Note that often λi is defined directly as wi . However, here we define


λi as 1/wi in order to simplify the related analysis later, i.e., avoiding the
inconsistency of search directions and weights. In such definition the search
direction determined by a weight vector is identical to the direction from the
ideal point4 to the given weight vector (Giagkiozis et al., 2013a).
For minimisation problems, both g ws (x|w) and g wc (x|w) should be min-
imised. In both Equations 2 and 3, w = (w1 , w2 , · · · , wM ) represents a

4
An Ideal point is a vector composed of all the best (e.g., minimum for minimization
problems) values of each objective.

5
 ∗ ∗ ∗ ∗
weight vector, wi ∈ [0, 1], i=M
i=1 wi = 1 and z = (z1 , z1 , · · · , zM ) is a refer-
ence point. Typically, the reference point is updated once a better (smaller)
value of fi is found during the algorithm execution, see Equation (4):

zi∗ = min {fi (x)|x ∈ Ω} − ε (4)


where Ω shows all the examined solutions during the algorithm execution,
and ε is a very small positive value. Essentially, if the ideal point is available,
the reference point should be set as the ideal point.
It is worth mentioning that the weighted sum scalarising function en-
counters difficulties with certain problem geometry, that is, the fact it is not
possible to find Pareto optimal solutions in concave regions of the Pareto
optimal front unless some additional technique (e.g., ε-constraint method) is
applied (Kim & De Weck, 2005, 2006). The Chebyshev scalarising function
does not have such an issue. However, its search ability is not as great as the
weighted sum approach: in terms of a specified weighted Chebyshev func-
tion, the proportion of solutions that are considered as better than a given
(M −1)
reference point is 12 . Thus, the search ability of the Chebyshev function
degrades significantly as the number of objectives increases. Readers are re-
ferred to Ishibuchi et al. (2009, 2010) and Giagkiozis & Fleming (2012) for
more details.
Additionally, decomposition based MOEAs combine different objectives
into one metric. These objectives might have various units of measurement.
Thus, it is important to rescale different objectives to dimension-free units
before aggregation. Moreover, normalisation is useful for obtaining evenly
distributed solutions when the objectives are disparately scaled. Typically,
the normalisation procedure transforms an objective value fi (in Equations
(2) or (3)) by
fi − z ide
f i = nad i ide (5)
zi − zi
where zide = (z1ide , z2ide , · · · , zM
ide
) is the ideal point and znad = (z1nad , z2nad ,
· · · , zM
nad
) is the nadir point5 . After normalisation objective values are within
[0, 1]. If the true ziide and zinad are not available (they are often difficult to
obtain, especially for znad (Deb et al., 2010)), we can use the smallest and

5
A nadir point is a vector composed of all the worst (e.g., maximum for minimization
problems) Pareto optimal objective values in a MOP.

6
largest fi of all non-dominated solutions found so far to estimate ziide and
zinad , respectively.
2.2. A selection of decomposition based MOEAs
This section reviews a selection of decomposition based MOEAs. These
algorithms are classified into two categories, i.e., using pre-defined weights
or using adaptive weights. Note that in this study we refer to pre-defined
weights as weights that are randomly generated during the search or ini-
tialised as an even distribution before the search.
2.2.1. Decomposition based MOEAs using pre-defined weights
Hajela and Lin’s genetic algorithm (denoted as HLGA) (Hajela et al.,
1993) and multi-objective genetic local search (MOGLS (Ishibuchi & Murata,
1998)) are two representative decomposition based MOEAs that use random
weights.
• In HLGA and I-MOGLS (MOGLS of Ishibuchi & Murata (1998)), a
random weight vector is generated as follows: firstly randomly generate
M numbers, e.g., a1 , a2 , · · · , aM and ai ≥ 0; secondly normalise ai by
a
Mi to obtain a valid component for a weight vector, i.e., wi =
i=1 ai
a
Mi . This method has a limitation: the generated weights are dense
i=1 ai
in the centre while sparse at the edge. This is because the method is
equivalent to directly projecting all randomly distributed points in a
hypercube to a hyperplane, see Figure 2.
• Jaszkiewicz (2002) proposed another variant of MOGLS (denoted as
J-MOGLS) in which components of a weight vector are calculated by
Equation (6), where function rand() returns a random value within the
interval (0, 1) according to a uniform probability distribution.

w1 = 1 − M −1
rand()
···

i−1

wi = (1 − wj )(1 − M −1
rand())
j=1 (6)
···

i−1
wM = 1 − wj
j=1

7
1 1

0.8 0.8

0.6 0.6

w3
a3

0.4 0.4

0.2 0.2

0 0
1 1 1 1
0.5 0.5 0.5 0.5
0 0 0 0
a2 a1 w2 w1

(a) random points in a cube (b) random weights on a plane

Figure 2: Illustration of the distribution of random weights.

MOEA/D and MSOPS (Hughes, 2003) are two representative decompo-


sition based MOEAs that employ evenly distributed weights.

• In MOEA/D weights are evenly distributed on the first quadrant of the


hyperplane f1 + f2 + · · · , +fM = 1. Specifically, weights are formed
by all normalised weight vectors with components chosen from the
set {0, 1/H, · · · , (H − 1)/H, 1}, where H is a positive integer num-
ber (known as the simplex-lattice design method (Tan et al., 2012)).
The same method has also been used in the cellular multi-objective
genetic algorithm in an earlier study (Murata et al., 2001). For ex-
ample, for 2-objective problems, if H is specified as 100, then we can
1
generate C101 = 101 groups of weight vectors (0, 1), (0.01, 0.99), ... ,
(1, 0). It is worth mentioning that the number of weights generated
using this method increases significantly when M increases; the num-
H+M −1
ber of weights is determined by CM −1 . Given H = 30 the number of
weights is 5456 for M = 4, while the number increases to 46376 when
M = 5.

• In MSOPS the author proposed an approach which generates evenly


distributed points on a hypersphere f12 + f22 + · · · , +fM
2
= 1 via min-
imising a metric V defined by Equation (7). These points can be trans-
formed into valid weights by using the equation: wi = Mxi x where xi
i=1 i
is the ith component of a point.

8
N N
V = max max (xi · xj ) (7)
i=1 j=1,j=i

The metric V measures the worst-case angle of two nearest neighbours.


The dot product xi · xj provides the cosine of angle between the vectors
xi and xj . The inner maximisation finds the nearest two neighbours in
terms of the angle between them. The outer maximum operator finds
the largest angle between two nearest neighbours. The optimal set of
weights is produced when the outer maximum is minimised.

The main issue of using pre-defined weights is that the obtained solu-
tions might not be well spread. This is because problem geometries are often
unknown beforehand and therefore determining a suitable set of weights in
order to obtain evenly distributed solutions is difficult (this will be explained
later). To overcome this limitation, researchers have attempted to use adap-
tive weights in decomposition based algorithms.

2.2.2. Decomposition based MOEAs using adaptive weights


A variety of methods for weights adaptation have been proposed. MSOPS-
II (Hughes, 2007) is one such method. In MSOPS-II weight vectors are gener-
ated by bootstrapping these from the on-line archive of locally non-dominated
solutions. Specifically, the weight vectors are updated as follows: take the
current weight vectors and augment it with a validate weight vector created
by a normalised non-dominated solution in turn. Once a weight vector is
augmented, we compute angles between two nearest neighbours; identify the
nearest neighbour; and then remove the most crowded one.
Jiang et al. (2011) improved MOEA/D by using Pareto adaptive weights,
paλ. The approach paλ automatically adjusts weights according to the geom-
etry characteristics of the Pareto front. Specifically, this approach assumes
that the Pareto optimal front is symmetric of the form f1p +f2p +· · · , +fM
p
= 1.
Based on the non-dominated solutions in the archive, parameter p is esti-
mated. Having determined p, evenly distributed points on the approximated
symmetric shape are generated; these are then converted to valid weights.
The use of paλ can significantly improve the performance of MOEA/D when
the problem geometry is close to the form: f1p + f2p + · · · , +fM
p
= 1. However,
this method faces difficulty on problems with asymmetric and disconnected
Pareto fronts.

9
Gu et al. (2012) proposed to incorporate a dynamic weight design method
into MOEA/D (denoted as DMOEA/D) in order to enable MOEA/D to per-
form well on problems having different geometries. Specifically, in every ten
generations weights are re-generated according to the shape of the current
non-dominated Pareto front. A piecewise linear interpolation method is ap-
plied to fit a curve (2-objective case) of the current non-dominated solutions.
For each objective fi , Ni interpolation points, of which the projection

on the
10M −1 N Mj=1,j=i Dj
ith objective is uniform, are generated. Ni is determined by M M
i=1 j=1,j=i Dj
where N is the population size and Dj is the maximum value of the projection
of the current non-dominated solutions on the ith objective. At the same
time, non-dominated solutions whose distance
 to the interpolation point is
Dj
smaller than 10 are removed. All the Ni interpolation points serve as
an approximation to the Pareto front. These interpolation points are then
ordered by an objective (e.g. f1 ). Solutions that are adjacent in f1 are
clustered
 into one group. The maximum number of solutions in a group is
[ Ni /N ] + 1, where N  is the number of currently non-dominated solu-


tions. A weight vector is then created by converting the point defined by


the mean of the solutions in a group. Experimental results show that this
adaptive method works well on 2- and 3-objective problems. This adaptive
approach is reported to be applicable to many-objective problems, however,
no experimental results are shown.
Qi et al. (2013) also proposed an algorithm named MOEA/D-AWA where
weights are periodically adjusted. Specifically, MOEA/D-AWA applies a two-
stage strategy to deal with the generation of the weight vectors. First, a set of
pre-determined weights are used until the population is considered converged
to some extent. Then a portion of the weight vectors are adjusted according
to the obtained solutions based on a geometric analysis. Weights toward
a sparse region are newly created, and weights toward a dense region are
removed. The density of solutions is measured by the k-nearest neighbour
approach (Deb et al., 2002). MOEA/D-AWA is demonstrated to perform well
on many-objective problems having a low-dimensional Pareto optimal front.
It also outperforms DMOEA/D for most of the selected bi- and tri-objective
problems.
All the above MOEAs attempt to maintain evenly distributed solutions
on the fly. There are also some methods that aim to first obtain as many
diversified solutions as possible and then apply some ad-hoc methods to the
obtained solutions so as to get evenly distributed solutions. EMOSA belongs

10
to this type Li & Landa-Silva (2011). It hybridises MOEA/D with simulated
annealing. The simulated annealing based local search is applied to improve
the current solution of each single objective problem. In EMOSA weights
are adaptively modified as follows: for each member F si in the current
population, firstly find the closest neighbour (e.g., F sj ) to F si and its as-
sociated weight vector wj . Secondly, identify the weights in the pre-defined
weight set (in Li & Landa-Silva (2011) this weight set is formed by a set
of evenly distributed weights generated by using the simplex-lattice design
method) whose Euclidean distance to wj is larger than the distance between
wi and wj . Thirdly, amongst the identified weights, select all the weights
of which the distance between them and wi is smaller than the distance
between them and all the neighbours of wi . The definition of the neighbour-
hood (T ) is the same as MOEA/D. If there are multiple weights then pick
one randomly. Evenly distributed solutions are obtained by applying the
ε-dominance strategy to the offline archive solutions. Experimental results
show that EMOSA outperforms three multi-objective memetic algorithms
(I-MOGLS, the improved I-MOGLS and MOEA/D) on 2- and 3-objective
knapsack problems and travelling salesman problems. However, its perfor-
mance on many-objective problems is not discussed.
Overall the use of adaptive weights is potentially helpful to handle the
issue of problem geometry for decomposition based algorithm. However,
none of the existing approaches has clearly shown their benefits on MaOPs;
in addition, it is suspected that adapting weights during the search may affect
the convergence performance of MOEAs (as will be discussed next).

2.3. Two issues of decomposition based algorithms


This section discusses two difficulties encountered by decomposition based
algorithms: problem geometry and many-objective optimisation.

2.3.1. Problem geometry


In a decomposition based algorithm, each Pareto optimal solution corre-
sponds to an optimal solution of a single objective problem that is defined by
a weighted scalarising function. That is to say, once the weighted scalarising
function is determined, the distribution of the obtained Pareto optimal so-
lutions is determined. Furthermore, once the scalarising function is chosen,
the distribution of solutions would only be affected by the distribution of the
employed weights.

11
Gu et al. (2012) and Giagkiozis et al. (2013a) discussed what the opti-
mal distribution of weights should be for a specified Pareto front when using
different scalarising functions. For example, when the Chevbyshev scalar-
ising function is used, see Equation (3), the optimal weight to search for
a solution x is ( Mf1 (x) , Mf2 (x) fM (x)
, · · · , M ). Knowing this, it is easy
i=1 fi (x) i=1 fi (x) i=1 fi (x)
to understand that the optimal distribution of weights for different problem
geometries changes. Figure 3 illustrates the optimal distribution of weights
for problems having convex and concave geometries. Take Figure 3(a) as an
example, the optimal distribution of weights for a concave Pareto front is
dense in the centre while sparse at the edge.

2 2

1.5 1.5
f2

f2
1 1

0.5 0.5

0 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
f1 f1

(a) Convex Pareto front (b) Concave Pareto front

Figure 3: The optimal distributions of weights for convex and concave Pareto fronts in
2-objective case using Chebyshev scalarising function: ◦ = weights;  = solution images.

Due to a lack of knowledge of the underlying problem geometry, it is usu-


ally not straightforward to determine a proper distribution of weights a priori
for decomposition based algorithms. Although the use of adaptive weights
is potentially helpful in handling this issue, it is suspected that adaptive
weights might have a deleterious effect on an algorithm’s convergence.
2.3.2. Many-objective optimisation
Decomposition based algorithms using evenly distributed weights, such
as MOEA/D and MSOPS face difficulties on many-objective problems. This
is because the number of Pareto optimal solutions that are required to de-
scribe the entire Pareto optimal front of a MaOP is very large (Ishibuchi

12
et al., 2008). In a decomposition based algorithm each weight vector typ-
ically corresponds to one Pareto optimal solution. The evenly distributed
weights are often initialised before the search and remain unchanged during
the search. It is therefore difficult to employ a limited number of weights to
obtain a full and representative approximation of the entire Pareto optimal
front. To illustrate this issue, we apply MOEA/D with 20 evenly distributed
weights to solve the 2-objective DTLZ2 problem. Figure 4(a) and Figure
4(b) show the obtained non-dominated solutions in the last generation and
in the archive, respectively, after running MOEA/D for 500 generations. It
is obvious that the obtained solutions are not sufficient to approximate the
entire Pareto optimal front. It should be noted that due to the stochastic
nature of MOEAs, neighbouring solutions of the sw are likely to be obtained
during the search. sw is referred as the optimal solution of a single objective
problem defined by the weighted scalarising function g(x|w). However, it is
less likely to find solutions that are distant from sw .
1.2 1.2

1 1

0.8 0.8

0.6 0.6
f2

f2

0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 1.2 0 0.2 0.4 0.6 0.8 1 1.2
f1 f1

(a) Solutions in the last generation (b) Solutions in the offline archive

Figure 4: An approximation of Pareto front of 2-objective DTLZ2 obtained by MOEA/D


using 20 weights: ◦ = weights;  = solution images.

A natural way to solve the limitation, i.e., a lack of solution diversity, is


by employing a large number of weights. However, it is argued that compared
with the number of solutions required to describe the entire Pareto optimal
front, the number of employed weights is always relatively small. Besides,
for some decomposition based MOEAs, e.g., MOEA/D, the population size
is required to be equal to the number of weights. It is not easy to strike an

13
effective balance between the population size and the number of generations
under a fixed computational budget – the larger the population size, the
more the beneficial dynamics of evolution are curtailed.
Another alternative is to use non-fixed weights. Typically, non-fixed
weights could be either randomly generated or adaptively modified during
the search. The use of non-fixed weights enables MOEAs to have more op-
portunities to explore different regions, thereby obtaining a set of diversified
solutions. However, this might slow down the convergence speed of an algo-
rithm. It is hypothesised that when using fixed weights, solutions are guided
towards the Pareto optimal front straightly along the search directions con-
structed by the weights, see Figure 5(a); when using non-fixed weights, the
constructed search directions keep changing. This suggests that solutions are
guided towards the Pareto optimal front in a polyline trajectory as shown
in Figure 5(b), that is, the convergence speed is degraded (Giagkiozis et al.,
2013b). Certainly, it should be admitted that in some cases, e.g., multi-
modal problems, the use of random/adaptive weights is helpful to maintain
diversified solutions and to prevent the algorithm from being trapped in the
local optima, resulting in a better convergence.

ĨϮ ĨϮ

 
   
 
 
K K
Ĩϭ Ĩϭ

(a) fixed weights (b) non-fixed weights

Figure 5: Illustration of the search behaviour of MOEAs using fixed weights and non-fixed
weights.

Overall, decomposition based algorithms using pre-defined weights suffer


from the issue of problem geometry. Although this issue can be handled by
employing adaptive weights, it is suspected that adaptive weights degrade the

14
algorithm’s convergence speed. Thus, it is important to develop an effective
weights adaptation strategy to address this issue.

3. Preference-inspired co-evolutionary algorithm using weight vec-


tors: PICEA-w
This section proposes a novel algorithm called preference-inspired co-
evolutionary algorithm using weight vectors (PICEA-w) in which
weights are adaptively modified by co-evolving with candidate solutions along
the search process. The co-evolution enables appropriate sets of weights to be
constructed adaptively such that candidate solutions are guided towards the
Pareto optimal front effectively. It is expected that PICEA-w would be less
sensitive to the problem geometry and also perform well on many-objective
problems.
Similar to a general decomposition based MOEA, PICEA-w decomposes
a MOP into a set of single objective problems that are defined by different
weighted scalarising functions. The main feature of PICEA-w is that the
scalarising functions’ underlying weights are adaptively modified in a co-
evolutionary manner during the search. Specifically, in PICEA-w candidate
solutions are ranked by each of the weighted scalarising functions, creating a
ranking matrix. The fitness of candidate solutions is then calculated based
on the ranking matrix. Weights are co-evolved with the candidate solutions
towards an optimal distribution, and these are also responsible for striking a
balance between the exploration and exploitation. For each selected solution,
an effective weight is selected which ranks this solution as the best (maintain
the convergence – exploitation) and is distant from this solution (improve
the diversity – exploration).
We implement PICEA-w within a (μ+λ) elitist framework shown as Fig-
ure 6. Populations of candidate solutions and weight vectors, S and W , of
fixed size, N and Nw , are evolved for a fixed number of generations. At each
generation t, parents S(t) are subjected to genetic variation operators to
produce N offspring, Sc(t). Simultaneously, Nw new weight vectors, W c(t),
are randomly generated. S(t) and Sc(t), W (t) and W c(t), are then pooled
respectively and the combined populations are sorted according to fitness.
Truncation selection is applied to select the best N solutions as the new can-
didate solution population, S(t + 1) and Nw solutions as the new weights
population W (t + 1). Additionally, an offline archive is employed to store
all the non-dominated solutions found during the search. Evenly distributed

15
solutions are obtained by using the clustering technique described in Zitzler
et al. (2002) after the optimization process has conducted.




^ t


 


 

^н^Đ
tнtĐ

^Đ tĐ
 


Figure 6: A (μ+λ) elitist framework of PICEA-w.

The pseudo-code of PICEA-w is presented in Algorithm 1. In the follow-


ing we explain the main steps of PICEA-w.

• Line 1 initialises the offline archive ArchiveF as the null set ∅.


• In lines 2 and 3, N candidate solutions S are initialised and their ob-
jective values F S are calculated. The offline archive, ArchiveF is
updated by function updateArchive in line 4.
• Line 5 applies function weightGenerator to generate Nw random weights.
• In line 7 function geneticOperation is applied to generate offspring
candidate solutions Sc. Their objective values F Sc are calculated in
line 8. S and Sc, F S and F Sc are pooled together, respectively in
line 9.
• Line 10 generates another set of weights W c. In line 11, W and W c
are pooled together.
• Line 12 sets the parameter θ which would be used in function coEvolve
to implement a local operation.
• Line 13 co-evolves the joint candidate solutions JointS and the joint
weights JointW , and so to obtain new parent S and W .

16
Algorithm 1: Preference-inspired co-evolutionary algorithm using
weights (PICEA-w)
Input: Initial candidate solutions, S of size N, initial weight vectors,
W of size Nw , the maximum number of generations, maxGen,
the number of objectives, M, a specified number (of evenly
distributed solutions), ASize
Output: S, W , offline archive, ArchiveF , evenly distributed solution
set, BestF of size ASize
1 ArchiveF ← ∅;
2 S ← initialiseS(N);
3 F S ← objectiveFunction(S);
4 ArchiveF ← updateArchive(ArchiveF, F S);
5 W ← weightGenerator(Nw );
6 for t ← 1 to maxGen do
7 Sc ← geneticOperation(S,F S);
8 F Sc ← objectiveFunction(Sc);
9 (JointS, JointF ) ← multisetUnion(S, Sc,F S, F Sc);
10 W c ← weightGenerator(Nw );
11 JointW ← multisetUnion(W, W c);
12 θ ← thetaConfiguration(t, π2 );
13 (S, F S, W ) ← coEvolve(JointF, JointS, JointW, θ);
14 ArchiveF ← updateArchive(ArchiveF, F S);
15 end
16 BestF ← pruningArchive(ArchiveF, ASize);

17
• Line 14 updates the offline archive with newly obtained solutions F S.
• Line 16 select ASize evenly distributed solutions BestF from ArchiveF
using the function pruningArchive.
The core part of PICEA-w lies in the function coEvolve which will be
elaborated next. Prior to introducing this function, we describe the following
four functions: weightGenerator, geneticOperation, thetaConfiguration,
updateArchive and pruningArchive.
(i) Function weightGenerator forms a new weight set W c with Nw weight
vectors that are generated according to Equation (6) (Jaszkiewicz, 2002).
Alternatively, W c can be formed by randomly sampling Nw weights
from an evenly distributed candidate weight set Ψ. Ψ can be created
by, for example, the simplex-lattice design method (as described in p.8).
(ii) Function geneticOperation applies genetic operators to produce off-
spring Sc. Many genetic operators are available, for example, single
point crossover, uniform crossover, simulated binary crossover, sim-
plex crossover, one bit-flip mutation, polynomial mutation. These ge-
netic operators have their own advantages and disadvantages. In this
study the SBX and PM operators are chosen. It should be noted that
for different problems different genetic operators may lead to different
algorithm performance. Selecting suitable genetic operators is often
algorithm- and problem-dependent (Srinivas & Patnaik, 1994).
(iii) Function thetaConfiguration adjusts parameter θ by Equation (8),
that is, θ increases linearly from a small value to π2 radians.
π t
θ= × ; (8)
2 maxGen
The use of θ implements a local selection at the early stage of the
evolution while a global selection at the late stage of the evolution.
A local selection means that a candidate solution only competes with
its neighbours; and a global selection means that a candidate solution
competes with all the other candidate solutions (Wang et al., 2014).
The benefits of this strategy will be demonstrated later (in Section 6).
(iv) Function updateArchive updates the offline archive ArchiveF by F S.
For each solution (e.g. F si ) in F S, if F si is dominated by a solution
in the archive, then F si is rejected. Otherwise it is accepted as a
new archive member. Simultaneously, solutions in the archive that are
dominated by F si are removed.

18
(v) Function pruningArchive employs the clustering method in SPEA2
(Zitzler et al., 2002) to obtain a specified number of evenly distributed
solutions. It iteratively removes the most crowded solution from ArchiveF
until only ASize solutions are left.

Function coEvolve evaluates the performance of candidate solutions and


weight vectors, and then selects new parent population S and W from the
joint population JointS and JointW , respectively. A candidate solution
gains higher fitness by performing well on more weighted scalarising func-
tions. A weight vector only gains fitness by being rewarded from the candi-
date solutions that are ranked as the best by this weight. The pseudo-code
of function coEvolve is as follows.

Function coEvolve(JointS, JointF, W )


Input: The joint population JointS, JointF, JointW , the parameter θ
Output: New parents, S, F S, W
1 R ← rankingSW(JointF, JointW, θ);
2 (F S, S, ix) ← selectS(JointF, JointS, R) ;
3 W ← selectW(JointW, F S, R, ix);

(i) Line 1 applies function rankingSW to rank JointF by each weighted


scalarising function. Specifically, for each w ∈ JointW , we first identify
its neighbouring candidate solutions. The neighbourhood is measured
by the angle between a candidate solution, s and a weight vector, w.
If the angle is smaller than the θ value, then s and w are defined as
neighbours. See Figure 7, s1 is a neighbour of w as α11 < θ while s2
is not (α12 > θ). Then we rank these neighbouring candidate solutions
based on their performance measured by the corresponding weighted
Chebyshev scalarizing function. This produces a [2N × 2Nw ] ranking
matrix, denoted as R. Rij represents the rank of the candidate solution
si on the weighted Chebyshev function using wj , i.e., g wc (si |wj ). The
best solution is ranked 1. The rank for solutions that are not neighbours
of the w is set as inf, i.e., the solutions are ignored. The Chebyshev
scalarising function is used in PICEA-w due to its guarantee of pro-
ducing a Pareto optimal solution for each weight vector, and also its
robustness on problem geometries.

19
ĨϮ
ǁ
Ɛ

ɲϭϭ

ɽ
ɽ ɲϭϮ
Ɛ

njΎ

Ĩϭ

Figure 7: Illustration of the neighbourhood of candidate solutions and weight vectors.

(ii) Line 2 applies function selectS to select the best N candidate solutions
as new parents S. Note that ix returns the index of S in JointS.
Specifically, the following steps are executed.
• Sort each row of the ranking matrix R in an ascending order, pro-
ducing another matrix Rsort . Rsort holds in the first column the
smallest (best performance) ranking result achieved by each candi-
date solution across all the weighted scalarising functions. Simul-
taneously, the second column holds the second smallest ranking
result for each candidate solution and so on.
• All candidate solutions are lexicographically ordered based on Rsort .
This returns a rank (denoted as r) for each candidate solution.
The fitness of a candidate solution is then determined by 2N − r.
Truncation selection is applied to select the best N solutions as
new parents, S.
(iii) Line 3 applies function selectW to select the most suitable weight vector
from JointW for each of the survived candidate solutions, i.e., members
in the new parent S. The basic idea of the selection is to balance
exploration and exploitation. To do so, two criteria are set for the
weights selection. First, for each candidate solution the selected weight
must be the one that ranks the candidate solution as the best. Secondly,
if more than one weight is found by the first criterion, we choose the
one that is the furthest from the candidate solution. The first criterion

20
is helpful in driving the search quickly towards the Pareto front; the
second criterion is helpful in guiding the search to explore new areas.
More specifically, for a candidate solution si , first we identify all the
weights that rank si the best. If there is only one weight, then this
weight is selected for si . This guarantees that the survived candidate
solution si will not lose its advantage in the next generation as there
is still one weight that ranks this candidate solution the best unless
a better solution along this direction is generated. If so, the conver-
gence performance is improved. If more than one weight is found, we
select the weight that has the largest angle between si and itself. In
this way, the algorithm is guided to investigate some unexplored areas,
i.e., improving the diversity. It is worth mentioning that the second
criterion is not considered unless multiple weights are identified by the
first criterion. This guarantees that while exploring for a better diver-
sity, the exploitation for a better convergence is also maintained. The
pseudo-code of the function selectW is described as follows.

Function selectW(JointW, F S, R, ix)


Input: the objective values F S of the survived candidate solutions,
the joint weight vectors, JointW , the ranking matrix, R, the
index list of F S in JointF , ix
Output: weight set, W
1 W ← ∅;
2 construct the ranking matrix R by extracting all the ix rows of R;
3 rank weights for each si according to R and obtain another matrix R ;
4 foreach F si ∈ F S do

5 J ← {j|j = arg minj∈{1,2,··· ,N } Rij };
6 if |J| > 1 then
7 k ← arg maxj∈J angle(wj , F si ) ;
8 else
9 k ← j;
10 end
11 W ← W ∪ wk ;
12 set the k-th column of R as inf;
13 end

21
• Line 1 initialises the new weight set as ∅. Line 2 forms a new
ranking matrix R by selecting the ix-th row of R, where ix is the
index of F S in the JointF .
• Line 3 ranks the contribution of weights to each of the candidate
solutions according to R . The results are stored in matrix R .

Rij describes the relative order of contribution that a candidate
solution si received from a weight vector wj . The lower the rank,
the higher the contribution.
• Line 5 finds all the weight vectors that give the highest contribu-

tion to solution si , i.e., weights that have Rij = 1. If there is only
one weight then we choose this weight for si . If multiple weights
are found then we choose the weight wk that has the largest angle
with F si (lines 6 to 11). To avoid multiple selections of a weight,

once the weight wk is selected the k-th column of Rij is set as inf.
• Additionally, function angle computes the angle, α, between a
−−−−→ −−→
candidate solution (i.e., F si z∗ ) and a weight vector (i.e., wi z∗ ).
The dot product of two normalised vectors returns the cosine of the
−−−−→ −−→
angel between the two vectors. Thus, α = arccos (F si z∗ · wi z∗ ).
To further explain the co-evolution procedure, let us consider a bi-objective
minimisation instance shown in Figure 8 with four candidate solutions and
four weight vectors, i.e. N = Nw = 2. Table 1 presents the settings for this
example, including objective values of the four candidate solutions, the four
weights and the θ value and the reference point z∗ .

Table 1: Settings of the bi-objective minimisation instance.


Solutions Objective values (F Si ) Weights Values
s1 (0.20,1.45) w1 (0.2,0.8)
s2 (0.55,1.35) w2 (0.45,0.55)
s3 (0.95,1.05) w3 (0.55,0.45)
s4 (1.55,0.95) w4 (0.8,0.2)
∗ π
z (0.05,0.05) θ 18
= 0.0556π

The angle αij between each pair of si and wj is calculated, and is shown
π
in Table 2. From the table, we know that when θ is set to 18 radians, w1
has two neighbours s1 and s2 ; w2 has only one neighbour s3 ; w3 has two
neighbours s3 and s4 ; w4 has no neighbour.

22
Table 2: Angles (unit π) between si and wi (neighbourhood is labelled with *).
αij w1 w2 w3 w4
s1 0.0440 (*) 0.1843 0.2478 0.3880
s2 0.0389 (*) 0.1014 0.1649 0.3051
s3 0.1553 0.0150 (*) 0.0485 (*) 0.1888
s4 0.2500 0.1097 0.0463 (*) 0.0940

ĨϮ
ǁ
Ɛ ǁ
Ɛ
ǁ
Ɛ
Ɛ

ǁ

njΎ
 Ĩϭ

Figure 8: Illustration of the co-evolution procedure of PICEA-w with a bi-objective min-


imisation example.

Table 3: Candidate solutions selection process.


(a) Ranking matrix R (b) Ranking matrix Rsort , r and F itsi
w1 w2 w3 w4 w1 w2 w3 w4 r F itsi
s1 1 inf inf inf s1 1 inf inf inf 2 2
s2 2 inf inf inf s2 2 inf inf inf 3 1
s3 inf 1 1 inf s3 1 1 inf inf 1 3
s4 inf inf 2 inf s4 2 inf inf inf 3 1

23
Table 4: Weights selection process.
(a) Ranking matrix R (b) Ranking matrix R
w1 w2 w3 w4 w1 w2 w3 w4
s1 1 inf inf inf s1 1 2 2 2 w1
s3 inf 1 1 inf s3 2 1 1 2 w3

Table 3 shows the selection process of candidate solutions. First candidate


solutions are ranked by each of the weighted Chebyshev functions. Then the
fitness of each candidate solution is calculated according to the procedure of
function selectS. Based on the fitness, s1 (F its1 = 2) and s3 (F its3 = 3) are
selected as new parent candidate solutions S.
Next we select the weight for each candidate solution in the S. Firstly,
we randomly select one solution from S without replacement, e.g. s1 . Then
we identify the weights that contribute the most to s1 , that is, s1 is ranked
the best on these weights. There is only one weight i.e., w1 that contributes
to s1 and therefore w1 is selected. After this, another candidate solution is
randomly selected from the set S\s1 , e.g., s3 . Similarly, we find the weights
on which s3 performs the best. Both w2 and w3 satisfy the condition. Then
we look at the second criterion. It is found that the angle between s3 and
w3 is larger than that between s3 and w2 and so w3 is selected for s3 . This
procedure continues until each candidate solution in the S is assigned a weight
vector.
Additionally, in PICEA-w the number of weight vectors Nw is not required
to be equal to the number of candidate solutions N. However, since in each
generation each of the survived candidate solution is assigned a different
weight vector, it is required that 2Nw ≥ N.
With respect to the time complexity of PICEA-w, evaluation of a pop-
ulation of candidate solutions runs at O(M × N), where M is the number
of objectives and N is the number of candidate solutions. The main cost of
PICEA-w is on function coEvolve in which three sub-functions are involved.
The sub-function rankingSW ranks all candidate solutions on each weight
vector and so runs at O(N 2 × Nw ). The sub-function selectS selects the
best N solutions from 2N solutions which runs at O(N 2 ). The sub-function
selectW requires to calculate the angle between each pair of candidate so-
lution and weight vector which runs at O(N × Nw ). Therefore, the overall
time complexity of PICEA-w is O(N 2 × Nw ).

24
4. Experiment description
4.1. Test problems
Eight test problems are used in this study. They are constructed by ap-
plying different shape functions provided in the WFG toolkit to the stan-
dard WFG4 benchmark problem (Huband et al., 2006). The WFG pa-
rameters k (position parameter) and l (distance parameter) are set to 18
and 14, i.e., the number of decision variables is n = k + l = 32 for each
problem instance. These problems are invoked in 2-, 4- and 7-objective
instances. The source code of these problems can be downloaded from
http://www.sheffield.ac.uk/acse/staff/rstu/ruiwang/index.
• WFG41 is the same as the WFG4 problem which has a concave Pareto
optimal front.
• WFG42 has a convex Pareto optimal front. It is built by replacing the
concave shape function used in WFG4 with the convex shape function.
• WFG43 has a strong concave Pareto optimal front. It is built by scaling
the concave shape function with power 14 .
• WFG44 has a strong convex Pareto optimal front. It is built by scaling
the convex shape function with power 14 .
• WFG45 has a mixed Pareto optimal front. It is built by replacing the
concave shape function used in WFG4 with the mixed shape function.
• The Pareto optimal front of WFG46 is a hyperplane. It is built by
replacing the concave shape function used in WFG4 with the linear
shape function.
• The Pareto optimal front of WFG47 is disconnected and concave. It is
built by replacing the concave shape function used in WFG4 with the
concave (for the first M − 1 objectives) and the disconnected (for the
last objective) shape function. Parameters used in the disconnected
function are set as α = β = 12 , A = 2.
• The Pareto optimal front of WFG48 is disconnected and convex. It is
built by replacing the concave shape function used in WFG4 with the
convex (for the first M − 1 objectives) and the disconnected (for the
last objective) shape function. Parameters used in the disconnected
function are set as α = β = 12 , A = 2.

25
The Pareto optimal front of these problems has the same trade-off mag-
nitudes, and it is within [0, 2]. Thus, the nadir point for these problems
is [2, 2, · · · , 2]. The Pareto optimal fronts of these problems are shown in
Appendix A. Please note that WFGn-Y refers to problem WFGn with Y
objectives.

4.2. The competitor MOEAs


To benchmark the performance of PICEA-w, four competitor MOEAs
are considered. All the competitors use the same algorithmic framework
as PICEA-w, and the Chebyshev scalarising function is chosen. The only
difference lies in the method of constructing JointW .
• The first algorithm (denoted as RMOEA) forms JointW by combin-
ing Nw weights that are randomly selected from current JointW and
another set of Nw randomly generated weights. RMOEA represents
decomposition based MOEAs using random weights, e.g., I-MOGLS
and J-MOGLS.

• The second competitor MOEA applies 2Nw weights that are evenly dis-
tributed on a unit-length hyperplane as JointW (denoted as UMOEA).
UMOEA represents decomposition based MOEAs using evenly dis-
tributed weights, e.g., MSOPS and MOEA/D.

• The other two considered competitor MOEAs use adaptive weights.


The weights adaptation strategies are extracted from DMOEA/D and
EMOSA, respectively. The reason for choosing these two algorithms is
that DMOEA/D is demonstrated to perform well on problems having
complex Pareto front geometries and EMOSA is found to outperform
MOGLS and MOEA/D on 2- and 3-objective problems (Li & Landa-
Silva, 2011). Note that the neighbourhood size used in DMOEA/D and
EMOSA is set as T = 10 which we demonstrated in previous work to
offer good performance (Wang et al., 2013).

4.3. General parameters


Each algorithm is performed for 31 runs, each run for 25,000 function
evaluations. For all algorithms the population size of candidate solutions
and weights are set as N = 100 and Nw = 100, respectively. The simulated
binary crossover (SBX) and polynomial mutation (PM) are applied as genetic
operators. The recombination probability pc of SBX is set to 1 per individual

26
and mutation probability pm of PM is set to 1/n per decision variable. The
distribution indices ηc of SBX and ηm of PM are set as 15 and 20, respectively.
These parameter settings are summarised in Table 5 and are fixed across all
algorithm runs.

Table 5: Algorithm testing parameter setting.


Parameters Values
N 100
Nw 100
n n = k + l = 18 + 14 = 32
maxGen 250
Crossover operator SBX (pc = 1, ηc = 15)
Mutation operator PM (pm = n1 , ηm = 20)

4.4. Performance assessment


The hypervolume metric (HV ) (Zitzler & Thiele, 1999) is used as perfor-
mance metric, and the method developed by Fonseca et al. (2006) is used to
compute the HV value. A favourable hypervolume (larger, for a minimiza-
tion problem) implies a better combination of proximity and diversity. The
approximation sets used in the HV calculation are the members of the offline
archive of all non-dominated points found during the search, since this is the
set most relevant to a posteriori decision-making. For reasons of computa-
tional feasibility, prior to analysis the set is pruned to a maximum size of
100 using the SPEA2 truncation procedure (Zitzler et al., 2002). Note that
prior to calculating the HV , we normalize all objective values to be within the
range [0, 1] by the nadir point (Deb et al., 2010) (which assumes equal relative
importance of normalised objectives across the search domain). The refer-
ence point for the hypervolume calculation is set as ri = 1.2, i = 1, 2, · · · , M.
Readers for more details of the dependency of hypervolume value and the
chosen reference point are referred to Knowles & Corne (2002); Knowles et al.
(2003) and Auger et al. (2009).
Performance comparisons between algorithms based on the HV metric are
made according to a rigorous non-parametric statistical framework, drawing
on recommendations in Zitzler et al. (2003). Specifically, we first test the
hypothesis that all algorithms perform equally using the Kruskal-Wallis test.
If this hypothesis is rejected at the 95% confidence level, we then consider

27
pair-wise comparisons between the algorithms using the Wilcoxon-ranksum
two-sided comparison procedure at the 95% confidence level, employing Šidák
correction to reduce Type I errors (Curtin & Schulz, 1998).

5. Experiment results
5.1. Non-dominated Pareto fronts and the co-evolved weights
To visualise the performance of PICEA-w, we plot the obtained Pareto
front (that has the median HV metric value across the 31 runs) as well as
the co-evolved weights of the 2-objective problems in Figures 9 and 10.
From the results, it is observed that, for most of problems, the obtained
solutions spread evenly along the Pareto front, that is, the performance of
PICEA-w is robust to the problem geometry. In addition, the distribution
of the obtained weights approximates to the optimal distribution for each
problem (this will be further discussed in Section 6.3). For example, prob-
lem WFG43-2 features strong concavity and the distribution of the obtained
weights is dense in the centre while sparse in the edge. WFG44-2 features
strong convexity and the distribution of the co-evolved weights is sparse in the
centre while dense in the edge. A much clearer observation can be made on
problems WFG47-2 and WFG48-2. On these two problems the co-evolution
leads most of weights to be constructed in the relevant place, intersecting
the Pareto optimal front.

5.2. Statistical results


Results of the Kruskal-Wallis tests followed by pair-wise Wilcoxon-ranksum
plus Šidák correction tests based on the performance metric are provided in
this Section. The initial Kruskal-Wallis test breaks the hypothesis that all
five algorithms are equivalent. Therefore the outcomes of pair-wise statistical
comparisons for 2-, 4- and 7-objective WFG problems are shown in Tables
6, 7 and 8 respectively. A smaller rank value indicates better performance;
ordering within a rank is purely alphabetical.
For 2-objective problems, it is observed from Table 6 that:
(i) UMOEA performs the best for WFG41, WFG42, WFG45 and WFG46,
for which the Pareto optimal fronts are neutral. However, UMOEA
exhibits worse performance on the rest of four problems. Specifically, its
performance is worse than PICEA-w on three out of the four problems;
and is worse than DMOEA/D and EMOSA on WFG43 and WFG44.

28
2 2

1.5 1.5
f2

f2
1 1

0.5 0.5

0 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
f1 f1

(a) WFG41 (b) WFG42

2 2

1.5 1.5
f2

f2

1 1

0.5 0.5

0 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
f1 f1

(c) WFG43 (d) WFG44

Figure 9: Approximation sets and weights obtained by PICEA-w for WFG41-2 to WFG44-
2: ◦ = weights;  = solution images.

29
2 2

1.5 1.5
f2

f2
1 1

0.5 0.5

0 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
f1 f1

(a) WFG45 (b) WFG46

2 2

1.5 1.5
f2

f2

1 1

0.5 0.5

0 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
f1 f1

(c) WFG47 (d) WFG48

Figure 10: Approximation sets and weights obtained by PICEA-w for WFG45-2 to
WFG48-2: ◦ = weights;  = solution images.

30
Table 6: HV results for 2-objective instances
WFG Ranking by Performance WFG Ranking by Performance
1 UMOEA 1 UMOEA
41 DMOEA/D EMOSA 42 2 PICEA-w EMOSA
2
PICEA-w 3 DMOEA/D
3 RMOEA 4 RMOEA
DMOEA/D EMOSA DMOEA/D EMOSA
1 1
PICEA-w PICEA-w
43 44
2 UMOEA 2 UMOEA
3 RMOEA 3 RMOEA
DMOEA/D EMOSA 1 UMOEA
1
PICEA-w UMOEA DMOEA/D EMOSA
45 46 2
2 RMOEA PICEA-w
3 RMOEA
1 PICEA-w UMOEA 1 PICEA-w
2 DMOEA/D EMOSA 2 UMOEA
47 48
3 RMOEA 3 DMOEA/D EMOSA
4 RMOEA

(ii) The three adaptive weights based D-MOEAs have comparable per-
formance on five problems (WFG41, WFG43, WFG44, WFG45 and
WFG46). On WFG47 and WFG48, DMOEA/D and EMOSA have
comparable performance and both are worse than PICEA-w. On WFG42,
PICEA-w and EMOSA have comparable performance and they both
perform better than DMOEA/D.
(iii) RMOEA is the worst optimiser for all the problems.
For 4-objective problems, it is observed from Table 7 that:
(i) The performance of UMOEA degrades. It is inferior to the three adap-
tive weights based D-MOEAs for all the problems. It even performs
worse than RMOEA for five out of the eight problems, and on the
other three problems (WFG41, WFG42 and WFG46) UMOEA per-
forms comparably with RMOEA.
(ii) PICEA-w is always amongst the top performing algorithms. It is exclu-
sively the best for four problems, i.e., WFG41, WFG46, WFG47 and
WFG48.

31
Table 7: HV results for 4-objective instances.
WFG Ranking by Performance WFG Ranking by Performance
1 PICEA-w 1 PICEA-w EMOSA
41 2 DMOEA/D EMOSA 42 2 DMOEA/D
3 RMOEA UMOEA 3 RMOEA UMOEA
1 EMOSA PICEA-w 1 PICEA-w EMOSA
43 2 DMOEA/D 44 2 DMOEA/D
3 RMOEA 3 RMOEA
4 UMOEA 4 UMOEA
DMOEA/D EMOSA 1 PICEA-w
1
45 PICEA-w 46 2 DMOEA/D EMOSA
2 RMOEA 3 RMOEA UMOEA
3 UMOEA
1 PICEA-w 1 PICEA-w
2 DMOEA/D EMOSA 2 DMOEA/D EMOSA
47 48
3 RMOEA 3 RMOEA
4 UMOEA 4 UMOEA

(iii) With respect to EMOSA and DMOEA/D, it is found that EMOSA


exhibits a better or comparable performance for all the problems.
(iv) Although RMOEA performs better than UMOEA, it is worse than the
three adaptive weights based algorithms.
As the number of objectives increases to 7, we can observe from Table 8
that the performance of PICEA-w become more promising:
(i) PICEA-w ranks exclusively or jointly (on WFG45 and WFG46) the
best for all the problems.
(ii) Among the remaining four algorithms, EMOSA is the most effective.
It performs better than DMOEA/D for five out of the eight problems,
and is comparable to DMOEA/D for the remaining three problems.
(iii) Although DMOEA/D is worse than PICEA-w and EMOSA, it is better
than RMOEA for most of the problems.
(iv) RMOEA performs better than UMOEA for all the problems. UMOEA
performs the least well.

5.3. Supplementary convergence and diversity results


To further investigate the performance of the algorithms, we also sepa-
rately calculated the proximity (as measured by generational distance – GD

32
Table 8: HV results for 7-objective instances.
WFG Ranking by Performance WFG Ranking by Performance
1 PICEA-w 1 PICEA-w
41 2 DMOEA/D EMOSA 42 2 DMOEA/D EMOSA
3 RMOEA 3 RMOEA
4 UMOEA 4 UMOEA
1 PICEA-w 1 PICEA-w
2 EMOSA 2 DMOEA/D EMOSA
43 3 DMOEA/D 44 3 RMOEA
4 RMOEA 4 UMOEA
5 UMOEA
1 EMOSA PICEA-w 1 EMOSA PICEA-w
2 DMOEA/D RMOEA 2 DMOEA/D
45 46
3 UMOEA 3 RMOEA
4 UMOEA
1 PICEA-w 1 PICEA-w
2 EMOSA 2 EMOSA
47 48
3 DMOEA/D RMOEA 3 DMOEA/D RMOEA
4 UMOEA 4 UMOEA

33
(Van Veldhuizen & Lamont, 2000)) and diversity (as measured by the Δ
(Deb et al., 2002)) measures for the 2-, 4- and 7-objective WFG41 problems.
The Pareto optimal front of WFG41 is the surface of an M-dimension hyper-
sphere with radius r = 2 in the first quadrant, which is amenable to uniform
sampling. We sample 20,000 points as the reference set for calculating the
performance metrics.

Table 9: GD and Δ results for WFG41 problems


WFG Ranking by GD WFG Ranking by Δ
1 UMOEA DMOEA/D EMOSA
1
2 DMOEA/D PICEA-w PICEA-w
41-2 41-2
3 EMOSA 2 UMOEA
4 RMOEA 3 RMOEA
1 UMOEA 1 EMOSA PICEA-w
2 PICEA-w 2 DMOEA/D
41-4 3 DMOEA/D EMOSA 41-4 3 RMOEA
4 RMOEA 4 UMOEA
1 UMOEA 1 PICEA-w
2 PICEA-w 2 DMOEA/D EMOSA
41-7 41-7
3 DMOEA/D EMOSA 3 RMOEA
4 RMOEA 4 UMOEA

From Table 9, it is found that

• In terms of convergence UMOEA performs the best while RMOEA


performs the worst for all three problems. Among the other three algo-
rithms, PICEA-w performs better than DMOEA/D and EMOSA for
WFWG41-4 and WFG41-7. For WFG41-2, PICEA-w performs com-
parably with DMOEA/D. Both algorithms are better than EMOSA.

• In terms of diversity PICEA-w is amongst the top performing algo-


rithms for all three problems. EMOSA performs competitively with
PICEA-w on WFG41-2 and WFG41-4, however, it is worse than PICEA-
w on WFG41-7. DMOEA/D is worse than EMOSA and PICEA-w on
WFG41-4 and WFG41-7, but is better than RMOEA and UMOEA
on these two problems. Comparing UMOEA with RMOEA, RMOEA
is worse on WFG41-2 while better on the other two many-objective
problems.

34
5.4. Visual examination using Pareto-Box problems
This section further demonstrates the effect of weights adaptation by com-
paring PICEA-w against MOEA/D on two Pareto-Box problems introduced
in Ishibuchi et al. (2011). The Pareto-Box problem minimises distances to
each of the given points in the two-dimensional decision space. For exam-
ple, if three points are given (e.g., A, B and C), the constructed Pareto-Box
problem is a three-objective problem, and is written as follows:

Minimize {distance(A, x), distance(B, x), distance(C, x)}. (9)

where x is a decision vector.


100 100
A A
90 90

80 80

70 70

60 60

50 50
x2

x2

40 40

30 30

20 20

10 C D 10 C D
B B
0 0
0 20 40 60 80 100 0 20 40 60 80 100
x1 x1

(a) MOEA/D (b) PICEA-w

Figure 11: Experimental results on four-objective Pareto-Box problems.

In this study, the decision space is set within [0, 100] × [0, 100]. The
first problem has four points, that is, A = [50, 90], B = [50, 10], C =
[48, 12], D = [52, 12]; the second problem considers six points: A = [50, 90],
B = [50, 10], C = [48, 12], D = [52, 12], E = [48, 88], D = [52, 88]. The
population size N is set as 220 and 252 for the four-objective and six-objective
problem, respectively. The number of weights is set equal to N. Other
parameters are set the same as that adopted in Table 5.
Experimental results of a single run of each algorithm on the four-objective
Pareto-Box problem are shown in Figure 11. Pareto optimal solutions are
points inside the four points A, B, C and D. From Figure 11, we can ob-
serve that solutions obtained by PICEA-w spread across the whole region.

35
100 100
A A
90 90
B C B C
80 80

70 70

60 60

50 50
x2

x2
40 40

30 30

20 20

10 E F 10 E F
D D
0 0
0 20 40 60 80 100 0 20 40 60 80 100
x1 x1

(a) MOEA/D (b) PICEA-w

Figure 12: Experimental results on six-objective Pareto-Box problems.

However, for MOEA/D we observe some regions with no solutions. With


respect to the convergence performance, MOEA/D is better. Most of the
obtained solutions are inside the four points. Similar results are obtained
for the six-objective Pareto-Box problem, see Figure 12, MOEA/D has bet-
ter convergence performance, while its diversity performance is inferior to
PICEA-w.
Although solutions obtained by PICEA-w can almost cover the entire
Pareto optimal front, the uniformity of these solutions is not as good as ex-
pected. One possible reason could be that the weights adaptation strategy is
specially designed for maintaining good diversity in objective-space. There
is no guarantee that good diversity in objective-space leads to good diver-
sity in decision-space. Maintaining good diversity in both objective-space
and decision-space is an important and challenging issue, which needs to be
further studied in future.

6. Discussions
6.1. General discussions
UMOEA, which employs a set of pre-defined evenly distributed weights,
faces difficulties on many-objective problems or problems having extremely
complex geometry. The reason is that evenly distributed weights lead to a
poor distribution of solutions for problems whose geometry is not similar to

36
a hyper-plane. Simultaneously, as the employed weights are fixed, UMOEA
can only obtain a limited number of solutions and these are insufficient to
approximate the entire Pareto optimal front, particularly for many-objective
problems. Certainly, when the problem is low-dimension and has a neutral
Pareto optimal front, UMOEA performs well, e.g., the 2-objective WFG41,
WFG42, WFG45 and WFG46.
RMOEA, which employs weights that are randomly generated in each
generation, performs worse than UMOEA for low-dimension problems. The
reason is that, as mentioned earlier, the search directions in RMOEA are
not fixed as in UMOEA but keep changing during the whole search process.
This leads RMOEA to have an inferior convergence performance compared
with UMOEA. However, RMOEA tends to perform better than UMOEA for
high-dimension problems. This is also due to the use of random weights,
that is, random weights guide the algorithm to search different regions of the
Pareto optimal front, therefore leading to a set of diversified solutions, i.e.,
a better diversity performance.
The three adaptive weights based D-MOEAs perform better than UMOEA
and RMOEA for the four 2-objective problems (WFG43, WFG44, WFG47
and WFG48) that have complex Pareto optimal front and for most of the
many-objective problems. This indicates that the weights adaptation strate-
gies employed in these algorithms are helpful for handling the issue of problem
geometry and many-objective optimisation. Given a closer examination, it
is found that amongst the three algorithms, the co-evolution based weights
adaptation is the most effective one. It appropriately balances the explo-
ration and exploitation and so enables PICEA-w to perform well for most of
the problems. Moreover, during the co-evolution, weights gradually learn the
geometry of the problem and evolve towards an optimal distribution. This
leads PICEA-w to be less sensitive to problem geometry.

6.2. The effect of the parameter θ


In PICEA-w a parameter θ, which measures the angle between a candi-
date solution and a weight vector is employed to implement local selection.
Here we demonstrate the impact of the local selection mechanism.
The 2- and 4-objective WFG47 are selected as test problems. Two differ-
ent choices of θ are examined. The first choice sets θ as π/2 radians during
the whole search process, which indicates that the evaluation of candidate
solutions is executed globally. That is, every solution competes against all
the other solutions. The second setting sets θ as π/18 radians during the

37
2 2

1.5 1.5
f2

f2
1 1

0.5 0.5

0 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
f1 f1

(a) PICEA-w1 (b) PICEA-w2

Figure 13: The Pareto front obtained by PICEA-w1 and PICEA-w2 for WFG47-2.

whole search process, which indicates that the evaluation of candidate so-
lutions is always executed locally. That is, every solution competes against
its neighbours. All the other parameters are the same as adopted in Table
5. We use PICEA-w1 and PICEA-w2 to denote PICEA-w using θ = π/2
and θ = π/18 radians, respectively. These two variants are compared with
PICEA-w in which θ is adjusted by Equation (8).
All algorithms are executed for 31 runs. To visualise the performance
of PICEA-w, Pareto fronts (that have the median HV metric value across
the 31 runs) obtained by PICEA-w1 and PICEA-w2 are shown in Figure 13.
Experimental results in terms of the HV metric are shown in Table 10. The
symbol ‘−’, ‘=’ or ‘+’ means the considered algorithm performs statistically
worse, equal or better than PICEA-w at 95% confidence level subjected to
the Wilcoxon-ranksum two-sided comparison procedure.

Table 10: HV comparison results (mean±std) for WFG47 and WFG48.


PICEA-w1 PICEA-w2 PICEA-w
WFG47-2 0.5028±0.0131 (−) 0.5730±0.0158 (=) 0.5910±0.0109
WFG47-4 0.6346±0.1212 (−) 0.7615±0.0219 (−) 0.7961±0.0182

From the results we can observe that (i) PICEA-w1 performs worse than
PICEA-w for all the problems; and (ii) PICEA-w2 performs comparably with

38
PICEA-w on the 2-objective WFG47, whilst performs worse than PICEA-w
on WFG47-4.
The poor performance of PICEA-w1 in terms of the HV metric is be-
cause PICEA-w1 has not found the entire Pareto optimal front, see Figure
13. The reason could be that each solution globally competes with other
solutions, which leads some potentially useful, though dominated, solutions
to be removed at the early stage of the evolution. For example, in Figure
14, although solutions in region A are dominated, they are helpful to search
for Pareto optimal solutions in region B. Using small θ at the beginning
is helpful in keeping those locally good solutions in region A and therefore
obtaining the entire Pareto optimal front.

Figure 14: The first generational objective vectors and the Pareto optimal front for
WFG47-2: ◦ = Pareto optimal front;  = solution images in the first generation.

The reason for the poor performance of PICEA-w2 is likely to be that


the local selection has a deleterious effect on convergence. This operation
might assign higher fitness to some dominated solutions. Although this is
helpful from a diversity perspective, it slows down the convergence speed
as some dominated solutions are stored during the search. This side effect
becomes more significant on higher-dimension problems since convergence
is inherently more difficult to achieve on many-objective fitness landscapes.
Equation (8), used to adjust θ, though simple, is an effective strategy to
balance convergence and diversity performance. In future, a refined strategy
for configuring θ should be investigated.

6.3. The distribution of the co-evolved weights


In PICEA-w the weights also evolve towards the optimal distribution dur-
ing the search. The optimal distribution of weights is defined as that which

39
corresponds to a uniform distribution of solutions in objective-space. Com-
paring Figure 9 and Figure 10 with Figure A.15 and Figure A.16 (shown
in Appendix A), we can observe that for most of the problems the distri-
bution of the co-evolved weights approximates qualitatively to the optimal
distribution.
Moving beyond visual inspection,we employ a non-parametric method –
the two-sample multi-dimensional Kolmogorov–Smirnov (K–S) test for the
independence of two distributions (Peacock, 1983; Justel et al., 1997) – to
quantitatively examine the similarity between the optimal and evolved weight
sets. For the purposes of this analysis, we can conceive of the state of PICEA-
w at some generation, t, as representing some stochastic and/or emergent
distribution function from which the weight vectors are sampled. Any run of
PICEA-w represents samples from an instance of the distribution function.
We therefore perform a K–S test for each run of the algorithm, compar-
ing against a single set of sample weights drawn from the optimal distri-
bution for each problem. To provide an indication of the effect of sample
(i.e. population) size on the K–S results, and against the effect of alternative
weight distributions, we also provide comparison K–S results for the optimal
weights sample against: (i) another sample from the optimal distribution;
(ii) a sample from a uniform distribution of weights; (iii) a sample from an
even distribution of weights. The methods for generating the sample sets are
summarised below:

(i) Optimal weights: Nw solutions from the Pareto optimal front are uni-
formly sampled. The selected solutions are then scaled and normalised
to create valid weight vectors (see Section 2.3.1).
(ii) Co-evolved weights: use the Nw adapted weights in the last generation
of PICEA-w (the algorithm is run 31 times to generate 31 sets of such
weights).
(iii) Uniform weights: uniformly sample Nw points from line x+y = 1, x, y ≥
0.
(iv) Even weights: use the simplex-lattice design method (see p.8) to gen-
erate Nw weights.

For each case, the null hypothesis of the K–S test is that both sets of
weights are drawn from the same underlying distribution. The K–S statistic
quantifies a distance (denoted as D) between the empirical distribution func-
tions of the two samples. The D metric value is calculated using the method

40
described in Peacock (1983). We also calculate the p-values for the indepen-
dence test. Results are presented in Table 11, for which we show the 10%,
50% and 90% deciles for the PICEA-w comparisons to provide an indication
of uncertainty in the performance of the algorithm. The PICEA-w results
are sorted so that 10% is better performing and 90% is worse performing.

Table 11: K–S results: D statistics (p-values)


Case WFG41 WFG42 WFG43 WFG44
Optimal 0.01 (0.999) 0.03 (0.987) 0.01 (0.999) 0.01 (0.999)
Uniform 0.11 (0.621) 0.20 (0.168) 0.20 (0.168) 0.25 (0.056)
Even 0.06 (0.897) 0.16 (0.332) 0.13 (0.499) 0.25 (0.056)
PICEA-w 10% 0.03 (0.987) 0.05 (0.682) 0.04 (0.966) 0.04 (0.966)
PICEA-w 50% 0.05 (0.935) 0.08 (0.798) 0.08 (0.798) 0.10 (0.682)
PICEA-w 90% 0.10 (0.682) 0.10 (0.682) 0.10 (0.682) 0.14 (0.440)
WFG45 WFG46 WFG47 WFG48
Optimal 0.01 (0.999) 0.04 (0.966) 0.01 (0.999) 0.03 (0.987)
Uniform 0.14 (0.440) 0.14 (0.440) 0.18 (0.241) 0.43 (0.000)
Even 0.08 (0.798) 0.06 (0.896) 0.17 (0.285) 0.39 (0.001)
PICEA-w 10% 0.03 (0.987) 0.03 (0.987) 0.03 (0.987) 0.03 (0.987)
PICEA-w 50% 0.06 (0.896) 0.08 (0.798) 0.06 (0.896) 0.08 (0.798)
PICEA-w 90% 0.09 (0.742) 0.12 (0.560) 0.09 (0.742) 0.13 (0.499)

For the WFG48 problem, for uniformly and evenly distributed samples,
we observe that the K–S test identifies that these samples come from a differ-
ent distribution to the optimal distribution of weights with 95% certainty. In
all other cases, the null hypothesis is not broken. Considering the PICEA-w
results, it can be seen that in general the D metric lies between that result-
ing from an optimal versus optimal comparison and those from the known
alternative distributions, even at the 90% decile. The best runs of PICEA-
w produce results very close to those of an optimal sample. Note that, for
WFG46, evenly distributed weights approximate well to the optimal distri-
bution because the Pareto optimal front of WFG46 is a linear line (i.e. an
even distribution is actually optimal in this case).
Overall, from both the qualitative and the quantitative examinations, we
have shown that the distribution of the co-evolved weights is more similar to
the optimal weights distribution than a distribution of uniform weights and
even weights for most of the problems. This provides good evidence, at a

41
mechanism level, for the good performance of PICEA-w, in comparison to
RMOEA and UMOEA.

7. Conclusion
Decomposition based algorithms comprise a popular class of evolutionary
multi-objective optimiser, and have been demonstrated to perform well when
a suitable set of weights are provided. However, determining a good set of
weights a priori for real-world problems is usually not straightforward due to
a lack of knowledge on the underlying problem structure. This study has pro-
posed a new decomposition based algorithm for multi-objective optimisation,
PICEA-w, that eliminates the need to specify appropriate weights in advance
of performing the optimization. Specifically, weights are adaptively modified
by being co-evolved with candidate solutions during the search process. The
co-evolution enables suitable weights to be constructed adaptively during
the optimisation process, thus guiding the candidate solutions towards the
Pareto optimal front effectively. Through rigorous empirical testing, we have
demonstrated the benefits of PICEA-w compared to other leading decom-
position based algorithms. The chosen test problems encompass the range
of problem geometries likely to be seen in practice, including simultaneous
optimisation of up to seven conflicting objectives. The main findings are as
follows:

(1) PICEA-w is less sensitive to the problem geometry, and outperforms


other leading decomposition based algorithms on many-objective prob-
lems. Moreover, it is shown that when guiding candidate solution to-
wards the Pareto optimal front, weights also evolve towards the optimal
distribution.
(2) The two adaptive weights based decomposition based MOEAs perform
well on most of the 2-objective problems. However, their performance
is not noticeably outstanding on most of the many-objective problems.
This is because that the weights adaptation strategy employed in these
algorithms cannot strike an effective balance between exploration and
exploitation.
(3) UMOEA (e.g., MOEA/D and MSOPS) faces difficulties on problems
having complex Pareto optimal fronts and on many-objective problems.
The poor performance of UMOEA is due to a lack of solution diversity.
However, UMOEA is found to perform the best in terms of convergence.

42
This could perhaps because the employed weights are kept fixed during
the whole search process, which guide candidate solutions towards Pareto
optimal front directly.
(4) RMOEA (e.g, HLGA and MOGLS) also faces difficulties on problems
having a complex Pareto optimal front. Noticeably, although it per-
forms worse than UMOEA on bi-objective problems, it performs better
than UMOEA on many-objective problems. The reason is that, for bi-
objective problems, the employed evenly distributed weights are suffi-
cient to describe the entire Pareto optimal front and therefore the diver-
sity performance of UMOEA is not much inferior to RMOEA. However,
UMOEA demonstrates a better convergence performance than RMOEA.
For many-objective problems, the limited number of weights employed in
UMOEA is not sufficient to approximate the entire Pareto optimal front
while the use of random weights enables the search to explore different
regions of the Pareto optimal front, producing better solution diversity.

The main limitation of this study is that its findings are based on real-
parameter function optimisation problems. It is also important to assess the
performance of PICEA-w on other problem types, e.g. multi-objective com-
binatorial problems, and also, crucially, real-world problems. With respect to
further research, it is our view that, first, given the superiority of PICEA-w
to other decomposition based MOEAs on many-objective problems, it would
be valuable to compare PICEA-w against other competitive many-objective
optimisers such as HypE and PICEA-g. Also, it would be valuable to inves-
tigate the performance of PICEA-w on multi-objective combinatorial prob-
lems. Thirdly, the search ability of decomposition based MOEAs is affected
by the chosen scalarising function. It would be useful to investigate how to
choose a suitable scalarising function for different problems (Ishibuchi et al.,
2009). Fourthly, in the current version of PICEA-w, new weight vectors are
randomly generated; it would be interesting to see how the performance of
PICEA-w would be affected if genetic operators are applied to generate new
weights. Lastly, different weights lead to different Pareto optimal solutions:
therefore, PICEA-w could be easily extended to incorporate the decision
maker’s real preferences during the optimisation process and so to search for
solutions that are interest to the decision makers.

43
Acknowledgements
This research was conducted in the Department of Automatic Control
and Systems Engineering, The University of Sheffield and the first author is
grateful for the facilities and support provided by the University.

Appendix A. Pareto optimal fronts of the WFG4x problems

2 2

1.5 1.5
f2

f2
1 1

0.5 0.5

0 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
f1 f1

(a) WFG41 (b) WFG42

2 2

1.5 1.5
f2

f2

1 1

0.5 0.5

0 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
f1 f1

(c) WFG43 (d) WFG44

Figure A.15: Pareto optimal fronts and the optimal distributions of weights for WFG41-2
to WFG44-2: ◦ = weights;  = solution images.

44
2 2

1.5 1.5
f2

f2
1 1

0.5 0.5

0 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
f1 f1

(a) WFG45 (b) WFG46

2 2

1.5 1.5
f2

f2

1 1

0.5 0.5

0 0
0 0.5 1 1.5 2 0 0.5 1 1.5 2
f1 f1

(c) WFG47 (d) WFG48

Figure A.16: Pareto optimal fronts and the optimal distributions of weights for WFG45-2
to WFG48-2: ◦ = weights;  = solution images.

45
Optimal solutions of these eight modified WFG4 problems satisfy the
condition below, see Equation (A.1) (Huband et al., 2006):

xi=k+l:n = 2i × 0.35 (A.1)

where n is the number of decision variables and n = k + l, k and l are the


position and distance parameters. To obtain an approximation of the Pareto
optimal front, we first randomly generate 20,000 optimal solutions for the
test problem and compute their objective values. Second, we employ the
clustering technique employed in SPEA2 (Zitzler et al., 2002) to select a set
of evenly distributed solutions from all the generated solutions.
The Pareto optimal fronts of these problems are shown in Figure A.15 and
Figure A.16 for 2-objective problems, respectively. Additionally, the optimal
distribution of weights for each of the problems is also plotted. They are
calculated according to the method provided by Giagkiozis et al. (2013a).

References
Auger, A., Bader, J., Brockhoff, D., & Zitzler, E. (2009). Theory of the hyper-
volume indicator: optimal μ-distributions and the choice of the reference
point. In Proceedings of the tenth ACM SIGEVO workshop on Foundations
of genetic algorithms (pp. 87–102). ACM.

Bader, J., & Zitzler, E. (2011). HypE: an algorithm for fast hypervolume-
based many-objective optimization. Evolutionary Computation, 19 , 45–76.

Coello, C., Lamont, G., & Van Veldhuizen, D. (2007). Evolutionary Algo-
rithms for Solving Multi-Objective Problems. Springer.

Curtin, F., & Schulz, P. (1998). Multiple correlations and Bonferronis cor-
rection. Biological Psychiatry, 44 , 775–777.

Deb, K., Miettinen, K., & Chaudhuri, S. (2010). Toward an estimation


of nadir objective vector using a hybrid of evolutionary and local search
approaches. IEEE Transactions on Evolutionary Computation, 14 , 821–
841.

Deb, K., Mohan, M., & Mishra, S. (2005). Evaluating the epsilon-domination
based multi-objective evolutionary algorithm for a quick computation of
Pareto-optimal solutions. Evolutionary Computation, 13 , 501–25.

46
Deb, K., Pratap, A., Agarwal, S., & Meyarivan, T. (2002). A fast and
elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on
Evolutionary Computation, 6 , 182–197.

Emmerich, M., Beume, N., & Naujoks, B. (2005). An EMO algorithm using
the hypervolume measure as selection criterion. In Evolutionary Multi-
Criterion Optimization (pp. 62–76). Springer.

Fonseca, C., & Fleming, P. (1993). Genetic Algorithms for Multiobjective


Optimization: FormulationDiscussion and Generalization. In Proceedings
of the 5th International Conference on Genetic Algorithms (pp. 416–423).
Morgan Kaufmann Publishers Inc.

Fonseca, C., Paquete, L., & López-Ibáñez, M. (2006). An improved


dimension-sweep algorithm for the hypervolume indicator. In Evolutionary
Computation (CEC), 2006 IEEE Congress on (pp. 1157–1163). IEEE.

Giagkiozis, I., & Fleming, P. (2012). Methods for Many-Objective Optimiza-


tion: An Analysis. Research Report No. 1030 Automatic Control and
Systems Engineering, University of Sheffield.

Giagkiozis, I., Purshouse, R. C., & Fleming, P. J. (2013a). Generalized


decomposition. In Evolutionary Multi-Criterion Optimization (pp. 428–
442). Springer.

Giagkiozis, I., Purshouse, R. C., & Fleming, P. J. (2013b). Towards Under-


standing the Cost of Adaptation in Decomposition-Based Optimization
Algorithms. In Systems, Man, and Cybernetics (SMC), 2013 IEEE Inter-
national Conference on (pp. 615–620). IEEE.

Goldberg, D. (1989). Genetic Algorithms in Search, Optimization, and Ma-


chine Learning. Addison-wesley.

Gu, F., Liu, H., & Tan, K. (2012). A Multiobjective Evolutionary Algorithm
using Dynamic Weight Design Method. International Journal of Innovative
Computing Information and Control, 8 , 3677–3688.

Hajela, P., Lee, E., & Lin, C. (1993). Genetic algorithms in structural topol-
ogy optimization. Topology design of structures, 227 , 117–134.

47
Huband, S., Hingston, P., Barone, L., & While, L. (2006). A review of
multiobjective test problems and a scalable test problem toolkit. IEEE
Transactions on Evolutionary Computation, 10 , 477–506.
Hughes, E. (2003). Multiple single objective pareto sampling. In Evolutionary
Computation (CEC), 2003 IEEE Congress on (pp. 2678–2684). IEEE.
Hughes, E. J. (2007). MSOPS-II: A general-purpose Many-Objective opti-
miser. In Evolutionary Computation (CEC), 2007 IEEE Congress on (pp.
3944–3951). IEEE.
Ishibuchi, H., Hitotsuyanagi, Y., Ohyanagi, H., & Nojima, Y. (2011). Ef-
fects of the Existence of Highly Correlated Objectives on the Behavior of
MOEA/D. In Evolutionary Multi-Criterion Optimization (pp. 166–181).
Springer.
Ishibuchi, H., & Murata, T. (1998). A multi-objective genetic local search
algorithm and its application to flowshop scheduling. IEEE Transactions
on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 28 ,
392–403.
Ishibuchi, H., Sakane, Y., Tsukamoto, N., & Nojima, Y. (2009). Adaptation
of scalarizing functions in MOEA/D: An adaptive scalarizing function-
based multiobjective evolutionary algorithm. In Evolutionary Multi-
Criterion Optimization (pp. 438–452). Springer.
Ishibuchi, H., Sakane, Y., Tsukamoto, N., & Nojima, Y. (2010). Simultane-
ous use of different scalarizing functions in MOEA/D. In GECCO 2010:
Proceedings of the Genetic and Evolutionary Computation Conference (pp.
519–526). Portland,USA: ACM.
Ishibuchi, H., Tsukamoto, N., & Nojima, Y. (2008). Evolutionary many-
objective optimization: A short review. In Evolutionary Computation
(CEC), 2008 IEEE Congress on (pp. 2419–2426). IEEE.
Jaszkiewicz, A. (2002). Genetic local search for multi-objective combinatorial
optimization. European Journal of Operational Research, 137 , 50–71.
Jiang, S., Cai, Z., Zhang, J., & Ong, Y. (2011). Multiobjective optimization
by decomposition with Pareto-adaptive weight vectors. In Natural Com-
putation (ICNC), 2011 Seventh International Conference on (pp. 1260–
1264). IEEE.

48
Justel, A., Peña, D., & Zamar, R. (1997). A multivariate Kolmogorov-
Smirnov test of goodness of fit. Statistics & Probability Letters, 35 , 251–
259.
Kim, I. Y., & De Weck, O. (2005). Adaptive weighted-sum method for
bi-objective optimization: Pareto front generation. Structural and multi-
disciplinary optimization, 29 , 149–158.
Kim, I. Y., & De Weck, O. (2006). Adaptive weighted sum method for
multiobjective optimization: a new method for Pareto front generation.
Structural and Multidisciplinary Optimization, 31 , 105–116.
Knowles, J., & Corne, D. (2002). On metrics for comparing nondominated
sets. In Evolutionary Computation (CEC), 2002 IEEE Congress on (pp.
711–716). IEEE.
Knowles, J. D., Corne, D. W., & Fleischer, M. (2003). Bounded archiving
using the Lebesgue measure. In Evolutionary Computation (CEC), 2003
IEEE Congress on (pp. 2490–2497). IEEE.
Li, H., & Landa-Silva, D. (2011). An adaptive evolutionary multi-objective
approach based on simulated annealing. Evolutionary Computation, 19 ,
561–595.
Murata, T., Ishibuchi, H., & Gen, M. (2001). Specification of genetic search
directions in cellular multi-objective genetic algorithms. In Evolutionary
Multi-Criterion Optimization (pp. 82–95). Springer.
Peacock, J. (1983). Two-dimensional goodness-of-fit testing in astronomy.
Monthly Notices of the Royal Astronomical Society, 202 , 615–627.
Purshouse, R., & Fleming, P. (2003). Evolutionary many-objective optimisa-
tion: an exploratory analysis. In Evolutionary Computation (CEC), 2003
IEEE Congress on (pp. 2066–2073). IEEE.
Purshouse, R., & Fleming, P. (2007). On the Evolutionary Optimization of
Many Conflicting Objectives. IEEE Transactions on Evolutionary Com-
putation, 11 , 770–784.
Purshouse, R., Jalbǎ, C., & Fleming, P. (2011). Preference-driven co-
evolutionary algorithms show promise for many-objective optimisation. In
Evolutionary Multi-Criterion Optimization (pp. 136–150). Springer.

49
Qi, Y., Ma, X., Liu, F., Jiao, L., Sun, J., & Wu, J. (2013). MOEA/D with
Adaptive Weight Adjustment. Evolutionary Computation, . doi:10.1162/
EVCO_a_00109.

Srinivas, M., & Patnaik, L. M. (1994). Adaptive probabilities of crossover


and mutation in genetic algorithms. IEEE Transactions on Systems, Man
and Cybernetics, 24 , 656–667.

Tan, Y., Jiao, Y., Li, H., & Wang, X. (2012). A modification to MOEA/D-
DE for multiobjective optimization problems with complicated Pareto sets.
Information Sciences, 213 , 14–38.

Van Veldhuizen, D., & Lamont, G. (2000). On measuring multiobjective


evolutionary algorithm performance. In Evolutionary Computation (CEC),
2000 IEEE Congress on (pp. 204–211). IEEE.

Wang, R., Fleming, P., & Purshouse, R. (2014). General framework for
localised multi-objective evolutionary algorithms. Information Sciences,
258 , 29–53.

Wang, R., Purshouse, R., & Fleming, P. (2013). Preference-inspired Co-


evolutionary Algorithms for Many-objective Optimisation. IEEE Trans-
actions on Evolutionary Computation, 17 , 474–494.

Zhang, Q., & Li, H. (2007). MOEA/D: A Multiobjective Evolutionary Al-


gorithm Based on Decomposition. IEEE Transactions on Evolutionary
Computation, 11 , 712–731.

Zhang, Q., Liu, W., & Li, H. (2009). The performance of a new version of
MOEA/D on CEC09 unconstrained MOP test instances. In Evolutionary
Computation (CEC), 2009 IEEE Congress on (pp. 203–208). IEEE.

Zitzler, E., & Künzli, S. (2004). Indicator-based selection in multiobjective


search. In Parallel Problem Solving from Nature – PPSN VIII (pp. 832–
842). Springer.

Zitzler, E., Laumanns, M., & Thiele, L. (2002). SPEA2: Improving the
Strength Pareto Evolutionary Algorithm for Multiobjective Optimization.
Evolutionary Methods for Design Optimisation and Control with Applica-
tion to Industrial Problems EUROGEN 2001 , 3242 , 95–100.

50
Zitzler, E., & Thiele, L. (1999). Multiobjective evolutionary algorithms: A
comparative case study and the strength pareto approach. IEEE Transac-
tions on Evolutionary Computation, 3 , 257–271.

Zitzler, E., Thiele, L., Laumanns, M., Fonseca, C., & da Fonseca, V. (2003).
Performance assessment of multiobjective optimizers: an analysis and re-
view. IEEE Transactions on Evolutionary Computation, 7 , 117–132.

51
(IGHLIGHTS

Research highlights

x A new decomposition based algorithm PICEA-w is proposed.


x PICEA-w adaptively varies the weights.
x Adaptive weights bring robustness to different problem geometries.
x PICEA-w outperforms other leading decomposition based algorithms.

You might also like