Compare Results

Compilation of Results on the 2005 CEC
Benchmark Function Set

Nikolaus Hansen
Computational Laboratory (CoLab)
Institute of Computational Science
ETH Zurich
May 4, 2006
Nikolaus Hansen
Compilation of Results on the 2005 CEC Benchmark Function Set
A Note on Evaluation Criteria

Quantitative performance measurements, the Success
Performance, abbreviated SP or FEs, is used as a
measure for the expected number of function evaluations
to reach a target function value.
Invariance is a non-empirical statement on the ability to
generalize performance results. Invariance guarantees
identical performance on a class of functions. Possible
invariances are
invariance against translation, scaling, or even order
preserving transformations of the objective function value
invariance against angle preserving (rigid) transformations
of the search space (translation, rotation)
Meta-Parameters
how many parameters of the algorithm need to be adjusted
to the object function?
how many different setups were tested?
how many different setups were finally used?
Nikolaus Hansen
Methods
task: black-box optimization of 25 benchmark functions
25 runs on each benchmark function for each dimension
n = 10, 30
a run is successful if the global optimum is reached with
the given precision, before the
maximumnumber of function evaluations
105 for n = 10
FEmax =
is reached
3 105 for n = 30
Remark
the setting of FEmax has a remarkable influence on the results, if the target function value can be
reached only for a (slightly) larger number of function evaluations with a high probability. Where
FEs FEmax the result must be taken with great care.
Reference
Suganthan, Hansen, Liang, Deb, Chen, Auger, and Tiwari (2005). Problem Definitions and Evaluation
Criteria for the CEC 2005 Special Session on Real-Parameter Optimization, Technical report, Nanyang
Technological University, Singapore, May 2005, http://www.ntu.edu.sg/home/EPNSugan
Nikolaus Hansen
References to Algorithms
BLX-GL50
BLX-MA
CoEVO
DE
DMS-L-PSO
EDA
G-CMA-ES
K-PCX
L-CMA-ES
L-SaDE
SPC-PNX
Garca-Martnez and Lozano (Hybrid Real-Coded. . . )

Molina et al. (Adaptive Local Search. . . )
Posk (Real-Parameter Optimization. . . )
Ronkk
onen
et al. (Real-Parameter Optimization. . . )
Liang and Suganthan (Dynamic Multi-Swarm. . . )
Yuan and Gallagher (Experimental Results. . . )
Auger and Hansen (A Restart CMA. . . )
Sinha et al. (A Population-Based,. . . )
Auger and Hansen (Performance Evaluation. . . )
Qin and Suganthan (Self-Adaptive Differential. . . )
Ballester et al. (Real-Parameter Optimization. . . )
In: CEC 2005 IEEE Congress on Evolutionary Computation, Proceedings
Nikolaus Hansen
Summarized Results
Empirical Distribution of Normalized Success Performance
n = 10
0.9
DE
0.8
LSaDE
DMSLPSO
0.7
EDA
0.6
BLXGL50
0.5
LCMAES
0.4
KPCX
0.3
SPCPNX
0.2
BLXMA
0.1
CoEVO
0 0
10
10
10
FEs / FEsbest
FEs = mean(#fevals)
#all runs (25)

,
#successful runs
GCMAES
0.9
KPCX
0.8
LCMAES
DMSLPSO
0.7
LSaDE
0.6
EDA
0.5
SPCPNX
0.4
BLXGL50
0.3
CoEVO
0.2
BLXMA
0.1
0 0
10
n = 30
GCMAES
empirical distribution over all functions
empirical distribution over all functions
DE
1
10
10
FEs / FEsbest
where #fevals includes only successful runs.
Shown: empirical distribution function of the Success Performance FEs divided by

FEs of the best algorithm on the respective function.
Results of all functions are used where at least one algorithm was successful at least once, i.e. where
the target function value was reached in at least one experiment (out of 11 25 experiments).
Small values for FEs and therefore large (cumulative frequency) values in the graphs
are preferable.
Nikolaus Hansen
Function Sets
We split the function set into three subsets

unimodal functions
solved multimodal functions
at least one algorithm conducted at least one
successful run
unsolved multimodal functions

no single run was successful for any algorithm
Nikolaus Hansen
Unimodal Functions
Normalized Success Performance, Tabulated
n = 10, FEmax = 100000
solved success
rate
functions
G-CMA-ES
EDA
DE
L-CMA-ES
BLX-GL50
DMS-L-PSO
L-SaDE
SPC-PNX
CoEVO
K-PCX
BLX-MA
6
6
6
6
5
5
5
4
4
4
3
100%
97%
96%
88%
83%
80%
77%
67%
67%
62%
49%
1S
re
phe
1000
1.6(25)
10.0(25)
29.0(25)
1.7(25)
19.0(25)
12.0(25)
10.0(25)
6.7(25)
23.0(25)
1.0(25)
12.0(25)
6
e
ds
10
oun
Nois
ion
with
on B
ndit
2
2
6
.
.
.
o
ck
1
1
2
C
efel
efel
efel
oid
nbro
ose
chw
llips
chw
chw
2S
3E
4S
5S
6R
2400
6500
1.0(25) 1.0(25)
4.6(25) 2.5(23)
19.2(25) 18.5(20)
1.1(25) 1.0(25)
17.1(25)
[9]
5.0(25) 1.8(25)
4.2(25) 8.0(16)
12.9(25)
[11]
11.3(25) 6.8(25)
1.0(25)
[8]
15.4(25)
[10]
2900
5900
1.0(25) 1.0(25)
4.1(25) 4.2(25)
17.9(25) 6.9(25)
65.5( 7) 1.0(25)
14.5(25) 4.7(25)
[11] 18.6(20)
15.9(24)
[9]
10.7(25) 6.8(25)
16.2(25)
[10]
19.7(21)
[11]
25.9(24)
[8]
First row: Success Performance FEs = mean(#fevals)

algorithm, where #fevals includes only successful runs
#all runs (25)

#successful runs
7100
1.5(25)
9.6(22)
6.6(24)
1.3(25)
7.3(25)
7.7(25)
6.9(25)
[10]
[11]
1.0(22)
[9]
of the best
Table entries: Success Performance FEs divided by FEs of the best algorithm (first
row), (number of successful runs in round brackets), [rank of median final function
value, where FEs is not available, in square brackets]
Nikolaus Hansen
Unimodal Functions
n = 30, FEmax = 300000

1S
solved success
functions rate
G-CMA-ES
L-CMA-ES
EDA
DMS-L-PSO
BLX-GL50
SPC-PNX
K-PCX
L-SaDE
BLX-MA
DE
CoEVO
6
5
4
4
3
4
3
3
1
1
2
90%
83%
67%
63%
50%
45%
43%
41%
17%
17%
7%
re
phe
2700
1.7(25)
1.8(25)
55.6(25)
1.9(25)
21.5(25)
11.1(25)
1.0(25)
7.4(25)
11.9(25)
51.9(25)
519 ( 3)
6
ds
oise
10
oun
ith N
nB
ition
.6 o
.2 w
ond
ck
2
1
C
l
l
r
e
e
ef
ef
oid
nb o
ose
llips
chw
chw
3E
4S
5S
6R
.2
1
efel
chw
2S
12000
1.1(25)
1.2(25)
13.3(25)
10.8(25)
13.3(25)
26.7(22)
1.0(25)
12.5(24)
[10]
[11]
70.0( 8)
43000
1.0(25)
1.0(25)
5.1(25)
7.9(21)
[7]
[11]
[6]
[8]
[10]
[9]
[5]
59000
1.0(10)
[11]
3.4(25)
[9]
[6]
6.1(19)
[8]
9.2(13)
[7]
[5]
[10]
66000 60000
1.0(25) 1.0(25)
1.1(25) 1.1(25)
[3]
[10]
[9] 5.5(24)
[6] 3.7(25)
[10] 86.7( 1)
[7] 1.1(14)
[4]
[8]
[7]
[8]
[5]
[7]
[11]
[11]
#all runs (25)

First row: Success Performance FEs = mean(#fevals) #successful
of the
runs
best algorithm, where #fevals includes only successful runs
Table entries: Success Performance FEs divided by FEs of the best algorithm
(first row), (number of successful runs in round brackets), [rank of median final
function value, where FEs is not available, in square brackets]
Nikolaus Hansen
Unimodal Functions
n = 10
0.9
EDA
0.8
DE
LCMAES
0.7
LSaDE
0.6
DMSLPSO
0.5
BLXGL50
0.4
SPCPNX
0.3
KPCX
0.2
CoEVO
0.1
BLXMA
0 0
10
10
FEs / FEsbest
10
GCMAES
0.9
LCMAES
0.8
DMSLPSO
EDA
0.7
SPCPNX
0.6
KPCX
0.5
LSaDE
0.4
BLXGL50
0.3
CoEVO
0.2
BLXMA
0.1
0 0
10
n = 30
GCMAES
empirical distribution over unimodal functions
empirical distribution over unimodal functions
DE
1
10
10
FEs / FEsbest
Empirical distribution function of the Success Performance FEs divided by FEs of the
best algorithm (table entries of last slides).
#all runs (25)
FEs = mean(#fevals) #successful
, where #fevals includes only successful runs.
runs
Small values of FEs and therefore large values in the empirical distribution graphs are
preferable.
Nikolaus Hansen
Solved Multimodal Functions

n = 10, FEmax = 100000

solved success
functions rate
G-CMA-ES
L-SaDE
DMS-L-PSO
K-PCX
DE
L-CMA-ES
BLX-GL50
BLX-MA
EDA
SPC-PNX
CoEVO
5
4
4
3
5
2
3
2
3
2
0
63%
53%
47%
40%
30%
25%
17%
15%
9%
1%
0%
nk
t
ou
wa
r ie
7G
ds
ble
un
Bo
tr
as
9R
4700
17000
1.0(25) 4.5(19)
36.2( 6) 1.0(25)
126 ( 4) 2.1(25)
[10] 2.9(24)
255 ( 2) 10.6(11)
1.2(25)
[11]
12.3( 9) 10.0( 3)
[11] 5.7(18)
404 ( 1)
[9]
383 ( 1)
[8]
[9]
[10]
First row: Success Performance FEs =
str
10
Ra
ed
tat
ara
ep
S
igin
igin
Ro
as
str
ier
e
1W
s
12
fel
we
ch
3
2.1
15
Hy
ep
dS
ara
ble
bri
55000 190000
8200 33000
1.2(23) 1.4( 6) 4.0(22)
[6]
[5]
[8] 3.9(25) 1.0(23)
[3]
[7] 6.6(19) 1.7(22)
1.0(22)
[10] 1.0(14)
[11]
[9] 1.0(12) 8.8(19) 75.8( 1)
[10]
[6] 11.6(12)
[6]
[5]
[5] 12.1(13)
[9]
[7]
[9]
[9] 8.5( 5)
[4] 2.9( 3) 4.3(10)
[9]
[8] 5.8( 1)
[10]
[6]
[11]
[11]
[11]
[8]
#fevals
psucc
of the best algorithm
Table entries: Success Performance FEs normalized (divided) by the first row,
(number of successful runs in round brackets), [rank of median final function value,
where FEs is not available, in square brackets]
Nikolaus Hansen
Solved Multimodal Functions

n = 30, FEmax = 300000

solved success
functions rate
K-PCX
G-CMA-ES
L-SaDE
DMS-L-PSO
EDA
BLX-GL50
DE
L-CMA-ES
SPC-PNX
CoEVO
BLX-MA
4
5
2
2
1
1
1
1
1
1
1
38%
37%
36%
22%
20%
20%
20%
20%
13%
9%
7%
nk
wa
r ie
ds
un
Bo
9R
le
rab
pa
trig
as
7G
6100
2.5(10)
1.0(25)
21.3(20)
9.8(24)
21.3(25)
10.2(25)
32.8(22)
1.1(25)
60.7(16)
93.4(11)
[11]
t
ou
e
nS
str
10
Ra
tat
igin
Ro
ed
11
ie
We
ras
rst
ble
ara
.13
ep
l2
e
S
f
e
d
bri
hw
Sc
Hy
12
15
99000 450000 5000000 180000

3.3(18) 1.0(14)
[7] 1.0( 5) [11]
8.0( 9) 5.3( 3)
1.0( 1) 1.3( 8) [1]
1.0(25)
[4]
[5]
[4] [2]
[6]
[5]
[5] 8.3( 4) [4]
[10]
[9]
[10]
[7] [4]
[5]
[3]
[4]
[8] [4]
[6]
[6]
[10]
[5] [8]
[11]
[11]
[3]
[9] [8]
[8]
[7]
[2]
[10] [8]
[9]
[10]
[9]
[11] [10]
6.7( 9)
[8]
[8]
[6] [4]
#all runs (25)

First row: Success Performance FEs = mean(#fevals) #successful
of the
runs
best algorithm, where #fevals includes only successful runs
Table entries: Success Performance FEs divided by FEs of the best algorithm
(first row), (number of successful runs in round brackets), [rank of median final
function value, where FEs is not available, in square brackets]
Nikolaus Hansen
Multimodal Functions
n = 10
GCMAES
0.9
DE
0.8
LSaDE
DMSLPSO
0.7
KPCX
0.6
BLXGL50
0.5
EDA
0.4
BLXMA
0.3
LCMAES
0.2
SPCPNX
0.1
CoEVO
0 0
10
10
10
FEs / FEsbest
empirical distribution over multimodal functions
empirical distribution over multimodal functions

n = 30
GCMAES
0.9
KPCX
0.8
DMSLPSO
LSaDE
0.7
LCMAES
0.6
BLXMA
0.5
BLXGL50
0.4
EDA
0.3
DE
0.2
SPCPNX
0.1
0 0
10
CoEVO
1
10
FEs / FEsbest
10
Empirical distribution function of the Success Performance FEs divided by FEs of the
best algorithm (table entries of last slides).
#all runs (25)
FEs = mean(#fevals) #successful
, where #fevals includes only successful runs.
runs
Small values of FEs and therefore large values in the empirical distribution graphs are
preferable.
Nikolaus Hansen
Never Solved Multimodal Functions

Rank of Final Function Value
n = 10, FEmax = 105
G-CMA-ES
BLX-GL50
L-SaDE
DMS-L-PSO
L-CMA-ES
SPC-PNX
EDA
BLX-MA
K-PCX
DE
CoEVO
mean
3.8
4.2
5.3
5.7
6.0
6.2
6.2
6.6
7.0
7.0
8.2
s
6
2
ion inuou
rF
dit
e
t
10
ffe
ds
on
on
ois
un
w
ion 6&7 Scha ated
o
C
C
t
o
N
o
i
h
B
d
n
t
rr
h
1
4
8
tB
ed
ed
on
F2
Hig d No d F2 d ou
Ro d wit d F1 d Na d on
y C pand pand brid
ri
rid
rid
ri
ri
ri
ri
ri
ri
e
b
b
b
b
b
b
b
b
b
l
ck
Hy 7 Hy 8 Hy 9 Hy 0 Hy 1 Hy 2 Hy 3 Hy 4 Hy 5 Hy
Ex 4 Ex
8A
1
16
1
1
2
2
2
2
2
2
13
1
5.5
5.5
5.5
5.5
5.5
11
5.5
5.5
5.5
5.5
5.5
4
6
1
2
3
8
10
7
5
11
9
7
1.5
6
3.5
11
8
5
1.5
3.5
10
9
1
2.5
5.5
4
7.5
7.5
9
5.5
2.5
11
10
5.5
3
3
3
11
5.5
8
7
1
10
9
3
3
8
8
6
3
3
8
10
3
11
2.5
2.5
7.5
7.5
5
2.5
7.5
7.5
11
2.5
10
2.5
2.5
7.5
7.5
5
2.5
7.5
7.5
10
2.5
11
5.5
5.5
5.5
5.5
1
5.5
5.5
10
11
5.5
5.5
1.5
5.5
5.5
5.5
3
9
8
5.5
1.5
10
11
4
4
4
8
4
4
4
10
11
4
9
5
5
5
5
11
5
5
5
10
5
5
2.5
8.5
4.5
8.5
4.5
8.5
2.5
6
8.5
11
1
Table entries: Rank of the median of the final best function values from 25 runs
measured with two digits of precision
Nikolaus Hansen
Never Solved Multimodal Functions

Rank of Final Function Value
n = 30, FEmax = 3 105
G-CMA-ES
EDA
BLX-MA
SPC-PNX
BLX-GL50
L-CMA-ES
DE
K-PCX
CoEVO
L-SaDE
DMS-L-PSO
mean
4.1
4.8
4.9
5.0
5.1
5.6
6.1
6.2
8.5
6
2
ion inuo
rF e
dit
e
10
ffe
ds
on Cont
d
ois
ha arabl
un
w
e
ion 6&7
C
c
t
t
o
N
o
i
S
h
B
d
n
ta
rr
p
h
1
8
4
ed
ed
on
Se d Ro d wit d F1 d Na d on
F2 d Hig d No d F2
y C pand pand brid
ri
ri
ri
ri
ri
rid
ri
ri
ri
y
yb
yb Hyb
yb
yb
yb
yb Hyb
yb
yb
x
x
kle
c
E
H
H
H
H
H
H
H
H
E
H
14
15
16
17
18
19
20
21
22
23
24
25
8A
13
2.5
7.5
7.5
7.5
7.5
2.5
7.5
11
7.5
2.5
2.5
5
11
4
8
6.5
2
6.5
10
9
1
3
6.5
6.5
6.5
6.5
2
10
6.5
10
6.5
2
2
1
8
4.5
8
4.5
2
8
11
10
4.5
4.5
1
7
10
4.5
4.5
3
8
2
9
6
7
5
2
3
10
8
1
9
5.5
1
3
5.5
5.5
8.5
5.5
2
10
8.5
5.5
1
3
5.5
5.5
8.5
5.5
2
10
8.5
5.5
1
3
5.5
5.5
8.5
5.5
2
10
8.5
4.5
4.5
4.5
4.5
4.5
4.5
4.5
9
4.5
1
3
7
3
5
3
6
9
8
3.5
3.5
3.5
3.5
7
3.5
3.5
9
8
6
2.5
2.5
2.5
7
9
2.5
5
8
4
4
4
4
4
4
8
4
9
Table entries: Rank of the median of the final best function values from 25 runs
measured with two digits of precision
Nikolaus Hansen

Compare Results

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Compare Results

Uploaded by

Copyright:

Available Formats

Compilation of Results on the 2005 CEC

Benchmark Function Set

Compilation of Results on the 2005 CEC Benchmark Function Set

A Note on Evaluation Criteria

Compilation of Results on the 2005 CEC Benchmark Function Set

Compilation of Results on the 2005 CEC Benchmark Function Set

Garca-Martnez and Lozano (Hybrid Real-Coded. . . )

Compilation of Results on the 2005 CEC Benchmark Function Set

#all runs (25)

empirical distribution over all functions

empirical distribution over all functions

where #fevals includes only successful runs.

Shown: empirical distribution function of the Success Performance FEs divided by

Compilation of Results on the 2005 CEC Benchmark Function Set

We split the function set into three subsets

unsolved multimodal functions

Compilation of Results on the 2005 CEC Benchmark Function Set

First row: Success Performance FEs = mean(#fevals)

#all runs (25)

Compilation of Results on the 2005 CEC Benchmark Function Set

n = 30, FEmax = 300000

#all runs (25)

Compilation of Results on the 2005 CEC Benchmark Function Set

empirical distribution over unimodal functions

empirical distribution over unimodal functions

Compilation of Results on the 2005 CEC Benchmark Function Set

Solved Multimodal Functions

n = 10, FEmax = 100000

First row: Success Performance FEs =

of the best algorithm

Compilation of Results on the 2005 CEC Benchmark Function Set

Solved Multimodal Functions

n = 30, FEmax = 300000

99000 450000 5000000 180000

#all runs (25)

Compilation of Results on the 2005 CEC Benchmark Function Set

empirical distribution over multimodal functions

empirical distribution over multimodal functions

Empirical Distribution of Normalized Success Performance

Compilation of Results on the 2005 CEC Benchmark Function Set

Never Solved Multimodal Functions

n = 10, FEmax = 105

Compilation of Results on the 2005 CEC Benchmark Function Set

Never Solved Multimodal Functions

n = 30, FEmax = 3 105

Compilation of Results on the 2005 CEC Benchmark Function Set

You might also like