You are on page 1of 14

Compilation of Results on the 2005 CEC

Benchmark Function Set


Nikolaus Hansen
Computational Laboratory (CoLab)
Institute of Computational Science
ETH Zurich

May 4, 2006

Nikolaus Hansen

Compilation of Results on the 2005 CEC Benchmark Function Set

A Note on Evaluation Criteria


Quantitative performance measurements, the Success
Performance, abbreviated SP or FEs, is used as a
measure for the expected number of function evaluations
to reach a target function value.
Invariance is a non-empirical statement on the ability to
generalize performance results. Invariance guarantees
identical performance on a class of functions. Possible
invariances are
invariance against translation, scaling, or even order
preserving transformations of the objective function value
invariance against angle preserving (rigid) transformations
of the search space (translation, rotation)

Meta-Parameters
how many parameters of the algorithm need to be adjusted
to the object function?
how many different setups were tested?
how many different setups were finally used?
Nikolaus Hansen

Compilation of Results on the 2005 CEC Benchmark Function Set

Methods
task: black-box optimization of 25 benchmark functions
25 runs on each benchmark function for each dimension
n = 10, 30
a run is successful if the global optimum is reached with
the given precision, before the
maximumnumber of function evaluations
105 for n = 10
FEmax =
is reached
3 105 for n = 30
Remark
the setting of FEmax has a remarkable influence on the results, if the target function value can be
reached only for a (slightly) larger number of function evaluations with a high probability. Where
FEs FEmax the result must be taken with great care.
Reference
Suganthan, Hansen, Liang, Deb, Chen, Auger, and Tiwari (2005). Problem Definitions and Evaluation
Criteria for the CEC 2005 Special Session on Real-Parameter Optimization, Technical report, Nanyang
Technological University, Singapore, May 2005, http://www.ntu.edu.sg/home/EPNSugan

Nikolaus Hansen

Compilation of Results on the 2005 CEC Benchmark Function Set

References to Algorithms

BLX-GL50
BLX-MA
CoEVO
DE
DMS-L-PSO
EDA
G-CMA-ES
K-PCX
L-CMA-ES
L-SaDE
SPC-PNX

Garca-Martnez and Lozano (Hybrid Real-Coded. . . )


Molina et al. (Adaptive Local Search. . . )
Posk (Real-Parameter Optimization. . . )

Ronkk
onen
et al. (Real-Parameter Optimization. . . )
Liang and Suganthan (Dynamic Multi-Swarm. . . )
Yuan and Gallagher (Experimental Results. . . )
Auger and Hansen (A Restart CMA. . . )
Sinha et al. (A Population-Based,. . . )
Auger and Hansen (Performance Evaluation. . . )
Qin and Suganthan (Self-Adaptive Differential. . . )
Ballester et al. (Real-Parameter Optimization. . . )
In: CEC 2005 IEEE Congress on Evolutionary Computation, Proceedings

Nikolaus Hansen

Compilation of Results on the 2005 CEC Benchmark Function Set

Summarized Results
Empirical Distribution of Normalized Success Performance
n = 10

0.9

DE

0.8

LSaDE
DMSLPSO

0.7

EDA

0.6

BLXGL50

0.5

LCMAES

0.4

KPCX

0.3

SPCPNX

0.2

BLXMA

0.1

CoEVO

0 0
10

10

10
FEs / FEsbest

FEs = mean(#fevals)

#all runs (25)


,
#successful runs

GCMAES

0.9

KPCX

0.8

LCMAES
DMSLPSO

0.7

LSaDE

0.6

EDA

0.5

SPCPNX

0.4

BLXGL50

0.3

CoEVO

0.2

BLXMA

0.1
0 0
10

n = 30

GCMAES

empirical distribution over all functions

empirical distribution over all functions

DE
1

10

10
FEs / FEsbest

where #fevals includes only successful runs.

Shown: empirical distribution function of the Success Performance FEs divided by


FEs of the best algorithm on the respective function.
Results of all functions are used where at least one algorithm was successful at least once, i.e. where
the target function value was reached in at least one experiment (out of 11 25 experiments).

Small values for FEs and therefore large (cumulative frequency) values in the graphs
are preferable.
Nikolaus Hansen

Compilation of Results on the 2005 CEC Benchmark Function Set

Function Sets

We split the function set into three subsets


unimodal functions
solved multimodal functions
at least one algorithm conducted at least one
successful run

unsolved multimodal functions


no single run was successful for any algorithm

Nikolaus Hansen

Compilation of Results on the 2005 CEC Benchmark Function Set

Unimodal Functions
Normalized Success Performance, Tabulated
n = 10, FEmax = 100000
solved success
rate
functions

G-CMA-ES
EDA
DE
L-CMA-ES
BLX-GL50
DMS-L-PSO
L-SaDE
SPC-PNX
CoEVO
K-PCX
BLX-MA

6
6
6
6
5
5
5
4
4
4
3

100%
97%
96%
88%
83%
80%
77%
67%
67%
62%
49%

1S

re
phe

1000
1.6(25)
10.0(25)
29.0(25)
1.7(25)
19.0(25)
12.0(25)
10.0(25)
6.7(25)
23.0(25)
1.0(25)
12.0(25)

6
e
ds
10
oun
Nois
ion
with
on B
ndit
2
2
6
.
.
.
o
ck
1
1
2
C
efel
efel
efel
oid
nbro
ose
chw
llips
chw
chw
2S
3E
4S
5S
6R

2400
6500
1.0(25) 1.0(25)
4.6(25) 2.5(23)
19.2(25) 18.5(20)
1.1(25) 1.0(25)
17.1(25)
[9]
5.0(25) 1.8(25)
4.2(25) 8.0(16)
12.9(25)
[11]
11.3(25) 6.8(25)
1.0(25)
[8]
15.4(25)
[10]

2900
5900
1.0(25) 1.0(25)
4.1(25) 4.2(25)
17.9(25) 6.9(25)
65.5( 7) 1.0(25)
14.5(25) 4.7(25)
[11] 18.6(20)
15.9(24)
[9]
10.7(25) 6.8(25)
16.2(25)
[10]
19.7(21)
[11]
25.9(24)
[8]

First row: Success Performance FEs = mean(#fevals)


algorithm, where #fevals includes only successful runs

#all runs (25)


#successful runs

7100
1.5(25)
9.6(22)
6.6(24)
1.3(25)
7.3(25)
7.7(25)
6.9(25)
[10]
[11]
1.0(22)
[9]
of the best

Table entries: Success Performance FEs divided by FEs of the best algorithm (first
row), (number of successful runs in round brackets), [rank of median final function
value, where FEs is not available, in square brackets]
Nikolaus Hansen

Compilation of Results on the 2005 CEC Benchmark Function Set

Unimodal Functions
Normalized Success Performance, Tabulated

n = 30, FEmax = 300000


1S

solved success
functions rate

G-CMA-ES
L-CMA-ES
EDA
DMS-L-PSO
BLX-GL50
SPC-PNX
K-PCX
L-SaDE
BLX-MA
DE
CoEVO

6
5
4
4
3
4
3
3
1
1
2

90%
83%
67%
63%
50%
45%
43%
41%
17%
17%
7%

re
phe

2700
1.7(25)
1.8(25)
55.6(25)
1.9(25)
21.5(25)
11.1(25)
1.0(25)
7.4(25)
11.9(25)
51.9(25)
519 ( 3)

6
ds
oise
10
oun
ith N
nB
ition
.6 o
.2 w
ond
ck
2
1
C
l
l
r
e
e
ef
ef
oid
nb o
ose
llips
chw
chw
3E
4S
5S
6R

.2

1
efel
chw

2S

12000
1.1(25)
1.2(25)
13.3(25)
10.8(25)
13.3(25)
26.7(22)
1.0(25)
12.5(24)
[10]
[11]
70.0( 8)

43000
1.0(25)
1.0(25)
5.1(25)
7.9(21)
[7]
[11]
[6]
[8]
[10]
[9]
[5]

59000
1.0(10)
[11]
3.4(25)
[9]
[6]
6.1(19)
[8]
9.2(13)
[7]
[5]
[10]

66000 60000
1.0(25) 1.0(25)
1.1(25) 1.1(25)
[3]
[10]
[9] 5.5(24)
[6] 3.7(25)
[10] 86.7( 1)
[7] 1.1(14)
[4]
[8]
[7]
[8]
[5]
[7]
[11]
[11]

#all runs (25)


First row: Success Performance FEs = mean(#fevals) #successful
of the
runs
best algorithm, where #fevals includes only successful runs

Table entries: Success Performance FEs divided by FEs of the best algorithm
(first row), (number of successful runs in round brackets), [rank of median final
function value, where FEs is not available, in square brackets]
Nikolaus Hansen

Compilation of Results on the 2005 CEC Benchmark Function Set

Unimodal Functions
Empirical Distribution of Normalized Success Performance
n = 10

0.9

EDA

0.8

DE
LCMAES

0.7

LSaDE

0.6

DMSLPSO

0.5

BLXGL50

0.4

SPCPNX

0.3

KPCX

0.2

CoEVO

0.1

BLXMA

0 0
10

10

FEs / FEsbest

10

GCMAES

0.9

LCMAES

0.8

DMSLPSO
EDA

0.7

SPCPNX

0.6

KPCX

0.5

LSaDE

0.4

BLXGL50

0.3

CoEVO

0.2

BLXMA

0.1
0 0
10

n = 30

GCMAES

empirical distribution over unimodal functions

empirical distribution over unimodal functions

DE
1

10

10
FEs / FEsbest

Empirical distribution function of the Success Performance FEs divided by FEs of the
best algorithm (table entries of last slides).
#all runs (25)
FEs = mean(#fevals) #successful
, where #fevals includes only successful runs.
runs
Small values of FEs and therefore large values in the empirical distribution graphs are
preferable.
Nikolaus Hansen

Compilation of Results on the 2005 CEC Benchmark Function Set

Solved Multimodal Functions


Normalized Success Performance, Tabulated

n = 10, FEmax = 100000


solved success
functions rate

G-CMA-ES
L-SaDE
DMS-L-PSO
K-PCX
DE
L-CMA-ES
BLX-GL50
BLX-MA
EDA
SPC-PNX
CoEVO

5
4
4
3
5
2
3
2
3
2
0

63%
53%
47%
40%
30%
25%
17%
15%
9%
1%
0%

nk

t
ou

wa
r ie
7G

ds

ble

un

Bo

tr

as
9R

4700
17000
1.0(25) 4.5(19)
36.2( 6) 1.0(25)
126 ( 4) 2.1(25)
[10] 2.9(24)
255 ( 2) 10.6(11)
1.2(25)
[11]
12.3( 9) 10.0( 3)
[11] 5.7(18)
404 ( 1)
[9]
383 ( 1)
[8]
[9]
[10]

First row: Success Performance FEs =

str

10

Ra

ed

tat

ara

ep

S
igin

igin

Ro

as
str
ier
e
1W

s
12

fel
we
ch

3
2.1
15

Hy

ep
dS

ara

ble

bri

55000 190000
8200 33000
1.2(23) 1.4( 6) 4.0(22)
[6]
[5]
[8] 3.9(25) 1.0(23)
[3]
[7] 6.6(19) 1.7(22)
1.0(22)
[10] 1.0(14)
[11]
[9] 1.0(12) 8.8(19) 75.8( 1)
[10]
[6] 11.6(12)
[6]
[5]
[5] 12.1(13)
[9]
[7]
[9]
[9] 8.5( 5)
[4] 2.9( 3) 4.3(10)
[9]
[8] 5.8( 1)
[10]
[6]
[11]
[11]
[11]
[8]
#fevals
psucc

of the best algorithm

Table entries: Success Performance FEs normalized (divided) by the first row,
(number of successful runs in round brackets), [rank of median final function value,
where FEs is not available, in square brackets]
Nikolaus Hansen

Compilation of Results on the 2005 CEC Benchmark Function Set

Solved Multimodal Functions


Normalized Success Performance, Tabulated

n = 30, FEmax = 300000


solved success
functions rate

K-PCX
G-CMA-ES
L-SaDE
DMS-L-PSO
EDA
BLX-GL50
DE
L-CMA-ES
SPC-PNX
CoEVO
BLX-MA

4
5
2
2
1
1
1
1
1
1
1

38%
37%
36%
22%
20%
20%
20%
20%
13%
9%
7%

nk

wa
r ie

ds

un

Bo

9R

le
rab

pa

trig

as

7G

6100
2.5(10)
1.0(25)
21.3(20)
9.8(24)
21.3(25)
10.2(25)
32.8(22)
1.1(25)
60.7(16)
93.4(11)
[11]

t
ou

e
nS

str

10

Ra

tat

igin

Ro

ed

11

ie
We

ras
rst

ble
ara
.13
ep
l2
e
S
f
e
d
bri
hw
Sc
Hy
12
15

99000 450000 5000000 180000


3.3(18) 1.0(14)
[7] 1.0( 5) [11]
8.0( 9) 5.3( 3)
1.0( 1) 1.3( 8) [1]
1.0(25)
[4]
[5]
[4] [2]
[6]
[5]
[5] 8.3( 4) [4]
[10]
[9]
[10]
[7] [4]
[5]
[3]
[4]
[8] [4]
[6]
[6]
[10]
[5] [8]
[11]
[11]
[3]
[9] [8]
[8]
[7]
[2]
[10] [8]
[9]
[10]
[9]
[11] [10]
6.7( 9)
[8]
[8]
[6] [4]

#all runs (25)


First row: Success Performance FEs = mean(#fevals) #successful
of the
runs
best algorithm, where #fevals includes only successful runs

Table entries: Success Performance FEs divided by FEs of the best algorithm
(first row), (number of successful runs in round brackets), [rank of median final
function value, where FEs is not available, in square brackets]
Nikolaus Hansen

Compilation of Results on the 2005 CEC Benchmark Function Set

Multimodal Functions
n = 10

GCMAES

0.9

DE

0.8

LSaDE
DMSLPSO

0.7

KPCX

0.6

BLXGL50

0.5

EDA

0.4

BLXMA

0.3

LCMAES

0.2

SPCPNX

0.1

CoEVO

0 0
10

10

10
FEs / FEsbest

empirical distribution over multimodal functions

empirical distribution over multimodal functions

Empirical Distribution of Normalized Success Performance


n = 30

GCMAES

0.9

KPCX

0.8

DMSLPSO
LSaDE

0.7

LCMAES

0.6

BLXMA

0.5

BLXGL50

0.4

EDA

0.3

DE

0.2

SPCPNX

0.1
0 0
10

CoEVO
1

10

FEs / FEsbest

10

Empirical distribution function of the Success Performance FEs divided by FEs of the
best algorithm (table entries of last slides).
#all runs (25)
FEs = mean(#fevals) #successful
, where #fevals includes only successful runs.
runs
Small values of FEs and therefore large values in the empirical distribution graphs are
preferable.
Nikolaus Hansen

Compilation of Results on the 2005 CEC Benchmark Function Set

Never Solved Multimodal Functions


Rank of Final Function Value

n = 10, FEmax = 105

G-CMA-ES
BLX-GL50
L-SaDE
DMS-L-PSO
L-CMA-ES
SPC-PNX
EDA
BLX-MA
K-PCX
DE
CoEVO

mean
3.8
4.2
5.3
5.7
6.0
6.2
6.2
6.6
7.0
7.0
8.2

s
6
2
ion inuou
rF
dit
e
t
10
ffe
ds
on
on
ois
un
w
ion 6&7 Scha ated
o
C
C
t
o
N
o
i
h
B
d
n
t
rr
h
1
4
8
tB
ed
ed
on
F2
Hig d No d F2 d ou
Ro d wit d F1 d Na d on
y C pand pand brid
ri
rid
rid
ri
ri
ri
ri
ri
ri
e
b
b
b
b
b
b
b
b
b
l
ck
Hy 7 Hy 8 Hy 9 Hy 0 Hy 1 Hy 2 Hy 3 Hy 4 Hy 5 Hy
Ex 4 Ex
8A
1
16
1
1
2
2
2
2
2
2
13
1

5.5
5.5
5.5
5.5
5.5
11
5.5
5.5
5.5
5.5
5.5

4
6
1
2
3
8
10
7
5
11
9

7
1.5
6
3.5
11
8
5
1.5
3.5
10
9

1
2.5
5.5
4
7.5
7.5
9
5.5
2.5
11
10

5.5
3
3
3
11
5.5
8
7
1
10
9

3
3
8
8
6
3
3
8
10
3
11

2.5
2.5
7.5
7.5
5
2.5
7.5
7.5
11
2.5
10

2.5
2.5
7.5
7.5
5
2.5
7.5
7.5
10
2.5
11

5.5
5.5
5.5
5.5
1
5.5
5.5
10
11
5.5
5.5

1.5
5.5
5.5
5.5
3
9
8
5.5
1.5
10
11

4
4
4
8
4
4
4
10
11
4
9

5
5
5
5
11
5
5
5
10
5
5

2.5
8.5
4.5
8.5
4.5
8.5
2.5
6
8.5
11
1

Table entries: Rank of the median of the final best function values from 25 runs
measured with two digits of precision

Nikolaus Hansen

Compilation of Results on the 2005 CEC Benchmark Function Set

Never Solved Multimodal Functions


Rank of Final Function Value

n = 30, FEmax = 3 105

G-CMA-ES
EDA
BLX-MA
SPC-PNX
BLX-GL50
L-CMA-ES
DE
K-PCX
CoEVO
L-SaDE
DMS-L-PSO

mean
4.1
4.8
4.9
5.0
5.1
5.6
6.1
6.2
8.5

6
2
ion inuo
rF e
dit
e
10
ffe
ds
on Cont
d
ois
ha arabl
un
w
e
ion 6&7
C
c
t
t
o
N
o
i
S
h
B
d
n
ta
rr
p
h
1
8
4
ed
ed
on
Se d Ro d wit d F1 d Na d on
F2 d Hig d No d F2
y C pand pand brid
ri
ri
ri
ri
ri
rid
ri
ri
ri
y
yb
yb Hyb
yb
yb
yb
yb Hyb
yb
yb
x
x
kle
c
E
H
H
H
H
H
H
H
H
E
H
14
15
16
17
18
19
20
21
22
23
24
25
8A
13

2.5
7.5
7.5
7.5
7.5
2.5
7.5
11
7.5
2.5
2.5

5
11
4
8
6.5
2
6.5
10
9
1
3

6.5
6.5
6.5
6.5
2
10
6.5
10
6.5
2
2

1
8
4.5
8
4.5
2
8
11
10
4.5
4.5

1
7
10
4.5
4.5
3
8
2
9

6
7
5
2
3
10
8
1
9

5.5
1
3
5.5
5.5
8.5
5.5
2
10

8.5

5.5
1
3
5.5
5.5
8.5
5.5
2
10

8.5

5.5
1
3
5.5
5.5
8.5
5.5
2
10

8.5

4.5
4.5
4.5
4.5
4.5
4.5
4.5
9
4.5

1
3
7
3
5
3
6
9
8

3.5
3.5
3.5
3.5
7
3.5
3.5
9
8

6
2.5
2.5
2.5
7
9
2.5
5
8

4
4
4
4
4
4
8
4
9

Table entries: Rank of the median of the final best function values from 25 runs
measured with two digits of precision

Nikolaus Hansen

Compilation of Results on the 2005 CEC Benchmark Function Set

You might also like