Performance of Evolutionary Algorithms on NK Landscapes with Nearest Neighbor Interactions and Tunable Overlap

Martin Pelikan, Kumara Sastry, David E. Goldberg, Martin V. Butz, and Mark Hauschild MEDAL Report No. 2009002 January 2009 Abstract
This paper presents a class of NK landscapes with nearest-neighbor interactions and tunable overlap. The considered class of NK landscapes is solvable in polynomial time using dynamic programming; this allows us to generate a large number of random problem instances with known optima. Several variants of standard genetic algorithms and estimation of distribution algorithms are then applied to the generated problem instances. The results are analyzed and related to scalability theory for selectorecombinative genetic algorithms and estimation of distribution algorithms.

Keywords
NK fitness landscape, hierarchical BOA, genetic algorithm, univariate marginal distribution algorithm, performance analysis, scalability, crossover, hybridization.

Missouri Estimation of Distribution Algorithms Laboratory (MEDAL) Department of Mathematics and Computer Science University of Missouri–St. Louis One University Blvd., St. Louis, MO 63121 E-mail: medal@cs.umsl.edu WWW: http://medal.cs.umsl.edu/

Performance of Evolutionary Algorithms on NK Landscapes with Nearest Neighbor Interactions and Tunable Overlap
Martin Pelikan, Kumara Sastry, David E. Goldberg, Martin V. Butz, and Mark Hauschild

Abstract This paper presents a class of NK landscapes with nearest-neighbor interactions and tunable overlap. The considered class of NK landscapes is solvable in polynomial time using dynamic programming; this allows us to generate a large number of random problem instances with known optima. Several variants of standard genetic algorithms and estimation of distribution algorithms are then applied to the generated problem instances. The results are analyzed and related to scalability theory for selectorecombinative genetic algorithms and estimation of distribution algorithms.

Keywords: NK fitness landscape, hierarchical BOA, genetic algorithm, univariate marginal distribution algorithm, performance analysis, scalability, crossover, hybridization.

1

Introduction

Testing on random instances of challenging classes of problems is an important approach to testing optimization techniques. NK fitness landscapes (Kauffman, 1989; Kauffman, 1993) were introduced by Kauffman as tunable models of rugged fitness landscape and are one of the most popular classes of problems used in testing optimization techniques on random problem instances. An NK landscape is a function defined on binary strings of fixed length and is characterized by two parameters: (1) n for the overall number of bits and (2) k for the neighborhood size. For each bit, k neighbors are specified and a function is given that determines the fitness contribution for any combination of values of the bit and its neighbors. In general, neighbors for each bit can be chosen arbitrarily. NK landscapes are NP-complete for k > 1, although some variants of NK landscapes are polynomially solvable and there exist approximation algorithms for other cases (Wright, Thompson, & Zhang, 2000; Gao & Culberson, 2002; Choi, Jung, & Kim, 2005). Nonetheless, NK landscapes remain a challenge for any optimization algorithm and they are also interesting from the perspective of complexity theory and computational biology (Kauffman, 1993; Altenberg, 1997; Wright, Thompson, & Zhang, 2000; Gao & Culberson, 2002; Aguirre & Tanaka, 2003; Choi, Jung, & Kim, 2005). Despite that, only few in-depth empirical studies exist that discuss performance of selectorecombinative genetic algorithms and estimation of distribution algorithms on NK landscapes and that relate empirical results to existing scalability theory. The purpose of this paper is to propose a class of NK landscapes with nearest-neighbor interactions and tunable overlap, and present an in-depth empirical performance analysis of various genetic and evolutionary algorithms on the proposed class of NK landscapes. The paper considers the genetic algorithm with twopoint and uniform crossover, the univariate marginal distribution algorithm, and the hierarchical Bayesian optimization algorithm. All algorithms are improved by incorporating a simple deterministic local search based on single-bit flips. To provide insight into

1

the effects of overlap between subproblems on algorithm performance, the number of bits that overlap between consequent subproblems is controlled by a user-specified parameter. All considered types of NK landscape instances are challenging yet solvable in polynomial time using dynamic programming; this allows us to consider a large number of random instances with known optima in practical time. The results are related to existing scalability theory and interesting directions for future research are outlined. The work presented in this paper combines and extends some of the facets of two previous studies on similar problems, specifically, the analysis of evolutionary algorithms on random additively decomposable problems (Pelikan, Sastry, Butz, & Goldberg, 2006; Sastry, Pelikan, & Goldberg, 2007) and the analysis of evolutionary algorithms on standard NK landscapes (Pelikan, Sastry, Butz, & Goldberg, 2008). The paper starts by describing NK landscapes in section 2. Section 3 outlines the dynamic programming algorithm which can find guaranteed global optima of the considered instances in polynomial time. Section 4 outlines the compared algorithms. Section 5 presents experimental results. Section 6 discusses future work. Finally, section 7 summarizes and concludes the paper.

2

NK Landscapes

This section describes NK landscapes. First, approaches to testing evolutionary algorithms are briefly discussed. The general form of NK landscapes is then described. Next, the classes of NK landscape instances considered in this paper are discussed and the method used to generate random problem instances of these classes is outlined.

2.1

Testing Evolutionary Algorithms

There are three basic approaches to testing optimization techniques: (1) Testing on the boundary of the design envelope using artificial, adversarial test problems. For example, fully deceptive concatenated traps (Ackley, 1987; Deb & Goldberg, 1991) represent a class of artificial test problems that can be used to test whether the optimization algorithm can automatically decompose the problem and exploit the discovered decomposition effectively. Testing on artificial problems on the boundary of the design envelope is also a common practice outside standard optimization; for example, consider testing car safety using car crash tests or testing durability of cell phones using drop tests. (2) Testing on classes of random problems. For example, to test algorithms for solving maximum satisfiability (MAXSAT) problems, large sets of random formulas in conjunctive normal form can be generated and analyzed (Cheeseman, Kanefsky, & Taylor, 1991). Similar approaches to testing are common outside standard optimization as well; for example, new software products are often tested by groups of beta testers in order to discover all problems in situations that were not expected during the testing on the boundary of the design envelope. (3) Testing on real-world problems or their approximations. For example, the problem of designing military antennas can be considered for testing (Santarelli, Goldberg, & Yu, 2004). As an example outside optimization, when a new model of an airplane has been designed and thoroughly tested on the boundary of its design envelope, it is ready to take its real-world test—the first actual flight. In this paper, we focus on the testing on random classes of problems. More specifically, we consider random instances of a restricted class of NK landscapes with nearest-neighbor interactions 2

and tunable overlap. All considered instances are solvable in low-order polynomial time with a dynamic programming algorithm. This allows us to generate a large number of random problem instances with known optima in practical time. The main focus is on simple hybrids based on standard selectorecombinative genetic algorithms, estimation of distribution algorithms, and the deterministic hill climbing based on single-bit flips. The general form of NK landscapes is described next. Then, the considered class of NK landscapes and the procedure for generating random problem instances are outlined.

2.2

Problem Definition

An NK fitness landscape (Kauffman, 1989; Kauffman, 1993) is fully defined by the following components: • The number of bits, n. • The number of neighbors per bit, k. • A set of k neighbors Π(Xi ) for the i-th bit, Xi , for every i ∈ {0, . . . , n − 1}. • A subfunction fi defining a real value for each combination of values of Xi and Π(Xi ) for every i ∈ {0, . . . , n − 1}. Typically, each subfunction is defined as a lookup table with 2k+1 values. The objective function fnk to maximize is defined as
n−1

fnk (X0 , X1 , . . . , Xn−1 ) =
i=0

fi (Xi , Π(Xi )).

The difficulty of optimizing NK landscapes depends on all of the four components defining an NK problem instance. One useful approach to analyzing complexity of NK landscapes is to focus on the influence of k on problem complexity. For k = 0, NK landscapes are simple unimodal functions similar to onemax or binint, which can be solved in linear time and should be easy for practically any genetic and evolutionary algorithm. The global optimum of NK landscapes can be obtained in polynomial time (Wright, Thompson, & Zhang, 2000) even for k = 1; on the other hand, for k > 1, the problem of finding the global optimum of unrestricted NK landscapes is NP-complete (Wright et al., 2000). The problem becomes polynomially solvable with dynamic programming even for k > 1 if the neighbors are restricted to only adjacent string positions (Wright et al., 2000) or if the subfunctions are generated according to some distributions (Gao & Culberson, 2002). For unrestricted NK landscapes with k > 1, a polynomial-time approximation algorithm exists with the approximation threshold 1 − 1/2k+1 (Wright et al., 2000).

2.3

NK Instances with Nearest Neighbors and Tunable Overlap

In this paper we consider NK instances with the following two restrictions: 1. Neighbors of each bit are restricted to the k bits that immediately follow this bit. When there are fewer than k bits left to the right of the considered bit, the neighborhood is restricted to contain all the bits to the right of the considered bit.

3

2. Some subproblems may be excluded to provide a mechanism for tuning the size of the overlap between consequent subproblems. Specifically, the fitness is defined as
j
n−1 step

k

fnk (X0 , X1 , . . . , Xn−1 ) =
i=0

fi (Xi×step , Π(Xi )),

where step ∈ {1, 2, . . . , k + 1} is a parameter denoting the step with which the basis bits are selected. For standard NK landscapes, step = 1. With larger values of step, the amount of overlap between consequent subproblems can be reduced. For step = k + 1, the problem becomes separable (the subproblems are fully independent). The reason for restricting neighborhoods to nearest neighbors was to ensure that the problem instances can be solved in polynomial time even for k > 1 using a simple dynamic programming algorithm. The main motivation for introducing the step parameter was to provide a mechanism for tuning the strength of the overlap between different subproblems. The resulting class of problems is a subset of standard, unrestricted NK landscapes (Kauffman, 1989; Kauffman, 1993). Furthermore, the resulting instances are a superset of the polynomially solvable random additively decomposable problems introduced in Pelikan, Sastry, Butz, and Goldberg (2006). The subfunctions in the considered class of NK landscapes are encoded as look-up tables; thus, the subfunctions can be defined arbitrarily.

2.4

Generating Random Instances

The overall number of subfunctions and the set of neighbors for each of these subfunctions are fully specified by parameters n, k, and step. The only component that varies from instance to instance for any valid combination of values of n, k, and step are the subfunctions themselves and the encoding, as described below. The lookup table for all possible instantiations of bits in each subfunction is generated randomly using the same distribution for each entry in the table. Each of the values is generated using the uniform distribution over interval [0, 1). To make the instances more challenging, string positions in each instance are shuffled randomly. This is done by reordering string positions according to a randomly generated permutation using the uniform distribution over all permutations. The following section describes the dynamic programming approach which can be used to find guaranteed optima for the aforementioned class of NK instances in polynomial time.

3

Dynamic Programing for Nearest-Neighbor NK Landscapes

The dynamic programming algorithm used to solve the described class of NK landscape instances is based on Pelikan et al. (2006). It uses the knowledge of the location of subproblems and the permutation imposed on the string positions, and considers subproblems in order from left to right according to the original permutation of string positions before shuffling. For example, consider the problem with n = 7, k = 2, and step = 2, which contains 4 subproblems defined in the following subsets of positions (according to the original permutation of the string positions): {0, 1, 2} for the subproblem f0 , {2, 3, 4} for f1 , {4, 5, 6} for f2 , and {6} for f3 . The dynamic programming algorithm processes the subproblems in the following order: (f0 , f1 , f2 , f3 ). For each subproblem, the optimal fitness contribution of this and the previous subproblems is computed for any combination of bits 4

that overlap with the next subproblem to the right. The global optimum is then given by the computed fitness contribution of the last subproblem on the right. Denoting by o = k −step+1 the maximum number of bits in which the subproblems overlap and by m the overall number of subproblems, the dynamic programming algorithm starts by creating a matrix G = (gi,j ) of size m × 2o . The element gi,j for i ∈ {0, 1, . . . , m − 1} and j ∈ {0, 1, . . . , 2o − 1} encodes the maximum fitness contribution of the first (i + 1) subproblems where the o (or fewer) bits that overlap with the next subproblem to the right are equal to j using integer representation for these o bits. The last few subproblems may overlap with the next subproblem in fewer than o bits; that is why some entries in the matrix G are not going to be used. For example, for the above example problem with n = 7, k = 2, and step = 2, g1,0 represents the best fitness contribution of f0 and f1 (ignoring f2 and f3 ) under the assumption that the 5th bit is 0; analogically, g1,1 represents the best fitness contribution of f0 and f1 under the assumption that the 5th bit is 1. The algorithm starts by considering all 2k+1 instances of the k + 1 bits in the first subproblem, and records the best found fitness for each combination of values of the o (or fewer) bits that overlap with the second subproblem; the resulting values are stored in the first row of G (elements g0,j ). Then, the algorithm goes through all the remaining subproblems from left to right. For the subproblem fi , all 2k+1 instances of the k + 1 bits in this subproblem are examined; the only exception may be the right-most subproblems, for which the neighborhood may be restricted due to the fixed string length. For each instance, the algorithm first looks at the column j ′ of G that corresponds to the o (or fewer) bits of the subproblem fi that overlap with the previous subproblem fi−1 . The fitness contribution is computed as the sum of gi−1,j ′ and the fitness contribution of the considered instance of fi . For each possible instantiation of bits that overlap with the next subproblem, the optimum fitness contribution is recorded in matrix G, forming the next row (that is, row i) of the matrix. After processing all subproblems, the value of the global optimum is equal to the fitness contribution stored in the first element of the last row of G. The values that lead to the optimum fitness can be found by examining all choices made when choosing the best combination of bits in each subproblem.

4

Compared Algorithms

This section outlines the optimization algorithms discussed in this paper: (1) the genetic algorithm (GA) (Holland, 1975; Goldberg, 1989), (2) the univariate marginal distribution algorithm (UMDA) (M¨hlenbein & Paaß, 1996), and (3) the hierarchical Bayesian optimization algorithm u (hBOA) (Pelikan & Goldberg, 2001; Pelikan, 2005). Additionally, the section describes the deterministic hill climber (DHC) (Pelikan & Goldberg, 2003), which is incorporated into all compared algorithms to improve their performance. In all compared algorithms, candidate solutions are represented by binary strings of n bits. The genetic algorithm (GA) (Holland, 1975; Goldberg, 1989) evolves a population of candidate solutions with the first population generated at random according to the uniform distribution over all binary strings. Each iteration starts by selecting promising solutions from the current population; we use binary tournament selection without replacement. New solutions are created by applying variation operators to the population of selected solutions. Specifically, crossover is used to exchange bits and pieces between pairs of candidate solutions and mutation is used to perturb the resulting solutions. Here we use uniform or two-point crossover, and bit-flip mutation (Goldberg, 1989). To maintain useful diversity in the population, the new candidate solutions are incorporated into the original population using restricted tournament selection (RTS) (Harik, 1995). The run is 5

terminated when termination criteria are met. In this paper, each run is terminated either when the global optimum has been found or when a maximum number of iterations has been reached. The univariate marginal distribution algorithm (UMDA) (M¨hlenbein & Paaß, 1996) proceeds u similarly as GA. However, instead of using crossover and mutation to create new candidate solutions, UMDA learns a probability vector (Juels, 1998; Baluja, 1994) for the selected solutions and generates new candidate solutions from this probability vector. The probability vector stores the proportion of 1s in each position of the selected population. Each bit of a new candidate solution is set to 1 with the probability equal to the proportion of 1s in this position; otherwise, the bit is set to 0. Consequently, the variation operator of UMDA preserves the proportions of 1s in each position while decorrelating different string positions. The hierarchical Bayesian optimization algorithm (hBOA) (Pelikan & Goldberg, 2001; Pelikan, 2005) proceeds similarly as UMDA. However, to model promising solutions and generate new candidate solutions, Bayesian networks with local structures (Chickering, Heckerman, & Meek, 1997; Friedman & Goldszmidt, 1999) are used instead of the simple probability vector of UMDA. The deterministic hill climber (DHC) is incorporated into GA, UMDA and hBOA to improve their performance. DHC takes a candidate solution represented by an n-bit binary string on input. Then, it performs one-bit changes on the solution that lead to the maximum improvement of solution quality. DHC is terminated when no single-bit flip improves solution quality and the solution is thus locally optimal. Here, DHC is used to improve every solution in the population before the evaluation is performed.

5

Experiments

This section describes experiments and presents experimental results. First, problem instances and the experimental setup are discussed. Next, the analysis of hBOA, UMDA and several GA variants is presented. Finally, all algorithms are compared and the results of the comparisons are discussed.

5.1

Problem Instances

The parameters n, k, and step were set as follows: n ∈ {20, 30, 40, 50, 60, 70, 80, 90, 100, 120}, k ∈ {2, 3, 4, 5}, and step ∈ {1, 2, . . . , k + 1}. For each combination of n, k, and step, we generated 10,000 random problem instances. Then, we applied GA, UMDA and hBOA to each of these instances and collected empirical results, which were subsequently analyzed. That means that overall 180,000 unique problem instances were generated and all of them were tested with every algorithm included in this study. For UMDA, largest instances were infeasible even with extremely large population sizes of more than 106 ; that is why some problem sizes are excluded for this algorithm and the main focus is on GA and hBOA.

5.2

Compared Algorithms

The following list summarizes the algorithms included in this study: (i) Hierarchical BOA (hBOA). (ii) Univariate marginal distribution algorithm (UMDA). (iii) Genetic algorithm with uniform crossover and bit-flip mutation. 6

(iv) Genetic algorithm with two-point crossover and bit-flip mutation.

5.3

Experimental Setup

To select promising solutions, binary tournament selection without replacement is used. New solutions (offspring) are incorporated into the old population using RTS with window size w = min{n, N/5} as suggested in Pelikan (2005). In hBOA, Bayesian networks with decision trees (Chickering et al., 1997; Friedman & Goldszmidt, 1999; Pelikan, 2005) are used and the models are evaluated using the Bayesian-Dirichlet metric with likelihood equivalence (Heckerman et al., 1994; Chickering et al., 1997) and a penalty for model complexity (Friedman & Goldszmidt, 1999; Pelikan, 2005). All GA variants use bit-flip mutation with the probability of flipping each bit pm = 1/n. Two common crossover operators are considered in a GA: two-point and uniform crossover. For both crossover operators, the probability of applying crossover is set to 0.6. A stochastic hill climber with bit-flip mutation has also been considered in the initial stage, but the performance of this algorithm was far inferior compared to any other algorithm included in the comparison and most problem instances included in the comparison were intractable with this algorithm; that is why the results for this algorithm are omitted. For each problem instance and each algorithm, an adequate population size is approximated with the bisection method (Sastry, 2001; Pelikan, 2005); here, the bisection method finds an adequate population size to find the optimum in 10 out of 10 independent runs. Each run is terminated when the global optimum has been found (success) or when the maximum number of generations n is reached before the optimum is reached (failure). The results for each problem instance comprise of the following statistics: (1) the population size, (2) the number of iterations (generations), (3) the number of evaluations, and (4) the number of flips of DHC. The most important statistic relating to the overall complexity of each algorithm is the number of flips of DHC, since this statistic combines all important statistics and can be consistently compared regardless of the used algorithm. That is why we focus on presenting the results with respect to the overall number of DHC flips until the optimum has been found. For each combination of values of n, k and step, all observed statistics were averaged over the 10,000 random instances. Since for each instance, 10 successful runs were performed, for each n, k and step and each algorithm the results are averaged over the 100,000 successful runs. Overall, for all algorithms except for UMDA, the results correspond to 1,800,000 successful runs on a total of 180,000 unique problem instances.

5.4

Initial Performance Analysis

The number of DHC flips until optimum is shown in figure 1 for hBOA, figure 2 for UMDA, figure 3 for GA with uniform crossover, and figure 4 for GA with twopoint crossover. The number of evaluations for k = 2 and k = 5 with step = 1 is shown in figure 5; for brevity, we omit analogical results for other values of k and step. There are three main observations that can be made from these results. First of all, for hBOA, the number of DHC flips as well as the number of evaluations both appear to be upper bounded by a low-order polynomial for all values of k and step. However, for GA with both crossover operators, for larger values of k the growth of the number of DHC flips and the number of evaluations appears to be worse than polynomial with respect to n and it can be expected that the results will get even worse for k > 5. The worst performance is obtained with UMDA, for which the growth of the number of evaluations and the number of DHC flips appears to be faster than polynomial for all values of k and step. 7

10 Number of flips (hBOA)

4

10

3

Number of flips (hBOA)

k=2, step=1 k=2, step=2 k=2, step=3

10

4

k=3, step=1 k=3, step=2 k=3, step=3 k=3, step=4

10

3

10

2

20

40 60 Problem size

80 100

10 10 Number of flips (hBOA)

2

20

40 60 Problem size

80 100

5

Number of flips (hBOA)

10

4

k=4, step=1 k=4, step=2 k=4, step=3 k=4, step=4 k=4, step=5

10

4

10

3

k=5, step=1 k=5, step=2 k=5, step=3 k=5, step=4 k=5, step=5 k=5, step=6

10

3

10

2

20

40 60 Problem size

80 100

20

40 60 Problem size

80 100

Figure 1: Average number of flips for hBOA. To visualize the effects of k on performance of all compared algorithms, figure 6 shows the growth of the number of DHC flips with k for hBOA and GA on problems of size n = 120; the results for UMDA are not included, because UMDA was incapable of solving many instances of this size in practical time. Two cases are considered: (1) step = 1, corresponding to standard NK landscapes and (2) step = k + 1, corresponding to the separable problem with no interactions between the different subproblems. For both cases, the vertical axis is shown in log-scale to support the hypothesis that the time complexity of selectorecombinative genetic algorithms should grow exponentially fast with the order of problem decomposition even when recombination is capable of identifying and processing the subproblems in an adequate problem decomposition. The results confirm this hypothesis—indeed, the number of flips for all algorithms appears to grow at least exponentially fast with k, regardless of the value of the step parameter.

5.5

Comparison of All Algorithms

How do the different algorithms compare in terms of performance? While it is difficult to compare the exact running times due to the variety of computer hardware used and the accuracy of time measurements, we can easily compare other recorded statistics, such as the number of DHC flips or the number of evaluations until optimum. The main focus is again on the number of DHC flips because for each fitness evaluation at least one flip is typically performed and that is why the number of flips is expected to be greater or equal than both the number of evaluations as well as the product of the population size and the number of generations. One of the most straightforward approaches to quantify relative performance of two algorithms is to compute the ratio of the number of DHC flips (or some other statistic) for each problem instance. The mean and other moments of the empirical distribution of these ratios can then be estimated for different problem sizes and problem types. The results can then be used to better

8

Number of flips (UMDA)

Number of flips (UMDA)

10

4

k=2, step=1 k=2, step=2 k=2, step=3

10

4

k=3, step=1 k=3, step=2 k=3, step=3 k=3, step=4

10

3

10

3

10

2

20 10 Number of flips (UMDA)
5

40 60 Problem size

80 100

10

2

20

40 60 Problem size

80 100

Number of flips (UMDA)

10

4

k=4, step=1 k=4, step=2 k=4, step=3 k=4, step=4 k=4, step=5

10

5

10

4

k=5, step=1 k=5, step=2 k=5, step=3 k=5, step=4 k=5, step=5 k=5, step=6

10

3

10

3

10

2

20

40 Problem size

60

10

2

20

40 Problem size

60

Figure 2: Average number of flips for UMDA. understand how the differences between the compared algorithms change with problem size or other problem-related parameters. This approach has been used for example in Pelikan et al. (2006) and Pelikan et al. (2008). The mean ratio of the number of DHC flips until optimum for GA with uniform crossover and that for hBOA is shown in figure 7. The ratio encodes the multiplicative factor by which hBOA outperforms GA with uniform crossover. When the ratio is greater than 1, hBOA outperforms GA with uniform crossover by this factor; when the ratio is smaller than 1, GA with uniform crossover outperforms hBOA. The results show that when measuring performance by the number of DHC flips, for k = 2, hBOA is outperformed by GA with uniform crossover on the entire range of tested instances. However, even for k = 2 the ratio grows with problem size and the situation can thus be expected to change for bigger problems. For larger values of k, hBOA clearly outperforms GA with uniform crossover and for the largest values of n and k, that is, for n = 120 and k = 5, the ratio for the number of flips required is more than 3 regardless of step; that means that for the largest values of n and k, hBOA requires on average less than third of the flips to solve the same problem. Even more importantly, the ratio appears to grow faster than polynomially, indicating that the differences will become much more substantial as the problem size increases. The ratio between the number of DHC flips until optimum for GA with twopoint crossover and that for hBOA is shown in figure 8. The results show that for twopoint crossover, the differences between hBOA and GA are even more substantial than for the uniform crossover. This is supported by the results presented in figure 9, which compares the two crossover operators used in GA; uniform crossover clearly outperforms twopoint crossover for all tested problem instances. In summary, the performance of hBOA is substantially better than that of other compared algorithms especially for the most difficult problem instances with large neighborhood size. There results are highlighted in tables 1 and 2, which show the average number of DHC flips and evalua-

9

Number of flips (GA, uniform)

Number of flips (GA, uniform)

10

4

k=2, step=1 k=2, step=2 k=2, step=3

10

4

10

3

k=3, step=1 k=3, step=2 k=3, step=3 k=3, step=4

10

3

10

2

10

2

20 Number of flips (GA, uniform) 10
5

40 60 Problem size

80 100 Number of flips (GA, uniform) 10 10 10 10 10
6

20

40 60 Problem size

80 100

10

4

k=4, step=1 k=4, step=2 k=4, step=3 k=4, step=4 k=4, step=5

5

4

k=5, step=1 k=5, step=2 k=5, step=3 k=5, step=4 k=5, step=5 k=5, step=6

10

3

3

10

2

2

20

40 60 Problem size

80 100

20

40 60 Problem size

80 100

Figure 3: Average number of flips for GA with uniform crossover. tions until optimum required by hBOA and GA for k = 5 and n = 120. Another interesting question is whether the instances that are difficult for one algorithm are also difficult for other algorithms included in the comparison. Figures 10 and 11 visualize the relationship between the number of flips required for the different instances for n = 120 and k = 5 with step = 1 and step = 6, respectively. In both these figures, the number of flips for each instance is normalized by dividing the actual number of flips by the mean number of flips over all instances for the same n, k, and step. The results show that in terms of the number of flips the performance of all compared algorithms is much more strongly correlated for the problem with high overlap (step = 1) than for the separable problem (step = 6). The results also show that, as expected, the correlation between the two variants of GA is much stronger than that between hBOA and GA. The relationship between performance of the compared algorithms on different problem instances is also visualized in figure 12, which shows the average number of DHC flips until optimum for various percentages of problems that are easiest for hBOA. These results also support the correlation between performance of the different algorithms included in the comparison, because the instances that are simpler for hBOA are clearly also simpler for the two variants GA.

5.6

Instance Difficulty

Assuming that recombination operator processes partial solutions or building blocks effectively, based on scalability theory for selectorecombinative genetic algorithms (Goldberg & Rudnick, 1991; Thierens & Goldberg, 1994; Harik et al., 1997; Goldberg, 2002) and multivariate estimation of distribution algorithms (M¨hlenbein & Mahnig, 1998; Pelikan et al., 2002; Pelikan, 2005; Yu et al., u 2007), there are three main sources of problem difficulty of generated instances: (1) signal-tonoise ratio, (2) scaling, and (3) overlap between subproblems. These factors and their effects on problem difficulty of the considered instances of NK landscapes are the main focus of this

10

Number of flips (GA, twopoint)

Number of flips (GA, twopoint)

10

4

k=2, step=1 k=2, step=2 k=2, step=3

10

4

10

3

k=3, step=1 k=3, step=2 k=3, step=3 k=3, step=4

10

3

10

2

10

2

20 Number of flips (GA, twopoint) 10
5

40 60 Problem size

80 100 Number of flips (GA, twopoint) 10 10 10 10 10
6

20

40 60 Problem size

80 100

10

4

k=4, step=1 k=4, step=2 k=4, step=3 k=4, step=4 k=4, step=5

5

4

k=5, step=1 k=5, step=2 k=5, step=3 k=5, step=4 k=5, step=5 k=5, step=6

10

3

3

10

2

2

20

40 60 Problem size

80 100

20

40 60 Problem size

80 100

Figure 4: Average number of flips for GA with twopoint crossover. subsection. First, the effects of parameters n, k, and step are discussed. Then, the results are related to existing scalability theory. Of course, performance of selectorecombinative genetic algorithms also strongly depends on their ability to preserve and mix important partial solutions or building blocks (Goldberg, 2002; Thierens, 1999) and the importance of effective recombination is also expected to vary from instance to instance. The results presented thus far in figures 1, 2, 3, 4, and 5 show that the larger the number of bits n, the more difficult the problem becomes whether we measure algorithm performance by the number of DHC flips or the number of evaluations until optimum. The main reason for this behavior is that as n grows, the signal-to-noise ratio decreases and the complexity of selectorecombinative GAs as well as hBOA is expected to grow with decreasing signal-to-noise ratio (Goldberg & Rudnick, 1991; Thierens & Goldberg, 1994; Harik et al., 1997; Goldberg, 2002; Pelikan et al., 2002); the relationship between instance difficulty and the signal-to-noise ratio will be discussed also later in this section. For hBOA, time complexity expressed in terms of the number of DHC flips or the number of fitness evaluations appears to grow polynomially fast with n; for the remaining algorithms, time complexity appears to grow slightly faster than polynomially fast. These results are not surprising. It was argued elsewhere that if the problem can be decomposed into subproblems of bounded order, then hBOA should be capable of discovering such a decomposition and solve the problem in low-order polynomial time with respect to n (Pelikan, Sastry, & Goldberg, 2002; Pelikan, 2005; Yu, Sastry, Goldberg, & Pelikan, 2007). Furthermore, it is known that fixed crossover operators are often not capable of solving such problems in polynomial time because they often break important partial solutions to the different subproblems or do not juxtapose these partial solutions effectively enough (Thierens & Goldberg, 1993; Thierens, 1995). While it is possible to create adversarial decomposable problems for which some model-building algorithms used in multivariate estimation of distribution algorithms, such as hBOA, fail (Coffin & Smith, 2007), this happens only in very specific cases and is unlikely to be the case with random problem instances

11

10 Number of evaluations

5

Number of evaluations

10

4

UMDA GA (uniform) GA (twopoint) hBOA

10

4

UMDA GA (twopoint) GA (uniform) hBOA

10

3

10

3

10

2

10

1

10 20 40 60 Problem size 80 100

2

20

40 60 Problem size

80 100

(a) k = 2, step = 1

(b) k = 5, step = 1

Figure 5: Average number of evaluations for k = 2 and k = 5 with step = 1.
GA (twopoint) GA (uniform) hBOA Number of flips 10
5

10 Number of flips

5

GA (twopoint) GA (uniform) hBOA

10

4

10

4

10

3

2

3 4 Neighborhood size, k

5

10

3

2

3 4 Neighborhood size, k

5

(a) step = 1

(b) step = k + 1

Figure 6: Growth of the number of DHC flips with k for step = 1 (most overlap) and step = k + 1 (no overlap). All results are for n = 120. or real-world decomposable problems. The effects of k on performance were visualized in figure 6, which showed the growth of the number of DHC flips until optimum with k for n = 120 and step ∈ {1, 6}; a similar relationship can be observed for the number of evaluations (results omitted). The results show that for all algorithms included in the comparison, performance grows at least exponentially with the value of k. Furthermore, the results show that both variants of GA are much more sensitive to the value of k than hBOA. This is not a surprising result because hBOA is capable of identifying the subproblems and recombining solutions to respect the discovered decomposition whereas GA uses a fixed recombination strategy regardless of the problem. The results for other values of n and step are qualitatively similar and are thus omitted. The effects of step on performance are somewhat more intricate. Intuitively, instances where all subproblems are independent should be easiest and this is also supported with all experimental results presented thus far. The results also indicate that the effects of overlap vary between the compared algorithms as is shown in figure 13. More specifically, for both variants of GA, the most difficult percentage of overlap (relative to the size of the subproblems) is about 0.5, whereas for hBOA it is about 0.7. As was mentioned earlier, time complexity of selectorecombinative GAs as well as hBOA is directly related to signal-to-noise ratio where the signal is the difference between the fitness con-

12

Number of flips (GA, uniform) / Number of flips (hBOA)

0.8 0.7 0.6 0.5

k=2, step=1 k=2, step=2 k=2, step=3

Number of flips (GA, uniform) / Number of flips (hBOA)

1.25 1 0.75

k=3, step=1 k=3, step=2 k=3, step=3 k=3, step=4

0.4 4 3 2

20

40 60 Problem size

80 100 Number of flips (GA, uniform) / Number of flips (hBOA)

0.5

20 k=5, step=1 k=5, step=2 k=5, step=3 k=5, step=4 k=5, step=5 k=5, step=6

40 60 Problem size

80 100

Number of flips (GA, uniform) / Number of flips (hBOA)

k=4, step=1 k=4, step=2 k=4, step=3 k=4, step=4 k=4, step=5

7 6 5 4 3 2

1

1 20 40 60 Problem size 80 100

20

40 60 Problem size

80 100

Figure 7: Ratio of the number of flips for GA with uniform crossover and hBOA. n 120 120 120 120 120 120 k 5 5 5 5 5 5 step 1 2 3 4 5 6 Number of DHC flips until optimum hBOA GA (uniform) GA (twopoint) 37,155 141,108 220,318 40,151 212,635 353,748 37,480 249,217 443,570 27,411 195,673 310,894 15,589 100,378 145,406 9,607 35,101 47,576

Table 1: Comparison of the number of DHC flips until optimum for hBOA and GA. For all settings, the superiority of the results obtained by hBOA was verified with paired t-test with 99% confidence. tributions of the best and the second best instances of a subproblem, and the noise models fitness contributions of other subproblems (Goldberg & Rudnick, 1991; Goldberg, Deb, & Clark, 1992). The smaller the signal-to-noise ratio, the larger the expected population size as well as the overall complexity of an algorithm. As was discussed above, the signal-to-noise ratio is influenced primarily by the value of n; however, the signal-to-noise ratio also depends on the subproblems themselves. The influence of the signal-to-noise ratio on algorithm performance should be strongest for separable problems with uniform scaling where all subproblems have approximately the same signal; for problems with overlap and nonuniform scaling, other factors contribute to instance difficulty as well. Another important factor influencing problem difficulty of decomposable problems is the scaling of the signal coming from different subproblems (Thierens, Goldberg, & Pereira, 1998). Next we examine the influence of the signal-to-noise ratio and scaling on performance of the compared algorithms in more detail. Figure 14 visualizes the effects of signal-to-noise ratio on the number of flips until optimum

13

Number of flips (GA, twopoint) / Number of flips (hBOA)

0.8 0.7 0.6

Number of flips (GA, twopoint) / Number of flips (hBOA)

k=2, step=1 k=2, step=2 k=2, step=3

1.25 1

k=3, step=1 k=3, step=2 k=3, step=3 k=3, step=4

0.75

0.5

0.4

20

40 60 Problem size

80

100

0.5

20

40 60 Problem size

80 100

4 Number of flips (GA, twopoint) / Number of flips (hBOA) 3

Number of flips (GA, twopoint) / Number of flips (hBOA)

2

k=4, step=1 k=4, step=2 k=4, step=3 k=4, step=4 k=4, step=5

7 6 5 4 3 2

k=5, step=1 k=5, step=2 k=5, step=3 k=5, step=4 k=5, step=5 k=5, step=6

1

1 20 40 60 Problem size 80 100

20

40 60 Problem size

80

100

Figure 8: Ratio of the number of flips for GA with twopoint crossover and hBOA. for n = 120, k = 5, and step ∈ {1, 6}; since UMDA was not capable of solving many of these problem instances in practical time, the results for UMDA are not included. The figure shows the average number of DHC flips until optimum for different percentages of instances with smallest signal-to-noise ratios. To make the visualization more effective, the number of flips is normalized by dividing the values by the mean number of flips over the entire set of instances. The results clearly show that for the separable problems (that is, step = 6), the smaller the signal-to-noise ratio, the greater the number of flips. However, for problem instances with strong overlap (that is, step = 1), problem difficulty does not appear to be directly related to the signal-to-noise ratio and the primary source of problem difficulty appears to be elsewhere. Figure 15 visualizes the influence of scaling on the number of flips until optimum. The figure shows the average number of flips until optimum for different percentages of instances with smallest signal variance. The larger the variance of the signal, the less uniformly the signal is distributed between the different subproblems. For the separable problem (that is, step = 6), the more uniformly scaled instances appear to be more difficult for all compared algorithms than the less uniformly scaled ones. For instances with strong overlap (that is, step = 1), the effects of scaling on algorithm performance are negligible; again, the source of problem difficulty appears to be elsewhere. Two observations related to the signal-to-noise ratio and scaling are somewhat surprising: (1) Although scalability of selectorecombinative GAs gets worse with nonuniform scaling of subproblems, the results indicate that the actual performance is better on more nonuniformly scaled problems. (2) Performance of the compared algorithms on problems with strong overlap does not appear to be directly affected by signal-to-noise ratio or signal variance. How could these results be explained? We believe that the primary reason why more uniformly scaled problems are more difficult for all tested algorithms is related to effectiveness of recombination. More specifically, practically any recombination operator becomes more effective when the scaling is highly nonuniform; on the other

14

Number of flips (GA, twopoint) / Number of flips (GA, uniform)

1.4 1.3

k=2, step=1 k=2, step=2 k=2, step=3

Number of flips (GA, twopoint) / Number of flips (GA, uniform)

1.5

1.4

1.3

k=3, step=1 k=3, step=2 k=3, step=3 k=3, step=4

1.2

1.2

1.1 20 40 60 Problem size 80 100

1.1 20 40 60 Problem size 80 100

Number of flips (GA, twopoint) / Number of flips (GA, uniform)

1.4

k=4, step=1 k=4, step=2 k=4, step=3 k=4, step=4 k=4, step=5

Number of flips (GA, twopoint) / Number of flips (GA, uniform)

1.5

1.6 1.5 1.4 1.3 1.2 1.1

1.3

k=5, step=1 k=5, step=2 k=5, step=3 k=5, step=4 k=5, step=5 k=5, step=6

1.2

1.1

20

40 60 Problem size

80 100

20

40 60 Problem size

80 100

Figure 9: Ratio of the number of flips for GA with twopoint and uniform crossover. hand, for uniformly scaled subproblems, fixed recombination operators are often expected to suffer from inefficient juxtaposition and frequent disruption of important partial solutions contained in the optimum or building blocks (Goldberg, 2002; Thierens, 1999; Harik & Goldberg, 1996; Thierens & Goldberg, 1993). For problems with strong overlap, the influence of the overlap appears to overshadow both the signal-to-noise ratio and scaling. We believe that a likely reason for this is that the order of interactions that must be covered by the probabilistic model may increase due to the effects of the overlap, which leads to a larger order of important building blocks that must be reproduced and juxtaposed to form the optimum. According to the existing population sizing theory (Goldberg & Rudnick, 1991; Harik et al., 1997; Pelikan et al., 2002), this should lead to an increase in the population size required to solve the problem (Harik et al., 1997). We are currently exploring possible approaches to quantifying the effects of overlap in order to confirm or deny this hypothesis.

6

Future Work

Probably the most important topic for future work in this area is to develop tools, which could be used to gain better understanding of the behavior of selectorecombinative genetic algorithms on additively decomposable problems with overlap between subproblems. This is a challenging topic but also a crucial one, because most difficult decomposable problems contain substantial overlap between the different subproblems. A good starting point for this research is the theoretical work on the factorized distribution algorithm (FDA) (M¨hlenbein & Mahnig, 1998) and the scalability u theory for multivariate estimation of distribution algorithms (Pelikan, Sastry, & Goldberg, 2002; Pelikan, 2005; Yu, Sastry, Goldberg, & Pelikan, 2007; M¨hlenbein, 2008). u It may also be useful to look at other random distributions for generating the subproblems. This avenue of research may split into at least two main directions. One may either bias the 15

n 120 120 120 120 120 120

k 5 5 5 5 5 5

step 1 2 3 4 5 6

Number of evaluations until optimum hBOA GA (uniform) GA (twopoint) 7,414 16,519 34,696 9,011 25,032 56,059 9,988 30,285 72,359 8,606 24,016 51,521 7,307 13,749 26,807 7,328 6,004 10,949

Table 2: Comparison of the number of evaluations until optimum for hBOA and GA. For all settings except for step = 6, the superiority of the results obtained by hBOA was verified with paired t-test with 99% confidence.
Number of flips (GA, twopoint)
Correlation coefficient= 0.493

10

2

Number of flips (GA, twopoint)

Number of flips (GA, uniform)

Correlation coefficient= 0.330

10

2

Correlation coefficient= 0.675

10

1

10

0

10

0

10

0

10

−1 0

10 Number of flips (hBOA)

10 Number of flips (hBOA)

0

10 Number of flips (GA, uniform)

0

Figure 10: Correlation between the number of flips for hBOA and GA for n = 120, k = 5 and step = 1. distribution used to generate subproblems in order to generate instances that are especially hard for the algorithm under consideration with the goal of addressing weaknesses of this algorithm. On the other hand, one may want to generate problem instances that resemble important classes of real-world problems. It may also be interesting to study instances of restricted and unrestricted NK landscapes from the perspective of the theory of elementary landscapes (Barnes, Dimova, Dokov, & Solomon, 2003). Finally, the instances provided by methods described in this paper can be used to test other optimization algorithms and hybrids.

7

Summary and Conclusions

This paper described a class of nearest-neighbor NK landscapes with tunable strength of overlap between consequent subproblems. Shuffling was introduced to eliminate tight linkage and make problem instances more challenging for algorithms with fixed variation operators. A dynamic programming approach was described that can be used to solve the described instances to optimality in low-order polynomial time. A large number of random instances of the described class of NK landscapes were generated. Several evolutionary algorithms were then applied to the generated instances; more specifically, the paper considered the genetic algorithm (GA) with two-point and uniform crossover and two estimation of distribution algorithms, specifically, the hierarchical Bayesian optimization algorithm (hBOA) and the univariate marginal distribution algorithm 16

Number of flips (GA, twopoint)

Correlation coefficient= 0.221

10

1

Correlation coefficient= 0.170

Number of flips (GA, twopoint)

Number of flips (GA, uniform)

10

1

Correlation coefficient= 0.723

10

0

10

0

10

0

10 Number of flips (hBOA)

0

10 Number of flips (hBOA)

0

10 Number of flips (GA, uniform)

0

Figure 11: Correlation between the number of flips for hBOA and GA for n = 120, k = 5 and step = 6.
1.8 Average number of flips (divided by mean) 1.6 1.4 1.2 1 0.8 0.6 0.4 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Percent easiest hBOA instances GA (twopoint) GA (uniform) hBOA 1.2 Average number of flips (divided by mean) GA (twopoint) GA (uniform) hBOA

1.1

1

0.9

0.8

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Percent easiest hBOA instances

(a) step = 1

(b) step = 6

Figure 12: Performance of GA and hBOA as a function of instance difficulty for n = 120 and k = 5. (UMDA). All algorithms were combined with a simple deterministic local searcher and niching was used to maintain useful diversity. The results were analyzed and related to existing scalability theory for selectorecombinative genetic algorithms. hBOA was shown to outperform other algorithms included in the comparison on instances with large neighborhood sizes. The factor by which hBOA outperforms other algorithms for largest neighborhoods appears to grow faster than polynomially with problem size, indicating that the differences will become even more substantial for problems with larger neighborhoods and larger problem sizes. This suggests that linkage learning is advantageous when solving the considered class of NK landscapes. The second best performance was achieved by GA with uniform crossover, whereas the worst performance was achieved by UMDA. The complexity of all algorithms was shown to grow exponentially fast with the size of the neighborhood. The correlations between the time required to solve compared problem instances were shown to be strongest for the two variants of GA; however, correlations were observed also between other compared algorithms. For problems with no overlap, the signal-to-noise ratio and the scaling of the signal in different subproblems were shown to be significant factors affecting problem difficulty. More specifically, the smaller the signal-to-noise ratio and the signal variance, the more difficult instances become. However, for problems with substantial amount of overlap between consequent problems, the effects of overlap were shown to overshadow the effects of signal-to-noise ratio and scaling.

17

32000 Number of flips

Number of flips

128000 64000 32000 16000 8000 4000 2000

16000

8000

4000

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Percentage of overlap

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Percentage of overlap

Number of flips

k=5 k=4 k=3 k=2

1024000 512000 256000

k=5 k=4 k=3 k=2

1024000 512000 256000 128000 64000 32000 16000 8000 4000 2000

k=5 k=4 k=3 k=2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Percentage of overlap

(a) hBOA

(b) GA (uniform)

(c) GA (twopoint)

Figure 13: Influence of overlap for n = 120 and k = 5 (step varies with overlap).
1.075 Average number of flips (divided by mean) 1.05 1.025 1 0.975 0.95 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Signal to noise percentile (% smallest) GA (twpoint) GA (uniform) hBOA 1.075 Average number of flips (divided by mean) 1.05 1.025 1 0.975 0.95 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Signal to noise percentile (% smallest) GA (twpoint) GA (uniform) hBOA

(a) step = 1

(b) step = 6

Figure 14: Influence of signal-to-noise ratio on the number of flips for n = 120 and k = 5.

Acknowledgments
This project was sponsored by the National Science Foundation under CAREER grant ECS0547013, by the Air Force Office of Scientific Research, Air Force Materiel Command, USAF, under grant FA9550-06-1-0096, and by the University of Missouri in St. Louis through the High Performance Computing Collaboratory sponsored by Information Technology Services, and the Research Award and Research Board programs. The U.S. Government is authorized to reproduce and distribute reprints for government purposes notwithstanding any copyright notation thereon. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation, the Air Force Office of Scientific Research, or the U.S. Government. Some experiments were done using the hBOA software developed by Martin Pelikan and David E. Goldberg at the University of Illinois at Urbana-Champaign and most experiments were performed on the Beowulf cluster maintained by ITS at the University of Missouri in St. Louis.

18

1.1 Average number of flips (divided by mean) 1.075 1.05 1.025 1 0.975

Average number of flips (divided by mean)

GA (twopoint) GA (uniform) hBOA

1.1 1.075 1.05 1.025 1 0.975

GA (twopoint) GA (uniform) hBOA

0.95 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Signal variance percentile (% smallest)

0.95 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Signal variance percentile (% smallest)

(a) step = 1

(b) step = 6

Figure 15: Influence of signal variance on the number of flips for n = 120 and k = 5.

References
Ackley, D. H. (1987). An empirical study of bit vector function optimization. Genetic Algorithms and Simulated Annealing, 170–204. Aguirre, H. E., & Tanaka, K. (2003). Genetic algorithms on nk-landscapes: Effects of selection, drift, mutation, and recombination. In Raidl, G. R., et al. (Eds.), Applications of Evolutionary Computing: EvoWorkshops 2003 (pp. 131–142). Altenberg, L. (1997). NK landscapes. In B¨ck, T., Fogel, D. B., & Michalewicz, Z. (Eds.), Handa book of Evolutionary Computation (pp. B2.7:5–10). Bristol, New York: Institute of Physics Publishing and Oxford University Press. Baluja, S. (1994). Population-based incremental learning: A method for integrating genetic search based function optimization and competitive learning (Tech. Rep. No. CMU-CS-94-163). Pittsburgh, PA: Carnegie Mellon University. Barnes, J. W., Dimova, B., Dokov, S. P., & Solomon, A. (2003). The theory of elementary landscapes. Appl. Math. Lett., 16 (3), 337–343. Cheeseman, P., Kanefsky, B., & Taylor, W. M. (1991). Where the really hard problems are. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI-91), 331– 337. Chickering, D. M., Heckerman, D., & Meek, C. (1997). A Bayesian approach to learning Bayesian networks with local structure (Technical Report MSR-TR-97-07). Redmond, WA: Microsoft Research. Choi, S.-S., Jung, K., & Kim, J. H. (2005). Phase transition in a random NK landscape model. pp. 1241–1248. Coffin, D. J., & Smith, R. E. (2007). Why is parity hard for estimation of distribution algorithms? Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2007), 624– 624. Deb, K., & Goldberg, D. E. (1991). Analyzing deception in trap functions (IlliGAL Report No. 91009). Urbana, IL: University of Illinois at Urbana-Champaign, Illinois Genetic Algorithms Laboratory. Friedman, N., & Goldszmidt, M. (1999). Learning Bayesian networks with local structure. In Jordan, M. I. (Ed.), Graphical models (pp. 421–459). Cambridge, MA: MIT Press. 19

Gao, Y., & Culberson, J. C. (2002). An analysis of phase transition in NK landscapes. Journal of Artificial Intelligence Research (JAIR), 17 , 309–332. Goldberg, D. E. (1989). Genetic algorithms in search, optimization, and machine learning. Reading, MA: Addison-Wesley. Goldberg, D. E. (2002). The design of innovation: Lessons from and for competent genetic algorithms, Volume 7 of Genetic Algorithms and Evolutionary Computation. Kluwer Academic Publishers. Goldberg, D. E., Deb, K., & Clark, J. H. (1992). Genetic algorithms, noise, and the sizing of populations. Complex Systems, 6 , 333–362. Goldberg, D. E., & Rudnick, M. (1991). Genetic algorithms and the variance of fitness. Complex Systems, 5 (3), 265–278. Also IlliGAL Report No. 91001. Harik, G. R. (1995). Finding multimodal solutions using restricted tournament selection. Proceedings of the International Conference on Genetic Algorithms (ICGA-95), 24–31. Harik, G. R., Cant´-Paz, E., Goldberg, D. E., & Miller, B. L. (1997). The gambler’s ruin problem, u genetic algorithms, and the sizing of populations. Proceedings of the International Conference on Evolutionary Computation (ICEC-97), 7–12. Also IlliGAL Report No. 96004. Harik, G. R., & Goldberg, D. E. (1996). Learning linkage. Foundations of Genetic Algorithms, 4 , 247–262. Heckerman, D., Geiger, D., & Chickering, D. M. (1994). Learning Bayesian networks: The combination of knowledge and statistical data (Technical Report MSR-TR-94-09). Redmond, WA: Microsoft Research. Holland, J. H. (1975). Adaptation in natural and artificial systems. Ann Arbor, MI: University of Michigan Press. Juels, A. (1998). The equilibrium genetic algorithm. Submitted for publication. Kauffman, S. (1989). Adaptation on rugged fitness landscapes. In Stein, D. L. (Ed.), Lecture Notes in the Sciences of Complexity (pp. 527–618). Addison Wesley. Kauffman, S. (1993). The origins of order: Self-organization and selection in evolution. Oxford University Press. M¨hlenbein, H. (2008). Convergence of estimation of distribution algorithms for finite samples u (Technical Report). Sankt Augustin, Germany: Fraunhofer Institut Autonomous intelligent Systems. M¨hlenbein, H., & Mahnig, T. (1998). Convergence theory and applications of the factorized u distribution algorithm. Journal of Computing and Information Technology, 7 (1), 19–32. M¨hlenbein, H., & Paaß, G. (1996). From recombination of genes to the estimation of distribuu tions I. Binary parameters. Parallel Problem Solving from Nature, 178–187. Pelikan, M. (2005). Hierarchical Bayesian optimization algorithm: Toward a new generation of evolutionary algorithms. Springer. Pelikan, M., & Goldberg, D. E. (2001). Escaping hierarchical traps with competent genetic algorithms. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO2001), 511–518. Also IlliGAL Report No. 2000020. Pelikan, M., & Goldberg, D. E. (2003). Hierarchical BOA solves Ising spin glasses and maxsat. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2003), II , 1275–1286. Also IlliGAL Report No. 2003001. 20

Pelikan, M., Sastry, K., Butz, M. V., & Goldberg, D. E. (2006). Performance of evolutionary algorithms on random decomposable problems. Parallel Problem Solving from Nature, 788– 797. Pelikan, M., Sastry, K., Butz, M. V., & Goldberg, D. E. (2008). Analysis of estimation of distribution algorithms and genetic algorithms on NK landscapes. pp. 1033–1040. Pelikan, M., Sastry, K., & Goldberg, D. E. (2002). Scalability of the Bayesian optimization algorithm. International Journal of Approximate Reasoning, 31 (3), 221–258. Also IlliGAL Report No. 2001029. Santarelli, S., Goldberg, D. E., & Yu, T.-L. (2004). Optimization of a constrained feed network for an antenna array using simple and competent genetic algorithm techniques. Proceedings of the Workshop Military and Security Application of Evolutionary Computation (MSAEC2004). Sastry, K. (2001). Evaluation-relaxation schemes for genetic and evolutionary algorithms. Master’s thesis, University of Illinois at Urbana-Champaign, Department of General Engineering, Urbana, IL. Also IlliGAL Report No. 2002004. Sastry, K., Pelikan, M., & Goldberg, D. E. (2007). Empirical analysis of ideal recombination on random decomposable problems. pp. 1388–1395. Thierens, D. (1995). Analysis and design of genetic algorithms. Doctoral dissertation, Katholieke Universiteit Leuven, Leuven, Belgium. Thierens, D. (1999). Scalability problems of simple genetic algorithms. Evolutionary Computation, 7 (4), 331–352. Thierens, D., & Goldberg, D. (1994). Convergence models of genetic algorithm selection schemes. Parallel Problem Solving from Nature, 116–121. Thierens, D., & Goldberg, D. E. (1993). Mixing in genetic algorithms. Proceedings of the International Conference on Genetic Algorithms (ICGA-93), 38–45. Thierens, D., Goldberg, D. E., & Pereira, A. G. (1998). Domino convergence, drift, and the temporal-salience structure of problems. Proceedings of the International Conference on Evolutionary Computation (ICEC-98), 535–540. Wright, A. H., Thompson, R. K., & Zhang, J. (2000). The computational complexity of N-K fitness functions. IEEE Transactions Evolutionary Computation, 4 (4), 373–379. Yu, T.-L., Sastry, K., Goldberg, D. E., & Pelikan, M. (2007). Population sizing for entropybased model building in estimation of distribution algorithms. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2007), 601–608.

21

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.