P. 1
gp

gp

|Views: 39|Likes:
Published by Maksim Tsvetovat

More info:

Published by: Maksim Tsvetovat on Oct 26, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PPT, PDF, TXT or read online from Scribd
See more
See less

10/26/2010

pdf

text

original

GENETIC PROGRAMMING

THE CHALLENGE

"How can computers learn to solve problems without being explicitly programmed? In other words, how can computers be made to do what is needed to be done, without being told exactly how to do it?" € Attributed to Arthur Samuel (1959)

CRITERION FOR SUCCESS

"The aim [is] ... to get machines to exhibit behavior, which if done by humans, would be assumed to involve the use of intelligence.³ € Arthur Samuel (1983)

REPRESENTATIONS
‡ Decision trees ‡ If-then production rules ‡ Horn clauses ‡ Neural nets ‡ Bayesian networks ‡ Frames ‡ Propositional logic ‡ Binary decision diagrams ‡ Formal grammars ‡ Coefficients for polynomials ‡ Reinforcement learning tables ‡ Conceptual clusters ‡ Classifier systems

A COMPUTER PROGRAM

GENETIC PROGRAMMING (GP)
‡ GP applies the approach of the genetic algorithm to the space of possible computer programs ‡ Computer programs are the lingua franca for expressing the solutions to a wide variety of problems ‡ A wide variety of seemingly different problems from many different fields can be reformulated as a search for a computer program to solve the problem.

GP € MAIN POINTS
‡ Genetic programming now routinely delivers high-return human-competitive machine intelligence. ‡ Genetic programming is an automated invention machine. ‡ Genetic programming has delivered a progression of qualitatively more substantial results in synchrony with five approximately order-of-magnitude increases in the expenditure of computer time.

DEFINITION OF ³HIGHRETURN´
The AI ratio (the ³artificial-to-intelligence´ ratio) of a problem-solving method as the ratio of that which is delivered by the automated operation of the artificial method to the amount of intelligence that is supplied by the human applying the method to a particular problem

DEFINITION OF ³ROUTINE´
A problem solving method is routine if it is general and relatively little human effort is required to get the method to successfully handle new problems within a particular domain and to successfully handle new problems from a different domain.

CRITERIA FOR ³HUMAN-COMPETITIVENESS´
‡ Previously patented, an improvement over a patented invention, or patentable today ‡ Publishable in its own right as a new scientific result €independent of the fact that the result was mechanically created ‡ Holds it own in regulated competition against humans (or programs) ‡ 5 other similar criteria that are ³arms-length´ from the fields of AI, ML, GP

PROGRESSION OF QUALITATIVELY MORE SUBSTANTIAL RESULTS PRODUCED BY GP

‡ ‡ ‡ ‡ ‡

Toy problems Human-competitive non-patent results 20th-century patented inventions 21st-century patented inventions Patentable new inventions

GP FLOWCHART

A COMPUTER PROGRAM IN C
int foo (int time) { int temp1, temp2; if (time > 10) temp1 = 3; else temp1 = 4; temp2 = temp1 + 1 + 2; return (temp2); }

OUTPUT OF C PROGRAM
Time 0 1 2 3 4 5 6 7 8 9 10 11 12 Output 6 6 6 6 6 6 6 6 6 6 6 7 7

PROGRAM TREE

(+ 1 2 (IF (> TIME 10) 3 4))

CREATING RANDOM PROGRAMS

CREATING RANDOM PROGRAMS

‡ Available functions F = {+, -, *, %, IFLTE} ‡ Available terminals T = {X, Y, Random-Constants} ‡ The random programs are:
± Of different sizes and shapes ± Syntactically valid ± Executable

GP GENETIC OPERATIONS

‡ ‡ ‡ ‡

Reproduction Mutation Crossover (sexual recombination) Architecture-altering operations

MUTATION OPERATION

MUTATION OPERATION ‡ ‡ ‡ ‡ Select 1 parent probabilistically based on fitness Pick point from 1 to NUMBER-OF-POINTS Delete subtree at the picked point Grow new subtree at the mutation point in same way as generated trees for initial random population (generation 0) ‡ The result is a syntactically valid executable program ‡ Put the offspring into the next generation of the population

CROSSOVER OPERATION

CROSSOVER OPERATION
‡ Select 2 parents probabilistically based on fitness ‡ Randomly pick a number from 1 to NUMBER-OFPOINTS for 1st parent ‡ Independently randomly pick a number for 2nd parent ‡ The result is a syntactically valid executable program ‡ Put the offspring into the next generation of the population ‡ Identify the subtrees rooted at the two picked points

REPRODUCTION OPERATION

‡ Select parent probabilistically based on fitness ‡ Copy it (unchanged) into the next generation of the population

FIVE MAJOR PREPARATORY STEPS FOR GP

‡ ‡ ‡ ‡ ‡

Determining the set of terminals Determining the set of functions Determining the fitness measure Determining the parameters for the run Determining the method for designating a result and the criterion for terminating a run

ILLUSTRATIVE GP RUN

SYMBOLIC REGRESSION
Independent Dependent variable X variable Y -1.00 -0.80 -0.60 -0.40 -0.20 0.00 0.20 0.40 0.60 0.80 1.00 1.00 0.84 0.76 0.76 0.84 1.00 1.24 1.56 1.96 2.44 3.00

PREPARATORY STEPS
Objective: Find a computer program with one input (independent variable X) whose output equals the given data T = {X, Random-Constants} F = {+, -, *, %}

1 2 3

Terminal set: Function set: Fitness:

The sum of the absolute value of the differences between the candidate program¶s output and the given data (computed over numerous values of the independent variable x from ±1.0 to +1.0) Population size M = 4 An individual emerges whose sum of absolute errors is less than 0.1

4 5

Parameters: Termination:

SYMBOLIC REGRESSION POPULATION OF 4 RANDOMLY CREATED INDIVIDUALS FOR GENERATION 0

SYMBOLIC REGRESSION x2 + x + 1 FITNESS OF THE 4 INDIVIDUALS IN GEN 0

x+1 0.67

x2 + 1 1.00

2 1.70

x 2.67

SYMBOLIC REGRESSION x2 + x + 1 GENERATION 1

Mutant of (c) Copy of (a) picking ³2´ as mutation point

First offspring of crossover of (a) and (b) picking ³+´ of parent (a) and left-most ³x´ of parent (b) as crossover points

Second offspring of crossover of (a) and (b) picking ³+´ of parent (a) and left-most ³x´ of parent (b) as crossover points

CLASSIFICATION

GP TABLEAU ± INTERTWINED SPIRALS
Create a program to classify a given point in the x-y plane to the red or blue spiral Terminal set: T = {X,Y,Random-Constants} Function set: F = {+,-,*,%,IFLTE,SIN,COS} Fitness: The number of correctly classified points (0 ± 194) Parameters: M = 10,000. G = 51 Termination: An individual program scores 194 Objective:

1 2 3 4 5

WALL-FOLLOWER

FITNESS

BEST OF GENERATION 57

BOX MOVER ± BEST OF GEN 0

BOX MOVER GEN 45 ± FITNESS CASE 1

TRUCK BACKER UPPER

TRUCK BACKER UPPER
‡ 4-Dimensional control problem
± ± ± ± horizontal position, x vertical position, y angle between trailer and horizontal, 5t angle between trailer and cab, 5d

‡ One control variable (steering wheel turn angle) ‡ State transition equations map the 4 state variables into 1 output (the control variable) ‡ Simulation run over many initial conditions and over hundreds of time steps

GENETIC PROGRAMMING: ON THE PROGRAMMING OF COMPUTERS BY MEANS OF NATURAL SELECTION (Koza 1992)

2 MAIN POINTS FROM 1992 BOOK ‡ Virtually all problems in artificial intelligence, machine learning, adaptive systems, and automated learning can be recast as a search for a computer program. ‡ Genetic programming provides a way to successfully conduct the search for a computer program in the space of computer programs.

SOME RESULTS FROM 1992 BOOK
‡ ‡ ‡ ‡ ‡ ‡ ‡ ‡ ‡ ‡ Intertwined Spirals Truck Backer Upper Broom Balancer Wall Follower Box Mover Artificial Ant Differential Games Inverse Kinematics Central Place Foraging Block Stacking ‡ ‡ ‡ ‡ ‡ ‡ ‡ ‡ Randomizer Cellular Automata Task Prioritization Image Compression Econometric Equation Optimization Boolean Function Learning Co-Evolution of GamePlaying Strategies

PROGRESSION OF QUALITATIVELY MORE SUBSTANTIAL RESULTS PRODUCED BY GP

‡ ‡ ‡ ‡ ‡

Toy problems Human-competitive non-patent results 20th-century patented inventions 21st-century patented inventions Patentable new inventions

COMPUTER PROGRAMS

‡ Subroutines provide one way to REUSE code € possibly with different instantiations of the dummy variables (formal parameters) ‡ Loops (and iterations) provide a 2nd way to REUSE code ‡ Recursion provide a 3rd way to REUSE code ‡ Memory provides a 4th way to REUSE the results of executing code

SYMBOLIC REGRESSION
Fitness case L0 1 2 3 4 5 6 7 8 9 10 3 7 10 3 4 3 5 1 2 8 W0 4 10 9 9 3 3 9 2 6 1 H0 7 9 4 5 2 1 9 9 8 10 L1 2 10 8 1 7 9 1 3 2 7 W1 5 3 1 6 6 5 7 9 6 5 H1 3 1 6 4 1 4 6 2 10 1 Dependent variable D 54 600 312 111 -18 -171 363 -36 -24 45

EVOLVED SOLUTION

(- (* (* W0 L0) H0) (* (* W1 L1) H1))

DIFFERENCE IN VOLUMES

D = L0W0H0 ± L1W1H1

AUTOMATICALLY DEFINED FUNCTION volume

AUTOMATICALLY DEFINED FUNCTION volume
(progn (defun volume (arg0 arg1 arg2) (values (* arg0 (* arg1 arg2)))) (values (- (volume L0 W0 H0) (volume L1 W1 H1))))

AUTOMATICALLY DEFINED FUNCTIONS
‡ ADFs provide a way to REUSE code ‡ Code is typically reused with different instantiations of the dummy variables (formal parameters)

ADDITION OF V0 AND V1
Fitness case L0 1 2 3 4 5 6 7 8 9 10 3 7 10 3 4 3 5 1 2 8 W0 4 10 9 9 3 3 9 2 6 1 H0 7 9 4 5 2 1 9 9 8 10 L1 2 10 8 1 7 9 1 3 2 7 W1 5 3 1 6 6 5 7 9 6 5 H1 3 1 6 4 1 4 6 2 10 1 V0 84 630 360 135 24 9 405 18 96 80 V1 30 30 48 24 42 180 42 54 120 35 D 54 600 312 111 -18 -171 363 -36 -24 45

DIVIDE AND CONQUER

DIVIDE AND CONQUER
‡ Decompose a problem into sub-problems ‡ Solve the sub-problems ‡ Assemble the solutions of the subproblems into a solution for the overall problem

CHANGE OF REPRESENTATION

CHANGE OF REPRESENTATION

‡ Identify regularities ‡ Change the representation ‡ Solve the overall problem

ADF IMPLEMENTATION
‡ Each overall program in population includes
± a main result-producing branch (RPB) and ± function-defining branch (i.e., automatically defined function, ADF)

‡ In generation 0, create random programs with different ingredients for the RPB and the ADF
± Terminal set for ADF typically contains dummy arguments (formal parameters), such as ARG0, ARG1, « ± Function set of the RPB contains ADF0 ± ADFs are private and associated with a particular individual program in the population

ADF MUTATION
‡ Select parent probabilistically on the basis of fitness ‡ Pick a mutation point from either RPB or an ADF ‡ Delete sub-tree rooted at the picked point ‡ Grow a new sub-tree at the picked point composed of the allowable ingredients appropriate for the picked point ‡ The offspring is a syntactically valid executable program

ADF CROSSOVER
‡ Select parent probabilistically on the basis of fitness ‡ Pick a crossover point from either RPB or an ADF of the FIRST patent ‡ The choice of crossover point in the SECOND parent is RESTRICTED to the picked RPB or to the picked ADF ‡ The sub-trees are swapped ‡ The offspring are syntactically valid executable programs

GENETIC PROGRAMMING II: AUTOMATIC DISCOVERY OF REUSABLE PROGRAMS (Koza 1994)

MAIN POINTS OF 1994 BOOK ‡ Scalability is essential for solving non-trivial problems in artificial intelligence, machine learning, adaptive systems, and automated learning ‡ Scalability can be achieved by reuse ‡ Genetic programming provides a way to automatically discover and reuse subprograms in the course of automatically creating computer programs to solve problems

COMPUTER PROGRAMS

‡ Subroutines provide one way to REUSE code € possibly with different instantiations of the dummy variables (formal parameters) ‡ Loops (and iterations) provide a 2nd way to REUSE code ‡ Recursion provide a 3rd way to REUSE code ‡ Memory provides a 4th way to REUSE the results of executing code

MEMORY

Settable (named)

Indexed vector

Matrix memory

Relational memory

variables memory

LANGDON'S DATA STRUCTURES

‡ ‡ ‡ ‡

Stacks Queues Lists Rings

COMPUTER PROGRAMS

‡ Subroutines provide one way to REUSE code € possibly with different instantiations of the dummy variables (formal parameters) ‡ Loops (and iterations) provide a 2nd way to REUSE code ‡ Recursion provide a 3rd way to REUSE code ‡ Memory provides a 4th way to REUSE the results of executing code

AUTOMATICALLY DEFINED ITERATION (ADI)
‡ The overall program includes an iterationperforming branch (IPB) in addition to a result-producing branch (RPB) and functiondefining branches (ADF) ‡ There are no infinite loops because the iteration is performed over a known, fixed set
± protein or DNA sequence (of varying length) ± time-series data ± two-dimensional array of pixels

‡ Memory is usually involved and is used to communicate between IPB, RPB, and ADF

TRANSMEMBRANE SEGMENT IDENTIFICATION PROBLEM

Goal is to classify a given protein segment as being a transmembrane domain or non-transmembrane area of the protein

TRANSMEMBRANE SEGMENT IDENTIFICATION PROBLEM
(progn (defun ADF0 () (ORN (ORN (ORN (I?) (H?)) (ORN (P?) (G?))) (ORN (ORN (ORN (Y?) (N?)) (ORN (T?) (Q?))) (ORN (A?) (H?)))))) (defun ADF1 () (values (ORN (ORN (ORN (A?) (I?)) (ORN (L?) (W?))) (ORN (ORN (T?) (L?)) (ORN (T?) (W?)))))) (defun ADF2 () (values (ORN (ORN (ORN (ORN (ORN (D?) (E?)) (ORN (ORN (ORN (D?) (E?)) (ORN (ORN (T?) (W?)) (ORN (Q?) (D?)))) (ORN (K?) (P?)))) (ORN (K?) (P?))) (ORN (T?) (W?))) (ORN (ORN (E?) (A?)) (ORN (N?) (R?)))))) (progn (loop-over-residues (SETM0 (+ (- (ADF1) (ADF2)) (SETM3 M0)))) (values (% (% M3 M0) (% (% (% (- L -0.53) (* M0 M0)) (+ (% (% M3 M0) (% (+ M0 M3) (% M1 M2))) M2)) (% M3 M0))))))

TRANSMEMBRANE SEGMENT IDENTIFICATION PROBLEM

‡ in-sample correlation of 0.976 ‡ out-of-sample correlation of 0.968 ‡ out-of-sample error rate 1.6%

AUTOMATICALLY DEFINED LOOP (ADL)

‡ ‡ ‡ ‡

loop initialization branch, LIB loop condition branch, LCB loop body branch, LBB loop update branch, LUB

ADL

COMPUTER PROGRAMS

‡ Subroutines provide one way to REUSE code € possibly with different instantiations of the dummy variables (formal parameters) ‡ Loops (and iterations) provide a 2nd way to REUSE code ‡ Recursion provide a 3rd way to REUSE code ‡ Memory provides a 4th way to REUSE the results of executing code

AUTOMATICALLY DEFINED RECURSION (ADR)
‡ ‡ ‡ ‡ recursion condition branch, RCB recursion body branch, RBB recursion update branch, RUB recursion ground branch, RGB

ADR

HUMAN-COMPETITIVE RESULTS (NOT RELATED TO PATENTS)
Transmembrane segment identification problem for proteins Motifs for D±E±A±D box family and manganese superoxide dismutase family of proteins Cellular automata rule for Gacs-Kurdyumov-Levin (GKL) problem Quantum algorithm for the Deutsch-Jozsa ³early promise´ problem Quantum algorithm for Grover¶s database search problem Quantum algorithm for the depth-two AND/OR query problem Quantum algorithm for the depth-one OR query problem Protocol for communicating information through a quantum gate Quantum dense coding Soccer-playing program that won its first two games in the 1997 Robo Cup competition Soccer-playing program that ranked in the middle of field in 1998 Robo Cup competition Antenna designed by NASA for use on spacecraft Sallen-Key filter

PROGRESSION OF QUALITATIVELY MORE SUBSTANTIAL RESULTS PRODUCED BY GP

‡ ‡ ‡ ‡ ‡

Toy problems Human-competitive non-patent results 20th-century patented inventions 21st-century patented inventions Patentable new inventions

GENETIC PROGRAMMING III: DARWINIAN INVENTION AND PROBLEM SOLVING (Koza, Bennett, Andre, Keane 1999)

SUBROUTINE DUPLICATION

SUBROUTINE CREATION

SUBROUTINE DELETION

ARGUMENT DUPLICATION

ARGUMENT DELETION

16 ATTRIBUTES OF A SYSTEM FOR AUTOMATICALLY CREATING COMPUTER PROGRAMS ‡ Starts with "What needs to be done" ‡ Tells us "How to do it" ‡ Produces a computer program ‡ Automatic determination of program size ‡ Code reuse ‡ Parameterized reuse ‡ Internal storage ‡ Iterations, loops, and recursions ‡ Self-organization of hierarchies ‡ Automatic determination of program architecture ‡ Wide range of programming constructs ‡ Well-defined ‡ Problem-independent ‡ Wide applicability ‡ Scalable ‡ Competitive with humanproduced results

GENETIC PROGRAMMING PROBLEM SOLVER (GPPS)

AUTOMATIC SYNTHESIS OF BOTH THE TOPOLOGY AND SIZING OF ANALOG ELECTRICAL CIRCUITS BY MEANS OF DEVELOPMENTAL GENETIC PROGRAMMING

AUTOMATED CIRCUIT SYNTHESIS
‡ The topology of a circuit includes specifying the gross number of components in the circuit, the type of each component (e.g., a capacitor), and a netlist specifying where each lead of each component is to be connected. ‡ Sizing involves specifying the values (typically numerical) of each of the circuit's components.

COMPONENT-CREATING FUNCTIONS

‡ ‡ ‡ ‡ ‡

Resistor R function Capacitor C function Inductor L function Diode D function Transistor Q function (3-leaded)

COMPONENT-CREATING FUNCTIONS

TOPOLOGY-MODIFYING FUNCTIONS

‡ SERIES division ‡ PARALLEL division ‡ VIA ‡ FLIP

TOPOLOGY-MODIFYING FUNCTIONS

DEVELOPMENT-CONTROLLING FUNCTIONS

‡ END function ‡ NOP (No Operation) function ‡ SAFE_CUT function

THE INITIAL CIRCUIT

DEVELOPMENTAL GP

(LIST (C (± 0.963 (± (± -0.875 -0.113) 0.880)) (series (flip end) (series (flip end) (L -0.277 end) end) (L (± 0.640 0.749) (L -0.123 end)))) (flip (nop (L -0.657 end)))))

CAPACITOR-CREATING FUNCTION

(LIST (C (± 0.963 (± (± -0.875 -0.113) 0.880)) (series (flip end) (series (flip end) (L ± 0.277 end) end) (L (± -0.640 0.749) (L -0.123 end)))) (flip (nop (L -0.657 end)))))

CAPACITOR-CREATING FUNCTION

SERIES DIVISION FUNCTION
(LIST (C (± 0.963 (± (± -0.875 -0.113) 0.880)) (series (flip end) (series (flip end) (L ± 0.277 end) end) (L (± -0.640 0.749) (L -0.123 end)))) (flip (nop (L -0.657 end)))))

SERIES DIVISION

DEVELOPMENTAL GP

EVALUATION OF FITNESS

+
Program Tree

IN

z0

OUT

mbryonic ircuit

Fully Designed ircuit (Net raph) ircuit Netlist (ascii) ircuit imulator ( PI E) ircuit Behavior (Output) Fitness

DESIRED BEHAVIOR OF A LOWPASS FILTER

EVOLVED CAMPBELL FILTER

U. S. patent 1,227,113 George Campbell American Telephone and Telegraph 1917

EVOLVED ZOBEL FILTER

U. S. patent 1,538,964 Otto Zobel American Telephone and Telegraph Company 1925

EVOLVED SALLEN-KEY FILTER

EVOLVED DARLINGTON EMITTERFOLLOWER SECTION

U. S. patent 2,663,806 Sidney Darlington Bell Telephone Laboratories 1953

NEGATIVE FEEDBACK

HAROLD BLACK¶S RIDE ON THE LACKAWANNA FERRY

Courtesy of Lucent Technologies

20th-CENTURY PATENTS
Campbell ladder topology for filters Zobel ³M-derived half section´ and ³constant K´ filter sections Crossover filter Negative feedback Cauer (elliptic) topology for filters PID and PID-D2 controllers Darlington emitter-follower section and voltage gain stage Sorting network for seven items using only 16 steps 60 and 96 decibel amplifiers Analog computational circuits Real-time analog circuit for time-optimal robot control Electronic thermometer Voltage reference circuit Philbrick circuit NAND circuit Simultaneous synthesis of topology, sizing, placement, and routing

PROGRESSION OF QUALITATIVELY MORE SUBSTANTIAL RESULTS PRODUCED BY GP

‡ ‡ ‡ ‡ ‡

Toy problems Human-competitive non-patent results 20th-century patented inventions 21st-century patented inventions Patentable new inventions

SIX POST-2000 PATENTED INVENTIONS

EVOLVED HIGH CURRENT LOAD CIRCUIT

REGISTER-CONTROLLED CAPACITOR CIRCUIT

LOW-VOLTAGE CUBIC CIRCUIT

VOLTAGE-CURRENT-CONVERSION CIRCUIT

LOW-VOLTAGE BALUN CIRCUIT

TUNABLE INTEGRATED ACTIVE FILTER

21st-CENTURY PATENTED INVENTIONS

Low-voltage balun circuit Mixed analog-digital variable capacitor circuit High-current load circuit Voltage-current conversion circuit Cubic function generator Tunable integrated active filter

PROGRESSION OF QUALITATIVELY MORE SUBSTANTIAL RESULTS PRODUCED BY GP

‡ ‡ ‡ ‡ ‡

Toy problems Human-competitive non-patent results 20th-century patented inventions 21st-century patented inventions Patentable new inventions

NOVELTY-DRIVEN EVOLUTION
‡ Two factors in fitness measure
± Circuit¶s behavior in the frequency domain ± Largest number of nodes and edges (circuit components) of a subgraph of the given circuit that is isomorphic to a subgraph of a template representing the prior art. Graph isomorphism algorithm with the cost function being based on the number of shared nodes and edges (instead of just the number of nodes).

NOVELTY-DRIVEN EVOLUTION
‡ For circuits not scoring the maximum number of hits (101), the fitness of a circuit is the product of the two factors. ‡ For circuits scoring 101 hits (100%compliant individuals), fitness is the number of shared nodes and edges divided by 10,000.

PRIOR ART TEMPLATE

NON-INFRINGING SOLUTION NO. 1

NON-INFRINGING SOLUTION NO. 5

GP AS AN INVENTION MACHINE

CIRCUIT-CONSTRUCTING PROGRAM TREE WITH ADFs

LOWPASS FILTER WITH ADFs

ADF0

AUTOMATIC SYNTHESIS OF CIRCUIT LAYOUT INCLUDING THE PLACEMENT OF COMPONENTS AND ROUTING OF WIRES ALONG WITH THE TOPOLOGY AND SIZING

CIRCUIT LAYOUT
‡ Circuit placement involves the assignment of each of the circuit's components to a particular physical location on a printed circuit board or silicon wafer. ‡ Routing involves the assignment of a particular physical location to the wires between the leads of the circuit's components.

LAYOUT

LAYOUT ² GENERATION 0

100%-COMPLIANT LOWPASS FILTER GENERATION 25 WITH 5 CAPACITORS AND 11 INDUCTORS € AREA OF 1775.2

100%-COMPLIANT LOWPASS FILTER GENERATION 30 WITH 10 INDUCTORS AND 5 CAPACITORS € AREA OF 950.3

100%-COMPLIANT LOWPASS FILTER BEST-OF-RUN CIRCUIT OF GENERATION 138 WITH 4 INDUCTORS AND 4 CAPACITORS € AREA OF 359.4

LAYOUT € 60 DB AMPLIFIER

AUTOMATIC SYNTHESIS OF BOTH THE TOPOLOGY AND TUNING OF CONTROLLERS

PROGRAM TREE FOR A CONTROLLER

CONTROLLER BLOCKS
‡ ‡ ‡ ‡ ‡ ‡ ‡ gain integrator differentiator adder subtractor multiplier differential input integrators ‡ inverter ‡ ‡ ‡ ‡ ‡ ‡ ‡ ‡ lead lag two-parameter lag absolute value limiter divider delay conditional operators (switches)

FUNCTION SET FOR CONTROLLER SYNTHESIS
F = {GAIN,INVERTER,LEAD,LAG,LAG2, DIFFERENTIAL_INPUT_INTEGRATOR, DIFFERENTIATOR, ADD_SIGNAL, SUB_SIGNAL,ADD_3_SIGNAL,ADF0, ADF1,ADF2,ADF3,ADF4}

TERMINAL SET FOR CONTROLLER SYNTHESIS

T = { REFERENCE_SIGNAL, CONTROLLER_OUTPUT, PLANT_OUTPUT}

CONSTRAINED SYNTACTIC STRUCTURE
‡ A grammar that specifies what functions and terminals are allowed as arguments to particular functions ‡ For example, the first argument of the GAIN function must be a value-setting subtree whereas the second can be from the general pool of functions ‡ Also known as strong typing

TWO-LAG PLANT
Plant 600

+40.0 622 Control Variable 610 620 626

+1.0 632

+1.0 642 Plant Output

Lag 630

636

Lag 640

680

624 -40.0

8 FITNESS CASES
‡ 8 elements of the fitness represent 2 v 2 v 2 choices: measure

± 2 different values of the plant's internal gain, K (1.0 and 2.0), ± 2 different values of the plant's time constant X (0.5 and 1.0), ± 2 different values for the height of the reference signal (rising from 0 to 1 volts or from 0 to 1 microvolts at t = 100 milliseconds

FITNESS MEASURE
‡ For each of these 8 fitness cases, a transient analysis (time domain) is performed using the SPICE simulator. ‡ The contribution to fitness for the 8 elements is

± Integral of time-weighted absolute error (ITAE) ± e(t) is difference between plant output and reference signal. ± Multiplication by B (106 or 1) makes both reference signals equally influential. ± Additional weighting function, A, heavily penalizes non-compliant amounts of overshoot. A weights all variations up to 2% above the reference signal by 1.0, but bigger variations by 10.0.

EVOLVED CONTROLLER FOR TWO-LAG PLANT

LESS ITAE AND OVERSHOOT

BETTER DISTURBANCE REJECTION

REVERSE ENGINEERING OF METABOLIC PATHWAYS

EVOLVED PATHWAY

ANTENNA SYNTHESIS USING GP

(PROGN3 (TURN-RIGHT 0.125) (LANDMARK (REPEAT 2 (PROGN2 (DRAW 1.0 HALF-MM-WIRE) (DRAW 0.5 NO-WIRE))) (TRANSLATE-RIGHT 0.125 0.75))

USING A TURTLE TO DRAW TWODIMENSIONAL ANTENNA

BEST-OF-RUN ANTENNA FROM GENERATION 90

3-DIMENSIONAL ANTENNA

NASA EVOLVED ANTENNA

To be on satellite to be launched in 2004

OTHER STRUCTURES

GENETIC NETWORK FOR lac operon

EVOLVED NETWORK
(IF (< LACTOSE_LEVEL 9.139 ) (IF (< REPRESSOR_LEVEL 6.270 ) (IF (> GLUCOSE_LEVEL 5.491 ) 2.02 (IF (< CAP_LEVEL 0.639 ) 2.033 (IF (< CAP_LEVEL 4.858 ) (IF (> LACTOSE_LEVEL 2.511 ) (IF (> CAP_LEVEL 7.807 ) 5.586 (IF (> LACTOSE_LEVEL 2.114 ) 1.978 2.137 ) ) 0.0 ) (IF (> REPRESSOR_LEVEL 4.015 ) 0.036 (IF (< GLUCOSE_LEVEL 5.128 ) 10.0 (IF (< REPRESSOR_LEVEL 4.268 ) 2.022 9.122 ) ) ) ) ) ) (IF (> CAP_LEVEL 0.842 ) 0.0 5.97 ) ) (IF (< CAP_LEVEL 1.769 ) 2.022 (IF (< GLUCOSE_LEVEL 2.382 ) (IF (> LACTOSE_LEVEL 1.256 ) (IF (> LACTOSE_LEVEL 1.933 ) (IF (> GLUCOSE_LEVEL 2.022 ) (IF (< GLUCOSE_LEVEL 5.183 ) 6.323 (IF (> CAP_LEVEL 1.208 ) 9.713 0.842 ) ) 10.0 ) (IF (> GLUCOSE_LEVEL 6.270 ) 2.109 ) 1.965 ) ) 0.665 ) 1.982 ) ) )

IN C-STYLE PSEUDO CODE
if(LACTOSE_LEVEL < 9.139) { if(REPRESSOR_LEVEL < 6.270) { LAC_mRNA_LEVEL = 2.022; } else { LAC_mRNA_LEVEL = 0.0; } } else { if(CAP_LEVEL < 1.769) { LAC_mRNA_LEVEL = 2.022; } else { if(GLUCOSE_LEVEL < 2.382) { LAC_mRNA_LEVEL = 10.0; } else { LAC_mRNA_LEVEL = 1.982; } } }

PARAMETERIZED TOPOLOGIES

‡ One of the most important characteristics of computer programs is that they ordinarily contain inputs (free variables) and conditional operations

PARAMETERIZED TOPOLOGY FOR LOWPASS FILTER

PARAMETERIZED TOPOLOGY FOR HIGHPASS FILTER

PARAMETERIZED TOPOLOGY FOR GENERAL-PURPOSE CONTROLLER

EVOLVED EQUATIONS FOR GENERALPURPOSE CONTROLLER

EVOLVED EQUATIONS FOR GENERALPURPOSE CONTROLLER

PATENTABLE NEW INVENTIONS

PID tuning rules that outperform the Ziegler-Nichols and Åström-Hägglund tuning rules General-purpose controllers outperforming Ziegler-Nichols and Åström-Hägglund rules

PROGRESSION OF QUALITATIVELY MORE SUBSTANTIAL RESULTS PRODUCED BY GP

‡ ‡ ‡ ‡ ‡

Toy problems Human-competitive non-patent results 20th-century patented inventions 21st-century patented inventions Patentable new inventions

PARALLELIZATION WITH SEMI-ISOLATED SUBPOPULATIONS

CONTROL PARAMETER FILE

KEYBOARD

BOSS (Tram)

MESH NODE

MESH NODE

MESH NODE

HOST (Pentium PC)

DEBUGGER (optional)

MESH NODE

MESH NODE

MESH NODE

OUTPUT FILE

VIDEO DISPLAY MESH NODE MESH NODE MESH NODE

GP PARALLELIZATION
‡ Like Hormel, Get Everything Out of the Pig, Including the Oink ‡ Keep on Trucking ‡ It Takes a Licking and Keeps on Ticking ‡ The Whole is Greater than the Sum of the Parts

PETA-OPS

‡ Human brain operates at 1012 neurons operating at 103 per second = 1015 ops per second ‡ 1015 ops = 1 peta-op = 1 bs (brain second)

EVOLVABLE HARDWARE CORNER OF XILINX XC6216

FUNCTION UNIT FOR CELL OF XILINIX XC6216

SORTING NETWORK
A B C

D H G F E

EVOLVED SORTING NETWORK

GP 1987±2002
System Dates Speed-up over first system 1 (base) 9 HumanProblem Category competitive results 0 2 toy problems human-competitive results not related to patented inventions 20th-century patented inventions 20th-century patented inventions 21st-century patented inventions patentable new inventions

Serial LISP 64 transputers

1987±1994 1994±1997

64 PowerPC¶s 70 Alpha¶s 1,000 Pentium II¶s 4-week runs on 1,000 Pentium II¶s

1995±2000 1999±2001 2000±2002 2002-2003

204 1,481 13,900 130,000

12 2 12 2

PROMISING GP APPLICATION AREAS

‡ Problem areas involving many variables that are interrelated in highly non-linear ways ‡ Inter-relationship of variables is not well understood ‡ Discovery of the size and shape of the solution is a major part of the problem ‡ "Black art" problems

PROMISING GP APPLICATION AREAS (CONTINUED)

‡ Areas where you simply have no idea how to program a solution, but where you know what you want

PROMISING GP APPLICATION AREAS (CONTINUED)

‡ Problem areas where a good approximate solution is satisfactory y design y control y bioinformatics y classification y data mining y system identification y forecasting

PROMISING GP APPLICATION AREAS (CONTINUED) y Areas where large computerized databases are accumulating and computerized techniques are needed to analyze the data y genome, protein, microarray data y satellite image data y astronomical data y petroleum databases y financial databases y medical records y marketing databases

PROMISING GP APPLICATION AREAS (CONTINUED)

y Areas for which humans find it very difficult to write good programs y parallel computers y cellular automata y multi-agent strategies y field-programmable game arrays y digital signal processors y swarm intelligence

CHARACTERISTICS SUGGESTING USE OF GP
(1) discovering the size and shape of the solution, (2) reusing substructures, (3) discovering the number of substructures, (4) discovering the nature of the hierarchical references among substructures, (5) passing parameters to a substructure, (6) discovering the type of substructures (e.g., subroutines, iterations, loops, recursions, or storage), (7) discovering the number of arguments possessed by a substructure, (8) maintaining syntactic validity and locality by means of a developmental process, or (9) discovering a general solution in the form of a parameterized topology containing free variables

DESIGNING A GIRAFFE
‡ ‡ ‡ ‡ ‡ ‡ Long neck Long tongue Vegetable-digesting enzymes in stomach 4 legs Long legs Brown coloration

THE DESIGN OF A GOOD GIRAFE
Neck length 15.11 feet loating point Tongue length Carnivorous? Number Leg of legs length 4 Integer 9.96 feet Coloration

14 inches No loating point oolean

rown

loating Categorical point

NON-LINEARITY ² GIRAFE
‡ Taken one-by-one, some gene values found in a giraffe, such as the long neck contribute (alone) negatively to fitness
± requires considerable material to construct ± requires considerable energy to maintain ± prone to injury (thereby hurting rate of survival and reproduction)

‡ Thus, maximizing any one variable will not lead to the global optimum solution

NON-LINEARITY (CONTINUED)

‡ When the variables are taken in pairs (there are 15 possible pairs), many combinations of pairs (e.g., Long neck and long tongue) are doubly detrimental

NON-LINEARITY (CONTINUED)
‡ But, certain combinations of traits, when taken together, are "co-adapted sets of alleles" that yield a very fit animal for eating high acacia leaves in the jungle environment, having good camouflage, having high escape velocity when faced with predators, and exploiting a niche (and avoiding competition) with other animals feeding on low-hanging vegetation

SEARCH METHODS IN GENERAL
‡ ‡ ‡ ‡ ‡ Initial structure(s) Fitness measure Operations for creating new structures Parameters Termination criterion and method of designating the result

SPACE WITH MANY LOCAL OPTIMA

SEARCH METHODS
‡ Blind random search does not use acquired information in deciding on the future direction of the search ‡ Hill combing and gradient descent use acquired information; however, they are prone to becoming trapped on local optima ‡ The previous point is especially true for nontrivial search spaces

7 DIFFERENCES BETWEEN GP AND ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APPROACHES

REPRESENTATION

Genetic programming overtly conducts it search for a solution to the given problem in program space

ROLE OF POINT-TO-POINT TRANSFORMATIONS IN THE SEARCH
Genetic programming does not conduct its search by transforming a single point in the search space into another single point, but instead transforms a set of points into another set of points

ROLE OF HILL CLIMBING IN THE SEARCH
Genetic programming does not rely exclusively on greedy hill climbing to conduct its search, but instead allocates a certain number of trials, in a principled way, to choices that are known to be inferior

DETERMINISM IN THE SEARCH

Genetic programming conducts its search probabilistically

ROLE OF AN EXPLICIT KNOWLEDGE BASE

Genetic programming does NOT make use of a knowledge base

ROLE OF FORMAL LOGIC IN THE SEARCH

Genetic programming does not utilize formal logic in it¶s search strategy. Contradictory alternatives are created and actively maintained.

UNDERPINNINGS OF THE TECHNIQUE

Biologically inspired

TURING (1948)
Turing made the connection between searches and the challenge of getting a computer to solve a problem without explicitly programming it in his 1948 essay ³Intelligent Machines´
"Further research into intelligence of machinery will probably be very greatly concerned with 'searches' ... ³

TURING¶S 3 APPROACHES TO MACHINE INTELLIGENCE (1948)

LOGIC-BASED SEARCH
One approach that Turing identified is a search through the space of integers representing candidate computer programs.

TURING¶S 3 APPROACHES (CONTINUED)

CULTURAL SEARCH
A second approach is the "cultural search³ which relies on knowledge and expertise acquired over a period of years from others (akin to present-day knowledgebased systems).

TURING¶S 3 APPROACHES (CONTINUED) GENETICAL OR EVOLUTIONARY SEARCH
"There is the genetical or evolutionary search by which a combination of genes is looked for, the criterion being the survival value.³

TURING (1950)
From Turing¶s 1950 paper "Computing Machinery and Intelligence" «
³We cannot expect to find a good childmachine at the first attempt. One must experiment with teaching one such machine and see how well it learns. One can then try another and see if it is better or worse. There is an obvious connection between this process and evolution, by the identifications´

TURING (1950) (CONTINUED)

³Structure of the child machine = Hereditary material ³Changes of the child machine = Mutations ³Natural selection = Judgment of the experimenter´

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->