You are on page 1of 6

Genetic Algorithm for Structuring and Optimizing of Neural Controller

Marwan A. Ali, Mat Sakim. H. A.


School of Electrical and Electronic Engineering
Universiti Sains Malaysia, Engineering Campus
14300 Nibong Tebal, Seberang Perai Selatan, Pulau Pinang, Malaysia
E-mail: amylia@eng.usm.my, marwan_engineer2000@yahoo.com
Abstract

generalization
ability
in
comparison
with
other
algorithms Peng [4]. It gives a new method to construct causal
networks by using EP. The evolving algorithm is first used to
learn the activation values of nodes, and then it is used again to
simultaneously acquire both the topology and weights of causal
networks. Some researchers used non-EA base method for
determining NN structures [4, 5, and 6]. Fujita [4] proposed
statistical estimation of the number of hidden units for FNN by
adding hidden units one by one. He first derives the decrease in
output error per hidden unit based on the least-square
approximation, and then he theoretically estimates the expected
maximum value of the decrease as the largest value in the largest
number of samples from an ideal distribution of the hidden unit.
In other words, the best-hidden unit is selected out of a finite
number of candidate units that have the various functions of
hidden units. The expected largest value depends on not only the
number of candidate units but also the number of learning data
sets that represent the complexity and difficulty of learning tasks.
The expected largest decrease per hidden unit is used to estimate
the total number of hidden units required to reduce the output
error to a desired value. De - Shunag [5] Makes an exhaustive
analysis of structural properties for FNNs , from the viewpoint of
optimization, i.e., minimizing the least square error cost function
(objective function) defined at the network outputs.He discusses
two cases of adding a hidden node and adding an input training
sample one-by-one.The changing process of the objective
function is analyzed. In addition, for the case of adding hidden
unit, he presents the optimization method for some hidden nodes
output vector. The problem of classification performance and
structure decision of MLPNNs was discussed briefly in the
paper of Dianzhi[6].

This paper presents a learning algorithm for the control of


nonlinear dynamical systems. This method is based on a new
approach to control different linear and nonlinear discrete
dynamic systems. It is used in evolutionary algorithms for
training (NNs) parameters. Structure that has been
considered based on the configuration proposed the use of
feed forward neural network (FNN) as an adaptive controller
and on model reference approach. The variable chromosome
representation was introduced with real-coding genetic
algorithm (GAs), as a way to be far from modifying the
(GAs) operators like the coding of bit string. As well as used
(GAs) not just to find and train the network parameters
of (NNs), but also find the optimal network architecture of
feed forward neural network (FNN).

Keyword: Genetic Algorithms; Optimization; Neural networks;


Adaptive Model reference.

Introduction:
Evolutionary algorithms (EA) have been applied to evolve the
network weights and topologies (structure) simultaneously [1, 2].
EC methodologies are sometime used in combinations with other
methodologies. For example, Montana and Davis [3] described
the use of evolutionary algorithm such as (GAs)to train NN.
Instead of replacing the entire population each generation, only
one or two individuals produced, which then has to compete to be
included in the new populations. The network weights were
represented as real, rather than binary numbers; Montana and
Davis paradigm includes an option for improving population
members using back propagation (BP). This hill-climbing
capability, however, did not result in better results than when
using (GAs) alone [1]. Several evolutionary computation (EC)
techniques for improving structure determination were developed
for (NNs) in literature. Yao [2] used a new evolutionary system
i.e., evolutionary programming EP Net, for evolving (NNs). The
EA used in EP Net is based on fugal is evolutionary
programming. EP Net evolves (NNs)architectures and
connection weights (including biases) simultaneously in order to
reduce the noise in fitness evaluation. The parsimony of evolved
(NNs) is encouraged by preferring node connection deletion to
addition. EP Net has been tested on a number of benchmark
problems in machine learning and NNs , the experimental results
show that EP Net can produce very compact NNs with good

RoViSP 2009 IEEE

1. Variable Length Chromosome Representation.


Variable architecture means a variable number of network
parameters and consequently the variable length bit string
representation is avoided when the architecture of the network is
also determined by (GAs), because of the difficulty in doing this
task. Therefore, the real-coding representation used in this work
is an alternative approach suited for this task. There are two
major approaches to evolving NNs architecture [2].
One is the evolution of pure architecture (i.e., architectures
without weights). Connection weights will be trained after a near
optimal architecture has been found. The other is the
simultaneous evolution of both architectures and weights.

123

again monitor with training halted when validation error is


seen to have reached a minimum. This process of adding a
hidden neuron and restarting training is repeated until
validation error is observed to bottom out and to start to
increase. At this point, the network will now have the
optimum, or near-optimum number of hidden nodes [8].
Therefore, in order to simplify the task of choosing the
number of hidden nodes, (GAs)is well suited for this
problem [3].

2.1. The Evolution of pure Architectures.


This type of evolution includes two extreme, the first extreme,
all the connections and nodes of architecture can be specified by
the genotype, this kind of representation scheme is called strong
specification scheme. At the other extreme, only the most
important parameters of architecture, such as the number of
hidden layers and
hidden nodes in each layer are encoded.
Other details about the architecture are either predefined or left to
the training process to decide. This kind of representation scheme
is called indirect encoding scheme. The evolution of pure
architecture has difficulties in evaluating fitness accurately. This
is because; such fitness evaluation of the genotype is very noisy
since a phenotype is fitness that is used to represent the genotype
is fitness. There are two major sources of noise:

The first source is the random initialization of the weights.

The second source is the training algorithm.


As a result, the evolution of pure architecture would be very
inefficient.

2.2. The
Simultaneous
Evolution
Architectures and Weights.

of

4.

Both

Weight of Neuro-Controller

Un used positions

Fig-1 Represents Variable Length Chromosome


The applicability of (GAs) with real-coding has made its
existing operators to be used without any modification to handle
these variable length strings.

3. Problem Description.
Neural network has been widely used in pattern recognition,
forecasting optimization, control of dynamic systems and
language learning, but its mapping capability is dependent on
their structure, and there is not yet a problem-independent way to
choose a good network topology [7]. Generally, the numbers of
input and output nodes are fixed by definition of the problem.
The number of hidden nodes and hidden layers, however, can be
varied, and as it is known that, the number of network weights
increases linearly as the number of hidden nodes, and as the
square of number of hidden layers [8].

Step1: Initialize the genetic operators: Pc, Pm, N


W1

W2

W3

W4

W5

pop

and the
.

nh

maximum number of generations.


Step2: Generate an initial population of the network at random.
The number of hidden nodes and the initial connection density
for each network is uniformly generated at random within certain
ranges. The random initial weights are uniformly

distributed inside a small range, typically between -1 and


+1.

Number of network weights=

Step3: First take the number of hidden nodes nh (in this work it
is consider to be between 2 to 8 nodes) as an integer number from
the end of each chromosome from the population, then partially
train each network on the training set to evaluate the objective
function (for example MSE). Calculate the fitness function as
equation (2) .The mean square error (MSE) is used as an
objective function:

(1)

ni = number of input nodes.


no = number of output nodes.
n h = number of hidden nodes and nl = number of hidden

N P [ y ( k ) y ( k )]2
p
m
(for SISO plants)
K 1
Np

MSE =

layers.
This equation shows the total number of weights for
FNNs without including the basis is weights. As
mentioned earlier, the number of hidden nodes can be
varied; a possible strategy to find a near-optimal number of
hidden nodes is to start with a small number of hidden
nodes, for example, two. Training the network is then
started, when validation error is seen to have stopped
decreasing, and training is halted. Then the number of
hidden nodes is increased by one, network weights are
reinitialized, and training is restart. Validation error is

RoViSP 2009 IEEE

The proposed procedure used in this work for selecting network


architecture (as well as tuning the parameters too) is based on a
variable length genome and is similar in concept to the evolution
both architectures and weight simultaneously method explained
in section (2.2). To illustrate the crossover operation in this task,
two chromosomes may be selected according to its fitness values
from the population; these chromosomes may have different
number of hidden nodes and consequently different number of
weights then the unused position of the string would take the zero
value. The string can be represented as:

To alleviate the noisy fitness evaluation problem, there is one


way that is to have a one to one mapping between genotypes.
Both architecture and weight in formation are encoded in
individuals and are evolved simultaneously.

( n n )*n ( n l ) * n
i
o
h
l
h
Where:

The Proposal Genetic Structure Selection for


Neuro-Controller

(2)

y p (k): is the output of the plant at sample k

y m (k): is the output of the model reference at sample k.


N P : is the number of the training patterns.
Since GA maximizes its fitness function, it is necessary
therefore to map the objective function (MSE) to a fitness
function. The most commonly used objective-to-fitness
transformation is of the form [9]:

124

FTNESS=

C max objective function[objective function C max]


0

(3)

[otherwise]

The constant C max be chosen as an input coefficient. The other


form of objective to fitness transformation is [9]:
FITNESS=

(a)

(4)

objective function

Is a constant chosen to avoid division by zero


Step4: Place in descending order, all chromosomes in the current
population (the first one is fittest).
Step5: Select individuals using hybrid selection method (Roulette
Wheel plus deterministic selection), the real coded genetic
operators of mutation and crossover (single point).
Step6: Stop if a maximum number of generations of (GAs) are
achieved, otherwise increment the generations by one and go to
Step3.
The following flowchart illustrates the proposed genetic learning
scheme.

5.

(b)

(c)

Simulation Results.

In this section, different examples are selected including


different types of linear and nonlinear plants; sinusoidal test input
is applied to test the behavior of the proposed method. All
simulation results of MRAC with NN parameters the same
(GAs) setting used. The network to be trained can be started
with a random integer number of hidden nodes with range of 2 to
8 nodes. The network size can be increased or decreased until the
network can successfully indentify all classes in the training set
with minimum number of nodes and interconnections. Once this
is achieved, at the end of generations, there is controller (NN)
that makes the plant track optimally the model-reference (with
minimum error function).

(d)

(e)
Fig-2.The MRAC of Example1: (a) Optimal Hidden Node
Selection. (b) Model-Reference and Plant Output. (c) Control
Signal. (d) Output Error. (e) Best MSE.

Example 1:
The second order stable nonlinear plant (linear in output and
nonlinear in input) can be described with the following difference
equation:

Example 2:

y p (k+1) = 0.3* y p (k) + 0.6* y p (k-1)


+0.6* sin (* u (k)) +3* sin (3** u (k))
+ 0.1* sin (5** u (k))

The second order stable nonlinear plant can be described with


the following difference equation:
(4)

y p (k 1)

The model reference is described by equation (5).

y m (k+1)=0.6* y m (k) +0.2* y m (k-1) + r (k)

1 y p (k ) y p (k 1)
2

Where, M = 0.5* sin (0.5*(

(5)

M
(6)

y p (k) + y p (k-1)))*

cos (0.5*( y p (k-1)))+1.2* u (k)

The simulation results are shown in fig.2 (a, b, c, d, e), showing


the best-hidden node selection against the generations, output of
model reference and the plant, nonlinear adaptive control action,
output error and the best MSE against the generations
respectively.

RoViSP 2009 IEEE

1.5 * y p (k ) y p (k 1)

(It is to be controlled by NNs controller) to follow the model


reference, which is described by equation (5). The simulation
results are shown in fig.3(a, b, c, d, e, f, g), the best-hidden node
selection against the generations with MSE and SAE, output of
the model-reference and the plant with MSE, nonlinear adaptive
control action, output error, best MSE, also, the model reference
and plant output with SAE and the required control signal are
shown respectively. Where SAE denote the sum of absolute
error,

125

SAE ( y p (k ) y m (k ) )

(7)

The asymptotically stable gas turbine process is described in


equation (8).

14.96( s 1.7) 9515( s 1.898)

(s) =

85.2( s 1.44) 12400( s 2.037)

Where =(s+10) (s^2+3.225s+2.525)


G

(a)

(8)

Equation (8) relates high-pressure-spool speed and lowpressure-spool speed to changes in jet-pressure-nozzle area and
fuel-flow rate. The proposed neuro -genetic controller can be
applied to the above linear multi-variable plant with a sampling
time of Ts=0.1s.The complete difference equations of (8). The
two-channel model reference (of third order) is chosen to be
stable, linear and is described by the difference equations (9 and
10) respectively: [10]

(b)

y m1 (k+1) = 0.32* y m1 (k) + 0.64* y m1 (k -1) 0.5*

y m1 (k-2) + r1 (k)

y m 2 (k+1) = 0.32* y m 2 (k) + 0.64* y m 2 (k -1) 0.5* y m 2 (k-2) + r2 (k)

(c)

(9)

(10)

The simulation results are given in Fig.4(a, b, c, d, e, f, g, h)


which show the best hidden node selection against the
generations, output of the model-reference and the plant,
nonlinear adaptive control action, output error for the two
channels and the best MSE against the generations respectively.
(d)

(a)
(e)

(b)
(f)

(c)
(g)
Fig-3.The MRAC of Example 2: (a) Optimal Hidden Node
Selection. (b) Model-Reference and Plant Output with MSE. (c)
Control signal. (d) Output Error. (e) Best MSE. (f) ModelReference and Plant Output with SAE. (g) Best SAE.

Example 3:

RoViSP 2009 IEEE

(d)

126

6.

By studying the simulation result of the previous examples, the


following points can be noticed:

It is obvious from the content of table (5.1) that neuro genetic controller could control different plants to follow the
desired model with an acceptable accuracy as well as shows
the power of GAs to find the optimal hidden nodes for this
controller without reducing its performance.

The proposed method finds the optimal numbers of hidden


nodes very quickly, typically in less than 50 generations, see
Fig.2 (a), Fig.3 (a) and Fig.4 (a).

Fig.2 (d), Fig.3 (d) and Figs.4 ((f), and (g)) represent the

(e)

(f)

molding error, and between

(g)

(h)
Fig-4.The MRAC of Example 3: (a) Optimal Hidden Node
Selection. (b) First Model-Reference and Plant Output. (c)
Second Model-Reference and Plant Output. (d) First Control
signal. (e) Second Control signal. (f) First Output Error. (g)
Second Output Error. (h) Best MSE.

For the previous examples, table (5.1) shows the optimal node
selection by the proposed previously method for the neuro genetic controller, output of the controller plant and output of the
model-reference, control signal, modeling error, best MSE against
the generation and finally the reference input.

Example
No.

Example 1

Example 2

Example 3

Optimal
Number
Hidden
Nodes
After
3000
Generatio
ns

Output
the
Plant/
Output
of the
Model
Refere
nce

Final
Value of
MSE after
3000
Generation
s

Contro
l Signal

Output
Error

Fig2.b

Fig 2.c

Fig2.d

9.10E_07

0.2

Fig3.b

Fig 3.c

Fig3.d

4.10E_06

0.2

Fig4.b

Fig4.d

Fig4.f

3
Fig4.e

In
Refe
renc
e
Inpu
t

0.4

Reference Input for all examples described by:

RoViSP 2009 IEEE

ym

(k) is

approximately zero.
The best MSE against the generations always goes to zero;
this means that the output of the plant tracks the model
reference output, see Figs.3 (e), and Fig.4 ((e), and (h)).
The number of hidden nodes ( n h ) cannot be changed
according to the changing performance index. For example,
Fig.3 (a) shows the optimal number of hidden nodes for
neuro -controller with MSE and SAE performance index,
which is equal to 2 nodes.
The proposed method gives some deterministic rather than
heuristic selection method for the structure of NNs .
Figs.4 ((b), and (c)) show the simulation results of gas
turbine plant, under the control of neuro -genetic controller.
It is evident from these figures, that neuro -genetic
controller gives good performance when handling strong
loop interaction.
As shown in Fig.2(c), Fig.3(c) and Fig.4 ((d), and (e)), the
generated neural networks control action is a sinusoidal
which is smooth and without sharp spikes.

References

Table (5.1) Represents Different Examples Responses.

2k
2k
r (k ) sin(
) sin(
)
25
10

(k) and

The merits of linking both GAs and NNs approach are


obvious, confirmed through the comprehensive knowledge
extraction, robustness and adaptive characteristics offered by the
neuro -genetic system. The simulation results are used to
evaluate these controllers. The neuro -genetic controller is tested
to be powerful in both small number of generation in obtaining
the best both (parameter and architecture) and in achieving good
performance when it is applied to control different plants based
on minimization of the error between the output of the plant and
the output of the model-reference. Neural networks are a robust
controller and have the ability to handle effectively the external
disturbance, tracking and de-coupling for MIMO plants with
strong loop interaction .Neural networks act effective as a
feedback controller using the genetic learning, and this controller
stores the dynamic behavior of the selected plant under control
and forces it to follow the model reference. Using another
performance index to minimize the error between the output of
the plant and the output of model-reference makes no big
difference than the use of MSE (see Fig.3 (f)).

0.5

Fig4.g

yp

7. Conclusions

1.50E-06
Fig4.c

Discussion.

[1] Eberhart R.C, and shiy, Evolving artificial Networks, Proc, of


ICNN&B, ( Int. Conference on Neural and Brains); ppl5-ppl13,
Bijing, China, 27-30 Oct 1998.

(11)

127

[2] Xin Yao, Senior Member, IEEE, and Yong LiuYao,a New
Evolutionary System for Evolving Artificial Neural Networks,
IEEE Transactions on Neural Networks, VOL. 8, NO 3, MAY 1997.
[3] Melanie Mitchell, an Introduction to Genetic Algorithms, 1st MIT
Press Cambridge 1998.
[4] Osamu Fujita, Statistical estimation of the number of hidden units for
feed forward neural networks, Neural Networks, VOL11, pp 851
859, 1998.
[5] De-Shung H., an Analysis of Structure Properties for Image
Segmentation, Proc, Of ICNN&B98, (Int. Conference on Neural
and Brains); ppl5-pp. PL463-PL466; Bijing, China, 27-30 Oct 1998.
[6] Dianzh Z, and Xiquen X., Study on Structure Adaptation of
Multilayer Feed forward Neural network, Proc Of ICNN&B98,
(Int. Conference on Neural and Brains); ppl5-pp. PL496-PL499;
Bijing, China, 27-30 Oct 1998.
[7] Peng G, an Evolutionary Algorithm to Construct Causal Networks,
Proc Of ICNN&B98, (Int. Conference on Neural and Brains); ppl5pp. PL259-PL261; Bijing, China, 27-30 Oct 1998.
[8] Rzempoluck E.J,Neural Network Data Analysis Using Stimulant,
Springer-Verlag New York, Inc, 1998.
[9] D.T. Pham and D.Karaboga, 1994, Intelligent Systems Research
Laboratory University of Wales College of Cardiff Cardiff CF2
1XH United Kingdom, Design of an Adaptive Fuzzy Logic
Controller, in: IEEE.
[10] M. S. Ahmed, LA. Tasadduq, S. 1994, Neural-Net controller for
nonlinear plants: design Approach through linearization, in: IEE
Proc.-Control Theory Appl., Vol. 141, No. 5.

RoViSP 2009 IEEE

128

You might also like