You are on page 1of 12

756 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO.

5, SEPTEMBER 1998

A Genetic-Based Neuro-Fuzzy Approach for


Modeling and Control of Dynamical Systems
Wael A. Farag, Victor H. Quintana, and Germano Lambert-Torres

Abstract— Linguistic modeling of complex irregular systems The knowledge representation in fuzzy modeling can be
constitutes the heart of many control and decision making sys- viewed as having two classes. The first (class A) as suggested
tems, and fuzzy logic represents one of the most effective algo- by Takagi and Sugeno in [11] can represent a general class
rithms to build such linguistic models. In this paper, a linguistic
(qualitative) modeling approach is proposed. The approach com- of static or dynamic nonlinear systems. It is based on “fuzzy
bines the merits of the fuzzy logic theory, neural networks, and partition” of input space and it can be viewed as the expansion
genetic algorithms (GA’s). The proposed model is presented in a of piecewise linear partition represented as
fuzzy-neural network (FNN) form which can handle both quan-
titative (numerical) and qualitative (linguistic) knowledge. The If is and is and is
learning algorithm of an FNN is composed of three phases. The
first phase is used to find the initial membership functions of the then (1)
fuzzy model. In the second phase, a new algorithm is developed
and used to extract the linguistic-fuzzy rules. In the third phase, where denotes the th fuzzy rule, and
a multiresolutional dynamic genetic algorithm (MRD-GA) is is the input and is the output of the
proposed and used for optimized tuning of membership functions
of the proposed model. Two well-known benchmarks are used to fuzzy rule . , are fuzzy
evaluate the performance of the proposed modeling approach, membership functions which can be bell-shaped, trapezoidal,
and compare it with other modeling approaches. or triangular, etc., and usually they are not associated with
Index Terms—Dynamic control, fuzzy logic, genetic algorithms, linguistic terms. From (1), it is noted that Takagi and Sugeno
modeling, neural networks. approach approximates a nonlinear system with a combination
of several linear systems by decomposing the whole input
space into several partial fuzzy spaces and representing each
I. INTRODUCTION output space with a linear equation. This type of knowledge

T HE principle of incompatibility, formulated by Zadeh


[1], explains the inadequacy of traditional quantitative
techniques when used to describe complex systems. Zadeh has
representation does not allow the output variables to be de-
scribed in linguistic terms which is one of the drawbacks
of this approach. Another drawback is that the parameter
suggested a linguistic (qualitative) analysis for these systems identification of this model is carried out iteratively using a
in place of a quantitative analysis. Accordingly, linguistic nonlinear optimization method [10], [11]. The implementation
modeling of complex systems has become one of the most of this method is not an easy task [8], [12], [13], because the
important issues [2]–[5]. A linguistic model is a knowledge- problem of determining the optimal membership parameters
based representation of a system; its rules and input–output involve a nonlinear programming problem.
variables are described in a linguistic form which can be easily The second class of knowledge representation (class B) in
understood and handled by a human operator; in other words, fuzzy models is developed by Mamdani [14] and used by Lin
this kind of representation of information in linguistic models and Lee [15] and Sugeno and Yasukawa [5]. The knowledge
imitates the mechanism of approximate reasoning performed is presented in these models as
in the human mind.
The fuzzy set theory formulated by Zadeh [6] has been If is and is and is
considered an appropriate presentation method for linguistic then is (2)
terms and human concepts since Mamdani’s pioneering work
in fuzzy control [7]. This work has motivated many researchers where , are fuzzy mem-
to pursue their research in fuzzy modeling [2]–[5], [8]–[13]. bership functions which are bell-shaped, trapezoidal, or tri-
Fuzzy modeling uses a natural description language to form a angular, etc., and usually associated with linguistic terms.
system model based on fuzzy logic with fuzzy predicates. This approach has some advantages over the first approach.
The consequent parts are presented by linguistic terms, which
makes this model more intuitive and understandable and gives
Manuscript received February 28, 1997; revised October 10, 1997.
W. A. Farag and V. H. Quintana are with the Electrical and Computer more insight into the model structure. Also, this modeling
Engineering Department, University of Waterloo, Waterloo, Ontario, Canada, approach is easier to implement than the first approach [13].
N2L 3G1. This second form (class B) of knowledge representation will
G. Lambert-Torres is with Escola Federal de Engenharia de Itjuba, Itajuba,
MG, Brazil. be adopted throughout this paper as we are more concerned
Publisher Item Identifier S 1045-9227(98)06185-2. with the linguistic modeling approaches.
1045–9227/98$10.00  1998 IEEE
FARAG et al.: GENETIC-BASED NEURO-FUZZY APPROACH 757

Many studies regarding finding the rules and tuning the where each gene can assume a finite number of values (alleles).
membership function parameters of fuzzy models have been A population consists of a finite number of chromosomes.
reported [2]–[5], [8]–[13]. Neural networks are integrated with The genetic algorithm evaluates a population and generates a
fuzzy logic in a form of fuzzy neural networks (FNN’s) new one iteratively, with each successive population referred
and used to build fuzzy models [15]–[22]. Many algorithms to as a generation. Given an initial population , the
have been proposed to train these FNN’s [15]–[22], and Jang GA generates a new generation based on the previous
et al. [19] have reviewed the fundamental and advanced generation as follows [24]:
developments in neuro-fuzzy synergisms for modeling and
control. Lin and Lee [15] have proposed a three-phase learning Initialize Population at time
algorithm. In the first phase, they have used the self-organizing Evaluate
feature map algorithm for coarse-identification of fuzzy model While (not terminate-condition) do
parameters. In the second phase, they have used a competitive begin
learning technique to find the rules. And in the third phase :Increment generation
they have used the backpropagation algorithm for fine-tuning select from
the parameters. recombine :apply genetic operators
In this paper, we propose a new approach for building lin- (crossover, mutation)
guistic models for complex dynamical systems. The structure evaluate
of the model is formed using a five-layer fuzzy neural network. end
The parameter identification of the fuzzy model is composed end.
of three phases. The first phase uses the Kohonen’s self
organizing feature map algorithm to find the initial parameters The GA uses three basic operators to manipulate the genetic
of the membership functions. A new algorithm is proposed, composition of a population: reproduction, crossover, and
in the second phase, to find the linguistic rules. The third mutation. Reproduction is a process by which the most highly
phase fine-tunes the membership functions parameters using a rated chromosomes in the current generation are reproduced in
new genetic algorithm (GA) called proposed multiresolutional the new generation. Crossover operator provides a mechanism
dynamic genetic algorithm (MRD-GA). The method used in for chromosomes to mix and match attributes through random
this work builds a linguistic model in a general framework processes. For example, if two chromosomes (parents)
known as the black box approach in systems theory. That is, are selected at random (such as
the model is built for a system without a priori knowledge and ) and an arbitrary crossover
about the system provided that numerical input–output data is site is selected (such as “3”), then the resulting two
given. chromosomes (offspring) will be and
This paper is organized as follows. In Section II, a after the crossover operation takes
brief overview on conventional genetic algorithms is given. place. Mutation is a random alteration of some gene values in
Section III illustrates the structure of the neuro-fuzzy model. a chromosome. Every gene in each chromosome is a candidate
Section IV describes the hybrid learning algorithm. Sections V for mutation, and its selection is determined by the mutation
and VI present the simulation results of two benchmarks. probability.
Section VII concludes the work done in this paper. GA’s provide a means to optimize ill-defined, irregular
problems. They can be tailored to the needs of different situ-
ations. Because of their robustness, GA’s have been success-
II. OVERVIEW OF GENETIC ALGORITHMS fully applied to generate if–then rules and adjust membership
GA’s are powerful search optimization algorithms based on functions of fuzzy systems [25]–[28].
the mechanics of natural selection and natural genetics. GA’s The GA described above is a conventional GA, meaning
can be characterized by the following features [23], [24]: the one in which the parameters are kept constant while the
• a scheme for encoding solutions to the problem, referred optimization process is running (static GA). In our approach,
to as chromosomes or strings; we introduce a new dynamic GA (MRD-GA) in which some
• an evaluation function (referred to as a fitness function) of its parameters as well as the problem configuration change
that rates each chromosome relative to the others in the from one generation to next (while the optimization process is
current set of chromosomes (referred to as a population); running) as will be discussed later in Section IV-C.
• an initialization procedure for a population of chromo-
somes (strings); III. THE NEURO-FUZZY (NF) MODEL TOPOLOGY
• a set of operators which are used to manipulate the genetic The NF model is built using a multilayer fuzzy neural net-
composition of the population (such as recombination, work shown in Fig. 1. The system has a total of five layers as
mutation, crossover, etc.); proposed by Lin and Lee [15]. A model with two inputs and a
• a set of parameters that provide initial settings for the single output is considered here for convenience. Accordingly,
algorithm and operators as well as the algorithm’s termi- there are two nodes in layer 1 and one node in layer 5.
nation conditions. Nodes in layer 1 are input nodes that directly transmit input
A candidate solution (in a GA) for a specific problem is signals to the next layer. Layer 5 is the output layer. Nodes in
called a chromosome and consists of a linear list of genes, layers 2 and 4 are “term nodes” and they act as membership
758 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

Layer 2: The nodes of this layer act as membership func-


tions to express the linguistic terms of input variables. For a
bell-shaped function, they are
for
(4)
for

for
(5)
Note that layer 2 links are all set to unity.
Layer 3: The links in this layer are used to perform pre-
condition matching of fuzzy rules. Thus, each node has two
input values from layer 2. The correlation-minimum inference
procedure is utilized here to determine the firing strengths of
each rule. The output of the nodes in this layer is determined
by fuzzy AND operation. Hence, the functions of the layer
Fig. 1. Topology of the neuro-fuzzy model. are as follows:

functions to express the input–output fuzzy linguistic variables. for


A bell-shaped function is adopted to represent the membership (6)
function, in which the mean value and the variance are
for (7)
adjusted through the learning process. The two fuzzy sets of
the first and the second input variables consist of and The link weights in this layer are also set to unity.
linguistic terms, respectively. The linguistic terms, such Layer 4: Each node of this layer performs the fuzzy OR
as positive large (PL), positive medium (PM), positive small operation to integrate the field rules leading to the same output
(PS), zero (ZE), negative small (NS), negative medium (NM), linguistic variable. The functions of the layer are expressed as
negative large (NL), are numbered in descending order in the follows:
term nodes. Hence, nodes and nodes are included
in layers 2 and 4, respectively, to indicate the input–output (8)
linguistic variables.
Each node of layer 3 is a “rule node” and represents a for (9)
single fuzzy control rule. In total, there are nodes The link weight in this layer expresses the association of
in layer 3 to form a fuzzy rule base for two linguistic input the th rule with the th output linguistic variable. It can take
variables. The links of layers 3 and 4 define the preconditions only two values; either one or zero.
and consequences of the rule nodes, respectively. For each Layer 5: The node in this layer computes the output signal
rule node, there are two fixed links from the input term nodes. of the NF model. The output node together with layer 5 links
Layer 4 links, encircled in dotted line, are adjusted in response act as a defuzzifier. The center of area defuzzification scheme,
to varying control situations. By contrast, the links of layers 2 used in this model, can be simulated by
and 5 remain fixed between the input–output nodes and their
corresponding term nodes. (10)
The NF model can adjust the fuzzy rules and their member-
ship functions by modifying layer 4 links and the parameters
that represent the bell-shaped membership functions for each (11)
node in layers 2 and 4. For convenience we use the following
notation to describe the functions of the nodes in each of the
five layers. Hence, the th link weight in this layer is .
the net input value to the th node in layer .
the output value of the th node in layer .
the mean and variance of the bell-shaped func- IV. THE PROPOSED HYBRID LEARNING ALGORITHM
tion of the th node in layer . In this section we present a three-phase learning scheme for
the link that connects the output of the th node the above NF model. The proposed scheme is an extension
in layer 3 with the input to the th node in of the ideas presented in [15] and [21]. It is quite convenient
layer 4. to divide the task of constructing the fuzzy model into the
Layer 1: The nodes of this layer directly transmit input following subtasks: locating initial membership functions,
signals to the next layer. That is finding the if–then rules, and optimal tuning of the membership
functions. In phase one and two of the proposed scheme,
(3) unsupervised learning algorithms are used to perform the
FARAG et al.: GENETIC-BASED NEURO-FUZZY APPROACH 759

first and the second subtasks. In phase three, a supervised B. Learning-Phase Two
learning scheme is used to perform the third subtask. To After the parameters of the membership functions have been
initiate the learning scheme, training data and the desired or found, the training signals from both external sides can reach
selected coarse of fuzzy partition (i.e., the size of the term the outputs of the term nodes at layer two and layer four.
set of each input–output linguistic variable) must be provided Furthermore, the outputs of term nodes at layer two can be
from the outside world. For more details about the structure transmitted to rule-nodes through the initial architecture of
identification of fuzzy models refer to Sugeno et al. [29]. layer-three links. Thus we can get the firing strength of each
rule-node. Based on these rule-firing strengths (denoted as
A. Learning-Phase One ’s) and the outputs of term-nodes at layer four (denoted
The problem for the self-organized learning can be stated as: as ’s), we want to decide the correct consequence-link
“Given the training input data , the desired for each rule node (from the connected layer-four-links)
output value , the fuzzy partitions to find the rules. A new algorithm is proposed here
and , and the desired shapes of membership to perform this task. We refer to this algorithm as maximum
functions, we want to locate the membership functions.” In matching-factor algorithm (MMFA). The MMFA is described
this phase, the network works in a two-sided manner; that as follows.
is, the nodes and the links at layer 5 are in the up–down Step 1: For each layer-three-rule node we construct
transmission mode manner (follow the dotted lines in Fig. 1) matching factors. In this case, we have matching
so that the training input and output data are fed into this factors. Each matching factor is denoted as , where the
network from both sides. subscript is the rule-node-index ( ), and
The centers (or means) and the widths (or variances) of the subscript is the output-linguistic-variable-index (output-
the membership functions are determined by a self-organized term-node-index) ( ).
learning technique that is analogous to statistical clustering. Step 2: is calculated according to the following pseu-
This serves to allocate network resources efficiently by placing docode:
the domains of membership functions covering only those
regions of the input–output space where data are present.
Kohonen’s self-organized feature-map (SOM) algorithm is
adapted to find the center of the membership function [15] (no. of available training examples)

(12) if is the maximum element


in the set
Otherwise.
(13)
for (14) end
end
where is a monotonically decreasing scalar learning rate, end.
and . This adaptive formulation runs independently
for each input and output linguistic variable. The determination Step 3: After calculating all the ’s using the previous
of which of ’s is can be accomplished in constant code considering all the available training patterns, the rules
time via a winner-take-all circuit [15]. consequences can be determined form these factors according
Once the centers of membership functions are found, their to the following pseudocode:
widths can be determined by the -nearest-neighbors heuris-
tic, by minimizing the following objective function with For
respect to the widths ’s: Find the maximum matching-factor
from the set
Find the corresponding term-node index
(15) of
Delete all the layer-four-links of the th
rule-node except the one connecting it
with the term-node of index ,
where is an overlap parameter that usually ranges from 1.0
end
to 2.0. Since our third learning phase will optimally adjust
the centers and the widths of the membership functions, the Step 4: From the above algorithm, only one term in the out-
widths can be simply determined by the first-nearest-neighbor put linguistic variable’s term set can become the consequence
heuristic at this stage as of a certain fuzzy-logic rule. If all the matching factors of a
rule-node are very small (meaning that this rule has small or
no effect on the output), then all the corresponding links are
(16)
deleted, and this rule-node is eliminated.
760 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

C. Learning-Phase Three strings of length alleles. It is allowed for each allele


After the fuzzy rules are found, the whole network structure to take any value in the set [ ]. To convert the
is established, and the third-learning phase is started in order to allele-value to a new center or width of a certain membership
optimally adjust the parameters of the membership functions. function, we use the following procedure.
Optimization, in the most general form, involves finding the Step 1: The initial values of the centers and widths of the
most optimum solution from a family of reasonable solutions fuzzy controller are entered to the GA program, for example,
according to an optimization criterion. For all but a few trivial ( ) and ( ).
problems, finding the global optimum can never be guaranteed Step 2: The new centers and widths are calculated from
[30, ch. 14]. Hence, optimization in the last three decades the allele values as
has focused on methods to achieve the best solution per unit (17)
computational cost. (18)
The problem for this supervised learning phase can be
where and are the new center and width values, respec-
stated as: “Given the training input data ,
tively, is the value of the th allele in the string, and
the desired output value , the fuzzy
and are the offsets of the centers and widths, respectively.
partitions and , the desired shapes of membership
It is recommended to set these offsets to very small values
functions, and the fuzzy rules, adjust the parameters of the
(around 0.001). This allows a more stable convergence of the
membership functions optimally.” In this phase, the network
MRD-GA.
works in the feedforward manner; that is, the nodes and the
Step 3: If the allele value of any center or width equals
links at layer four are in the down-up transmission mode. five then no change occurs. If is greater than five then
A new approach is proposed here to use GA’s for the op- positive change occurs (the center or width increases). If it is
timization of the membership functions parameters. Problem- less than five then negative change occurs (the center or width
specific knowledge is used to tailor a GA to the needs of this decreases).
learning phase. The main attribute of the proposed approach The MRD-GA uses the mean squared error (MSE) (the error
is that the fuzzy-model configuration is dynamically-adapted is the difference between the actual output and the estimated
while the optimization process is running. Accordingly, the output by the fuzzy model) as a fitness function. Simply,
MRD-GA changes its search space with the change of the for each chromosome (1/MSE) is considered as the fitness
problem configuration and with the advance of generations. measure of it. The MSE is calculated from data points as
The GA search space is monotonically getting narrower and
narrower, while the model parameters are getting closer and MSE (19)
closer to the optimal values.
The GA is coded using the well-structured language C . where is the actual value, and is the estimated value.
The program allows the user to define the values for population Each generations, the offset values ( and ) decrease
size (pop size), maximum number of generations (max gen), according to the following decaying functions:
probability of crossover (pcross), and probability of mutation
(pmut). In order to select the individuals for the next genera- (20)
tion, the tournament selection method is used. In this method, (21)
two members of the population are selected at random and where and are the modifying factors for the centers
their fitness values compared. The member with higher fitness and widths, respectively. The decaying functions can take any
advances to the next generation. An advantage of this method decaying shape such as, for example, an exponential decay.
is that it needs less computational effort than other methods. The usual GA terminating condition is a maximum allowable
Also, it does not need a scaling process (like the roulette generations or a certain value of MSE required to be reached.
wheel selection). However, the particulars of the reproduction In this GA algorithm, the stopping criteria is the execution of a
scheme are not critical to the performance of the GA; virtually, certain number of generations without any improvement in the
any reproduction scheme that biases the population toward the best fitness value. In this criteria, you do not need to specify
fitter strings works well [25]. a required MSE value (which usually unknown in advance) or
The MRD-GA uses decimal-integer strings to encode the a required number of generations (where there is no grantee
model parameters. The decimal strings are considered a more that this number will produce an appropriate solution).
suitable representative method than the binary strings. This The MRD-GA pseudocode is shown at the bottom of the
representation allows the use of a more compact-size strings. next page.
The number of alleles (individual locations which make up This GA offers exciting advantages over the conventional
the string) is determined from the total number of fuzzy sets GA [31]–[33]. It allows a dynamic increase in the resolution
used to partition the spaces of the input–output variables. of the search space (by decreasing and ) as the model
For the model configuration shown in Fig. 1, we have ( parameters approach their optimal values. It also changes the
) membership functions. Each bell-shaped nature of the model-identification problem from a static type to
membership function is defined by two parameters (the center a dynamic type (by adapting and continuously) which
, and the width ). To optimize the membership functions, decreases the chances of the GA premature convergence. The
we have to optimize ( ) parameters. Thus, the GA uses MRD-GA has also advantages over the backpropagation (BP)
FARAG et al.: GENETIC-BASED NEURO-FUZZY APPROACH 761

algorithm [15], [21]. The MRD-GA allows to obtain interme- TABLE I


diate solutions, which the BP usually cannot offer; also, the THE COMPLETE FUZZY ASSOCIATIVE MEMORY MATRIX WITH THE RULES
GA does not suffer from convergence problems with the same
degree that the BP suffers (i.e., the MRD-GA is more robust).

V. THE FIRST NUMERICAL EXAMPLE


The proposed modeling algorithm is examined using the
well-known example of system identification given by Box
and Jenkins [34]. The process is a gas furnace with a single
input and a single output : gas flow rate and CO con-
centration, respectively. The data set consists of pairs TABLE II
of input–output measurements. The sampling interval is 9 s. THE COMPUTATION TIME OF EXAMPLE 1
The inputs to the fuzzy model are selected to be
and , respectively [35]. The input variable
is modified to be , where
is the average of all the ’s. Each of the input
variables is intuitively partitioned into seven linguistic sets
( ). The output of the fuzzy model is , where
the actual process output is . The model
output is partitioned into nine linguistic sets ( ). chromosome-length 46, , ,
The gas furnace is modeled using the fuzzy-neural network , , and . To study the pop size
shown in Fig. 1. The SOM algorithm (Section IV-A) is used to effect on the search performance of the proposed GA, the GA
determine the initial centers and widths of the 23 member func- is applied four times with different population sizes (pop size
tions of the input–output variables. The three scaling factors 80, 50, 30, and 12). After finishing the second learning
of this model are determined as “Gu 0.658,” “Gy 0.227,” phase and before applying the GA, the model has an MSE
and “Go 7.909.” The resultant membership functions after value of 0.937. This MSE value is decreased to 0.111 after
finishing this learning phase are shown in Fig. 2. 4972 generations of the MRD-GA using pop_size 50 (note
The MMFA algorithm (Section IV-B) is used to find the that the MSE value reached 0.15 after only 900 generations).
linguistic fuzzy rules of the gas furnace model. Out of the 49 The computation time elapsed to perform the whole learning
rules of the fuzzy model, 37 rules are only considered. The scheme is roughly determined as shown in Table II. The
rest are deleted because they have very small matching factors resultant membership functions and the model output are
(less than 1% of the highest matching factor of all the rules). shown in Figs. 3 and 4, respectively. The MSE decrease rates
The 37 rules are shown in Table I. using different population sizes are also shown in Fig. 5.
The MRD-GA (Section IV-C) is then applied to optimize In Table III, our fuzzy model is compared with other models
the parameters of the gas furnace model. The algorithm identified from the same data. It can be seen that our model
parameters are set as follows: pcross 0.9, pmut 0.1, outperforms all the other models in its class (class B). In

Initialize . Population at time .


Initialize best fit = 0. :The best fitness value.
Evaluate .
Search for the best fitness of and assign best fit to it.
While (not terminate-condition) do
Begin
:Increment generation.
If then modify (decrease) and as given in (20), and (21).
Select from using tournament selection criteria.
Recombine :apply genetic operators (crossover, mutation).
Evaluate .
Search for the best fitness of and compare it with best fit, if larger then do
Begin
Assign best fit to the best fitness value of .
Adapt the centers and the widths and according
to the state of the chromosome having the best fitness using (17), (18).
End
End
End.
762 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

Fig. 2. The normalized membership functions after SOM.

Fig. 3. The optimized membership functions after MRD-GA.

comparison with class A models, Sugeno’s model [10] has membership function, at least two or three parameters have
less MSE value using six inputs but, in the same time, has to be calculated through a nonlinear programming procedure.
much higher MSE value using the same inputs used by our The choice and computation of these membership functions are
model ( , and ). Also, This model is quite rather tricky and subjective so that it is possible for different
difficult to build [8], [12], [13]. The most difficult aspect designers to sometimes get completely different results.
lies in the identification of the premise structure, mainly Wang’s model (class A) [8] has comparable results and
the membership functions of the input variables. For each less number of rules; however, the number of rules does
FARAG et al.: GENETIC-BASED NEURO-FUZZY APPROACH 763

Fig. 4. Output of the gas furnace fuzzy model.

Fig. 5. The MRD-GA convergence rates with different population sizes.

not necessarily give a reliable indication of the number of nonlinear difference equation
unknown parameters of the model. For example, in Wang’s
model [8] five class A rules are used with two inputs, the (22)
number of unknown parameters in this case (in both the
premise and consequent parts) are 35. In our model, 37 class Training data of 500 points are generated from the plant
B rules are used with two inputs and 46 unknown parameters. model, assuming a random input signal “ ” uniformly dis-
Bearing in mind that our model shows about 30% reduction in tributed in the interval [ ]. This data is used to build a
the MSE value and provides a linguistic description for the gas linguistic-fuzzy model for this plant.
furnace system; these two advantages, in our view, compensate The plant is modeled using the FNN described in
for the difference in the number of parameters (46 versus 35). Section III. The model has three inputs , , and ,
and a single output . The inputs and are intuitively
partitioned into five fuzzy linguistic spaces NL, NS, ZE, PS,
VI. THE SECOND NUMERICAL EXAMPLE PL , the input is partitioned into three fuzzy spaces N,
This example is taken from Narendra et al. [36] in which Z, P and the output is partitioned into 11 fuzzy spaces
the plant to be identified is given by the second-order highly NVL, NL, NM, NS, NVS, ZE, PVS, PS, PM, PL, PVL .
764 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

TABLE III
COMPARISON OF OUR MODEL WITH OTHER MODELS

The SOM algorithm described in Section IV-A is used to After the learning process is finished, the model is tested by
determine the initial centers and widths of the 24 membership applying a sinusoidal input signal to the
functions of the input–output variables of the fuzzy model. fuzzy model. The output of both the fuzzy model and the actual
The four scaling factors of this fuzzy model are determined model are shown in Fig. 7. The fuzzy model has a good match
from this learning phase as “Gu 0.7476,” “Gy 0.4727,” with the actual model with a MSE of “0.0403.” Another test
“Gyy 0.6261,” and “Go 5.5781.” is carried out using an input signal .
According to the structure of this fuzzy-neural network, the The result is shown in Fig. 8 and the MSE in this case is
number of rules (rule nodes in the third layer) is . “0.0369.” After extensive testing and simulations, the fuzzy
The MMFA described in Section IV-B is used to find the 75 model proved a good performance in forecasting the output of
rules of this fuzzy model and the results are shown in Table IV. the complex-dynamic plant. Remember that in this example
The MRD-GA (Section IV-C) is applied to optimize the only 500 data points are used to build the model; while in
parameters of the dynamic-system model. The algorithm pa- [36], 100 000 data points have been used to identify a neural
rameters are set as follows: pop size , pmut 0.05, network model. It can be expected that the performance of the
chromosome-length , , , identified fuzzy model may be further improved if the number
, , and . After finishing the second of data points used to build the model is increased.
learning phase and before applying the MRD-GA, the model In order to compare our modeling approach with that of
has an MSE value of 0.2058. This MSE value is decreased to Sugeno’s [10], [11] and Wang’s [8] approaches, both of
0.0374 after 3517 generations using a single point crossover these approaches are implemented. The Sugeno’s approach is
with pcross value of 0.9 (note that the MSE value reached implemented using the MATLAB fuzzy-logic tool box. The
0.06 after only 470 generations). The computation time used approach [37] applies the least-squares algorithm (LSA) and
to perform this learning process is illustrated in Table V. The the backpropagation gradient descent method for identifying
MSE decay rates using different crossover probabilities are linear (consequent) and nonlinear (premise) parameters of
shown in Fig. 6. the class A fuzzy rules, respectively. The core function of
FARAG et al.: GENETIC-BASED NEURO-FUZZY APPROACH 765

Fig. 6. The MRD-GA convergence rates with different crossover rates.

Fig. 7. Testing of the fuzzy model versus the actual model.

this algorithm is implemented using an-optimized-for-speed Table VI compares our modeling approach with both of
C code. Wang’s approach is implemented using the C Sugeno’s and Wang’s approaches. The models are learned
programming language. The approach uses the fuzzy C-means from the previously generated 500 data pairs and tested by
(FCM) clustering algorithm [38] to find the premise parameters applying a sinusoidal input signal . All the
of the class A fuzzy rules, then it applies the least squares experiments are carried out on a Pentium 166-MHz PC. The
algorithm to find the consequent linear parameters of the rules. comparison shows the advantages of our modeling approach.
766 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998

Fig. 8. Testing of the fuzzy model versus the actual model.

TABLE IV TABLE V
THE COMPLETE FAM MATRICES WITH THE FUZZY RULES THE COMPUTATION TIME OF EXAMPLE 2

TABLE VI
A MODELING COMPARISON USING EXAMPLE 2

In the third phase, the membership functions are fine-tuned


using the proposed MRD-GA. The performance of the neuro-
fuzzy approach is tested using two benchmarks, and compared
VII. CONCLUSION
with other models. The approach shows a good performance
This paper presents a neuro-fuzzy approach for linguistic in building accurate linguistic fuzzy models.
modeling of complex-dynamical systems, in which a proposed
genetic algorithm plays a central role. One of the advantages of
the method presented is that it divides the learning algorithm ACKNOWLEDGMENT
into three phases. The initial membership functions are found The authors wish to thank the anonymous reviewers for
in the first phase using Kohonen’s self organizing feature their valuable and meticulous comments. The guidance and
maps. In the second phase, a new algorithm is proposed to help of Dr. A. Y. Tawfik at Wilfrid Laurier University are
extract and optimize the fuzzy rules for the neuro-fuzzy model. also gratefully acknowledged.
FARAG et al.: GENETIC-BASED NEURO-FUZZY APPROACH 767

REFERENCES [29] M. Sugeno and G. T. Kang, “Structure identification of fuzzy model,”


Fuzzy Sets Syst., vol. 28, pp. 15–33, 1988.
[1] L. Zadeh, “Outline of a new approach to analysis of complex systems [30] T. J. Ross, Fuzzy Logic with Engineering Applications. New York:
and decision processes,” IEEE Trans. Syst., Man, Cybern., vol. SMC-3, McGraw-Hill, 1995.
pp. 28–44, 1973. [31] M. Lee and H. Takagi, “Dynamic control of genetic algorithms using
[2] R. M. Tong, “The construction and evaluation of fuzzy models,” fuzzy logic techniques,” in Proc. 5th Int. Conf. Genetic Algorithms, Univ.
Advances in Fuzzy Set Theory and Applications, M. M. Gupta, R. K. Illinois, Urbana-Champaign, July 17–21, 1993, pp. 76–83.
Ragade, and R. R. Yager, Eds. Amesterdam, The Netherlands: North- [32] A. N. Aizawa and B. W. Wah, “Dynamic control of genetic algorithms
Holland, 1979, pp. 559–576. in a noisy environment,” Proc. 5th Int. Conf. Genetic Algorithms, Univ.
[3] W. Pedrycz, “An identification algorithm in fuzzy relational systems,” Illinois, Urbana-Champaign, July 17–21, 1993, pp. 48–49.
Fuzzy Sets Syst., vol. 13, pp. 153–167, 1984. [33] J. Kim and B. P. Zeigler, “Designing fuzzy logic controllers using a
[4] C. W. Xu and Y. Z. Lu, “Fuzzy model identification and self-learning multiresolutional search paradigm,” IEEE Trans. Fuzzy Syst., vol. 4, pp.
for dynamic systems,” IEEE Trans. Syst., Man, Cybern., vol. 17, pp. 213–226, Aug. 1996.
683–689, 1987. [34] G. E. P. Box and G. M. Jenkins, Time Series Analysis: Forecasting and
[5] M. Sugeno and T. Yasukawa, “A fuzzy-logic-based approach to qual- Control, 2nd ed. San Francisco, CA: Holden-Day, 1976.
itative modeling,” IEEE Trans. Fuzzy Syst., vol. 1, pp. 7–31, Feb. [35] R. M. Tong, “Synthesis of fuzzy models for industrial processes—Some
1993. recent results,” Int. J. General Syst., vol. 4, pp. 143–162, 1978.
[6] L. A. Zadeh, “Fuzzy sets,” Inform. Contr., vol. 8, pp. 338–353, 1965. [36] K. S. Narendra and K. Parthasarathy, “Identification and control of dy-
[7] E. H. Mamdani and S. Assilian, “An experiment in linguistic synthesis namical systems using neural networks,” IEEE Trans. Neural Networks,
with a fuzzy logic controller,” Int. J. Man–Machine Studies, vol. 7, no. vol. 1, Mar. 1990.
1, pp. 1–13, 1975. [37] J.-S. R. Jang, “ANFIS: Adaptive-network-based fuzzy inference sys-
[8] L. Wang and R. Langari, “Complex systems modeling via fuzzy logic,” tem,” IEEE Trans. Syst., Man, Cybern., vol. 23, pp. 665–685, 1993.
IEEE Trans. Syst., Man, Cybern., vol. 26, pp. 100–106, Feb. 1996. [38] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algo-
[9] W. Pedrycz and J. V. de Oliveira, “Optimization of fuzzy models,” IEEE rithms. New York: Plenum, 1981.
Trans. Syst., Man, Cybern., vol. 26, no. 4, pp. 627–636, Aug. 1996.
[10] M. Sugeno and K. Tanaka, “Successive identification of a fuzzy model
and its application to prediction of a complex system,” Fuzzy Sets Syst.,
vol. 42, pp. 315–334, 1991.
[11] T. Takagi and M. Sugeno, “Fuzzy identification of systems and its Wael A. Farag received the B.Sc. and M.Sc. de-
applications to modeling and control,” IEEE Trans. Syst., Man, Cybern., grees from Cairo University, Egypt, in 1990 and
vol. 15, 1985. 1993, respectively, both in electrical engineering.
[12] L. Wang and R. Langari, “Building Sugeno-type models using fuzzy Currently, he is a Ph.D. candidate in the Electrical
discretization and orthogonal parameter estimation techniques,” IEEE and Computer Engineering Department at the Uni-
Trans. Fuzzy Syst., vol. 3, pp. 454–458, Nov. 1995. versity of Waterloo, Ontario, Canada.
[13] E. Kim, M. Park, S. Ji, and M. Park, “A new approach to fuzzy From 1990 to 1994, he worked as an Assistant
modeling,” IEEE Trans. Fuzzy Syst., vol. 5, pp. 328–337, Aug. 1997. Lecturer in the Electrical Engineering Department
[14] E. Mamdani, “Advances in the linguistic synthesis of fuzzy controllers,” at Cairo University. His research interests include
Int. J. Man–Machine Studies, vol. 8, pp. 669–678, 1976. microprocessor-based systems, fuzzy logic and neu-
[15] C. T. Lin and C. S. G. Lee, “Neural-network-based fuzzy logic control ral networks, evolutionary algorithms, and control
and decision system,” IEEE Trans. Comput., vol. 40, pp. 1320–1336, systems.
Dec. 1991.
[16] H. Nomura, I. Hayashi, and N. Wakami, “A self-tunning method of fuzzy
control by descent method,” Int. Fuzzy Syst. Assoc., Brussels, Belgium,
1991, pp. 155–158.
[17] L. X. Wang and J. Mendel, “Generating fuzzy rules by learning from Victor H. Quintana (M’73–SM’80) received the
examples,” IEEE Trans. Syst., Man, Cybern., vol. 22, pp. 1414–1427, Dipl. Ing. degree from the State Technical Univer-
1992. sity of Chile in 1959, the M.Sc. degree in electrical
[18] C. C. Hung, “Building a neuro-fuzzy learning control system,” AI engineering from University of Wisconsin, Madi-
Expert, pp. 40–49, Nov. 93. son, in 1965, and the Ph.D. degree in electrical
[19] J. S. R. Jang and C. T. Sun, “Neuro-fuzzy modeling and control,” Proc. engineering from the University of Toronto, Ontario,
IEEE, vol. 83, no. 3, pp. 378–406, Mar. 1995. Canada, in 1970, respectively.
[20] Y. Lin and G. A. Cunningham, “A new approach to fuzzy-neural system Since 1973, he has been with the University of
modeling,” IEEE Trans. Fuzzy Syst., vol. 3, pp. 190–198, May 1995. Waterloo, Department of Electrical and Computer
[21] W. A. Farag, V. H. Quintana, and G. Lambert-Torres, “Neural-network-
Engineering, where he is currently a Full Professor.
based self-organizing fuzzy-logic automatic voltage regulator for a
His main research interests are in the areas of
synchronous generator,” in Proc. 28th Annu. NAPS, Cambridge, MA,
numerical optimization techniques, state estimation, and control theory as
Nov. 10–12, 1996, pp. 289–296.
applied to power systems.
[22] R. Langari and L. Wang, “Fuzzy models, modular networks, and hybrid
Dr. Quintana is an Associate Editor of the International Journal of Energy
learning,” Fuzzy Sets Syst., vol. 79, pp. 141–150, 1996.
[23] D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Ma- Systems and a member of the Association of Professional Engineers of the
chine Learning. Reading, MA: Addison-Wesley, 1989. Province of Ontario.
[24] Z. Miachalewicz, GA + Data Structure = Evolution Programming.
New York: Springer-Verlag, 1994.
[25] C. L. Karr and E. J. Gentry, “Fuzzy control of pH using genetic
algorithms,” IEEE Trans. Fuzzy Syst., vol. 1, pp. 46–53, Feb. 1993.
[26] D. Park, A. Kandel, and G. Langholz, “Genetic-based new fuzzy Germano Lambert-Torres received the Ph.D. degree in electrical engineering
reasoning models with application to fuzzy control,” IEEE Trans. Syst., from École Polytechnique de Montréal, Canada.
Man, Cybern., vol. 24, pp. 39–47, Jan. 1994. Since 1983, he has been with the Escola Federal de Engenharia de Itajuba
[27] A. Homaifar and E. McCormick, “Simultaneous design of membership (EFEI), Brazil, where he is a Full Professor and Associate Chairman of
functions and rule sets for fuzzy controllers using genetic algorithms,” graduate Studies. During the 1995 to 1996 academic year, he was a Visiting
IEEE Trans. Fuzzy Syst., vol. 3, May 1995. Professor at University of Waterloo, Canada. He has served as a consultant for
[28] H. Ishibuchi, K. Nozaki, N. Yamamoto, and H. Tanaka, “Selecting fuzzy many power industries in South-America countries. He has published more
if–then rules for classification problems using genetic algorithms,” IEEE than 200 chapter books, journal articles, and conference papers on intelligent
Trans. Fuzzy Syst., vol. 3, pp. 260–270, Aug. 1995. systems applied to power system problem solving.

You might also like