Professional Documents
Culture Documents
5, SEPTEMBER 1998
Abstract— Linguistic modeling of complex irregular systems The knowledge representation in fuzzy modeling can be
constitutes the heart of many control and decision making sys- viewed as having two classes. The first (class A) as suggested
tems, and fuzzy logic represents one of the most effective algo- by Takagi and Sugeno in [11] can represent a general class
rithms to build such linguistic models. In this paper, a linguistic
(qualitative) modeling approach is proposed. The approach com- of static or dynamic nonlinear systems. It is based on “fuzzy
bines the merits of the fuzzy logic theory, neural networks, and partition” of input space and it can be viewed as the expansion
genetic algorithms (GA’s). The proposed model is presented in a of piecewise linear partition represented as
fuzzy-neural network (FNN) form which can handle both quan-
titative (numerical) and qualitative (linguistic) knowledge. The If is and is and is
learning algorithm of an FNN is composed of three phases. The
first phase is used to find the initial membership functions of the then (1)
fuzzy model. In the second phase, a new algorithm is developed
and used to extract the linguistic-fuzzy rules. In the third phase, where denotes the th fuzzy rule, and
a multiresolutional dynamic genetic algorithm (MRD-GA) is is the input and is the output of the
proposed and used for optimized tuning of membership functions
of the proposed model. Two well-known benchmarks are used to fuzzy rule . , are fuzzy
evaluate the performance of the proposed modeling approach, membership functions which can be bell-shaped, trapezoidal,
and compare it with other modeling approaches. or triangular, etc., and usually they are not associated with
Index Terms—Dynamic control, fuzzy logic, genetic algorithms, linguistic terms. From (1), it is noted that Takagi and Sugeno
modeling, neural networks. approach approximates a nonlinear system with a combination
of several linear systems by decomposing the whole input
space into several partial fuzzy spaces and representing each
I. INTRODUCTION output space with a linear equation. This type of knowledge
Many studies regarding finding the rules and tuning the where each gene can assume a finite number of values (alleles).
membership function parameters of fuzzy models have been A population consists of a finite number of chromosomes.
reported [2]–[5], [8]–[13]. Neural networks are integrated with The genetic algorithm evaluates a population and generates a
fuzzy logic in a form of fuzzy neural networks (FNN’s) new one iteratively, with each successive population referred
and used to build fuzzy models [15]–[22]. Many algorithms to as a generation. Given an initial population , the
have been proposed to train these FNN’s [15]–[22], and Jang GA generates a new generation based on the previous
et al. [19] have reviewed the fundamental and advanced generation as follows [24]:
developments in neuro-fuzzy synergisms for modeling and
control. Lin and Lee [15] have proposed a three-phase learning Initialize Population at time
algorithm. In the first phase, they have used the self-organizing Evaluate
feature map algorithm for coarse-identification of fuzzy model While (not terminate-condition) do
parameters. In the second phase, they have used a competitive begin
learning technique to find the rules. And in the third phase :Increment generation
they have used the backpropagation algorithm for fine-tuning select from
the parameters. recombine :apply genetic operators
In this paper, we propose a new approach for building lin- (crossover, mutation)
guistic models for complex dynamical systems. The structure evaluate
of the model is formed using a five-layer fuzzy neural network. end
The parameter identification of the fuzzy model is composed end.
of three phases. The first phase uses the Kohonen’s self
organizing feature map algorithm to find the initial parameters The GA uses three basic operators to manipulate the genetic
of the membership functions. A new algorithm is proposed, composition of a population: reproduction, crossover, and
in the second phase, to find the linguistic rules. The third mutation. Reproduction is a process by which the most highly
phase fine-tunes the membership functions parameters using a rated chromosomes in the current generation are reproduced in
new genetic algorithm (GA) called proposed multiresolutional the new generation. Crossover operator provides a mechanism
dynamic genetic algorithm (MRD-GA). The method used in for chromosomes to mix and match attributes through random
this work builds a linguistic model in a general framework processes. For example, if two chromosomes (parents)
known as the black box approach in systems theory. That is, are selected at random (such as
the model is built for a system without a priori knowledge and ) and an arbitrary crossover
about the system provided that numerical input–output data is site is selected (such as “3”), then the resulting two
given. chromosomes (offspring) will be and
This paper is organized as follows. In Section II, a after the crossover operation takes
brief overview on conventional genetic algorithms is given. place. Mutation is a random alteration of some gene values in
Section III illustrates the structure of the neuro-fuzzy model. a chromosome. Every gene in each chromosome is a candidate
Section IV describes the hybrid learning algorithm. Sections V for mutation, and its selection is determined by the mutation
and VI present the simulation results of two benchmarks. probability.
Section VII concludes the work done in this paper. GA’s provide a means to optimize ill-defined, irregular
problems. They can be tailored to the needs of different situ-
ations. Because of their robustness, GA’s have been success-
II. OVERVIEW OF GENETIC ALGORITHMS fully applied to generate if–then rules and adjust membership
GA’s are powerful search optimization algorithms based on functions of fuzzy systems [25]–[28].
the mechanics of natural selection and natural genetics. GA’s The GA described above is a conventional GA, meaning
can be characterized by the following features [23], [24]: the one in which the parameters are kept constant while the
• a scheme for encoding solutions to the problem, referred optimization process is running (static GA). In our approach,
to as chromosomes or strings; we introduce a new dynamic GA (MRD-GA) in which some
• an evaluation function (referred to as a fitness function) of its parameters as well as the problem configuration change
that rates each chromosome relative to the others in the from one generation to next (while the optimization process is
current set of chromosomes (referred to as a population); running) as will be discussed later in Section IV-C.
• an initialization procedure for a population of chromo-
somes (strings); III. THE NEURO-FUZZY (NF) MODEL TOPOLOGY
• a set of operators which are used to manipulate the genetic The NF model is built using a multilayer fuzzy neural net-
composition of the population (such as recombination, work shown in Fig. 1. The system has a total of five layers as
mutation, crossover, etc.); proposed by Lin and Lee [15]. A model with two inputs and a
• a set of parameters that provide initial settings for the single output is considered here for convenience. Accordingly,
algorithm and operators as well as the algorithm’s termi- there are two nodes in layer 1 and one node in layer 5.
nation conditions. Nodes in layer 1 are input nodes that directly transmit input
A candidate solution (in a GA) for a specific problem is signals to the next layer. Layer 5 is the output layer. Nodes in
called a chromosome and consists of a linear list of genes, layers 2 and 4 are “term nodes” and they act as membership
758 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998
for
(5)
Note that layer 2 links are all set to unity.
Layer 3: The links in this layer are used to perform pre-
condition matching of fuzzy rules. Thus, each node has two
input values from layer 2. The correlation-minimum inference
procedure is utilized here to determine the firing strengths of
each rule. The output of the nodes in this layer is determined
by fuzzy AND operation. Hence, the functions of the layer
Fig. 1. Topology of the neuro-fuzzy model. are as follows:
first and the second subtasks. In phase three, a supervised B. Learning-Phase Two
learning scheme is used to perform the third subtask. To After the parameters of the membership functions have been
initiate the learning scheme, training data and the desired or found, the training signals from both external sides can reach
selected coarse of fuzzy partition (i.e., the size of the term the outputs of the term nodes at layer two and layer four.
set of each input–output linguistic variable) must be provided Furthermore, the outputs of term nodes at layer two can be
from the outside world. For more details about the structure transmitted to rule-nodes through the initial architecture of
identification of fuzzy models refer to Sugeno et al. [29]. layer-three links. Thus we can get the firing strength of each
rule-node. Based on these rule-firing strengths (denoted as
A. Learning-Phase One ’s) and the outputs of term-nodes at layer four (denoted
The problem for the self-organized learning can be stated as: as ’s), we want to decide the correct consequence-link
“Given the training input data , the desired for each rule node (from the connected layer-four-links)
output value , the fuzzy partitions to find the rules. A new algorithm is proposed here
and , and the desired shapes of membership to perform this task. We refer to this algorithm as maximum
functions, we want to locate the membership functions.” In matching-factor algorithm (MMFA). The MMFA is described
this phase, the network works in a two-sided manner; that as follows.
is, the nodes and the links at layer 5 are in the up–down Step 1: For each layer-three-rule node we construct
transmission mode manner (follow the dotted lines in Fig. 1) matching factors. In this case, we have matching
so that the training input and output data are fed into this factors. Each matching factor is denoted as , where the
network from both sides. subscript is the rule-node-index ( ), and
The centers (or means) and the widths (or variances) of the subscript is the output-linguistic-variable-index (output-
the membership functions are determined by a self-organized term-node-index) ( ).
learning technique that is analogous to statistical clustering. Step 2: is calculated according to the following pseu-
This serves to allocate network resources efficiently by placing docode:
the domains of membership functions covering only those
regions of the input–output space where data are present.
Kohonen’s self-organized feature-map (SOM) algorithm is
adapted to find the center of the membership function [15] (no. of available training examples)
comparison with class A models, Sugeno’s model [10] has membership function, at least two or three parameters have
less MSE value using six inputs but, in the same time, has to be calculated through a nonlinear programming procedure.
much higher MSE value using the same inputs used by our The choice and computation of these membership functions are
model ( , and ). Also, This model is quite rather tricky and subjective so that it is possible for different
difficult to build [8], [12], [13]. The most difficult aspect designers to sometimes get completely different results.
lies in the identification of the premise structure, mainly Wang’s model (class A) [8] has comparable results and
the membership functions of the input variables. For each less number of rules; however, the number of rules does
FARAG et al.: GENETIC-BASED NEURO-FUZZY APPROACH 763
not necessarily give a reliable indication of the number of nonlinear difference equation
unknown parameters of the model. For example, in Wang’s
model [8] five class A rules are used with two inputs, the (22)
number of unknown parameters in this case (in both the
premise and consequent parts) are 35. In our model, 37 class Training data of 500 points are generated from the plant
B rules are used with two inputs and 46 unknown parameters. model, assuming a random input signal “ ” uniformly dis-
Bearing in mind that our model shows about 30% reduction in tributed in the interval [ ]. This data is used to build a
the MSE value and provides a linguistic description for the gas linguistic-fuzzy model for this plant.
furnace system; these two advantages, in our view, compensate The plant is modeled using the FNN described in
for the difference in the number of parameters (46 versus 35). Section III. The model has three inputs , , and ,
and a single output . The inputs and are intuitively
partitioned into five fuzzy linguistic spaces NL, NS, ZE, PS,
VI. THE SECOND NUMERICAL EXAMPLE PL , the input is partitioned into three fuzzy spaces N,
This example is taken from Narendra et al. [36] in which Z, P and the output is partitioned into 11 fuzzy spaces
the plant to be identified is given by the second-order highly NVL, NL, NM, NS, NVS, ZE, PVS, PS, PM, PL, PVL .
764 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998
TABLE III
COMPARISON OF OUR MODEL WITH OTHER MODELS
The SOM algorithm described in Section IV-A is used to After the learning process is finished, the model is tested by
determine the initial centers and widths of the 24 membership applying a sinusoidal input signal to the
functions of the input–output variables of the fuzzy model. fuzzy model. The output of both the fuzzy model and the actual
The four scaling factors of this fuzzy model are determined model are shown in Fig. 7. The fuzzy model has a good match
from this learning phase as “Gu 0.7476,” “Gy 0.4727,” with the actual model with a MSE of “0.0403.” Another test
“Gyy 0.6261,” and “Go 5.5781.” is carried out using an input signal .
According to the structure of this fuzzy-neural network, the The result is shown in Fig. 8 and the MSE in this case is
number of rules (rule nodes in the third layer) is . “0.0369.” After extensive testing and simulations, the fuzzy
The MMFA described in Section IV-B is used to find the 75 model proved a good performance in forecasting the output of
rules of this fuzzy model and the results are shown in Table IV. the complex-dynamic plant. Remember that in this example
The MRD-GA (Section IV-C) is applied to optimize the only 500 data points are used to build the model; while in
parameters of the dynamic-system model. The algorithm pa- [36], 100 000 data points have been used to identify a neural
rameters are set as follows: pop size , pmut 0.05, network model. It can be expected that the performance of the
chromosome-length , , , identified fuzzy model may be further improved if the number
, , and . After finishing the second of data points used to build the model is increased.
learning phase and before applying the MRD-GA, the model In order to compare our modeling approach with that of
has an MSE value of 0.2058. This MSE value is decreased to Sugeno’s [10], [11] and Wang’s [8] approaches, both of
0.0374 after 3517 generations using a single point crossover these approaches are implemented. The Sugeno’s approach is
with pcross value of 0.9 (note that the MSE value reached implemented using the MATLAB fuzzy-logic tool box. The
0.06 after only 470 generations). The computation time used approach [37] applies the least-squares algorithm (LSA) and
to perform this learning process is illustrated in Table V. The the backpropagation gradient descent method for identifying
MSE decay rates using different crossover probabilities are linear (consequent) and nonlinear (premise) parameters of
shown in Fig. 6. the class A fuzzy rules, respectively. The core function of
FARAG et al.: GENETIC-BASED NEURO-FUZZY APPROACH 765
this algorithm is implemented using an-optimized-for-speed Table VI compares our modeling approach with both of
C code. Wang’s approach is implemented using the C Sugeno’s and Wang’s approaches. The models are learned
programming language. The approach uses the fuzzy C-means from the previously generated 500 data pairs and tested by
(FCM) clustering algorithm [38] to find the premise parameters applying a sinusoidal input signal . All the
of the class A fuzzy rules, then it applies the least squares experiments are carried out on a Pentium 166-MHz PC. The
algorithm to find the consequent linear parameters of the rules. comparison shows the advantages of our modeling approach.
766 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998
TABLE IV TABLE V
THE COMPLETE FAM MATRICES WITH THE FUZZY RULES THE COMPUTATION TIME OF EXAMPLE 2
TABLE VI
A MODELING COMPARISON USING EXAMPLE 2