You are on page 1of 6

A Generation Method for Fuzzy Rules using

Neural Networks with Planar Lattice Architecture


Eiichiro TAZAKI and Norimasa INOUE
Department of Control and Systems Engineering
Toin University of Yokohama
1614 Kurogane-cho, Midori-ku, Yokohama 225, JAPAN

Abstract
In this paper, we first presents a method for automated extraction of fuzzy rules using neural
networks with a planar lattice architecture. The neural network is composed of three layers
- input layer, hidden layer with a lattice architecture and output layer. In hidden layer, the
neurons are arranged in a lattice structure, with each neuron assigned a position in a lattice.
Each neuron of hidden layer is assigned a fuzzy proposition which composes a fuzzy rule. The
network is learned structurally with generation/annihilation of neurons. After learning process
of rules, we may extract simple fuzzy production rules from the network.
Next, we extend the method t o the cases in which we can treat to the multi-dimensional rules
by composing extracted simple rules. We apply the proposed method to generate the diagnostic
rules of hernia of intervertebral disc.

1 Introduction
Recently, neural network models and the automatic generation of expert systems based on learning
processes are attracting growing interest as useful tools for mainstream tasks involving artificial intelligence.’)
Neural networks embody the information derived from the training data and are implicitly assumed t o contain
the fuzzy rules and/or knowledge base used for expert systems. This paper presents a method to extract
automatically fuzzy production rules by using a modified layered feedforward neural network with a lattice
architecture. The neural network is composed of three layers - input layer, hidden layer with a lattice
architecture and output layer. In the hidden layer the neurons are arranged in a lattice structure with each
neuron assigned a position in a lattice. Each neuron is assigned a fuzzy proposition which may be selectively
composed into a fuzzy rule depending on the activation of neuron. The network is learned structurally with
generation/annihilation of neurons in lattice structure. After the learning process we can generate simple
fuzzy rules from the network. Moreover, we have examined that our method in which we can treat is extended
t o the multi-dimensional rules by applying it to the problem of the medical diagnosis of the intervertebral
disc.
2 Generation of Fuzzy Production Rules
The neural network model to generate the production rules can be described according to its network,
neuron and behavioral properties as follows. The neural network is consisted of three layers as shown in
Fig.1 . The input layer and the output layer consist of fuzzy cell groups, respectively. The hidden layer
has a lattice architecture in which each neuron is assigned a position in a lattice. All proposition which
may be selected in production rules are arranged on node neurons of lattice as shown in Fig.2 In the .
hidden layer a proposition Ai which will be adopted in antecedent of rules is assigned for i-th row axis of
lattice structure. In a similar way a fuzzy proposition Bj to be adopted in consequence is assigned for j-th
column axis. Neuron inputs and activations may be continuous, assuming values in the interval [0,1].Here,
the magnitude of input pattern to the network corresponds to the membership values of input possibility
distribution. Each neuron of input and output layers computes a single numerical output or activation and
typically, every neuron uses the same algorithm for computing its activation. On the other hand, the neurons

$4.0001994 IEEE
0-7803-1901-X/94 1743
of hidden layer behave in a different way from input/output layers. During the learning process of rules,
the generation/annihilation of specified neurons may be placed to grow a suitable configuration of network.
After the learning process, we may generate simple fuzzy production rules corresponding to a certain level
of activation of neurons in lattice. And we may get the multi-dimensional rules composed of the generated
simple rules by observing the value of the weight between the neurons. The learning process will be described
in the next section in detail.
3 Learning in Neural Network2)
In this section, the learning in the neural network with lattice architecture is described. Now, we consider
the network composed of input layer, hidden layer with a lattice architecture, and output layer (see Fig.1 ).
..
In the input layer, there are L neurons, and they are received the input signals x = ( q , z 2 , . ,zL). In
the hidden layer, the neurons are arranged in a lattice structure, with each neuron assigned a position in a
lattice. And each neuron within this layer is assigned a fuzzy proposition composing a fuzzy rule (see Fig.2 ).
The input of neuron m in the hidden layer is

I;=XW~, m= 1,2,...,M. (1)


where Wm = (wL7, w?;, ...
,wIH)T is the connection weight vector between neurons in the input layer and
neuron m in the hidden layer; G2
is the connection weight between neuron 1 in the input layer and neuron
m in the hidden layer. In the output layer there are N neurons, and those output is

Y(t) = (Yl(t)"), * * * ,arlv(t)) (2)


where pn(t) = CmWfgUm.
The learning in the network is performed by changing the connection weight between layers and genera-
tion/annihilation of the neurons in the hidden layer. That will be described the followings.
3.1 Output of Neuron in Lattice
For neuron i in the hidden layer, the rule for changing the pre-sigmoidal activation level is

where pi[n]is the pre-sigmoidal activation level for neuron i at time index n; F()is a positive monotonic
decreasing function, the purpose of which is to enable a neuron to generate a high output activation level
when the input vector is close to its weight vector in the pattern space; M e t ( ) is a metric that measures the
distance between two vectors in a metric space. h", is the neighborhood set of neuron i ; Uj is the output
level of neuron j ; and Pjj, called the lateral interconnection weight between neuron j and neuron i , is the
weight associating neuron i with neuron j ; c(c depends only on the relative position between neuron i and
-
neuron j in the lattice, i.e. p*j= q(k lj), where 4,lj are the lattice indexes of neuron i and j respectively.
q ( ) ,called the spatial impulse responee function of the network, in normally in the form of a high pass filter.
a1 and a2 in (3) are two constants used to weight the contribution of the system inputs and the outputs
from other neurons to the activation level.
If we use a Euclidean metric for the M e t ( ) in (3) , then F()can be written as:

f() can be define as f(z) = eX".


The output level for neuron i is
4 7 4 = ai(Pi[n]), (5)
where a,()is the output function for neuron i .

1744

- " ... ,.. . . ..... .- ._.


3.2 Adjustment of Input Weight Vector in Lattice
The input weight vectors of neurons in the network are adjusted as the following:

w j [ n ]= w j [ n- 11 + a+ - l](lllj - l#(x[n] - w j [ n- 1 ) ) (6)


where x[n]is the input signal of the network at time index n; wj[n] is the input weigth vector of neuron j at
time index n; a is learning late, and it is a constant used to control the rate of convergence in the learning
process; 1, is the position of neuron i in the lattice. 4[n](a)
is
if a 5 R[n]
4[n1(u)= { otherwise, (7)

where R[n]is a positive definite monotonic decreasing function; the minimum value for R[n] is 1 for the
network to retain organizing capability.
3.3 S t r u c t u r a l Learing in Lattice
N e u r o n G e n e r a t i o n Process
The average system distortion is

D = E[llx - Q ( X ) I ~ ~ I = /V
Ilx - Q(x)JJ2p!x)dx, (8)

where V = Ui,=sK is the input vector space.


Whenever the source signal x falls into the region Vi, neuron i is selected to represent the input signal,
and the vector quantizer replaces x by Q(z) = wi; hence we may write
M .

where M is the number of neurons in the network.


The probability of selecting neuron i is

Equation ( 9 ) may be replaced by


A4

i=l
where di is the average distortion observed by neuron i.
Suppose the allowable average system distortion is Ed; then neuron i should be split into two neurons if

die >-
ed
M'
To measure di dynamically, we define an operational measure of di as
n

+ (1- ~d)llX[n:]- wi[nk-1]112,


A

di[$] = Tddi[$-'] (13)


6

where nf is the time index when neuron i is the k-th time being selected; di[m]is the distortion measure for
neuron i at time index m; and ^Id, a factor bztween 0 and 1, is used to control the effective temporal window
size for the averaging process. Notice that d i is updated only when neuron i is selected; between nf-' and
CI

n:, di stays the same.


2
If we repeatedly apply equation(l3) to substitute in the right hand side of (13) , then z[n3]can be
rewritten as

1745
Similarly to (13), we define the operational measure of P.as
1
+ (1 - yp)-Intrnf]
A A
k l
Pi[7$] = 7pPi[n.- ] '
where I t ( [ n f ] )= nf - n f ' .
Every time a neuron i is selected, & and pi are checked to see if (12) is satisfied; if so, then a new neuron
is generated.
Guideline of Neuron Generation
1. Find acceptable empty lattice sites within the neighborhood region of the parent neuron and list them
in order of preference.
2. If the list generated in Step 1 is not empty, place the new neuron on the position specified by the top
entry of the list.
3. Else if the list is empty, move the lattice toward the desirable direction to make room for the new
neuron (this operation is called Lattice Expansion).
The criterion for selecting a new lattice site generating new neurons can be evaluated based on the
distribution of distortion on different local lattice direction. For each neuron i , we can define the local axes
by looking into the context of neuron input weight vectors in the neighborhood of neuron i . For example, if
neuron 2 is the closest neighbor of neuron 1 in the positive z direction on the lattice, then we can define the
local +z direction for neuron 1 as
x; = w2 -w1
llwa - w1 II
With the local axes defined for a given neuron, we can then define the distribution of average distortion on
different local axes. For example, the average distortion on the +x axis for neuron i can be defined as

&.z+[n:] = 7d&.Z+[nf-'] + (1 - rd)(T((x[n:]- Wi[n:-']) ' xf))2 (17)

where r ( ) is the unit ramp function; &.x-,$.y+,&.y- can be defined likewise.


Each neuron also keeps track of a measure called a l i a energy, which is the average distortion along the
axes perpendicular to all the current axes. The operational measure of the alias energy of neuron i is:

where

is called the alias operator, which maps the vector y to its alias component related to all the local axes {ai}
of neuron i. @- is used as a measure to determine whether the current axes for neuron i are sufficient to
represent the input patterns. If &A is high, a new axis must be generated. When a new axis is generated for
neuron i , the new direction 4h is set to be ProjL((x-wd),i)
itProjA((x-d),i)il '
Whenever a new neuron is generated, its input weight vector is to be

where ai, the ai axis for neuron i , is the local direction to put the new neuron; &aj is the average distortion
along axis a' for neuron a; 6, a number between 0 and 1,is used to control the similarity between the newborn
neuron and its parent.

1746

~ .,.. ...
.".. ,".... ,. ..,.
Neuron Annihilation and Coalition Process
The average output activity of the neuron i is

where yi is the output level for neuron i; yi[n] is emulated by O(llC(x[n]) - lill), where C ( x [ n ] )is the lattice
position of the neuron which is the closest neighbor t o the source signal x[n].
h

If after a long training period, Act is very low for a particular neuron, then that neuron should be deleted.
Similar t o equation(12) , neuron i will be deleted if
ea
Acti <-
M’
3.4 Adjustment of Connection Weight between Hidden and Output Layer
The connection weight between the hidden layer and output layer is adjusted as the following:

where ,a0is the connection weight between neuron j in the hidden layer and neuron i in the output layer;
0 < p << 1; ti is the desirable output of neuron i and y i the output of neuron i in the output layer.
4 Extension to Multidimensional Rules
Next, we extend the method t o the cases in which we can treat to the multi-dimensional rules. We first
acquire the elementary knowledge which will be composed to antecedent or consequent of the production rule
from the medical expert, and assigned them on the lattice. On the other hand, it is well aware that the rules
for the diagnosis of this disease generally contains some multi-dimensional rules. For the relationship among
the propositions in antecedents and consequents the value of the connection weight among the neurons on
lattice from the input layer and t o the output layer is observed. If it has the the value greater then a certain
threshold for some group of knowledge, then they should be in the same antecedent and/or consequent.
Now, we have examined the proposed method by applying it to the problem of the medical diagnosis
of the intervertebral disc. We acquired the elementary knowledge with respect to hernia of intervertebral
disc shown in Table 1 from the medical expert. We assinged them on the lattice, and trained the network
according t o the training data of proposition’s truth values shown in Table 2 and 3 with the proposed
learning method. After learning process, we obtained 14 simple production rules. Observed the value of the
connection weight among the corresponding neurons from the input layer and to the output layer, we could
find multi-dimensional rules composed of simple rules. The generated rules for the diagnosis of hernia of
intervertebral disc is shown in Table 3 . Medical experts suggested that this rule-base is almost valid.
5 Conclusion
We have examined how to construct a neural network from training data and how to use it as generating
the fuzzy production rules for knowledge base. We believe this model presents a effective approach t o the
practical rule-generation problems on fuzzy expert system. Moreover, the resulting system run very faster
than the almost popular systems. Because the many systems works successively t o extract a rule for each
computing cycle.3) On the other hand in the proposed method, all rules are extracted simultaneously for
a cycle of computation. So, we can expect to obtain efficiently fuzzy rules from the group of the prepared
knowledge(propositions) in advance for knowledge base.
References
[l] D.E.Rumelhart et al.: Parallel Distributed Processing, vol.1 & 2 (1986) MIT Press.

[2] E. Tazaki, N. Inoue: “Automated Extraction of Fuzzy Rules using Neural Networks with Planar Lattice
Architecture”, EUFIT’93 Vol.1 pp.458/464 (1993)
[3] S.I.Gallant: “Connectionist Expert Systems”, Communications ACM vo1.31 pp.153/169 (1988)

1747
Fig.1: Neural Network with Lattice Architecture Fig.2: Hidden Layer with Lattice Architecture
for Extraction of Production Rules

Table 1: Knowledge from Medical Expert with respect to Hernia of


2rvertebral Disc
A I : radiation pain when coughing and/or strain oneself Table 2: Truth scales and their
Az: increase of pain when exercise
As: relief of pain when resting
Ad: numbness of the lower limbs
A5: pain radiated to the lower limbs
As: increase of pain followed by some movements
AT: scoliosis due to pain
As: pain when bowing down very false
Ae: pain when getting up unknown
Alo: relief of pain when exercise

antecedent consequent
A1: radiation pain when coughing and/or strain oneself true very true
Az: increase of pain when exercise true
AI: radiation pain when coughing and/or strain oneself true very true
As: relief of pain when resting true
4: numbness of the lower limbs true very true
As: pain radiated t o the lower limbs true true
Al: radiation pain when coughing aad/or strain oneself true true
A&: pain radiated to the lower limbs true
A5: pain radiated to the lower limbs true true
A6: increase of pain followed by some movements true
AT: scoliosis due to pain true rather true
Aa: pain when bowing down true rather true
A1: radiation pain when coughing and/or strain oneself true unknown
A6: increase of pain followed by some movements true unknown
AQ: pain when getting up true rather fdse
Alo: relief of pain when exercise true false

1748

, .."" ... -. , .