You are on page 1of 17

ANFIS

Adaptive Neuro-Fuzzy Inference Systems (ANFIS)


Takagi-Sugeno fuzzy system mapped onto a
neural network structure.
Different representations are possible, but one with
5 layers is the most common.
Network nodes in different layers have
different structures.

ANFIS
Consider a first-order Sugeno fuzzy model, with two inputs,
x and y, and one output, z.
Rule set
Rule 1: If x is A1 and y is B1, then f1 = p1x + q1y + r1
Rule 2: If x is A2 and y is B2, then f2 = p2x + q2y + r2

Weighted
A
1 B1 w1 fuzzy-mean:
f1 = p1x + q1y + r1
X Y f w1 f1 w2 f2
A B
2 2 w1 w2
w2 f = p x + qy + r w f w f
2 2 2 2 1 1 2 2
X Y
x y
ANFIS architecture
Corresponding equivalent ANFIS architecture:

ANFIS layers
Layer 1: every node is an adaptive node with
node function:
O1,ii ( xi )
Parameters in this layer are called premise parameters.
Layer 2: every node is fixed whose output (representing
firing strength) is the product of the inputs:
O w
2,i i j j

Layer 3: every node is fixed (normalization):

wi
w
O3,i wi j j
ANFIS layers
Layer 4: every node is adaptive (consequent
parameters) :
O4,i O3, i f i wi ( p0 p1 x1 ... p n xn )

Layer 5: single node, sums up inputs:


O wf
i i i
5,i w i fi
w
i i i

Adaptive network is functionally equivalent to a


Sugeno fuzzy model!

356

ANFIS with multiple rules

357
Hybrid learning for ANFIS
Consider the two rules ANFIS with two inputs x and y
and one output z;
Let the premise parameters be fixed;
ANFIS output is given by linear combination
of consequent parameters p, q and r:
w1 f w2 f
z w 1 w 2
w2 1 1 w2

w1 ( p1 x q1 y r1 ) w 2 ( p2 x q2 y r2 )
( w1 x ) p1 ( w1 y ) q1 ( w1 ) r1 (w2x)p2 (w2y)q2 ( w 2 )r2
A

358

Hybrid learning for ANFIS


Partition total parameters set S as:
S1: set of premise (nonlinear) parameters
S2: set of consequent (linear) parameters
q: unknown vector which elements are parameters in S2
z = Aq: standard linear least-squares problem
Best solution for q that minimizes ||Aq – z ||2 is
the least-squares estimator q*:
q* = (ATA)-1ATz

359
Hybrid learning for ANFIS
What if premise parameters are not optimal?
Combine steepest descent and least-squares
estimator to update parameters in adaptive network.
Each epoch is composed of:
1. Forward pass: node outputs go forward until Layer 4
and consequent parameters are identified by least-
squares estimator;
2. Backward pass: error signals propagate backward
and the premise parameters are updated by gradient
descent.

360

Hybrid learning for ANFIS


Error signals: derivative of error measure with
respect to each node output.
Forward pass Backward pass
Premise parameters Fixed Gradient descent
Consequent Least-squares
Fixed
parameters estimator
Signals Node outputs Error signals

Hybrid approach converges much faster by reducing


the search space of pure backpropagation method.

361
Stone-Weierstrass theorem
Let D be a compact space of N dimensions and let be a
set of continuous real-valued functions on D satisfying:
1. Identity function: the constant f (x) = 1 is in .

2. Separability: for any two points x1 x2 in D, there is


an f in such that f (x1) f (x2 ).
3. Algebraic closure: if f and g are two functions in ,
then fg and af+bg are also in for any reals a and b.
Then, is dense in the closure C(D) of D, i.e.:
" e > 0, " g C(D), $ f: |g(x) – f(x)| < e, " x D.

362

Universal approximator ANFIS


According to Stone-Weierstrass theorem, an ANFIS
has unlimited approximation power for matching any
continuous nonlinear function arbitrarily well
Identity: obtained by having a constant consequent
Separability: obtained by selecting
different parameters in the network

363
Algebraic closure
Consider two systems with two rules and final outputs:
wf
1 1 w2 f 2 w1 f1 w2 f2

z w w and z
1 2 w1 w2
Additive:
w f w f w1 f w f
az b z a 1 1 2 2 b 1
2
2

w1 w2 w1 w2
w1 w1 ( af1 b f 1 ) w1 w 2 ( af1 b f 2 ) w2 w1 ( af 2 b f 1 ) w2 w 2 ( af 2 bf2)
w1 w1 w1 w 2 w2 w1 w2 w2
Construct 4 rule inference system that
computes: az b z

364

Algebraic closure
Multiplicative:
w fw f
1 1 2 2 w1 f 1 w2 f2
zz
w w
1 2 w1 w2
w1 w1 f1 f 1 w 1 w 2 f1 f 2 w 2 w 1 f 2 f 1 w 2 w 2 f 2 f 2
w1 w1 w1 w 2 w2 w1 w2 w2

Construct 4 rule inference system that

computes: z z

365
Model building guidelines
Select number of fuzzy sets per variable:
empirically by examining data or trial and
error using clustering techniques
using regression trees (CART)
Initially, distribute bell-shaped membership functions
evenly:

Using an adaptive step size can speed up training.

366

How to design ANFIS?


Initialization
Define number and type of inputs
Define number and type of outputs
Define number of rules and type of consequents
Define objective function and stop conditions
Collect data
Normalize inputs
Determine initial rules
Initialize network
TRAIN

367
Ex. 1: Two-input sinc function
z sin c( x , y) sin( x ) sin( y)
xy
Input range: [-10,10] [-10,10], 121 training data pairs.
Multi-Layer Perceptron vs. ANFIS:
MLP: 18 neurons in hidden layer, 73 parameters,
quick propagation (best learning algorithm for
backpropagation MLP).
ANFIS: 16 rules, 4 membership functions
per variable, 72 fitting parameters (48
linear, 24 nonlinear), hybrid learning rule.

368

MLP vs. ANFIS results


Average of 10 runs:
MLP: different sets
of initial random
weights;
ANFIS: 10 step
sizes between
0.01 and 0.10.

MLP’s approximation power decrease due to: learning


processes trapped in local minima or some neurons can
be pushed into saturation during training.
369
ANFIS output
Training data ANFIS Output

1 1

0.5 0.5

0 0

10 10
0 0
10 10
0
Y -10 -10 X
0
step size curve
Y -10 -10 X 0.4
error curve
root mean squared error

0.2 0.3

step size
0.15 0.2

0.1 0.1

0.05 0
0 50 100 150 200 250 epoch
0 number
0 50 100 150 200 250
epoch number

370

ANFIS model
m

m
e

b
e

p
s
r

Initial MFs on X Initial MFs on Y


m

m
e

b
e

p
s
r

0.5 0.5
1 1
of
Degree of

eg
re
D

0 0
-10 -5 0 5 10 -10 -5 0 5 10
input1 input2
m

m
e

b
e

p
s
r

Final MFs on X Final MFs on Y


m

m
e

b
e

p
s
r

0.5 0.5
1 1
of
Degree of

eg
re
D

0 0
-10 -5 0 5 10 -10 -5 0 5 10
input1 input2

371
Ex. 2: 3-input nonlinear function
output 1 x 0.5 y 1 z 1.5 2

Two membership functions per variable, 8


rules Input ranges: [1,6] [1,6] [1,6]
216 training data, 125 validation data
error curves step size curve
1 0.2
training error
0.8 checking error
root mean squared error

0.15

0.6
step size

0.1
0.4
0.05
0.2

0 0
0 20 40 60 80 100 0 20 40 60 80 100
epoch number epoch number

372

ANFIS model
m

m
e

b
e

p
s
r

Initial MFs on X, Y and Z Final MFs on X


0.5
1 1
Degree of membership

of

0.5
D
e
g

e
e
r

0 0
1 2 3 4 5 6 1 2 3 4 5 6
input1 input1
m

m
e

b
e

p
s
r

Final MFs on Y Final MFs on Z


0.5
Degree of membership

1 1
of

0.5
D
e
g

e
e
r

0 0
1 2 3 4 5 6 1 2 3 4 5 6
input2 input3

373
Results comparison
P
1 T(i) O(i)
APE Average Percentage Error .100%
P i 1 T (i)

Training Checking Training Checking


Model # Param.
error error data size data size
ANFIS 0.043% 1.066% 50 216 125
GMDH
4.7% 5.7% - 20 20
model [1]
Fuzzy model
1.5% 2.1% 22 20 20
1 [2]
Fuzzy model
0.59% 3.4% 32 20 20
2 [2]

[1] T. Kondo. Revised GMDH algorithm estimating degree of the complete polynomial. Trans. of the Society of
Instrument and Control Engineers, 22(9):928:934, 1986.
[2] M. Sugeno and G. T. Kang, Structure Identification of fuzzy model. Fuzzy Sets and Systems, 28:15-33, 1988.

374

Ex. 3: Modeling dynamic system


Plant equation
y ( k 1) 0.3 y (k ) 0.6 y (k 1) f (u (k ))
f(.) has the following form
f (u ) 0.6sin( u ) 0.3sin(3 u ) 0.1sin(5 u )
Estimate nonlinear function F with ANFIS
yˆ ( k 1) 0.3 yˆ ( k ) 0.6 yˆ ( k 1) F (u ( k ))

Plant input: u ( k ) sin(2 k / 250)


ANFIS parameters updated at each step (on-line)
Learning rate: = 0.1; forgetting factor: = 0.99
ANFIS can adapt even after the input changes
Question: was the input signal rich enough?

João M. C. Sousa 375


Plant and model outputs

376

Effect of number of MFs


Initial MFs Final MFs

1 1
0.8 0.8
0.6 0.6
5 membership functions

0.4 0.4
0.2 0.2
0 0
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1
f(u) and ANFIS Outputs Each Rule's Outputs
1 1
0
.
0.5 5

0 0
-0.5 -0.5

-1 -1
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1

377
Effect of number of MFs
Initial MFs Final MFs

1 1
0.8 0.8
0.6 0.6
4 membership functions

0.4 0.4
0.2 0.2
0 0
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1
f(u) and ANFIS Outputs Each Rule's Outputs
1

0.5 1

0 0

-0.5 -1

-1
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1

378

Effect of number of MFs


Initial MFs Final MFs

1 1
0.8 0.8
0.6 0.6
3 membership functions

0.4 0.4
0.2 0.2
0 0
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1
f(u) and ANFIS Outputs Each Rule's Outputs
1 1
0
.
0.5 5

0 0
-
0
.
-0.5 5

-
-1 1
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1

379
Ex. 4: Chaotic time series
Consider a chaotic time series generated by
0.2 x (t )
x(t) 0.1x (t)

Task: predict system output at some future instance t+P


by using past outputs
500 training data, 500 validation data
ANFIS input: [x(t 18), x(t 12), x(t 6), x(t)]
ANFIS output: x(t + 6)
Two MFs per variable, 16 rules
104 parameters (24 premise, 80 consequent)
Data generated from t =118 to t =1117

380

ANFIS model
Final MFs on Input 1, x(t - 18) Final MFs on Input 2, x(t - 12)

1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0.6 0.8 1 1.2 0.6 0.8 1 1.2
Final MFs on Input 3, x(t - 6) Final MFs on Input 4, x(t)

1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0.6 0.8 1 1.2 0.6 0.8 1 1.2

381

You might also like