Chapter 5. Phase HI: Maintaining the Queue Stability using Adaptive Resonance Theory!
89
5 PHASE III: Maintaining the Queue Stability using Adaptive Resonance
Theory 2
5.1 Introduction
Adaptive Resonance Theory (ART) was introduced by Stephen
Grossberg in 1976. ART self-organizes a stable pattern recognition. This
recognition code is done in real time based on the arbitrary input pattern
sequences. Due to its self stabilization ART does not possess the local
minimum problem. It is an unsupervised paradigm and is based on competitive
learning method, capable of finding categories and creating new categories
when needed automatically. ART is a type of incremental clustering paradigm.
The standard feed forward network suffers a stability problem for which ART
was developed. During the training process the old weights that were captured
may be lost when new weights come in. The process of maintaining the old
weights with that of the new weights is called the stability -plasticity dilemma
which is the main concept used in this neural network. Its principles helps in
the parametric behavioural and brain data in the areas like object recognition,
visual perception, variable rate speech auditory source identification, and word
recognition and adaptive sensory motor control [LEVOO, PagOO]. It is used to
represent the hypothesized processes like learning, attention, search,
recognition and prediction.
Some of the properties of ART as in [W7] are :
1. To normalize the total network activity. For eg, the human eye adapts
itself to the changes in the light rays in the environment likewise
biological systems are adaptive to changes in their environment.
2. Contrast enhancement of input pattern. For eg, A hiding lion from a
resting one makes the differences in the world likewise a slight
difference in the input patterns can affect a lot in the whole world of
survival.
Communicated in “ARPN Journal of Engineering and applied Science”.
Chapter 5. Phase III: Maintaining the Queue Stability using Adaptive Resonance Theory2 90
3. Short-Term Memory (STM) storage of the contrast-enhanced pattern.
The input patterns are first stored in an STM before being decoded. The
Long-Term Memory is used for classification.
ART system is a connection of two layers FI and F2 via the LTM. The
input is given to the FI layer and the classification is done at the F2 layer. The
first process is the characterization by extracting the features, giving rise to the
activation in the feature representation field after which the classification is
performed. The expectation which exists in the LTM connections is transmitted
as the input pattern for categorization to the category representation field. The
classification is evaluated to the expectation of the networks that resides in the
LTM and transfer their weights from F2 to FI. The expectations are
strengthened if there is a match, otherwise the classification is rejected.
Category representation field
F2
1r 'r 'r 'r 'r 'r 'r
A
STM activity pattern
LTM LTM
STM activity pattern
‘in
i i , \ ki i
Feat .ire re presentation field FI
kL i k k i. a n i L l L, i
input
Figure 5.1 Architecture of ART
Communicated in “ARPN Journal of Engineering and applied Science”.
Chapter 5. Phase HI: Maintaining the Queue Stability using Adaptive Resonance Theory2 91
An ART has two interconnected layers FI and F2 as shown in the figure
5.1. FI indicates the characteristics representation layer and F2 indicates the
category representation layer. The dimension of the input vector represents the
number of nodes in the FI layer. Nodes in the F2 layer represent a category
given by its top-down and bottom-up weights. These weights are known as the
Long Term Memory (LTM) of the network. When new categories of patterns
arrive, nodes will be dynamically created in F2. Vigilance parameter gives the
coarseness of the categories. In the FI layer, normalization and feature
extraction are the operations performed on the input pattern. The pattern code
in the FI layer is sent out to the F2 layer through the bottom-up weights. That
F2 node with the bottom-up weights which matches with the FI pattern code is
declared as winner and its top-down weights are sent back to the FI layer
.When the resonance between FI and F2 stabilizes, the reset assembly
compares the FI pattern code and the input vector. If the resemblance is too
low then the winning F2 node is inhibited and a new F2 node indicating a new
category is created. The adaptation of weights progress according a differential
equation and is executed on the winning node. Node which has the largest inner
product of its weights and the FI pattern code is known as the winning node in
F2. Reset is made by using a clever angle measure between the input vector
and the FI pattern code.
ART is a family of different neural architectures among which ART1 is
the first network which leams binary patterns. ART2 [CG87], can learn
arbitrary sequences of analog input patterns. Other ART models include ART3
and ARTMAP [CG03].The ART2 NN is an unsupervised classifier that accepts
input vectors. It classifies the next vector according to the stored pattern into
which it resembles. If the input vector is totally a different pattern it creates a
new cluster corresponding to the input vector. This is achieved by a vigilance
threshold p [CG03]. Thus the ART2 NN allows the user to control the degree
of similarity patterns placed in the same cluster.
Communicated in “ARPN Journal of Engineering and applied Science”.
Chapter 5. Phase HI; Maintaining the Queue Stability using Adaptive Resonance Theoiy2 92
Some advantages of ART2 are:
1. Promptly learning and adapting to a non-stable environment,
2. Stability and plasticity,
3. Unsupervised learning of preferences behaviour that the target
does not know initially, and
4. Deciding the number of clusters exactly and automatically.
5.2 ART2 Network Architecture
Figure 5.2 as in [RP04] illustrates the architecture of the ART2 neural
network, including FI and F2 layers. There are six nodes in the FT layer
(w,x,u,v,p,q). The input signal is processed by the FI layer and then is passed
from the bottom to top value (bij). The result of the bottom to top value is an
input signal to the F2 layer. The nodes of F2 layer compete with each other to
produce a winning unit. The winning unit returns the signal to the FI layer. The
match value is then calculated with top to bottom value tJ} in the FI layer and
compared with the vigilance value. The weights of bjj and % are updated if the
match value is greater than the vigilance value. Otherwise, the reset signal is
sent to the F2 layer and the winning units are inhibited. After inhibition, a
different winning unit will be found in the F2 layer. If all of the F2 layer nodes
are inhibited, the F2 layer will produce a new node and generate the initial
corresponding weights to the new node.
Learning Model
The prime ART system is an unsupervised learning model. It usually
consists of a comparison field, a vigilance parameter, a recognition field
composed of neurons and a reset module. The vigilance parameter has great
effect on the system. Higher vigilance value results in highly detailed memories
(fine-grained categories), while lower vigilance value produces more general
Communicated in “ARPN Journal of Engineering and applied Science”.
Chapter 5. Phase IE: Maintaining the Queue Stability using Adaptive Resonance Theory2 93
memories (more-general categories). The comparison field gets an input vector
(one-dimensional array of values) and transmits it to the best match in the
recognition field where the set of weights (weight vector) in a single neuron
matches well with the input vector. Each recognition field neuron produces a
negative signal (proportional to that neuron’s quality of match to the input
vector) to all other recognition field neurons and consequently reduces their
output. The recognition field displays a lateral inhibition which allows each
neuron to represent a category by which the input vectors are classified. The
reset module then compares the strength of the recognition match with the
vigilance parameter. Training commences when the vigilance threshold is met.
If it does not match with the vigilance parameter, the firing recognition neuron
gets reduced until a new input vector is applied. Training starts only when
search procedure completes. Recognition neurons are disabled one by one in
the search procedure using the reset function until the vigilance parameter is
satisfied by a recognition match. If no recognition neuron satisfies the vigilance
threshold, an uncommitted neuron is committed and it is adjusted to match with
the input vector.
Figure 5.2 ART2 Architecture
Communicated in “ARPN Journal of Engineering and applied Science”.
Chapter 5. Phase III: Maintaining the Queue Stability using Adaptive Resonance Theory2 94
Training
Basically there are two methods of training ART-based neural networks.
They are slow and fast. The degree of training of the recognition neuron’s
weights towards the input vector is calculated in slow learning method by
continuous values using differential equations and it depends on the length of
time the input vector is presented. Algebraic equations are used to calculate
degree of weight adjustments in fast learning and it uses binary values. Fast
learning is effective and efficient for several tasks, while the slow learning
method is more biologically probable and it can be used with continuous-time
networks (i.e. while the input vector varies continuously).
5.3 Learning
o Algorithm
o
The learning of the ART2 network FI layer activation [Lau96]
Vl
« + Ill'll
wi = si + CLUi
Pi =Ut + (tty
= Wt
Xt e+ \\W\\
vi = f{xt ) + bf(qt )
So that e, is a small parameter used to prevent division by zero, where l|Vj| is
the vector length, a and b are fixed weights in FI layer, d is the activation of
winning F2 unit, 6 is the noise suppression parameter and a is the learning
rate.
The activation function is:
if X <6
o x
if x > 6
Communicated in “ARPN Journal of Engineering and applied Science”.
Chapter 5. Phase III: Maintaining the Queue Stability using Adaptive Resonance Theory2 95
Input signals to F2 layer are;
Vi =libijPi
And updates weights for winning unit J are:
tji = a dut + {1 + a d(d - 1)}^
bij = a dui + {1 + a d(d - 1)}Jby
ART2 neural network algorithm used in this work is summarized below
[Lau96].
Neural network configuration:
There are some ART2 neural network parameters that should be initially
chosen. Chosen parameters are as follows:
a=10, b=T0, d=0.9, c=0.1, e=0.0
Noise inhibition threshold: 0 < 6 < 1
Surveillance parameter; 0 < p < 1
Error tolerance parameter ETP in the FI Layer:
0 < ETP < 1
Weight initialization of the neural network:
Top down;zy (0) = 0
Bottom up:zy (0) <
Conununicated in “ARPN Journal of Engineering and applied Science”,
Chapter 5. Phase III: Maintaining the Queue Stability using Adaptive Resonance Theory2 96
Operation steps:
i. Initialize the sub-layer and layer outputs with zero value and set cycle counter to one.
2. Apply an input vector I to the sub-layer w of the FI layer. The output of this layer will be as
follows:
wt = It + aui (5.1)
3. Propagate to the x sub-layer:
_ wi (5.2)
%x ~ e+||W||
4. Calculate the v sub-layer output:
Vi = /(*; ) + bfiqi ) (5.3)
In the first cycle the second term of (10) is zero once the value of q, is zero. The function fix')
is given by:
2$X2
fix,) = (*»+«» ) Q<x<6 (5.4)
'•x x ze
5. Compute the u sub-layer output:
__ Vi
“l ” e+lW (5.5)
6. Propagate the previous output to the p sub-layer:
Pt =ut + dZu (5.6)
where the J node of the F2 layer is the winner node. If F2 is inactive or if the network is in its
initial configuration then
Pi
7. Calculate the q sub-layer output
q. = Pi (5.7)
e+ llr’ll
8. Repeat steps (2) to (8) until the values in the FI layer is stabilized according to equation 5.7
Error (i) = i/(£) - U* (i) (5.8)
If Error (i) < FTP, the FI layer is stable.
9. Calculate the r sub-layer output
Ut +cp(
(5.9)
‘ ' 6+ IIUII+ ||CJ»H
Communicated in “ARPN Journal of Engineering and applied Science”.
Chapter 5. Phase III; Maintaining the Queue Stability using Adaptive Resonance Theoiy2 97
10. Determine if a reset condition is reached. If p > (e + ||i?||), then send a reset signal to F2 ,
mark any active F2 node as not enabled for competition, reduce the cycle counter to zero and
return to step(2). If a reset signal is not sent and the counter is one, then increase the cycle
counter and pass on to step(l 1). If reset signal is not sent and die cycle counter is larger than
one, it is considered that resonance is reached and hence skip to step(14).
11. Calculate the F2 layer input
T) — Zfli Pt zji (5-10)
12. Only the F2 winner node has non-zero output. Any node marked as non capable by a previous
reset signal doesn’t participate in the competition.
= {; d Tj = max (Tk)
(5.11)
0 otherwise
13. Repeat the steps (6) to (10).
14. Update the bottom-up weights of the F2 layer winner node
- ~ ui (5.12)
zii ~ i-d
15. Update the top -down weights of the F2 layer winner node
(5,13)
1-d
16. Remove the input vector, restore the inactive F2 nodes and return to step(l) with a new input
vector.
Figure 5.3 Pseudo-Code of ART2
5.4 Queue Stability using ART2
ART2 [CG87], self organizes stable recognition categories in response
to arbitrary sequences of analog input patterns as well as binary input patterns.
The capability of recognizing analog patterns represents a significant
enhancement to the system. ART2 also recognizes the underlying similarity of
identical patterns superimposed on constant backgrounds having different
levels. ART2 includes
• Allowance for noise suppression.
• Normalization (i.e.) contrasts to enhance the significant parts of the
vector.
• Comparison of top-down and bottom-up signals to reset the mechanism.
Communicated in “ARPN Journal of Engineering and applied Science”.
Chapter 5. Phase III: Maintaining the Queue Stability using Adaptive Resonance Theory2 98
• Dealing with real valued data that may be arbitrarily close to one
another.
ART2 RED inputs data from the various environments categorizes the
input internally and recognizes similar data in the future. Its plasticity allows
the system to learn new concepts and stabilizes the previously learned
information without destroying the learnt data. ART2’s self regulating control
structure allows autonomous recognition and learning. Its speed, stability,
feature amplification and noise reduction features along with its plasticity and
stability produces a well suitable training process for the proposed
methodology.
ART2 leamt information is not washed out due to stability. The training
time is reduced compared to the other mechanisms like KSOM, MKSOM and
LVQ. This proposed method reduces noise and drop ratio compared to KSOM,
MKSOM and LVQ.
5.5 Experimental results
The experimental results show the stability of the queue in both the
scenarios (explained in chapter III) of the traffic with a standard RTT and with
a varying RTT. The number of packets used for the testbed is 10000 packets.
The results obtained by using the ART2 training method perform much better
than that of the existing KRED. The queue is stabilized in both the scenarios
used shown in figure 5.4 .The training steps involved is less compared to that
of the previous methods.
Communicated in “ARPN Journal of Engineering and applied Science”.
Chapter 5. Phase III: Maintaining the Queue Stability using Adaptive Resonance Theory2 99
mss
it
11 111111
S 3 g S « 8 3 S 8 8 ;» S S * 8 3 S « l
m um •
11111
(a)
X Graph
i---r
(b)
Figure 5.4 Queue stability in a) KRED and b) ART2 RED
The trained network can be used for any type of traffic. The time taken
to train the given network is much lesser compared to that of KRED. Table 5.1
discusses the results obtained for both KRED and ART2 RED. Figure 5.5
shows that ART2 RED trains better than KRED in a minimum amount of time.
Communicated in “ARPN Journal of Engineering and applied Science”.
Chapter 5. Phase III: Maintaining the Queue Stability using Adaptive Resonance Theory2 100
Table 5.1 Training Time
TRAINING TIME Methods Time(ms)
o oo oo oo oo
H M W ^
Tlim e(m s)
I. IKRED
KRED 301.4626
IART2 RED
ART2RED 110.5288
Methods
Figure 5.5 Training Time of KREI)
and ART2 RED
It has been noticed from table 5.2 that the performance compared to KRED,
ART2 RED produces a minimum queue delay for heavy traffic. The average
queue delay is shown in figure 5.6.
Table 5.2 Average Queue Delay
AVERAGE QUEUE DELAY Queue
Methods
0.3 Delay (ms)
Tim e(m s)
0.2
0.1 KRED 0.283853
IKRED
IART2 RED
KRED ART2 RED
ART2RED 0.199942
Methods
Figure 5.6 Average Queue Delay of KRED
and ART2 RED
Communicated in “ARPN Journal of Engineering and applied Science”.
Chapter 5. Phase III: Maintaining the Queue Stability using Adaptive Resonance Theory2 101
From table 5.3 and figure 5.7 it is clear that the ART2 RED gives a
better result than KRED competitive learning neural network method.ART2
RED lakes a minimum jitter value than KRED.
Table 5.3 Variance of Queue Delay
Queue
VARIANCE QUEUE DELAY
Methods
0.004 Delay (ms)
Tlim e(m s)
0.002
0 I ■ I KRED KRED 0.0031342
KRED ART2
lART2 RED
RED
Methods ART2RED 0.0019157
Figure 5.7 Variance of Queue Delay
between KRED and ART2 RED
The number of bytes sent and received from the scenarios with RTT of
50 Hows is given in table 5.4. and in figure 5.8 respectively.
Table 5.4 Number of Bytes Sent
Vs Bytes Received
BYTES SENT VS BYTES
RECEIVED
580000 Bytes Bytes
No. of. packets sent
575000 KRED Methods
570000 sent Received
565000
560000 ART2 RED
555000
550000
545000
540000 KRED 565365 545740
535000
530000
Bytes sent ART2
576472 558710
RED
Figure 5.8 Bytes Sent Vs Bytes Received in
KRED and ART2 RED
The packet delivery ratio observed from using both KRED and ART2 RED is
given below in table 5.5 and figure 5.9.
Communicated in “ARPN Journal of Engineering and applied Science”.
Chapter 5. Phase Ill: Maintaining the Queue Stability using Adaptive Resonance Theory2 102
Fable 5.5 Packet Delivery Ratio
PACKET DELIVERY RATIO Methods PDR
0.974
0.972
RAatio in %
0.97 KRED 0.9731
0.968
0.966
KRED ART2RED
ART2RED 0.9691
Methods
Figure 5.9 Packet Delivery Ratio between
KRED and ART2 RED
5.6 Summary
Adaptive resonance theory 2 uses the short term and the long term
memory to keep the enhanced pattern and later the long term memory for the
classification so that the stability and plasticity is maintained in this type of
neural network. The performance when compared to the existing method is
better in regard with average queue delay, jitter and the stability maintenance.
The training time obtained by ART2 is very minimum when compared to the
KRED mechanism. The number of bytes sent and received is very high in
ART2 RED.
Communicated in “ARPN Journal of Engineering and applied Science”.