You are on page 1of 3

Proceedings of the 2000 IEEE Internatioiral \Vurksliop

Robot and Human Interactive Colnmuiiicatioir


Osaka. Japan - September 27-29 2000
011

Growing RBF Structures Using Self-organizing Maps


Qiiigyu XIONG, Kotaro HIRASAWA, Jiiiglu HU aiid Juiiichi MURATA
Dept. of Electrical aiid Electronic Systems Engineering, Kyushu ITiiiv.
6-10-1 Hakozaki, Higashi-Ward, Fukuoka-City, .Japan
(E-mail: xiong@terra.ees.kyusliu-u.ac.jp)
Abstract

network selected is too small, it. will be uimble to ha,ve

We p r e s e n t U no.uel gro,wing R B F network structure


using S O M in this pctper. It consists of SOM and R B F
netruor.ks respectively. The SOM performs unsupervised
learii~iiigand nlso the ,weight ,uecto,rsbelonging to its output nodes w e tiwis,niitted to the hidden nodes in the
R B F net.iuoAs (is the centers of R B F activation functions, Q S (1 .i.csult one to one correspondence relationship is .renlixd betmeen the output nodes in SOM and
the hidden nodes in R B F networks. The RBF networks
perfomz super-ci,ised training using delta rule. Therefore,
the current output errors in the R B F networks can be
used to determine .where to insert a new SOM unit according to the rule. This also makes it possible to make
the RBF netmorks groui until a performance criterion is
fuljilled o f riiitil U desired network size is obtained. The
simulcitions on the t,wo-spirals benchmark are shown to
prove the p,roposecl networks have good performance.

a good solution a.nd hence underfit, the da.t,a.. On the

Iiitroductioii

The ma.in difficulty faced in the field of feedforward


neural networks (FNN's) is model selection. Model selection involves matching the complexity of the function
to be approximated with the complexity of the model.
FNN model complexity is determined by the factors such
as the number of weights, the values of weights, and
weight connectmiont,opology. If the model does not have
the enough complexity to approximate the desired function, underfit,ting a,nd poor generalization occur. If the
model is t,oo complex, then it may overfit the data and
also give poor generalization.
In one group of t8hemodel selection techniques, training begins with an oversize network which is then simplified. Pruning is one of such techniques[l]. The nonconvergent technique of early stopping[2] also uses an
oversize network, and works by stopping the training at
a point, where t,he performance for a validation set begins
to worsen. Since t8hevalidation set is a representative of
the underlying funct,ion, the performance for this data
set, will worsen when a. model is formed, which is more
complex t1ia.n the underlying function.
But the " i n disadvant,age is the difficulty of specifying wha.t size of the network should actually be considered a.s a.n oversize network a priori. If the initial
0-7803-6273-X/00/$10.00@2000 IEEE

other hand, selecting an initial netswork t,ha.t, is much


larger than required makes t,raining coiiiput,a.t,ioiiallyexpensive.
Another group of the model selectmiontechniques uses
constructive methods[3]. These met,liods st,a.rt. wit,li a.
minima.1 size net,work, and sequentia.lly add uiiit,s according to some crkerion until a.n appropria.t,e level is
reached. Constructive algorithms will spend t,lie majority of their time on training networks smaller than t8he
final network, as compared to the algorithms that, sta,rt,
training with the oversize networks. There a.re a number
of inherent advantages in constructive a.lgorithms over
the algorithms using t,he oversize net,works. But, one
must define the type of training a.lgorit8hmsand how t,o
connect the new hidden nodes t,o the current networks.
Generally spea.king, Self-organizing neural net,work
models generatmemapping from high-diniensiona.1 signal
spaces to lower-dimensiona.1 topological stsruct,ures[4][6]. These mapping are able to preserve the neighborhood relations in t,he input data. Most of the works ca.rried out so far on SOM's have concent,ra.ted on syst,eins
with a single self-organizing layer. This self-organizing
layer has generally a fixed number of neurons. However in recent, years, some adaptive/multilayer SOM
models have also been proposed. Lee et al.[7]proposed
a self-development neural network for adaptive vector
quantization. The network has one self-organizing layer
and two levels of adaptations-namely st,ructure a.nd parameter (synaptic vector) levels. Kohonen ' s t,opology
preserving mapping algorithm was used for parameter
adaptation. Wu et a1.[8] investigated a supervised t,wo
self-organizing layer SOM. However their method did
not employ any structure with adaptation scheme.
Cho 191 proposed a structure-adaptive SOM with a.
single self-organizing layer. Bauer et al.[lO]presented a
growing self-organizing map (GSOM) algorithm. The
GSOM has generated a hypercubical shape which is
adapted during the learning. Fritzke [lllproposed a.
structure adaptation algorithm for SOM which has
one self-organizing layer with sophisticated multidimensional lattice topologies. The algorithm includes cell iiisertion and removal based on the local counter variable.

-107-

According t,o t,hese, we propose a network named


growing R.BF using SOM in this paper, which is composed of t,wo kinds of i~et~works
such as a basic network
and a clust,er net,work. In such network's structure, a
RBF network and a. Kohonen SOM network are adopted
as the ba.sic net,work aad cluster network, respectively
(see Fig.l). The rehtionship between the output nodes
in SOM a n d the hidden nodes in the RBF network is
one t80one correspondence.

2
2.1

principle proposed by Kohonen[4]-[6]. There are. however, two differences:


0

The adaptat,ion strength is 'consta.nt, over t,ime.


Specifically, we use constant adaptmationpa.ra.niet,ers
CY, and CY, for the best-matching unit, a n d its neighboring units, respectively.
Only the best-matching unit. and it,s firfjt.t,opological neighbors are adapted.

Structure of Growing RBF Network


Using SOM
.Basic Structure

Let, R. = ( , 1 . 1 . 7 ' 2 : . . .: 7,ni) denote the input vector, and


Y is the out,put. of t,he ba.sic network. Each node in the
output) layer of SOM has a weight vector, say W I ,att8ached t,o it,, where W, = ( ~ W I ~ , ' U. I. I.,~W, I N ) I ~ L Also
.
Wi is t,ransmitk,ed t,o the hidden nodes in the RBF network as it,s ceiit,er of RBF a.ctivation function. The usual
procedure for t,ra.ining such a network consists of two
consecut,ive phases, an unsupervised and a supervised
one.
This net,worl; stmartswith a. minimal size network, and
t81iensequentmiallyadds neurons according to the criterion. SON1 netswork performs unsupervised learning. It
genera.t,esorclerecl iimppings of the input data onto some
low-dimensional t,opological structure, which means that
the SOM defines a topology mapping from the input
data. spa.ce ont80t,he output nodes. Then the weight vectror Wj belonging t80the output in SOM are transmitted
to t8hehidden node in tjhe RBF network as its center of
RBF a.ct,iva.t,ionfunction.
The R,BF net,work performs supervised training using
delta rule. T h e current, output errors in the RBF network are used as t,he va.lues of the current best-matching
unit in SON1 t.0 clet,ermine where to insert a new SOM
unit, according to a. rule. This also ma.kes it, possible to
make the RBF net#workgrow until a. performance criterion is fulfilled or until a. desired network size is reached,
bemuse of t,lieir connecting relationship between the
outpiit, nodes i n SON1 a.nd the hidden nodes in the RBF
net,work.

2.2 Center adaptation


When a n input, vect80rR is given to this network, a.t
first, in SOhJI, t,he Euclidean distances between each Wl
and R a.re tromput,ed. The output nodes of SOM compete ea.& ot,lier, but, t,he node c whose connected weight
vector is closest, t,o t,he current input vector (minimum
dist,ance) wins, say besbmatching unit.

Y
Basic Network (RBF)

Figure 1: Structure of Growing RBF Net,work using

SOM
Here, let Ne denotes the set of the first topological
neighbors of the winner c.
An update rule in SOM can be formula.ted as follows:

we +- we + * ( R wl t wj+ * ( R- W,)
a!,

a!*

Wl

i- Wl

WC)

(for 1 E Arc)
(otherwise)

(2)

The inputs of the basic network is the same as the


ones in SOM. Each Gaussian unit 1 has an associated
vector Wl indicating the center in the input, vector space
and the standard deviation q defined by using the mea.n
distance of its first neighboring Gaussian unit,s.
For a given input, vector R, the activation funct>ionof
unit 1 in RBF is described by:

(3)
The Gaussian units are completely connected t,o the
linear output node by weighted connection w1. Therefore the value of the output node in the RBF network
is computed by:

After det,ermining the best-matching unit for the current, input, pa.t8t8ern,
tshisbest-matching unit and its topological neighbors should be increased according to the

- 108-

2.3 Local error measure


The weight ( 1 1 of the output layer is trained to produce the clewed lalue T at the output node. Common
delta rule 15 u w l
io/ t cc1

+ $(T - Y ) . f , (for all 1 E L )

(5)

where /jis t,he learning ratme.


The error causecl by the t,ra.iningdat,a is used to determine where to insert, a new unit, in the input space. Here
for each unit. 6 in SOM, we define a local count,er varia.ble 14 basica.lly conta.ining tjhe errors of the node where
the best,-iiia.t.chingfor t,he current, input is obt,ained.

Deteriniiiiiig of iiiitial weight about the


iiew cell
Whenever a. new unit, I is inserted, a. weight. io^ con-

2.5

nect,ing to the out,put. node in RBF net,worli should be


det,ermined. Here, we ma.ke init,ial ici1 equal t,o 0 siniply not, t.0 make influence t.0 t,he out.put. valur of R.BF
iiet,work a.fter insert,ing.

2.6

To prevent, t,he net,work growing inclefinit,ely, a. s t c o p


ping criterion has to be defined. In t,lie simplest, case
one can specify a. maximum number of a.llowed unit's
and halt, t,he process when this number is a.chievecl or
surpassed.
The whole process of the learning of t,he proposed
iiet8workis a.rrangecl as follows:
0

2.4 Iiisertioii of iiew cell


After a fixed niiniber of adaptation steps M in SOM
and RBF network, we determine the unit s with t,he
largest, values of :

Then, we look for a unit with the largest error counter


value ainong t,he first neighbors N,. This is a unit, f
satisfying:
.f = Cl,7'!/

M U Z

{K}
~

Stoppiiig criterion and the whole learning process

Init,ialize t8heSOM's stjruct,ure.


Crea.te a. weight,ed connectlion from ea.cli R.BF cell
t,o t,he out,put, unit, in t8heRBF net,work.
Associa.te every cell in the RBF net.wor1; wit,li a
Gaussian fun ct.ion .
while (an appropriate crit,erion is not. reached)
-

And in t?liose common neighbors N s


of s a,nd f ,
we determine t,he cell 'U.with the largest distance from
cell s:

After t,liat,,we now ca.n insert a new unit I between s


a.nd I/.. 'rhr posittion of a. new unit, I in the input, space
is init,ialized as:

Determine cell f with ma.ximum 1oca.l counter


va.lue a.niong s t h cell's first, neighbors NS (see
eq.8):

Determine cell U, with the largest, clista.nce between s and the common neighbors Nsn,f
(see
eq.9);
- Insert, a new cell I between s and U.
(see eq.10);

nf

-109-

Choose I/O-pa.ir from t,raining dat,a.;


Determine best,-ma.t,cIiingunit, c
(see eq.1);
* 1ncrea.se mat>chingfor c a.nd its first, neighbors in SOM (see eq.2);
* Coinputmea.ctivation f r (see ec1.3);
* Compute the output. Y (see eq.4);
- * Perform one step delta-rule 1ea.rning for
the weights '1ul (see eq.5);
* Add local counter varia.ble fl
(see eq.6);
Determine cell s with maximum local counter
value (see eq.7);
t

(8).

And to keep t80pological relationship in SOM, we


should set, s, .f aad t,hose common neighbors N s
as
1's first neighbors. Then, the original neighboring relationship between s a.nd f is releva.ntly cancelled.
Doing so, we may sa.y that the local optimum regarding SOM st,ruct,ure a.nd basic network's output, error is
realized. This is due to t8he fact that new units are
only insert,ecl in t,hose regions of t,he input space where
misclassifica.t,ions stmilloccur a.nd the errors are locally
iimxiniuin .

repea.t X times

Give the new cell I an initial weight, connect,ing


to the output, in the R.BF network;

Stop if pre-specified conditions are met,.

You might also like