This action might not be possible to undo. Are you sure you want to continue?
Classification of Symbolic Objects using Adaptive
Auto-Configuring RBF Neural Networks
T N Nagabhushan, Hanseok Ko, Junbum Park
S K Padma, Y S Nijagunarya
Department of Electronics & Computer Engineering
Anam dong, Seongbuk-gu, Seoul 136-713, Korea
Department of Information Science & Engineering
S J College of Engineering, Mysore 570 006, India.
Abstract— Symbolic data represents a general form of classical
data. There has been a highly focused research on the analysis
of symbolic data in recent years. Since most of the future
applications involve such general form of data, there is a need
to explore novel methods to analyze such data. In this paper
we present two simple novel approaches for the classification of
In the first step, we show the representation of symbolic
data in binary form and then use a simple hamming distance
measure to obtain the clusters from binarised symbolic data.
This gives the Class label and the number of samples in each
cluster. In the second part we pick a specific percentage of
significant data samples in each cluster and use them to train
the Adaptive Auto-configuring neural network. The training
automatically builds an optimal architecture for the shown
samples. Complete data has been used to test the generalization
property of the RBF network. We demonstrate the proposed
approach on the soybean bench mark data set and results are
discussed. It is found that the proposed neural network works
well for symbolic data opening further investigations for data
Key words: Auto-configuring neural networks, Incremental
learning, RBF, Significant patterns.
I. I NTRODUCTION
In Conventional data analysis the objects are numerical
vectors. Clustering of such numerical vectors is achieved by
minimizing the intra cluster dissimilarity and maximizing
the inter cluster dissimilarity. Many different approaches
have been devised to handle such type of data  .
Symbolic objects are extensions of classical data types. The
main distinction between these two forms of data is that in
case of classical data the objects are more ”individualized”
where as in symbolic frame work they are more ”unified”
by relationships. Symbolic objects are defined as the logical
conjunction of events linking values and variables.
For example e1 = [ Color = (white, blue)], e2 = [ height =
( 1.5-2.0)], here the variable e1 takes the color either white
or blue where as the variable height e2 has a value between
1 This research was supported by the Ministry of Information & Communication (MIC), Korea under the IT Foreign Specialist Programme (ITFSIP)
supervised by the Institute of Information Technology Advancement (IITA).
0-7695-3045-1/07 $25.00 © 2007 IEEE
1.5 to 2.
In general, symbolic data have both quantitative ( numeric
or interval) as well as qualitative attributes. There exists
three types of symbolic data, namely Assertion, Hoard and
Synthetic. Clustering and classification of such type of data
require specialized schemes and most of the procedures
reported use similarity and dissimilarity measures. The
similarity and dissimilarity between two symbolic objects are
defined in terms of three attributes namely, position, span
and content respectively  . In the context of mining
important information from large complex data types, such as
multimedia data, it becomes imperative to develop methods
that have generalization ability. Most of the data generated
today closely resembles symbolic data. This work presents
some novel methods to deal with classification of symbolic
objects using machine learning techniques.
Analysis of symbolic data has been explored and expanded
by several researchers such as Edwin Diday and K C
Gowda , Ichino , D S Guru  etc. All of them
have viewed the analysis of symbolic objects from different
mathematical frameworks and reported good results. But
none of the available techniques have the ability to provide
good generalization for the test samples. In other words
neural computing techniques have not been tried on symbolic
objects except a recent work by S.Mitra . In this paper
the authors have taken the bench mark dataset from UCI
machine learning repository and proposed schemes to find the
clusters with respect to medoids and the samples have been
trained using fuzzy Radial basis function neural networks
and have reported very good results. The only drawback of
the proposed scheme is the fixed architecture of the network.
The medoids serve as the fixed optimal centers for the RBF
neural network. While this method works well for small data
sets, for larger data sets the algorithm attracts additional
computational burden since it involves the calculation of
fuzzy membership functions.
In our work, we propose a very simple approach to the
classification of the symbolic objects. The main contributions
of this work are:
.. We employed farthest neighbor concept with respect to the medoid within a cluster and select specified number of samples for training the network. training a neural network is laborious and some times frustrating too. we propose a simple approach to select significant patterns with respect to the mediod in each class. many of which are significant and many are not. the training patterns often have high dimensions besides being voluminous. Predominantly symbolic objects are to be used in all image processing applications.. ¾ . constitute ½ . III. we obtain the clusters. A widely used theorem is the Vapnik-Chervonenkis (VC) dimension   .. Let the entire set of pattern vectors ½ .. Experimental results have shown that acceptable generalization performances can be obtained with training sets less than that specified by the VC dimension  In the context of classification problems.2..n be the ’n’ number of binary ¾ Å be number of samples. Therefore it becomes imperative to take a look at the input patterns themselves and choose those which makes sense in good learning and generalization. The entire procedure is very simple and can be easily applied to larger data sets. II. longer will be the training times and larger will be the size of the generated network. In such situations. While generating the decision boundaries.. Problem Definition Given a set V of input vectors. On the other hand.. Those samples which aid in good learning and generalization are called informative patterns. Adaptive Auto-configuring RBF neural network proposed in our earlier work  has been used for synthesizing the architecture. ¾ .. . A. To obtain a subset Î of V belonging to ÊÆ . .. A N EW S IMILARITY M EASURE FOR S YMBOLIC O BJECTS It is well known that symbolic objects exist in a generalized form in several applications. the neural network often uses all patterns. if the training patterns are not structured and noisy.. ¾ . Our new method doesn’t require the traditional formule and hence uses a simple representation of symbolic objects in the form of a concatenated binary string. significant patterns or representative patterns. There exist many methods for clustering symbolic objects. The above procedure has been applied to soybean data and the class labels are obtained which are in concurrence with the benchmark dataset. Larger the number of patterns. If the training samples are structured and compact. 2 Using the binary form of Symbolic data sets. It is seen from the above equation that computation of similarity values is a simple direct approach when compared to traditional methods. We compute the similarity indices for the binary equivalent of symbolic objects as follows: Let bits and ½ . then neural networks can learn fast... select ’k’ number of samples from the given training set such that Î = where Ò and ½ ... S ELECTION OF S IGNIFICANT PATTERNS It is known that the training patterns control the dynamics of the neural network architecture. they also end up with large number of training cycles. The learning and generalization features of the proposed techniques are illustrated on the standard bench mark soybean dataset from UCI machine learning repository. then training all them would result in a over fitting architecture and that too at the expense of more resources. Ò be ’n’ vectors which constitute the samples in the input space and Î ¾ ÊÆ . we determine the similarity between them using hamming distance and then using this similarity index. the problem is to obtain a subset of V such that the vectors in the subset achieve the desired generalization level. 3 We then compute the medoids in each cluster which have binary symbolic data... significant patterns. Theoretical procedures to compute the upper bounds on the number of training samples needed for a specified level of generalization are available.. In many real life situations.. In this work. Next section introduces the Symbolic data and its features...1 Conversion of symbolic data into a homogeneous binary strings. Even though incremental learning algorithms offer a better training procedure.. i = 1. In our work we propose a simple approach to compute the similarity between symbolic objects through a homogenous binary format for both quantitative and qualitative features.. Let V = ½ . . The non significant patterns which may be outliers often consume maximum training cycles during learning and therefore need to be removed for improving the learning performance. 23 .. Consider two strings ½ ½ ¾ ¿ Ò and Similarity between ½ Ë where = 1 if ¾ and ½ ¾ ¿ È ¾ is given by Ò ½ ¾ and Ò ½ Ò (1) = 0 if Equation 1 constitutes a new measure which defines the similarity between two samples ½ and ¾ as the ratio of number of similar bits in the corresponding position of the two strings to the total number of bits in the strings.. We employ the agglomerative clustering algorithm to cluster the symbolic objects using similarity indices computed using equation 1. the neural networks define a classification boundary.
B. The insertion of a new RBF unit is based on the squared error accumulated across all the output units. Connect them by an edge and associate it with an age variable. The age of all edges emanating from the BMU is increased at every adaptation step. ¾. ½ a b IV. Notations The following are the notations used in the algorithm presented below: Pattern index Input pattern Desired output Width of RBF Learning rate ½ Center adaptation parameter for BMU ¾ Center adaptation parameter for non-BMU Output layer Hidden layer Input layer Ç Actual output Î Activation output Error Ü RBF center Ï Weight between output and hidden neurons ÅÍ Best Matching Unit ÓÖ ´Ò Ûµ Ü Ò ÓÖ ´ÓÐ µ · ¾ ´Ü µ (9) 4) Update the weights between the hidden and output units using Ï Ò Û Ï ÓÐ · (10) Ç Î where is the learning rate which has a small value between 0 and 1. Algorithm for Adaptive RBF (ARBF) Network 1) Select two random vectors from the input space as initial RBF centers. compute BMU and the next immediate BMU.. Edges exceeding an age limit AÑ Ü are deleted and so also the nodes having no more emanating edges.. Algorithm for training the network along with notations used are given in the following sections.. For each class ½.. We have modified the incremental learning algorithm proposed by Fritzke  and used it for training the significant samples. e Append all the selected samples from all the classes to form the training set Î . Connect them by an edge. Determine the medoid Å Calculate the distance between each of the input samples in the class and the medoid Å .. These are the samples that represent this class in the training set. calculate the output Ï Î ÜÔ Ü ¾ (3) ¾ (4) Calculate error using WITH INCREMENTAL LEARNING ADAPTIVE Ç (5) ¢ (6) 3) Set ½ using Table I. When an edge is created. that is . The modified algorithm has the ability to adapt its learning parameters which control the RBF center movement in the input space . .. d Repeat steps [a] through [c] for all the classes in the given input set.. Each one is complex in its own way and bears its own benefits. ¾ ½ ¾ ¾ ½ For every input pattern that is presented. The above procedure is applied to three bench mark datasets and the patterns derived are shown in Table II. ½ ). Calculate ¾ using RBF N EURAL N ETWORK As mentioned earlier. 5) The width of RBF units are computed using ’Age’ information. 2) For a given training pattern ( of the RBF network using Ñ Ü Ç where Î .¾ . its age variable is set to zero. T RAINING Ü½ Ü¾ ¾ (2) where Ü½ and Ü¾ are the coordinates of the two selected RBF units. find the best matching unit (BMU) using Ü (7) Move the BMU ½ times the current distance towards the input pattern using Ü Training is conducted with full dataset as well as with selected significant patterns. ÑÙ ´Ò Ûµ Ü ÑÙ ´ÓÐ µ · ½ ´Ü µ (8) Move all the immediate neighbors of the BMU ¾ times their current distance towards the input pattern using ÜÒ A. For each input presented.. ... Set their widths to be equal to the distance between them... c Choose a known percentage of samples which are the farthest from the medoid. there exist many versions of incremental algorithms for synthesizing RBF networks. 24 . Ò belong to ’c’ number of classes where c = . Also choose the medoid as one of the samples in addition to those chosen.
10 0. The dataset is briefly described below: A. After deleting these 41 samples.13 0. After the process of binarization.16 0.22 0. 8) Repeat steps 2 to 7 until classification error for all the patterns falls below a set value.005 6) A RBF unit is inserted between a unit which has accumulated maximum error and any of its neighbors.small and large.12 0. Each set of significant patterns is used to synthesize the optimal RBF architecture. Testing and Generalization We have used all the 266 patterns of the Soybean dataset to test the classification accuracy of the RBF architectures generated. Learning Characteristics The adaptive RBF algorithm is trained with different proportions of significant patterns shown in Table II.28 0. Then a percentage of samples are picked for training. In the second approach the medoid is used as a reference point and samples are picked with respect to the medoid. The dataset has 307 instances belonging to 19 classes. Dataset SOYBEAN LARGE data: Soybean data is available in two forms . Figures 1 and 2 show the learning curves for 70% significant samples. Nevertheless results from both the approaches are the same.20 0. Therefore we have trained and tested data having 105 attributes belonging to 15 classes. we have also used the soybean dataset for our experiments and comparison. E XPERIMENTS Since most of the improvements in RBF network construction have been illustrated with well-defined benchmark datasets from UCI machine learning database repository . The above approaches have been used to select the training samples whose number is progressively increased to study the optimatility of the generated architecture.17 0. Table III shows the number of RBF units generated and epochs taken by various significant pattern sets. Thus in the proposed study we have obtained 11 different RBF architectures. B. Training set Generation We have investigated 2 approaches in selecting significant training patterns. It is evident from the algorithm that the RBF units are subjected to movement in the input space during the entire learning phase. the dataset has 266 patterns belonging to 15 classes. V. These parameters help to synthesize optimal network architectures. Table IV shows the generalization produced by the networks. Insertion and movement of RBF units are carefully controlled by the adaptation parameters ½ . ¾ and . We have tabulated results using the medoid only as the medoid happens to be an actual sample in the input space where as the mean is a non existant sample. Its width is set to the mean distance between the neighboring units.11 0.19 0.32 0.S. each pattern is a 105 attribute vector consisting of binary equivalents of the original data.24 0. D. 7) The learning rate is decremented linearly by a small value during the convergence cycle. 41 samples have missing attributes.21 0. We have used the large dataset which has more samples. Table II shows the number of samples selected for training. Its weights are set to small random values. Each pattern has 35 attributes.18 0. But out of the 307 samples.N 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 TABLE I TABLE II L OOKUP TABLE FOR CHOOSING ½ S OYBEAN DATA SAMPLES SELECTED FOR TRAINING Error ½¼ ² ½¼ ² ² ² ² ² ¿² ¾² ¿ ½² ¾ ¼ ² ½ ¼ ½² ¼ ¼¼ ² ¼½ ¼ ¼¼ ² ¼¼ ¼ ¼¼¼ ² ¼ ¼¼ ¼ ¼¼¼¼ ² ¼ ¼¼¼ ¼ ¼¼¼¼¼ ² ¼ ¼¼¼¼ ¼ ¼¼¼¼¼¼ ² ¼ ¼¼¼¼¼ ¼ ¼¼¼¼¼¼¼ ² ¼ ¼¼¼¼¼¼ ½ 0. Percentage 10 20 30 40 50 60 70 75 80 90 100 Number 42 69 95 122 148 175 202 219 228 255 266 We have used the binarised representation of the actual dataset available. In the first approach euclidean norm is computed with respect to the mean of the samples in a class.15 0. 25 .26 0.30 0. It can be seen that 70% significant samples have yielded best results when compared with the remaining set of patterns. C.14 0.
Specifically for this dataset. vol. B. kiranagi.  M. 1994.79 51. pp.60 97.” Pattern Recognition. “Learning and generalization in a two-layer neural network: The role of the vapnik-chervonenkis dimension.” Advances in Neural Information Processing Systems. 312– 317.” Pattern Recognition Letters.” 1998. 471–476. 1. V. pp. pp. 1203–1213. Tesauro. Ravi. 26 . Merz. and P. Blake and C.  D. C ONCLUSIONS In this research work. 1500 1000 Epochs 1500 2000 Epochs Vs RBF units for 70% Soybean Data % 100 90 80 75 70 60 50 40 30 20 10 50 1000 Epochs 500 TABLE IV % G ENERALIZATION ACHIEVED 70% Significant patterns 60 500 30 20 70 0 40 2000 # 266 255 228 219 202 175 148 122 95 69 42 Generalization 100. 1995.71 99. “Symbolic classification.81 65. 4. 1994. Ichino and H. Haussler. 2005. Opper. 3. Mostafa.  T. 2133–2166. 24. 1991. clustering and fuzzy radial basis function network. Man and Cybernetics.71 97.  Y. 255–262.86 89. B. no. Guru. Baum and D. C.  B.  M. Diday. pp. we have shown that symbolic data can be classified using auto-configuring RBF network with better generalization. 553–564. Error 40 30 20 10 0 Fig. vol. “Can neural networks do better than the vapnikchervonenkis bounds?” Advances in Neural Information Processing Systems. “Multivalued type proximity measure and concept of mutual similarity value useful for clustering symbolic patterns.” Neural Computation. Padma.09 84. “Symbolic clustering using new dissimilarity measure. 1989. 70% significant patterns picked up using fartherest neighbour principle with respect to medoid has yielded good results both in terms of network size and training time. pp. vol. Yayuchi. no.TABLE III 70 70% Significant patterns R ESULTS : E POCHS & RBF UNITS FOR DIFFERENT COMPOSITION OF 60 S IGNIFICANT PATTERNS 50 # 266 255 228 219 202 175 148 122 95 69 42 Epochs 2783 2225 2539 2501 2407 2221 1850 1511 1306 1276 853 RBFs 68 60 67 59 68 62 53 47 44 39 28 RBF units % 100 90 80 75 70 60 50 40 30 20 10 10 0 0 Fig. 28.  E. 152. vol. 2004. vol. Mitra. Gowda and E. “Uci repository of machine learning databases.59 74. 2004. 25.” Physical Review Letters. “Divisive clustering of symbolic objects using the concepts of both similarity and dissimilarity. C.” ICONIP 2004. VI. pp. “Supervised learning with growing structures. 1991. 6. vol.00 99.  K. pp.  K. 911–917. 1. “Generalized minkowski metrics for mixed feature-type data analysis. pp. This can be attributed to the small number of patterns that are present in some of the clusters. And picking significant patterns from a small number does not yield good results. vol. pp. 24. pp. 8. K.50 Epochs Vs Error for 70% Soybean Data It is seen that generalization levels are poor in the lower half of the table. “What size net gives valid generalization?” Advances in Neural Information Processing Systems.” Pattern Recognition. Gowda and T.” Fuzzy sets and systems. “Adaptive learning in incremental learning rbf networks. B. vol. no. S.  D.” IEEE Transactions on Systems. Nagabhushan and S. 1989. 1277–1282. 567–578. 81–90.  C. Fritzke. A. vol. S. R EFERENCES  K. 1994.20 95. Mali and S. “The vapnik-chervonenkis dimension: Information verses complexity in learning. 1. 6. More applications in multimedia data mining are under investigation. Cohn and G. Nagabhushan. 2. N.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.