You are on page 1of 4

Modified Self-Organising Maps Neural Network for Arabic

Phonemes Recogniser

SUKKAR FADEL AND HOMSI MASUN


Department of Computer Engineering
University of Aleppo
P.O Box 6502
ALEPPO, SYRIA

Abstract: This paper refers to an approach for Arabic phonemes recogniser based on unsupervised learning of
Modified Self-Organising Map (MSOM) which is a new version of self-organising map (SOM). It utilises
Adaptive Resonance Theory (ART2) learning concept, and which adds many advantages like the ability to
learn from experience and a high computation rate. The approach consists of two stages, a preprocessing
stage, where Arabic phonemes are recorded, segmented and handled by Wavelet transform algorithm to
decompose signals, and a recognition stage where a neural network is employed to recognise one hundred and
twelve Arabic phonemes. Results show that the system performance depends on the good choices of network
parameters of are made, so it is designed to give the optimal generalisation recognition and lower
computational costs.

Key-Words: Self-organising, Modified Self-Organising Map (MSOM), Wavelet Transform, Adaptive


Resonance Theory (ART2), Arabic Phonemes Recogniser.

1 Introduction This paper is organized as follows. An introduction


Every language is constructed by a set of basic of basic architecture of MSOM is briefly outlined in
linguistic units, which have the property that if one section 2. System implementation is emphasized in
replaces another in an utterance, the meaning of the section 3. A set of experiments, which summarises
word or sentence is changed. That is, the the comparative study between SOM and MSOM, is
information transmitted through speech can be introduced in section 4. Finally concluding remarks
represented by a concatenation of elements called are provided in section 5.
phonemes. Modified Kohonen Self-organising maps
(MSOM) neural network used as a part of the
building blocks for Arabic phonemes recogniser. It 2 Modified Self-Organising Maps
is considered one of the unsupervised competitive (MSOM)
learning that is able to learn without supervision
from a teacher [5]. Its architecture is quite simple it 2.1 MSOM Basic Architecture
consists of two layers, an input layer and a kohonen The self-Organising Map (SOM) is a clustering
output layer. The input layer size matches the size algorithm developed by Teuvo Kohonen of Helsinki
of each phoneme features extracted from a University of Technology [6] and is often used to
preprocessing stage. It uses Adaptive Resonance classify inputs into categories. It is a simplified
Theory (ART2) top-down updating weight equation model of the feature-localised-region mapping of the
that represents the Long Term Memory of the brain from which derives its name. It is a
network. The main advantages of the inclusion of competitive, self-organising network which learns
ART2 learning concept are the ability to learn from from the environment without a teacher. It consists
experience and at a higher computation rate. of a group of neurons organised in a single layer of
Wavelet transform forms a part of the preprocessing one-dimension or multi dimensions, which may be
stage for an Arabic phonemes recogniser. It is used by itself or as layer of another neural network.
introduced as an alternative technique for time- Fig. 1 illustrates the basic architecture of SOM,
frequency decomposition. Its main goal is to which has two layers, an input layer and a Kohonen
separate high frequencies from low frequencies, output layer [7]. Inputs from the input layer are fed
simulating the performance of the basilar membrane into each of the neurons in the kohonen network.
in the ear of human beings and other mammals. The neuron with a weight vector that is closest to the
input pattern wins the competition. The wining 4- Compute distance dj between the input and each
neuron shares learning experience with its closest output node j using:
neighbours and learning process is executed, so that N −1

nearby neurons tend to have their weights in the d j = ∑ ( xi (t ) − wij (t )) 2


i =0
same direction as the input pattern while more
distant neurons have their weights in the opposing Where xi(t) is the input to node i at time t and wij(t)
directions [4]. is the weight from input node j at time t.
5- Select output node J with minimum distance.
6- Update weights to node J and neighbours.
kohonen Layer
Input Layer
Winner
Where η(t) is a learning rate that decreases in time.

0 < η (t ) < 1

7- Go to step 2.

3 Arabic Phonemes System


Fig. 1, Kohonen Network Implementation
The proposed system in this paper is designed to
MSOM is developed as a modified version of self- recognise Arabic phonemes. It consists of two
organising map (SOM) neural network and it has the stages, preprocessing stage and recognition stage.
same architecture of SOM, but weight matrix is Fig.2 shows a block diagram of different
updated using the top-down LTM of Adaptive components of Arabic phonemes recogniser.
Resonance Theory (ART2) developed by Carpenter
and Grossberg [2]. This new concept of learning

Recognition
SOM/MSOM

Stage
applying to SOM makes clustering mechanism more
flexible and saves time in categorising the input
pattern.
Wavelet
2.2 MSOM Algorithm Transform
The training algorithm for MSOM network is
designed to produce clusters of similar input data,
and to place similar clusters close together in the Segmentation
Preprocessing

output map. A MSOM network is trained by


Stage

presenting the input vectors of phoneme signals one


at a time and defining as the winning node the ones
whose weights most closely match the values in the Noise Suppression
current input vector. Then the weights are modified
to reduce the difference between the weights of the
winning node and the features in the input vector.
Weights of the winner’s nearest neighbours are also Recording
modified, to a less extent, and this feature leads to
similar but not identical clusters being grouped in
the same regions of the output. Fig.2, Block diagram of a Arabic phoneme system

The algorithm consists of the following steps:


1- Initilise weights from N inputs to M output nodes 3.1 Preprocessing Stage
to small values. In this stage one hundred and twelve Arabic
2- Set the initial radius of the neighbourhood. phonemes are recorded, four different static vowels
3- Present new input from training set. (Fatha, Dammeh, Kasra and Sokoun) are used for
each Arabic letter ( ،‫ ز‬،‫ ر‬،‫ ذ‬،‫ د‬،‫ خ‬،‫ ح‬،‫ ج‬،‫ ث‬،‫ ت‬،‫ ب‬،‫أ‬
،‫ ن‬،‫ م‬،‫ ل‬،‫ ك‬،‫ ق‬،‫ ف‬،‫ غ‬،‫ ع‬،‫ ظ‬،‫ ط‬،‫ ض‬،‫ ص‬،‫ ش‬،‫س‬ 4 Results and Discussion
‫ ي‬،‫ و‬،‫ )هــ‬with low quality of 8 bits per sampled Results for two-layered self-organising map neural
amplitude and at a sampling rate of 22050. Noise networks with 128 neurons on input layer and 22500
suppression and segmentation procedure is neurons on output layer are tabulated in Table 1.
implemented, details of which can be found [3]. The table shows the different parameters used to
Then wavelet transform is applied to separate high train the networks. Nine comparative experiments
frequencies from low frequencies and reduce data are applied to determine the most favourable
size for each Arabic phoneme to 128 sample, so parameters are used. On experiments 1, 2, 4, 5, 7
signals are decomposed in a set of subsignales, and 8 both neural networks SOM and MSOM learn
where one of them contains the relevant needed and group 112 Arabic phonemes into bidimension
information of Arabic phonemes [1]. matrix of output layer, and also they can recall them,
but it is obvious that MSOM saves time in learning
3.2 Recognition Stage stage. In this case, MSOM has a better performance
128 samples for each Arabic phoneme are obtained than SOM. When neighbour parameter is less,
which represent MSOM input pattern. The self- MSOM and SOM can learn in a shorter time this is
organising is constructed of two layers, an input illustrated on experiments 2, 4 and 8.
layer of 128 neurons and an output layer of two When SOM can not learn, MSOM can learn and
dimensions of 22500 neurons (150*150) to classify recall all Arabic phoneme signals. This is an
and distribute 112 Arabic phonemes. advantage of MSOM over SOM, although it utilises
the same parameters. This is depicted on
experiments 3, 6 and 9.

Experiments on Arabic Phonemes


No η Neigbour SOM MSOM
Learning Recall Learning Recall
# Learned Learning # Recalled # Learned Learning # Recalled
Phonemes Time sec. Phonemes Phonemes Time sec. Phonemes
1 0.9 3 112 1350 112 1220
2 0.9 2 112 1110 112 112 1090 112
3 0.9 1 98 975 - 112 876 112
4 0.7 3 112 1310 112 112 1215 112
5 0.7 2 112 1190 112 112 1112 112
6 0.7 1 90 995 - 112 899 112
7 0.4 3 112 1455 112 112 1310 112
8 0.4 2 112 1195 112 112 1115 112
9 0.4 1 95 989 - 112 909 112
Table I Experimental Results

5 Conclusion References:
This paper is based on the comparative analysis of [1] Jensen, A. LA Cour-Harbo, “Ripples in
two neural networks self-organising map SOM and Mathematics: The Discrete Wavelet
modified self-organising map MSOM. This new Transform”, Springer Verlag; ISBN:
version of SOM is developed to be a part of Arabic 3540416625 , 2001.
phoneme signals recogniser. MSOM’s performance [2] Carpenter Gail A., and Grossberg Stephen,
is characterised by a high degree of accuracy as well “The ART of Adaptive Pattern Recognition by a
as fast training time because it introduces ART2 Self-Organizing Neural Network”, 1988.
learning concept to update weight matrix between [3] Homsi Masun and Sukkar Fadel, “Match
the input layer and output layer. Adaptive Resonance Theory Neural Network
Arabic phonemes recogniser is considered as an for Arabic Alphabet Recognition”, Problems in
important attempt to build Arabic word recognizer. applied Mathematics and Computational
Intelligence, ISBN: 960-8052-30-0, pp 104-
112, 2001.
[4] Klaus Obermayer, Terrence J. Sejnowski,
“Self-Organizing Map Formation : Foundations
of Neural Computation (Computational
Neuroscience)”, MIT Press; ISBN:
0262650606, 2001.
[5] Patterson Dan, “Artificial Neural Networks
Theory and Applications”, Prentice Hall,
Singapore, ISBN 0-13-295353-6, pp.387-403,
1996.
[6] T. Kohonen, “Self-organized formation of
topologically correct feature maps”, Biological
Cybernetics, vol. 43, no, pp.59-69, 1982.
[7] T. Kohonen, T. S. Huang, and et. Al., “Self-
Organizing Maps ”, Springer Verlag; ISBN:
3540679219, third edition, 2001.

You might also like