Professional Documents
Culture Documents
Neural Network
Joern David
Institute of Computer Science / I1
Technical University Munich
D-85748 Garching, Germany
email: david@in.tum.de
B. Formal grammars incorporated by neural networks In order to associate sequences of nodes in knowledge bases
Formal languages that are generated by formal grammars we want to model navigation operations on these nodes by a
can be learned by artificial neural networks. Recurrent neural formal grammar.
networks (RNNs) have already been applied to internalize the Def. Components of a formal grammar
explicit rules of formal grammars, such that they were capable • Σ is the alphabet of terminal symbols (lowercase letters).
to classify or predict words ω ∈ L of the generated language L • V is the set of variables or non-terminal symbols, V ∩Σ =
(cp. [Smi03]). These words are sequences of terminal symbols ∅ (capital letters).
of the underlying alphabet Σ. So different types of RNNs • P is the set of production rules, which is a finite subset:
like Elman Simple Recurrent Networks (SRN, e.g. [JMS]) or P ⊂ (V ∪ Σ)+ × (V ∪ Σ)∗ .
Jordan Nets were utilized to classify the words in terms of ∗
is the Cleene star, that stands for an arbitrary number
the membership ω ∈ L or ω ∈ / L (the word problem) [Cal03]. 0..n of single symbol occurrences (+ stands for 1..n).
By contrast, the prediction task is to continue the symbol
Now a formal grammar is a 4-tuple G = (V, Σ, P, S), where
sequence ω̌ = a1 . . . ak , ai ∈ Σ for a given partial word
S ∈ V is the start symbol.
ω̌, ω = ω̌ ω̂, such that the predicted sequence of subsequent
It is proven that neural networks with one hidden layer and
symbols ω̂ = ak+1 . . . an is conform with the rules of the
non-linear activation function (e.g. logistic function, tanh) are
grammar.
at least as powerful and expressive than the general Turing
Many languages that are processed by neural networks orig-
machine.1
inate from the regular grammars (Chomsky type 3), that can
So context-sensitive rules like αBγ → αβγ, B ∈ V, α, β, γ ∈
be represented by a Finite State Machine (FSM).
(V ∪Σ)∗ can be recognized by neural networks with the appro-
C. Information Retrieval priate learning algorithm theoretically. The expressiveness of
[MS05] proposes a Neural Network Information Retrieval context-sensitive grammars is very high, thus it is in practice
(IR) system, that consists of three layers namely query, very difficult to conduct context sensitive language learning.
keyword and document layer. This system should enable Nevertheless also very complex grammars are expressible by
information retrieval from text documents in Slovak language. RNNs. For our purpose it is sufficient to train a recurrent
Here the specialty is the use of neural networks for the neural network in a way, that it behaves like a Finite State
“transition of the information” between each of these layers. Machines (FSM).
The goal is to associate actor queries with a set of keywords, A FSM can only generate or respectively accept the regu-
which is itself associated with the actual document set, while lar languages, according to the Chomsky-hierarchy [Sch01].
the associations between the layers are established by two These are generated by the grammars of type three in this
neural networks. Like proposed in the paper at hand, [MS05] hierarchy, whose regular production rules – which are exactly
equivalent to the regular expressions – are the least expressive
ones, but sufficient to describe all relevant actor traces in the
recommendation system.
But also with the restrictions of regular grammars we can
leverage the context of an actor trace in form of the previously
visited nodes ab1 b2 . . . bj c, which altogether represent the
navigation-history for the generic knowledge tree depicted
in figure 2. In that case, the target sequence is represented
by the symbol sequence d1 . . . dk ef g. As a consequence,
the path through the knowledge tree generates a word ω,
that is composed of a sequence of terminal symbols and
ω ∈ L(G) ∈ L3 is derived from the following regular grammar
1 The power of neural networks could even exceed the power of Turing
Fig. 1. Three layer information retrieval concept enabled by a machines, due to the massive parallelism of information processing (MPP)
common feed-forward neural network architecture. by the interconnected neurons.
rules can be extracted afterwards by methods similar to those
of association rule mining. Like this, a general understanding
of the linked knowledge bases can be derived in form of
symbolic association rules.
R EFERENCES
[Cal03] Robert Callan. Neuronale Netze im Klartext. In Pearson Studium,
ISBN 3-8273-7071-X, 2003.
[GSW05] Faustino J. Gomez, J”urgen Schmidhuber, and Daan Wierstra.
Modeling Systems with Internal State using Evolino. In Proc.
of the 2005 conference on genetic and evolutionary computation
(GECCO), Washington, D. C., ACM Press, New York, NY, USA,
2005, pages 1795–1802, 2005.
[JMS] Fergal W. Jones, IPL McLaren, and Rainer Spiegel. The
Prediction-Irrelevance Problem in Grammar Learning. In Uni-
versity of Cambridge, Department of Experimental Psychology,
Downing Site, Cambridge, CB2 3EB, UK.
[Kra91] Klaus Peter Kratzer. Neuronale Netze. In Carl Hanser Verlag
M”unchen Wien, 1991.
[MRF04] Rene Mayrhofer, Harald Radi, and Alois Ferscha. Recognizing and
Predicting Context by Learning from User Behavior. In Institut fr
Pervasive Computing, Johannes Kepler Universitt Linz, 2004.
[MS05] Igor MOKRIŁ and Lenka SKOVAJSOV. Neural Network Model
of System for Information Retrieval from Text Documents in Slovak
Language. In Acta Electrotechnica et Informatica No. 3, Vol. 5,
2005.
[Rab89] Lawrence R. Rabiner. A Tutorial on Hidden Markov Models and
Selected Applications in Speech Recognition. 1989.
[Sch01] Uwe Sch”oning. Theoretische Informatik - kurzgefasst. In SPEK-
TRUM, akademischer Verlag, 2001.
[SLD96] I. Syu, S.D. Lang, and N. Deo. A neural network model for
information retrieval using latent semantic indexing. In ICNN
96, The 1996 IEEE International Conference on Neural Networks,
pages 1318–23 vol.2, 1996.
[Smi03] Andrew Smith. Grammar Inference Using Recurrent Neural
Networks. In Department of Computer Science, University of
California, San Diego, La Jolla, CA 92037, 2003.