Professional Documents
Culture Documents
in computer science, a formal grammar is in chomsky normal form if and only if all
production rules are of the form:
a ? bc or
a ? a or
s ? e
with the exception of the optional rule s ? e (included when the grammar may
generate the empty string), all rules of a grammar in chomsky normal form are
expansive; thus, throughout the derivation of a string, each string of terminals
and nonterminals is always either the same length or one element longer than the
previous such string. the derivation of a string of length n is always exactly 2n-
1 steps long. furthermore, since all rules deriving nonterminals transform one
nonterminal to exactly two nonterminals, a parse tree based on a grammar in
chomsky normal form is a binary tree, and the height of this tree is limited to at
most the length of the string.
the chomsky normal form is named after noam chomsky, the us linguist who invented
the chomsky hierarchy.
alternative definition
some sources define chomsky normal form in the following slightly different way:
a formal grammar is in chomsky normal form if and only if all production rules are
of the form:
a ? bc or
a ? a
where a, b and c are nonterminal symbols, and a is a terminal symbol. when using
this definition, b or c may be the start symbol.
this definition differs from the previous one in that it precludes the possibility
that the grammar will generate the empty string, e. it remains true that any
context-free accepting a language l can be efficiently transformed into a grammar
in chomsky normal form that accepts l - {e}. the principle advantage of this later
definition is that proofs are generally marginally simpler, due to the fact that
each step in a derivation never decreases the length of the resulting string. its
disadvantage, of course, is that special consideration is needed if the original
grammar generated e.
--------------------------------
the syntax definitions on this site use a variant of backus normal form (bnf) that
includes the following:
where there are several optional items and you can choose any,several or all
these are shown in a list, all indented to the same degree:
where there are several optional items and you can choose only one from
the list, they are indented and separated by a blank line
option
option
where the layout makes it possibe, braces and the pipe symbol are the preferred
method to show alternative options.
this syntax is not always followed rigourously e.g. on a page where almost
everything is a variable, to improve readability some variables may be left as
lower case but not italic.
similarly option [,option] [,...] may be abbreviated to just option ,... this is
to avoid having a surfeit of brackets around the more complex commands.
bold text is used to highlight some key items but does not have any syntactic
importance.
backus�naur form
from wikipedia, the free encyclopedia
jump to: navigation, search
the backus�naur form (also known as bnf, the backus�naur formalism, backus normal
form, or panini�backus form) is a metasyntax used to express context-free
grammars: that is, a formal way to describe formal languages.
history
john backus created the notation in order to express the grammar of algol. at the
first world computer congress, which took place in paris in 1959, backus presented
"the syntax and semantics of the proposed international algebraic language of the
zurich acm-gamm conference", a formal description of the ial which was later
called algol 58. the formal language he presented was based on emil post's
production system. generative grammars were an active subject of mathematical
study, e.g. by noam chomsky, who was applying them to the grammar of natural
language.[1] [2]
peter naur later simplified backus's notation to minimize the character set used,
and, at the suggestion of donald knuth[3], his name was added in recognition of
his contribution.
introduction
note that many things (such as the format of a first-name, apartment specifier, or
zip-code) are left unspecified here. if necessary, they may be described using
additional bnf rules.
variants
there are many variants and extensions of bnf, generally either for the sake of
simplicity and succinctness, or to adapt it to a specific application. one common
feature of many variants is the use of regexp repetition operators such as * and
+. the extended backus-naur form (ebnf) is a common one. in fact the example above
is not the pure form invented for the algol 60 report. the bracket notation "[]"
was introduced a few years later in ibm's pl/i definition but is now universally
recognised. abnf is another extension commonly used to describe ietf protocols.
parsing expression grammars build on the bnf and regular expression notations to
form an alternative class of formal grammar, which is essentially analytic rather
than generative in character.
many bnf specifications found online today are intended to be human readable and
are non-formal. these often include many of the following syntax rules and
extensions:
---------------------------------------------
a \to \alpha x
or
s \to \epsilon
given a grammar in gnf and a derivable string in the grammar with length n, any
top-down parser will halt at depth n.
---------------------------------