Professional Documents
Culture Documents
This chapter surveys previous work in structured text processing. A common theme
in much of
this work is a choice between two approaches: the syntactic approach and the
lexical approach.
In general terms, the syntactic approach uses a formal, hierarchical definition of text
structure,
invariably some form of grammar. Syntactic systems generally parse the text into a
tree of
elements, called a syntax tree, and then search, edit, or otherwise manipulate the
tree.
The lexical approach, on the other hand, is less formal and rarely hierarchical.
Lexical systems
parsing text into a hierarchical tree, lexical systems treat the text as a sequence of
flat segments,
The syntactic approach is generally more expressive, since grammars can capture
aspects of
hierarchical text structure, particularly arbitrary nesting, that the lexical approach
cannot. However,
the lexical approach is generally better at handling structure that is only partially
described. Other
differences between the two approaches will be seen in the sections below.