Professional Documents
Culture Documents
Cip Gerst EUROLAN2007
Cip Gerst EUROLAN2007
Ciprian-Virgil Gerstenberger
Computational Linguistics
University of Saarland, Germany
gerstenb@coli.uni-sb.de
Abstract
6 Examples
This section exemplify the interaction of the pro-
posed architecture, using some phenomena de-
scribed in the previous sections.
Figure 1: Forming complex SLOPs
Polish Let us consider the Polish PN-marker de-
scribed in ex. 9–10. For the given input, the PO-
rules form complex SLOPs as in figure 1. The LO-
rules for Polish order simple and complex SLOPs
according to grammar restrictions as in figure2. Af-
ter the inflection morphology step, the final string is
processed accordingly: the first word form is cap-
italized; the PN-marker is joined to the preceding
word, if this action does not violate the phonological
constraints3 imposed by the language (see (Kupść Figure 2: Final results
and Tseng, 2005)) – in which case, the current real-
ization variant is rejected –; then, punctuation rules obligatory weak pronoun postclitization to the verb
are applied. in postverbal position, only the variant Faceţi-l! is
generated.
Romanian Given as input SLOPs for realizing a 7 Related work
Romanian this-NP, a linearization module impose
One of the most important concepts of GLM is
no restriction on the positions of demonstrative and
the use of mereological structure as “generalized
noun: both variant are possible (see ex. 1 and 5).
constituents”. Dissociating linear order from con-
Based on the morpho-syntactic specification and
stituency is not new (see (Reape, 1994), (Pollard
the relative position, the value for definiteness is
et al., 1994), (Goetz and Penn, 1997)), but most
assigned accordingly: for the constellation [DEM,
of these models are concerned mainly with analy-
def=minus]≺ [OM, def=minus], the inflection morphol-
sis, and as for generation, many questions are left
ogy generates acest om, while for [OM, def=plus]≺
open. However, even models dedicated mostly to
[DEM, def=plus], omul acesta.
generation employ vague concepts of what are the
Unlike (Minnen et al., 2000), the surface architec-
linearization primitives, how exactly to form com-
ture we propose allows for different variants at the
plex linearization units, or what to do after the lin-
level of morphology. For Romanian weak pronouns
earization step to obtain the correct surface realiza-
(see ex. 12–14), the result of linearization and inflec-
tion. An example of such a model is (Bohnet, 2004),
tion is să ı̂l faceţi and faceţi ı̂l. Based on the optional
which is very similar to our model with respect to
weak pronoun postclitization to the subjunction să in
dependency trees as input stuctures and to how to
preverbal position, the text polishing module gener-
express linearization rules. However, there are basic
ates both Să ı̂l faceţi! and Să-l faceţi!; based on the
differences with respect to both primitive and com-
3
The proposed architecture allows for a modular application plex linearization items. While our model proposes
of phonological rules that can model even such phenomena like mereological-based units that abide by TPC/PPC,
yer vocalisation (see (Crysmann, 2006) for details). Given the
constellation mógł+m, phonological rules at this level come up (Bohnet, 2004)’s “[p]recedence units roughly rep-
to the correct verb form mogłem. resent constituent structures.”. Moreover, (Bohnet,
2004) uses only two kinds of LO-relating rules (ver- Topological Hierarchy. In Proceedings of the Associ-
tical and horizontal), failing possibly to properly ation for Computational Linguistics, pages 220–227,
Toulouse.
constrain extraposed relative clauses in German (see
ex. 23–24). As for the realization steps after lin- Thilo Goetz and Gerald Penn. 1997. A proposed linear
earization, (Bohnet, 2004) appears not to be con- specification language. Technical Report SFB 340, Nr.
cerned about at all. 134, University Tübingen.