10th International Congress of the German Association for Semiotic Studies (DGS), University of Kassel (Germany), July, 19-21,

2002; Section Semiotics and the Computer

A Semiotic Perspective on Natural Language Understanding

Inger Lytje Aalborg University, Denmark Abstract Within computational linguistics, methodological and philosophical issues are rarely discussed. Responding to this situation, I want to propose that the trace of methodological considerations and linguistic theory can be found in the software products that implements computational models of linguistic phenomena. Furthermore, the notion of computer emerges in computer models of natural language understanding, and in this connection I want to suggest that computer is conceived of as language or rather as a multiplicity of languages (including programming language, database language, interface language, knowledge representation language), which, again, implement different kinds of signs. Furthermore I want to suggest that computer models and language engineering products be interactive as opposed automatic procedures. It might enhance the quality of language engineering products and it provides the researcher with strong tools for linguistic enquiry. Regarding the computer as interactive multimedia rather than a mathematical engine, only, provides for a richer notion of formalisation and implementation of linguistic knowledge. The precedence which generative grammar has had in computational linguistics may come from an understanding of computer as a mathematical engine. Other aspects, such as seeing the computer as structured persistent memory and interactive multimedia, have to a certain extent, overruled this notion of computer. And it is along this argument I will present my work on cognitive grammar as a framework for computer modelling of text comprehension. I will suggest that language be seen as interaction between text and grammar, and cognitive grammar (according to Langacker) explains and describes this interaction in a specific way. A text corpus represents a language and so does a grammar. The linguistic symbol, which may be simple or complex, unifying phonological form and cognitive semantic content make up the basic analytical unit in a cognitive grammar. The lexicon is seen as a database of general word-meanings, which are specified through interaction with other linguistic units in various kinds of expressions. The dynamics of a grammar is motivated by valence which explains the inclination of specific linguistic categories, such as the category action-verb, to merge with other linguistic categories. Computer based descriptions result in lexical databases, taggers, parsing algorithms, schemas for semantic representation etc. These are all elements in systems for processing natural language text and one of the major contributions of my paper is to describe how these linguistic information resources may be organized within the overall framework of cognitive grammar and how they interact with the human user concerning linguistic enquiry and software products for different purposes such as language acquisition, information retrieval and automatic translation.