Language Generation

(Article 86)

Eduard Hovy
Information Sciences Institute
of the University of Southern California
4676 Admiralty Way
Marina del Rey, CA 90292-6695
tel: +1-310-448-8731
fax: +1-310-823-6714
email: hovy@isi.edu

Natural Language Generation (NLG) is the enterprise of automatically producing
text (in English or any other human language) from computer-internal
representations of information. Usually, the process involves three stages:
macroplanning (of content), microplanning (of form), and realisation (of sentences).
Keywords: language processing, generation, realisation, text planning, speech synthesis

1. Introduction
Natural Language Generation (NLG) is the enterprise of automatically producing text (in English
or any other human language) from computer-internal representations of information.
To the extent the computer-internal information can be directly converted into human language,
the NLG problem is simple. But most computer programs that require language output (such host
programs include expert systems, Artificial Intelligence reasoning systems, help information
systems, etc.) format the information to support their own needs. That makes it rare for each
individual representation item to correspond exactly to a unique word in the human language, and
each grouping of information exactly to a sentence. Therefore it is the NLG system’s task to
identify in detail which aspects of the information need to be said, to regroup and reformat that
information into sets that correspond to paragraphs and sentences, to select appropriate phrases
and words, and then to assemble them all into grammatical output.
In the past, the NLG process used to be decomposed into two stages: what should I say? and how
should I say it?. In the past ten years, a third stage has been identified (Reiter, 1994). We discuss
each of the three stages—macroplanning, microplanning, and sentence realisation—in turn.

2. Macroplanning (What to say)
Stage 1. Macroplanning (also called text planning) deals with the problem of content choice and
organisation. Generally, the macroplanner’s input is one or more communicative goals
(instructions to Explain some state, Describe some object, Relate some events, etc.). These

Primarily two types of templates have been studied: Schemas (McKeown. including Rhetorical Structure Theory (Mann and Thompson. Several other issues have also been identified and studied. Quintus left the party” or “At 3 in the morning yesterday. especially in the recent proceedings of the International Natural Language Generation Workshops and Conferences (see Bibliography). 3. 1988). Sequence. each frame specifying the content and organisation of a single sentence. Microplanning involves a set of semi-independent issues. Microplanning (also called sentence planning) deals with the sentence-sized groupings of information in the leaves of the text structure tree. The output of a microplanner is generally a list of frames. etc. namely the text structure. Wednesday. Both Schemas and Relation/Plans can be nested by the macroplanner to form the tree. namely that the reader will understand not only each sentence individually but also how all the sentences together compose to a sensible whole. Despite Meteer’s (1990) development of an expressive notation to support microplanning. Aggregation decides whether to remove repetitious material (transforming “Quintus left the party. Macroplanning has been influenced by various theories of text linguistics. Microplanning (How to say it) Stage 2. This tree constitutes the so-called rhetorical organisation of the text. Relation/Plans generally govern how only two elements relate as Nucleus (using the primary element) and Satellite (the secondary one) in various ways (the inter-sentence relations mentioned above. Because of its relative recency and internal complexity. objects. events. Tuesday. Details. microplanning is less articulated than the other two stages. Theme and focus control decides how to order the constituents of a sentence (as in “Yesterday. etc. Thursday. usually a tree. So he left”). in which the leaves are sentence-sized groupings of representation items. Moore. 1992) and in numerous papers since then. but aspects of the problem are discussed in (Appelt. 1985. 1985. the wrong choice in context can be jarring. the relations have been studied by Text Linguists and Psycholinguists. and AI planner experts have applied various planning techniques to the tree construction process. at 3 in the morning. 1989). and Friday” to “He worked all week”). although attempts have been made. Rambow and Korelsky. 1988. nobody has developed an adequate architecture in which the partial decisions made for one purpose (say.). Quintus went to a play” into “Quintus left the party. Microplanning is required because these groupings seldom include the linguistic information required to produce language. No general-purpose microplanners have ever been built. and its internal nodes are inter-sentence relations (such as Cause. and went to a play” or even to transform “He worked Monday. Hovy. 1993. . are defined in the host system’s knowledge base or database. theme control) can be acknowledged or overruled by another (say. while these sentences sound equivalent in isolation. Pronominalisation decides whether to use a full referent (as in “Quintus Smith left Mary’s party”) or a pronoun (as in “He left her party” or even “He left it”). In order to produce this rhetorical organisation. Sentence scoping decides whether and how to amalgamate two leaves into a single sentence (as in “He left the party because there was no more food” versus “There was no more food at the party. pronominalisation). and its function is to ensure coherence.. 1993) and Relation/Plans (Mann and Thompson. the macroplanner requires a set of templates describing how (and in what order) the host system’s representation elements can be assembled. Comparisons of objects. The macroplanner’s output is a structure. Paris. Background).states. Schemas are generally paragraph-sized structures that capture the sequence of topics in stereotypical paragraphs such as Descriptions in encyclopedias. Quintus left the party” or “Quintus left the party yesterday at 3 in the morning” or even “The party Quintus left at 3 in the morning yesterday”). drove to the city. Quintus drove to the city.

as well as a lexicon (the words of the language. Generally. with all their multi-word correspondences). The realizer converts each input frame of information into grammatical strings of words. Four have been most influential. Nitrogen (Langkilde and Knight. More sophisticated variants express the grammar as templates (a very natural formulation of phrase structure rules). etc. using a local (three-word) window. which have played an important role in NLG. statistical techniques have brought a completely new approach to realisation. 1992) represents input. because they help determine the content of the information groupings in the rhetorical tree. a template is built for each possible pattern of expression. Nitrogen starts by using a weak phrase structure grammar to generate literally millions of possible sentences (not all of them wholly grammatical) corresponding to the input frame. which finds a single path through the lattice by choosing the most fluent sequence at each point. the realizer requires a grammar of the language (the rules that specify word order and word inflection for tense. The problem of lexical selection appears at all levels of the generation process. which specifies more detail.. aspect. 1991) represents the grammar in a large network of interrelated options. These alternatives. Other words and multi-word phrases should be chosen during microplanning. the more natural it is to establish a simple one-to-one correspondence. A secondary problem concerns where in the process to perform lexical selection. Other statistically trained methods are beginning to appear as well. “take” or “buy”. 4. packed into a lattice (a data structure that stores shared words once only). “swap”) all have different symbols in the host system’s formalism.). Lexical Selection Two further issues require mention. “eat”. The Penman/KPML system (Matthiessen and Bateman. a step that systems generally prefer not to take. This approach uses the grammar and ideas of Systemic Functional Linguistics. 1. Realisation (also known as sentence generation) deals with the production of sentences in a human language. in all their inflected forms. 1987). 1998) employs the overgenerate-and-prune strategy that has been successful in automated speech recognition and other natural language applications. which the system then traverses under guidance of the input frame. number. “sell”. see Mumble (Meteer et al. which was built from several years of Wall Street Journal newspaper text. The phrasal expansion approach is the simplest. the closer the host system’s representation items mirror words in the language being generated. and occasionally also use templates for the multi-word requirements of individual words in the lexicon. A solution is to have the NL generator’s lexical chooser select the appropriate sense of each work using sets of questions called discrimination trees (Goldman. This however requires that closely related representation items (such as “smoke”. Simple generators that produce preformatted (“canned”) text and that fill in one or two blanks are common in computer programs. “breathe”. and lexicon all in the same notation. To determine fluency it consults a table of hundreds of millions of three-word phrases called trigrams. grammar. The FUF/SURGE unification mechanism (Elhadad. Recently.4. “drink”. In essence. 1974) or some other mechanism. 2. To ensure grammaticality. Some words (typically verbs) have to be chosen fairly early. Realisation (How to say it) Stage 3. 5. because they help determine the general forms of expression used . “trade”. and the realizer simply fills each slot in the initial template either with material from the input frame or with another template. incrementally assembling the final sentence. 3. are then handed to the ranker. and then uses unification to merge together all the appropriate information and produce a full sentence specification. In this approach the user is free to develop his or her grammar. Several quite different grammars and realisation procedures have been developed.

1988. 1975. as well as rules embodying style to guide the generator’s choices (content. Amsterdam: Elsevier North-Holland. Matthiessen. London: Francis Pinter.D. 54–79.C. 1974. Ph. Ph.H. I. and the speaker’s pragmatic intentions. Elhadad. C. J. To do so. and there is no clear understanding of how words co-constrain each other.M. Stanford University. 1988). 1974) and Pauline (Hovy. NJ: Lawrence Erlbaum Associates. Hovy. they usually differ in interpersonal.D. Planning English Sentences. Computer generation of natural language from a deep conceptual base. Schank (ed) Conceptual Information Processing. 1993. Cambridge: Cambridge University Press. 1998. Hovy. That is. they require knowledge about the hearer/reader. for example. why should I say it this way? According to many theories of language. Using argumentation to control lexical choice: A functional unification-based approach.H.D. Such expressive systems have to produce conversationally appropriate text (not sound formal at a party or informal at a funeral. In Proceedings of the Conference of the Association for Computational Linguistics (COLING/ACL). University of Pennsylvania. Hillsdale.M. Also in: R. E. . Any system that can say the ‘same’ thing in various ways requires criteria by which to choose.H.in the sentences. Even simple cases such as “strong tea” (and not “powerful tea”) require more knowledge than a simple template filling mechanism usually contains. 6.E. the conversational setting. M. W. dissertation. 1991. pp. 1974. the expressive alternatives are not truly equivalent. Columbia University. Artificial Intelligence (Special Issue on Natural Language Processing) 63(1–2): 341–386. S. N. Systems addressing this problem include Erma (Clippinger. pp 704–710. A discourse speaking program as a preliminary theory of discourse behavior and a limited theory of psychoanalytic discourse.). References Appelt. Generating Natural Language under Pragmatic Constraints. Other words (adjectives. D. Clippinger.) can generally be chosen relatively late. 1992.M. J. Mann. or say “terrorist” or “freedom fighter” in stead of “guerrilla” to the wrong person).I. Goldman. Generation that exploits corpus-based statistical knowledge. and Bateman. etc. lexical selection. dissertation. Also available as USC/Information Sciences Institute Research Report RR-87-190.C. Langkilde. Automated discourse generation using discourse structure relations.A. 1985.A. E. Rhetorical Structure Theory: Toward a functional theory of text organization. Text Generation and Systemic-Functional Linguistics. affective. Pragmatic Guidance and Style The second problem is usually finessed by all but the most expressive generators. sentence-level expressing. and Knight. or other pragmatic connotation. There is however no fixed rule about when each type of word should be chosen. etc. local phrasing. Text 8: 243–281. Ph. dissertation. situational. K. 1988. and Thompson. topical organisation.

COINS Technical Report 87-87. Horacek. Distributed by the Association for Computational Linguistics • Paris. University of California in Los Angeles. • De Smedt. Edited collections of papers.C.. Natural Language in Artificial Intelligence and Computational Linguistics. Available as BBN Technical Report 7347. 1992.D.D. D. 1996. Has a consensus NL generation architecture appeared. (ed). Rambow. J. K. O. A. University of Massachusetts at Amherst. Ph. C. and Dahan Netzer. dissertation. and Hovy.H.H. 1989. 1992. H. An overview of computational text generation. Swartout. The ‘generation gap’: The problem of expressibility in text planning. (eds).. Architectures for Natural Language Generation: Problems and Perspectives. 1998. Zock (eds). 1990. 1995. and Stock. pp. 1992. E. Moore.L. Anderson. K. Distributed by the Association for Computational Linguistics. Huettner. M. In Proceedings of the Applied Natural Language Processing Conference. Germany: Springer Verlag Lecture Notes in AI number 1036. Computers and Texts: An Applied Perspective. mostly from bienniel workshops: • Adorni. and is it psychologically plausible? In Proceedings of the 7th International Workshop on Natural Language Generation. 53–74. Applied Text Generation. M. D. M. Meteer. P.J. S. In G. pp. • Elhadad. London: Francis Pinter. England: Basil Blackwell. L. Text Generation: Using Discourse Strategies and Focus Constraints to Generate Natural Language Text. Textbook for university-level students: • Reiter. M. Butler (ed). Trends in Natural Language Generation: An Artificial Intelligence Perspective. Oxford. E. Ph. and Korelsky. Heidelberg. 163–170. E. . • Hovy. Proceedings of the First International Conference on Natural Language Generation. Adorni and M. 40–44 Reiter. Y.. Paris.B.. and Zock.W. 1985. J. Heidelberg: Springer Verlag Lecture Notes in AI number 587. T.Meteer.L. A reactive approach to explanation in expert and advice-giving systems. dissertation. (eds) 1990.. Heidelberg.W. Rösner. Hovy.. O. pp. (eds). W.. Cambridge: Cambridge University Press. 2000.. Forster. (eds). Germany: Springer Verlag Lecture Notes in AI number 1036. University of Massachusetts (Amherst). 1987. M..M. Mumble-86: Design and Implementation. R. and Dale. Building Natural Language Generation Systems. The Use of Explicit Models in Text Generation. G.D. and Sibun. D.. pp 17–46.A. Aspects of Automated Natural Language Generation. 1993. 1994. C. Boston: Kluwer Academic Publishers. R.. • Dale. 2000. Cambridge: Cambridge University Press. W.R.J. E. Gay. Bibliography Overview papers: • Bateman. Trends in Natural Language Generation: An Artificial Intelligence Perspective. and Zock. McKeown. and Mann. Proceedings of the Ninth International Workshop on Natural Language Generation.H. In C.R.. E.D.. McDonald.

. and planning the overall structure of the text Microplanning: The second stage of text generation--the process of specifying details about sentence length. and so on Realisation: The third stage of text generation--the process of producing a sentence as a string of words.dynamicmultimedia. object. on the relationship between the speaker and the hearer/reader. in the domain that the program models or reasons about Sentence planning: See microplanning Text planning: See macroplanning Unification: A general process of matching frames against each other and merging their content when no conflicts occur .) and so on Lexical selection: The process of choosing the appropriate word for one or more items of content represented in the system's internal notation Macroplanning: The first stage of text generation--the process of selecting from the knowledge base what material should be said. and internal organisation Pragmatics: The aspects of language that are neither grammatical nor meaning-based (semantic). etc.com.See also the website of the Special Interest Group on Generation SIGGEN: http://www.au/siggen/ Glossary Grammar: The rules that specify the order and inflection of words in sentences inflection: the change of word that signals number (plural). including the effects of text on the emotions of the hearer/reader. content. event. process. tense (past or future. using a grammar and lexicon Representation (item): A symbol used by a computer program to stand for some fact. etc.