Lexical

summer school : Lexical Semantics and Question-Answering - Investigate some crucial facets of lexical semantics relevant for QA - Develop
basic aspects of QA: question representation, document indexing, question-document matching, response production
The emergence of Computational linguistics, from CS, linguistics, and psychlinguistics

CL was born about 40 years ago. - 1-25 first years: experiments in various areas: morphology, parsing, dialogue, machine translation, semantics with cooperation in linguistics, relations with formal linguistics, etc. - 15 years ago-Now: development of large scale applications, resources and tools, evaluation campaigns (MUC, DUC, TREC, etc.)
But still in an relatively early stage of development due to complexity of language (e.g. understanding, references) and development costs New areas have emerged: robustness, cooperativity, integration of services, interoperability, , development of large scale resources, semantic tagging, etc.
The framework
pluri-disciplinary cooperation: Linguistics, psychology, artificial intelligence, computer science, logic, philosophy Huge problems with multilinguism. Interleaving of scientific and applied facets
Epistemology of CL (vs. linguistics ?)

Seeks to understand what kinds of computational processes (mechanisms and knowledge) are at stake to understand, produce or learn a human language.
Relies on the mathematical theory of grammars and parsing and on logic for modelling inference, in spite of the mergence of statistical models.
The goal is not to reproduce how the human brain works, it is rather to simulate human performance: just needs to be able to do possibly as well as a human on (simple) tasks that involve language understanding and language production.
A paradox ?
We are potentially capable of generating/understanding an infinite number of situationally appropriate, meaningful utterances, extracting knowledge, etc. just from a few examples. In doing so, we refer to huge amounts of knowledge or subtle categories that we percieve, some of which is postulated to be innate, and some of which has been acquired over the course of our lifetime. We also use other cognitive faculties in understanding and producing language, such as the ability to reason about what weve heard or read.
1. Notion of sense / polysemy

Different levels of polysemy: - From accidental: bank - To indefinitely polysemous: e.g. good (via underspecification and selection (of properties)) Via largely polysemous, like prepositions: against,with. Different theories, highly contradictory: facets, productive polysemy, etc. (formal views, conceptual views, NLP..). Meanings are complex clusters, where features may evolve in context. In context: a conceptual approach: word meanings are more stable than one thinks, but focus may change, and meaning may undergo some operations: John eats his meal quickly vs. John eats quickly Syntactic structure has a role to play
What are senses ??

A notion that no one can really define and capture.
- Depends on perspective: AI, formal linguistics, NLP, psycho-linguistics, anthropology, etc - No unique solution: sense identification is rather a matter of strategy and constraints (practical or on a model). - Same for sense representation: can be abstract (almost no contents) or rich (built from a series of devices, that complement each other). - What do we want to do with senses and representations ??
compositionality
Run has an underspecified meaning: move along a trajectory crucial to capture this fact! Therefore: run to school, run from the store, etc Involve all the same meaning of run, but with various trajectories: tune exactly the semantic contribution of a word/concept. Compositionality should take care of isolated sense combinations, including modifiers. It should also consider syntax, in particular syntactic alternations. In a large number of cases, it is a monotonic operation, however some cases are more complex (they require type coercion, type shifting, or other inference forms)
Lexical descriptions
At syntactic as well as semantic levels, must remain generic: this must be an essential property, at least in NLP circles !. Then dedicated mechanisms operate when processing sentences to elaborate meaning, derived meanings (e.g.metaphors), or to deal with unexpected situations (e.g. metonymies, sense variations). Crucial aspects: selectional restrictions, semantic representations that support underspecification and allow for term combinations (types + lambda-calculus is a good solution). Define a sufficiently expressive language to deal with meaning. Depends on objectives (granularity to be adjusted): from basic roles to languages based on primitives (LCS), incrementally complex. Framebased languages are an intermediate step.
Lexical semantics part: Course outline

Lexical organization, semantic relations and selectional restrictions: WordNet/EurowordNet Semantic representations: roles and frame-based languages, primitive based languages like the LCS Syntactic and semantic tagging: dependency parsing Sense variation: dealing with metonymy: the Generative Lexicon, other efforts: extended principle of composition, dealing with metaphors, with sense variations.
Semantics and understanding

- Goals: in association with syntax: understand an utterance - Consider: * what to represent , * for what (answering questions, index,) ? * with what level of granularity (costs, resources) ? * in what formal framework ?
-Frames (Minsky, Shank, etc 70) - Semantic nets, graphs (Winograd, Sowa, 75) - Primitive based languages (Wilks, Jackendoff, 80) - Specialized languages: situational semantics, temporal semantics - Dedicated formalisms: DRT, etc. Implementations: logic-based forms, dependencies ?
WordNet / EuroWordNet and related concepts (Fellbaum, Miller, Cruse)

WN: A dictionary based on psycholinguistic principles: - Synchronic properties of mental lexicon that can be exploited in lexicography. - A lexical system based on conceptual lookup: Organizing concepts in a semantic network. - Organizes lexical information in terms of word meaning, rather than word form: Wordnet can also be used as a thesaurus, NOT as an ontology !
Lexical representations: synsets

Is Constructive: The representation should contain sufficient information to support an accurate construction of the concept Is Differential: Meanings must be represented so that they enable the theorist to distinguish among them
In the Differential approach: Meaning can be represented by the list of those word forms that can be used to express it: the synset. And by the relations that the synset has with other synsets.
Structure of WN
95.600 word forms (quite recently) 51.500 simple words 44.100 collocations 70.100 word meanings Wordnet Relations - Lexical relations (between word forms) Synonymy Antonymy - Semantic relations (between word meanings) Hyponymy/Hyperymy Meronymy/Holonymy Entailment
An example
S: (v) dance (move in a graceful and rhythmical way) "The young girl danced into the room" direct troponym / full troponym S: (v) glissade (perform a glissade, in ballet) S: (v) chasse, sashay (perform a chasse step, in ballet) S: (v) capriole (perform a capriole, in ballet) verb group S: (v) dance, trip the light fantastic, trip the light fantastic toe (move in a pattern; usually to musical accompaniment; do or perform a dance) "My husband and I like to dance at home to the radio" entailment S: (v) step (shift or move by taking a step) "step back" direct hypernym / inherited hypernym / sister term S: (v) move (move so as to change position, perform a nontranslational motion) "He moved his hand slightly to the right" derivationally related form W: (n) dancer [Related to: dance] (a person who participates in a social gathering .) W: (n) dancer [Related to: dance] (a performer who dances professionally) sentence frame Somebody ----s [Applies to dance] The crowds dance in the streets [Applies to dance] The streets dance with crowds
(Cruse 86) : Synonymy

Two words are synonymous if they have the same sense, i.e.: they have the same values for all their semantic features (?) they map to the same concept they satisfy the Leibnizs substitution principle:
If the substitution of one for the other never changes the truth value of a sentence in which the substitution is made
Cruse: substitution possible in any context. A Synset is the set of word forms that share the same Sense in fact usually quasi-synonyms. But: Synsets do not explain what the concepts are, they just say that concepts exist.
An hyponym is a word whose meaning contains the entire meaning of another, known as the superordinate (weak isa relation).
Two words overlap in meaning if they have the same value for some (but not all) of the semantic features (yet to be defined ! described here within a paradigmatic perspective). Hyponymy is a special case of overlap where all the features of the superordinate are contained by the hyponym sister
Meronymy
A word w1 is a meronym of another word w2 (the holonym) if the relation is-part-of holds betwen the meaning of w1 and w2: ! Meronymy is transitive and asymmetric, ! A meronym can have many holonyms, Pb: cardinality, optionality of parts, levels of generality, inheritance ?. Ex. If beak and wing are meronyms of bird, and if canary is a hyponym of bird, then (by inheritance), beak and wing must be meronyms of canary.
! Limited transitivity (functional domain): Ex. A house has a door and a door has a handle, Then a house has a handle ?
Part-whole relation (Winston et ali)

Component-object (branch/tree) Member-collection (tree/forest) Portion-mass (slice/cake) Stuff-object (aluminium/airplane) Feature-activity (paying/shopping) Place-area (Toulouse/Haute Garonne) Phase-process (addolescence/growing up).
Categories of words
Treated independently in WN, unfortunately !: Nouns ! Organised as topical hierarchies with lexical inheritance (hyponymy/hyperymy and meronymy/holonymy). Verbs ! Organised by a variety of entailment relations Adjectives ! Organised on the basis of bipolar opposition (antonymy relations), syno Adverbs ! organized e.g. according to scales and opposition Function words ! Currently omitted, stored separately as part of syntactic elements.
cover distinct conceptual and lexical domains
25 unique beginner hierarchies:

(in addition to domains)
{act, activity} {animal, fauna} {artifact} {attribute} {body} {cognition,knowledge} {communication} {event, happening} {feeling,emotion} {food} {group, grouping} {location} {motivation, motive} {natural, object}
{natural phenomenon} {person, human being} {plant, flora} {possesion} {process} {quantity, amount} {relation} {shape} {state} {substance} {time}
The Top nodes
adjectives
19.500 adjective forms -- 10.000 word meanings (synsets) Main Types: ! Descriptive adjectives (scalars or booleans) - Clusters based on antonymy - Used to modify/give attribute values of a noun ! Relevance: X is Adj presupposes there is an attribute A s.t. A(x) = Adj. ! Relational adjectives Similar to nouns used as modifiers ! Reference modifying adjectives, negative adjectives Ex. Former, alleged,...
Antonymy of adjectives
Two words are antonyms if their meanings differ only in the value for a single semantic feature: ! Dead/alive, above/below, hot/cold, fat/skinny, ! Binary antonyms (dead/alive: [+/- living]) ! Gradable antonyms (scalar, notion of scale) Hot,,warm,,cool,,cold
Antonyms w.r.t. a context: arrive: stay/leave charge: accept/contest (reversives)
Verbs
21.000 verbs word forms -- 13.000 are unique strings 8.400 word meanings (synsets) Includes phrasal verbs Divided into 12 semantic domains e.g.: verbs of: Body care, change, cognition, communication,
competition, consumption, contact, creation, emotion, motion, perception, possession, social interaction, and weather verbs.
Event or action and State distinctions
Apparent verb synonyms

Exhibit some meaning differences: * Different selectional restrictions: walk, tiptoe * Verb synsets often contain periphrastic expressions, rather than lexicalised synonyms. Ex. {hammer, (hit with a hammer)} Gloss breaks down a verb into an entire VP that indicates the basic action: Ex.: {whiten, (turn white)}: changes expressed as become + adjective. {swimm, (travel through water)}: manner elaborations over basic verb
Verbs cannot easily arranged into the kind of tree structure onto which nouns are mapped but they can be related by semantic relations like: - Entailment - Temporal inclusion - Causation
Verb top organization

Some semantic fields must be represented by several independent trees. - Motion verbs have two tops nodes: {move, (make a movement)} and {move, travel}. - Possession verbs can be traced up to the verbs: {give, transfer}, {take, receive} and {have, hold}. - Verbs of body care and functions consist of a number of independent hierarchies that form a coherent semantic field. Most of verbs (wash, comb, shampoo, make-up) select for the same kinds of noun argument (body parts). - Communication verbs are headed by the verb communicate but immediately divide into verbs of verbal and nonverbal Communication.
Lexical entailment
A verb V1 logically entails a verb V2 when the sentence Someone V1 (logically) entails the sentence Someone V2 . ! Ex. snore lexically entails sleep. ! The first sentence presupposes the second.
Negation reverses the direction of entailment: ! Ex. Not sleeping entails not snoring. Lexical entailment is a non-symmetric relation: ! Only synonymous verbs can be mutually entailing Ex. A defeated B and A beat B.
Verb Temporal aspects

A verb V1 will be said to temporally include a verb V2 if there is some stretch of time during which the activities denoted by the two verbs co-occur, but no time during which V2 occurs and V1 does not Ex. snore entails sleep and is properly included by it.
If V1 entails V2 and if a temporal inclusion relation holds between V1 and V2, then people will accept a part-whole statement relating V2 and V1.
Troponymy
The troponymy relation between two verbs V1 and V2 can be expressed by the formula: ! To V1 is to V2 in some particular manner. Ex. Troponyms of communication: say, tell Encode the speakers intention like in Examine, confess, preach, ... Encode the medium of communication like in Fax, email, phone, telex.
! Troponymy is a particular kind of entailment: Every troponym V1 of a (more general) verb V2 also entails V2. The activity referred to by a troponym and its more general hyperonym are always temporally coextensive. snore is not a troponym of sleep (because of lack of co-extensive temporal inclusion).
Causation
The causation relation relates two verb concepts: ! causative (like give) ! resultative (like have). Constraints: (1) The subject of the causative verb usually has an object that is distinct from the subject of the resultative verb. (2) The subject of the resultative verb must be the object of the causative verb (which is therefore necessarily transitive). Causation is anti-symmetric: For someone to have something does not entail that he was given it. Causation is a specific case of entailment: ! If V1 necessarily causes V2, then V1 also entails V2. ! Causal entailment lacks temporal inclusion.
A summary of entailment relations for verbs
Sentence verb frames
Semantic relations in WN
Selectional restrictions
The synset hierarchy can be used to define selectional restrictions by means of objects viewed as types (but this is not sufficient): componential analysis. Typing objects: simple types, complex types (e.g. dot objects), often need a simple logical language to express restrictions. Relational/ predicative terms: may have their own type, but also express the type of argument they expect: Eat: verb of consumption, NP1: human, NP2: concrete object + edible + solid. Contrast with drink: NP2 : + liquid
Unexpected situations
Unexpected situations abound: New usage of a term, Metaphor, and related images, Metonymy, Other kind of meaning change, e.g. via co-composition, co-predication, etc., where an argument affects the meaning of the predicate: several views, many debates !! These situations are very difficult to predict, they are no as regular and systematic as sometimes claimed, and they need specific forms of semantic interpretation.
Metonymies and the inference rule of type coercion

Various forms of metonymies (see the generative lexicon) can often be resolved by means of: - for each object, a graph of metonymic relations, produced via expansion related to a precise function (part of, object functions, etc.), that specifies the generative expansion of a term, e.g. via the Qualia structure of the GL, - a type coercion mechanism: V expects an argument of type A; in a proposition it is combined with an element of type B, then: (1) B is a subtype of A, then agreement (2) there is a type shifting operation S(B) that produces the type C which is subsumed by A combination is correct, may need interpretation (the bank rejected the loan) (3) in any other case the structure is ill-formed.
Metaphors and types

Much more difficult than metonymies because there is no means to construct a priori a graph of generative expansion. Metonymies are not so regular, depending on domains, and require some interpretation effort, they are not just a stylistic figure. However there are some quite regular situations, over languages: - time is money (waste time,) - Life as a trip, linguistic expressions as containers, ideas as objects, orientation metaphors: happy is up, etc. (see Lakoff and Johnson, metaphors we live by).
Nevertheless: some works have proposed metaphor rules with application restrictions which have a quite large scope. These can be treated via the inference rule of ype coercion.
2. Structures using tags or labels

Type roles, rhetorical relations, etc. Identifies head-complement and adjunct roles. Syntactic dependency annotation (EXERCISES):
Thematic roles
Large number of types, relates a predicate and one of its arguments. Granularity to adjust. Allows for partial parsing. Problematic to capture inter-argument relations.
A few thematic roles

Agent : direct, of cause, Theme: holistic / incremental (bnficiary victim) / cause / consequence Localisation: source / destination / fixed position Manner Accompaniment Quantity Instrument specified a priori, but may change in sentences
EXERCISES: semantic dependency tagging
Thematic roles as a bridge between syntax and semantics

Relation between grammatical function and thematic roles. Several proposals, among which: Agent < experiencer < goal, source, location < theme, patient Questions: - where to specify thematic roles ?? - How much stable are they ?? Not stable under alternations: Load hay unto the truck [agent, theme, goal] Load the truck with hay ?? - too rough, hides cross-classifications of verbs w.r.t. their meaning components.
Proto-roles
Defined by means of clusters of properties.
proto role agent: * volitional involvement * sentience or perception * causes an event or change of state * causes a movement Proto role patient: * undergoes a change of state * incremental theme * causally affected by another participant * stationary relative to movement
Frame Semantics
declarative representations: frames (Marvin Minsky) schemata (David Rumelhart) scripts (Roger Schank, Abelson) procedural representations (productions): conditionals that specify actions to be performed if certain conditions are met
a frame is to be understood as a cognitive structuring device frames characterise situations or states of affairs they are in principle independent of their linguistic realisation parts of frames are connected to specific words/constructions since verbs refer to whole situations, they are most closely associated with frames
verbs refer to whole situations, they focus on different aspects of the same frame: e.g. : buy, sell, pay, spend; all evoke the same frame to know the meaning of a verb, one has to know the frame as a whole: this brings together these verbs in one semantic group
thus, frames introduce a perspective on situations, even if the same arguments are realised differently syntactically: buy X from Y for Z: perspective of buyer sell X to Y for Z: perspective of seller
The essence of frames

Frames are supposed to serve as a kind of interface between purely linguistic and conceptual knowledge thus, the description of a verb comprises the frame it evokes linking information: the grammatical realisation of the frame elements (obligatory vs optional), procedural attachment.
Frames are hierachically structured, they include subframes. They capture prototypicality in some way.
Frames for understanding texts

Frames are used for theories of text understanding: first, the general topic of the text is recognised and encoded in an appropriate frame then the facets of the text meaning is modelled as filling in the elements of a subframe
Then, subsections of the text are treated similarly and inserted into the main frame result is a complex tree (or so) representing the articulations of a text
A lexical frame: FrameNet II

creation of an on-line lexical resource for English based on frame semantics and supported by corpus evidence the project consists of the FrameNet database itself: lexical entries for individual word senses descriptions of frames and frame elements annotated subcorpora Uses: WSD, MT, IE, QA
structure
the database serves as: a dictionary: definition (from the Concise Oxford Dictionary, 10th Edition, Oxford University Press, or a definition written by a FrameNet staff member) - tables showing how frame elements are syntactically expressed in sentences containing each word. - this includes a complete characterisation of the headwords grammar and combinatorial properties - annotated examples from the corpus - alphabetical indexes Also used as a thesaurus.
There are three layers of annotation on a tagged constituent: the frame element realisation consists of a frame Element (say, patient), -- a grammatical function (say, object) and a phrase type (e.g. NP) valence descriptions of predicating words are generalisations over such structures
FrameNet examples
Base for change_direction:
A THEME that is in motion assumes a new DIRECTION in which it moves.

Args. du noyau: THEME et DIRECTION lments hors noyau: ANGLE, CONSECUTIF, CO-THEME, DEPICTIVE, etc. (env. 30 lments) Mots: bear.v, cut.v, left.n, right.n, swing.v, turn.n, turn.v,
veer.v
Judgment_communication A COMMUNICATOR communicates a judgment of an EVALUEE to an ADRESSEE. Kernel: COMMUNICATOR (semantic type), EVALUEE (judgement on source) EXPRESSOR (body part that informs) MEDIUM (mode of expression: telephone, etc.) REASON, TOPIC Out of kernel: ADRESSEE, DEGREE, FREQUENCY, GROUNDS, etc. acclaim.n, acclaim.v, accusation.n, accuse.v, belittle.v, belittlement.n, belittling.n, blame.v,..
Annotated text:
Frame Identification and their meaning: It CANCapability be HOPEDDesiring that Spanish PRIME MINISTERLeadership Felipe Gonzalez will draw the RIGHTSuitability CONCLUSIONComing_to_believe from his NARROWClarity_of_resolution ELECTIONChange_of_leadership VICTORYFinish_competition Sunday . A STRONGExertive_force CHALLENGECompetition from the FAR . detail: Spanish ORIGIN Prime-Minister IDENT Felipe Gonzales etc MINISTER Felipe Gonzalez
VerbNet
Link syntactic classes, WN, propbank and framenet, applied to verbs (M. Palmer et ali. 97). 5245 verbs of English, about 5000 links, 237 main classes, 3412 links towards FrameNet. Extensions to Portuguese and Korean.
Example: abandon
Classe: Leave 51-2 (No. De sens) Roles: Theme [+animate] , Source [+location & -region] Frame: Transitive basically, locative preposition drop of "from" "We abandoned the area." syntax: Theme V Source semantics: motion(during(E), Theme) location(start(E), Theme, Source) not(location(end(E), Theme, Source)) direction(during(E), from, Theme, Source)
PrepNet: abstract notions, families and facets

Quantity: numerical/ frequency / proportion Accompaniment: adjunction/ simultaneity/ inclusion/ exclusion Manner: means/ manners and attitudes/ imitation or analogy Localisation: source/ destination/ via/ fixed position Choice and exchange: exchange / choice or alternative / substitution Causality: cause/ goal or consequence/ intention Opposition Ordering: priority/ subordination/ hierarchy/ ranking/ degree of importance Instrument (see after) Minor elements: about, in spite of, comparison
The case of via

[1] : VIA - generic. 'An entity X moving via a location Y' X: concrete entity, ACTION: movement verb, Y: location representation: X : via(loc, Y) French synset: {par, via} example: Jean rentre par la porte
Stratifications de sens
Stratification 2:
[1.2.1] VIA UNDER from generic 'An entity X moving via under a location Y' X: concrete entity, ACTION: movement verb, Y: location with a form of passage under it representation: X : via(loc, under(loc,Y)) French synset: {par dessous} example: Jean passe par dessous le pont.
[1.2.2] VIA ABOVE from generic etc.
Language Realization
SFi
(= lower level frame)
Multi-level partitioning of realizations from usage norms
Direct uses etc

restr1 restr2 restr3
Indirect uses etc

Derived types,
synset1
synset3
synsets ??
+ frequency
measures
Second part: QA technology

Definition, situation and foundational aspects Processing questions - Focus and type - various forms of representations: from keywords to semantics, - matching questions with texts
Situation and definition

QA is a language-based, possibly intelligent device that operates on top of search engines and information retrieval systems. It offers: (1) an easier access to information via NL/flexible queries (2) advanced tools to sort, filter out and select information (plus fusion, reasoning) (3) a response adequate for the user, possibly in NL, cooperative.
Some contemporary classes of QA

Boolean Factoids: when, who, where? Etc. Definition Causes/consequences Procedural (+ instrumental) Evaluation, comparison Opinions.
Some external parameters

QA
Domain spcific
Open domain
Structured data
Free text Text databases
The Web
A unique, coherent document
A Map:
old: The renewal:
And now:
A global Architecture (Surdeau & Pasca)
2.1 Dealing with Questions

Question categories Type and focus How to represent the contents of questions for QA, to allow for matching with texts produced by search engines:
Key words, bags of words, Templates, Syntactic structures and dependencies, Applications: framenet, primitives Unification between Q and R Inference
question contents
Several levels: - question Conceptual category, - expected response type, - representation of question body.
Who invented the telephone ? In logical form: (entity, X: person, invent(X, telephone)).
(a) Extraction of keywords/bag of words

Look for heavy terms in the question + predicates and names
entities How to change my mother board ? change, mother board Identify compound terms, proper nouns, etc. Lexical expansion to normalize forms and improve search useful for search engines to get appropriate sets of responses (to be later processed)
(b) Simple Templates

QALC: 80 catgories for questions: When was Rosa Park born ? When-BE-NP-born Webclopedia: same strategy: Who was Lincolns state secretary ? Template : Who be <entity>s <role> Who is the author of Eugne Ongin ? Who be <role> of <entity>
On the response side, unify Q body with text fragments, but forms are quite often quite different
Vazquez coach of Johnny <person>, <role> of <entity> In Tchaikovskys Eugne Ongin. Prep <person>s <entity>
D. Shostakovitch, official composer of the Soviet Union. <person> , <role> of <entity>
In TextMap:
1. Query generation:
How did Mahatma Gandhi die? Mahatma Gandhi die <HOW> Mahatma Gandhi die of <HOW> Mahatma Gandhi lost his life in <WHAT> 550 patterns, grouped into 105 equivalence classes 2. Response extraction: When was Mozart born? P=1 <PERSON> (<BIRTHDATE> - DATE) P=.69 <PERSON> was born on <BIRTHDATE>
(d) Key words + syntactic dependencies PIQASso (Attardi et al. 02)
Response Extraction : a synthesis
(d) Labelled predicate-argument structures

PropBank (1 million words) (www.cis.upenn.edu/~ace) 1 to 5 arguments: - ARG0 = agent, - ARG1 = object direct = theme, - ARG2 = object indirect = dest., or instrument. + functional labels (oblique args.): - ARMG-LOC = locative, - ARGM-TMP = temporal, - ARGM-DIR = direction
What [Arg1: kind of nuclear materials] were [Predicate:stolen] [Arg2: from the Russian Navy]?
Text in which the response is found: [ArgM-TMP(Predicate 2): in 1/96], [Arg1(Predicate 2): approximately 7 kg of HEU] was [ArgM-ADV(Predicate 2) reportedly] [Predicate 2: stolen] [Arg2(Predicate 2): from a naval base] [Arg3(Predicate 2): in Sovetskawa Gavan]
About 7kg of HEU
Using FrameNet (Narayana et Harabagiu)

Predicate: stimulate Argument 0: (role = agent): ANSWER Argument 1: (role = theme): Indias missile programs Argument 2 (role = instrument): ANSWER Q/R en frameNet (2 main facets): Frame: STIMULATE Frame Element: CIRCUMSTANCES: ANSWER Frame Element: EXPERIENCER: Indias missile programs Frame Element: STIMULUS: ANSWER Additional information : (2 additional facets): Frame: SUBJECT STIMULUS Frame Element: CIRCUMSTANCES: ANSWER Frame Element: COMPARISON SET: ANSWER Frame Element: EXPERIENCER: Indias missile programs Frame Element: PARAMETER: nuclear/biological proliferation
Semantic dependencies: thematic roles versus LCS primitives, rhtorical operators (IRIT-ILPL)
(1) Thematic roles: What is The first The first university of Thailand Thme University of Thailand
Loc-Temp
University Of Thailand
Loc-spatiale
(2) Basic LCS elements:

B e ( What ) is ( the first university of Thailand ) B e From+ Loc
Response 1
(1) roles :
Th Th Th
(2) LCS : At+tem B p e ( Kasetsart University ) ( has been recognized as ( the first university of Thailand ) ) From+ Loc
Response 2
(1) roles: Loc Ide nt
(2) LCS + rhetorical structure : in + Valu Loc e (the first university in Thailand ) ( namely(Chulalongkorn university)) Be Elaborat
At+Tem
Response 3
(1) roles: Ag Th
Goal
(2) LCS + RST : At+tem in + At+tem Loc p p (When (the king Vajiravudh (Rama VI) founded the first university in Thailand) ( in commemoration of his father, King Chulalongkorn,))) Cause Goal Valu
Application to rice diseases: document level indexing

Bakanae Pathogen: Gibberella fujikuroi (Fusarium moniliforme) (Reviewed 4/04, updated 4/04) In this Guideline: Symptoms , Comments on the disease , Management , Publication
Index: disease-name(bakanae), symptoms(kakanae, [list of major symptoms]), origin(diseased: bakanae, place: California, date: 1999), spreading(disease: bakanae, period: winter, medium: [soil, water), treatment(disease: bakanae, product: XX).
SYMPTOMS Symptoms of bakanae first appear about a month after planting. Infected seedlings appear to be taller, more slender, and slightly chlorotic when compared to healthy seedlings. The rapid elongation of infected plants is caused by the pathogen's production of the plant hormone, gibberellin. Plants with bakanae are often visible arching above healthy rice plants; infected plants senesce early and eventually die before reaching maturity. If they do survive to heading, they produce mostly empty panicles. COMMENTS ON THE DISEASE Bakanae is one of the oldest known diseases of rice in Asia but has only been observed in California rice since 1999 and now occurs in all California rice-growing regions. While very damaging in Asia, the extent to which bakanae may effect California rice production is unknown. As diseased plants senesce and die, mycelium of the fungus may emerge from the nodes and may be visible above the water level. After the water is drained, the fungus sporulates profusely on the stems of diseased plants. The sporulation appears as a cottony mass and contaminates healthy seed during harvest. The bakanae pathogen overwinters as spores on the coat of infested seeds. It can also overwinter in the soil and plant residue. However, infested seed is the most important source of inoculum. MANAGEMENT The most effective means of control for this disease is the use of noninfested seed. Also, when possible, burning plant residues with known infection in fall may help limit the disease. Research is under way to identify effective seed treatments. Field trials indicate that a seed treatment with sodium hypochlorite (Ultra Clorox Germicidal Bleach) is effective at reducing the incidence of this disease. Using a thoroughly premixed solution of 5 gallons of bleach to 100 gallons of water, seed is soaked for 2 hours, then drained and soaked in fresh water.
A simple example: intelligent unification

How to control Bakanae ? <question type=PROC or SqE focus control bakanae > How to <action> control <theme> Bakanae </theme> </action> ? </question>
control in the query lexically entails management and knowing that Bakanae is a disease (ontology) then the match occurs with the last section: management MANAGEMENT The most effective means of control for this disease is the use of noninfested Tagged as: <action> management <theme> this disease </theme> </action> The response is the whole section, which is indeed a procedure.
With lexical inference

how does Tilletia barclayana spreads ?. <question type=SqE focus =Tilletia barclayana spreads > How does <agent> Tilletia barclayana </agent> <action> spreads </action> ? </question>
spread {germinate, flower, infest, etc} which are sub-events of spread. Then, the matching is done via as many of the above sub-events as possible, given that the disease has already been identified. You are looking for SqE (sequences of events) which are annotated below: In spring as the fields are flooded, chlamydospores float, <action> germinate, and produce other spore and mycelial stages </action>. <action> At flowering (heading), secondary airborne spores (sporidia) infect individual florets or kernels.</action>
So the response is the above text, where the relevant actions have been identified.
With more advanced (but realistic) inferencing

<question type=PROC or SqE focus=destroy rice > How can <agent> the rice thrips </agent> <action> destroy <theme> the rice </theme> </action> ?</question>
For the response, if it is taken from a text, it is VERY difficult to match the question with this answer. In bold italics: the direct match elements. <text> . <agent> The rice thrips </agent><action> will suck the sap <source> from the young rice. </source> </action> </text> To match the action destroy in the question with this text portion, you need to have the inference: Suck sap probably destroy this is what we call lexical inference, so this is more complex than just term matching.

Lexical

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lexical

Uploaded by

Copyright:

Available Formats

summer school : Lexical Semantics and Question-Answering - Investigate some crucial facets of lexical semantics relevant for QA - Develop

The emergence of Computational linguistics, from CS, linguistics, and psychlinguistics

Epistemology of CL (vs. linguistics ?)

1. Notion of sense / polysemy

What are senses ??

Lexical semantics part: Course outline

Semantics and understanding

WordNet / EuroWordNet and related concepts (Fellbaum, Miller, Cruse)

Lexical representations: synsets

(Cruse 86) : Synonymy

Part-whole relation (Winston et ali)

cover distinct conceptual and lexical domains

25 unique beginner hierarchies:

The Top nodes

Event or action and State distinctions

Apparent verb synonyms

Verb top organization

Verb Temporal aspects

A summary of entailment relations for verbs

Sentence verb frames

Metonymies and the inference rule of type coercion

Metaphors and types

2. Structures using tags or labels

A few thematic roles

Thematic roles as a bridge between syntax and semantics

The essence of frames

Frames for understanding texts

A lexical frame: FrameNet II

A THEME that is in motion assumes a new DIRECTION in which it moves.

PrepNet: abstract notions, families and facets

The case of via

[1.2.2] VIA ABOVE from generic etc.

(= lower level frame)

Multi-level partitioning of realizations from usage norms

Direct uses etc

Indirect uses etc

Second part: QA technology

Situation and definition

Some contemporary classes of QA

Some external parameters

Free text Text databases

A unique, coherent document

A global Architecture (Surdeau & Pasca)

2.1 Dealing with Questions

(a) Extraction of keywords/bag of words

(b) Simple Templates

D. Shostakovitch, official composer of the Soviet Union. <person> , <role> of <entity>

(d) Key words + syntactic dependencies PIQASso (Attardi et al. 02)

Response Extraction : a synthesis

(d) Labelled predicate-argument structures

About 7kg of HEU

Using FrameNet (Narayana et Harabagiu)

(2) Basic LCS elements:

Application to rice diseases: document level indexing

A simple example: intelligent unification

With lexical inference

With more advanced (but realistic) inferencing

You might also like