Upload_transparent

Information Extraction and knowledge modelling from oral communication notes (Fabrice EVEN - PhD The

 
 
 
 
 
Value This
Doc
Scribd
Average
     
Pages: 252 43
Words: 92373 13640
Characters: 581210 81678
Lines: 3630 623
     
     
Letters per word: 6.29 5.99
Words per line: 25.45 21.89
Words per page: 366.56 317.21

Add to your reading list

Flag_red Flag this document

Document Information

237 Reads | 0 Comments

Description

In spite of the rise of Information Extraction and the development of many applications in the
last twenty years, this task encounters problems when it is carried out on atypical texts such as
oral communication notes.
Oral communication notes are texts which are the result of an oral communication (meeting,
talk, etc.) and they aim to synthesize the informative contents of the communication. These
constraints of drafting (speed and limited amount of writing) lead to linguistic characteristics
which the traditional methods of Natural Language Processing and Information Extraction are
badly adapted to. Although they are rich in information, they are not exploited by systems
which extract information from texts.
In this thesis, we propose an extraction method adapted to oral communication notes. This
method, called MEGET, is based on an ontology which depends on the information to be
extracted (“extraction ontology”). This ontology is obtained by the unification of an “ontology of
needs”, which describe the information to be found, with an “ontology of terms” which
conceptualize the terms of the corpus which are related to the required information. The
ontology of terms is elaborated from terminology extracted from texts and enriched by terms
found in specialized documents. The extraction ontology is formalized by a set of rules which are
provided as a knowledge base for the extraction system SYGET. This system (1) carries out a
labelling of each instance of every element of the extraction ontology and (2) extracts the
information. This approach is validated in several corpora

Pdf_16x16 252 Pages


Date Added

06/19/2009

Category
Tags
Groups
Copyright

Attribution Non-commercial

More info »

 

or use Facebook Connect