Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword or section
Like this
16Activity

Table Of Contents

0 of .
Results for:
No results containing your search query
P. 1
Information Extraction and knowledge modelling from oral communication notes (Fabrice EVEN - PhD Thesis)

Information Extraction and knowledge modelling from oral communication notes (Fabrice EVEN - PhD Thesis)

Ratings: (0)|Views: 12,765|Likes:
Published by godai_44880
In spite of the rise of Information Extraction and the development of many applications in the
last twenty years, this task encounters problems when it is carried out on atypical texts such as
oral communication notes.
Oral communication notes are texts which are the result of an oral communication (meeting,
talk, etc.) and they aim to synthesize the informative contents of the communication. These
constraints of drafting (speed and limited amount of writing) lead to linguistic characteristics
which the traditional methods of Natural Language Processing and Information Extraction are
badly adapted to. Although they are rich in information, they are not exploited by systems
which extract information from texts.
In this thesis, we propose an extraction method adapted to oral communication notes. This
method, called MEGET, is based on an ontology which depends on the information to be
extracted (“extraction ontology”). This ontology is obtained by the unification of an “ontology of
needs”, which describe the information to be found, with an “ontology of terms” which
conceptualize the terms of the corpus which are related to the required information. The
ontology of terms is elaborated from terminology extracted from texts and enriched by terms
found in specialized documents. The extraction ontology is formalized by a set of rules which are
provided as a knowledge base for the extraction system SYGET. This system (1) carries out a
labelling of each instance of every element of the extraction ontology and (2) extracts the
information. This approach is validated in several corpora
In spite of the rise of Information Extraction and the development of many applications in the
last twenty years, this task encounters problems when it is carried out on atypical texts such as
oral communication notes.
Oral communication notes are texts which are the result of an oral communication (meeting,
talk, etc.) and they aim to synthesize the informative contents of the communication. These
constraints of drafting (speed and limited amount of writing) lead to linguistic characteristics
which the traditional methods of Natural Language Processing and Information Extraction are
badly adapted to. Although they are rich in information, they are not exploited by systems
which extract information from texts.
In this thesis, we propose an extraction method adapted to oral communication notes. This
method, called MEGET, is based on an ontology which depends on the information to be
extracted (“extraction ontology”). This ontology is obtained by the unification of an “ontology of
needs”, which describe the information to be found, with an “ontology of terms” which
conceptualize the terms of the corpus which are related to the required information. The
ontology of terms is elaborated from terminology extracted from texts and enriched by terms
found in specialized documents. The extraction ontology is formalized by a set of rules which are
provided as a knowledge base for the extraction system SYGET. This system (1) carries out a
labelling of each instance of every element of the extraction ontology and (2) extracts the
information. This approach is validated in several corpora

More info:

Categories:Types, Research, Science
Published by: godai_44880 on Jun 19, 2009
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

07/17/2012

pdf

text

original

 
École Centrale de Nantes Université de Nantes École des Mines de Nantes
ÉCOLE DOCTORALE STIM« SCIENCES ET TECHNOLOGIE DE L’INFORMATION ET DES MATERIAUX »
Année 2005
Extraction d’Information etmodélisation de connaissances àpartir de Notes deCommunication Orale
THÈSE
pour obtenir le grade de
DOCTEUR DE L’UNIVERSITÉ DE NANTES
Discipline : INFORMATIQUE
 présentée et soutenue publiquement par 
Fabrice E
VEN
le 5 octobre 2005à l’UFR Sciences et Techniques, Université de Nantes
devant le jury ci-dessous
Président : Alexandre D
IKOVSKY
, Professeur des Universités LINA, Université de NantesRapporteurs : Pierre Z
WEIGENBAUM
, Professeur des Universités INSERM, Hôpitaux de ParisFrançois R
OUSSELOT
, Maître de conférences LIIA, INSA StrasbourgExaminateurs : Noureddine M
OUADDIB
, Professeur des Universités LINA, Université de NantesChantal E
NGUEHARD
, Maître de conférences LINA, Université de NantesPascal M
UCKENHIRN
Crédit Mutuel LACODirecteur de thèse : Professeur Noureddine M
OUADDIB
 
Co-encadrante : Maître de conférences Chantal E
NGUEHARD
 Laboratoire : Laboratoire d’Informatique de Nantes Atlantique (LINA) CNRS-FRE 2729
N° ED 366-210
 
 
 
 
EXTRACTION D’INFORMATIONET MODELISATION DE CONNAISSANCESA PARTIR DE NOTES DE COMMUNICATION ORALE
Information Extraction and knowledge modelling  from oral communication notes
 
Fabrice E
VEN
 favet neptunus eunti 
Université de Nantes

Activity (16)

You've already reviewed this. Edit your review.
1 hundred reads
1 thousand reads
Ramla El Ayeb liked this
12_schmitt6773 liked this
annamontana liked this
finmine liked this
finmine liked this
nikodinz liked this
samba liked this
nomedaconta liked this

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->