Professional Documents
Culture Documents
--------
Nguyn Quc i
H NI 2009
I HC QUC GIA H NI
TRNG I HC CNG NGH
--------
Nguyn Quc i
Ngnh:
Cng ngh thng tin
Ging vin hng dn: TS. Phm Bo Sn
H NI 2009
To My Family
Nguyn Quc i
Li cm n
H Ni, 24-5-2009
Nguyn Quc i
Nguyn Quc i
Tm tt
ii
Nguyn Quc i
Mc lc
Li cm n ................................................................................................................................i
Tm tt..................................................................................................................................... ii
Mc lc ................................................................................................................................... iii
Danh sch t vit tt .............................................................................................................vi
Danh sch hnh v ............................................................................................................... vii
Danh sch bng ......................................................................................................................ix
Chng 1. Gii thiu ............................................................................................................1
Chng 2. Tng quan v hi p.......................................................................................3
2.1 Tng quan v h thng hi p da trn Ontology ...................................................3
2.2 Khi nim v hi p (Question Answering QA) ...................................................6
2.2.1 Khi nim hi p ..................................................................................................6
2.2.2 Kin trc hi p ....................................................................................................6
2.2.3 Cc phng php hi p ......................................................................................7
2.3 Giao din ngn ng t nhin min ng ....................................................................9
2.4 Cc h thng QA min m ..........................................................................................10
2.5 Ontologies trong QA....................................................................................................12
Chng 3. Ontology Sesame..........................................................................................14
3.1 Khi nim v Ontology ................................................................................................14
3.1.1 Khi nim ..............................................................................................................14
3.1.2 Tng quan v Ontology .......................................................................................14
3.1.2.1 Cc thnh phn ca Ontology ......................................................................15
3.1.2.2 Cc thc th....................................................................................................15
3.1.2.3 Cc lp ............................................................................................................16
3.1.2.4 Thuc tnh ......................................................................................................17
iii
Nguyn Quc i
iv
Nguyn Quc i
Nguyn Quc i
STT
1
2
3
K hiu
NLP
QA
API
T ting Anh
Ng ngha
X l ngn ng t nhin
Hi p
Giao din chng trnh ng dng
vi
Nguyn Quc i
vii
Nguyn Quc i
viii
Nguyn Quc i
ix
Nguyn Quc i
Chng 1.
Gii thiu
Nguyn Quc i
Nguyn Quc i
Chng 2.
Tng quan v hi p
Nguyn Quc i
Da vo kin trc trong hnh 2.1, Ontology xc nh cc khi nim c lin quan,
bng phng php tip cn lai, hoc kt hp cc phng php c php v cc phng
php thng k c s dng trch ra cc khi nim t cc ti liu.
Phng php da vo c php pht hin mu cc danh t ghp v min cu trc
c th. Pht hin mu ph hp vi cc phng php trch chn khi nim khc. Cc
cch tip cn thng k cho vic trch chn khi nim thng nhn bit s xut hin ca
cc t vng trong mt hoc nhiu tp cc ti liu xc nh. Mt khi nim xut hin
nhiu trong mt ti liu, th khi nim ny c ngha v c trch chn.
Gn nhn t loi c dng ci tin vic nhn dng cc khi nim. Cc t
c gn nhn lun da vo u tin c trch chn. Cc t loi khc chng hn
nh gii t lin kt, gii t nhn dng s b loi tr v chng khng mang thng tin v
min khi nim. Cc khi nim c hnh thc ha s thch hp d dng cu trc
Nguyn Quc i
Nguyn Quc i
Nguyn Quc i
Nguyn Quc i
Nguyn Quc i
Nguyn Quc i
s d liu da trn ng ngha hc hnh thc [19], to ra phn bit r rng gia thao
tc u v thao tc cui trong qu trnh x l ngn ng t nhin ( Natural Language
NL). Thao tc u cung cp mt nh x gia cc cu ting Anh ti cc biu thc ng
ngha hc, v thao tc cui nh x cc biu thc ny vo trong cc biu thc ngha
i vi min cu hi. TEAM [41] l mt h thng NLIDB th nghim, linh hot c
pht trin vo nhng nm 1980. H thng hi p TEAM, bao gm hai thnh phn
chnh: (1) thnh phn nh x cc biu thc NL thnh cc biu din hnh thc, (2) thnh
phn chuyn i nhng biu din ny thnh cc cu lnh i vi c s d liu. TEAM
[41] to ra s phn tch gia qu trnh ngn ng v qu trnh nh x ln KB.
PRECISE [47] l chng trnh nh x cc cu hi ti truy vn SQL tng ng
bi vic nhn dng cc lp cu hi n gin. Cc cu hi l mt tp cc cp thuc tnh
gi tr v cng vi mt quan h. Mi thuc tnh trong c s d liu c lin kt vi
mt wh-value (what, when, where,). Trong chng trnh PRECISE, mt b t vng
c s dng tm cc t ng ngha. Tuy nhin, trong chng trnh PRECISE, vic
tm mt nh x da theo t vng cho c s d liu i hi mi t vng phi ring bit
vi nhau. H thng ny khng th phn tch c ng ngha cc cu hi c cha cc t
m h thng cha bit, v th chng trnh khng th x l c cu hi ny. Ni cch
khc, chng trnh PRECISE s khng tr li cc cu hi m cha nhng t khng c
sn trong t in ca n.
10
Nguyn Quc i
mt cu hi, LASSO t ng tm kim: (a) loi cu hi (what, why, who, how, where),
(b) loi cu tr li (person, location), (c) trng tm cu hi, l thng tin chnh c
yu cu bi cu hi. Ngoi ra, phn cp loi cu hi cn gip nhn dng t kha trong
cu hi. i khi, nhiu t c trong cu hi s khng xut hin li trong cu tr li. Cc
h thng min m c gng tm kim cc t ng ngha, cng vi cc bin th hnh thi
ca t ng ngha cho cc thut ng hoc cc t kha.
Trong TREC-9 [18], h thng FALCON c m t bi Harabagiu et al. [25] cho
cu tr li ng ngha c nh x bi cng c nhn dng thc th c tn. Nu khi
nim trong cu hi cho bit loi cu tr li, h thng FALCON s c nh x cu hi
vo trong phn loi cu tr li. Tt c cc danh t (v cc bin i hnh thi t vng)
c lin quan ti cc khi nim xc nh loi cu tr li th u c nhn bit thng
qua cc t kha. FALCON a ra cu tr li c lu tr nu c mt cu hi tng t
c hi trc .
START [33] ch trng vo cc cu hi v a l v thng tin phng th nghim
ca MIT. START s dng gi b ba l i tng thuc tnh gi tr. y l mt
h thng ln c nh gi cao trong cc h thng hi p (Question Ansering QA)
bi kh nng phn tch v tng hp cu hi. Vi mt cu hi dng phc tp, START
c c ch chia cu hi ny ra thnh nhng cu hi nh, mi cu hi nh s tm c
cu tr li trc tip trong c s d liu. Sau START tng hp kt qu t nhng cu
tr li ca cc cu hi nh . Ngoi ra START cn c bit hu dng trong vic tm
ra cu tr li bi kh nng phn tch ng ngha rt tt ca n.
Litkowski [38] a ra h thng DIMAP, h thng ny trch chn cc b ba quan
h ng ngha sau khi ti liu c phn tch c php v cy c php c kim tra.
Cc b ba trong DIMAP s c lu li trong mt c s d liu vi mc ch dng
tr li cc cu hi. B ba quan h ng ngha c cp trn bao gm cc i tng
(SUBJ, OBJ, TIME, NUM, ADJMOD), c mt quan h ng ngha m t li vai tr ca
i tng, v mt t trong cu c lin quan ti i tng ny. Mt b ba thng tng
ng vi mt hnh thi logic. Cc i tng l thnh phn ch cht ca cc b ba
trong DIMAP, cc thnh phn quan trng (cc danh t chnh, ng t chnh v bt c
tnh t hoc danh t b ngha no) uc xc nh cho mi loi cu hi. H thng phn
cc cu hi ra lm su loi cu hi: thi gian, a im, ai, ci g, kch c v s lng.
11
Nguyn Quc i
12
Nguyn Quc i
13
Nguyn Quc i
Chng 3.
Ontology Sesame
14
Nguyn Quc i
cc i tng (v cc lp) c th c.
Cc mi quan h: cc cch cc lp, cc thc th c th c lin kt ti
cc lp (hoc thc th) khc.
15
Nguyn Quc i
3.1.2.3 Cc lp
Cc lp c th c nh ngha nh l mt s m rng (extension) hoc nh mt
tng cng (intension) [58]. Theo nh nh ngha m rng, cc lp l cc nhm,
cc b hoc cc tp hp cc i tng tru tng. Theo nh ngha tng cng, cc
lp l cc i tng tru tng c nh ngha bi cc dng gi tr, cc dng gi tr
ny l cc rng buc cho phn t ca lp. Trong nh ngha m rng cho thy mt lp
l mt tp cc lp con. Cn nh ngha theo tng cng, gia cc tp hp v cc lp
c nhiu khc nhau c bn. Cc lp c th phn loi cc thc th, phn loi cc lp
khc, hoc kt hp c hai phn loi. V d v lp:
-
16
Nguyn Quc i
<c_tn> Nguyn_Quc_i
<c_qu> H_Ni
<hc> k50_khoa_hc_my_tnh
17
Nguyn Quc i
Hnh 3.2. V d v lp t
18
Nguyn Quc i
19
Nguyn Quc i
20
Nguyn Quc i
3.3 Sesame
3.3.1 Khi nim v Sesame
Chng ti gii thiu trong phn 3.2, khung m t ti nguyn RDF (Resource
Description Framework) l mt h thng cc c im k thut ca W3C (World Wide
Web Consortium) c thit k c o nh mt m hnh siu d liu. RDF c
dng m t khi nim hoc m hnh ha thng tin.
RDF Schema (vit tt khc nh RDFS, RDF (S), RDF-S, RDF/S) l mt ngn
ng biu din tri thc m rng, cung cp cc thnh phn c bn m t Ontology.
RDF Schema cn c gi l t vng RDF, dng cu trc cc ti nguyn RDF.
Sesame l mt Java framework m ngun m lu tr, truy vn v suy lun i
vi RDF v RDF schema [59]. Sesame c th c s dng nh l mt c s d liu
cho RDF v RDF Schema, hoc l mt th vin Java cho cc ng dng cn tm kim
thng tin bn trong RDF.
Nu mt ng dng cn c mt file RDF ln, tm thng tin thch hp, v s dng
thng tin . Sesame cung cp nhng cng c cn thit phn tch, bin dch, truy
vn v lu tr tt c cc thng tin ny, nhng vo trong ng dng . Ngoi ra,
Sesame cung cp mt cng c cha ng cc tnh nng hu ch i vi RDF.
3.3.1.1 Sesame Server
Sesame c th c s dng nh l mt my ch vi cc ng dng khch c th
giao tip thng qua HTTP. Sesame c th c trin khai nh l mt ng dng Java
Servlet trong Apache Tomcat mt webserver h tr Java Servlets v ngn ng JSP.
21
Nguyn Quc i
Tng lu tr v suy lun (the Storage And Inference Layer SAIL API) l giao
din chng trnh ng dng bn trong Sesame (internal Sesame API - Application
program interface) cung cp h tr suy lun (hnh 3.4). Cc x l ca SAIL cung cp
chc nng nh b nh m hoc x l truy cp ng thi. Mi kho d liu Sesame c
i tng SAIL ring biu din cho kho d liu .
22
Nguyn Quc i
Trong kin trc ca Sesame (hnh 3.4), pha trn SAIL, l cc module chc nng
ca Sesame, chng hn nh cc cng c truy vn SeRQL, RQL v RDQL, module
qun tr, v module trch xut file RDF. Truy cp vo cc module chc nng c sn
thng qua cc giao din chng trnh ng dng truy cp ca Sesame (Sesame's Access
APIs), bao gm hai phn ring bit: Repository API v Graph API. Repository API
cung cp quyn truy cp bc cao vo cc kho d liu Sesame, chng hn nh truy vn,
lu tr cc file RDF, trch xut file RDF, Graph API cung cp nhiu h tr b sung
cho x l RDF, chng hn nh cc cu lnh thm v loi b thc th, v to ra cc m
hnh RDF nh trc tip t code. Hai API ny b sung chc nng cho nhau, v thng
c s dng cng nhau trong cc ng dng.
Access APIs cung cp truy cp trc tip vo cc module chc nng ca Sesame,
hoc truy cp vo mt chng trnh khch (v d, mt ng dng vn phng s dng
th vin Sesame), hoc truy cp trc tip vo thnh phn tip theo l Sesame server.
Sesame server mt thnh phn cung cp cc truy cp da vo HTTP ti cc API ca
Sesame. Sau , trn pha my khch HTTP xa, tm Access APIs, s dng cho cc
giao tip vi Sesame.
3.3.2 Ci t Sesame
Sesame c th trin khai theo mt vi cch. Hai phng php ph bin nht bao
gm vic trin khai nh l mt th vin java, hoc trin khai nh l mt my ch.
3.3.2.1 Ci t th vin Sesame
Th vin Sesame gm tp cc file:
-
Sesame.jar
23
Nguyn Quc i
24
Nguyn Quc i
3.3.2.3 Qun tr my ch
Thay i cu hnh h thng
Cu hnh ca Sesame c thit lp trong file [SESAME_DIR]/WEBINF/system.conf. thay i file cu hnh s dng cng c Configure Sesame! c sn
trong [SESAME_DIR]/WEB-INF/bin/. bt u thay i cu hnh, s dng
configSesame.bat (trn Windows) hoc configSesame.sh (trn UNIX) (hnh 3.5).
Np cu hnh h thng
25
Nguyn Quc i
M tab Users.
26
Nguyn Quc i
27
Nguyn Quc i
Cc cng c trn nh ca mn hnh hin th thng tin ngi dng v thng tin
kho d liu, v cho php la chn cc thao tc khc nhau trn kho d liu ny. Cc
thao tc ny c phn loi trong cc thao tc c (chng hn nh cc truy vn) v
trong cc thao tc ghi (thm v loi b d liu).
3.3.3.2 Thm d liu vo mt kho d
Giao din web cung cp ba la chn thm d liu vo mt kho d liu
Sesame: Add file, Add (www) v Add (copy-paste).
Cc la chn Add file v Add (www) khng phc tp, la chn u tin cho
php la chn mt ti liu RDF trn a thm vo kho d liu Sesame, v la chn
th hai cho php thm cc ti liu RDF qua mt URL ti kho d liu.
Ty chn Add (copy-paste) cho php ti d liu ti Sesame bng cch g (hoc
sao chp v dn) trong vng vn bn. Vn bn c g l ti liu RDF/XML hp l.
28
Nguyn Quc i
Chng 4.
H thng hi p ting Vit da trn Ontology
Mc 4.1 chng ti gii thiu kin trc tng quan v h thng, cc thnh phn ca h
thng. Mc 4.2, chng ti gii thiu cch x l cu hi u vo ca h thng [1]. Mc
4.3, chng ti a ra cch thit k Ontology, v chng ti thit k mt Ontology th
nghim cho t chc, c th l trng i hc Cng Ngh. Mc 4.4 v 4.5, chng ti
m t h thng trch rt cu tr li bi thnh phn nh x Ontology v thnh phn trch
chn cu tr li. Cc cu hi c a ra trong min ng dng c th da trn
Ontology thit k trong mc 4.3, t chng ti a ra cu tr li ng ngha tt
nht c th ti ngi dng.
Cu tr li
ng ngha
Thnh phn x l
cu hi
ngn ng t nhin
Thnh phn
tm kim
cu tr li
Sesame
server
ONTOLOGY
OWL
B biu din
trung gian
Hnh 4.1. Kin trc tng quan ca h thng hi p ting Vit da trn Ontology
29
Nguyn Quc i
Sesame server
B biu din
trung gian
OWL
Thut ton
khong cch xu
Tin x l
Pre-processing
B ba m t tng
ng vi Ontol ogy
nh x
Ontology
Trch chn
cu tr li
Tng tc
Cu tr li
ng ngha
Ngi
s dng
Hnh 4.2 m t kin trc ca thnh phn tm kim cu tr li. B biu din trung
gian cho cu hi sau khi c tin x l, l u vo cho nh x Ontology. nh x
Ontoloyg kt hp vi d liu t Ontology lu trn Sesame server, s dng thut ton
khong cch xu. Thut ton khong cch xu c s dng tm cc thut ng
tng ng thch hp trong Ontology v nh x Ontology cng c th tng tc vi
ngi dng c thut ng ph hp vi Ontology. nh x Ontology hnh thnh b m
t tng ng vi Ontology. V thnh phn trch chn cu tr li s dng b m t ny
a ra cu tr li ng ngha nht ti ngi s dng.
30
Nguyn Quc i
31
Nguyn Quc i
32
Nguyn Quc i
V d:
sinh vin no hc lp khoa hc my tnh ca trng i hc cng ngh?
(sinh vin, hc, lp khoa hc my tnh, trng i hc cng ngh)
trng ca Nguyn Quc i ca Nguyn Quc t l g? (?, trng,
Nguyn Quc i, Nguyn quc t).
Trn y l cc loi cu hi m chng ti a ra hi h thng ny. Sau khi
chng ti thit k xong Ontology (phn 4.3), chng ti a ra tng quan v thnh phn
nh x Ontology v trch chn cu tr li (mc 4.4, 4.5) da trn cc loi cu hi nu
trn. T , chng ti nh gi h thng cho cc cu hi c hi.
33
Nguyn Quc i
Do vy, vic thit k Ontology l cc k quan trng th hin tri thc v min
ng dng c th.
H thng hi p ting Vit da trn Ontology ca chng ti c th p dng i
vi nhiu min ng dng. Tuy nhin, trong kha lun ny, chng ti thit k mt
Ontology th nghim v mt t chc, c th l trng i hc Cng Ngh. T ,
da vo Ontology c thit k, chng ti s dng Sesame server lu tr v cn c
vo , chng ti s m t chi tit thnh phn nh x Ontology v trch chn cu tr li
(trong mc 4.4 v 4.5).
Chng ti thit k th nghim Ontology cho trng i hc Cng Ngh bng
cng c Protege 3.3.1 [68]. Vic pht trin mt Ontology, bao gm cc bc sau:
-
34
Nguyn Quc i
which, c cc i tng:
h_ni, hi_dng,
bc_giang,
Hnh 4.3, chng ti m t cc lp c thit k trong Ontology bi Protege.
35
Nguyn Quc i
36
Nguyn Quc i
Trong qun trnh phn tch v nghin cu cc mi quan h lin, chng ti thit k
m t mt s quan h thng qua cc thuc tnh (hnh 4.4).
Sau khi thit k xong cc lp, cc thuc tnh, cng vi cc i tng trong
mi lp. Chng ti in gi tr tng ng cho lin kt i vi mi i tng da the o
thuc tnh. Chng hn:
nguyn_quc_i hc k50_khoa_hc_my_tnh
nguyn_quc_i c_qu h_ni,
Thit k Ontology cn mt qu trnh lu di, nghin cu cc khi nim, phn tch
cc mi quan h cn chnh xc. Sau , chng ti p dng Ontology th nghim ny
cho h thng, ri nh gi hot ng ca h thng da vo cc cu hi c a ra.
37
Nguyn Quc i
4.4 nh x Ontology
nh x Ontology l nn tng trong h thng hi p ting Vit ca chng ti.
Cu hi u vo s c phn tch di dng b ba biu din trung gian bi thnh
phn x l cu hi [1], b ba biu din trung gian ny l u vo cho nh x Ontology.
nh x Ontology hnh thnh nn cc b ba m t cc khi nim, i tng cng vi
mi quan h tng thch vi Ontology. Thnh phn trch chn cu tr li s dng cc
b ba m t ny a ra cu tr li ng ngha tt nht c th ti ngi s dng.
to b ba m t cc khi nim, quan h v i tng tng ng vi Ontology,
trc tin, t b ba biu din trung gian thu c do [1], nh x Ontology s dng tp
t ng ngha i vi tng thnh phn ca b ba. Sau , cc thut ng c so
khp vi cc khi nim v i tng bn trong Ontology. Nu so khp khng thnh
cng, nh x Ontology s dng thut ton khong cch xu tm cc khi nim v
i tng tng t trong Ontology. Nu thut ton khong cch xu tr li nhiu hn
mt kt qu, khi y xy ra nhp nhng v ngha ca cc thut ng, th h thng a ra
tng tc vi ngi dng, yu cu la ch n khi nim hoc i tng thch hp.
Sau khi tm c cc thut ng ch khi nim v i tng tng ng trong
Ontology, da vo chng, nh x Ontology tm kim cc mi quan h so khp vi
quan h u vo. Nu so khp khng c, nh x Ontology s dng thut ton
khong cch xu hoc tng tc vi ngi s dng. Khi nh x Ontology tm c
quan h tng ng trong Ontology, h thng hnh thnh cc b ba m t cc khi
nim, i tng cng mi quan h ph hp vi Ontology. Cc b ba thu u vo cho
thnh phn trch chn cu tr li a ra cu tr li ng ngha nht c th.
Thnh phn nh x Ontology x l ty theo tng trng hp c th, i vi tng
loi cu hi khc nhau c cch x l khc nhau. Cc loi cu hi c phn lm hai
dng, cu hi n gin v cu hi phc tp. Cc cu hi dng n gin c phn loi:
- sinh vin no hc lp k50 khoa hc my tnh NORMAL: (sinh vin,
hc, lp k50 khoa hc my tnh) nh x Ontology (sinh_vin, hc,
k50_khoa_hc_my_tnh).
- m ca Nguyn Quc i l g UNKN_TERM: (?, m, Nguyn Quc
i) nh x Ontology (m, c_m, nguyn_quc_i).
38
Nguyn Quc i
(sinh_vin,
hc,
k50_khoa_hc_my_tnh)
(sinh_vin,
hc,
i_hc_cng_ngh).
Di y, chng ti m t cch x l ca thnh phn nh x Ontology i vi
tng loi cu hi c th. Trc ht, chng ti gii thiu cch x l i vi cu hi n
gin, t , chng ti m t hot ng ca nh x Ontology i vi cu hi phc
tp.
4.4.1 nh x Ontology cho cu hi n gin
Cc cu hi sau khi c phn tch bi thnh phn x l cu hi ngn ng t
nhin [1], c xp vo cc loi tng ng. y, trong mc ny, chng ti m t i
vi mt s loi cu hi nh NORMAL, UNKN_TERM, UNKN_REL,
AFFIRM_NEG. Cc cu hi c biu din bi mt b ba quan h bc hai gia hai
thut ng, v b ba biu din ny l u vo cho nh x Ontology. Nhng ty vo tng
loi cu hi m nh x Ontology s c cch tm kim thut ng thch hp trong
Ontology. nh x Ontology x l i vi cc loi cu hi n gin ny c m t
nh hnh 4.5.
39
Nguyn Quc i
Thut ng 1
B
ba
biu
din Quan h
trung
gian
Thut ng 2
Thut_ng_1
Thut_ng_2
Quan_h
40
Nguyn Quc i
Nu vic so khp khng thnh cng, nh x Ontology s dng thut ton khong
cch xu tm cc khi nim (hoc cc i tng) tng t trong Ontology. Nu gi
tr so snh tng t gia hai khi nim (hoc gia hai i tng) ln hn ngng
cho trc th thut ton tr li khi nim (hoc i tng) tng ng. Hoc nu thut
ton khong cch xu tr li nhiu hn mt kt qu, khi , nhp nhng v ngha ca
cc thut ng vn xy ra. V d, khi thut ton khong cch xu so snh lp khoa hc
my tnh vi cc i tng trong Ontology. Khi y, s nhp nhng xy ra khi kt qu
ca thut ton tr li i tng k50_khoa_hc_my_tnh l thc th ca lp lp v
i tng khoa_hc_my_tnh l i tng ca lp b_mn trong Ontology. Lc
ny, nh x Ontology a ra yu cu tng tc vi ngi dng la chn thut ng
tng ng. Sau khi ngi dng phn hi li , h thng tm c thut ng cn thit ph
hp vi Ontology. Cc thut ng ny chnh l cc thut ng ch khi nim biu din
mt lp trong Ontology, hoc l cc thut ng ch i tng thuc v mt lp no
trong Ontology.
Da vo cc thut ng va tm c, nh x Ontology tm tt c cc quan h gia
hai thut ng ny. Sau , nh x Ontology nu so khp quan h khng thnh cng th
s dng thut ton khong cch xu tm quan h tng ng. Nu nhp nhng v
ngha xy ra do thut ton khong cch xu tr li nhiu hn mt kt qu, th h thng
a ra yu cu tng tc vi ngi s dng. Sau bc ny, nh x Ontology tm c
quan h thch hp gia hai thut ng trong Ontology. Nhng ty thuc vo tng loi
cu hi, chng hn nh cu hi UNKN_TERM (thiu thut ng u tin), c trng
hp tm c mi quan h thng qua thut ton khong cch xu, nhng cng c
trng hp chnh cm t miu t quan h trong cu li l thut ng ch khi nim m
t mt lp no trong Ontology.
Chng ti s a ra v d c th cho tng trng hp xem xt cch gii quyt
ca thnh phn nh x Ontology. Nh vy, kt thc qu trnh nh x Ontology, hnh
thnh b ba m t cc khi nim, i tng v quan h ph hp vi Ontology. B ba
m t ny l u vo cho thnh phn trch chn cu tr li a ra cu tr li ng ngha
tt nht c th ti ngi dng. Di y, miu t cch x l ca h thng i vi
thnh phn nh x Ontology, chng ti a ra mt s v d v tng loi cu hi v
cch x l i vi chng.
41
Nguyn Quc i
42
Nguyn Quc i
43
Nguyn Quc i
sinh vin c a ch u?
a ch ca sinh vin l g?
44
Nguyn Quc i
45
Nguyn Quc i
46
Nguyn Quc i
Thut ng 1
Thut_ng_1
B
Quan h
ba
biu
din
trung Thut ng 2
gian
Thut ng 3
Tp quan h
Thut_ng_2
Thut_ng_3
Quan_h_1
Quan_h_2
47
Nguyn Quc i
48
Nguyn Quc i
sinh_vin
49
Nguyn Quc i
50
Nguyn Quc i
51
Nguyn Quc i
nh x
Ontology
Thut_ng_1
Tp i tng
Quan_h
Thut_ng_2
i tng
Cu tr li
52
Nguyn Quc i
V d, vi cu hi
sinh vin no hc lp khoa hc my tnh?
nh x Ontology a ra b ba m t tng ng vi Ontology l (sinh_vin, hc,
k50_khoa_hc_my_tnh). Thnh phn trch chn cu tr li s tm kim trong tt cc
i tng ca lp sinh_vin trong Ontology, vi mi i tng ny, da vo mi
quan
hc
trong
Ontology
xem
lin
kt
ti
thc
th
53
Nguyn Quc i
Quan_h
nh x
Ontology
Cu tr li
Thut_ng_2
i tng
54
Nguyn Quc i
Nhng i vi cu hi:
sinh vin c a ch u?
Nh phn tch trn, h thng thu c b ba m t (?, c_a_ch, sinh_vin).
Thnh phn trch chn cu tr li tm tt c cc thc th ca lp sinh_vin
trong Ontology da theo quan h c_a_ch lin kt ti thc th no th a ra cu
tr li ti ngi dng (hnh 4.17).
55
Nguyn Quc i
nh x
Ontology
Thut_ng_1
Quan_h
Thut_ng_2
Cu tr li
Hnh 4.18. Thnh phn trch chn cu tr li i vi loi cu hi AFFIRM_NEG
56
Nguyn Quc i
Thut_ng_1
nh x
Ontology
Quan_h_1
Tp i tng
Thut_ng
Thut_ng_2
i tng
Quan_h_2
Tp i tng
Thut_ng_3
i tng
Cu tr li
Hnh 4.20. Thnh phn trch chn cu tr li i vi loi cu hi THREETERM
57
Nguyn Quc i
Vi cu hi:
sinh vin no hc lp khoa hc my tnh ca trng i hc cng ngh?
nh x Ontology hnh thnh hai b ba (sinh_vin, hc, k50_khoa_hc_my_tnh)
v (sinh_vin, hc, i_hc_cng_ngh). Thnh phn trch chn cu tr li tm tt c
cc i tng cho tng ng vi tng b ba, tm cc i tng ca lp sinh_vin
theo quan h hc ti k50_khoa_hc_my_tnh, ng thi cng tm cc i tng
ca lp sinh_vin theo quan h hc ti i_hc_cng_ngh. Sau , thnh phn
trch chn cu tr li tm giao ca hai b i tng a ra cu tr li ti ngi s
dng (hnh 4.21).
58
Nguyn Quc i
nh gi h thng
S cu hi
T l %
25
50%
10
20%
35
70%
59
Nguyn Quc i
nh gi h thng
S cu hi
T l %
STT
1
Li do nh x Ontology
10
20%
Li do trch chn cu tr li
10%
Tng s cu hi li
15
30%
60
Nguyn Quc i
61
Chng 6. Kt lun
Nguyn Quc i
Chng 6. Kt lun
62
Ph lc
Nguyn Quc i
Ph lc A
63
Ph lc
Nguyn Quc i
64
Ph lc
Nguyn Quc i
65
Nguyn Quc i
[1].
[2].
[3].
[4].
AKT
Reference
Ontology,
http://kmi.open.ac.uk/projects/akt/ref-
onto/index.html.
[5].
[6].
327330.
I. Androutsopoulos, G.D. Ritchie, P. Thanisch, Natural language interfaces to
databasesan introduction, Nat. Lang. Eng. 1 (1) (1995) 2981.
[7].
[8].
[9].
66
[10].
Nguyn Quc i
T. Berners-Lee, J. Hendler, O. Lassila, The semantic web, Sci. Am. 284 (5)
(2001).
[11].
J. Burger, C., Cardie, V., Chaudhri, et al., Tas ks and Program Structures to
Roadmap Research in Question & Answering (Q&A), NIST Technical Report,
2001.
http://www.ai.mit.edu/people/jimmylin/%0Apapers/Burger00Roadmap.pdf.
[12].
R.D. Burke, K.J. Hammond, V. Kulyukin, Question answering from frequentlyasked question les: experiences with the FAQ nder system, Tech. Rep. TR-
[13].
[14].
P. Clark, J. Thompson, B. Porter,Aknowledge-based approach to questionanswering, in: In the AAAI Fall Symposium on Question-Answering Systems,
CA, AAAI, 1999, pp. 4351.
[15].
[16].
2.cs.cmu.edu/wcohen/postscript/ijcai-ws-2003.pdf.
A. Copestake,K.S. Jones,Natural language interfaces to databases, Knowl. Eng.
Rev. 5 (4) (1990) 225249.
[17].
[18].
[19].
[20].
Discourse Represenand then tation Theory. Jan Van Eijck, to appear in the 2nd
edition of the Encyclopedia of Language and Linguistics, Elsevier, 2005.
[21].
67
Nguyn Quc i
EasyAsk: http://www.easyask.com.
[23].
[24].
[25].
[26].
[27].
[28].
[29].
[30].
[31].
[32].
[33].
[34].
68
[35].
Nguyn Quc i
[36].
D. Klein, C.D. Manning, Fast Exact Inference with a Factored Model for
Natural Language Parsing, Adv. Neural Inform. Process. Syst. 15 (2002).
[37].
[38].
[39].
[40].
[41].
[42].
[43].
[44].
[45].
[46].
69
[47].
Nguyn Quc i
[48].
RDF: http://www.w3.org/RDF/.
[49].
[50].
[51].
[52].
[53].
[54].
[55].
500.
Z. Zheng, The answer bus question answering system, in: Proceedings of the
Human Language Technology Conference (HLT2002), San Diego, CA, March
2427, 2002.
[56].
http://en.wikipedia.org/wiki/Ontology_(computer_science).
[57].
http://en.wikipedia.org/wiki/Web_ontology_language.
[58].
http://en.wikipedia.org/wiki/Ontology_components.
[59].
[60].
http://en.wikipedia.org/wiki/Resource_Description_Framework.
[61].
http://www.w3.org/TR/rdf-concepts/.
[62].
http://www.w3.org/TR/rdf-schema/.
[63].
http://en.wikipedia.org/wiki/RDF_Schema.
[64].
[65].
http://en.wikipedia.org/wiki/Question_answering
http://www.w3.org/TR/owl-guide/
70
Nguyn Quc i
[66].
http://www.w3.org/TR/owl-ref/
[67].
http://www.w3.org/TR/owl-features/
[68].
[69].
http://protege.stanford.edu/doc/owl/getting-started.html.
[70].
http://protege.stanford.edu/conference/2005/slides/T2_OWLTutorialI_Drummo
nd_final.pdf.
71