You are on page 1of 12

K yu Hi ngh Quc gia ln th VIII v Nghin cu c bn v ng dng Cng ngh thng tin (FAIR); H Ni, ngy 9-10/7/2015

DOI: 10.15625/vap.2015.000210

THUT TON MI V SO KHP ONTOLOGY


Hunh Nht Pht1, Hong Hu Hnh2, Phan Cng Vinh3
1
i hc Hu
2
i hc Hu
3
Trng i hc Nguyn Tt Thnh TP. HCM
huynhnhutphat@yahoo.com, hhhanh@hueuni.edu.vn, pcvinh@ntt.edu.vn

TM TT So khp ontology (ontology matching) l mt phn quan trng trong k ngh ontology ca Web ng ngha vi
mc tiu tm kim cc so khp (alignment) gia cc thc th ca cc ontology cho. Trong nghin cu ny chng ti xut thut
ton mi v mt cng c da trn thut ton ny tm s tng ng gia cc thc th ca cc ontology u vo. Thut ton
xut ny s dng o mi v s tng ng ca t vng v cng s dng thng tin v cu trc ca cc ontology xc nh thc
th tng ng ca chng. o s tng ng v t vng to ra mt tp t cho mi thc th da trn nhn v thng tin m t ca
chng. Cch tip cn v cu trc to thnh mt mng li cho mi nt trong cc ontology. S kt hp ca phng php tip cn v
t vng v cu trc to thnh ma trn ng dng gia ontology ngun v ontology ch. Thut ton xut ny c th
nghim da trn cc chun c cng nhn v cng c so snh vi cc thut ton khc hin nay. Kt qu thc nghim ca
chng ti cho thy thut ton xut rt hiu qu v nhanh hn so vi cc thut ton khc.
T kho lexical similarity, ontology matching, similarity measure, structure similarity.
I. GII THIU
Cc ontology l cc m hnh khi nim chia s v mt min ng dng, cho nhiu ngi tham gia. Cc ontology
cho Web ng ngha c biu din bi RDFS hoc OWL, ng mt vai tr rt quan trng trong Web ng ngha. So
khp ontology c xut nh l mt gii php tt cho vic chia s v ti s dng kin thc, bng cch cung cp mt
c ch hnh thc xc nh ng ngha ca d liu. N ng mt vai tr quan trng trong vic m rng v s dng cc
ng dng da trn Web ng ngha [1]. Nhng nm gn y c rt nhiu ontology c to ra vi cc min khc nhau.
So khp ontology cho php d liu v tri thc c biu din trong cc ngn ng khc nhau v cc nh dng c
chia s. Chng hn, vi hai ontology u vo v to ra cc so khp (alignment) u ra. Cc so khp ny l mt tp
hp lin quan gia cc thc th tng ng v ng ngha, nh l cc lp c t tn, cc thuc tnh i tng v cc
thuc tnh d liu ca cc ontology u vo.
Trong bi bo ny chng ti trnh by phng php t hp tm s tng ng gia cc thc th (v d: cc lp
c t tn, cc thuc tnh i tng v cc thuc tnh d liu) ca cc ontology, da trn s tng ng v t vng
v cu trc ca chng. Trc tin, chng ti xc nh s tng ng v t vng trong s cc lp c t tn (cc nt
trong ontology), cc thuc tnh i tng (cc cnh trong ontology) v cc thuc tnh d liu (cc gi tr), s dng
o khong cch mi, n to ra mt ti t (bag of words) cho mi thc th trong cc ontology cho. Cc ti sau ,
c so snh vi nhau tnh ton s tng ng v t vng gia hai thc th vi kiu ging nhau (v d nh hai lp
c t tn) t hai ontology. V vy, vic s dng o khong cch ny, chng ti a ra ba ma trn khc nhau v
t vng, n tng ng vi cc c im ging nhau ca cc lp c t tn, cc thuc tnh i tng v cc thuc
tnh d liu ca ontology ngun v ontology ch. Th hai, chng ti tm kim im tng ng gia cc nt (cc lp
c t tn) ca ontology ngun v ontology ch da trn cc cu trc ontology ca chng. Ni cch khc, chng ti
to ra mt mng li gm nhiu nt, vi mi nt s dng cc nt ln cn ca n trong ontology ngun v ontology
ch so snh v mt cu trc gia chng. Cui cng, trong giai on ny, ba ma trn c c cc giai on trc
tng ng v t vng v tng ng v cu trc s c kt hp v s dng thm k thut to ra ma trn tng
ng ton din.
Chng ta bit rng so khp ontology to ra s tng ng gia cc thc th ca hai ontology. Trong bi bo ny
chng ti trnh by thm cng c OMReasoner, n to ra mt khung ng dng c th m rng v s kt hp ca nhiu
cng c so khp ring l v t in WordNet cng nh logic m t c s dng trong vic phn tch so khp
ontology. N x l so khp ontology c hai cp t v ng ngha, v n s dng phn ng ngha cng nh cu trc
ca OWL-DL. Chng ti trnh by kt qu t c ca OMReasoner vi OAEI 2014 theo ba phng php:
Benchmark, Conference v MultiFarm.
Bi bo c t chc nh sau: Phn I gii thiu. Phn II kho st cc nghin cu lin quan. Phn III tho lun
vi thut ton xut chi tit. Phn IV a ra mt v d minh ha. Phn V trnh by cc kt qu thc nghim v cng
c c trin khai da trn thut ton m t. Phn VI trnh by cng c OMReasoner c th m rng v s kt hp
nhiu cng c so khp ring l, Phn VII cc kt lun v nhn xt.
II. CC NGHIN CU LIN QUAN
Trong phn ny chng ti kho st cc phng php tip cn c xut cho vic so khp ontology [3-10].
Nhng phng php tip cn ny c th c chia thnh bn loi: t vng, ng ngha, cu trc v t hp.
696 THUT TON MI V SO KHP ONTOLOGY

Phng php tip cn v t vng l phng php da trn chui nhn dng cc thc th tng ng nhau
trong cc ontology cho. Phng php ny c th c dng nhn dng cc lp tng ng trong cc ontology
ngun v ontology ch da trn s ging nhau v nhn hoc v vic miu t ca chng [3]. Cc k thut ny xem cc
chui c trnh t ca cc ch ci. Chng da trn s nhn bit sau y: cc chui ging nhau nhiu hn, nhiu kh
nng chng biu th cc khi nim ging nhau. Vic so snh cc k thut so khp chui khc nhau, t cc vn v
khong cch n cc vn da trn token c th tm thy trong [4]. Mt s v d ca cc k thut da trn chui c
s dng rng ri trong cc h thng so khp l tin t, hu t, chnh sa khong cch v n-gram.
Cc phng php tip cn tin t v hu t ca hai chui u vo v kim tra xem chui th nht c bt u (kt
thc) so vi chui th hai hay khng. Phng php tip cn ny hiu qu trong vic so khp cc chui c cng ngun
gc v cc t vit tt tng t nhau (v d int v integer). Cc phng php tip cn chnh sa khong cch ca hai
chui u vo c tnh ton chnh sa khong cch gia chng. Chnh sa khong cch l s cc k t c chn
vo, xa i hay thay th chuyn i mt chui ny thnh mt chui khc, c chun ha theo chiu di ca chui
di hn. V d, MLMA+algorithm [9] s dng khong cch Levenshtein [6] chnh sa khong cch v tnh ton s
tng ng v t vng gia hai thc th. Phng php tip cn da trn N-gram ca hai chui u vo v tnh ton s
lng n-grams (tc l trnh t ca n k t) gia chng. V d, 3-grams ca chui nikon l nik, iko, kon. Vy,
khong cch gia nkon v nikon da trn 3-grams s l 1/3.
Vi phng php tip cn ng ngha theo cch thng thng l mt hoc nhiu ti nguyn v ngn ng nh t
vng v t in chuyn ngnh c s dng xc nh cc thc th ng ngha [3]. Nhng phng php tip cn ny
thng s dng kin thc ph thng hoc t in thuc min c th so khp cc t da trn cc mi quan h ngn
ng gia chng (v d: cc t ng ngha, cc t c ngha hp so vi t khi qut). Trong trng hp ny, tn cc thc
th ca ontology c xem nh cc t ca ngn ng t nhin. Mt s phng php tip cn s dng t in kin thc
ph thng c c ngha ca cc thut ng s dng trong cc ontology. V d, WordNet [7] l mt c s d liu
in t v t vng ting Anh (v cc ngn ng khc), trong cc t c ngha ring bit c t vo cc b t ng
ngha. Cc quan h gia cc thc th ontology c th c tnh ton lin quan n cc cc rng buc v ngha trong
WordNet [8-9]. Cc phng php tip cn khc s dng t in v tn min c th thng lu tr mt s kin thc v
min c th, n khng c sn trong b t in kin thc ph thng (v d nh tn ring), nh truy cp vi t ng
ngha, cc t c ngha hp so vi t khi qut v cc mi quan h khc.
Cc phng php tip cn v cu trc nhn dng cc lp ging nhau (cc nt) bng cch quan st cc i snh
ca chng vi cc lp khc da vo ontology c cp v cc thuc tnh ca chng na. tng chnh vi hai lp
ca ontology ngun v ontology ch l tng ng nu chng c nhng ln cn ging nhau (cc cu trc) v cc thuc
tnh ging nhau [2]. V d, GMO l mt thut ton v cu trc, n s dng th hai bn tch bit (bipartite graphs)
miu t cc ontology. N o cc th tng ng v cu trc bi mt php o mi. Tuy nhin, GMO c mt tp cc
cp i snh, chng thng c tm thy trc bi cc phng php tip cn khc, vi d liu nhp vo t bn ngoi
trong qu trnh so khp ca n. Mt s so khp khc v cu trc c s dng so snh cc ontology, chng so khp
da trn cc nt con, cc nt l v cc mi quan h. Trong trng hp i snh, hai thc th khng phi nt l c
xem l tng ng nu chng c cc nt con hoc cc nt l ln lt tng ng nhau. Trong quan h i snh, vic
tnh ton gia cc nt tng ng cng c th da vo cc mi quan h (thuc tnh i tng) ca chng [10].
Cc phng php tip cn v t hp, chng kt hp hai hoc nhiu cc phng php tip cn ni trn (tc l s
kt hp v t vng, ng ngha v phng php tip cn v cu trc) c c kt qu tt hn. MLMA+algorithm [5]
v phin bn ci tin ca n l t hp cc phng php tip cn, n s dng mt k thut tm kim ln cn, n thc
hin hai cp . cp u tin, o s tng ng v hai t vng ca hai ontology u vo. Trng hp ny l o
s tng ng theo tn, n s dng khong cch Levenshtein [6] v o s tng ng v t vng, n s dng
WordNet [7]. cp th hai, cc thut ton c p dng o s tng ng v cu trc tm cc gii php so khp
tt nht.
III. PHNG PHP T HP I SNH CC ONTOLOGY
Trong phn ny chng ti miu t phng php t hp ca chng ti trong so khp cc ontology da trn s
tng ng v t vng v s tng ng v cu trc ca chng. Phng php xut ny bao gm ba giai on tm
cc i snh gia ontology ngun v ontology ch. giai on th nht v giai on th hai, cc ontology u vo c
so khp tng ng v t vng v v cu trc. Sau , giai on th ba cc kt qu tip nhn t hai giai on trc c
kt hp li to ra kt qu tng th. Cc chi tit ca ba giai on ny c gii thch trong cc mc tip theo.
A. Cc thc th so khp v t vng gia hai ontology
Cc phng php tip cn so khp tng ng v t vng l cc phng php da trn chui ca cc thc th
tng ng c xc nh trong cc ontology cho. y, chng ti tm kim s tng ng v t vng c tch bit
gia cc thc th (cc lp c t tn, cc thuc tnh i tng v cc thuc tnh d liu) ca ontology ngun v
ontology ch. V vy, giai on ny chng ti s a ra ba ma trn ring bit tng ng v t vng nh l u ra ca n.
Chng ti gii thiu o mi tng ng v khong cch xc nh s tng ng v t vng ca cc
ontology u vo. Gi s chng ti mun tnh ton s tng ng v t vng gia chui s v chui t. Cc chui ny c
th l mt t hoc mt vn bn cha mt s cc pht biu (statements). Lc u, chng ti chuyn mi chui k t
Hunh Nht Pht, Hong Hu Hnh, Phan Cng Vinh 697

thnh chui cc token bng cch s dng cc du phn cch, sau khi chuyn i thnh cc token s a vo mt ti t.
Bt k k t trong chui cho khng thuc bng ch ci s c xem nh l mt du phn cch. V d, nu chui k
t s cha hai k t khng thuc bng ch ci th chng ti xem hai k t nh l hai du phn cch v loi b chng ra
khi chui s. Kt qu l, chui s s chuyn i thnh ba token tc l ba t. Mi chui s v t sau khi c chuyn i
thnh cc token s cho vo mi ti t tng ng, mi t m chung cho hai ti s b loi b khi hai ti. Sau , nu
khng cn g trong ti th nht v ti th hai, khong cch gia hai chui u vo s l zero. Mt khc, tt c cc t
cn li trong mi ti s c kt ni v dn n khong cch tng ng Levenshtein [6] c tnh ton gia hai t.
Sau khi tnh ton khong cch gia chui s v t, th s tng ng ca chng s l phng trnh sau y:
Lexical_Similarity s, t 1 , _ , (1)
trong distance l khong cch gia chui s v t, v max_len l di ti a ca chui s v t.
Hy xt v d sau y:
s = Part Of
t = is_part_of
bag_of_words(s) = {Part, Of}
bag_of_words(t) = {is, part, of}
Sau khi to cc ti t, chng ti loi b hai t part v of ra khi hai ti t. n y, chng ti s c cc ti
sau:
bags_of_words(s) = {}
bags_of_words(t) = {is}
Cui cng, s tng ng gia chui s v t s l: Levenshtein_distance(, is) = 2
, _ "", " " 2
_ , 1 1 1 10 0.80
max _ , max 7,10
Chng ti tnh ton ring bit s tng ng v t vng trong s cc lp c t tn, cc thuc tnh i tng
v cc thuc tnh d liu ca hai ontology u vo, s dng o ni trn v sau to ra ba ma trn ring bit tng
ng v t vng.
B. Cc thc th so khp v mt cu trc gia hai ontology
Trong phn ny, chng ti a ra phng php so khp v mt cu trc gia hai ontology u vo, trong Hnh 1
cho thy s ca phng php ny (c phc tp thut ton O(n2)). Sau y l cc bc to ra cu trc ma trn
tng ng gia ontology ngun v ontology ch:
1. To mt ma trn ln cn i vi mi ontology.
2. To mt dy cc danh sch lin kt i vi mi ontology da trn ma trn ln cn ca n.
3. Tnh ton s tng ng v cu trc trong s cc nt ca ontology ngun v ontology ch bng cch s dng
danh sch lin kt ca chng v to ra ma trn khi to (ban u) tng ng v cu trc.
4. Ci thin ma trn khi to tng ng v cu trc bng cch s dng ba thao tc b sung (bc 4, Hnh 1).
Cc bc trn sau y c m t mt cch chi tit. so snh cu trc hai ontology u vo, chng ti to ra
mt mng li gm cc nt ca mi ontology. Nhng nt ny s c so snh vi mi nt ca ontology khc da trn
mng li ca chng. Li ny c m phng bng cch s dng mt dy cc danh sch lin kt (tc l mt dy hai
chiu). im quan trng ca phng php ny l s nt ln cn ca mt nt, cch thc chng lin quan vi nhau v vi
nt ny (Hnh 2).
Chng ti xem hai nt trong mt ontology l ln cn nhau nu chng c lin quan vi nhau thng qua quan h
is-a (lp con hoc lp cha), equivalent to hoc disjoint with hoc thng qua quan h v thuc tnh i tng. V
d, nu lp A l mt lp con ca lp A, th A v A l ln cn nhau. Mt v d khc, mt i tng c p thuc tnh,
cc nt min v vng ca n s l ln cn nhau (v d nt A c hai thuc tnh D v C, hai thuc tnh ny c quan h
vng min, nn D v C l ln cn nhau). Da trn gi nh ny, chng ti tnh ton ma trn ln cn cho mt ontology
bt k. Mi phn t ca ma trn ln cn l 1 hoc 0, n cho thy cc nt tng ng v hng v ct ca ma trn l c ln
cn hay khng (Hnh 3).
698
6 THU
UT TON MII V SO KHP ONTOLOGY
O

Nhp: ontology ngun, on


ntology
ch,
Ngng, lch

Ln cn ngun to ma trn ln cn (ontology ngun)


Bc 1

Ln cn ch to ma trn ln cn
n (ontology ch)

n to danh sch lin kt (Ln cn nggun )


Danhh sch lin kt ngu
Bc 2

Danh sch lin kt ch to danh sch lin kt (Ln cn ch )


Bc 3

i to to ma trn tng ng (Danh sch lin kt ngun, Danh sch lin kkt ch)
Ma trrn tng ng kh

n tng ng 1 C
Ma trn Ci thin 1 (Ma trn tng ng khi to, Ln cn ngun, Ln cn ch, Ng
ng, lch)
Bc 4

Ma trn tng ng 2 Ci thin 2 (Ma trn tng ng 1, Ln cn ngun, Lnn cn ch, Ngng,, lch)

n tng ng cui ccng Ci thin 3 (Ma trn tng n


Ma trn ng 2, Ln cn ngunn, Ln cn ch, Ng
ng, lch)

Xut: Ma
M trn tng ng cui cng

Hnh 1. S
thut ton tnh
t ton s t
ng ng v cuu trc.
T ontoology ngun hhoc ontologyy ch, chng ti tnh ton mng
m li caa mi nt bnng cch s dn
ng mt ma
trrn ln cn ca ontology . Mng li ca mi nt c
miu t bi
b mt dy cc danh sch llin kt. Mi hng trong
ma
m trn ln cnn ca mt onttology tng ng vi mt nt
n ca ontolo ogy . S 1 trrong mi hngg cho bit s nt
n ln cn
vi
v n. Trngg hp nu nt A trong ma trrn ln cn c n nt ln cn
n th dy cc ddanh sch lin kt tng ng
g ca n s
c
c n hng (theeo Hnh 2, ntt A c 3 nt ln cn l ntn B, nt C v v nt D, nn ddy cc danh sch lin kt tng ng
ca
c nt A s c 3 hng, hng 1 l ca ntt B, hng 2 l ca nt C v hng h 3 l ca nt D, xem HHnh 3b). Mi hng trong
dy
d cc danh sch s lin kt nny miu t cc ln cn ca nt .

nh 2. V d ontoology cho thy cch


Hn c thc so kh
hp s tng ng v cu trc
Ct uu tin ca dyy cc danh scch lin kt caa nt A cho bit s ln cnn ca nt ln ccn vi A (v d ct u
tin ca ma trn trong Hnh 3b gm 3 3 5 ngha l ntt B c 3 ln cn, nt C c 3 ln cn v nt D c 5 ln cn). c Ct 2
trr i cho bit s ln cn ca mi ln cnn tng ng vi ln cn hng
h i j = 1,2, ..., nn) ca dy cc danh sch
th j (vi
lin kt ca nt A (v d ctt 2 3 4 ca hng 1 tng n ng l 1 1 3 ngha l ln cnn th nht caa B l B1 v B1B ny c 1
ln cn vi n, ln cn th hhai ca B l B2 v B2 ny c c 1 ln cn vi n, cui cnng ln cn th ba ca B l A v A ny
c
c 3 ln cn v i n).
Trong bc
b th ba ca mc ny, cchng ti tnh ton ma trn khi to tnng ng v cuu trc gia cc ontology
u
vo bng cch
c s dng dy cc danh sch lin kt ca chng. Mi phn t caa ma trn khi to tng n ng vi mc

tng ngg v cu trc ggia mt nt cca ontology ngun


n v mt nt
n ca ontoloogy ch. Chng ti so snh h nt A ca
Hunh
H Nht Pht,, Hong Hu Hnnh, Phan Cng Viinh 699

ontology
o ngun vi tt c ccc nt A caa ontology cch vi num_o
of_neighbors(A
A) num_of__neighbors(A) 1. Vi
gii
g php thay th, chng ti c th so snnh tt c cc nt
n ca ontologgy ngun vi ttt c cc nt ca ontology ch.
so snh
s mt nt A ca ontologgy ngun c n ln cn vi mt nt A ca ontology ch, chng t i tnh ton
kh
k nng so snh c th xyy ra l n + 1. Kh nng u u tin, c g i l p0, c tnh ton daa trn s so s nh cc ln
c ca hai nt ny (ct uu tin ca mii dy nt). Kh nng pi (vi i = 1, 2, ..., nn) c tnh bng cch so s
cn nh cc ln
c ca nhngg ln cn ny. Mi hng i ca dy u tin (nt A)
cn c so snh vi tt c cc hng ca dy th h hai (nt
A)
A v theo hng so khpp tt nht c kkh nng cao nht
n c xem m l kh nngg pi. Trong th c t, pi cho biit so khp
c
c ln cn tt nht vi nt A
A l ln cn th i ca nt A. Khi tt c cc kh nngg c tnh toon, trung bnh h cng ca
k nng so snh n + 1 ny hhnh thnh s
kh tng ng vv cu trc caa hai nt.
Sau khii tt c cc phhn t ca maa trn tng ng
c tnh
h ton, ba thaao tc di y c s d
ng tnh
ton ci thin ma
m trn khi tto:
Nu hai
h nt t onttology ngun v ontology ch l tng ng, th s tng ng vv ln cn ca a chng s
c tng ln bi sai lch (biaas) c xc nh
trc.
Nu hai
h nt t ontoology ngun vv ontology ch c n nt ln cn c i snh, th s tng ng ca chng
i v
i sai lch c xc nhh trc s c tng ln bii n sai lcch.
Nu hai
h nt t ontoology ngun vv ontology ch c n thucc tnh vi kiuu d liu ph bin, th s t ng ng
ca chhng i vi sai lch
c xc nh trrc s c tng
t ln bi n sai lch..
Sau khii tin hnh mt s th nghim chng ti tm c gi trr thch hp chho yu t lch l 0.1. V vy chng
ti s dng gi tr ny trongg cc th nghiim ca chng g ti. Vic p dng gi tr nngng (thresshold) trn ma a trn khi
to tng nng v cu trcc, chng ti xc nh c cc nt ca ontology
o ngun so khp vvi cc nt ca ontology
ch.
Ni chunng, trong qu trnh to ma trn cui cn ng tng ng g v cu trc,, bn ma trnn tng ng khc nhau
c
to ra. Ma
M trn khi to tng ngg v cu trc l l ma trn u u tin v ba m
ma trn khc c to ra bng cch p
dng
d ba thao tc
t c xc nh trn tngg bc ci thin ma trn n khi to. MMa trn c to ra cui cn ng s c
xem
x l ma trnn cui cng t ng ng v ccu trc.

(a) (b)
Hnh 3 (a) Ma trn lnn cn, v (b) dy cc danh sch
h lin kt ca nt A ca ontoloogy
C.
C S kt hpp v t vng vv cc kt qu tng ng v
v cu trc
Trong mcm 3.1 v 3.2 cc chi titt ca phng php xutt cho vic tnhh ton nhng im tng ng
v t
vng
v v cu trrc gia cc thhc th ca onntology ngun v ontology ch
c bbn n.
Nh trnh by, phhng php xut a ra ba ma trn tng ng v tt vng ca ccc lp c t
tn, cc
th
huc tnh i tng v ccc thuc tnh d liu v ma trn tng ng v cu trc cho cc lpp c t tn
n. By gi,
chng
c ta tm hiu
h vic tip nnhn tt c ccc kt qu tn
ng ng ca cc ma trn ktt hp ny.
xc nh cc lp tng ng c t tn ca ontology ngun v ontollogy ch, chng ti ly trung bnh c
trrng s ca ma
m trn u tin tng ngg v t vng v ma trn tn
ng ng v cu trc (tc l ma trn tngg ng ca
cc
c lp c t tn) nh saau:
Named
dClasses_Similaarity _ _ _ (2)

trong NamedClassses_Similarityy l tt c s tng


t ng tro
ong s cc lpp c t tnn ca ontology
y ngun v
ontology
o ch,, Lexical_NC__Matrix l maa trn tng g ca cc lp c t tn, Structural_Matrix l ma
ng v t vng
trrn tng nng v cu trc, v l ccc trng s c gn tng ng
cho ma trrn v t vngg v ma trn v v cu trc.
Trong
T th nghiim ca chngg ti, nu hai oontology cho vi tng ng v t vng nhiu hnn tng ng vv cu trc
h h s v s c gi tr tng ng l 0.6 v 0.4, tr
th ng hp cn li th = = 0.6.
Ma trnn th hai v thh ba tng ng v t vnng tng ng vi cc thucc tnh i tnng v thuc tn nh d liu.
V mi cp thhuc tnh i tng hoc tthuc tnh d liu ca ontology ngun vv ontology ch, nu chng
Vi g c gi tr
ng ng v t vng bngg hoc ln hnn 0.50 th cc thao tc di y c s ddng ci thhin hai ma tr
t n ny:
700 THUT TON MI V SO KHP ONTOLOGY

Nu chng c cc min c i snh, s tng ng ca chng s c tng ln bi sai lch c xc


nh trc.
Nu chng c cc vng c i snh (hoc tng ng vi cc thuc tnh d liu), s tng ng ca
chng s c tng ln bi sai lch c xc nh trc.
Sau tc v ny, cc ma trn s phn nh mi s tng ng tng ng trong s cc thuc tnh i tng v cc
thuc tnh d liu. Chng ti da vo ma trn NamedClasses_Similarity quyt nh cc min v vng ca thuc tnh
i tng v thuc tnh d liu c i snh.
IV. V D MINH HA
V d minh ha trnh by trong phn ny lm r hn phng php so khp tng ng v cu trc vi thut
ton c xut. Hy xem xt ontology mu trong Hnh 2.
Trong Hnh 2 mi tn lin tc hin th cc mi quan h is-a, equivalent hay disjoint v cc mi tn khng
lin tc hin th cc quan h v thuc tnh i tng. Khi ontology c 10 nt, ma trn ln cn ca ontology ny s l
mt ma trn 10 10, xem Hnh 3(a). Vic s dng ma trn ln cn tnh ton, mt dy cc danh sch lin kt c
to ra i vi mi nt ch ra cc ln cn vi n v nhng ln cn ca cc ln cn v cc kt ni ca chng. Vi
ma trn ln cn trnh by trong Hnh 3(a), v d, hng u tin tng ng vi nt A ca ontology. N cho thy nt A c
ba nt ln cn B, C v D tng ng vi ba hng trong dy danh sch lin kt. Hnh 3(b) cho thy y l dy danh sch
lin kt c tnh ton i vi nt A. Ct u tin ca dy danh sch lin kt ny hm rng ba nt B, C v D tng
ng c s nt ln cn l 3 3 v 5. Hng u tin ca dy danh sch lin kt ny hm rng ln cn u tin ca nt A
l nt B, nt B ny c ba ln cn l nt B1, B2 v A, vi mi nt ny, c s nt ln cn tng ng l 1 1 v 3.
Tng t cho hng th hai v th ba ca dy danh sch lin kt ny.
By gi, nu em so snh cu trc nt A ca ontology ngun vi mt nt A ca ontology ch, chng ti cn
phi tnh ton bn kh nng so snh pi c th xy ra (vi i = 0, 1, 2, 3). tnh ton p0, ct u tin ca dy danh sch
lin kt ca nt A c so snh vi ct u tin ca dy danh sch lin kt ca nt A ca ontology ch. V c bn
y l s so snh v cu trc ca hai nt da trn cc ln cn ca chng. tnh pi (vi i = 1, 2, 3), hng th i (khng
bao gm phn t u tin) ca dy danh sch lin kt ca A c so snh vi tt c cc hng (khng bao gm cc phn
t u tin) ca dy danh sch lin kt ca nt A, hng so khp tt nht vi dy danh sch lin kt ca nt A s xc
nh gi tr ca pi. Cui cng, vi mc trung bnh ca bn kh nng so snh c tnh ton xc nh s tng ng
v cu trc gia nt A ca ontology ngun v nt A ca ontology ch.
S tng ng v cu trc trong s tt c cc nt ca ontology ngun v tt c cc nt ca ontology ch to
thnh ma trn khi to tng ng v cu trc. Sau khi tnh ton ma trn ny, ba thao tc c gii thiu trong mc 3.2
c p dng to ra ma trn cui cng tng ng v cu trc.
V. CC TH NGHIM V KT QU
Vic nh gi hiu sut ca cc thut ton so khp ontology i vi cc ontology th nghim v cc kt qu so
snh, ngi ta a ra hai o precision and recall c ngun gc t truy hi thng tin [2].
nh ngha 1 (i snh - Alignment)
Cho hai ontology O v O, mt i snh gia O v O l mt tp cc b tng ng (gm 4 phn t): <e, e, r, n>
vi e O v e O ca hai thc th so khp nhau, r l mi quan h rng buc gia e v e v n l tin cy [0..1]
tng ng.
Thut ton so khp tr v i snh A, n c so snh vi i snh tn ti R.
V d trong Hnh 4 trnh by hai ontology cng vi hai i snh A v R. V mc ch ca vic n gin ha,
mi quan h lun lun l = v tin cy lun lun l 1.0.
i snh A c xc nh nh sau:
<o1:Vehicle,o2:Thing,=,1.0>
<o1:Car,o2:Porsche,=,1.0>
<o1:hasSpeed,o2:hasProperty,=,1.0>
<o1:MotorKA1,o2:MarcsPorsche,=,1.0>
<o1:250kmh,o2:fast,=,1.0>
i snh tn ti (R) c xc nh nh sau:
<o1:Object,o2:Thing,=,1.0>
<o1:Car,o2:Automobile,=,1.0>
<o1:Speed,o2:Characteristic,=,1.0>
<o1:250kmh,o2:fast,=,1.0>
<o1:PorscheKA123,o2:MarcsPorsche,=,1.0>
Hunh
H Nht Pht,, Hong Hu Hnnh, Phan Cng Viinh 701

nh
ngha 2 (precision, reccall)
Vi i snnh tn ti R, o precisionn ca i snh
h A cho bi:
|R A|
P A,, R
|A|
v o reecall cho bi:
|R A|
R A, R
|R|

Hnh 4. Hai ontology


o i snh nhau
Chng ti nh gi hhiu sut thutt ton do chn
ng ti xut bng vic s dng h s c trng. V ccc kt qu
i
snh, chnng ti so snhh cc kt qu ca chng t i vi su thuut ton c trnh by tronng Ontology Alignment
Evaluation
E Innitiative 2008 (OAEI-08) v cng vi ML LMA + algoriithm. OAEI l mt hot ng thng nin cho cc
h
h thng so khhp ontology xc nh im mnh v im yu caa chng. y l mt sng kkin quc t phi p hp t
chc
c nh gi v qu trnh ppht trin ca cc h thng so khp ontology. Mc tiuu chnh ca n l so snh cc h thng
v
v cc thut toon trn c s tng ng v cho php btb k ai a ra kt lun v cc chin l c so khp t t nht. S
nh
gi ca cc
c t chc a ra b th nnghim chun c h thng viv cp ontoloogy i snh ccng nh cc kt qu d
kin.
k Cc ontoology ny c m t bi OWL-DL v c nh dng bng RDF F/XML. Cc i snh d kin
k a ra
nh
dng chuun c din t bng RDF F/XML. Chng g ti pht trin
t mt cngg c da trn thut ton c xut
v p dng n vo b th nghim chun OAEI-08. Ch
v hng ti s dnng cc s liuu thng k chuun v truy h
i thng tin
nh gi cc kt qu th nghim:

| |

| |
(3)
| |

| |

2

Chng ti phn loi ccc trng hp th nghim m vo nm nh
m: l cc nnhm sau #1001-104, #201-210, #221-
247,
2 o ca preccision, recall vv f-measure nhn c t mi nhm
#248-2666 v #301-3044. Cc gi tr trrung bnh v
bng
b cch s dng
d thut ton xut tth hin trong Bng 1.
Bng 1. H
Hiu sut trung bnh thut ton
n c xut trn
t b th nghiim chun OAE
EI-08

#101104 #2201210 #2
221247 #248266
# #301304 Trung bnh
Preecision 0.98 0.96 0.93 0.46 0.87 0.84
Reecall 0.95 0.92 0.88 0.41 0.84 0.80
F-mmeasure 0.96 0.93 0.90 0.43 0.85 0.81

Chng ti s dng ccng c Jena phn tch cc ontology ca b th nnghim chunn OAEI-08 v xt gi tr
ngng
n th = 0.70
0 cng nh
sai lch bbias = 0.1. Tro
ong trng hp p th nghim
m #101-104, d liu v t v
ng v cu
trrc ph hp, v
v vy chng ti nhn c kt qu t g trng hp tth nghim #2201-210 ontology ngun
t nht. Trong
v
v ontology ch c cu trc tng ngg v trong tr ng hp th nghim
n #221-2247 d liu vv t vng ph hp gia
ontology
o ngun v ontologyy ch. V vyy, chng ti c kt qu tt trong
t trng hhp th nghim ny. Vi trng
t hp
702 THUT TON MI V SO KHP ONTOLOGY

th nghim #248-266 d liu v t vng v cu trc khng y , chng ti thu c cc kt qu khng thch hp.
Trong trng hp th nghim #301-304 vi bn ontology, cng vi thut ton c xut, n cho ra kt qu tt.
Chng ti so snh nhiu cp v cch tip cn so khp ontology t hp ca chng ti vi mt s h thng nh
CIDER, DSSim, GeRoMe, MapPSO, SPIDER, TaxoMap, gm nhng ngi tham gia b th nghim chun OAEI-08
cng vi MLMA + algorithm [5]. Cc kt qu ca su thut ton trn b th nghim chun OAEI-08. Kt qu so snh
c trnh by trong Bng 2.
Bng 2. So snh cc gi tr trung bnh ca precision v recall bi phng php tip cn ca chng ti vi mt s h thng
tham gia vo t chc OAEI-08
Our
TaxoMap MapPSO GeRoMe SPIDER CIDER DSSim MLMA+ approach
System
test Prec Rec. Prec Re. Prec Rec. Prec Rec. Pre. Rec Prec Rec. Pre. Rec. Pre. Rec.

1xx 1.0 0.34 0.92 1.0 0.96 0.79 0.99 0.99 0.99 0.99 1.0 1.0 0.91 0.89 0.98 0.95
2xx 0.95 0.21 0.48 0.53 0.56 0.52 0.97 0.57 0.97 0.57 0.97 0.64 0.57 0.52 0.78 0.74
3xx 0.92 0.21 0.49 0.25 0.61 0.40 0.15 0.81 0.90 0.75 0.90 0.71 0.68 0.65 0.87 0.84
Average 0.91 0.22 0.51 0.54 0.60 0.58 0.81 0.63 0.97 0.62 0.97 0.67 0.69 0.65 0.86 0.83

F-measure 0.35 0.52 0.58 0.70 0.75 0.79 0.66 0.84

Cc th nghim chun trn h thng c phn thnh ba loi: 1xx, 2xx v 3xx. Bng 2 cho thy gi tr trung
bnh precision v recall ca tng loi, trung bnh tng (hay trung bnh iu ha) v f-measure ca ba loi ny. Xem
phng trnh (3).
Khi tham kho t cc kt qu c trnh by trong Bng 2, thut ton xut ca chng ti c o f-measure
tt hn cc h thng khc v hm rng n hiu qu hn cc h thng khc. Thut ton xut cng t c
o recall tt hn so vi cc h thng khc. Nhng, n c o precision thp hn cc h thng TaxoMap, CIDER v
DSSim. Tuy nhin, cc h thng ny gn nh t ti o recall v o f-measure khng thch hp trong tt c cc
thut ton. Trong thc t chng loi b o recall c c o precision tt hn.
VI. CNG C OMREASONER
A. Trnh by v h thng
So khp ontology tm kim s tng ng gia cc thc th lin quan n ng ngha ca cc ontology. N ng
mt vai tr quan trng trong nhiu min ng dng.
Cc phng php so khp ontology c xut: vic thc hin so khp c th s dng nhiu thut ton so
khp hoc cc cng c i snh, v cc tiu ch phn loi ch yu sau y c xem xt [11-13].
Nhiu phng php tp trung vo cc kha cnh c php thay th cho ng ngha. OMReasoner thc hin vic so
khp bi quy trnh s dng mt s t in bn ngoi v cc k thut suy din. Tuy nhin, phng php ny bao gm
chin lc ca vic phi hp (ch yu c php) nhiu cng c so khp (v d, cng c so khp EditDistance).
1. nh ngha v phn tch h thng
Qu trnh so khp c th c nh ngha l mt hm f.
A = f(O1, O2, A, p, r)
Trong O1 v O2 l mt cp ca cc ontology nh l u vo i snh, A l i snh u vo gia cc
ontology v A l i snh mi u ra gia cc ontology, p l mt tp cc thng s (v d, trng s w v ngng ) v
r l mt tp cc ngun ti nguyn.
Cc i snh biu th s tng ng gia hai thc th. Mt tng ng phi th hin hai thc th v mi quan h
gia chng. Cho hai ontology, mt s tng ng l b 5 phn t: <id, e1, e2, R, n>, trong :
id l mt nh danh duy nht ca s tng ng;
e1 v e2 l cc thc th ca ontology th nht v ontology th hai tng ng;
R l mt quan h (v d, tng ng (=), ln hn (>), nh hn (<), khng tng ng ()) gia e1 v e2. Theo
OAEI, quan h tng ng l ct li;
n l tin cy (thng trong khong [0 1]) vi s tng ng gia e1 v e2.
OMReasoner thc hin i snh ontology vi ba bc nh sau (Hnh 5):
1. Phn tch c php: chng ti c th thu c cc lp v cc thuc tnh ca cc ontology bng cch s dng
API ontology: Jena.
2. Kt hp gia cc cng c so khp ring l: tng ng v t c th c sinh ra bng cch s dng nhiu
thut ton so khp hoc cc cng c i snh, v d phng php tng ng v chui (tin t, hu t, chnh sa
khong cch) bng cch da trn chui, cc k thut da trn rng buc. Trong khi , mt s ng ngha tng ng c
Hunh
H Nht Pht,, Hong Hu Hnnh, Phan Cng Viinh 703

th
h thc hin bng
b cch s dng t inn bn ngoi nh
h WordNet. Sau , nhiuu kt qu so kkhp c ktt hp bng
cch
c s dng chin
c lc c th.
Khung ng dng s hh tr v vic kt hp cc cng c i sn
nh, to iu ki
kin thun li ccho cc cng c so khp
ring
r l.
3. Qu trnh suy diin: ng nghha tng ng g c th c suy din bbng cch s dng logic m t DL
(Description Logic),
L trong cc tng ng v t
c sinh ra b
c 2 c xeem l u vo..
Cui cng, chng tii nh gi ccc kt qu da vo cc i snh
s lin quann, v tnh tonn hai o: prrecision v
recall.
r
Vi OM MReasoner, khhung ng dnng rt linh hot i vi cc cng c so khhp ring l. H
Hin nay, nhi
u cng c
so
s khp ring l bao gm EdditDistance v WordNet (H
Hnh 5).

nh 5. So khp ontology trong OMReasoner


Hn

Minh ho v cc cng c so kh
Hnh 6. M p trong OMReaasoner
2.
2 Cc k thuut s dng c th
a)
a Ngng
Ngngg rt cn thitt i vi nhiuu cng c so khp
k (c bitt l c php) vv s tng ng. V d, kh
hong cch
chnh
c sa ca book v booklet l 3/77 (tc l, cc
o tin cy tng
t ng l 1-3/7=0.57). Nu ngng l 0.55, th
hai
h thc th l tng ng ((vi do tin cy 0.57); ng gc li nu nggng l 0.6, th chng khng tng n ng. V vy,
chng
c ti phi iu chnh cng c so khpp thng qua ngng.
b)
b Kt hp o tin cy
Mi cng c so khpp ring l c th to ra ccc o tin cy
y tng ng. Tt c cc o tin cy ny s c
chun
c ho tr
c khi kt hpp. OMReasoneer bao gm cc chin lc liinh hot sau y kt hpp cc kt qu i
snh:
Thutt ton tng h
p trng s (W
WeightSum)
tin cy c th c tng hp bbng thut ton tng ng g v trng s ((cng thc 1),, trong wk l trng s
cho
c mt cng c so khp k c th v simmk(e1, e2) l
tin cy ca s
s tng ngg.
sim e , e w sim e , e , trong w 1.0 (1)
Ph
ng php cc i (Max)
o tin
t cy cc i c chn trrong s n cng
g c so khp (cng
( thc 2).
sim(e1, e2) = max(siim1(e1, e2),
, simn(e1, e2
2)) (2)
c)
c So khp ngg ngha
OMReaasoner s dnng cc phngg php so khp
p ng ngha nh
n cng c soo khp WordN
Net v vic su
uy din bi
logic m t (D
DL - Descriptioon Logic).
704
7 THU
UT TON MII V SO KHP ONTOLOGY
O

WordNNet l mt c ss d liu in t v t vnng ting Anh, trong cc ngha khc nnhau (cc ngha c th l
mt
m t hay cm m t) ca ccc t c sp xp to thnhh cc b t
ng ngha. Cc quan h gia cc thc th ontology
c
tnh tonn vi cc thutt ng rng buuc v ngha ca WordNet. Cng c so kkhp ring l nny s dng mt
m t in
bn
b ngoi nh WordNet t c s t ng ng v ng
n ngha.
OMReaasoner s dnng logic m t DL c cun ng cp bi Jen
na. OMReasonner bao gm ccc lut suy diin v vic
so
s khp ontoloogy. Tuy nhin, kh nng ssuy din mt nhiu
n thi gian
n v ch gp m
mt phn nh cho cc kt qu.
q Trong
phin
p bn ny,, kh nng suyy din c b qua.
B.
B Kt qu c
a OMReason
ner theo tng pphng php
p thc hin
Trong phn
p ny, chng ti trnh by cc kt qu t c t OMReasoner
O vvi OAEI 20114. N thc hiin theo ba
phng
p php: Benchmark, Conference vv MultiFarm.. Cc th ngh him c tinn hnh trn mmt my tnh ang chy
Windows
W Servver 2008 R2 Standard vi b vi x l Inteel Core i5 chy y 2.8 Ghz v 16 GB RAMM.
1.
1 Phng phhp Benchmarkk
Vi ph ng php nyy, cc ontologgy c th cc chia thnh 3 loi (Bng 3)). Trong nhm m 1, thng tin t vng
c
thay i thay th cc nhn hoc nh danh v chng. S thaay i ny baoo gm vic thaay th cc nh n hoc cc
nh
danh vi cc tn khc ttheo mt quy c t tn c th, mt tn
n ngu nhin, mmt tn sai chhnh t hoc mt
m t nc
ngoi.
n Trong nhm
n 2 c ccc ontology thuu hp h thngg phn cp, m rng h thnng phn cp hhoc tt c uu khng c
phn
p cp. Tronng nhm 3 cc ontology c thch thc ln nht v i snh ontoology. y cc nhn c trn sao
cho
c php honn v ca cc tt c chiu ddi c th. Chng ti iu chnh c cng c bng cch ss dng ng ng T (wd:
ngng
n ca WordNet,
W ed: nngng ca EdditDitance) v kt hp chinn lc S, sau nhn cc cc kt qu tt hn (wd
= 0.95, ed = 0..9; S = Max). Cc kt qu t c t OM MReasoner theeo Benchmarkk c tm tt trong Bng 4. 4
Bng 3. Phn loi theo chu
un 2014

Bng 4. Cc kt qa t c theo Benchmark


B 20114

2 Phng phhp Conferencce


2.
Tp d liu tin cy bbao gm cc oontology thc t. Chng ti s dng chinn lc kt hpp thc thi cng
c c h
th
hng ca chng ti theo phhng php CConference. Cc kt qu tt c t OM MReasoner c tm tt trong Bng 5
(wd = 0.9, ed = 0.8; S = Maax).
Bn
ng 5. Kt qu t c theo Co
onference 2014

3.
3 Phng phhp MultiFarm
m
Phngg php MultiF
Farm bao gm m mt tp con ca tp d liu kt hp, c dch vi tm ngn ng khc nhau.
Vi
V phng php
p ny, cc ontology c th c chiaa thnh 2 loii. Trong nhm m 1 cc i snh ontology u ging
nhau.
n Trong nhhm 2 cc ii snh ontologgy u khc nh
hau.
Trc ht,
h chng tii s dng t in dch cc c ngn ng khc nhau saang ting Anhh. Sau , tin ng Anh
c
dch s a
vo cc cng c so kh p bng cch ss dng chin lc Max. Cuui cng chnng ti nhn c cc kt
qu.
q Chng tii iu chnh cng c ca chhng ti bng cch c s dng ngng v cc kt qu c tth hin th tro
ong Bng 6
(wd = 0.8, ed = 0.6; S = Maax), trong ccho thy cc
o F-Measu ures ca cc i snh ontology nhm 2 l r rng
km
k hn so v i cc i snnh ontology trrong nhm 1. Chng ti thy rng nhngg l do m OM MReasoner khng c
th
hit k tt so khp vi cc ontology kkhc l v chnng c vit bng
b cc ngn ng hon tonn khc nhau.
Bng 6. Cc kt qu
q i vi Mu
ultiFarm 2014

chn ngng ttt hn, chng ti so snh cc kt qu (Bng


( 7) trnn mt s ngng theo phng php
Conference.
C T nhin, chng ti vn s
Tuy ng php Maxx thc hinn cng c ca chng ti.
dng chin lc v ph
Hunh
H Nht Pht,, Hong Hu Hnnh, Phan Cng Viinh 705

T
T cc kt quu, chng ti thy rng khii ngng wd = 0.9, ed = 0.8,
0 cng c ca chng ti thc hin t t nht (F-
measure
m = 0.6647). V vy m
m chng ti s dng ngng wd = 0.9, ed = 0.8 theeo phng phhp Conferenc
ce. Vic s
dng
d phng php Conferrence, chng ti nhn c cc ng ng tt hn sso vi phnng php Benc chmark v
MultiFarm.
M
Bng 7. So snhh kt qu vi cc ngng khc nhau ca Confe
B ference 2014

C.
C Nhn xt chung
c
1.
1 Tho lun v
v cch thc ci thin h thng xut
Thc hin vic suy ddin da trn ccc tng ngg v t l rt kh
k khn, v vvy cc kt quu chnh xc c a ra
t
cc cng c so khp rinng l s nng ccao cc kt qu
u ca chng ti.
t Mt s cch ci thin cng c caa chng ti
c
lit k nhh sau:
a) p dng nhiu chiin lc linh hhot hn trong vic kt hp
p nhiu cng c so khp thhay v ch tn
ng hp da
trn trrng s.
b) Thmm mt s tin xx l (Hnh 66), chng hn nh
n loi b c tnh c th (v d, '-', '_') hoc tch c
c t ghp,
trc khi a vo ccc cng c i snh.
c) c nhn xt v thng tin v nhn ca ontology tnh ton, c bit kkhi tn ca khhi nim ny l
Ly cc v ngha.
d) Xem xt li vic s
dng gi tr nngng thch hp ti u ha chnh xc.
e) v khc trrong cng c ca chng ti l b qua th
Mt vn hng tin v cu trc bao gm ontology giai on
hin nay.
n V chngg ti s ci thin n trong t
ng lai.
2.
2 xut cc bin php m
mi
Chng ti thy rng O
OMReasoner c th ci tin
n rt nhiu. Mt s cch mi c xut nh sau:
a) Lm phong
p ph cc t in ng ngha v WorrdNet khng phi
p l mt t in chuyn nnghip, n kh
hng th c
c cc khi nim
m ng ngha toon din.
b) Tnh n s phn ccp cc khi niim ng nghaa thay v ch tnh n tt c cc khi nim
m v thuc tnh
h.
c) Tm cc
c t ng nggha theo ph
ng php kt hp.
h
d) Tm cc
c t in ngn ng khc nnhau cho MulttiFarm.
e) Ci thhin thut tonn ca mt s ccng c i snh.
f) Bao gm
g nhiu cnng c so khp khc nhau.

VII. KT LUN
Trong bi
b bo ny chhng ti trnhh by thut ton so khp onntology tm raa s tng nng trong s c c thc th
ca
c cc ontoloogy cho daa trn thng tiin v t vng v cu trc c a chng. Thuut ton ny thhc hin ba giai on:
t
vng, cu trc,
t v t hpp. i vi vic xc nh s tng ng v t vng giia cc thc tth, chng tii gii thiu
mt
m php o mi m v s tnng ng, tronng cc thn ng tin v t v
ng ca mi thhc th, chnng hn nh nhn hoc s
m t, c chuyn i v cho vo mtt ti t, sau chng cc s dng cho vic tm kim
miu m s tng ng ca cc
th
hc th. Tronng giai on u tin, chnng ti thu c ba ma trn tng
t ng vv t vng bnng cch so snh cc lp
c
t tn, cc
c thuc tnhh i tng v cc thuc tnnh d liu caa hai ontologyy. Trong giai on th hai, so snh
cu
c trc cc onntology, chngg ti to ra mt mng li cho mi nt trrong ontologyy ngun v ont ntology ch v sau so
snh
s chng v i nhau da trn mng li ca chng. Mi
M mng li ca mi nt c to ra bng cch s d ng cc ln
cn
c ca nt v cc ln ccn ca cc ln cn n ng thi c thh hin bi mmt mng hai chiu. Ma tr n khi to
t
ng ng v cu trc c tnh bng cch so snh cc mng ny. Sau khi to raa ma trn ny, n c ci thin bng
cch
c p dng ba c m t trongg phn III mc B. Cui cn
b thao tc ng, trong giai on th ba, cchng ti tnh ton gi tr
trrung bnh c trng
t s ca ccc kt qu v t vng v cu
c trc. Chn ng ti thc hin thut ton ca chng ti trn b
th
h nghim chhun ca OAE EI-08 v c cc kt qu kh quan. Ngoii ra chng ti so snh thut t ton ca chng ti vi
mt
m s h thnng tham giia vo t chcc OAEI-08 v nh trong Bng 2 cho thyy thut ton ca chng ti c c o f-
measre
m tt hn
n. Ngoi ra, chng ti trnnh by thm cc kt qu ca c h thng OMReasonerr cho vic i snh cc
ontology
o theo ba phng phhp: Benchmaark, Conferencce v MultiFaarm. Chin lc kt hp caa nhiu cng c c so khp
706 THUT TON MI V SO KHP ONTOLOGY

ring l v s suy din logic m t DL bao hm c trong cch tip cn ca chng ti. Cc kt qu t c chng ti
thy vn cha tha mn v s tip tc ci tin n trong tng lai.
VIII. TI LIU THAM KHO
[1] N. Arch-Int and P. Sophatsathit, A semantic information gathering approach for heterogeneous information
sources on WWW, Journal of Information Science 29 (2003) 357374.
[2] M. Ehrig and J. Euzenat. Relaxed precision and recall for ontology matching, K-Cap 2005 Workshop on
Integrating Ontologies2005 (Banff, Alberta, Canada) 2532.
[3] L. S. Xiao and R. Ellen, Automated schema mapping techniques: an exploratory study, Research Letters
Information Science4 (2003) 113136.
[4] W. Cohen, P. Ravikumar and S. Fienberg, A comparison of string metrics for matching names and records,
Proceedings of the Workshop on Data Cleaning and Object Consolidation at the International Conference on
Knowledge Discovery and Data Mining (KDD)(2003).
[5] A. Alasoud, V. Haarslev and N. Shiri, An empirical comparison of ontology matching techniques, Journal of
Information Science35(4) (2009) 379397.
[6] V. I. Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals, Soviet Physics
Doklady10 (1966) 707710.
[7] G. A. Miller, WordNet: A lexical database for english, Communications of the ACM38 (1995) 3941.
[8] P. Bouquet, L. Serafini and S. Zanobini, Peer-to-peer semantic coordination, Journal of Web Semantics 2(1)
(2004) 8197.
[9] G. Pirro, A semantic similarity metric combining features and intrinsic information content, Journal of Data and
Knowledge Engineering 68 (2009) 12891308.
[10] A. Maedche and S. Staab, Measuring similarity between ontologies, In Proceedings of the International
Conference on Knowledge Engineering and Knowledge Management(2002) 251263.
[11] Rahm, E. and Bernstein, P.: A survey of approaches to automatic schema matching. The VLDB Journal, ,10(4):
334--350(2001).
[12] Shvaiko, P. and Euzenat, J.: A survey of schema-based matching approaches. Journal on Data Semantics (JoDS)
IV, 146--171(2005).
[13] Kalfoglou, Y. and Schorlemmer, M.: Ontology mapping: the state of the art. The Knowledge Engineering
Review Journal, 18(1):1--31, (2003).

A NEW ALGORITHM FOR ONTOLOGY MATCHING


Huynh Nhut Phat, Hoang Huu Hanh, Phan Cong Vinh
ABSTRACT Ontology matching is an importance in ontology technology of the Semantic Web with a goal of finding alignments
among the entities of given ontologies. Ontology matching is a necessary step for establishing interoperation and knowledge sharing
among Semantic Web applications. In this study we present an algorithm and a tool developed based on this algorithm to find
correspondences among entities of input ontologies. The proposed algorithm uses a new lexical similarity measure and also utilizes
structural information of ontologies to determine their corresponding entities. The lexical similarity measure generates a bag of
words for each entity based on its label and description information. The structural approach creates a grid for each node in the
ontologies. The combination of lexical and structural approaches creates the similarity matrix between the source and target
ontologies. The proposed algorithm was tested on a well known benchmark and also compared to other algorithms presented in the
literature. Our experimental results show the proposed algorithm is effective and outperforms other algorithms.

You might also like