Computation of Semantic Similarity Among Cross Ontological Concepts For Biomedical Domain

JOURNAL OF COMPUTING, VOLUME 2, ISSUE 8, AUGUST 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 111
Computation of Semantic Similarity among

Cross Ontological Concepts for Biomedical
Domain
K.Saruladha, Dr.G.Aghila, and A.Bhuvaneswary
Abstract— Based on Amos Tversky psychological contrast model this paper proposes a corpus independent information content
based similarity computation method to assess similarity between biomedical concepts belonging to multiple ontology’s. Ontologies have
been widely used in many domains including database integration, bioinformatics, and the Semantic Web to facilitate the sharing of hetero-
geneous information. Semantic similarity techniques are becoming important components in most intelligent knowledge-based and semantic
information retrieval (SIR) systems. This paper discusses the limitations of existing semantic similarity methods for computing similarity be-
tween concepts of a single ontology and concepts belonging to different ontologies. The proposed approach exploits informativeness of
concepts as a factor for computing the amount of specific and shared features between the concepts. Identifying the Most Common Specific
Abstraction between concepts belonging to different ontologies is a challenge and we proposed a methodology to identify the MCSA by
forming a virtual root which connects the root concepts of the considered ontologies. The proposed idea is tested using MESH and
SNOMED-CT biomedical ontology.
Index Terms — Biomedical domain, Information retrieval, Ontology, Similarity Methods, UMLS.
——————————  ——————————
1 INTRODUCTION
ssessing semantic similarity between concepts is a COMMONALITY PROPERTY ‐ The similarity between A
A main issue in much research areas such as Linguis‐
tics, Cognitive Science, Biomedicine, and Artificial
and B is related to their commonality. The more com‐
monality they share, the more similar they are.
Intelligence. Semantic similarity techniques are becoming
important components in most intelligent knowledge‐ DIFFERENCE PROPERTY ‐ The similarity between A and
based and semantic information retrieval (SIR) systems B is related to the differences between them. The less dif‐
[1], [2]. With the growing access to heterogeneous and ference they have, the more similar they are.
independent data repositories, the differences in the
structure and semantics of the data stored in those reposi‐ IDENTITY PROPERTY ‐ The maximum similarity be‐
tories plays a major role in information systems. Semantic tween A and B is reached when A and B are identical, no
Similarity relates to computing the similarity between matter how much commonality they share.
conceptually similar but not necessarily lexically similar
terms. Typically, semantic similarity is computed by SYMMETRIC PROPERTY – The similarity between con‐
mapping terms to ontology and by examining their rela‐ cepts (A, B) is equal to the similarity between the concepts
tionships (hyponymy, hypernomy, meronymy and ho‐ (B, A).
monym) in that ontology. Semantic similarity approaches
fall under four different categories: ontology based ap‐ DEPTH PROPERTY: The distance between A and B is
proach, Information content based approach, feature represented by an edge of the concepts and is influenced
based approach and hybrid based approach. The basic by the depth of the location of the edge in the ontology.
qualitative properties that a semantic similarity measure
should consider are commonality, difference, identity, This paper discusses the proposed method to compute
symmetric and depth property. semantic similarity among cross ontological concepts.
————————————————
Section II discusses the classification of various semantic
 Mrs.K.Saruladha is with the Computer Science Department, Pondicherry similarity methods based on single ontology; Section III
Engineering College, Puducherry, Pin 605014, India. discusses Classification of similarity methods based on
 Dr.K Aghila is with the Computer Science Department, Pondicherry Uni-
versity, Puducherry, Puducherry, Pin 605014, India.
cross ontology. Section IV discusses the architectural de‐
 Ms.A.Bhuvaneswary is with the Computer Science Department, Pondi- sign and algorithm of the similaity method for cross onto‐
cherry Engineering College, Puducherry, Pin 605014, India. logical concepts in biomedical domain.

2 SEMANTIC SIMILARITY METHODS FOR is obtained by considering negative log likelihood of the
SINGLE ONTOLOGY probability of the concept in a given corpus and is given
Semantic similarity methods are broadly classified into by
single ontology similarity methods and cross ontological IC ( C )   log p ( c ) (1)
similarity methods. Various approaches could be used to
find similarity between two similar concepts in ontology. where c is a concept in the considered ontology and p(c)
Similarity methods for single ontology could be broadly is the probability of encountering c in a given corpus. IC
classified into four main approaches value of each concept is monotonically decreasing as we
 Ontology Based Approaches move from the leaves of the taxonomy to its roots. The
 Information Content (Corpus) Based Ap‐ root node of the concept in the IS‐A hierarchy has the
proaches maximum frequency count, since it includes the frequen‐
 Hybrid Based Approaches cy counts of every other concept in the hierarchy. This
 Feature Based Approach approach adheres to the basic properties such as commo‐
nality, symmetry and difference.
2.3 Hybrid Based Approach

Hybrid approach [12], [13] combines different informa‐
tion sources such as Information content of the concept,
depth and shortest path to assess the similarity or dis‐
tance between concepts. This approach adheres to the
basic properties such as commonality, symmetry and dif‐
ference.
2.4 Feature Based Approach
In Feature based approach [3], [14], [16] the similarity
considers the features that are common to two concepts
Fig. 1. Semantic Similarity Methods for Single Ontology and also the differentiating features specific to each. The
similarity of a concept C1 to a concept C2 is a function of
2.1 Ontology Based Approach the features that are common to both C1 and C2, those in
C1 but not in C2 and those in C2 but not in C1. According
Ontology means “Specification of a Conceptualization”. It to Tversky [3] the similarity function is
is a description of the concepts and relationships that can
exist for domain. Ontology based approach [4] requires Simtvr ( C1, C 2)  .F ( ( C1)  ( C 2))   . F ( ( C1) /  (C 2))  .F ( (C 2) /  (C1)) (2)
consistent and rich ontologies to asses semantic similarity
between two concepts. The ontology based approaches where F is some function that represents a set of features,
are classified under two categories. Path length approach and α, β and γ are parameters that afford for differences
[5] computes similarity by counting the number of in focus on the different components. ( ( C1)   ( C 2 ))
nodes/edges between two concepts in terms of the short‐ represents the set of features that the two concepts have
est path in the taxonomy. Depth relative approach [6], [7], in common. ( ( C1) /  ( C 2 )) and ( ( C 2 ) /  ( C1)) represents
[8] takes into account the depth of the taxonomy by calcu‐ the differentiating features specific to each concept. Simi‐
lating the depth from the root to the target concept. It ad‐ larity is not symmetric, (i.e. Sim(C1, C2) != Sim(C2, C1))
heres to the basic properties such as difference and identi‐ This approach adheres to the basic properties such as
ty. commonality, difference.

2.2 Information Based Approach (Corpus)
3 SEMANTIC SIMILARITY METHODS FOR
Information theoretic approaches [9], [10], [11] usually CROSS ONTOLOGY
employ the notion of Information Content (IC), which can
Semantic Similarity among multiple ontologies is classi-
be considered as a measure quantifying the amount of
fied under two categories. 1) Ontology based approach 2)
information a concept expresses in the taxonomy. The IC
Feature based approach.
values are obtained by calculating the probability of oc‐
currence of word to each concept in a given corpus. These
probabilities are cumulative as we go up the taxonomy
from specific concepts to more abstract ones. The IC value
( 2 D1  1 )
PathRate 
( 2 D 2  1 ) (5)
where D1 and D2 represents depth of the primary and
secondary ontology. According to the path feature scale of
primary ontology, the cross modified path length between
the two concepts nodes in primary ontology is calcalcu‐
lated as given in (6)

Path (C1, C2) = d1 + PathRate × d2 – 1 (6)
Since there may be many bridge nodes between two con‐
Fig. 2. Semantic Similarity Methods for Cross Ontology.
cepts there can be more than one path length i.e. {pathi}
and the semantic distance, SemDist, between two concept
3.1 Path Length Approach for Cross Ontology nodes is given as follows
The ontology based approach used in the similarity me‐
thod for single ontology is differing from multiple ontolo‐ CSpec i ( C 1, C 2 )  D1 Depth ( LCS ( C 1, Bridge i ) (7)
gies by considering one as primary and another as second‐
ary ontology. The semantic similarity between cross onto‐ SemDist ( C1, C 2 )  log(( pathi 1) *CSpeci )   K (8)
logical concepts is measured by joining the common node
belonging to two ontologies is considered as bridge node. 3.2 Feature Based Approach for Cross Ontology
According to Al‐Mubaid.et.al method [2], the semantic According to Rodriguez & Egenhofer, the semantic simi‐
similarity between concepts in single ontology and mul‐ larity is measured among multiple ontologies by consid‐
tiple ontologies are measured by ontology‐structure‐based ering three important features 1) matching process, 2)
technique for the biomedical domain (MeSH). Al‐ semantic neighborhoods 3) distinguishing features.
Mubaid.et.al has proposed that semantic similarity can be In [16], each concept is considered as an entity class.
measured by using three different cases: 1) Similarity me‐ The similarity between entity classes is given as
thod for single primary ontology, 2) Similarity method for
p q p q p q p q
cross ontology and 3) Similarity method within secondary S ( a , b )  Ww S w ( a , b )  Wu Su ( a , b )  Wn Sn ( a , b ) (9)
ontologies.
where Ww ,Wu ,Wn are the respective weights of the simi‐
The semantic similarity measure for cross ontology is
larity of each component and it value is greater than 0.
based on three features
The functions Sw, Su, and Sn are the similarity between
A common specificity of concepts in the ontology
synonym sets, features, and semantic neighbor‐
Cross modified path length between two concepts
hoods.The entity class a belongs to ontology p and b
A local granularity of both ontologies.
belongs to ontology q. The similarity between entity
For cross‐ontology semantic similarity, the common
classes is calculated using synonym sets, features, and
specificity feature between two concepts C1 and C2 takes
semantic neighborhoods and is given by
into account the depth of the least common subsumer

(LCS) of two concepts and the depth of the ontology.
A B
S ( a ,b )  (10)
A  B   ( a , b ) A / B  ( 1   ( a ., b )) B / A
CSpec( C1,C 2) D Depth( LCS ( C1,C 2 )) (3) where α is the function representing the depth of the on‐
The less the CSpec value, the more they have shared in‐ tology and its value ranges from 0 to 1. The function α is
formation between two concepts. In this case, two concepts given in (11), (12).
belong to two different ontologies one identified as prima‐ When depth(C1O1) ≤ depth(C2O2)
ry ontology and other with lesser number of concepts as O1 O1 O2
 ( C1, C 2 )  Depth ( C1 ) / depth ( C1 )  depth ( C 2 ) (11)
secondary ontology. Using bridge node, the least common
When depth(a ) > depth(b ) p q
subsumer node of two concepts (C1, C2) is measured by O1 O1 O2
considering the LCS of the first node C1 in primary ontol‐  ( C1, C 2 )  1  ( Depth ( C1 ) / Depth ( C1 )  Depth ( C 2 )) (12)
ogy and the bridge node, Word matching (Sw) is determined by contemplates
the set of common words and different words in the
LCS ( C 1, C 2 )  LCS ( C 1, bridge n ) (4) synonym sets that denote the entity classes [14].Feature
matching (Su) applies a matching process which classi‐
Thus the path length is calculated by adding d1 = d(C1 ,
fies features into parts (Sp), functions (Sf), and attributes
bridge) and d2 = d(C2 , bridge). In order to scale the path
(Sa). The feature similarity using word matching is given
length and CSpec features in the secondary ontology to the
by
primary ontology, the path rate is given by
S u ( a , b ) W p S p ( a , b ) W f S f ( a , b ) W a S a ( a , b ) (13)
p q p q p q p q
But when we extend P&S [14], [15] metric for cross on‐
for Wp, Wf, Wa ≥ 0. Semantic‐neighborhood matching (Sn) tologies the IC value of a concept should be computed
compares entity classes ap and bq of ontologies p and q based on both the ontologies. The following principles
with radius r, respectively. The semantic neighborhoods were kept in mind for designing the new similarity meas‐
is given by ure.
a p n bq  The proposed measure should be based on hu‐
S (a,b) (14) man psychological models as all of the existing
a n b  (a ,b ). (a ,a n bq ,r )(1 (a p ,bq ). (a p ,a p m bq ,r )
p q p q p p
semantic similarity methods are evaluated

The intersection over semantic neighborhoods is ap‐
against human judgments.
proximated by the similarity of entity classes across
 The proposed method should be based on infor‐
neighborhoods, where S is the semantic similarity of enti‐
mation content method because most of the IC
ty classes; n and m are the number of entity classes in the
based methods achieve highest correlations
corresponding semantic neighborhoods [16].
against human judgements.
 The information content calculation should be
TABLE 1 corpus independent as corpus dependent IC cal‐
LIMITATIONS OF EXISTING SIMIALRITY METHODS culations are time‐consuming and require tagged
APPROACH MEASURE LIMITATIONS corpora.
ONTOLOGY Rada et al.[4] Require consistent ontology.  The depth property of the semantic similarity
should not be ignored as the more deep the con‐
BASED Leacockand Only for specific information
cept is in a hierarchy the most specific it is.
Chodorow[8] source
The proposed method is to compute a semantic simi‐
Resnik[9] Considers only most specific larity among cross ontological biomedical concepts using
common abstraction and IC feature and information content based methods. The fea‐
IC‐BASED Lin[10] depends on corpora.
ture matching approach uses common and different cha‐
Time‐consuming analysis of
the corpus.
racteristics between concepts to compute semantic simi‐
J&C[11]
Considers only IS‐A relations larity. This work is motivated by the need of new tools
HYBRID Li et al.[12] Requires parameters to be that can improve the retrieval, integration and mapping
BASED settled. of information. For this work we thought the UMLS
OSS[13] Tuning is required. framework [17] could be used as it is populated with
Considered WordNet ontol‐ many biomedical ontologies. The proposed idea is to be
Pirró and ogy and MeSH ontology
tested using MESH [18] and SNOMED‐CT [19] biomedi‐
Seco[14] separately.
Issues related to find cross
cal ontologies.
FEATURE
BASED Rodriguez & ontological similarity is not
addressed.
Egenhofer
Considers only hypernomy /
Method [16]
holonomy relations among
concepts.
4 THE PROPOSED SIMILARITY METHOD FOR

CROSS ONTOLOGY
Pirró [14],[15] has mentioned in his work that the pro‐
posed method could be extended to compute semantic
similarity between concepts belonging to different ontol‐
ogies if the problem of finding the most common specific
abstract concept is found. He has also mentioned in his
work that the method used by Rodriguez [16] may be
adopted to find the MCSA. The main challenge of extend‐
ing the P&S [14] metric for computation of cross ontology
similarity is underpinned under the following two issues.
1) Finding the most specific common abstraction between
concepts Ci and Cj where Ci belongs to O1 ontology and Fig. 3. Architecture for Computation of Semantic Similarty of Cross
Ontological Concepts
Cj belongs to O2 ontology, 2) When we consider P&S
[14],[15] metric IC value is calculated for single ontology.

The exisiting approach for computation of semantic si‐ measure use Information content (IC) value to compute

milarity among cross ontological concepts proposed by Al‐ similarity between two terms in the ontology. Main draw‐
Mubaid.et.al Method [2] is a path based approach and our backs in this approach are
proposed method is information content based approach.  It is corpus dependent
Based on Tversky formula the intersection of the features  Time consuming analysis of the corpus
quantifies the amount of commonality that exists between  The information content for cross ontology is not
the compared concepts. This quantification is conceived as addressed.
IC (MSCA). Pirro [14] has calculated commonality using IC Thus the refined formula for information content
(MSCA) using single ontology (MeSH). He has proposed based approach overcomes the drawbacks by considering
the ratio based formulation of Tversky [3] model. But he the corpus independent information content for cross
has not taken into consideration the depth property. We ontological concepts. Seco [15] has proposed a method to
have considered depth property (α) for computation of compute information content which is corpus indepen‐
semantic similarity of concepts belonging to different on‐ dent The information content based approaches have
tologies (SNOMED‐CT and MeSH). been refined for computing cross ontological concept si‐
milarity. All of these approaches calculate the information
4.1 Proposed Similarity Method content value using Seco’s [15] formula.
The similarity value can be calculated by assuming a vir‐
tual root that connects the subcategory of both the biomed‐  Refined Resnik’s Measure
ical ontologies of the concepts. This measure is computed Semantic Similarity between concepts (C1, C2) belonging
based on information content, shared features and depth
to two different ontologies (O1 and O2) is given by (18).
of the taxonomy.The Similarity measure is defined by
Information content for the concept IC(C) is given by (17).

IC ( MSCA ( c ))
The refined Resnik [9] measure is given by
Sim ( C 1 ,C 2 ) 
IC ( MSCA ( C ))  ( C 1 ,C 2 ).( IC ( C 1 )  ( 1 ( C 1 ,C 2 ).( IC ( C 2 )) (15)

where IC ( MSCA( c )) is a most specific common abstraction of Simres  max CS ( O1( C1),O 2( C 2 )) IC (C )] (18)
both the concepts. It can be calculated by considering a
where max CS (O1(C1),O 2 (C 2)) represents the ancestor con-
virtual root that connects both the subcategory of two
different ontologies. From the virtual root, the synonym cept which is having maximum informantion content
among two ontologies O1 and O2.
set of the hypernym concepts of primary ontology is being
matched with the synonym set of the hypernym concepts
 Refined Jiang & Conrath’s Measure
of secondary ontology using word matching feature and
The refined semantic distance between any two concepts
MSCA can be calculated from the matched set.
C1 and C2 belonging to two different ontology is given by

log(min( hypo ( O1( C 1), O 2 ( C 2 ))) 1)
IC ( MSCA ( c ))  1  (16) IC ( C1)  IC ( C 2 )  2 MaxCS ( O1( C1),O 2 ( C 2 )) ( IC ( C ))
log(max min con )
Sim J &C ( C1, C 2 )  (19)
2
where function min(hypo(O1(C1),O2(C2)) represents the
The information content value should consider the
taxonomy which is having minimum hyponymy of the
specific concept belonging to ontology O1 and ontology
concept. It also considers depth (α) the hierarchy of both
O2 and also IC value of the concept that maximally sub‐
the concepts and it can be calculated by (11), (12). IC(C1)
sumes both the concepts.
and IC(C2) are the specific information content value for

each concept in their corresponding hierarchy. The IC
 Refined Lin’s Measure
value is defined [14] as
1log( hypo ( c )1)
The refined Lin similarity method for cross ontological
IC ( c )  (17) concepts (C1, C2 ) is given by
log(maxcon )
where the function hypo represents the number of hypo‐

MSCA ( O1( C1), O 2 ( C 2 ))
nyms of a given concept c and max con represents total Sim Lin ( C1, C 2 )  2* (20)
IC ( C1)  IC ( C 2 )
number of concepts in the considered taxonomy. If a con‐ Lin’s measure is therefore the ratio of the informa‐
cept has many hyponyms, then it has more of a chance of tion shared in common (i.e. MSCA ( O1( C1), O 2 ( C 2 )) ) to the
appearing in the taxonomy hence it convey less informa‐ total amount of information possessed by two concepts in
tion content compared to the concepts that are leaves. two different ontologies.

4.2 Refined Information Content Approaches 4.3 Proposed Algorithm
The semantic similarity measures such as Resnik’s [9] Let (O1, O2,…., On) are multiple ontologies available in
measure, Lin’s [10] measure and Jiang&Cornath’s [11] UMLS framework. Among the available ontologies desig‐
nate one ontology as primary and the other as secondary
Step 5: Calculate Information content for Specific
based on the granularity of the concepts they possess and
concepts i.e IC (C1) and IC (C2)
then identify the concepts for which the semantic similari‐
IC ( c )  1log( hypo ( c )1)
ty is to be calculated. Let O1(Ci) and O2(Cj) be the concepts log(maxcon )

belonging to the corresponding ontologies and r1 and r2 go to Step 7

be the root nodes of the selected ontologies. Create a vir‐

tual root (VR) which connects the root nodes r1 and r2 to Step 6: Calculate the depth of the concepts in both

VR. For our experiments we have considered the datasets ontology by (12),(13)
(36 concept pairs) used by [2] & [14]. For the biomedical go to Step 7
concepts of the datasets XML files are generate using
Clinclue and dragon toolkit. XML input file contains Step 7: Calculate semantic similarity between the
concept pair (C1,C2) by
hypernomy and hyponymy relations of each concept. It
also contains depth and synonym set of each concept. The IC ( MSCA ( c ))
Sim ( C 1 ,C 2 ) 
created XML files of the biomedical concepts serve as IC ( MSCA ( C ))  ( C 1 ,C 2 ).( IC ( C 1 )  ( 1 ( C 1 ,C 2 ).( IC ( C 2 ))
output to the algorithm and the semantic similarity is go to Step 8
calculated.
Step 8: Calculate semantic similarity for cross ontol-
SS_Score Algorithm (Cross Sim(XML file1,XML
file2)) ogy using refined Information Content Approaches
(Resnik using (19), J&C using (20), Lin using (21).
// SS_Score represents Semantic Similarity Score.
Step 9: Collect human judgements for which similar-
Step1: Get the input XML file for the concepts from
the repository. ity rating is to be calculated.
Step 10: Check User Integrity by a rating coefficient

Step2: Compute ancestor list and corresponding hy-
(i.e., Rc) defined as
po number for each concept C1 and C2 //hypo-
n
RC   Ci  avgi
number of hyponyms
i 0
While concept C1 not found in ontology O1
where n represents number of concept pairs.
{Return (ancestor list (S1) found for the con-
Eliminates human judgement which are incorrect
cept C1 and their corresponding hyponym numbers
using Rc
in the Ontology O1)}
While concept C2 not found in ontology O2
Step 11: Calculate correlation coefficient using Pear-
{Return (ancestor list (S2) found for the con-
son correlation coefficient.
cept C2 and their corresponding hyponym number in
Step 12: Compare the performance of the proposed
the Ontology O2)}
approach.
Step 13: End.
Step 3: Compare (S1 and S2) until common ancestor is
found 4.4 Sample Computation among Biomedioal Con-
If one or more common ancestor found, create cepts
list of common ancestor.
mscalist = conceptlist [(c1,h1),(c2,h2), (c3,h3)..(cn,
hn)]
go to Step 4
else
{Return (“There is no most specific common an-
cestor for the concept pair (C1,C2). The similarity
value cannot be calculated”)}
Step 4: Calculate Most Specific Common Abstraction

of both concepts (MSCA(C1,C2))
//From the mscalist, the concept which is having
higher level in the taxonomy is considered as the
MSCA (C1,C2) and information content of msca con-
cept is calculated by
log(min( hypo ( O1( C 1), O 2 ( C 2 ))) 1) Fig. 4. Connecting two ontology fragments
IC ( MSCA ( c ))  1 
log(max min con )
go to Step 7
The similarity between concepts a4 and b3 belongs to plication of a metric on semantic nets”, IEEE Trans. on Systems,
two different ontologies is measured by connecting sub‐ Man, and Cybernetics vol. 19, pp. 17–30, 1989.
roots (a1 and b1) of the concepts to the virtual root (VR).
[5] G. Hirst, D. St-Onge, WordNet, “An Electronic Lexical Data-
The common ancestor that exists between a4 and b3
base, Chapter Lexical Chains as Representations of Context for
among different ontologies are a3 in O1 and b3 in O2.
the Detection and Correction of Malapropisms”, MIT Press,
Thus MSCA (a4, b3) is calculated by choosing the ontol‐ 1998.
ogy which is having minimum number of hyponymy of
MSCA concept and the Information content value can be [6] Wu and M. Palmer, “Verb semantics and lexical selection,”
calculated using (16). Information Content for the specif‐ Proc. 32nd Ann. Meeting Assoc. Comput. Linguistics, pp. 133–138,
ic concept is measured by using (17). Depth of msca 1994.
concept from the virtual root is calculated using the for‐
mula (11) & (12). Thus the similarity value among cross [7] Michael Sussna,”Word sense disambiguation for free-text in-
dexing using a massive semantic network”, Proc. Second Interna-
ontological concepts is calculated using (15).
tional Conference on Information and Knowledge Management, pp.
TABLE 2 67–74, 1993.
SIMILARITY RATING FOR BIOMEDICAL CONCEPTS
Concept 1 Concept 2 Similarity [8] Claudia Leacock and Martin Chodorow”Combining local con-
text and Word-Net similarity for word sense identification”, In
rating
Christiane Fellbaum, editor, WordNet: An Electronic Lexical Data-
Anemia Appendicitis 0 base, pp. 265–283. 1998.
[9] P. Resnik, “Information content to evaluate semantic similarity
in taxonomy”, Proc. of IJCAI, pp. 448–453, 1995.
Antibiotics Antibacterial 0.736
agent [10] D. Lin, “An information-theoretic definition of similarity”, in
Urinary tract Pyelonephritis 0.373 Proc. of Conference on Machine Learning, pp. 296–304, 1998.
infection [11] J. Jiang, D. Conrath, ”Semantic similarity based on corpus statis-

Migraine Headache 0.433 tics and lexical taxonomy”, Proc. of ROCLING X, 1997.
[12] Y. Li, D. McLean, Z. Bandar, J. O’Shea, K. Crockett, “Sentence

similarity based on semantic nets and corpus statistics”, IEEE
Trans. on Knowledge and Data Engineering, vol. 18,no. 8,pp. 1138–
5 CONCLUSION 1150, 2006.
This paper has discussed the various semantic similarity

[13] V.S.Zuber, B. Faltings, “OSS: A semantic similarity function
approaches that could be used for finding similar con‐ based on hierarchical ontologies”, Proc. of IJCAI, pp. 551–556,
cepts of a single ontology and concepts belonging to dif‐ 2007.
ferent ontologies. It also describes a new semantic similar‐
[14] G. Pirró, N. Seco, “A new semantic similarity metric combining
ity computation method between biomedical concepts features and intrinsic information content”, in ODBASE, pp.
belonging to multiple ontologies based on corpus inde‐ 1271–1288, 2009.
pendent information content and also investigating how
[15] N. Seco, T. Veale, J. Hayes, “An intrinsic information content
this measure influence retrieval effectiveness in informa‐ metric for semantic similarity in WordNet”, in Proc. of ECAI, pp.
tion retrieval applications and study the influence of rela‐ 1089–1090, 2004.
tions in computation of semantic similarity score.
[16] M. Rodriguez, M. Egenhofer, “Determining semantic similarity
among entity classes from different ontologies”, IEEE Trans. on
REFERENCES Knowledge and Data Engineering vol. 15, no. 2,pp. 442–456, 2003.
[1] A.Budanitsky and G. Hirst, “Evaluating WordNet-based meas-
ures of semantic distance,” Comput. Linguistics, vol. 32, no. 1, [17] UMLS: (2010). [Online].Available:
pp. 13–47, 2006. Http://www.nlm.nih.gov/research/umls/
[18] MeSH Browser (2010). Available:
[2] H. A. Nguyen and H. Al-Mubaid, “Measuring Semantic Simi-
http://www.nlm.nih.gov/mesh/MBrowser.html
larity Between Biomedical Concepts Within Multiple Ontolo-

gies,” IEEE Trans. on Systems, Man, and Cybernetics,vol.39,no.4, [19] SNOMED‐CT (2010). Available:
pp. 339–398, 2009. http://www.snomed.org/index.html

[3] A.Tversky, “Features of similarity, Psychological Review” vol. [20] Angelos Hliaoutakis, “Semantic Similarity Measure in MeSH
84 no. 2, pp. 327– 352, 1977. Ontology and their application to Information Retrieval on Med‐
line”, 2005
[4] Rada, H. Mili, M. Bicknell, E. Blettner, “Development and ap-
[21] Giuseppe Pirro and Jerome Euzenat, “A Feature and Information
Theoretic Framework for Semantic Similarity and Related‐
ness”,2010
K.Saruladha working as Assistant professor in Pondicherry Engi-

neering College, India. She has got a total of 20 years of teaching
expereince. She has graduated from Pondicherry universi-
ty.Puchucherry, India. She is a member of Indan Society of Technic-
al Education, India. She has published nearly 20 research papers in
Distributed computing, information security and ontology based in-
formation retrieval. She is currently pursuing her Ph.D. in ontology
based inforemation retreival systems.
Dr.G.Aghila working as professor in Pondicherry Universit, India has

got a total of 20 years of teaching expereince. She has graduated
from Anna University chennai, India. She has published nearly 40
research papers in web crawlers, ontology based information re-
trieval. She is currently a supervisor guiding 8 Ph.D. scholars sys-
tems.She was in receipt of schrneiger award. She is an expert in
onology development. Her area of interests inlcude artificial intelli-
gence, text mining and semantic web technologies.
A.Bhuvaneswary is a post graduate student pursuing her M.Tech in

distributed computing systems.

Computation of Semantic Similarity Among Cross Ontological Concepts For Biomedical Domain

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computation of Semantic Similarity Among Cross Ontological Concepts For Biomedical Domain

Uploaded by

Copyright:

Available Formats

JOURNAL OF COMPUTING, VOLUME 2, ISSUE 8, AUGUST 2010, ISSN 2151-9617

Computation of Semantic Similarity among

2.3 Hybrid Based Approach

semantic similarity methods are evaluated

4 THE PROPOSED SIMILARITY METHOD FOR

The exisiting approach for computation of semantic si‐ measure use Information content (IC) value to compute

where the function hypo represents the number of hypo‐

Step 10: Check User Integrity by a rating coefficient

Step 4: Calculate Most Specific Common Abstraction

infection [11] J. Jiang, D. Conrath, ”Semantic similarity based on corpus statis-

[12] Y. Li, D. McLean, Z. Bandar, J. O’Shea, K. Crockett, “Sentence

This paper has discussed the various semantic similarity

K.Saruladha working as Assistant professor in Pondicherry Engi-

Dr.G.Aghila working as professor in Pondicherry Universit, India has

A.Bhuvaneswary is a post graduate student pursuing her M.Tech in

You might also like