You are on page 1of 15

Chapter 3: Ontology

3.1 Introduction
The word ontology comes from two Greek words: "Onto", which means existence, or
being real, and "Logia", which means science, or study. The word is used both in a
philosophical and non-philosophical context. In philosophy, ontology is the study of
what exists, in general. Examples of philosophical, ontological questions are: What are
the fundamental parts of the world? How they are related to each other? Are physical
parts more real than immaterial concepts?
▪ For example, are physical objects such as shoes more real than the concept of
walking? In terms of what exists, what is the relationship between shoes and
walking?
▪ Philosophers use the concept of ontology to discuss challenging questions to
build theories and models, and to better understand the ontological status of
the world.
An ontology is a formal description of knowledge as a set of concepts within a domain
and the relationships that hold between them. To enable such a description, we need
to formally specify components such as individuals (instances of objects), classes,
attributes and relations as well as restrictions, rules and axioms. As a result,
ontologies do not only introduce a sharable and reusable knowledge representation
but can also add new knowledge about the domain.
From Nirenburg and Raskin regarding ontology from linguistics to Information
Systems, we see that: “Ontological Semantics is a theory of meaning in Natural
Language and an approach to NL (Natural Language Processing) which uses an
ontology as a central resource for extracting and representing meaning of natural
language texts, reasoning about knowledge derived from the texts as well as
generating natural language texts based on representations of their meaning”.
The ontology data model can be applied to a set of individual facts to create
a knowledge graph – a collection of entities, where the types and the relationships
between them are expressed by nodes and edges between these nodes, By describing
the structure of the knowledge in a domain, the ontology sets the stage for the
knowledge graph to capture the data in it.
There are, of course, other methods that use formal specifications for knowledge
representation such as vocabularies, taxonomies, thesauri, topic maps and logical
models. However, unlike taxonomies or relational database schemas, for example,
ontologies express relationships and enable users to link multiple concepts to other
concepts in a variety of ways.
As one of the building blocks of Semantic Technology, ontologies are part of the W3C
standards stack for the Semantic Web. They provide users with the necessary structure to link
one piece of information to other pieces of information on the Web of Linked Data. Because
they are used to specify common modeling representations of data from distributed and
heterogeneous systems and databases, ontologies enable database interoperability, cross-
database search and smooth knowledge management.
The Artificial-Intelligence literature contains many definitions of an ontology; many of these
contradict one another. For the purposes of this guide an ontology is a formal explicit
description of concepts in a domain of discourse (classes (sometimes called concepts)),
properties of each concept describing various features and attributes of the concept (slots
(sometimes called roles or properties)), and restrictions on slots (facets (sometimes called
role restrictions)). An ontology together with a set of individual instances of classes
constitutes a knowledge base. In reality, there is a fine line where the ontology ends and the
knowledge base begins. Classes are the focus of most ontologies. Classes describe concepts
in the domain.

3.2 Main Components of Ontology – General / Computing


The main components of an ontology in general are concepts, relations, instances and
axioms.
A concept represents a set or class of entities or `things' within a domain. Protein is a concept
within the domain of molecular biology. Concepts fall into two kinds:
1. primitive concepts are those which only have necessary conditions (in terms of their
properties) for membership of the class. For example, a globular protein is a kind of
protein with a hydrophobic core, so all globular proteins must have a hydrophobic core,
but there could be other things that have a hydrophobic core that are not globular
proteins.
2. defined concepts are those whose description is both necessary and sufficient for a
thing to be a member of the class. For example, Eukaryotic cells are kinds of cells that
have a nucleus. Not only does every eukaryotic cell have a nucleus, every nucleus
containing cell is eukaryotic.
Relations describe the interactions between concepts or a concept's properties. Relations
also fall into two broad kinds:
1. Taxonomies that organise concepts into sub- super-concept tree structures. The most
common forms of these are
• Specialisation relationships commonly known as the `is a kind of'
relationship. For example, an Enzyme is a kind of Protein, which in turn is a
kind of Macromolecule.
• Partitive relationships describe concepts that are part of other concepts
- Protein has Component Modification Site.
2. Associative relationships that relate concepts across tree structures. Commonly
found examples include the following:
• Nominative relationships describe the names of concepts - Protein has
AccessionNumber (in the context of bioinformatics) and Gene has Name
GeneName.
• Locative relationships describe the location of one concept with respect to
another - Chromosome has Subcellular Location Nucleus.
• Associative relationships that represent, for example, the functions,
processes a concept has or is involved in, and other properties of the
concept - Protein has Function Receptor, Protein is Associated With
Process Transcription and Protein has Organism Classification Species.
• Many other types of relationships exist, such as `causative' relationships.
Instances are the `things' represented by a concept. For example, Atom is a concept and
`potassium' is an instance of that concept.
Finally, axioms are used to constrain values for classes or instances. In this sense the
properties of relations are kinds of axioms. Axioms also, however, include more general rules.

A computational ontology consists of a number of different components, such


as Classes, Individuals and Relation.
Concept : Concepts, also called Classes, Types or Universals are a core component of most
ontologies. A Concept represents a group of different Individuals, that share common
characteristics, which may be more or less specific.
For example, (most) humans share certain characteristics, such as related DNA, a set of
specific body parts, the ability to speak a complex language. Likewise, all mammals share
these characteristics, except for the ability to speak.
Individual : Individuals also known as instances or particulars are the base unit of an
ontology; they are the things that the ontology describes or potentially could describe.
Individuals may model concrete objects such people, machines or proteins; they may also
model more abstract objects such as this article, a person’s job or a function.
Individuals are a formal part of an ontology and are one way of describing the entities of
interest. Perhaps more common within bioinformatics is the development of ontologies
consisting only of Concepts which are then used to annotate data records directly.
Relation : Relations in an ontology describe the way in which individuals relate to each other.
Relations can normally be expressed directly between individuals (this article has author
Phillip Lord) or between Concepts (an article has author a person); in the latter case, this
describes a relationship between all individuals of the Concepts.
Although it is dependant on the ontology language, it is often possible to express different
categories of relationships between Concepts. Consider, for example, “person has father
person”. This is an existentially quantified relationship; it is the case that every person has a
father, and that this individual is also a person. This can be contrasted from “person is father
of person”; this is a universal quantified relationship. It is true that every individual which is
father of a person is, themselves, a person; however, it would be wrong to assert that every
person is the father of another.

3.3 Ontology types:


Uschold and Gruninger (1996) have classified ontologies based in their formality and
complexity as a continuum as belonging to the following major categories:
1. Highly Informal: Ontologies that are expressed loosely in natural language.
2. Semi- Informal: Ontologies expressed in a restricted and structure form of natural
language.
3. Semi- Formal: Ontologies expressed in artificially formally defined language, like the
ontolingua version of Enterprise ontology (Uschold, King, Moralee & Zorgios 1995).
4. Rigidly Formal: Those that are clearly defined terms with semantics, theorems and
proofs like the TOVE Enterprise Ontology (Fox 1992).
Guarino (1997) proposes a classification of ontologies under three headings, as follows:
1. By the level of detail.
- Reference (off-line) ontologies.
- Shareable (on-line) ontologies.
2. By the level of dependence of a particular task or point of view.
- Top-level ontologies.
- Domain ontologies.
- Task ontologies.
- Application ontologies.
3. Representation ontologies.
Of the above major classification groups (Guarino), the second based on level of dependence
on the domain task or perspective is of interest to the current research.

3.4 Ontology Architectures


Methodology for defining ontology includes identifying the scope and use for the ontology.
Then the domain knowledge is to be classified and concepts identified. Ontology is defined
as a set of classes arranged in a hierarchy or taxonomy, where real world concepts are
modeled as classes, their characteristics as attributes and inter-object relationships as
relationships, properties or axioms.
Ontology Architecture proposed by Guarino :
Top level ontologies: Describe general concepts like time, space, matter, and event
that are independent of domain or a particular problem.
Domain Ontologies and Task Ontologies: Describe ontologies pertaining to a specific
domain or task.
Application Ontologies: Describe concepts that depend upon both a domain and a
particular task, usually being specializations of both ontologies.

Figure : Ontology Architecture proposed by Guarino


Guarino proposes a bottom-up approach in designing. He suggests to identify the most
specialized concepts needed in the application ontology, then the domain ontology and task
ontology. Finally, Guarino recommends to abstract into the top level ontology the generic
concepts. In essence, his suggested approach is valid in those cases when the ontology is to
be designed from scratch. This does not take in to consideration other previously existing
ontologies or knowledge bases.

3.5 Ontology Design Principles


Gruber (1993a, 1993b) has formulated some criteria for design of formal ontologies mostly
for artificial intelligence purposes that have now been widely accepted.
• Clarity: Ontology should be able to effectively communicate its intended
meaning to its users.
• Coherence: Ontology should support inferences that are consistent with its
definitions.
• Extendibility: Ontology should be designed to anticipate the uses of shared
vocabulary. One should be able to define new terms based on the existing
definitions.
• Minimal encoding bias: According to Gruber, the conceptualization should be
specified at the knowledge level without depending upon any symbol or
language encoding.
• Minimal ontological commitment: Finally, Gruber recommends that ontology
should not restrict the domain being modeled, allowing the users the freedom
to specialize and instantiate the ontology as required.
The above criteria have now become the biblical commandments for any ontology designer
for AI and Information Systems as well. We see that these criteria define the requirements
only on the ontology artifact that is to be designed and developed. It aims to only ensure that
the ontology is correct, cohesive and true.

3.6 Ontology Design Methodologies


There are prominent ontology design and development methodologies, namely:
1. Uschold and Gruninger’s Skeletal Method.
2. Gruninger and Fox Method.
3. Noy and McGuinness ’s 101 Ontology Design Methodology.
4. UPON.
5. Methontology
1. Uschold and Gruninger’s Skeletal Method : Uschold & Gruninger (1996) provides guidelines
for ontology designing based on their experiences in designing the Enterprise Ontology
(Uschold et al. 1995) which may be summarized as follows:
Phase 1- Identify Purpose: Why the ontology is being built and what its intended use is, and
who the targeted users are.
Phase 2- Building the ontology: In this phase the actual design of the ontology is suggested
following the given steps below:
- Ontology capture: Uschold and Gruninger suggest to
(i) identify the key concepts and relationships in the domain of interest;
(ii) produce precise and unambiguous textual descriptions of identified
concepts;
(iii) identify terminology to name identified concepts and relationships and
finally
(iv) to have a consensus on the concepts, relationships and their names.
- Coding: This phase includes explicitly representing the knowledge /conceptualization
captured in the previous phase using some chosen formal language. For this Uschold
& Gruninger (1996) propose that the designer should commit to the basic terms
identified in the previous term, choose a formal representation language and
thereafter code the ontology.
-Integrating existing ontology: Uschold and Gruninger propose the use of existing
ontologies in the ontology capture or coding or both the processes.
Phase 3- Evaluation: Uschold and Gruninger agree that evaluation of produced ontology is
vital and refer us to other related research done in the same domain.
Phase 4- Documentation: Similarly, on the issue of documentation for ontology, they refer to
the documentation facilities supported,
2. Gruninger and Fox Method : (Gruninger & Fox 1995) propose a more formal design
approach as compared to Uschold’s skeletal method. They used their methodology to design
more formal and extensive ontologies like the TOVE ontology(Fox 1992). (The TOVE is a set
of formal ontologies for different aspects of the business enterprise like the Resource
Ontology, Time ontology etc).
3. Noy and McGuinness ’s 101 Ontology Design Methodology: This approach is more like a
user manual for an ontology to be designed specifically using the Prot´eg´e ontology editor.
In simple steps they illustrate the process of capturing the concepts, the slots and the role
restrictions. But, on analysis, we see that their basic design methodology is similar to that
proposed by the Gruninger-Fox methodology or Uschold-Gruninger Method. Noy and
McGuinness have proposed a knowledge engineering method for building ontologies. They
advocate an iterative and refinement process and have proposed three fundamental rules for
the ontology developer to help him in design decision process.
4. UPON : The fourth ontology design methodology which we review here is the UPON (Unified
Process for ONtology building) proposed by A.Nicola, M Missikoff et al (2005). Their process
builds on the accepted Unified Process and uses UML. The design methodology closely
follows the unified process and has the following phases:
Inception Phase. Requirement capturing and modeling the use cases.
Elaboration Phase. Analysis of requirements and fundamental concepts are
identified and loosely captured.
Construction Phase. Based on the loosely identified concepts a skeleton for the
ontology may be designed. Successive iterations of the first three phases, will
lead to refinement and a more stable version of the ontology ultimately reached.
Transition Phase. the ontology is subjected to rigorous testing, documentation
and finally released for public use.
5. Methontology : The final design methodology that we would like to review here is that of
METHONTOLOGY (Fernandez et al. 1997) that is used for building ontologies from scratch or
from other existing ontologies or by a process of re-engineering. Till now, the ones we have
reviewed basically deal with designing ontologies from the scratch, that is no previous
versions or knowledge base or data model exists.
Phase 1 Planify: The designer should plan the entire development process like the
tasks, time and resource allocation etc.
Phase 2 Specification: Just as one never starts a trip without knowing the destination
and purpose for the travel, the designer should never start the ontology design and
development process without establishing the purpose and scope of the ontology.
Phase 3: Acquire existing Knowledge: They also advocate the use of existing
knowledge bases and knowledge acquisition using techniques as proposed by
Uschold & Gruninger (1996). This phase is vital if the designer has to acquire ample
knowledge about a domain.
Phase 4: Conceptualize: Following knowledge acquisition, the designer needs to
conceptualize the knowledge using some conceptual knowledge modeling technique.
Phase 5: Formalize: The next step recommended can be quoted as “To transform the
conceptual model into a formal or semi-compatible model, you need to formalize it
using frame-oriented or description logic representation systems.”
Phase 6: Integration: Ontologies are intended to be reuses therefore G´omez-P´erez
suggest the integration of relevant ontologies as possible.
Phase 7: Machine Readable: To make the ontology ‘machine-readable’ we need to
select the formal machine process able implementation language.
Phase 8: Evaluation: G´omez-P´erez, Juristo & Pazos (1995) now stress the need to
evaluate the ontology designed, so that to rule out any erroneous definitions and
discrepancies in the ontology.
Phase 9: Documentation: Thereafter, they recommend that proper documentation is
vital as in any software development project, not only for easy reusability, modification,
but also for configuration management and change traceability.
Phase 10: Maintenance: Finally, they recommend that ontology once designed and
developed cannot be forgotten, it needs to be constantly maintained.

3.7 Ontologies for Better Data Management


Some of the major characteristics of ontologies are that they ensure a common
understanding of information and that they make explicit domain assumptions. As a result,
the interconnectedness and interoperability of the model make it invaluable for addressing
the challenges of accessing and querying data in large organizations. Also, by
improving metadata and provenance, and thus allowing organizations to make better sense
of their data, ontologies enhance data quality.
The OWL Standard and Ontology Modelling
In recent years, there has been an uptake of expressing ontologies using ontology languages
such as the Web Ontology Language (OWL). OWL is a semantic web computational logic-
based language, designed to represent rich and complex knowledge about things and the
relations between them. It also provides detailed, consistent and meaningful distinctions
between classes, properties and relationships.
By specifying both object classes and relationship properties as well as their hierarchical
order, OWL enriches ontology modeling in semantic graph databases, also known as RDF
triplestores. OWL, used together with an OWL reasoner in such triplestores, enables
consistency checks (to find any logical inconsistencies) and ensures satisfiability checks (to
find whether there are classes that cannot have instances).
Also, OWL comes equipped with means for defining equivalence and difference between
instances, classes and properties. These relationships help users match concepts even if
various data sources describe these concepts somewhat differently. They also ensure the
disambiguation between different instances that share the same names or descriptions.
3.8 The Benefits of Using Ontologies
One of the main features of ontologies is that, by having the essential relationships between
concepts built into them, they enable automated reasoning about data. Such reasoning is easy
to implement in semantic graph databases that use ontologies as their semantic schemata.
What’s more, ontologies function like a ‘brain’. They ‘work and reason’ with concepts and
relationships in ways that are close to the way humans perceive interlinked concepts. In
addition to the reasoning feature, ontologies provide more coherent and easy navigation as
users move from one concept to another in the ontology structure.
Another valuable feature is that ontologies are easy to extend as relationships and concept
matching are easy to add to existing ontologies. As a result, this model evolves with the
growth of data without impacting dependent processes and systems if something goes wrong
or needs to be changed.
Ontologies also provide the means to represent any data formats, including unstructured,
semi-structured or structured data, enabling smoother data integration, easier concept and
text mining, and data-driven analytics.

3.9 Limitations of Ontologies


While ontologies provide a rich set of tools for modeling data, their usability comes with
certain limitations. One such limitation is the available property constructs. For example, while
providing powerful class constructs, the most recent version of the Web Ontology Language
– OWL2 has a somewhat limited set of property constructs.
Another limitation comes from the way OWL employs constraints. They serve to specify how
data should be structured and prevent adding data inconsistent with these constraints. This,
however, is not always beneficial. Often, data imported from a new source into the RDF
triplestore would be structurally inconsistent with the constraints set using OWL.
Consequently, this new data would have to be modified before being integrated with what is
already loaded in the triplestore.
A novel alternative to using ontologies to model data is using the Shapes Constraint Language
(SHACL) for validating RDF graphs against a set of constraints. A shape specifies metadata
about a type of resource – how it is used, how it should be used and how it must be used.
As such, similarly to OWL, SHACL can be applied to validate data. Unlike OWL, however,
SHACL can be applied to validate data that is already available in the triplestore.

3.19 Ontology Use Cases


Since ontologies define the terms used to describe and represent an area of knowledge, they
are used in many applications to capture relationships and boost knowledge management.
The adoption of ontologies helps early hypotheses testing in Pharma by categorizing
identified explicit relationships to a causality relation ontology. Ontologies also
enrich semantic web mining, mining health records for insights, fraud detection and semantic
publishing.
In a nutshell, ontologies are frameworks for representing shareable and reusable knowledge
across a domain. Their ability to describe relationships and their high interconnectedness
make them the bases for modeling high-quality, linked and coherent data.

You might also like