You are on page 1of 19

Ontologies in Bioinformatics

Robert Stevens Department of Computer Science University of Manchester Robert.stevens@cs.man.ac.uk

http://img.cs.man.ac.uk/stevens

Introduction
 What is knowledge?  What is an ontology? Relationships between the two communities  The last decade of bio-ontologiesontologies  The future

http://img.cs.man.ac.uk/stevens

What is Knowledge?
Knowledge all information and an understanding to carry out tasks and to infer new information Information -- data equipped with meaning Data -- un-interpreted signals that reach our senses
B I O L O G Y

man academic, senior ancient university, 5 rated European important figure in biology Name Job Institution Country

C o n f

I Michael Ashburner S Professor M University of Cambridge http://img.cs.man.ac.uk/stevens 3 B UK

Things, Symbols & Concepts


Humans require words (or at least symbols) to communicate efficiently. The mapping of words to things is only indirectly possible. We do it by creating symbols that stand for things. The relation between symbols and things has been described in the form of the meaning triangle:
Concept

Jaguar
[Ogden, Richards, 1923]

http://img.cs.man.ac.uk/stevens

Representing Knowledge
Language uses symbols and rules (natural language) to communicate knowledge Need human intelligence to deal with pragmatics NLP notoriously difficult Need to capture knowledge in a computationally amenable manner Ontology: A conceptual model Ontology plus lexicon is a terminology Primary aim of creating a shared understanding of a domain and the relationships within that domain Common symbols for the things within a domain Capturing domain knowledge with fidelity and precision
http://img.cs.man.ac.uk/stevens 5

Sharing info Sharing meaning


Metadata
Data describing the content and meaning of resources and services. But everyone must speak the same language Shared and common vocabularies For search engines, agents, curators, authors and users But everyone must mean the same thing Service provider Service provider Service provider Service provider

Terminologies

Service provider

Ontologies
Shared and common understanding of a domain


Essential for search, exchange and http://img.cs.man.ac.uk/stevens 6 discovery




What is an Ontology?
Concepts: Units of thought: Classes and individuals; Protein, Gene, DNA, Hexokinase, glycolysis, Terms: Labels for concepts Protein, Gene, Relationships: Semantic links between concepts Is-a-kind, is-a, part-of, name-of, Taxonomy backbone of ontology

http://img.cs.man.ac.uk/stevens

So what Counts as an ontology?


[Deborah McGuinness, Stanford]

Thesauri Catalog/ ID Terms/ glossary Informal Is-a

Formal Is-a

General Frames Logical (properties) constraints Disjointness, Inverse, partof

Formal instance Value restrictions Arom

Gene Ontology

TAMBIS EcoCyc Mouse Anatomy http://img.cs.man.ac.uk/stevens PharmGKB

The art of ranking things in genera and species is of no small importance and very much assists our judgment as well as our memory. You know how much it matters in botany, not to mention animals and other substances, or again moral and notional entities as some call them. Order largely depends on it, and many good authors write in such a way that their whole account could be divided and subdivided according to a procedure related to genera and species. This helps one not merely to retain things, but also to find them. And those who have laid out all sorts of notions under certain headings or categories have done something very useful.
Gottfried Wilhelm Leibniz, New Essays on Human Understanding

http://img.cs.man.ac.uk/stevens

The Gene Ontology

http://img.cs.man.ac.uk/stevens

10

Bio-Ontologies in the Past Decade


Explicit use of ontologies fairly recent EcoCyc and RiboWeb using Frame Based Systems to create knowledge bases An area in which the CS community can test their technology Large, complex and dynamic A knowledge based discipline The post-genomic era encourages the need for shared understanding Cross-genome comparisons need structured, controlled vocabularies Moved from small nich to a much bigger niche Biologists are building ontologies
http://img.cs.man.ac.uk/stevens 11

Uses of Bio-Ontologies
Controlled vocabularies for annotation Describing schema dn the content of schema Domain maps Query mechanisms Resolution of semantic heterogeneiety Text analysis.

http://img.cs.man.ac.uk/stevens

12

The Gene Ontology


Tutorial and the first Bio-Ontologies meeting at ISMB 1998 in Montreal Fly, mouse and yeast get together to develop GO First release some 3,500 terms covering Molecular Function, biological Process and Cellular Component Now some 15,000 terms and growing Gene Ontology Consortium covers some 15 organism databases plus SWISS-PROT and others Synonyms, abbreviations and associations to gene products: Access to names, genes etc. A common understanding across a community
http://img.cs.man.ac.uk/stevens 13

GO DAG for heparin biosynthesis


GO:0003673 : Gene_Ontology (46199) GO:0008150 : biological_process (30188) GO:0008151 : cell growth and/or maintenance (20547) GO:0008152 : metabolism (14693) GO:0016051 : carbohydrate metabolism (267) GO:0006023 : aminoglycan metabolism (18) GO:0030203 :glycosaminoglycan metabolism GO:0030202 : heparin metabolism (3) GO:0030210 : heparin biosynthesis (3)

http://img.cs.man.ac.uk/stevens

14

Open bio-Ontologies (OBO)


Go, though large, is narrow Sequence Ontology Chemical Ontology Promotes a common ontology format, tools and house-style Micro-array community a further boost avoiding mistakes of previous bioinformatics resources Need ontolgoies for phenotype, tissues, anatomies, etc.

http://img.cs.man.ac.uk/stevens

15

Two Communities
Computer Scientists
Building ontologies KR Reasoning

Biologists
Ontology content Domain Knowledge

http://img.cs.man.ac.uk/stevens

Better Ontologies

16

What are We Saying?


Person
is-a is-a

Man

Woman

Are all instances of Man instances of Person? Can an instance of Person be both a Man and an instance of Woman? Can there be any more kinds of Person?
http://img.cs.man.ac.uk/stevens 17

This Years Meeting


A theme of text analysis and ontology First time talks have matched theme Ontologies and indexing Integrating ontologies into NLP systems Ontologies in information retrieval Developing terminologies GO in NLP New Ontologies Semantic Similarity

http://img.cs.man.ac.uk/stevens

18

Opportunities
Ontologies to help text analysis Text analysis to help build ontologies Biology community steadily building a large number of large domain ontologies CS community can help build computationally amenable ontologies Vast quantities of domain knowledge in natural language forms in literature and databanks Opportunities for language and ontology communities

http://img.cs.man.ac.uk/stevens

19

You might also like