You are on page 1of 41

Web Technologies

Semantic Web and Web 3.0

Prof. Beat Signer

Department of Computer Science


Vrije Universiteit Brussel

beatsigner.com

2 December 2005
The Semantic Web
I have a dream for the Web [in which com-
puters] become capable of analyzing all the
data on the Web – the content, links, and
transactions between people and computers.
A 'Semantic Web', which should make this
possible, has yet to emerge, but when it
Tim Berners-Lee
does, the day-to-day mechanisms of trade,
bureaucracy and our daily lives will be
handled by machines talking to machines.
The 'intelligent agents' people have touted
for ages will finally materialize.
Weaving the Web - The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor,
Tim Berners-Lee, Harper San Francisco, September 1999

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 2


The Semantic Web ...
The Semantic Web is a vision: the idea of having data on
the Web defined and linked in a way that it can be used by
machines not just for display purposes, but for auto-
mation, integration and reuse of data across various
applications. Metadata provides a means to make
statements and create machine-readable statements.
W3C, 2003

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 3


The Semantic Web ...
▪ Meaning of data on the Web can not only be inferred by
people but also discovered by machines without (or with
less) human intervention
▪ Web of Data instead of Web of Documents
▪ the Web as a huge decentralised database (knowledge base)
▪ machine-accessible data
▪ data may be interconnected similar to today's webpages
▪ machine-readable metadata for existing web content
▪ combination of data from different sources to derive new facts
▪ machines (agents) may use logical reasoning to infer facts that
are not explicitly recorded
▪ Crucial component of Web 3.0 or Giant Global Graph
December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 4
Video: The Future Internet

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 5


Semantic Web Stack
▪ The Semantic Web Stack User interface and applications

(or Semantic Web Cake) Trust

describes the architecture Proof


of the Semantic Web
Unifying Logic
▪ URI/IRI
Ontologies: Rules:

Cryptography
- unique identification of semantic Querying: OWL RIF/SWRL
web resources SPARQL
Taxonomies: RDFS
▪ Unicode
Data interchange: RDF
- representing/manipulating text
in different languages Syntax: XML and XML Namespaces
▪ XML Identifiers:
URI/IRI Character set: UNICODE
- interchange of structured data
Based on [http://en.wikipedia.org/wiki/File:Semantic-web-stack.png]
over the Web

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 6


Semantic Web Stack ...
▪ XML Namespaces User interface and applications
- uniquely qualify markup from
Trust
multiple sources (integration)
▪ Resource Description Proof

Framework (RDF) Unifying Logic


- define RDF triples and repre-
Ontologies: Rules:
sent resource information in

Cryptography
Querying: OWL RIF/SWRL
a graph structure SPARQL
Taxonomies: RDFS
▪ RDF Schema (RDFS)
- create hierarchies of classes Data interchange: RDF

and properties
Syntax: XML and XML Namespaces

Identifiers:
URI/IRI Character set: UNICODE
Based on [http://en.wikipedia.org/wiki/File:Semantic-web-stack.png]

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 7


Semantic Web Stack ...
▪ Web Ontology Language User interface and applications
(OWL)
Trust
- language to define vocabularies
- extends RDFS with more ad- Proof
vanced features (e.g. cardinality)
Unifying Logic
- enables reasoning based on
description logic Ontologies: Rules:

Cryptography
Querying: OWL RIF/SWRL
▪ SPARQL SPARQL
Taxonomies: RDFS
- query language to query any
RDF-based data Data interchange: RDF

▪ Rule Interchange Format Syntax: XML and XML Namespaces


(RIF) and Semantic Web Identifiers:
Character set: UNICODE
Rule Language (SWRL) URI/IRI
Based on [http://en.wikipedia.org/wiki/File:Semantic-web-stack.png]

- describe relations that cannot be


described in OWL

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 8


Semantic Web Stack ...
▪ Unifying Logic User interface and applications
- logical reasoning (infer new
Trust
facts and check consistency)
▪ Proof Proof

- explain logical reasoning steps


Unifying Logic
▪ Cryptography Ontologies: Rules:

Cryptography
- protect RDF data via encryption Querying: OWL RIF/SWRL
SPARQL
- validate the source of facts by Taxonomies: RDFS
digitally signing RDF data
Data interchange: RDF
▪ Trust
- authentication of sources and Syntax: XML and XML Namespaces
trustworthiness of derived facts Identifiers:
URI/IRI Character set: UNICODE
▪ User Interface Based on [http://en.wikipedia.org/wiki/File:Semantic-web-stack.png]

- user interfaces for semantic web


applications
December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 9
Resource Description Framework
▪ The Resource Description Framework (RDF) has
been designed to describe
▪ data and metadata about specific subjects
▪ structure of data sets
▪ relationships between bits of data
▪ An RDF statement (triple) consists of three parts
▪ subject
▪ predicate (property)
▪ object (value)
{person-1, name, "Niklaus Wirth"}

subject predicate object

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 10


Resource Description Framework ...
▪ Subjects, predicates and objects are all resources
▪ subject is either a URI reference or a blank node
▪ predicate is a URI reference defining the relationship
▪ object is either a URI reference, a literal or a blank node
▪ RDF data is often stored in relational databases or
so-called triplestores such as Apache Jena (TDB)
▪ up to billions of triples

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 11


RDF Graph
▪ A set of RDF statements can be represented as a
directed labelled graph
▪ note that in RDF we can only define statements about specific
instances but not about generic concepts
- RDFS/ontologies have to be used to define statements about generic concepts

w:hasGivenName
Beat

https://wise.vub.ac.be/beat-signer

Signer
w:hasFamilyName

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 12


RDF Graph ...
http://wise.vub.ac.be
w:hasDirector w:isMember

w:isColleague
https://wise.vub.ac.be/beat-signer https://wise.vub.ac.be/lode-hoste

w:hasGivenName
w:hasGivenName w:hasFamilyName w:hasOffice
w:hasFamilyName Lode
Beat Signer w:room w:phone
Hoste
10F733 026293306

▪ Anonymous resources have no explicit identifier


▪ in the example, the "office" is an anonymous resource
▪ anonymous resources are also called blank nodes or bnodes
▪ blank nodes can only be used as subjects or objects

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 13


RDF Reification
https://wise.vub.ac.be
w:hasDirector w:isMember

rdf:subject rdf:object
https://wise.vub.ac.be/beat-signer https://wise.vub.ac.be/lode-hoste

w:forYears
w:hasGivenName
w:hasGivenName w:hasFamilyName
rdf:type 1
rdf:Property Lode
Beat Signer w:hasFamily Name
rdf:statement isColleague
Hoste

▪ An RDF triple is not a resource and can therefore not


become subject of another statement
▪ we have to reify the original statement
- make a resource out of the statement

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 14


Advantages of RDF
▪ Simple
▪ Enables the combination (merging) of data from
different data models
▪ not easily possible in a relational database (different schemas)
▪ The same resource can be annotated by different people
▪ resource referenced by URI
▪ separation of data and metadata
▪ Well-defined standard
▪ many tools available
- triplestores, parsers, editors, frameworks, ...

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 15


RDF Schema (RDFS)
▪ Vocabulary description language for RDF
▪ domain vocabulary and structure
▪ Define common concepts and relationships
▪ classes (rdfs:Class) and subclasses (rdfs:subClassOf)
▪ properties and sub-properties (rdfs:subPropertyOf)
▪ domain (rdfs:domain) and range (rdfs:range) of a property
▪ rdfs:seeAlso, rdfs:isDefinedBy (utility properties)
▪ rdfs:label, rdfs:comment
▪ ...
▪ Provides the basic elements for the definition of
ontologies

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 16


RDF Schema Example
rdfs:Class rdf:Property

rdf:type rdfs:domain rdf:type

Person isColleague

rdfs:range
rdfs:subClassOf
Researcher
rdf:type rdf:type

w:isColleague
https://wise.vub.ac.be/beat-signer https://wise.vub.ac.be/lode-hoste

w:hasGivenName w:hasFamilyName w:hasGivenName w:hasFamilyName

Beat Signer Lode Hoste

rdf:type rdf:type rdf:type rdf:type

rdfs:Literal rdfs:Literal rdfs:Literal rdfs:Literal

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 17


Advantages of RDFS
▪ With RDFS we have a richer expressiveness
(e.g. subClassOf) than with RDF
▪ Simple reasoning (e.g. type hierarchy)
▪ Many existing tools to deal with RDFS
▪ However, some things cannot be expressed; for example
▪ "a person must have a family name"
▪ "a person can have at most one family name" (cardinality)
▪ "if Beat is a colleague of Lode then Lode is a colleague of Beat"
(symmetry)

→ these issues are addressed by the Web Ontology


Language (OWL)

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 18


RDF(S) / XML Serialisation
{https://wise.vub.ac.be/beat-signer, isColleague,
https://wise.vub.ac.be/lode-hoste}

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="https://wise.vub.ac.be/beat-signer">
<w:isColleague rdf:resource="https://wise.vub.ac.be/lode-hoste"/>
<w:hasGivenName>Beat</w:hasGivenName>
...
</rdf:Description>
...
</rdf:RDF>

▪ Syntax not so easy to learn


▪ many different ways to construct the same statement
▪ long URIs are hard to read

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 19


RDF Notation 3 (N3)
▪ Short non-XML serialisation
▪ separate predicates with a semicolon
▪ finish subject definition with a full stop
<https://wise.vub.ac.be/beat-signer>w:isColleague <https://wise.vub.ac.be/lode-hoste/>;
...
w:hasGivenName "Beat".

▪ Note that the N3 notation offers more features than are


necessary for RDF(S) serialisation
▪ e.g. support for RDF-based rules

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 20


RDF Turtle Notation
▪ Terse RDF Triple Language
▪ Subset of N3 language
▪ only describes RDF features (RDF graph model)
▪ Syntax looks similar to Notation 3
▪ https://www.w3.org/TeamSubmission/turtle/
▪ Many RDF frameworks (e.g. Jena) offer Turtle parser
and serialisation features

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 21


RDF Applications
▪ Annotea project
▪ defines an RDF schema for the types of annotations that can be
used to annotate webpages
▪ RSS
▪ some RSS versions use RDF(S) / XML serialisation
▪ Dublin Core
▪ widely used to describe digital media (also in standard HTML)
- bibliographic metadata such a title, creator, description, ...
▪ uses RDF(S) / XML serialisation as one possible representation
<head>
...
<meta name="DC.Subject" content="Cross-Media, Technology, Interactive Paper, ..."/>
<meta name="DC.Description" content="Beat Signer's research on ..."/>
</head>

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 22


SPARQL Query Language
▪ RDF query language which can be used to
▪ extract information as URIs, literals, blank nodes or subgraphs
▪ SPARQL SELECT queries return variable bindings
▪ SPARQL querying relies on graph pattern matching
▪ Example
▪ get the name and mbox of all subjects that have both of these
properties defined
SELECT ?name ?mbox
WHERE { ?x foaf:name ?name .
?x foaf:mbox ?mbox }

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 23


Web Ontology Language (OWL)
▪ OWL evolved from DAML+OIL
▪ DAML is the DARPA Agent Markup Language
▪ OIL stands for Ontology Inference Layer
▪ There exist 3 different OWL sublanguages (flavours) with
different expressiveness
▪ OWL Full
- maximum expressiveness (full language)
- no computational guarantee
▪ OWL DL
- maximal OWL Full subset that is still computationally decidable
▪ OWL Lite
- classification hierarchy and simple constraints (limited cardinality constraints)
- weakest of the three variants

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 24


Jena Semantic Web Framework
▪ Open source Semantic Web framework for Java
▪ create and access data from RDF graphs via an RDF API
▪ offers an OWL API
▪ data can be stored in files, databases or accessed via URLs
▪ https://jena.apache.org
▪ RDF graphs can be serialised into different formats
▪ RDF/XML
▪ Notation 3
▪ Turtle
▪ relational database
▪ SPARQL query interface
▪ Multiple reasoners
December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 25
Protégé
▪ Free open source platform
to create, manipulate and
visualise ontologies
▪ Two modelling tools
▪ Protégé-Frames editor
- build and populate frame-based
ontologies
- Java API for plug-ins
▪ Protégé-OWL editor
- build Semantic Web ontologies

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 26


Friend of a Friend (FOAF)
▪ First social Semantic Web
application
▪ Miller and Brickley, 2000
▪ Describe a social network
without a central database
▪ links can be followed by
spiders (data mining)
▪ no unique identifier
- identification by description
(predicates and objects)
▪ "six degrees of separation" or
"small world phenomenon"
▪ FOAFNaut browser
December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 27
Friend of a Friend (FOAF)
▪ Personal information and connections to friends in RDF
▪ http://www.foaf-project.org
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:foaf="http://xmlns.com/foaf/0.1/">
<foaf:Person>
<foaf:name>Beat Signer</foaf:name>
<foaf:title>Prof.</foaf:title>
<foaf:givenname>Beat</foaf:givenname>
<foaf:family_name>Signer</foaf:family_name>
<foaf:nick>Beat</foaf:nick>
<foaf:mbox_sha1sum>ce6d419869307d57839feef6445a9d64f784eb36</foaf:mbox_sha1sum>
...
<foaf:knows>
<foaf:Person>
<foaf:name>Moira C. Norrie</foaf:name>
<foaf:mbox_sha1sum>4cb61b36a6feaa48c78acbb51fcce7cb356afdd6</foaf:mbox_sha1sum>
<rdfs:seeAlso rdf:resource="http://www.globis.ethz.ch/people/norrie.rdf">
</foaf:Person>
</foaf:knows>
...
</foaf:Person>
</rdf:RDF>

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 28


Semantic Wikis
▪ Use Semantic Web
technologies to provide
machine-processable
Wiki content
▪ page content
▪ link metadata
▪ Ontology reasoning
▪ much richer query interface
▪ Existing semantic Wikis
▪ DBPedia
▪ Semantic MediaWiki
▪ ...
December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 29
Linked Data
▪ Link different data sources (URIs) on the Web
▪ provide metadata about the resources via RDF/XML, N3, etc.
▪ provide links to resources in other data sets on the Web
▪ Linked Open Data (LOD) cloud project
▪ RDF triples from currently 1269 datasets (DBPedia, GeneID, ...)
▪ more than 30 billion triples with more than 500 million links

https://lod-cloud.net

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 30


Linked Open Data

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 31


Semantic Desktops
▪ Apply Semantic Web tech-
nologies to personal infor-
mation management (PIM)
▪ inter-application data sharing
▪ enhancement of limited
filesystem functionality
- add document metadata

▪ Examples
Nepomuk Integration with Dolphin (KDE 4.0)
▪ Haystack
▪ Nepomuk

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 32


GoodRelations
▪ Lightweight ontology for expressing
product information in e-commerce web applications
▪ Product features
▪ offers
▪ prices
▪ units
▪ ...
▪ Adopted by various companies
▪ Yahoo
▪ BestBuy
▪ ...
▪ Leads to enhanced product search functionality
December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 33
Microformats
▪ Add semantics to (X)HTML pages
▪ Makes use of specific (X)HTML tag attributes
▪ class and rel attributes
- e.g. rel="nofollow" for search engines

▪ Specific microformats
▪ hCard: contact information
▪ hCalendar: event information
▪ hProduct: product information

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 34


hCard Microformat Example
<head profile="http://www.w3.org/2006/03/hcard">
...
</head>
...
<div class="vcard">
<div class="fn">Lode Hoste</div>
<div class="org">Vrije Universiteit Brussel</div>
<div class="tel">32 2629 3306</div>
<a class="url" href="http://wise.vub.ac.be/members/lode-hoste">
http://wise.vub.ac.be/members/lode-hoste</a>
</div>

▪ Some search engines (e.g. Google and Yahoo) pay


attention to different types of microformats

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 35


RDF in Attributes (RDFa)
▪ Add a set of attribute extensions to (X)HTML for
embedding RDF metadata
▪ Different vocabularies
▪ FOAF, video, audio, commerce, …
▪ Search engines (e.g. Yahoo and Google) process certain
RDFa metadata (e.g. product information)
<p xmlns:dc=http://purl.org/dc/elements/1.1/
about="http://www.amazon.com/...">
and the will to live. <span property="dc:creator">Simpson</span>
dedicates the book <cite property="dc:title">Touching the Void</cite> to
the... The book was published in <span property="dc:date"
content="1989-12-01">December 1989</span>.
</p>

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 36


Microdata
▪ Add machine readable metadata (semantics) to
W3C Working Group Note

HTML5 documents in the form of key/value pairs


▪ can be used by crawlers, search engines (SEO) and browsers to
provide a richer browsing experience
▪ alternative to Microformats and RDFa
<section itemscope itemtype="http://data-vocabulary.org/Person">
Hello, my name is <span itemprop="name">Beat Signer</span> and I am a
<span itemprop="title">Professor</span> at the
<span itemprop="affiliation">Vrije Universiteit Brussel. </span>
<section itemprop="address" itemscope itemtype="http://data
-vocabulary.org/Address">My address is:
<span itemprop="street-address">Pleinlaan 2</span>,
<span itemprop="postal-code">1050 </span>
<span itemprop="locality">Brussels</span>,
<span itemprop="country-name">Belgium</span>.
</section>
</section>

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 37


Exercise 9
▪ Semantic Web

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 38


References
▪ Tim Berners-Lee, James Hendler and Ora
Lassila, The Semantic Web, Scientific American
Magazine, May 2001
▪ https://www.scientificamerican.com/article.cfm?id=the-semantic-web

▪ The Future Internet: Service Web 3.0


▪ https://www.youtube.com/watch?v=off08As3siM

▪ Resource Description Framework (RDF)


▪ https://www.w3.org/RDF/

▪ Thomas B. Passin, Explorer's Guide to the Semantic


Web, Manning Publications, March 2004

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 39


References ...
▪ The Linked Open Data Cloud
▪ https://lod-cloud.net

December 2, 2021 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 40


Next Lecture
Web Search and SEO

2 December 2005

You might also like