Professional Documents
Culture Documents
Web of Data Tutorial - Part 1
Web of Data Tutorial - Part 1
1 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Agenda
Introduction Multimedia Interlinking
2 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Introduction: The Web of Data Vision
3 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
4 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
5 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
6 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
7 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Web of Data Vision
There is lots of data about the movie “The Shining”
available on the Web…
Starring: Jack
Nicholson, Shelly Produced by: Stanley Kubrick
Duvall, Danny Lloyd, …
8 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Web of Data Vision
…but only in a human-readable representation
(HTML)
9 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
DB DB
DB DB
10 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Web of Data Vision
The Web is successful because it provides
Uniform encoding (HTML)
Uniform addressing (URI)
Uniform transportation (HTTP)
for the exchange of documents.
11 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Web of Data Vision
12 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
The Linked Data Principles
13 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
The Enabling Technologies
14 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
URI
Uniform Resource Identifiers (URI) identify things
Use dereferencable HTTP URIs in the Web of Data
15 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
RDF
A data model for representing metadata on the Web
Several statements (triples) form a graph
16 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
RDF
Links are an intrinsic RDF feature
17 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
RDF/XML, N3, Turtle, etc
18 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
RDFS
A language for describing vocabularies in a machine-
understandable way
19 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
OWL
A more expressive language for expressing
vocabularies and/or ontologies in a machine-
understandable way
20 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
SKOS
A language for describing controlled vocabularies
21 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
SPARQL
A query language and protocol for accessing RDF
data on the Web
22 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Linked Data Implementation Best
Practices
23 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
How to publish vocabularies
Hash-based URIs
E.g., http://example.com/example1#ClassA
Suited to group the description of a moderate number of related
terms into one document
Agent can retrieve terms with a single HTTP request
Slash-based URIs
E.g., http://example.com/example1/ClassB
Suited to split the descriptions of terms in large vocabularies into
one document per term
No need for the agent to download a massive document to find
the description of a term
24 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
How to publish vocabularies
E.g.. extended configuration for hash namespace
25 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
How to publish vocabularies
E.g.. extended configuration for hash namespace
26 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
How to publish Linked Data
Distinguish between
non-information resource
http://dbpedia.org/resource/The_Shining_%28film%29
Information resource
http://dbpedia.org/page/The_Shining_%28film%29 (HTML)
http://dbpedia.org/data/The_Shining_%28film%29 (RDF)
27 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
The Linking Open Data Project
28 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Some clarifications
Open Data: a philosophy, practice, or policy that data
are freely available to everyone without restrictions
from copyright, patents, a.s.o.
29 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
As of October 2007
30 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
31 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
32 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Available Tools - Overview
33 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
RDF APIs
Jena Semantic Web Framework (Java)
http://jena.sourceforge.net/
Sesame
ARC (PhP)
http://arc.semsol.org/
Redland RDF – Ruby interface (Ruby)
http://librdf.org/docs/ruby.html
RDFlib (Python)
http://www.rdflib.net/
34 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Triple Stores
Jena Semantic Web Framework (Java)
http://jena.sourceforge.net
Sesame (Java)
http://www.openrdf.org/
OpenLink Virtuoso
http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/
ARC (PhP)
http://arc.semsol.org/
…
35 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
References
RDF Primer: http://www.w3.org/TR/rdf-primer/
OWL 2 Overview:
http://www.w3.org/TR/2009/REC-owl2-
primer-20091027/
Best Practice Recipes for Publishing RDF
Vocabularies:
http://www.w3.org/TR/swbp-vocab-pub/
How to Publish Linked Data on the Web:
http://www4.wiwiss.fu-berlin.de/bizer/pub/
LinkedDataTutorial/
Pedantic Web Group: http://pedantic-web.org/
36 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
References
Linking Open Data Project
http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/
LinkingOpenData
37 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Agenda
Introduction Multimedia Interlinking
38 38 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Linked Data Publishing Steps
The basic tenets of Linked Data are to:
use the RDF data model to publish structured data
on the Web
use RDF links to interlink data from different data
sources
39 39 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Principles
Resources
all items of interest are called resources
Resource Identifiers
Uniform Resource Identifiers (URIs) are used, use of
HTTP URIs is strongly suggested
Representation
of an information resource is a stream of bytes
(HTML, JPG, RDF, …)
40 40 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Content Negotiation
Example:
http://dbpedia.org/resource/Graz
(URI identifying the non-information resource Graz)
http://dbpedia.org/data/Graz (information resource with an RDF/
XML representation describing Graz)
http://dbpedia.org/page/Graz (information resource with an HTML
representation describing Graz)
41 41 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Content Negotiation with RDFa
When RDF is embedded in another representation
(e.g. via RDFa) no content negotiation is needed
(X)HTML
RDF
XHTML+RDFa representation
42 42 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
RDFa example
Plain (X)HTML:
... All content on this site is licensed under
<a href="http://creativecommons.org/licenses/by/3.0/">
a Creative Commons License </a>.
43 43 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Choosing URIs
Resources are named with URI references
44 44 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Choosing URIs
Use HTTP URIs for everything
Define your URIs in an HTTP namespace under your
control
Keep implementation details out of your URIs.
Short, mnemonic names are better
http://dbpedia.org/resource/Graz VS
http://www.confuseme.com:2020/demos/xyz/cgi-bin/
resources.php?id=Graz
Try to keep your URIs stable and persistent
Cool URIs don’t change!
Use some kind of primary key inside your URIs
(e.g. when dealing with books use the ISBN, etc.)
45 45 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Which vocabularies to use?
Reuse terms from well-known vocabularies wherever
possible
Only define new terms yourself if you can’t find
required terms in existing vocabularies
46 46 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Some well-known vocabularies
Friend-of-a-Friend (FOAF), vocabulary for describing
people.
Dublin Core (DC) defines general metadata attributes.
Semantically-Interlinked Online Communities (SIOC),
vocabulary for representing online communities.
Description of a Project (DOAP), vocabulary for describing
projects.
Simple Knowledge Organization System (SKOS),
vocabulary for representing taxonomies and loosely
structured knowledge.
Music Ontology provides terms for describing artists,
albums and tracks.
Review Vocabulary, vocabulary for representing reviews.
Creative Commons (CC), vocabulary for describing
license terms.
47 47 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
More best practices
You can mix terms from different vocabularies, e.g.
rdfs:label and foaf:depiction
48 48 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Defining your own terms
If you need to define own terms use RDFS or OWL.
49 49 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
What should the RDF contain?
What triples should go into the RDF representation that
is returned (after a 303 redirect) in response to
dereferencing a URI identifying a non-information
resource?
Description
Backlinks
(Related descriptions)
Metadata
Syntax: RDF/XML (+ maybe other serializations)
50 50 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Serving linked data
Things must be identified with dereferenceable HTTP
URIs
51 51 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Serving static RDF files
RDF files generated manually or
generated by some software
52 52 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Serving relational databases
Several tools exist to generate RDF from relational
databases
53 53 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Linked Data Publishing Steps
For more information:
54 54 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
RDB2RDF
W3C Incubator Group in 2008 & early 2009 to
examine existing approaches for generating RDF
from relational databases
http://www.w3.org/2005/Incubator/rdb2rdf/
55 55 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
RDB2RDF Working Group
W3C Working Group started in 2009 to standardize
a language for mapping relational data and relational
database schemas into RDF and OWL, tentatively
called the RDB2RDF Mapping Language, R2RML
http://www.w3.org/2001/sw/rdb2rdf/
56 56 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Tools for Publishing Linked Data
57 57 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Pubby Linked Data Frontend
Provides Linked Data from SPARQL endpoint
data must be available in RDF already
if not: use a wrapper
Originally developed for DBpedia
Richard Cyganiak, Chris Bizer - FU Berlin
Provides dereferenceable HTTP-URIs
Simple HTML interface for browsing
Handles 303 redirects correctly
Content Negotiation (HTML, RDF/XML, N3)
Java Web application
58 58 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Pubby - Architecture
Text
Text
60 60 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Wrapping Relational Databases
D2RQ-Map and D2R-Server (FU Berlin)
http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/index.html)
Triplify (Uni Leipzig)
http://triplify.org)
OpenLink Virtuoso RDF Views
http://virtuoso.openlinksw.com/wiki/main/Main/
61 61 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
D2RQ-Map and D2R-Server
Java
wraps any ODBC database to RDF
2 Components
D2RQ-Map (wrapping component): dumps + virtual
D2R-Server (adds SPARQL endpoint)
can be used in Jena applications (Assembler)
Automatic generation of mapping file (simple)
shell script: “generate-mapping”
62 62 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
D2RQ-Map and D2R-Server
63 63 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
D2R-Server How-To
Download/Extract
Generate mapping file automatically first
generate-mapping -o mapping.n3 -d driver.class.name!
-u db-user -p db-password jdbc:url:..."
Inspect the generated mapping
Model your desired target graph
you should always know what you want...
also study your source database model
Adjust the mapping
64 64 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Mapping language
d2rq:Database
d2rq:ClassMap
d2rq:PropertyBridge
Rather expressive
Joins
Conditions
Value-translations
65 65 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Example Mapping
66 66 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Wrapping Spreadsheets
Excel2RDF, RDF123, TopBraid Composer
XLWrap (A. Langegger)
http://xlwrap.sourceforge.net/
supports cross tables, repetitive patterns in spreadsheets
arbitrary target graphs
powerful expressions (extensible, user-defined functions)
67 67 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
XLWrap Example
68 68 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Example Template Graph
§ [ xl:uri "'http://example.org/revenue_' &!
§ URLENCODE(SHEETNAME(A1) & '_' & B2 & '_' &!
§ A4)"^^xl:Expr ] a ex:Revenue ;!
§ ex:country "DBP_LOCALITY(SHEETNAME(A1))"^^xl:Expr ;!
§ ex:year "DBP_YEAR(B2)"^^xl:Expr ;!
§ ex:product "A4"^^xl:Expr ;!
§ ex:itemsSold "B4"^^xl:Expr ;!
Text
§ ex:revenue "C4"^^xl:Expr .!
69 69 SAMT
SAMT2009
2009––Tutorial
TutorialWeb
Webof
ofData
Dataininthe
theContext
Contextof
ofMultimedia
Multimedia(WoDMM)
(WoDMM) Graz,
Graz,Austria
Austria- 2- 2Dec
Dec2009
2009
Agenda
Introduction Multimedia Interlinking
70 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Existing Relevant
Linked Data Sets
71 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
The Linked Data Cloud
(a success story, 2007-2009)
72 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
73 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
74 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) 10/2007
Graz, Austria - 2 Dec 2009
75 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) 03/2008
Graz, Austria - 2 Dec 2009
76 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
77 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
78 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Some numbers
Data Set # of RDF Triples
ACM RKB > 12,000,000
AudioScrobbler > 600,000,000
BBC Music + Programmes > 20,000,000
Bio2RDF > 2,000,000,000
data.gov > 5,000,000,000
DBpedia > 470,000,000
Freebase > 100,000,000
Geonames > 90,000,000
Linked Geo Data > 3,000,000,000
MusicBrainz > 60,000,000
RDF Book Mashup > 100,000,000
US Census Data > 1,000,000,000
Source: http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets/Statistics
79 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
DBpedia: the Linked Data Hub
DBpedia [Auer07] is a Linked Data representation of
Wikipedia content.
(Semi-)structured data are extracted (mostly from
infoboxes) and are published as RDF.
DBpedia 3.4 (Nov. 2009) describes 2.9 million things,
including persons, places, organizations, ...
Each thing has an URI; most of them have a label
and an abstract, in up to 91 languages.
DBpedia provides > 8 millions links to other data sets
(web pages and RDF), and 75,000 YAGO
categories.
80 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
DBpedia Vocabularies
81 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
82 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
83 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
BBC Linked Data
BBC is continually increasing their Linked Data
services.
BBC Programmes publishes all broadcast
programmes and provides URIs for them.
BBC Music provides data about music artists and
links to DBpedia and MusicBrainz.
BBC uses Linked Data technology to link their
internal, heterogeneous data sets [Kobi09].
Used vocabularies: RDF(S), FOAF, DC, Music
Ontology, ...
84 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
85 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
86 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
DBtune
DBtune [Rai08] is a collection of music-related
Linked Data sets, amongst others:
Jamendo (creative commons music site)
MusicBrainz (community music metadata)
AudioScrobbler (last.fm playcounts)
MySpace
URIs for entities can be retrieved by lookup services
or through links to DBpedia and other data sets.
Used vocabularies: RDF(S), FOAF, DC, Music
Ontology, ...
87 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
88 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
89 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Linked Movie Data Base
LMDB [Hass09] is an RDF representation of parts of
the data from the Internet Movie Database (IMDb).
Data about ~40,000 films, ~30,000 actors, ~8,000
directors, etc.
Links to
DBpedia
YAGO
flickr
RDF book mashup
MusicBrainz
Geonames
Vocabularies: RDF(S), FOAF, Movie Ontology,
SKOS, DC
90 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
91 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
92 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
There is more ...
flickr wrapper: generates links to flickr pictures on
demand, represents them as RDF
Geonames: information about places (cities,
countries)
Linked Geo Data: RDF representation of
OpenStreetMap (geo data)
revyu.com: provides reviews for any kind of entities
(including movies and music)
New York Times: >5.000 concepts from their archive
and many more ...
94 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Web of Data in the Context of Multimedia
Part 1: Linked Open Data: Vision, Concepts and Technologies
Linking Data
95 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Linking Data
RDF links enable Linked Data browsers and crawlers
to navigate between data sources and to discover
additional data.
96 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Linking Data
Example: Equivalent URIs for
http://data.linkedmdb.org/resource/film/2014
owl:sameAs
1.http://dbpedia.org/resource/The_Shining_(film)
2.http://data.linkedmdb.org/resource/film/2014
3.
http://rdf.freebase.com/ns/guid.
9202a8c04000641f800000000046c3da
97 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Setting RDF links manually
Usually only done in very small datasets like
personal FOAF profiles (e.g. stating who you know
by setting foaf:knows links)
99 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Tools for data interlinking
Input: 2 (or more) datasets + linkage specification
Output: links between the datasets
Domain-independent tools:
SILK
ODD-Linker
RDF-AI
Knofuss
Domain-specific tools:
LD-Mapper (Music Ontology)
RKB co-reference resolution system (publications)
100 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Overview
RKB CRS LD-Mapper ODD RDF-AI Silk Knofuss
Data access API local copy ODBC local copy SPARQL local copy
101 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Specification formats
datasets resources links matching
SPARQL endpoint, resources to interlink, link condition (for string matching,
Silk-LSL graphs name resources type each resource) matchers combination
string matching (for
Knowfuss local copy (SPARQL query) fusion method each resource)
RDF-AI local copy resource descriptions link description fuzzy string, wordnet
102 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
SILK
Silk - A Linking Framework for the Web of Data
http://www4.wiwiss.fu-berlin.de/bizer/silk/
103 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
SILK
Input: 2 datasets behind SPARQL endpoint + LSL
specification
104 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
<Interlink id="cities">
<LinkType>owl:sameAs</LinkType>
<SourceDataset dataSource="dbpedia" var="a">
<RestrictTo>
?a rdf:type dbpedia:City
</RestrictTo>
</SourceDataset>
<TargetDataset dataSource="geonames" var="b">
<RestrictTo>
?b rdf:type gn:P
</RestrictTo>
</TargetDataset>
<LinkCondition>
<AVG>
<Compare metric="jaroSimilarity">
<Param name="str1" path="?a/rdfs:label" />
<Param name="str2" path="?b/gn:name" />
</Compare>
<Compare metric="numSimilarity">
<Param name="num1" path="?city1/dbpedia:populationTotal" />
<Param name="num2" path="?city2/gn:population" />
<Compare>
</AVG>
</LinkCondition>
<Thresholds accept="0.9" verify="0.7" />
<Output acceptedLinks="accepted_links.n3"
verifyLinks="verify_links.n3"
mode="truncate" />
</Interlink>
105 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Web of Data in the Context of Multimedia
Part 1: Linked Open Data: Vision, Concepts, and Technologies:
Consuming Linked Data
106 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Agenda
URI Discovery
Data Discovery
Data Set Discovery
107 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
URI Discovery
1. Querying specific data sources,
e.g., http://lookup.dbpedia.org
2. Using dedicated search engines,
e.g.,
Falcons
http://iws.seu.edu.cn/services/falcons/
Sindice
http://sindice.com
SWSE
http://www.swse.org
Watson
http://watson.kmi.open.ac.uk
108 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Example Search Engine:
Falcons
(1) Entry of keywords
110 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Sameas.org
111 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Sindice.org
112 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Data Set Discovery
1. Manually, i.e., by browsing and selecting a data set
from http://esw.w3.org/topic/SparqlEndpoints
2. (Semi-) automatically, i.e., by exploiting VoiD, the
Vocabulary of interlinked Datasets
113 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Describing Datasets
The problem:
Only human comprehensible descriptions of datasets available
Automation of tasks impossible such as
Efficient & effective search
Selection of datasets (for apps, interlinking targets)
Generation of maps, etc.
cf. K. Alexander, R. Cyganiak, M. Hausenblas, and J. Zhao "Describing Linked Datasets" Proceedings of LDOW 2009, 2009.
114 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
VoiD – Core concepts
A dataset is a set of RDF triples that are published, maintained or
aggregated by a single provider.
A dataset is authoritative with respect to a certain URI namespace
if it contains information about resources named by URIs in this
namespace, and is published by the URI owner.
A linkset LS is a set of RDF triples where for all triples ti=
⟨si,pi,oi⟩ ∈ LS, the subject is in one dataset, i.e. all si are
described in DS1 , and the object is in another dataset, i.e. all oi are
described in DS2 .
cf. K. Alexander, R. Cyganiak, M. Hausenblas, and J. Zhao "Describing Linked Datasets" Proceedings of LDOW 2009, 2009.
115 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
voiD Vocabulary
cf. K. Alexander, R. Cyganiak, M. Hausenblas, and J. Zhao "Describing Linked Datasets" Proceedings of LDOW 2009, 2009.
116 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
voiD Usage Example
cf. K. Alexander, R. Cyganiak, M. Hausenblas, and J. Zhao "Describing Linked Datasets" Proceedings of LDOW 2009, 2009.
117 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Tools and Applications to
Consume Linked Data
Linked Data browsers
To explore things and datasets and to navigate between them.
Tabulator Browser (MIT, USA), Marbles (FU Berlin, DE), OpenLink RDF
Browser (OpenLink, UK), Zitgist Dataviewr (Zitgist, USA), Disco
Hyperdata Browser (FU Berlin, DE), Fenfire (DERI, Ireland)
Linked Data mashups
Sites that mash up (thus combine Linked data)
Revyu.com (KMI, UK), DBtune Slashfacet (Queen Mary, UK), DBPedia
Mobile (FU Berlin, DE), Semantic Web Pipes (DERI, Ireland)
Search engines
To search for Linked Data.
Falcons (IWS, China), Sindice (DERI, Ireland), MicroSearch (Yahoo,
Spain), Watson (Open University, UK), SWSE (DERI, Ireland), Swoogle
(UMBC, USA)
118 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Example Linked Data Browser:
Marbles
Server-based Linked Data browser.
Formats RDF for XHTML using Fresnel.
Unique feature: Indicates the origin of displayed data
using colored dots.
Support for different views:
Full view: all available data is displayed.
Summary view: returns a short textual summary about a resource.
Photo view: provides a photo for a given resource.
Retrieves data from multiple sources by (a) issuing
parallel queries to multiple Linked Data search
engines and (b) by following owl:sameAs and
rdfs:seeAlso links.
119 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Example Linked Data Browser:
Marbles (2) (1) Entry of query URL
(3) Sources
120 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Example Linked Data Browser:
gFacet
cf. Heim, P., Ziegler, J., and Lohmann, S. "gFacet: A Browser for the Web of Data" In
Proceedings of IMC-SSW 2008, 2008. Try yourself: http://gFacet.org
121 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
Example Mashup: Revyu.com
124 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009
What’ll come next
Part 2: Linked Multimedia
125 SAMT 2009 – Tutorial Web of Data in the Context of Multimedia (WoDMM) Graz, Austria - 2 Dec 2009