You are on page 1of 34

24/10/08

ebXML Day (Barcelona 23.5.2002) Implementing ebXML Registry Information Model

Peter Burgess

1

TietoEnator©2002

24/10/08

Some background...

Technical Architect (Media and Telecom section, TietoEnator in Brussels, Belgium). Specialising in system architecture + development of Java/ XML solutions. Open-source evangelist. New committer on ebxmlrr open-source project. Previously worked for Nokia (Finland), IBM Global Services (Belgium)

  

Peter Burgess

2

TietoEnator©2002

24/10/08

Some background...
TietoEnator
 

Staff of 10,000 and annual net sales of 1.1 billion euros. IT Services organization with strong base in Scandinavia esp. Finland, Sweden Consulting, systems development and integration, operation and support, product development services, and software services. In Belgium, working with both commercial and public sector clients. http://www.tietoenator.com/

Peter Burgess

3

TietoEnator©2002

24/10/08

Our Project

Implementation of MIReG metadata model and framework MIREG = Managing Information Resources for eGovernment Sponsored by European Commission’s IDA initiative IDA = Interchange of Data between Administrations

 

IDA’s mission: ‘using advances in information and communications technology to support rapid electronic exchange of information between Member State administrations’ http://europa.eu.int/ISPO/ida/

Peter Burgess

4

TietoEnator©2002

24/10/08

Project Goal

To implement a system for managing metadata about information resources, documents and services. To implement a system that facilitates:  content interoperability  simplification of administrative processes  improved information flows To allow users to:  locate and track documents, metadata and versions  search and manage content  search and manage administrative metadata

Peter Burgess

5

TietoEnator©2002

24/10/08

What is Metadata??
 

Data about data. Metadata describes a resource – e.g: – Name – Title – Subject – Date issued – Version – Date modified – Identifier

Dublin Core standard is simple standard for describing a wide range of networked resources. See http://dublincore.org

Peter Burgess

6

TietoEnator©2002

24/10/08

Dublin Core Elements
Creator Contributor Coverage Date Description Format Identifier Language Publisher Relation Rights Subject Source Title Type

Peter Burgess

7

TietoEnator©2002

24/10/08

MIReG Metadata Model & Framework

Metadata management system manages metadata about information resources, documents and services Describes citizens, enterprises, public servants, long-lived information (e.g. archived documents). Dublin Core + MIReG extensions:  administrative metadata - to describe how the resource should be managed and processed  access rights  security classification  disposal  long-term preservation  etc…

Peter Burgess

8

TietoEnator©2002

24/10/08

Functional requirements
Metadata management system should support:  Exporting documents and their metadata  Converting existing metadata to Dublin Core/RDF  Adding or updating administrative metadata  Storing metadata  Providing metadata search capability  Importing documents and their metadata.
Peter Burgess 9 TietoEnator©2002

24/10/08

Why ebXML?

Metadata Management System should:  be flexible and evolutionary;  facilitate ‘content interoperability’ – i.e. information exchange between organisations;  be standards compliant and open;  provide well defined interfaces, allowing Creation, Update, Retrieval and Deletion of metadata and content

Peter Burgess

10

TietoEnator©2002

24/10/08

Non-functional requirements

All open-source solution, adopting ‘best of breed’ solutions (e.g. Apache WS, Apache Tomcat, Apache Xindice, Castor Java-XML binding) Co-operate with open source community wherever possible. Delivered system based on standards such as ebXML, W3C Schema, RDF, Dublin Core. Total XML solution – from database to user interface.

Peter Burgess

11

TietoEnator©2002

24/10/08

Tools and APIs: Requirements

Open-source!! Ability to see the code and make changes if necessary... Support – open-source community responds quickly to bug reports and questions. Don’t reimplement ebXML Registry from scratch – co-operate with existing open-source team(s). Don’t reinvent the wheel – reuse best existing solutions. Use stable and well adopted tools (e.g. Apache Web Server 1.3, Apache Tomcat 4.0+)
12 TietoEnator©2002

Peter Burgess

24/10/08

Tools and APIs: Problems
 

Steep learning curve ... many new tools and APIs to master. Support for full W3C Schema standard not available in Castor Java-XML binding. No concrete JAXB implementation available that supports W3C Schema (only DTD) Xindice 1.0 only really supports US-ASCII (UTF-8 patch now available in Xindice 1.1 development..) Xindice XPath contains() search is slow. Must use equality tests to gain benefits of indexing. Xindice’s transaction support not yet available ...

Peter Burgess

13

TietoEnator©2002

24/10/08

Standards & Technologies (1)
Standards:             JAXP – Java API for XML Processing JAXB – Java API for XML Binding JAXR – Java API for XML Registries JAXM – Java API for XML Messaging SOAP Version 1.1 W3C XML Schema RDF & RDF Schema XSLT (Version 1.0 ) XPATH (Version 1.0 ) XML :DB Java Servlet specification (Sun Microsystems) – version 2.3 JSP specification (Sun Microsystems) – version 1.2

Peter Burgess

14

TietoEnator©2002

24/10/08

Standards & Technologies (2)
Open Source technologies & tools:        Apache Web Server 1.3 Apache Tomcat 4.0.3 servlet engine Apache SOAP (XML messaging API) Apache Xerces XML parser Apache Xalan XSLT processor Apache Xindice (DBXML) native XML database Castor open source framework for Java – XML binding

All software written using the Java programming language (Java versions 1.3 and 1.4)

Peter Burgess

15

TietoEnator©2002

24/10/08

Architecture Overview (High Level)

Peter Burgess

16

TietoEnator©2002

24/10/08

Architecture Overview (ebXML Service Layer)

Peter Burgess

17

TietoEnator©2002

24/10/08

Architecture Overview (XML database layer)

Peter Burgess

18

TietoEnator©2002

24/10/08

Xindice v Relational

Relational database model: – Tables – Views – Data is structured, based on pre-defined schema – Standardised queries via SQL (SELECT, INSERT, UPDATE, DELETE etc..) – Most RDBMS support JOIN operations – Possible to make XML to Relational mapping (e.g. IBM DB2 XML Extender)

Peter Burgess

19

TietoEnator©2002

24/10/08

Xindice v Relational
Xindice database model: – Hierarchical organisation of data – The root of the hierarchy is a database instance – Data managed as XML Documents – Insert the data as XML and retrieve it as XML – Sets of documents form a Collection (similar idea as file system folder) – Queries with using standard XPath (Query engine built around Apache Xalan) – Indexation system speeds Xpath query performance

Peter Burgess

20

TietoEnator©2002

24/10/08

Mapping ebXML RIM to Xindice
Main Concepts:
  

All ebXML RIM components stored as separate XML documents In Xindice ebXML RegistryObject id used as document id Use Association to link two RegistryObjects e.g:

<rim:ObjectRef id="urn:uuid:b2345678-1234-1234-123456789077"/> <rim:ObjectRef id="urn:uuid:c2345678-1234-1234-123456789012"/> <!– Association describes relationship between these two objects --> <rim:Association associationType="Packages" sourceObject="urn:uuid:b2345678-1234-1234-123456789077" targetObject="urn:uuid:c2345678-1234-1234-123456789012"/>

Peter Burgess

21

TietoEnator©2002

24/10/08

Mapping ebXML RIM to Xindice

All XML data is wrapped in a custom <RegistryData> wrapper. <RegistryData> wrapper contains namespace declaration <RegistryData xmlns="urn:oasis:names:tc:ebxmlregrep:rim:xsd:2.0" xmlns:rim="urn:oasis:names:tc:ebxmlregrep:rim:xsd:2.0“> XPath queries include namespace prefix e.g. //rim:ExtrinsicObject All ebXML RIM components are stored in same collection

Peter Burgess

22

TietoEnator©2002

24/10/08

ebXML RIM and Dublin Core

Metadata mapped to ExtrinsicObject slots e.g: creator=‘Arthur C. Clarke’ maps to: <Slot name=“creator" slotType=“dc-metadata"> <ValueList> <Value>Arthur C. Clarke</Value> </ValueList> </Slot>

Sub-set of ebXML RIM implemented in short-term – User, Slot, ExtrinsicObject, AuditableEvent, Association, ExternalLink

Peter Burgess

23

TietoEnator©2002

24/10/08

Querying Xindice with XPath (1)
 

W3C standard XPath Advanced path like expressions, allowing node selection and filtering
Example <rim:ExtrinsicObject id="urn:uuid:b089d653-bad1-41d6-93ad-9dc93c055339"> <rim:Name> <rim:LocalizedString value="ebXML RIM Schema metadata"/> </rim:Name> <rim:Description> <rim:LocalizedString value="metadata about ebXML RIM schema"/> </rim:Description> <!-- metadata here as slots --> <rim:Slot name="title" slotType="schema-metadata"> <rim:ValueList> <rim:Value>ebXML RIM W3C Schema</rim:Value> </rim:ValueList> </rim:Slot> etc . . .

Peter Burgess

24

TietoEnator©2002

24/10/08

Querying Xindice with XPath (2)
 Select ExtrinsicObject with identifier ‘urn:uuid:b089d653-bad1-41d6-93ad-9dc93c055339’: //rim:ExtrinsicObject[@identifier='urn:uuid:b089d653-bad141d6-93ad-9dc93c055339']  Case sensitive Select ExtrinsicObject with Slot whose name is ‘title’ and whose value list entry contains the word ‘ebXML

//rim:ExtrinsicObject[rim:Slot[@name='title']/rim:ValueList/rim :Value[contains(.,'ebXML')]]

Peter Burgess

25

TietoEnator©2002

24/10/08

Querying Xindice with XPath(3) XPath JOIN

One Xindice collection can be queried as one large document: Example
<rim:ExternalLink id="acmeLink2"> <rim:Name> <rim:LocalizedString value="Link #2"/> </rim:Name> <rim:Description> <rim:LocalizedString value="ACME's Link #2"/> </rim:Description> </rim:ExternalLink> <rim:Association id="acmeLink2-alreadySubmittedCPP-Assoc" associationType="ExternallyLinks" sourceObject="acmeLink2" targetObject="urn:uuid:a2345678-1234-1234-123456789012"/>

Peter Burgess

26

TietoEnator©2002

24/10/08

Querying Xindice with XPath (4) XPath JOIN
XPath:  Get the RegistryObject whose id is the same as the targetObject’s id of the Association whose sourceObject’s id is ‘acmeLink2’ //*[@id=//rim:Association[@sourceObject='acmeLink2']/@targetObject]

Peter Burgess

27

TietoEnator©2002

24/10/08

Querying Xindice with XPath (5) XPath JOIN
Example <ExtrinsicObject id="urn:uuid:548b6bf0-cf77-4450-9efeee465b504484" status="Submitted" xmlns="urn:oasis:names:tc:ebxml-regrep:rim:xsd:2.0"> … </ExtrinsicObject> <AuditableEvent id="urn:uuid:724719b2-6b4f-41ca-b910-af5219ebcdd9" objectType="AuditableEvent" eventType="Created" registryObject="urn:uuid:548b6bf0-cf77-4450-9efeee465b504484" timestamp="2002-05-15T11:38:56.980" user="urn:uuid:921284f0-bbed-4a4c-9342-ecaf0625f9d7" xmlns="urn:oasis:names:tc:ebxml-regrep:rim:xsd:2.0" />

Peter Burgess

28

TietoEnator©2002

24/10/08

Querying Xindice with XPath (6) XPath JOIN
XPath:  Get all RegistryObjects created by user with id 'urn:uuid:921284f0-bbed-4a4c-9342-ecaf0625f9d7‘:
//*[@id=//rim:AuditableEvent[@eventType='Created' and @user='urn:uuid:921284f0-bbed-4a4c-9342ecaf0625f9d7']/@registryObject]

Peter Burgess

29

TietoEnator©2002

24/10/08

XUpdate
• XML:DB initiative specification http://www.xmldb.org/xupdate • Batch modifications against XML document set. Example:
<xupdate:update select="//rim:User[@id='urn:uuid:921284f0-bbed4a4c-9342ecaf0625f9d7']/rim:EmailAddress/@address">peter.burgess@tietoen ator.com</xupdate:update>

Peter Burgess

30

TietoEnator©2002

24/10/08

Castor Java-XML Binding (1)
• Implements majority of W3C Schema recommendation (e.g. no Union) • UnMarshal a java.io.Reader into Java object StringReader stringReader = new StringReader(extObjXML); ExtrinsicObject extObj =ExtrinsicObject.unmarshal(stringReader); • Marshall Java object to java.io.Writer extObject.marshal(stringWriter); String extObjXML = stringWriter.toString();

Peter Burgess

31

TietoEnator©2002

24/10/08

Castor Java-XML Binding (2)
• Fast, reliable, performant • Uses SAX • High level interface. • Manipulate XML document as Java Object • No need to walk the DOM tree, or build custom SAX handlers

Peter Burgess

32

TietoEnator©2002

24/10/08

Lessons Learned
• Reuse of existing solutions saves much time in long term • Access to all software sources was invaluable – make own bug fixes on the spot. • Open-source is a two-way street. Use other’s solutions and also contribute your own. • Solid architecture because we took time to carefully design the system (plus prototyping, learning new APIs) • XML databases offer a very realistic solution for projects with XML data storage needs. • XPath is very powerful – even possible to implement JOIN in Xindice.

Peter Burgess

33

TietoEnator©2002

24/10/08

References
• ebxmlrr project http://sourceforge.net/projects/ebxmlrr • Apache Xindice http://xml.apache.org/xindice • XML:DB initiative http://www.xmldb.org • Castor Java-XML Binding http://castor.exolab.org/ • IDA (European Commission) http://europa.eu.int/ISPO/ida • TietoEnator http://www.tietoenator.com • Peter Burgess - peter.burgess@tietoenator.com

Peter Burgess

34

TietoEnator©2002