You are on page 1of 11



Corey Chong1, Angela Goh1 and Puay Siew Tan2

In a supply chain environment, interoperability has been a longstanding issue for application
developers and e-Business supply chain collaborators. A major problem lies with semantic
differences in business terms used by e-business transactions. This paper focuses on how
interoperability is enhanced through the use of ontologies. Leveraging on existing efforts to
standardize supply chain transactions (specifically, RosettaNet), an experimental ontology is built
using W3C’s semantic language OWL. In order to demonstrate the use of the ontology, a schema
matching system was designed by deriving relationships between business terms using WordNet
and their context structure in the schema.

Keywords: web services, ontology, RosettaNet, interoperability, schema matching

1. Introduction

In the supply chain domain, one of the main challenges for developers is interoperability between
applications used in facilitating daily business processes. Traditionally, supply chain business is
carried out using snail mail, telephones and facsimile systems to exchange information and process
transactions. Early attempts to automate supply chain collaboration include Electronic Data
Interchange (EDI) [18] and information hubs. It is reported in [8] that “inflexibility of EDI in
representing business processes made it limited to the largest 20% of trading partners”. In order to
achieve full automation in the supply chain domain, collaborating partners must agree to a standard
protocol to exchange information and execute business transactions. Standardization efforts such as
RosettaNet [15] and ebXML [4] emerged as a result. The RosettaNet consortium was formed to
tackle the longstanding issue of interoperability between supply chain partners. RosettaNet’s
mission is to drive collaborative development and rapid deployment of e-business standards and
services, creating a common language and open processes that provide measurable business
benefits for global trading networks. One difficulty in implementing RosettaNet standards is the
complexity involved in understanding, developing and testing the RosettaNet Interface Framework
(RNIF) and Partner Interface Processes (PIPs) [15]. To date, there are only about 500 large
corporations (e.g. Fujitsu, Microsoft, IBM, etc.) that are RosettaNet compliant partners. In an
automated environment, RosettaNet uses servers to exchange information over the Internet. XML
[22] functions as the alphabet, and electronic commerce applications serve as the vehicle through
which e-business processes are transmitted. The lack of agreement on the words, grammar and
dialog that constitute e-business processes illustrates the need for standards. RosettaNet
dictionaries provide the words, the RosettaNet Implementation Framework (RNIF) acts as the
grammar (predefined protocol) and RosettaNet Partner Interface Processes (PIPs) form the dialog.
The collaborative decision support solutions (DSS) used by trading parties will need to be aligned

1 School of Computer Engineering, Nanyang Technological University, Nanyang Avenue, Singapore 639798
2 Web Services Programme, Singapore Institute of Manufacturing Technology (SIMTech), 71 Nanyang Drive, Singapore 638075
with their business protocol. RosettaNet Partner companies realize that consortium PIPs only
specify the process at the point of interface, and the true value lies in aligning internal decision
systems with the PIP specifications.

In this paper, we attempt to embed business concepts and terms in an ontology [5]. OWL [10] is an
ontology Web Language recommended by W3C [19]. With the use of OWL functions such as
Class, SubClassOf, EquivalenceOf, Datatypes Properties, Object Properties, and a host of other
features, business terms and their relationships can be captured. These OWL-stored concepts can be
manipulated, matched, reasoned and queried upon, according to different business needs and
application requirements. Thus, the ontology will reduce the limitations RosettaNet’s PIPs face in
representing business concepts in XML (both its domain structure and document structure of the
data). The RosettaNet consortium is aware of these limitations and have plans to create machine-
readable schemas in current work plans. Representing concepts in a semantic language allows
knowledge-sharing, extension of concepts across several ontologies and provides mapping
capability for overlapping concepts in distinct ontologies. Moreover, semantically difficult queries
can be answered, via inference or aggregation through the ontology.

Hence, the motivation of this paper is to address interoperability problems between supply chain
partners in terms of the different terminology used in e-Business transactions. It is common that
enterprises have their own set of business terms, terminology and meanings based on local context.
The main issue being addressed is the construction of ontolgies from PIP. To demonstrate the use of
the ontology, a schema matching application has been designed and tested. The system is Web
Services based and is capable of matching inbound business schemas with ontologies stored in local

The following section introduces RosettaNet, the basis upon which the ontology is created. Section
3 describes the approach taken to build the ontology. A scenario is presented in section 4, which
illustrates the use of the ontology. Section 5 briefly describes the schema matching algorithm used,
followed by test results and conclusions in sections 6 and 7 respectively.

2. RosettaNet and Motivation for Building an Ontology

RosettaNet’s PIP® Specification Package presents concepts and knowledge in three interdependent
forms: RosettaNet PIP Message Guidelines, RosettaNet XML Message Schema in Document Type
Declaration (DTD) format [21] and RosettaNet Implementation Framework (RNIF) guideline. The
Specifications provide the business performance controls (also known as the choreography of the
exchange) as well as define the purpose of the business process and the roles that participate in the
process. The Message Guidelines define the cardinality, vocabulary, structure, and allowable data
element values and value types for each message exchanged during the execution of a PIP. The
DTD provides the order or sequence of the elements, element naming, composition, and attributes.
In order to implement a RosettaNet business exchange, the above must be adhered to. However,
limitations are present in these monolithic PIPs. The Message Guidelines define the RosettaNet
Message structure using a hierarchical or “tree” presentation and the DTD is based on information
from the Message Guidelines. Due to limitations of DTD, point-to-point consistency cannot be
captured by the DTD alone. For example, if an element is utilized two times within the Message
Guideline with different sub-element cardinalities, the DTD cannot express this constraint.
Therefore, DTD will present the less restrictive cardinality to support both occurrences. The
business knowledge embedded in the PIPs’ Message Guidelines and XML Message Schemas will
be leveraged upon. This data will be stored within a single OWL ontology – depending on the
functionality of the PIP. For example, Request Quote PIP has two data files in its Specification
Package: RosettaNet XML Message Guidelines HTML file and a Message Schema DTD file
depicting the usage/definition and document structure of the business terms respectively.

An ontology stores RosettaNet’s information as a compact knowledge base in the form of a single
file complete with all descriptive text, cardinalities and elements’ hierarchical information. For
example, Entity Instances provide a list of possible values and their descriptions for a specific
business term. An information system administrator will usually insert these values into the
backend database system. The values can be displayed to users when completing an online form.
For example, if the term in question is Global Country Code, a list of codes recognized globally
will be displayed along with their descriptions. Entity information is extracted from RosettaNet’s
PurchaseOrder-Notification Message Guideline and inserted into OWL ontology. OWL allows a
Class element to contain direct instances, which in turn can store comments regarding instance
description as well.

Other information stored in the Entity Instances section require domain knowledge in order to
facilitate the completion of RosettaNet-compliant transaction. In order to automate the process, it is
necessary for the server to provide all possible instances. This prevents errors and misunderstanding
from occurring, leading to greater supply chain collaboration. It is therefore possible that if a
transaction has compulsory fields left uncompleted, the system can provide a feature by extracting
the RosettaNet instances from the ontology and display them as choices in the web application
interface for the client to choose.

3. Building the Ontology

The RosettaNet PIPs Extraction module is used to facilitate extraction of data from PIPs and to
insert them into a preliminary OWL file. The extraction module gathers information from
RosettaNet files, sorts the data and stores them in memory. Thereafter, these in-memory data are
processed and inserted in the correct order into an OWLWriter. The flowchart in Figure 1 depicts
the program flow of this extraction module. This preliminary OWL file requires trimming and some
minor modifications to convert it into a well-formed expression. This is because concepts that are
meant to be expressed in OWL’s Object or Datatype Properties require human interpretation and
cannot be done satisfactorily by the system alone. A richer expression and definition of a concept
term can be provided through manual modification. Examples of modification which cannot be
achieved by automated means include the changing of Classes to Properties instead of Class
Definition, concepts such as FreeFormText into XMLSchema String data-type and so on. Protégé -
OWL Editor [13] with graphical user interface is used to aid in this manual task.

The methodology in creating RosettaNet’s OWL ontology from its PIPs requires the use of XML
Parsers / Converters and OWL Parsers / Reasoners. First, the pre-processing procedure converts
RosettaNet DTD files into XML Schemas. This will facilitate the generation of a tree structure
which in turn will be accessed to construct the ontology. Although there are open-source tools
(OWL-API [11] and Jena 2.1 [7]), their functionalities are limited and lack the ability to create a
new ontology and insert data into it.
Creating Ontology from RosettaNet PIP Package

XML DTD Message

PIP Package
File Guidelines

Extractors Parsing Parsing

In Memory Business Fundamental Fundamental Business

Properties Representation Business Data Data
Entities Entities

OWL Writer Add Class Add Add Add Add

SubClassOf Cardinality Instance Comments

Figure 1. Phases in Ontology Construction

The module therefore produces a ‘raw’ ontology file that is built by writing the large amount of PIP
data into the OWL file without considering tricky semantic language issues. These issues will be
tackled manually. We transfer the in-memory data obtained from preprocessing efforts into a
sequential writer that prints data according to OWL’s definition syntax. Upon obtaining the
untrimmed version of the RosettaNet ontology, we proceed with manual modification to obtain a
well-formed ontology. This manual process is achieved using Protégé-2000 OWL ontology Editor
as shown in Figure 2. It can be observed that Asserted Hierarchy on the left of the GUI displays
Class Hierarchy structure of RosettaNet business concepts. A Class element is denoted by this
symbol . All Classes are subclass of OWL Class ‘Thing’ and the hierarchy structure depicts the
class-subclass relationship between concepts. The Properties Frame in the upper right of the GUI
contains information regarding the Class element’s Object Properties ; Datatype Properties ;
and ‘rdfs:comments’ (lower right of the GUI) contain definitions of the Class elements obtained
from the extraction of RosettaNet files. It is with the use of this GUI that the ontology is modified
to include Datatype Properties and Object Properties linking to their domain terms. Clearly, this
process is difficult to achieve fully automatically. We are, however, able to eliminate the laborious
task involved in constructing an ontology by inserting all the RosettaNet’s Class element names,
comments, data types, and instances using the OWLWriter. Conversion of business terms to
Datatype and Object Properties is then done manually.

4. Scenario

There are more than 200 RosettaNet’s PIPs Specifications in existence, with other new ones still
being created. Therefore, as a test case, we have selected Purchase Order and Delivery Order
(equivalent to Shipping Order for RosettaNet context) transactions in a simulated environment.
This smaller scope allows us to analyze and identify the accuracy of the matching system using
RosettaNet PIP as our designated source schema.
Figure 2. Editing the Ontology

A scenario that is described in Figure 3 is used to illustrate the system that has been developed. It
comprises of an electronic communication between a Multi-National Corporation (MNC) who is a
RosettaNet compliant partner and a Small-Medium Enterprise (SME). The scenario starts with (1)
the MNC issuing a Purchase Order (PO) request. The RosettaNet Server intercepts this request on
behalf of the SME and matches the PO request with stored RosettaNet ontologies. It then (2) sends
a query to the SME that includes information that the SME administrator must provide (3) so that
the RosettaNet Server is able to extract out a subset correctly. Thereafter, a modified PO (4) is sent
to the SME. A Delivery Order (DO) is assumed to be issued by the SME and directed to an
appropriate Delivery Party through the RosettaNet Server also, though not shown in Figure 3. PIP
3A13: Notify of Purchase Order Information [16] and PIP 3B11: Notify of Shipping Order [17]
specifications from RosettaNet are used, and converted into W3C’s OWL ontologies to employ
machine translation for the business exchanges.

RosettaNet Server
2. Query

1. PO
Multi-National (Original) Small Medium
Mapping 3. Response
Corporation Enterprise
4. PO

Figure 3. Scenario in the supply chain environment

Figure 4 illustrates the flow of messages in a simulated scenario involving the Purchase Order
business process. A customer (whom we assume is a MNC) would issue a RosettaNet compliant
Purchase Order to a designated supplier (a SME). Once the Supplier is able to fulfill the order, a
Delivery Order is issued to a designated Delivery Party for transfer of goods to the customer. The
gateway for translating business schemas is a RosettaNet Server that utilizes the Java API for XML
Messaging (JAXM) with a schema matcher module. In this scenario, we assume that MNCs use
RosettaNet compliant systems, while SMEs have their own set of business schemas and business
taxonomies. The RosettaNet Server acts as an intermediary, translating and mapping business
terms suitable for each individual party.

Figure 4. Scenario Implementation Diagram

5. Schema Matching

The scenario described in section 4 is a vehicle to illustrate the use of ontologies in a supply-chain
environment. The ontology, which is based on the RosettaNet PIPs are compared with in-coming
non-RosettaNet schemas. This is done by computing their semantic distances in terms of their
schema structure and taxonomy similarities. These generated distances are used to produce a set of
classification rules based on decision tree induction. In the schema matcher process, there are two
approaches used, namely, direct and indirect methods which are adapted from Jackman [6] and Xu
and Embley [23]. Figure 5 shows the components of the system and their relationship.

5.1 Direct Method

The direct matcher uses the structure of the word or sentence, looking for similarity in strings. It is
used in two ways. Firstly, direct matching is adopted when the RosettaNet Server accesses and
retrieves the SOAP header upon receipt of a SOAP message from the client. The header will
contain information regarding the content of its message. For example, if the header contains
‘PurchaseOrder’, the system will search in a predefined ontology directory for the stored OWL files
and retrieve ‘PurchaseOrderInformation-Notification.owl’ which has the closest match compared to
other files in that directory. Secondly, direct matching is used when simple straightforward
matching is possible. For example, comparing between term pairs’ names and word nouns stored in
the Matching Table which is described in further detail below.

Figure 5. Components of the Schema Matching System

Table 1 shows that heuristic values defined for the Direct Threshold is set at 0.5 (determined
empirically). In File Matching, if the file retrieved has OWL file extension (*.owl) and the final
confidence score exceeds the threshold, the target ontology file is retrieved and loaded. For
example, ‘PurchaseOrderNotification’ matched against ‘PurchaseOrder’ will generate a value of
0.6. This value has taken into account the hit ratio of alphanumeric characters, which is 0.4 in this
case (‘purchase’ and ‘order’ each contribute 0.2), and the confidence increases by 0.2 when they are
structured in correct sequential order in both sources. Since the score obtained is greater than
Direct Threshold 1, this file is chosen.

Table 1. Thresholds for Direct Matching

Direct Direct Increment Sequential Remarks
Threshold 1 Threshold 2 Confidence Value
Used in comparing ontology files
File with SOAP header to pinpoint
0.50 NA +0.20 +0.20
Matching current business transaction.
Used in simple straightforward
Term matching of word terms. (i.e. match
NA 0.50 +0.20 +0.20 table names with ontology’s Class

Direct matching is generally used in WordNet [9] for operations involving retrieval of word nouns
from the database. A database table has individual noun-pairs, with each pair given a WordNet
score. Therefore, in order to compute the average WordNet score for business terms like
“OrderQuantity”, the system locates corresponding lists of words (i.e. ‘order’ and ‘quantity’) in the
table ‘matching’ that contains that noun. Another usage for direct matching is as follows: For
example, ‘action-quantity’ term pair has a positive WordNet score. Suppose ‘quantity’ is from a
RosettaNet data field, all the OntClasses that have names containing ‘quantity’ would be retrieved.
One such OntClass will be ‘OrderQuantity’ in RosettaNet and this will be involved in Context
matching. Clearly, these two usages require only a straightforward character matching, thus Direct
Matching is used. More details on WordNet and Context methods are described in the next section.

5.2 Indirect Method

Under indirect matching, two confidence scores for source to target element matching are
computed, namely, WordNet score and Context score. The first method provides estimated
mappings while the second method confirms the final source to target mapping. RosettaNet is the
source schema and target schema is a sample representing organizations that are non-RosettaNet
compliant. The former will be referred to as source object sets, SOS and while the latter is termed
target object sets, TOS.

The WordNet Method computes confidence scores for terms used in SOS and TOS based on their
hypernym hierarchy. The hypernym hierarchy contains concepts that define the more general
classes of entities of the original term. Word sense defines the various meanings that a term can
have in the English language. For example, the word ‘company’ has meanings like ‘an institution’
which is an organization founded and united for a specific purpose, or in a more general sense, ‘an
organization’ where a group of people who work together. Each distinct word sense has its own
hypernym hierarchy in WordNet. The system has adopted metrics from Jackman [6]. including
NumberOfRootTerms, XYSenseCount, MinSenseCount, MaxSenseCount, etc. More details on
WordNet can be found in [2].

In Structural Context method, a TOS and a SOS match only if the values of their adjacent object
sets around the object schema element are similar. With reference to Figure 6, TOS refers to
‘Supplier’ business term and Adjacent TOS refers to the subclasses under it that further defines the
business concept. This similarity is measured by computing metrics on their relationship. Further
details can be found in [6].

Figure 6. Object Set Diagram

6. Results and Discussion

To investigate the effectiveness of the system developed, testing was conducted and the results
were assessed using the quality measures, Recall and Precision [3]. The WordNet method requires
the computation of a confidence score from the metrics mentioned in section 5. In order to build
decision trees, we have to provide training data from RosettaNet. Testing was carried out using the
following four XML Schemas:

papiNet [12]: papiNet, the Global Transaction Standard for the Paper and Forest Supply Chain
standard provides a small XML Schema to support business transactions in this specific vertical
XML Industry Project [20]: The XML Working Group under the of National IT Standards
Committee proposed XIP to aid companies, especially Small and Medium Enterprises (SMEs),
to revamp their business processes with XML technology. The aim of the project is to encourage
SMEs to employ XML technology and benefit from doing so.
Quote Messaging Standard (QMS) [14]: The QMS Quote is a large schema used by the automotive
industry for quotation and invoicing purposes.
BizTalk [1]: The business schema taken from Microsoft’s BizTalk server site is another example of
attempts at standardizing information exchange

Results are obtained with the assumption that a human expert has classified the training file. Given
the number of direct and indirect matches N determined by a human expert, the number of correct
direct and indirect matches C and the number of incorrect matches I, the metrics are computed as
follows: recall ratio R = C/N and the precision ratio P = C/(C + I),









papiNet XIP QMS biztalk

Figure 7. Performance results

As seen in Figure 7, the results range from moderate (70% recall and 50% precision) to poor. This
is mainly due to a large number of incorrect mappings. It should be noted that mapping decisions
depends largely on the type of schemas used.

As seen from the testing results, match performance is better when the two schemas being
compared are of similar size. This may be due to the fact that similar size schemas tend to have
similar tree structures and nesting of leaf nodes. Thus, the results are better when matching the two
large schemas of RosettaNet and QMS Quote. However, performance deteriorates when matching
schemas of different sizes. This may be due to the huge number of synonyms being generated for
the source elements. Furthermore, the structure of small and large schemas differ, resulting in low
scores for structural matching.

Another problem is the system does not take into consideration mappings for leaf node elements.
Class elements such as Country, Street and City, which are leaf node sub-Classes of Address Class
naturally maps to the source schema’s PhysicalLocation sub-Classes, but such mapping decisions
are not generated in the results. It is also unable to differentiate between mappings of shipTo,
BillTo and SoldBy Classes, which all have PartnerDescription as their only sub-Class. This is
related to the earlier point regarding schemas of different structures.

The limitation within context matching arises when there is a lack of ‘child nodes’ (adjacent object
sets) to do comparison. Clearly, this will generate a zero Context score for the object sets

In the supply chain context, there is a wide usage of acronyms or abbreviations in business
schemas. For example, UOM, which stands for unit of measure, can be found widely in BizTalk’s
PO schema. The use of acronyms makes machine matching difficult and nullifies the results of
WordNet and Context matching methods. To solve this problem, a data dictionary which contains
domain acronyms, could be used during machine translation.

7. Conclusion

An OWL ontology based on RosettaNet Notify of Purchase Order Information (3A13) PIP® has
been built successfully. The method and tools used can be readily applied to any other PIPs to
create ontologies in other business areas. When a new XML schema is received by an organisation,
the availability of an ontology allows easy extraction of Class element information and parsing of
the ontology. A demonstration of the use of the created ontology was given. This involved schema
matching between two organisations in a supply chain scenario. Future work includes the creation
of ontologies based on other XML standards in various domains.

[1] BizTalk

[2] DIDION, J. and BARTON, G. Java WordNet Library API.

[3] DO, H.H. and RAHM, E. COMA – A System for Flexible Combination of Schema Matching Approach. 28th
International Conference on Very Large Data Bases, Hong Kong, 2002.

[4] ebXML Specifications. OASIS Consortium.

making of a Web Ontology Language. 13th International WWW Conference, New York, USA, 2004

[6] JACKMAN, D. Mapping Target Schemas to Source Schemas Using WordNet Hierarchies and Structure
Context. Department of Computer Science, Brigham Young University. 2002

[7] JENA Version 2.1: Java Framework for Building Semantic Web Applications. HP Labs Semantic Web

[8] KAK, R. and SOTERO, D. Implementing RosettaNet E-Business Standards for Greater Supply Chain
Collaboration and Efficiency., 2002.
[9] MILLER, G. WordNet: a lexical database for English. Communications of the ACM 38 (11), 1995

[10] OWL Web Ontology Language

[11] OWL-API: High-level view of an OWL ontology based on the OWL Abstract Syntax

[12] papiNet. Global Transaction Standard for the Paper and Forest Supply Chain standard.

[13] Protégé OWL GUI Editor.

[14] QMS Quote Messaging Standard


[15] RosettaNet

[16] RosettaNet PIP 3A13: Notify of Purchase Order Information,


[17] RosettaNet PIP 3B11: Notify of Shipping Order Notification,


[18] SOKOL, P.K. From EDI to Electronic Commerce: A Business Initiative, 2nd Ed, Mcgraw-Hill, 1995.

[19] W3C World Wide Web Consortium

[20] XIP ITSC-XML Working Group’s XML Industrial Project.:

[21] XML DTD Specification

[22] XML Extensible Markup Language

[23] XU, L. and EMBLEY, D. W. Discovering Direct and Indirect Matches for Schema Elements. 8th International
Conference on Database Systems for Advanced Applications (DASFAA'03), Kyoto, Japan, 2003.