You are on page 1of 5

2010 International Conforence on Computer Application and System Modeling (ICCASM 2010)

Semantic Grid for Chemical Domain Knowledge Service

Xuelin Shi YingZhao


School of Information Science and Technology School of Information Science and Technology
Beijing University of Chemical Technology Beijing University of Chemical Technology
Beijing, 100029, China Beijing, 100029, China
e-mail: shixl@mail. buct. edu. cn e-mail: zhaoy@mail. buct. edu. cn

Abstract-Chemical Grid is a collaborative grid platform flexible collaboration and computation on a global scale.
based on CGSP (China Grid Support Platform) and provides a Fig. l shows the progress of Semantic Web.
grid computing environment for various chemical applications.
On this platform we constructed a knowledge service
architecture using Semantic Web technology, in particular
RDF (Resource Description Framework), to integrate -
::l
heterogeneous chemical information, generate domain (trn Semantic Semantic
.... (1)
knowledge and provide knowledge service for researchers. o 3 Web Grid
o po
Furthermore, we designed a Chemical Knowledge � ::l

. g.

Representation Model consisting of a stack of languages, and �
0

� �
brought out a RDF/S-Graph model to represent semantic ::l
property and relationships of these data. All extracted
chemical domain knowledge, i.e. RDF data and service
mechanism compose a Semantic Grid, which is an experience
on grid computing platform and is a innovative effort of the Computing Performance
Semantic Web.

Figure 1. The progress of Semantic Grid.


Keywords-Grid Computing; Semantic Web; Chemical;
Knowledge service; RDF
This paper reports on our experiences in building a
Semantic Grid infrastructure: Chemical Knowledge Service
I. INTRODUCTION
Semantic Grid (CKSSG) for chemical research as part of the
The objective of Grid computing is to bring a variety of Chemical Grid Project of Beijing University of Chemical
computational and data resources together to create new Technology (BUCT) . Chemical Grid is a grid computing
capabilities. Now it presents a new trend to distributed platform based on China Grid Support Platform (CGSP)[7].
computing and Internet applications [1]. In grid It integrates most of chemical research resources of BUCT
environment how to integrate large-scale heterogeneous and provides high collaborative grid computing platform for
resources and provide users knowledge service is becoming chemical applications. Our work is one application on
more and more attractive. In the past few years there have ChinaGrid. Based on Chemical Grid, the CKSSG especially
been some efforts to provide the knowledge and semantic aims to provide semantic retrieval services for chemical
framework to enable the automated management and sharing domain knowledge. It is concerned with the way that
of complex resources using Semantic Web technology. knowledge is acquired, used, retrieved, published and
Semantic Web has been proposed with rapid growth of maintained to assist chemical researchers to achieve their
Web, in order to provide enhanced information service by particular goals and objectives. To implement such function,
applying machine-processable meta-information [2]. It is the we designed knowledge service architecture, which is easy
idea of having data on the Web defined and linked in a way to deploy on the grid platform. Then a chemical domain
that it can be used for more effective discovery, automation, knowledge representation model (CKRM) is brought up, the
integration and reuse across various applications. The Web key of CKRM is RDF/S-Graph.
can reach its full potential if it becomes a place where data This paper is organized as follows. Section 2 introduces
can be processed by automated tools as well as people. the background of Chemical Grid and the knowledge service
Based on grid computing and Semantic Web, Semantic Grid architecture of CKSSG. In section 3 , we describe
[3 , 4, 5, 6] was brought up as an infrastructure where representation model for chemical domain knowledge based
complex applications and services can be deployed with on a stack ofianguages, especially introduced RDF/S-Graph.
minimal manual intervention. It is characterized by an open The implementation of the CKSSG and its application
system, with a high degree of automation, which supports scenario is given in Section 4. Finally, concluding remarks
and future work of research are described in Section 5.

978-1-4244-7237-6/10/$26.00 ©2010 IEEE V7-549


2010 International Conference on Computer Application and System Modeling (ICCASM 2010)

II. KNOWLEDGE SERVICE ARCHITECTURE OF CKSSG CKSSG is designed to improve current information service
systems and help users get what he (she) wants by providing
In this section, we introduce the background of Chemical
semantic search and knowledge classification. Fig. 3 shows
Grid platform and present the knowledge service architecture
the architecture of CKSSG.
of CKSSG.
A. Chemical grid
Chemical Grid aims to provide a unified and
collaborative environment to access different chemical Knowledge Services
applications across different physical domains and security
firewalls. It is a collaborative computing framework based
on CGSP. CGSP is a grid middleware developed to build the
ChinaGrid, which integrates all kinds of resources in Meta-knowledge
education and research environments, makes the
heterogeneous and dynamic nature of resource transparent to
the users, and provides high performance, high reliable,
secure, convenient and transparent grid service for the RDFData Repository
scientific computing and engineering research. CGSP is
based on the core of Globus Toolkit.
RDFModeling
Fig. 2 gives the framework of Chemical Grid [8]:

Domain Specific Service Layer Extraction semi-structured


(Chemical Engineering Applications)

,
,
___________________ 1____________________
,
,
Data Source
(WWW)
,
, Collaboration Resource
Support Service Sharing Service

1
,
, Figure 3. CKSSG architecture
'------------------- -------------------

The architecture of consists of three layers: extraction


layer, knowledge layer and knowledge services layer, which
CGS P implement functions of all the key stages of the knowledge
management lifecycle, i. e. knowledge acquisition,
Globus Toolkit
knowledge representation and knowledge use.
The bottom is extraction layer, which integrated digital
Figure 2. Framework of Chemical Grid resources on chemical engineering. At present, the major
source is Web pages, which are immense, heterogeneous and
The framework consists of 3 tiers: domain specific dynamic, and it is semi-structured, such as html, xml or other
service layer, common service layer and grid middleware semi-structured text. This layer extracts these semi­
layer. We used CGSP as the lowest layer, grid middleware. structured data to implement knowledge acquisition.
Above the 3 layers, there is the portal, web interface for After knowledge has been acquired form these semi­
users, which is the entrance for the end user to use grid structured data, it is then required to represent the knowledge
services. in a special model easy to implement semantic retrieval. This
The first layer is CGSP acting as grid middleware layer, is the function of the second layer, knowledge layer, which
which implements the characteristic of grid for collaborative consists of two sub-layers: RDF data repository and meta­
resource sharing. The second layer is common service, knowledge.
which includes collaboration support service and resource The bottom of the knowledge layer is the RDF data
sharing service. This layer is based on grid middleware layer repository, which stores knowledge from the extraction layer.
for collaborative chemical grid computing environment. We We used RDF as knowledge representation model, making
used some professional software to construct the second data machine-understandable and enhancing the retrieve
layer. The third is application layer, which oriented to special services.
application domain. Many chemical engineering applications The top level of the knowledge layer is meta-knowledge.
are deployed on this layer. Meta-knowledge is a process of critical thinking, reasoning
and understanding of knowledge, through which it turns into
B. Knowledge service architecture
a kind of skills of problem-solving and decision making.
Based on Chemical Grid, we construct the knowledge Meta-knowledge serves not only knowledge management,
service architecture CKSSG for chemical researchers.

V7-550
2010 International Conference on Computer Application and System Modeling (ICCASM 2010)

but more importantly provIsIOn of knowledge services,

I
problem-solving methods, and decision making processes.
Meta-knowledge is the analysis, evaluation, comprehension, Retrieval Mechanism
inference, and interpretation of inherent knowledge and is
the exploitation, analysis and manifestation of users wisdom
and experience [9].
I Domain Knowledge Category

Shortly, just like metadata, meta-knowledge is


description of knowledge. In our CKSSG, the meta­
I RDFS (RDFSchema)

knowledge is created, maintained and populated in such way:


firstly the knowledge engineer constructed category tree of
1 RDF(Resource Description Framework)

chemical domain knowledge. Based on key words the RDF


data is classified into deferent category. Secondly, while I XML (eXtensible Markup Language)

operation of CKSSG, based on feedbacks of users, the meta­


knowledge is improved. Semi-Structured Data
The top is knowledge services layer, which provides
knowledge sharing, searching, browsing, summarizing,
visualizing and organizing services for users. Importantly, Figure 4. Layer of CKRM
the knowledge service interface component will record basic
operations when a user uses the system. These logs, which As shown in figA, the bottom is XML (Semi-structured
will display the interests of the user in certain information data is information source, not included in CKRM). XML is
resources, will be the major source of meta-knowledge. already widely known and designed for make-up in
Furthermore, as the meta-knowledge accumulating, the RDF documents of arbitrary structure. In our CKRM, XML is
data repository will be updated with new meta-knowledge used as serialization syntax of RDF/S data.
rules. The second layer is RDF, which is a W3C
This architecture is concerned with the knowledge is recommendation for the formulation of meta-data on the
acquired, represented, used, retrieved and maintained to World Wide Web [10]. RDF is an XML application
assist chemical scientists to achieve their particular goals and customized for adding meta-information to Web documents.
objectives. It extracted knowledge from WWW, and When it comes to semantic interoperability, RDF has
represented them in RDF format. In order to provide users significant advantages over XML The basic building block
.

knowledge navigation service, it construed meta-knowledge. in RDF is an object-attribute-value triple, commonly written
Then users can access the knowledge by knowledge service as A(O, '1. That is, an object ° has an attribute A with value
interface on the top layer. V. For example, the following is a relationship expressing in
A(O, '1 format:

III. CHEMICAL KNOWLEDGE REPRESENTATION MODEL hasName{'http://202.4. I 30. 20SlperiodList',


CKSSG acquires chemical engineering domain 'periodic table of the elements ')
knowledge from Internet, such as Web sites, open chemical
resource library and so on. These data are characterized as The above triple can be serialized in XML as followed:
semi-structured, i. e. they are often semi-structured texts,
HTML documents and XML documents. Furthermore the <rdfDescription rdfabout=
data often includes some description information, i. e. ''http://202. 4. 130. 20S/pieriodList'' >
metadata, such as subject, updating time, source, and so on. <hasName rdfresource=
Based on such features, it is effective to use RDF as " periodic table of the elements">
knowledge representation model. Using such model makes <lrdfDescription>
data machine-understandable and enhances the retrieve
services. Above RDF is RDFS. Just as XML Schema provides a
Therefore in our CKSSG we designed a knowledge vocabulary definition facility for XML RDF Schema ,

model, Chemical Knowledge Representation Model provides a similar facility for RDF. RDF Schema extends
(CKRM) , based on RDF to represent knowledge and provide this standard with the means to specifY domain vocabulary
users more flexible semantic retrieve services. In the and object structures [11]. These techniques will enable the
following, we introduced the layered structure of our enrichment of the Web with machine-understandable
chemical domain knowledge model, then described such the semantics, thus giving rise to what has been dubbed the
RDF/S-Graph model, at last the retrieval mechanism are Semantic Web.
presented. The upper layer is chemical domain knowledge category,
which not only is used for knowledge classification and
A. Layered structure of CKRM knowledge navigation, but also facilitates users' feedbacks
According to the requirements of CKRM, it consists of a processing. Currently the domain knowledge category is
stack of languages and data model, likes a pyramid. The constructed by knowledge engineer according to chemical
layer of CKRM is showed in Fig. 4. part of Chinese Book Category, i. e. TPI to TP2.

V7-551
2010 International Conference on Computer Application and System Modeling (ICCASM 2010)

The top layer is retrieval mechanism of knowledge of


CKRM format. In fact it just needs design a query
mechanism for RDF/S data, which should implement
&r3
querying at the semantic level. Now there are a lot of query Dark-red sediment
mechanisms for RDF, such as RQL [12], RDQL [13] and so
on. In our CKSSG, we used a simple retrieval mechanism for
RDF/S-Graph [14].

B. RDFIS-Graph
is-a
As described above, the basic building block in RDF is
ch:ao
an object-attribute-value triple, commonly written as A(O, V).
That is, an object ° has an attribute A with value V. The
triple also can be presented as graph: objects (resources) are
presented as nodes and attributes are denoted by labeled Pyrolysis oil
edges between nodes, Le. {O/ -A -{V/. Usually one object
can be a value of another object, so it is clear to presents
rl: catalyst for oil production
such relation using graph.
r2: Fe203
RDFS defines classes of RDF resources and the range of
r3: CUS04
attributes. In nature RDFS is also RDF documents, so RDFS
also can be presented by graph: nodes present classes and
Figure 6. RDF/S-Graph of RDF Document
edges denote types of attributes. Therefore, RDF/S-Graph
can be used as knowledge representation model. A RDF/S­
Fig.5 is a RDF document serialized in XML, which
Graph is a Directed Acyclic Graph (DAG) with labeled
describes molecule type, physical property and chemical
edges, which can be presented as RG (V, E, D, 1), and V
=

property of catalyst for oil production (Fe]03), especially


denotes the set of nodes, E denotes the set of edges, D is
including the information source (Le. URL address) of the
domain of node, 1 is mapping from edges to nodes.
digital object. Fig.6 gives the graph of this RDF document
Fig.5 and fig.6 give an example of RDF/S-Graph to
based on above RDF/S-Graph definition.
represent chemical knowledo-e:
It is clearly to note that using RDF/S-Graph is fit to
<rdf:RDF> implement querying at the semantic level, because it
<rdf:Description about="catalyst for pyrolysis oil represents the relationships among properties and objects.
production" Importantly, such data are easy to be stored in RDBMS
<ch:molecule>Fe203</ ch:molecule> (Relationship Database System). In CKSSG we used
< ch:physical�rop> MySQL.
dark-red sediment
</ ch:physical�rop>
IV. IMPLEMENTAnON AND APPLICATION
< ch:chemical�rop>
oil yield can be improved As an application deployed on Chemical Grid platform,
about 76.7% when using Fe203 as catalyst GKSSG provides chemical domain knowledge, knowledge
</ ch:chemical�rop> retrieval and navigation services, and it can be accessed by
<url_addr>http://......</url_addr> all users of our Chemical Grid. When the user need access
</rdf: Description>
these knowledge service, he/she just submit a request on
Chemical Grid Web portal, then the grid job schedule system
</rdf:RDF>
will manage the job execution and implement load balance
of widespread computational resources.

A. Implementation on Chemical Grid platform


Figure 5. An Example of RDF Document Based on Chemical Grid, we implement CKSSG. From
users' view, CKSSG is also a service provided by Chemical
Grid: users can submit request in portal, then Chemical Grid
system actives CKSSG. When CKSSG receives request, it
begins search RDF data repository and meta-knowledge.
Then the results are presents to users. Fig.7 shows the system
architecture ofCKSSG.

V7-552
2010 International Conforence on Computer Application and System Modeling (ICCASM 2010)

objectives. To implement such function, we designed


Portal knowledge service architecture, which is easy to deploy on
the grid platform. Then a chemical domain knowledge
representation model (CKRM) is brought up, the key of
CKRM is RDF/S-Graph. Based on the architecture and the
knowledge mode, we implement CKSSG system.
At present CKSSG just integrated a little of chemical
digital resources from WWW. However, the grid metaphor
ROBMS intuitively gives rise to the view of the CKSSG as a set of
services that are provided by particular individuals or
� ROFOata
institutions for consumption by others. On the other hand a
� Repository grid is useless without appliances to plug in [3]. Our CKSSG
is a useful knowledge service which is able to deploy on grid
computing platform and provide services using distributed
computing resources. All extracted chemical domain
knowledge, i.e. RDF data and service mechanism compose a
Semantic Grid, which is an experience on grid computing
platform and is an innovative effort of the Semantic Web.

REFERENCES

[I] 1. Foster, C. Kesselman. The Grid: Blueprint for a Future Computing


Figure 7. System Overview of CKSSG Infrasturcture. Mogan Kaufmann Publisher, 1999.
[2] John Davies, Dieter Fensel, Frank van Harmelen. Towards the
As shown in the above figure, ChemExtract module semantic web: ontology-driven knowledge management. John Wiley
extracts semi-structure data from WWW. Then these data are & Sons, Ltd, 2003.
processed and stored in database. Now ChemExtrace is a [3] David De Roure, Nicholas Jennings, Nigel Shadbolt. Research
Agenda for the Semantic Grid: A Future e-Science Infrastructure.
executable application developed with Java.
Technical Report, URL:
ChemSearch module is the core component of the system. http://www.semanticgrid.org/vI. 9/semgrid. pdf
It not only provides knowledge query service, but also [4] K. R Taylor, J. W. Essex, etc. al. The Semantic Grid and Chemistry:
provides knowledge navigation. Furthermore it is designed Experiences with CombeChem. Web Semantics: Sciences and Agents
as a WSDL compatible and SOAP compatible web services on the World Wide Web 4 (2006), 84-10I.
deployed on grid platform. [5] D. De Roure, N. Jennings, N.R Shadbolt. The Semantic Grid: Past,
The web portal is the interface by which users can access Present, and Future. Proc. IEEE 93 (2005), 669-681.
knowledge service of CKSSG. Unlike typical web subject [6] C.A. Goble, D. De Roure, etc. al.: Enhancing services and
portals, a grid portal may also provide access to grid resource. applications with knowledge and semantics. The Grid2: Blueprint for
a New Computing Infrastructure. Morgan-Kaufinann, 2004, 431-458.
B. Application scenario [7] Jin HaL: China Grid: making grid computing a reality. International
Collaboration and Cross-Fertilization (ICADL2004), 2005, 13-24.
A scenario of CKSSG application is derived from
[8] Ying Zhao, Xuelin ShL Collaborative Computational Chemical Grid
requirements of one chemical researcher, who wants get
Based on CGSP. 2007 IFIP International Conference on Network and
some information about catalyst for oil production. Parallel Computing Workshops (NPC 2007), 199-202.
The researcher logs into the Grid platform by Web portal, [9] Xiaoxing Zhang. Knowledge Service and Digital Library: A
and he can choose service. By looking through the services Roadmap for the Future. 7th International Conference on Asian
list, he decides to access CKSSG service. Digital Libraries (ICADL 2004), Shanghai, China, December 13-17,
Maybe he can firstly use knowledge navigation. From the 2004.
Domain Knowledge Category he finds the catalyst, and he [10] O.Lassila, Ralph Swick. Resource Description Framework (RDF)
Model and Syntax Specification. URL: http://www.w3.orgITRlREC­
goes to the sub-category until he gets what he wants.
rdf-syntax!.
He can also use knowledge retrieval service by
[II] D. Brickley, RV. Guha. RDF Vocabulary Description Language 1.0:
submitting a query request. When ChemSearch module gets RDF Schema. URL http://www.w3.orglTR/2004/REC-rdf-schema-
the results, he can get what he wants. 200402101.
Various chemical researchers maybe express different [12] G. Karvounarakis, A. Magganaraki, et. al. Query the Semantic Web
interest in knowledge. By recording users' feedback, these with RQL. Computer Networks 42(2003), pp.617-640.
logs will be an important source to improve knowledge. [13] A. Seaborne. RDQL-A Query Language for RDF. W3C Member
Submission. http://www.w3c.org/ SubmissionlRDQL.
V. CONCLUSION AND FUTURE WORK [14] Xuelin Shi, Ying Zhao. RDF based integrated information retrieval in
grid computing environment. 2008 International Joint Conference on
In this paper we report our CKSSG application on grid
Neural Networks (IJCNN2008).
platform. It is concerned with the way that knowledge is
acquired, used, retrieved, published and maintained to assist
chemical researchers to achieve their particular goals and

V7-553

You might also like