Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Look up keyword
Like this
2Activity
0 of .
Results for:
No results containing your search query
P. 1
A Proposed Ontology Based Architecture to Enrich the Data Semantics Syndicated by RSS Techniques in Egyptian Tax Authority

A Proposed Ontology Based Architecture to Enrich the Data Semantics Syndicated by RSS Techniques in Egyptian Tax Authority

Ratings: (0)|Views: 155|Likes:
Published by ijcsis
RSS (RDF site summary) is a web content format used to provide extensible metadata description and syndication for large sharing, distribution and reuse across various applications; the metadata provided by the RSS could be a bit to describe the web resource; this paper provides a framework for making the RSS not only just for syndicating a little information about news but also for further classification, filtering operations and answering many questions about that news by modeling RSS ontology. The proposed architecture will be applied to handle announcements in the Egyptian Tax authority.
RSS (RDF site summary) is a web content format used to provide extensible metadata description and syndication for large sharing, distribution and reuse across various applications; the metadata provided by the RSS could be a bit to describe the web resource; this paper provides a framework for making the RSS not only just for syndicating a little information about news but also for further classification, filtering operations and answering many questions about that news by modeling RSS ontology. The proposed architecture will be applied to handle announcements in the Egyptian Tax authority.

More info:

Published by: ijcsis on Jan 20, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

01/28/2011

pdf

text

original

 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 9 December, 2010
A Proposed Ontology Based Architecture to enrich the data semantics syndicatedby RSS techniques in Egyptian Tax Authority
 
Ibrahim M El-Henawy
1
 
Mahmoud M Abd El-latif 
2
 
Tamer E Amer
2
 
1
Faculty of computer and Informatics
2
Faulcty Of Computer and InformationZagazig University Mansoura UniversityZagazig, EgyptMansoura, EgyptHenawy2000@yahoo.comdrmmlatif@yahoo.comTameramer1@yahoo.com
 
 Abstract
—RSS (RDF site summary) is a web content format usedto provide extensible metadata description and syndication forlarge sharing, distribution and reuse across various applications;the metadata provided by the RSS could be a bit to describe theweb resource; this paper provides a framework for making theRSS not only just for syndicating a little information about newsbut also for further classification, filtering operations andanswering many questions about that news by modeling RSSontology. The proposed architecture will be applied to handleannouncements in the Egyptian Tax authority.
 Keywords-
 
Semantic Web - RDF – Ontology – RSS – OWL – Protégé - Egyptian Tax
I.
 
I
NTRODUCTION
Egyptian tax authority consists of what is called 39 TaxRegions distributed all over Egypt that manages 227 tax offices[22]. All tax offices are connected via a huge computerNetwork on a single domain Called GTAX Domain managedby the central management of computer in Cairo. There are 14IT (Information technology) branches to support the IT worksin all tax regions. Besides the huge computer network; the Taxauthority uses a huge IP telephone network that uses the VoIP(Voice over IP) technology to support communicationsbetween remote offices.The idea of centralization makes a great challenge here; Forexample when the central management of computer in Cairowants to announce for a specific event, meeting or a newversion of specific application in the authority, it put a writtenannouncement (.doc format) in the main FTP server andtelephone all the 14 IT branches using IP telephone and thenthe 14 IT branches call the rest of 227 remote tax offices. It isa very time consuming manual announcing protocol; but usingthe RSS technique to syndicate data published by differentplaces will facilitate the data exchange between them.RSS can be found as acronym for RDF Site Summary; it isan RDF (Resource description Framework) vocabulary thatprovides a lightweight multipurpose extensible metadata todescribe and syndicate any information consists of discreteitems [1, 15 and 16]; hence It allows the key elements of websites, such as headlines, to be transmitted, when devoid of all elaborate graphics and layouts, such minimalist headlinesare quite easily incorporated into other websites.Besides the ability of RSS to solve many problemsthat web masters face such as increasing traffic, and gatheringand distributing news, RSS can also be the basis for additionalcontent distribution services. Regardless of the speed of looking at many different sites in a single coherent hole, thedemocratic manner in news distribution that enables the user tochoose the feed he wants; making him the potential newsprovider, can be considered the most efficient benefit in usingRSS [2].Many advantages can be achieved by using RSS, but whatis noticed that all the data gathered in the RSS file is showndirectly by any RSS aggregator. What about if someone wantsto classify the data presented in RSS? For example; if thetraining management of Tax authority announces for a trainingcourse in "Soft Skills"; does this announcement belongs tospecific department or for all?, does it for specific tax regionaccording to specific schedule or for all? What if someonewants to know some information about the writer or publisherof the published article? It’s obvious that there are manyquestions in the chain and the little Metadata descriptionpresented in the RSS technology did not have the ability to giveanswers for the questions chain.Semantic web extends the current web by givinginformation published on the web a well-defined meaning,better enabling computer and people to work in corporation [3].To make the RSS has the ability to answer the questions above;the word “well-defined meaning” should exist in theperspective of RSS; it is noticed that it may not be expressedvia terminologies in RSS. The only way to express “well-defined meaning” in RSS is to extend the RSS itself byenabling it to link and interact with other ontologies; thusenriching the semantics that are provided by RSS.The contribution of this paper is dealing with datapublished by RSS as domain ontology and enables it to interactwith other vocabularies such as Dublin core Metadata, FOAF(Friend Of A Friend) ontology and tax ontology. This wayenables us to make further operations about RSS data such asclassifications, reasoning or answering the above questionschain. The presented ontology is modeled by Protégé
.
The outline of this paper is as follows: providing abackground of Egyptian tax authority and the current way of announcement in section 1. Section 2 illustrates what is theRSS and how it is related to RDF. The proposed architectureand the implementation of the RSS ontology are presented insection 3; finally we conclude this paper in section 4.II.
 
LITERATURE ON
RSSRSS file is XML based syntax; it has xml/applicationMIME (Multi-purpose Internet Mail Exchange) type. The
203http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 9 December, 2010
extension of the file of RSS version 1.0 is preferable to be(.rdf). Care should be taken here because RSS will bediscussed in the scope of RDF; RSS 0.9 and 1.0 is the onlyspecification standard of RSS that uses RDF vocabularies theother specification (RSS 0.91, 0.92, 0.93, 0.94, and 2.0) doesnot [1, 16]; they are more basic XML implementation. Its fileuses mainly the following two namespaces as two attributeswithin <rdf:RDF> tag:
 
xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#
 
xmlns=http://purl.org/rss/1.0
/
 
 A.
 
The anatomy Of RSS file
Each RSS file consists of a single channel thatcontains the information gathered from many different sites; itis represented by the <channel> element. The attributerdf:about is used with in <channel> tag to describe the locationand the name of the RSS file
.
Some required tags within the channel element can beused to describe the channel itself such as
 
<title> element to describe the title of the channel.
 
<link> element to describe the URL of the parent siteor the news page.
 
<description> element to provide a brief descriptionof the channel contents, function, etc.As shown in figure.1; the channel contains number of items (<items>) listed in an ordered collection described by theRDF container <rdf:seq>. The items listed in the channel willbe described outside the channel after the closing </channel>tag using the above <title>, <link> and<description> tags. Thefollowing block diagram illustrates the anatomy of the RSSfile.
Figure.1 Anatomy of RSS File
 B.
 
The relationship of RSS to RDF 
Earlier
versions of RSS did not include any RDFvocabularies; it is just a syntactic XML representation of thepublished news. Although XML is a universal Meta languagefor defining Markup [7]; it is worth mentioned that XML hassome deficits, for example it does not provide a satisfactorysemantics representation that is embedded in statements; thereis no standard way exists to assign meaning to the nesting of the XML elements.Expressing the RSS 1.0 in a language described inRDF concepts and abstract syntax makes it conforms toRDF/XML syntax specification that has a precise formalsemantic defined in RDF semantics; thus easy interoperabilitywith other RDF Languages and obviously can be read andprocessed by machines [13].The foundation of RSS 1.0 serves the purpose of this paperthat intends to extend the RSS 1.0 to be used outside of strictnews and announcements syndication by focusing on a genericmeans of structured metadata exchanging [4] and how it canincorporate with other RDF ontologies by providing a simplemodular extension mechanism to accommodate newvocabularies.III.
 
SYSTEM ARCHITECTURE
 The main purpose of the proposed architecture is to extendthe RSS data gathered from different resources to exceed justsyndication purpose by making an ontology that can interactwith other ontologies to have a tight and well-defined metadataabout the news and announcements. It will make acollaborative space that makes everything is linked.Classification operation can be done as well as many questionscan be answered.The framework presented in this research can beconsidered as integrated semantic web architecture. The word“integrated” refers to that this architecture consists of morethan one component, each one has a specific task, and the word“semantic” means that, this architecture is based on semanticweb technologies to make the presented ontology.Figure.2 shows the schematic diagram for this architecture
Identify applicable sponsor/s here.
(sponsors)
 RDF declaration and used namespacesChannelList of items
 
Item1
 
Item2
 
Item3
 
..
 
Item nDescription of each Item in the channelRDF file closingPresentationInf. request
DBXML
Semi structured dataGenerator
RDF storage/ Query Engine
RDF/XMLRDF QueryInference
Data request
QueryCompositionPresentationbrowsing
UIOWLApplication LayerExtending andInference LayerRDF LayerSource and Storage Layer
Other ontologies
Rules
Reasoner
RSS files (Structured data)RSSOntology
API
Figure.2 The schematic diagram of the architectureUnstructured data
204http://sites.google.com/site/ijcsis/ISSN 1947-5500
 
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 9 December, 2010
 A.
 
The Ontology of RSS data
Ontology is the heart of semantic web applications.The definition of the ontology [5] is the explicit and formalspecification of conceptualization of a domain of interest. It isincreasingly seen as a key technology for enabling semantics-driven knowledge processing. Communities establishontologies, or share conceptual models, to provide a framework for sharing a precise meaning of symbols exchanged duringcommunication and enable the programs to reason aboutdifferent worlds and environments; they enable us to say “ourworld looks like this” [6, 10].The presented ontology will be modeled by Protégé [17]; itis one of the most famous and widely used ontology editingenvironments [22, 23]. The conceptual Model of the proposedontology will consist of the hierarchy of all Classes, subclasses,properties, sub properties and how it related to each other. Theontology is written in OWL (Ontology Web Language); it is avery powerful tool for describing complex relationships andcharacteristics of the resources. OWL allows rules to beasserted to classes and properties. Rules represent the logic thatenhances the ontology language [7]; this will help when it isapplied to a set of facts to infer new facts that are not explicitlystated.Looking to the RSS data as ontology will help duringsearching for a concept to easy locate, not only the concepts butalso the other concepts that are semantically related to it.Although RSS 1.0 is basically expressed in the language that isdescribed in RDF concepts and abstract syntax; and RDFprovides an ideal encoding to make available ontologies tosemantic web applications; it offers a limited set of semanticprimitives and cannot therefore meet the requirements of amarkup language for the semantic web. So extending the RSSsemantics by adding more primitives encoded in OWL to offerappealing inference capabilities will form a very tight definedvocabularies that describe the concepts in the ontology, andalso exert significant influence on searching information aboutthe concepts; the degree to which terminologies aresemantically precise has a direct impact on the degree to whichrelevant information can be found [8, 9].RDF schema should be considered when talking about theRSS 1.0 ontology because it shapes and describes the ontologyof RSS 1.0. It is described in formal language in RDF schemaof RSS 1.0 [18]. It consists of the following classes andattributes summarized in table1 and table II
TABLE I. T
HJE CLASS SPECIFICATION OF
RSS
IN
RDF
SCHEMA
 
Class Definition URL
Channel An RSSinformationchannelhttp://purl.org/rss/1.0/channelImage An RSSImagehttp://purl.org/rss/1.0/imageItem An RSS Item http://purl.org/rss/1.0/itemTextInput An RSS textInputhttp://purl.org/rss/1.0/textinput
TABLE II. T
HJE PROPERTY SPECIFICATION OF
RSS
IN
RDF
SCHEMA
 
PropertyDefinitionURLSubProperty Of 
Items list of rss:itemelements thatare membersof the subjectchannelhttp://purl.org/rss/1.0/itemsTitle A descriptivetitle for thechannelhttp://purl.org/rss/1.0/titleDublin coretitle element ;dc:title
[19]
 Link The URL towhich anHTMLrendering of the subjectwill link http://purl.org/rss/1.0/link Dublin coreidentifierelement;dc:identifier
[19]
 url The URL of the image tobe used inthe 'src'attribute of the channel'simage tagwhenrendered asHTML.http://purl.org/rss/1.0/urldc:identifierDescription A short textdescriptionof the subjecthttp://purl.org/rss/1.0/descriptionDublin Coredescriptionelementdc:description
[19]
 Name The textinput field's(variable)namehttp://purl.org/rss/1.0/nameFigure.3, 4 represents the Node and Arc diagram for theclasses and properties in the table1 and table 2.
Figure.3 Node and Arc Diagram for RSS Classesrdfs:ClassChannel ImageItem
 
TextInput
205http://sites.google.com/site/ijcsis/ISSN 1947-5500

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->