You are on page 1of 15

A Domain Integrity Constraint Control for XML Documents

Alexandre Tagliari Lazzaretti1, Ronaldo dos Santos Mello2


1

Instituto de Instituto de Cincias Exatas e Geocincias Universidade de Passo Fundo (UPF) Passo Fundo RS Brazil.
2

Centro Tecnolgico Departamento de Informtica e Estatstica Universidade Federal de Santa Catarina (UFSC) Florianpolis SC Brazil.
lazzaretti@upf.br, ronaldo@inf.ufsc.br

Abstract. Integrity constraints for XML documents are mainly defined through XML schemas: DTD and XML Schema. However, XML schemas do not have support for all existing kinds of domain integrity constraints in SQL databases. This paper presents a control called XDC that aims at supporting domain integrity constraints that cannot be handled by XML schemas. It is composed by an XML language for domain integrity constraint specification, called XDCL, and an XDCL parser that validates such constraints. XDCL allows specifying conditions and executing actions, as in SQL databases. XDC is an independent mechanism that can be used by users, applications or XML databases to validate domain integrity constraints.

1. Introduction
XML (eXtensible Markup Language) is a current standard for representation and interchange of data on the Web [XML 2005]. An XML document is a collection of data, like a database. However, XML technology is not equivalent to database technology, because there is not effective soluctions for all aspects of XML data management, like integrity constraint control and transaction management [Chaudhry and Zicary 2003]. As well as in database systems, XML documents also need rules to impose constraints on data, assuming that applications that interchange XML data require consistent data. In XML technology, the W3C reccomendations for XML schemas (DTD and XML Schema), define the structure, content and semantics of XML documents [W3C XML Schema 2003]. However, they are not able to deal with all kinds of database integrity constraints, like Domain Integrity Constraints (DICs), that is the focus of this work. Some taxonomies of DICs are proposed to the relational model [Date 2003, Elmasri and Navathe 2003, Silberchatz et. al. 2003]. However, these taxonomies are not homogeneous. Because of this, we define a taxonomy for DICs based on these existing taxonomies. We use this taxonomy as the basis to the analysis of available resources in XML technology for supporting DICs. Some work in the literature deals with DIC control for XML data [Ogbuji 2001, Benedikt 2002, Bayley et al. 2002, Provost 2002, Bonitafi et al. 2002]. However, most of these treatments are not efficient, if compared to DIC support provided by databases.

Besides, these work have some drawbacks, like the extension of XML schema recommendations to the definition of integrity constraints (restricting the use of the extended XML schema to the approach proposed by the work) and the development of complex constraint engines to control integrity constraints automatically. This paper presents a DIC control for XML documents that supplies the existing limitations in W3C XML schema recommendations as well as in related work. This control is called XDC (XML Domain Constraint control), being composed by a DIC specification language called XDCL, and an XDCL parser that validates these constraints on an XML document. In our approach, an XML document that must be validated in terms of DICs have a reference to an XDC document that contains XDCL specifications. The structure of the XML data considered in the XDCL specifications respects the structure of the schema defined to the XML document, in order to provide a correct validation by the XDCL parser. Thus, the validation of the schema and the validation of the DICs of an XML document are performed separately, i.e., it is not necessary to extend an XML schema recommendation as well as an XML parser in order to support the control of DICs. This paper is organized in other five sections. Section 2 outlines the existing resources in XML schema recommendations and related work for DIC control to XML documents, providing a motivation to our approach. Section 3 gives an overview of our approach, called XDC. Section 4 details the XDCL language and section 5 briefly comments the XDCL parser. Finally, section 6 is dedicated to the conclusion.

2. Domain Integrity Constraints for XML


DICs in databases restricts the allowed values for stored data in a specific application domain. If we consider a relational database, they may be classified in DICs of [Lazzaretti 2005]: (i) type, if the allowed values for data are restricted to their datatypes; (ii) attribute, which defines specific allowed values for the data, like an enumeration or a fixed value; (iii) tuple, if the constraint considers the analysis of the values of two or more attributes in a tuple; (iv) database, if the constraint considers the analysis of the values of two or more tuples in the database (in the same table or in different tables); and (v) state transistion, that guarantees valid transitions of values for data. XML data are semistructured data [Abiteboul, 2000]. For this reason, the taxonomy proposed above for database DICs must be adapted to be in accordance to the XML model. In this paper, we assume the following: a type DIC encloses structural constraints, like sub-elements sequences or choices, and sub-element cardinality constraints; an attribute DIC defines allowed values to simple elements or attributes; a tuple DIC considers the validation of the content of elements or attributes that belong to the same hierarchic level in an XML document, i.e., all the elements (and their attributes) that are direct descandants of an element E, including E attributes;

a database DIC considers the validation of elements or attributes that belong to different hierarchic levels in an XML document; a state transistion DIC considers valid value transitions to simple elements and attributes.

The W3C recommendations for XML schemas are DTD (Document Type Definition) and XSD (XML Schema Definition) [W3C XML Schema 2005]. Table 1 shows a comparative between the existing DIC controls in XML schemas and the DIC taxomomy for relational databases [Lazzaretti 2005].
Table 1 DIC Taxomomy for Databases vs. DIC Controls in XML Schemas. DIC Taxonomy DICs for XML Schemas Element content Sub-element sequences and choices Sub-element cardinality Attribute and element type Attribute cardinality Attribute content Attribute defaults Element defaults Type Attribute Tuple Database State Transition -

X X X X X X -

X X

According to Table 1, XML schemas are able to specify type and attribute DICs. The element content1, sub-element sequences and choices, sub-element cardinality, attribute content2, and attribute cardinality controls are related to the type DICs, because they define the structure of the elements and attributes in an XML schema. A type specification in an XSD schema is also related to a type DIC, because it allows the definition of constraints that comprises the datatype of an element or an attribute3. The other element and attribute controls are classified as attribute DIC. A default in Table 1 is considered a pattern value or a set of constant values allowed to an element or an attribute. Based on the information of Table 1, we conclude that XML schemas does not support tuple, state transition and database DICs, i.e., XML schemas are not expressive enough to describe and control all kinds of relational database DICs. Because of this, some proposals in the literature work on extensions of the XML technology with the purpose of dealing with such constraints. These related work are described in the following. 2.1 Related Work Schematron is a language that defines a schema for XML documents [Ogbuji 2001]. It extends an XSD specification, including a set of instructions for DIC definition. A schematron parser validates an XML document against schema and DICs. Although a

1 2

We assume that an element content may hold a structure or a predefined datatype. We assume that an attribute content respects a predefined datatype. 3 The Attribute and element type line in Table 1 includes simple type constraints (facets) as well as derivation by restriction or extension in XML Schema [W3C XML Schema 2005].

more comprehensive integrity control for XML data is provided, schematron produces complex XSD schemas that requires a proprietary parser for validating XML documents. In [Benedikt 2002], it is proposed a language for DIC definition called
aY `X W #6UT#S6RQ51PI1"HG%EC4#)B" @69)%8651421)%&%$#"  V 0  D !   0 D  0   F D  ! ' ! A     '  7 !  (  3  0  ( '   !   

, and a

based on XPath [XML Path Language 2005] and XQuery [W3C XML Query 2005] expressions, allowing the specification of several kinds of DICs, like allowed values for attributes and elements, referential and cardinality constraints, and constraints based on aggregation functions. In [Bayley et al. 2002], DICs are treated through ECA (EventCondition-Action) rules that perform automatic actions in response to an event. However, only events related to insertion or exclusion of elements in an XML document are considered. Another drawback is that both proposals do not deal with state transition DICs. The approach presented in [Provost 2002] supports DICs specification through a language based on XSLT (XML Stylesheet Language Transformation) language [XSL 2005]. The problem is that DIC specifications are usually extensive, and the processing required for DIC validation becomes complex, because XSLT is a language designed to format the presentation of XML data, and not a language designed to the specification of integrity constraints. Active XQuery extends XQuery with a syntax for trigger definition based on SQL [Bonifati et al. 2002]. In order to guarantee trigger control, a constraint management system is proposed for executing Active XQuery queries and updates, as well as to fire triggers related to specific update operations on XML data. Despite of providing a comprehensive trigger syntax and semantics based on SQL (that supports, for example, after and before trigger firing, ordering of trigger executions, and state transition constraints), Active XQuery is dependant of a specific compiler and a specific and complex engine for processing constraints. The limitations highlighted in related work and XML schemas had motivated the development of the DIC control for XML documents proposed in this paper. The proposal is presented in the next section.

3. XDC
XDC (XML Domain Constraint control) is an approach for handling DICs on XML documents [Lazzaretti 2005]. Three main objectives were considered in the XDC development: to focus on the treatment of tuple, state transition and database DICs, because these kinds of constraints are not supported by XML schemas; to eliminate the drawbacks of related work, through a validation of DICs independent of the validation of XML schemas, as well as a constraint control that is not complex and is not tightly coupled to the update engine of an application or XML database; to define a DIC definition language for XML documents based on SQL language, considering that SQL provides several resources for integrity constraint

specification in relational databases, like check clauses and triggers, that deals with all kinds of relational database DICs. In order to allow the validation of DICs on XML documents, we apply the same approach followed by some related work, i.e., we extend the integrity constraint control existed in XML schemas with a language for defining DICs called XDCL (XML Domain Constraint Language), and a parser that validates these constraints for a given XML document, called parser XDCL. Figure 1 shows a context for design and validation of XML documents in which XDC may be inserted. This context may be applied to a set of applications that manipulate XML documents or an XML database. The numbered flows (or arrows) indicate the steps followed to the definition and validation of XML documents.
analysis of data requirements conceptual schema analysis of application domain domain integrity constraints

1
definition

domain integrity constraint for data

4
definition reference

XML documents of the applications

8
composed XML database reference by reference reference

XDCL specifications

XML schema document

composed by

13

7
XDC

12
query

18

schema validation

XML instances

document

11

19
XML parser generation domain integrity constraint validation

10

query

XDCL parser

17

14
validation result

generation

validation result reference validation

15 16

XDC

Figure 1 A Design and Validation Context for XML Documents.

Flows 1 to 5 are related to the design of XML documents for applications or XML databases. Flow 1 demonstrates that, based on an application domain, an analysis of data requirements is performed, generating a conceptual schema. Flow 2 indicates that an analysis of DICs is also considered to these data. The schema of the XML documents are designed from the conceptual schema (flow 3), taking also into consideration the DICs that may be defined in the schema specification (flow 4). The DICs that cannot be defined by XML schemas are treated by XDC, being specified through the use of XDCL and stored in a document with XML syntax and extension XDC (flow 5). An XML document or a native XML database whose data must be in accordance to the DICs of an XDC document must reference it (flows 6 and 7), in the same way they reference a schema for their XML data (flows 8 and 9). When an application is running, the XDCL parser may be called to validate XDCL constraints (flows 10 and 11) against one or more XML instances that belong to an XML document or an XML database (flows 12 and 13). Once the validation is concluded, the XDCL parser generates a parsing result (flow 14), that may be a notification of validation OK (with or without update of data) or messages reporting inconsistencies in the XML instances. An XDC document is an XML document that makes reference to an XML schema4. Thus, an XML parser may be used to analyze if the XDC document is valid and well-formed, presenting the result of this validation (flows 15 to 17). In the same way, the XML parser also validates XML instances against an XML schema (flows 18 and 19). In order to further illustrate the DIC control considered in our approach, Figure 2 shows an XML document for a health care domain. In this domain, we have a system for health care plans management, in which XML documents about medical consults are interchanged among associated health care companies, and consistent data are always required. An example of constraint to be considered in this document is that date_registration cannot occurs after data_consult in a same consult. This case characterizes a tuple DIC. Another example is that consults_number must be equal to the sum of quantity values, in the scope of consults. This case characterizes a database DIC. The DIC control provided by XDC is entirely dependent on integrity constraint rules specified through XDCL language. This language is detailed in the next section. The XML document of Figure 2 is used as basis for exemplifying DICs in XDCL.

XDC documents are written in XDCL language. See section 4 for details.

<?xml version="1.0" encoding="UTF-8"?> <dates> <code_responsible>124</code_responsible> <name_responsible>Prontoclinicas</name_responsible> <companies> <company>1452</company> <date_generation monthgeneration=04>2004/04/29</date_generation> <total_payment>54.00</total_payment> <consults> <consults_number>2</consults_number> <consults_total currency=real>54.00</consults_total> <consult> <authorization>813321</authorization> <date_registration monthreg=04>2004/04/14 </date_registration> <patient>14578</patient> <date_consult monthcon=04>2004/04/12</date_consult> <quantity>1</quantity> <value_consult currency=real>27.00</value_consult> </consult> <consult> <autorization>81341</autorization> <date_registration monthreg=04>2004/04/12 </date_ registration> <patient>1245</patient> <date_consult monthcon=04>2004/04/20</date_consult> <quantity>1</quantity> <value_consult currency=real>27.00</value_consult> </consult> </consults> </companies> </dates>

Figure 2 An Example of XML Document.

4. XDCL
XDCL allows the specification of DICs for XML documents5 . Its syntax is based on SQL instructions for integrity constraint definition. In order to simplify XDCL understanding, an example of XDC document is initially presented in Figure 3. This document defines DICs for XML data described in the XML document of Figure 2. According to Figure 3, the first condition (xdcl_condition element) characterizes a tuple DIC, which tests if the content of date_registration element is greater than the content of date_consult element. The second condition characterizes a database DIC, checking if the sum of quantity element contents is different from the content of consults_number element. The last condition exemplifies a state transistion DIC, verifying if the content of the name_responsible element in the updatable XML document is different from the content of the name_responsible element in the old XML document. As shown in Figure 3, an XDC document begins with an xdcl_constraints element, that contains one or more xdcl_constraint elements. These elements are responsible by the specification of DICs and are identified by the xdcl_name attribute. An xdcl_constraint element has two sub-elements: xdcl_on and xdcl_statements. The xdcl_on element denotes what element or attribute of the XML document the DIC
5

The XDCL grammar is shown in Appendix.

defined in the xdcl_constraint element applies. Its content is an XPath 1.0 expression that indicates the respective element or attribute. The xdcl_on element is necessary only if we have conditions to be verified.
<?xml version="1.0" encoding="UTF-8"?> <xdcl_constraints xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="C:\xdcl\xdcl.xsd"> <xdcl_constraint xdcl_name="VALIDATE_DOCUMENT"> <xdcl_on>/dates/companies/consults/*</xdcl_on> <xdcl_statements> <xdcl_set_conditions> <--FIRST CONDITION--> <xdcl_condition xdcl_operator="&gt;"> <xdcl_operand1 type_condition="ExpDt">date_registration</xdcl_operand1> <xdcl_operand2 type_condition="ExpDt">date_consult</xdcl_operand2> </xdcl_condition> <AND/> <--SECOND CONDITION--> <xdcl_condition xdcl_operator="&lt;&gt;"> <xdcl_operand1 type_condition="ExpFuncNr"> sum(/dates/companies/consults/consult/quantity) </xdcl_operand1> <xdcl_operand2 type_condition="ElementNr">consults_number</xdcl_operand2> </xdcl_condition> <AND/> <--THIRD CONDITION--> <xdcl_condition xdcl_operator="&lt;&gt;"> <xdcl_old> <xdcl_old_filexml>\docs\+Old.xml</xdcl_old_filexml> <xdcl_old_identifier type_id="ElementNr">code_responsible</xdcl_old_identifier> </xdcl_old> <xdcl_operand1 type_condition="ElementSt" old="TRUE">name_responsible</xdcl_operand1> <xdcl_operand2 type_condition="ElementSt">name_responsible</xdcl_operand2> </xdcl_condition> </xdcl_set_conditions> <xdcl_actions> <--FIRST ACTION--> <xdcl_insert> <insert type="Element" name_element="date_registration" type_place="Before"> message </insert> <insert-value>CHECK DATES AND RESPONSIBLE NAME</insert-value> </xdcl_insert> <--SECOND ACTION--> <xdcl_update> <update type_value="Function" name_element="consults_number"> sum(/dates/companies/consults/consult/quantity) </update> </xdcl_update> <--THIRD ACTION--> <xdcl_message>INCORRECT VALUES!</xdcl_message> </xdcl_actions> </xdcl_statements> </xdcl_constraint> </xdcl_constraints>

Figure 3 An Example of XDC Document specified through XDCL.

The xdcl_statements element is composed by the xdcl_set_conditions and xdcl_actions sub-elements that denote, respectively, a predicate to be tested and a set of actions to be executed on XML documents if the predicate is true. Such definition is similar to the condition-action part of a trigger definition in SQL. The only difference with regard to XDCL is that a trigger in SQL is an ECA integrity constraint rule, being the event part a database operation that automatically fires the trigger. In our approach, the invocation of the XDCL parser to validade an XML document corresponds to the

event part. We work this way because one of XDC objectives is to provide an independent DIC control, that may be activated by an application or an XML database when necessary. 4.1 XDCL Conditions The xdcl_set_conditions element defines a predicate composed by one or more conditions. The set of conditions that comprises a predicate is able to access and compare XML data from any part of an XML document. This is similar to assertion and check integrity constraint resources in SQL, that are able to relate and compare data from any relation in a relational database. If there are more than one condition, the empty elements </AND>, </OR> or </NOT> are inserted after the tag </xdcl_condition>, providing the appropriate conection of predicate conditons. In the example of Figure 3, three conditions are defined, and the use of the element </AND> denotes that all conditions must be true in order to execute the actions. Each XDCL condition is specified into the xdcl_condition element, which contains the xdcl_operator attribute, the required xdcl_operand1 and xdcl_operand2 sub-elements, and the optional xdcl_old sub-element. The xdcl_operator attribute defines the comparison operator that is used to test the content of xdcl_operand1 and xdcl_operand2.
xdcl_operand1 and xdcl_operand2 elements have the same attributes, and their contents may be the name of an element6 or a numeric XPath 1.0 function. The type_condition attribute defines the content type of the respective element. A content type can be a date, a number, a text or an XPath 1.0 expression. If one of the operands is an attribute of the element declared in xdcl_operand1 or xdcl_operand2, its name is indicated in the name_attr attribute. xdcl_operand1 and xdcl_operand2 may still hold the old attribute. attribute acts as the old clause in an SQL trigger, denoting that the old

If used, this value of an updated data must be verified in the rule condition. In the context of XML document validation, this condition is considered only if the element or attribute to be checked exists in the old XML document (XML document before data update). The use of this attribute characterizes a state transition DIC. When a state transition DIC is defined, besides the declaration of the old attribute in xdcl_operand1 or xdcl_operand2, the element xdcl_old must be specified as a sub-element of xdcl_condition. This element mantains the xdcl_old_filexml and the xdcl_old_identifier sub-elements that indicate, respectively, the location plus the pattern adopted to the names of old XML documents7, and an XPath 1.0 expression to the element or attribute that identifies the XML instance in the old XML document. In the third condition of the XDC document of Figure 3, xdcl_old_filexml content is \docs\+Old.xml, denoting that old XML documents are kept in the docs directory, and the names of the old documents are the concatenation of the original name of the XML document with the string Old.xml. Besides, xdcl_old_identifier says
6 7

Only elements that are descendants of the element to be verified by the DIC are permitted. In the current version of XDC, a path in a DOS file system is supported.

that the element code_responsible identifies the XML instance, allowing a correct comparison between old and new values of the instance. We establish, with such treatment of state transition DICs, that: (i) all old XML document names follows the same pattern; (ii) the creation and maintenance of old XML document is out of the scope of XDC, being managed by the application or XML database; (iii) all XML instance has an identifier, i.e., we assume that XML instances usually represent conventional database data or file records, which always hold an identification like a primary key or an object identity. 4.2 XDCL Actions The element xdcl_actions allows the specification of several actions to be executed against XML instances in order to guarantee data integrity. XDCL provides the usual insert, update and delete actions, supported by SQL triggers, applied to XML elements and attributes, and, in addition, the possibility of renaming elements and attributes, as well as showing a validation message to the user8. In the XDC document of Figure 3, three actions are performed. The first one inserts a message element before date_registration element with a content described in insert-value element. The second one updates the content of the consults_number element with the result of the XPath expression described in the update element. Each update is performed in the context of a consults element (specified in the element xdcl_on). Finally, the last action shows a message, which is described in the content of the xdcl_message element. Each kind of action has a representative element in XDCL, as exemplified above to the last action. The xdcl_delete element is used to remove elements or attributes from an XML document. Its content specifies the name of the element to be removed. If the name_attr attribute is declared, the named attribute of the specified element is excluded. The xdcl_insert element is composed by the insert sub-element, which may hold, in turn, the insert-value sub-element. The insert element encloses the name of the element or attribute that have to be inserted. Three attibutes provide detailed information about the data to be inserted and its correct position in the XML document:
type:

indicates if an element or an attribute will be inserted;

name_element:

indicates the name of the reference element, i.e., the element that is used as the basis for properly positioning the new element or attribute. It also denotes the element that will hold the new attribute, if an attribute is inserted;
type_place:

it is declared only if a new element is inserted. It indicates whether the new element must be inserted after the last sub-element of the reference element (append), or before the reference element (before).

The content of the new element or attribute, if exists, is defined in the insertvalue sub-element. The xdcl_update element defines an update action. The update sub-element holds the new content of the element or attribute to be updated. This content may be a constant or the result of a numeric XPath 1.0 function. These types of content (constant

Currenlty, XDCL parser is only able to output messages in an operating system terminal.

or function) are specified in the type_value attribute in order to instruct the XDCL parser to directly store the content or to execute the defined function. If the update action has to be executed on an element, its name is specified in the name_element attribute of the update element. Otherwise, the name_attr attribute holds the name of the attribute to be updated. The xdcl_rename element allows the renaming of an element or attribute. The content of the sub-element rename indicates the new name of the element or attribute. The content of the mutual exclusive attributes name_element and name_attr defines, respectively, the name of the element and the name of the attribute that will be modified.

5. XDCL Parser
The XDCL parser is the program that controls the consistency of data in an XML document with respect to DICs specified in an XDC document [Lazzaretti 2005]. It basically analyzes XDCL conditions and executes XDCL actions to correct and/or indicate inconsistent XML data. The parsing activity may be summarized in an algorithm that executes the following steps: 1. Read the XML document dXML and the XDC document dXDC that specifies the DICs to be checked at dXML ; 2. For each XDCL constraint cXDC in dXDC do a. Fecth the XML instances of dXML that must be checked by cXDC to a set SXML ; b. If there is a set of conditions SC then i. Fetch SC ; ii. Fetch the set of actions SA ; iii. For each iXML SXML do If SC is TRUE for iXML then Execute each action ai SA on iXML ; Else i. Fetch the set of actions SA ; ii. For each iXML SXML do Execute each ai SA on iXML . Two APIs for XML are used in the coding of the XDCL parser: SAX [SAX 2005] and DOM [W3C DOM 2005]. SAX is used to read and process an XDC document. SAX methods are applied to verify the existence of elements and attributes, as well as the existence and fetch of their contents, to verify conditions, and to execute actions. We use SAX for these tasks, instead of DOM, because DOM brings to memory

all readed fragments of an XML document. This is not efficient, specially for large XML documents. DOM is used to search XML instances specified through XPath expressions in the element of XDC documents. In this case, DOM methods are more adequated because they allow only the fetching of the XML document fragment, corresponding to the XML instance in the XPath expression, that must be processed by the XDCL parser.
xdcl_on

6. Conclusion
This paper presents a integrity constraint control for XML documents called XDC. XDC is a contribution to the problem of managing XML data because it imposes a more rigorous DIC control on XML data. Of course, there are other kinds of integrity constraints that are handled by relational databases, like key and referential constraints. Some work in the literature concentrate their efforts to the management of these constraints for XML data [Buneman 2001, Chen et. al. 2002, Deutsch and Tannen 2001, Fan and Simon 2000]. However, we decide to focus on DIC control at this time because this is a important control for applications, considering that they desire data consistent with the domain abstraction they work on. Besides, related work and XML schemas present several drawbacks in the management of DICs, as discussed before. Some analysis are important to demonstrate the XDC strenghts. First of all, we see in Table 1 that XML schemas are able to specify only type and attribute DICs. In the other hand, XDC supports the other kinds of constraints in DIC taxonomy: tuple, database and state transition. Therefore, the combined use of XDC and XML schemas provides an enough infrastructure to handle all kinds of DICs in XML documents. Other advantages of XDC may be highlighted in a comparison with related work. XDC provides a completely independent DIC control, i.e., it does not extend an XML schema recommendation with integrity constraint instructions and does not impose additional programming code for XML parsers or XML management systems in order to handle DICs. It is only necessary to invoke an specific parser for DIC validation based on previous rule definitions. Rules for DIC are specified through a simple and short language called XDCL. XDCL is based on SQL, that is a reference language for integrity constraint definition in relational databases. Similar to XDCL is Active XQuery, that also adopts an SQL-like syntax for incorporating trigger instruction in XQuery. However, such support had required an extension of the XQuery compiler as well as an additional and non trivial trigger control on its XQuery engine. We had developed a first version of XDC at the time of this paper. Therefore, several future work must be done. A completeness and correctness analysis of XDCL must be performed in order to eliminate some drawbacks and provide a much eficcient integrity control. One point to be considered is to avoid that an XDCL action may fire and violate other DICs controlled by XDC or even schema constraints imposed by DTDs or XSDs. With respect to completeness, XDCL currently does not have the same expressive power of SQL because it does not support some resources like after and before triggering, and arithmetic expressions in conditions and actions. We intend to extend XDCL to reach at least the SQL expressive power.

We also plan to support XUpdate [XUpdate 2005] actions in XDC, considering the increasing use of XUpdate for updating XML documents. Due to validation problems with existing XUpdate parsers, it was not considered in this version of XDC, despite of the XDC syntax for actions be strongly based on XUpdate. Other relevant future work comprises the control of other kinds of integrity constraints, like the ones pointed out before, and the development of a tool for creating and validating XDC documents.

References
ABITEBOUL, S.; BUNEMAN, P.; SUCIU, D. Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufman, 2000. BAILEY, J.; PAPAMARKOS, G.; POULOVASSILIS, A.; WOOD, P. An EventCondition-Action Language for XML. Web Dynamics - Adapting to Change in Content, Size, Topology and Use, Spinger-Verlag, 2004. p.223-248. BENEDIKT, M.; BRUNS, G. GIBSON, J.; KUSS, R., AMY, N. Automated Update Management for XML Integrity Constraints. In: Workshop on Programming Languages for XML (PLAN-X), 2002. BONIFATI, A.; BRAGA, D.; CAMPI, A.; CERI, S. Active Xquery. In: 18th International Conference on Data Engineering, San Jose, CA, USA, 2002 (ICDE2002). p.403-412. BUNEMAN, P.; DAVIDSON, S.; FAN, W.; HARA, C., TAN, W. Keys for XML. In: 10th International Conference on World Wide Web, Hong Kong, Hong Kong, 2001. p.201-210. CHAUDHRI, A.; RACHID, A.; ZICARI, R. XML Data Management. Addison-Wesley, 1st Ed. 2003. CHEN, Y.; DAVIDSON, S.; ZHENG, Y. Validating Constraints in XML. Technical Report MS-CIS-02-03, Department of Computer and Information Science, University of Pennsylvania, 2002. DATE, C. J. An Introduction to Database Systems. Addison-Wesley, 8th Ed. 2003. DEUTSCH, A.; TANNEN, V. Containment and Integrity Constraints for XPath. In: 8th International Workshop on Knowledge Representation meets Databases, Rome, Italy, 2001 (KRDB 2001). ELMASRI, R; NAVATHE, S. B. Fundamentals of Database Systems. Addison-Wesley, 4th Ed. 2003. Extensible Stylesheet Language Family http://www.w3c.org/Style/XSL, april, 2005. (XSL). Available at:

FAN, W.; SIMON, J. Integrity Constraints for XML. In: Symposium on Principles of Database Systems, Dallas, Texas, USA, 2000 (PODS2000). p.23-34. LAZZARETTI, A. T. XDC: Uma Proposta de Controle de Restries de Integridade de Domnio em Documentos XML. Masters Thesis, PPGCC - UFSC, 2005 (in portuguese).

OGBUJI, C. Validating XML with Schematron. Available http://www.xml.com/pub/a/2000/11/22/schematron.html, april, 2005. PROVOST, W. Beyond W3C XML Schema. Available http://www.xml.com/pub/a/2002/04/10/beyondwxs.html, april, 2005. SAX. Available at: http://www.saxproject.org/, july, 2005.

at: at:

SILBERCHATZ, A.; KORTH, H. F.; SUDARSHAN, S. Database Systems Concepts. McGraw-Hill, 4th Ed. 2003. W3C DOM Document Object Model. Available at: http://www.w3c.org/DOM, july, 2005. W3C XML Query. Available at: http://www.w3c.org/xml/Query, july, 2005. W3C XML Schema. Avaliable at: http://www.w3.org/XML/Schema, july, 2005. XML - eXtensible Markup Language. Avaliable at: http://www.w3c.org/xml, july, 2005. XML Path Language. Available at: http://www.w3.org/TR/XPath, july, 2005. XML:DB Initiative: XUpdate XML Update Language. Available at: http://xmldborg.sourceforge.net/xupdate/, april 2005.

Appendix XDCL Grammar


<xdcl_constraints> {<xdcl_constraint xdcl_name={name_constraint}> [<xdcl_on>{Expression Xpath}</xdcl_on>] {<xdcl_statements> [{<xdcl_set_conditions> {<xdcl_condition xdcl_operator={= | >= | <= | > | < | <>}> [<xdcl_old> <xdcl_old_filexml>{Path doc. Xml old}</xdcl_old_filexml> <xdcl_old_identifier type_id={ElementDt|ElementNr|ElementSt} [attr_identifier={atributo}]> {element} </xdcl_old_identifier> </xdcl_old>] {<xdcl_operand1 type_condition ={ElementDt|ElementNr|ElementSt|ExpDt|ExpNr|ExpSt|ExpFuncNr} [name_attr={atributo}] [old={true}]> {expression XPath | element} </xdcl_operand1> {<xdcl_operand2 type_condition ={ElementDt|ElementNr|ElementSt|ExpDt|ExpNr|ExpSt|ExpFuncNr} [name_attr={atributo}] [old={true}]> {expression XPath | element} </xdcl_operand2> </xdcl_condition> [{<AND/> | <OR/> | <NOT/>}][n]}[...n] </xdcl_set_conditions>}] {<xdcl_actions> {[<xdcl_delete> <delete [name_attr= {name_attribute}]>{name delete}</delete> </xdcl_delete> | <xdcl_insert> {<insert type= {Element | Attribute} name_element={element_father} [type_place={Append | Before}]> {name insert} </insert> [<insert-value>{value insert}</insert-value>]} </xdcl_insert> | <xdcl_update> <update name_element= {name element} [name_attr={name attribute} ] type_value={Constant | Function}> {function XPath | constantt } </update> </xdcl_update> | <xdcl_rename> <rename name_element= {name_element} [name_attr={name attribute}]>{new name}</rename> </xdcl_rename> | <xdcl_message> {message}</xdcl_message>]}[1...n] </xdcl_actions>} </xdcl_statements>} </xdcl_constraint>}[...n] </xdcl_constraints>

You might also like