You are on page 1of 46

Iran University of Science and Technology

Fast Query
Answering
XML data sources approach

Amir Saman Memaripour


12/19/2010

1 .....................................................................................................................................
4 ......................................................................................................................................
Z-Label RDBMS 5 .............................................................................
5 ...................................................................................................................................
5 ..................................................................................................................................
6..............................................................................................................Z
8 ..................................................................................................................
8 .................................................................................................................
9 ....................................................................................................................
10..........................................................................................................................
XQuery 11...................................................................
11.................................................................................................................................
11..................................................................................................................................
13...........................................................................................................................
16 .........................................................................................................................
16 ..........................................................................................................................
18....................................................................................................................
XPath 18...............................................................................................
19..........................................................................................................
24........................................................................................................
24.....................................................................................................................
24............................................................................................................
)29................................................................................(XPathMark dataset
2

31.......................................................................................................................
XQuery 33........................................................
33..................................................................................................................................
33.........................................................................................................
35........................................................................................
36 ........................................................................................................................
37........................................................................................... (XML-aware) XML
37............................................................................................................. Full-text
37........................................................................................................................
37..........................................................................................................
38..............................................................................................................
38....................................................................................................
39.......................................................................................................................
)39..................................................................................................... (D-nodes
)42....................................................................................... (E-nodes
43..................................................................................................
44.......................................................................................................................
46 ............................................................................................................................


XML
.

.
.

.
XML . XML

.
.

Z-Label RDBMS

Z-Label
XML .
XQuery
index . labeling 1
XQuery
RDBMS .
.

n XML ) S(n
.
tag 2 3 :
/season/league/player

Q ]] [[Q .
4 n XML ) S(n Z n .
Z .
Q SZ :
}[[Q]] = {n | Sn LIKE SZ

Axis
Child
3
Descendant
4
Label
2

Z
Z . LIKE
CONCAT NOT OR AND SQL .

:1 // % .
:2 / 0 .
:3 XML ) (TagID
..
:4 //ti %.i .
:5 //ti/tj %.i.j .
:6 //ti//tj %.i.%.j .
.
Z P Q
. :

. % Q Q P

) P Q( ZP = ZQ.
-

. ZQ ZQ .%.

Q P ZP LIKE ZQ .
-

i .%. ZQ .%.
Z :
-1 .%. null

-2 .%.
Q 2i Z .
Q P :
) (
OR . i=2

:

. Z Q P Q true

Z Update :
-1 tag ) (tagID Z

tag .
-2 tag Z
.

Bohme .
.

01.00/01.01 01.00/01 01.02.01/01


. / ASCII .

.
.
XML .


predicate XPath . predicate
XPath .
predicate . /a[b]/c
/a/b /a /a/c . join
SQL RDBMS .

Indexing schema


.
.


Z-Label XML
XPath . ) ( XML
.
XML
.

10

XQuery

XPath XML . XQuery XSLT

XPath . heuristic
XPath .
XPath 2.0 XPath
. XPath ) (
XPath 350
.

XPath XML . XPath


standalone W3C XSLT W3C XQuery
. XPath
6 . XPath
.
XPath
XPath
intersect except .
XPath 2.0
. .
.
XPath .
XPath 2.0 XPath .
XPath
. XPath 2.0

Redundant constructs

11

caching . Q
Q cache . cache
.
XPath
XPath . )}{( .
XP1 XP2 XPath }{ = XP1 INTERSECT XP2
}{ = XP1 EXCEPT XP2 XP1 XP2 7
.
XML data source c XPath XC

C
Q . }{ = XC INTERSECT Q .
}{ = XC INTERSECT Q Q
C remote
.
) X(Q XPath XML Q

Q cache
XPath C . X(Q) C

.
:
-

XPath

8 XPath E E-1 . E-1


context E .

XPath 2.0 XPath


XPath

Logical satisfiability test


Reverse pattern

7
8

12

XPath 2.0 XPath


XPath


XPath 9
10 .
.

XPath
.
XPath
.
.


E XPath .
E E-1 $d

)] ($d[E-1 $d
E . XPath XPath
.
) 1 :(XPath XPath
.
.
child descendant .
11 descendant-or-self .
attribute namespace

Reverse axis
Location step
11
Current context node
10

13

. ancestor-or-self ancestor parent


) (.

XPath

child attribute namespace parent .



XPath .
) 2 ( : XPath
.
) 3 ( : XPath :.
14

XPath .
4 5 . XPath )"|"(
. 12
.
.
XPath ) relative( :
]axis1::test1 [F1 1] [F1 n1/ axis2::test2 [F2 1] [F2 n2] / / axism::testm [Fm 1] [Fm nm

XPath ) (Absolute :
]/axis1::test1 [F1 1] [F1 n1/ axis2::test2 [F2 1] [F2 n2] / / axism::testm [Fm 1] [Fm nm

axisi testi XPath Fij . Erelative Eabsolute


:
self::testm [Fm 1] [Fm nm] Tm / (raxism 1::testm-1 | | raxism pm::testm-1) [Fm-1 1] [Fm-1 nm-1] Tm-1 / /
(raxis2 1::test1 | | raxis2 p2::test1) [F1 1] [F1 n1] T1 / (raxis1 1::node() | | raxis1 p1::node()) Troot

Troot ")( "self::node() is root " "self::node() is $c $c


context raxis i 1 raxis i pi axis i Ti axis i
. i Ti .
child::object

self::object [self instance of element()*] /

] parent::node() [self::node() is $c / ])( self::node() [self::node() is root


| ]ancestor::object [attribute::name = /child::contains / child::object [position() = 1

] cockpit .
self::object [position() = 1] [self instance of element()*] / parent::contains [self instance of element()*] /
parent::node() [self::node is root()] | self::object [attribute::name = cockpit] [self instance of
element()*] / (descendant::node() | descendant-or-self::node() / attribute::node() | descendant-or]self::node() / namespace::node()) [self::node() is $c

Disjunctive operator

12

15

:1 XPath E $d ] $d [E-1 E-1

E .
:
.


XPath 2.0 XPath
XP1 XP2 " "XP1 intersect XP2
XP1 XP2 . XPath
.
) 4 :(XPath XP1 XP2 XP1 XP2
XML context .
.
) 2 ( : " "XP1 intersect XP2
] XP1[XP2-1 XP1 XP2 XPath XP2-1 XP2 .
: XP1 .
XP2 XP2 .
XP2-1 XP1 )] .(XP1[XP2-1 "XP1
" intersect XP2 "XP1 intersect XP2

"] XP1[XP2-1 .


XPath 2.0 . XP1 XP2
XPath " "XP1 except XP2 XP1 XP2
.
16

:3 " "XP1 except XP2 XP1 XP2 XPath ])XP1[not(XP2-1

XP2-1 XP2 .
: XP1 .
XP2 XP1 ]) XP1[not(XP2-1
XP1 XP2 . " "XP1 except XP2
XP1 XP2 :
])XP1 except XP2 XP1[not(XP2-1

XPath :4 13 . .
:1 XPath XML .
Gall = /descendant-or-self::node() | /descendant-or-self::node() /attribute::node() | /descendant-or)(self::node() /namespace::node
: XML attribute namespace /descendant-or-
)( self::node . /descendant-or-

)( self::node() /attribute::node )( /descendant-or-self::node() /namespace::node


XML . XML

.
: Q " "Gall [not(Q-1)] Gall except Q .
XPath . XPath 1.0
) context( . ])([self::node() is root

] [self::node() is $c context .
.
extension XPath
.

Complementation

13

17


] XP1[XP2-1 ]) XP1[not(XP2-1 XPath
XP1 XP2-1 ) not(XP2-1
. XPath .
:
-

XP1 .

XML

heuristic :
-

XPath XPath
.

)( not .

14 self
.

XPath

XPath .
XPath .
XPath
.

Location step

14

18

XPath


more restrictive .
:(more restrictive) 5

15

t1 t2 t1

<< t2 .
t1 t2 . self::t1 context self::t2 context
context .
name test :(more restrictive) 5 * attribute() * * )({text(), comment(), element

})( document-node(), processing-instruction )( element

{name, *, element(), attribute(), text(),

})( comment(), document-node(), processing-instruction )( node .


: .

Node test

15

19

t1 })( t2 {element(), attribute })( t1 {element(), attribute t2


t2

t1 . ) not(t1 << t2 << not(t2

) t1 t1 t2 .
XPath .
.
. XPath
.
.
.
XPath
.
p3 p2 p1 p p4 t1 t2 a1 a2 .
F )( ) (predicate .
] P[P1 P )(
) (/ P1 XPath . P[P1] P

] self::node() [P1 .

) .(P1 / / P2
) P | P .( | P P
self .

20

not .

21

OR AND .

)( not .

.

.

22

:( )2
Let XP1 = /child::a /child::b, let XP2 = /child::a /child::b [child::c]. Then XP1 intersect XP2 /child::a
/child::b [self::b [child::c] [self instance of element()*] /parent::a [self instance of element()*]
/parent::node() [self::node() is root()]]

4
.
/child::a /child::b [child::c]

:( )3
Let XP1 = /child::node() /self::a /child::node() /self::b, let XP2 = /descendant-or-self::c /ancestor-orself::b. Then XP1 intersect XP2 /child::node() /self::a /child::node() /self::b [self::b /descendant-orself::c /ancestor-or-self::node() [self::node() is root()]]

4
.
/child::a /child::b [descendant::c]

:( )4
Let XP1 = /child::a /child::b, let XP2 = /child::a /child::b [child::c]. Then XP1 except XP2 /child::a
/child::b [not(self::b [child::c] [self instance of element()*] /parent::a [self instance of element()*]
/parent::node() [self::node() is root()])]

5
.
/child::a /child::b [not(child::c)]

:( )5
Let XP1 = /child::node() /self::a /child::node() /self::b, let XP2 = /descendant-or-self::c /ancestor-orself::b. Then XP1 except XP2 /child::node() /self::a /child::node() /self::b [not(self::b /descendant-orself::c /ancestor-or-self::node() [self::node () is root()])]
23

5
.
])/child::a /child::b [not(descendant::c


.
16
.
XPathMark Benchmark XPath


1/7 Inter Pentium 1 RAM
Windows XP Java VM 1.4.2 Saxon 8.0 Qizx 0.4
.



. XML > <a > <b
. > <b > <c > <d . XML
> <b
> <c > <d .

. 2 4 0.88
Dataset

16

24

1.105 . XPath
.
(XP1 intersect XP2) 3 Qizx

. .

XP1 intersect XP2 3 Qizx

3 Qizx

25

3 Qizx

(XP1 intersect XP2) 3


Saxon .

XP1 intersect XP2 3 Saxon

26

3 Saxon

3 Saxon

5 Qizx Saxon
. Saxon XML .
93 .
Qizx %30
.

27

5 Qizx

5 Saxon

28

)(XPathMark dataset
XPathMark Benchmark
XPath .
XPathMark 0.116 11.597 .

)
XPathMark( .
. Pi = //keyword

(/parent::node() /child::keyword)i Ai i A
. Si = //keyword (/self::keyword)i
)( self .
49 .
P1 .. P5 E16 E15 E17 S10 .. S100
Saxon Qizx .
XPath ) E16 E17 (E15 ) P1 (P5
) S10 (S100 .
%12 %50
. 1.3 %30
. Qizx %900 .

29

P1 E16 E15 P5 E17 Saxon

P1 E16 E15 P5 E17 Qizx

30

S10 S100 Saxon

S10 S100 Qizx


XPath
. XPath 2.0

.
31

XPath .
intersect except XPath .
XPath
350 .
.
XPath XQuery
XQuery .

32

XQuery

.
.
.
. XML XQuery


.
MarkLogic .


SQL

.
.
.


.
. XML
.

33

. markup

.
. .
.

. XML .
markup
.

.
.

.

full-text :

34

} { } { } {.

) ( .
XML
XML
XML . XML
.


.
:
-

. 1
.

.
.


load .

Web

2.0 metadata .
) ( ) (

.
online .
-


. full-text

...

35

.
.
-

.
markup
.


.
.


)( ... .
.
.

XML .
XML .
XML XML
.

XQuery
. MarkLogic Server
.



.
.
XQuery XML .

36

(XML-aware) XML
XQuery XML .
) (
XML 17 .

Full-text

full-text .
.
full-text XQuery .



metadata
. .
XQuery .


.
XQuery
. XQuery
.

Syntax

17

37


XQuery .

. XQuery
. )( fn:trace
)( fn:error .

.
.
XQuery Lazy .

. full-text
10

.


XQuery
.
. XQuery
XML Schema
. item
XML XML
XML Schema .
.

38

Schema
.
. XQuery
.
XQuery
.



.
. MarkLogic
XQuery
.
10
.
MarkLogic .
) E-node (
) D-node ( . E D
. load balancer caching proxy
E .

)(D-nodes
XML
.
E .

39


.
. .


.
.


.
.

40


Universal
posting .
... . .
posting
.
full-text XQuery posting

. .
simple

example .
posting simple
example .
. index resolution .

.

.
stand 18log .
.
Journaling 19 .
. stand
.
Concurrency .

.
Log structured
Crash

18
19

41

stand
.
stand
. sequential
sequential .

)(E-nodes
client ) XQuery
( . HTTP listener
XQuery .

.
.

. lazy
.

posting
.
20

Application Server . HTTP port


HTTP
XQuery . XQuery

Application server

20

42

. XHTML browser
XQuery .


10
.

-1 HTTP client .
. :
;"import module namespace my="http://marklogic.com/example" at "/MarkLogic/example.xqy
])for $result in cts:search( //SCENE, "to be or not to be" ) [fn:position() = (1 to 10
)return my:render-result($result

-2 XQuery
. :
)"AND (SCENE,"to","be","or","not

43

-3 posting .
21 .
.
posting posting
) 8 4 2 1 ... (.
-4 .
posting
)
SCENE ( . not to be seen or heard

.
-5 XQuery .
lazy
. 10
SCENE my:render-result . ACT
filtering SCENE .
10 .
-6 HTTP
.
cache .



.
XML XQuery

False positive

21

44

22
. full-text
.

Fine grained

22

45

1. Angela Bonifati, Gregory Leighton, Veli Makinen, Sebastian Maneth .() .An In-Memory
XQuery/XPath Engine over a Compressed Structured Text Representation .Dagstuhl Seminar
Proceedings .
2. Ashish Virmani, Suchit Agarwal, Rahul Thathoo, Shekhar Suman, Sudip Sanyal .() .A Fast
XPATH Evaluation Technique with the Facility of Updates .ACM.
3. Liang Huai Yang, Mong Li Lee, Wynne Hsu, Decai Huang, Limsoon Wong .() .Efficient mining
of frequent XML query patterns with repeating-siblings .ELSEVIER.
4. Mariano P. Consens, Flavio Rizzolo .() .Fast Answering of XPath Query Workloads on .
Springer.
5. Mary Holstege, Mark Logic Corporation .() .Big, Fast XQuery: Enabling Content
Applications .IEEE Computer Society Technical Committee on Data Engineering.
6. Sven Groppe, Stefan Bttcher, Jinghua Groppe .() .XPath Query Simplification with regard
to the Elimination of Intersect and Except Operators .nd International Conference on Data
Engineering Workshops .IEEE Computer Society Washington.

46

You might also like