Professional Documents
Culture Documents
SQL 2005 XML Best Practice
SQL 2005 XML Best Practice
0 Technical Articles
Introduction
Microsoft SQL Server 2000 and SQL-ML >e, 'eleases provide po:erf%l -ML data &anage&ent capa,ilities. These feat%res foc%s on the &apping ,et:een relational and -ML data. -ML vie:s of relational data can ,e defined %sing annotated -S8 5A-S86 to provide an -ML2centric approach that s%pports ,%lk load of -ML data! and ?%er( and %pdate capa,ilities on -ML data. Transact2SQL e.tensions provide SQL2centric approach for &apping relational ?%er( res%lts to -ML 5%sing @A' -ML6 and generating relational vie:s fro& -ML 5%sing Apen-ML6. Microsoft SQL Server 200+ provides e.tensive s%pport for -ML data processing. -ML val%es can ,e stored nativel( in an -ML data t(pe col%&n! :hich can ,e t(ped according to a collection of -ML sche&as! or left %nt(ped. Bo% can inde. the -ML col%&n. @%rther&ore! fine2grained data &anip%lation is s%pported %sing -Q%er( and -ML 8ML! the latter ,eing an e.tension for data &odification. ;n addition! the SQL-ML! @A' -ML and Apen-ML feat%res have ,een e.tended in SQL Server 200+. Together :ith the ne:l( added native -ML s%pport! SQL Server 200+ provides a po:erf%l platfor& for developing rich applications for se&i2str%ct%red and %nstr%ct%red data &anage&ent. >ith all the added f%nctionalit(! the %sers have &ore design choices for their data storage and application develop&ent. This article provides g%idelines for -ML data &odeling and %sage in SQL Server 200+. ;t is divided into the follo:ing t:o topics*
Data modeling
-ML data can ,e stored in &%ltiple :a(s in SQL Server 200+Cfor e.a&ple! %sing native -ML data t(pe and -ML shredded into ta,les. This topic provides g%idelines for &aking the
appropriate choices for &odeling (o%r -ML data. ;t also covers inde.ing of -ML data! propert( pro&otion! and t(ping of -ML instances.
Usage
This topic disc%sses %sage2related topics! s%ch as loading -ML data into the server and t(pe inference d%ring ?%er( co&pilation! e.plains and differentiates closel(2related feat%res! and s%ggests appropriate %se of these feat%res. The ideas are ill%strated :ith e.a&ples.
Data Modeling
This section o%tlines reasons for %sing -ML in SQL Server 200+! provides g%idelines for choosing ,et:een native -ML storage and -ML vie: technolog(! and gives data &odeling s%ggestions.
Bo%r data is sparse! (o% do not kno: the str%ct%re of the data! or the str%ct%re of (o%r data &a( change significantl( in the f%t%re. Bo%r data represents contain&ent hierarch( 5as opposed to references a&ong entities6 and &a( ,e rec%rsive. Arder of &ark%ps and val%es is inherent in (o%r data. Bo% :ant to ?%er( into the data or %pdate parts of it ,ased on its str%ct%re.
;f none of these conditions are &et! (o% sho%ld %se relational data &odel. @or e.a&ple! if (o%r data is in -ML for&at ,%t (o%r application &erel( %ses the data,ase to store and retrieve the data! an 0n7varchar5&a.6 col%&n is all (o% need. Storing the data in an -ML col%&n ,rings additional ,enefits/ the engine checks that the data is :ell2for&ed or valid according to a prespecified -ML sche&a! and s%pports fine2grained ?%er( and %pdate of the -ML data.
Bo% :ant to %se ad&inistrative f%nctionalit( of the data,ase server for &anaging (o%r -ML data 5for e.a&ple! ,ack%p! recover( and replication6. Bo% :ant to share! ?%er(! and &odif( (o%r -ML data in an efficient and transacted :a(. @ine2grained data access is i&portant to (o%r application. @or e.a&ple! (o% &a( :ant to
e.tract so&e of the sections :ithin an -ML doc%&ent! or (o% &a( :ant to insert a ne: section :itho%t replacing (o%r :hole doc%&ent.
Bo% have relational data and SQL applications! and (o% :ant interopera,ilit( ,et:een relational and -ML data :ithin (o%r application. Bo% need lang%age s%pport for ?%er( and data2&odification for cross2do&ain applications.
Bo% :ant the server to g%arantee :ell2for&ed data! and optionall( validate (o%r data according to -ML sche&as. Bo% :ant inde.ing of -ML data for efficient ?%er( processing and good scala,ilit(! and the %se a first2rate ?%er( opti&iEer. Bo% :ant SAA ! A8A.9FT and ALF 8= accesses to the -ML data.
;f none of these conditions are satisfied! (o% &a( ,e ,etter off storing (o%r data as a non2-ML! large o,Gect t(pe! s%ch as 0n7varchar5&a.6 or var,inar(5&a.6.
XML Storage
!tions
The storage options for -ML in SQL Server 200+ are as follo:s*
The data is stored in an internal representation that preserves the -ML content of the data! s%ch as contain&ent hierarch(! doc%&ent order! ele&ent and attri,%te val%es! and so on. Specificall(! the ;nfoSet content of the -ML data is preserved 5for &ore infor&ation on ;nfoSet! see http*11:::.:3.org1T'1.&l2infoset 0 http*11:::.:3.org1T'1.&l2infoset 7 6. ;t &a( not ,e an e.act cop( of the te.t -ML! since the follo:ing infor&ation is not retained* insignificant :hite spaces! order of attri,%tes! na&espace prefi.es! and -ML declaration. @or t(ped -ML data t(pe 5that is! -ML data t(pe ,o%nd to -ML sche&as6! the t(pe2relevant infor&ation of the post2sche&a validation ;nfoset 5 S";6! :hich adds t(pe infor&ation to the ;nfoset! is encoded in the internal representation. This i&proves parsing speed significantl(. 5@or &ore infor&ation! see the >3C -ML Sche&a specifications at http*11:::.:3.org1T'1.&lsche&a24 0 http*11:::.:3.org1T'1.&lsche&a24 7 and http*11:::.:3.org1T'1.&lsche&a22 0 http*11:::.:3.org1T'1.&lsche&a22 7 and the -Q%er( 4.0 and - ath 2.0 8ata Model :orking draft at http*11:::.:3.org1T'1200+1>82.path2 data&odel2200+0244 0 http*11:::.:3.org1T'1200+1>82.path2data&odel2200+0244 7 .6
<sing annotated sche&a 5A-S86! the -ML is deco&posed into col%&ns in one or &ore ta,les! preserving fidelit( of the data at the relational levelChierarchical str%ct%re is preserved! :hile order a&ong ele&ents is ignored. The sche&a cannot ,e rec%rsive.
An e.act cop( of the data is stored. This is %sef%l for special2p%rpose applications s%ch as legal doc%&ents. Most applications do not re?%ire an e.act cop(! and are satisfied :ith the -ML content 5;nfoset fidelit(6. ;n general! (o% &a( need to %se a co&,ination of these approaches. @or e.a&ple! (o% &a( :ant to store (o%r -ML data in an -ML data t(pe col%&n and pro&ote properties fro& it into relational col%&ns. Ar! (o% &a( :ant to %se &apping technolog( and store non2rec%rsive parts in non2-ML col%&ns and onl( the rec%rsive parts in -ML data t(pe col%&ns. C"oice of XML #ec"nology The choice of -ML technolog( 5native -ML vers%s -ML vie:6 generall( depends %pon the follo:ing factors*
Storage options*
Bo%r -ML data &a( ,e &ore s%ita,le for large o,Gect storage 5for e.a&ple! a prod%ct &an%al6! or &ore a&ena,le to storage in relational col%&ns 5for e.a&ple! line ite&s converted to -ML6. Fach storage option preserves doc%&ent fidelit( to a different e.tent.
Q%er( capa,ilities*
Bo% &a( find one storage option &ore s%ita,le than others! ,ased on the nat%re of (o%r ?%eries and on the e.tent to :hich (o% ?%er( (o%r -ML data. @ine2grained ?%er( of (o%r -ML dataCfor e.a&ple! predicate eval%ation on -ML nodesCis s%pported to var(ing degrees in the t:o storage options.
Bo% &a( :ant to inde. the -ML data to speed %p -ML ?%er( perfor&ance. ;nde.ing options var( :ith the storage options/ (o% need to &ake the appropriate choices to opti&iEe (o%r :orkload.
So&e :orkloads involve fine2grained &odification of -ML data 5for e.a&ple! adding a ne: section :ithin a doc%&ent6! :hile others do not 5for e.a&ple! >e, content6. 8ata &odification lang%age s%pport &a( ,e i&portant for (o%r application.
Sche&a s%pport*
Bo%r -ML data &a( ,e descri,ed ,( a sche&a! :hich &a( or &a( not ,e an -ML sche&a doc%&ent. The s%pport for sche&a2,o%nd -ML depends %pon the -ML technolog(. 9eedless to sa(! different choices have different perfor&ance characteristics.
$ative XML Storage Bo% can store (o%r -ML data in an -ML data t(pe col%&n at the server. This is a s%ita,le choice if*
Bo% :ant a straightfor:ard :a( of storing (o%r -ML data at the server :hile preserving doc%&ent order and doc%&ent str%ct%re. Bo% &a( or &a( not have a sche&a for (o%r -ML data. Bo% :ant to ?%er( and &odif( (o%r -ML data. Bo% :ant to inde. the -ML data for faster ?%er( processing. Bo%r application needs s(ste& catalog vie:s to ad&inister (o%r -ML data and -ML sche&as.
9ative -ML storage is %sef%l :hen (o% have -ML doc%&ents :ith a :ide range of str%ct%res! or -ML doc%&ents confor&ing to different or co&ple. sche&as that are too hard to &ap to relational str%ct%res. %&am!le: Modeling XML Data 'sing XML Data #y!e Consider a prod%ct &an%al in -ML for&at! consisting of a separate chapter for each topic! and &%ltiple sections :ithin each chapter. A section can contain s%,2sections! so that HsectionI is a rec%rsive ele&ent. rod%ct &an%als contain plent( of &i.ed content! diagra&s and technical &aterial/ the data is se&i2str%ct%red. <sers &a( :ant to perfor& conte.t%al search for topics of interest 5for e.a&ple! the section on Jcl%stered inde.J :ithin the chapter on Jinde.ingJ6! and ?%er( technical ?%antities. A s%ita,le storage &odel for (o%r -ML doc%&ents is an -ML data t(pe col%&n. This preserves the ;nfoset content of (o%r -ML data. ;nde.ing the -ML col%&n ,enefits ?%er( perfor&ance. %&am!le: Retaining %&act Co!ies of XML Data S%ppose govern&ent reg%lations re?%ire (o% to retain e.act te.t%al copies of (o%r -ML doc%&ents 5for e.a&ple! signed doc%&ents! legal doc%&ents! or stock transaction orders6. Bo% &a( :ant to store (o%r doc%&ents in an 0n7varchar5&a.6 col%&n. @or ?%er(ing! convert the data to -ML data t(pe at r%nti&e and e.ec%te -Q%er( on it. The r%nti&e conversion &a( ,e e.pensive especiall( :hen the doc%&ent is large. ;f (o% ?%er( often! (o% can red%ndantl( store the doc%&ents in an -ML data t(pe col%&n and inde. it! :hile (o% ret%rn e.act doc%&ent copies fro& the 0n7varchar5&a.6 col%&n. The -ML col%&n &a( ,e a co&p%ted col%&n ,ased on the 0n7varchar5&a.6 col%&n. Do:ever! (o% cannot create an -ML inde. on a co&p%ted! -ML col%&n! nor can an -ML inde. ,e ,%ilt on 0n7varchar5&a.6 or var,inar(5&a.6 col%&ns. XML (ie) #ec"nology =( defining a &apping ,et:een (o%r -ML sche&as and the ta,les in (o%r data,ase! (o% create an J-ML vie:J of (o%r persistent data. -ML ,%lk load can ,e %sed to pop%late the %nderl(ing ta,les %sing the -ML vie:. Bo% can ?%er( the -ML vie: %sing - ath 4.0/ the ?%er( is translated into SQL ?%eries on the ta,les. Si&ilarl(! %pdates are propagated to those ta,les as :ell. This technolog( is %sef%l :hen*
Bo% :ant to have an -ML2centric progra&&ing &odel %sing -ML vie:s over (o%r e.isting relational data.
Bo% have a sche&a 5-S8! -8'6 for (o%r -ML data! :hich an e.ternal partner &a( have provided. Arder is not i&portant in (o%r data! (o%r ?%er(a,le data is not rec%rsive! or the &a.i&al rec%rsion depth is kno:n in advance. Bo% :ant to ?%er( and &odif( the data thro%gh the -ML vie: %sing - ath 4.0. Bo% :ant to ,%lk2load -ML data and deco&pose the& into the %nderl(ing ta,les %sing the -ML vie:.
F.a&ples incl%de relational data e.posed as -ML for data e.change and :e, services! and -ML data :ith fi.ed sche&a. @or &ore infor&ation! see http*11&sdn.&icrosoft.co&1SQL-ML 0 http*11&sdn.&icrosoft.co&1SQL-ML 7 . Bo% can also p%,lish -ML fro& relational and -ML data stored at the server %sing @A' -ML. @or &ore infor&ation! refer to <sing @A' -ML to $enerate -ML fro& 'o:sets in this article. %&am!le: Modeling Data 'sing *nnotated XML Sc"ema +*XSD, S%ppose (o% have e.isting relational data 5for e.a&ple! c%sto&ers! orders! and line ite&s6 that (o% :o%ld like to &anip%late as -ML. 8efine an -ML vie: %sing A-S8 over the relational data. The -ML vie: allo:s (o% to ,%lk2load -ML data into (o%r ta,les! and ?%er( and %pdate the relational data %sing the -ML vie:. This &odel is %sef%l if (o% need to e.change data :ith -ML &ark%p :ith other applications :hile (o%r SQL applications :ork %ninterr%pted. -y.rid Model Q%ite often! a co&,ination of relational and -ML data t(pe col%&ns is appropriate for data &odeling. So&e of the val%es fro& (o%r -ML data can ,e stored in relational col%&ns! and the rest! or the entire -ML val%e! stored in an -ML col%&n. This &a( (ield ,etter perfor&ance 5for e.a&ple! (o% have f%ll control over the inde.es created on the relational col%&ns6 and locking characteristics. Do:ever! (o% %ndertake greater responsi,ilit( for &anaging (o%r data storage. The val%es to store in relational col%&ns depend on (o%r :orkload. @or e.a&ple! if (o% retrieve entire -ML val%es ,ased on the path e.pression 1C%sto&er1KC%st;d! then pro&oting the val%e of the CustId attri,%te into a relational col%&n and inde.ing it &a( (ield faster ?%er( perfor&ance. An the other hand! if (o%r -ML data is e.tensivel( and non2red%ndantl( deco&posed into relational col%&ns! the reasse&,l( cost &a( ,e significant. @or highl( str%ct%red -ML data 5for e.a&ple! the content of a ta,le has ,een converted into -ML6! (o% can &ap all val%es to relational col%&ns! possi,l( %sing -ML vie: technolog(.
Bo%r application perfor&s data retrieval on the -ML col%&n and does not re?%ire an -ML inde. on the -ML col%&n! or.
Bo% :ant to ,%ild an -ML inde. on the -ML data t(pe col%&n! and the pri&ar( ke( of the &ain ta,le is the sa&e as its cl%stering ke(. See the section on ;nde.ing an -ML 8ata T(pe Col%&n for &ore details.
Create the -ML data t(pe col%&n in a separate ta,le if the follo:ing conditions are tr%e*
Bo% :ant to ,%ild an -ML inde. on the -ML data t(pe col%&n! ,%t the pri&ar( ke( of the &ain ta,le is not the sa&e as its cl%stering ke(! or the &ain ta,le does not have a pri&ar( ke(! or the &ain ta,le is a heap 5that is! no cl%stering ke(6. This &a( ,e tr%e if the &ain ta,le alread( e.ists.
Bo% do not :ant ta,le scans to slo: do:n d%e to the presence of the -ML col%&n in the ta,le! :hich takes %p space :hether it is stored in2ro: or o%t2of2ro:.
Bo% do not have a sche&a for (o%r -ML data. Bo% have sche&as ,%t (o% do not :ant the server to validate the data. This is so&eti&es the case :hen an application perfor&s client2side validation ,efore storing the data at the server! or te&poraril( stores -ML data invalid according to the sche&a! or %ses -ML sche&a feat%res not s%pported at the server 5for e.a&ple! key/keyref6.
Bo% have sche&as for (o%r -ML data and (o% :ant the server to validate (o%r -ML data according on the -ML sche&as. Bo% :ant to take advantage of storage and ?%er( opti&iEations ,ased on t(pe infor&ation. Bo% :ant to take ,etter advantage of t(pe infor&ation d%ring co&pilation of (o%r ?%eries s%ch as static t(pe errors.
T(ped -ML col%&ns! para&eters and varia,les can store -ML doc%&ents or content! :hich (o% have to specif( as a flag 58AC<MF9T or CA9TF9T! respectivel(6 at the ti&e of declaration. @%rther&ore! (o% have to provide one or &ore -ML sche&as. Specif( 8AC<MF9T if each -ML instance has e.actl( one top2level ele&ent/ other:ise! %se CA9TF9T. The ?%er( co&piler %ses 8AC<MF9T flag in t(pe checks d%ring ?%er( co&pilation to infer singleton top2level ele&ents. ;n addition to t(ping an -ML col%&n! (o% can %se relational 5col%&n or ro:6 constraints on t(ped or %nt(ped -ML data t(pe col%&ns. <se constraints %nder the follo:ing conditions*
Bo%r ,%siness r%les cannot ,e e.pressed in -ML sche&as. @or e.a&ple! the deliver( address of a flo:er shop &%st ,e :ithin +0 &iles of its ,%siness location! :hich can ,e :ritten as a constraint on the -ML col%&n. The constraint &a( involve -ML data t(pe ðods :ithin scalar 5as opposed to ta,le2val%ed6 %ser2defined f%nctions.
Bo%r constraint involves other -ML or non2-ML col%&ns in the ta,le. An e.a&ple is the enforce&ent of the ;8 of a C%sto&er 5/Customer/@CustId6 fo%nd in an -ML instance to &atch the val%e in a relational C%sto&er;8 col%&n.
Document #y!e Definition +D#D, -ML data t(pe col%&ns! varia,les! and para&eters can ,e t(ped %sing -ML sche&a! ,%t not %sing 8T8. Bo% can convert 8T8s to -ML sche&a doc%&ents %sing third2part( tools! and load the -ML sche&as into the data,ase. ;nline 8T8 can ,e %sed for ,oth %nt(ped and t(ped -ML instances to s%ppl( defa%lt val%es and to replace entit( references :ith their e.panded for&.
Cop( Code
IN$ERT INT% do!s &AL'E$ "(, )*+ook ,e-re./se!ur0ty/ pu+l0!1t0o-d1te./2332/ I$ N./3456784(79942/: *t0tle:;r0t0-, $e!ure Code*/t0tle: *1ut<or: *f0rst4-1me:M0!<1el*/f0rst4-1me: *l1st4-1me:=o>1rd*/l1st4-1me: */1ut<or: *1ut<or: *f0rst4-1me:?1@0d*/f0rst4-1me: *l1st4-1me:Le l1-!*/l1st4-1me: */1ut<or: *pr0!e:6ABAA*/pr0!e: */+ook:)#
The stored siEe in ,(tes of the -ML instances in the -ML col%&n can ,e fo%nd %sing the D*#*L%$/#-+, f%nction* Cop( Code
In2Ro) and
ut2of2Ro) Storage
S&all -ML data t(pe instances are stored :ithin the ro:s of a ta,le. Larger val%es that cannot ,e acco&&odated :ithin a disk page are stored o%t of ro: :ith an in2ro: pointer of 4L ,(tes. Storing -ML val%es in2ro: red%ces the record densit( and slo:s do:n ta,le scans over the non2-ML col%&ns in the ta,le. ;n s%ch cases! the MMlarge val%e t(pes o%t of ro:MM option can ,e specified in the s(ste& stored proced%re s!3ta.leo!tion to store all large data t(pes off2ro:.
Q%eries on -ML col%&ns are co&&on in (o%r :orkload. -ML inde. &aintenance cost d%ring data &odification &%st ,e taken into acco%nt. Bo%r -ML val%es are relativel( large and the retrieved parts are relativel( s&all. =%ilding the inde. avoids parsing the :hole data at r%nti&e and ,enefits inde. look%ps for efficient ?%er( processing.
The first inde. on an -ML col%&n is the Jpri&ar( -ML inde.J. <sing it! three t(pes of secondar( -ML inde.es can ,e created on the -ML col%&n to speed %p co&&on classes of ?%eries! as descri,ed ,elo:.
4rimary XML Inde& This inde.es all tags! val%es and paths :ithin the -ML instances in an -ML col%&n. The ,ase ta,le 5that is! the ta,le in :hich the -ML col%&n occ%rs6 &%st have a cl%stered inde. on the pri&ar( ke( of the ta,le/ the pri&ar( ke( is %sed to correlate inde. ro:s :ith the ro:s in the ,ase ta,le. @%ll -ML instances are retrieved fro& the -ML col%&ns 5for e.a&ple! $ELECT E6. Q%eries %se the pri&ar( -ML inde.! ret%rning scalar val%es or -ML s%,trees %sing the inde.. %&am!le: Creating 4rimary XML Inde& The follo:ing state&ent creates a pri&ar( -ML inde. called id.N.Col on the -ML col%&n .Col of the ta,le docs* Cop( Code
;f (o%r :orkload %ses path e.pressions heavil( on -ML col%&ns! the ATD secondar( -ML inde. is likel( to speed %p (o%r :orkload. The &ost co&&on case is the %se of e&ist+, ðod on -ML col%&ns in >DF'F cla%se of Transact2SQL.
;f (o%r :orkload retrieves &%ltiple val%es fro& individ%al -ML instances %sing path e.pressions! cl%stering paths :ithin each -ML instance in the 'A F'TB inde. &a( ,e helpf%l. This scenario t(picall( occ%rs in a propert( ,ag scenario :hen properties of an o,Gect are fetched and its relational pri&ar( ke( val%e is kno:n.
;f (o%r :orkload involves ?%er(ing for val%es :ithin -ML instances :itho%t kno:ing the ele&ent or attri,%te na&es that contain those val%es! (o% &a( :ant to create the "AL<F inde.. This t(picall( occ%rs :ith descendant a.es look%ps! s%ch as //1ut<orGl1st4
-1me./=o>1rd/H! :here Ha%thorI ele&ents can occ%r at an( level of the hierarch( and the
search val%e 5JDo:ardJ6 is &ore selective than the path. ;t also occ%rs in J:ildcardJ ?%eries! s%ch as /+ook G@E . /-o@el/H! :here the ?%er( looks for H,ookI ele&ents :ith so&e attri,%te having the val%e JnovelJ. %&am!le: 4at"25ased Loo6u! S%ppose the ?%er( ,elo: is co&&on in (o%r :orkload* Cop( Code
DR%M ;=ERE
The path e.pression /+ook/@,e-re and the val%e Jsec%rit(J correspond to the ke( fields of the ATD inde.. Conse?%entl(! secondar( -ML inde. of t(pe ATD is helpf%l for this :orkload* Cop( Code
CREATE XML IN?EX 0dxFxColFP1t< o- do!s "xCol# '$INC XML IN?EX 0dxFxCol D%R PAT=
%&am!le: 7etc"ing 4ro!erties of an .8ect
Consider the ?%er( ,elo: that retrieves the first na&es of a%thors of a ,ook fro& each ro: in ta,le docs* Cop( Code
$ELECT refB@1lue ")f0rst4-1me), )-@1r!<1r"8I#)#, refB@1lue ")l1st4-1me), )-@1r!<1r"8I#)# DR%M do!s CR%$$ APPLY xColB-odes ")/+ook/1ut<or# R"ref#
The propert( inde. is %sef%l in this case and is created as follo:s* Cop( Code
CREATE XML IN?EX 0dxFxColFProperty o- do!s "xCol# '$INC XML IN?EX 0dxFxCol D%R PR%PERTY
%&am!le: (alue25ased Query ;n the follo:ing ?%er(! a partial path is specified %sing 11! so that the look%p ,ased on the val%e of ;S=9 ,enefits fro& the %se of the "AL<F inde.* Cop( Code
CREATE XML IN?EX 0dxFxColF&1lue o- do!s "xCol# '$INC XML IN?EX 0dxFxCol D%R &AL'E
XML Inde& on Multi!le 7ile /rou!s -ML inde.es are collocated :ith the ,ase ta,le/ that is! -ML inde. ro:s are stored in the sa&e file gro%ps and ta,le partitions as the corresponding ,ase ta,le ro:s. This &a( so&eti&es re?%ire large file gro%ps for -ML ,lo,s and their collocated -ML inde.es. The TF-T;MA$FNA9 H filegroupI specification in the C'FATF TA=LF state&ent stores the -ML ,lo,s in the specified filegro%pCthe -ML inde. ro:s are still collocated :ith the ,ase ta,le! :hile large -ML node val%es are in the sa&e file gro%p as the -ML ,lo,s. This red%ces the siEe of the individ%al file gro%ps and provides &ore convenience for data &anage&ent. @or e.a&ple! :hen the non2-ML data in the ro: is s&all relative to the siEe of the -ML data! this techni?%e can distri,%te the storage &ore evenl(.
7ull2#e&t Inde& on XML Column Bo% can create a f%ll2te.t inde. on -ML col%&ns/ this inde.es the content of the -ML val%es :hile ignoring the -ML &ark%p. Attri,%te val%es are not f%ll2te.t inde.ed 5since the( are considered part of the &ark%p6 and ele&ent tags are %sed as token ,o%ndaries. Bo% can co&,ine f%ll2te.t search :ith -ML inde. %sage in so&e scenarios*
@ilter the -ML val%es of interest %sing SQL f%ll2te.t search. Q%er( those -ML instances! :hich %ses -ML inde. on the -ML col%&n.
%&am!le: Com.ining 7ull2#e&t Searc" )it" XML Querying The steps for creating f%ll2te.t inde. on an -ML col%&n are identical to those for other SQL t(pe col%&ns. The 88L state&ents are as follo:s! in :hich PNNdocsNN0238+A0) is the single2col%&n pri&ar( ke( inde. of the ta,le* Cop( Code
CREATE D'LLTEXT CATAL%C ft A$ ?EDA'LT CREATE D'LLTEXT IN?EX %N d+oBdo!s "xCol# KEY IN?EX PKFFdo!sFF326?7A3I
Ance the f%ll2te.t inde. has ,een created on the -ML col%&n! the follo:ing ?%er( checks that an -ML instance contains the :ord JSec%reJ in the title of a ,ook* Cop( Code
The C $#*I$S+, ðod %ses the f%ll2te.t inde. to s%,set the -ML instances that contain the :ord JSec%reJ an(:here in the doc%&ent. The e&ist+, cla%se ens%res that the :ord JSec%reJ occ%rs in the title of a ,ook. @%ll2te.t search %sing C $#*I$S+, and -Q%er( contains+, have different se&antics. The latter is a s%,string &atch! :hile the for&er is a token &atch %sing ste&&ing. Th%s! if the search is for the string Jr%nJ in the title! then Jr%nJ! Jr%nsJ and Jr%nningJ all &atch! since ,oth the f%ll2te.t C $#*I$S+, and the -Q%er( contains+, are satisfied. Do:ever! the ?%er( a,ove does not &atch the :ord J<nSec%redJ in the title 5the f%ll2te.t C $#*I$S+, fails ,%t the -Q%er( contains+, is satisfied6. @%rther&ore! f%ll2te.t search e&plo(s :ord ste&&ing! :hile -Q%er( contains+, is a literal &atch. ;n general! for a p%re s%,string &atch! the f%ll2te.t C $#*I$S+, cla%se sho%ld ,e re&oved. This difference is ill%strated in the ne.t e.a&ple. %&am!le: 7ull2#e&t Searc" on XML (alues 'sing Stemming The -Q%er( contains+, check in F.a&ple* Co&,ining @%ll2Te.t Search :ith -ML Q%er(ing cannot ,e eli&inated in general. Consider the ?%er(* Cop( Code
The :ord JranJ in the doc%&ent &atches the search condition o:ing to ste&&ing. @%rther&ore! the search conte.t is not checked %sing -Q%er(. >hen -ML is deco&posed %sing A-S8 into relational col%&ns that are f%ll2te.t inde.ed! - ath ?%eries over the -ML vie: do not perfor& f%ll2te.t search on the %nderl(ing ta,les. Su!!ort for Different Languages in 7ull2#e&t Inde& on XML Column <nlike nvarchar or varchar col%&ns that can have onl( one :ord ,reaker for the entire col%&n! an -ML data t(pe col%&n s%pports &%ltiple lang%age :ord ,reakers %sing the .&l*lang attri,%te on -ML ele&ents. The :ord ,reaker for the specified lang%age is %sed on the content of that ele&ent. A s%,2ele&ent can specif( a different lang%age in an .&l*lang attri,%te. Th%s! not onl( can different -ML instances ,%t also a single -ML instance can involve &%ltiple :ord ,reakers. This gives rise to interesting possi,ilities. @or e.a&ple! a >ord 2003 doc%&ent &a( contain sections in different lang%ages. The doc%&ent in >ordML -ML representation can ,e stored in an -ML data t(pe col%&n! and the appropriate lang%age :ord ,reakers are %sed for f%ll2te.t inde.ing. A f%ll2te.t ?%er( can specif( the lang%age to %se! as sho:n in the e.a&ple ,elo:. %&am!le: 7ull2#e&t Searc" S!ecifying a Language The ?%er( ,elo: specifies that the f%ll2te.t search sho%ld ,e perfor&ed for the $er&an lang%age. Cop( Code
4ro!erty 4romotion
;f ?%eries are &ade principall( on a s&all n%&,er of ele&ent and attri,%te val%es 5for e.a&ple! find c%sto&ers ,ased on c%sto&er ;8Cthat is! the val%e of /Customer/@CustId is specified6! (o% &a( :ant to pro&ote those val%es into relational col%&ns. This is helpf%l :hen ?%eries are iss%ed on a s&all part of the -ML data :hile the entire -ML instance is retrieved. Creating -ML inde. on the -ML col%&n is overkill/ instead! the pro&oted col%&n can ,e inde.ed. Q%eries &%st ,e :ritten to %se the pro&oted col%&n 5that is! the ?%er( opti&iEer does not retarget ?%eries on the -ML col%&n to the pro&oted col%&n6. The pro&oted col%&n can ,e a co&p%ted col%&n in the sa&e ta,le or a separate! %ser2&aintained col%&n in a ta,le. This is ade?%ate :hen singleton val%es 5that is! single2val%ed properties6 are pro&oted fro& each -ML instance. Do:ever! for &%ltival%ed properties! (o% have to create a separate ta,le for the propert(! as descri,ed in the follo:ing section. Com!uted Column 5ased on XML Data #y!e A co&p%ted col%&n can ,e created %sing a %ser2defined f%nction 5<8@6 that invokes -ML data t(pe ðods. The t(pe of the co&p%ted col%&n can ,e an( SQL t(pe! incl%ding -ML. This is ill%strated in the follo:ing e.a&ple. %&am!le: Com!uted Column 5ased on XML Data #y!e Met"od Create the %ser2defined f%nction for ;S=9 of ,ooks* Cop( Code
?ECLARE @I$ N
@1r!<1r"23#
RET'RN @I$ N
EN?
Add a co&p%ted col%&n to the ta,le for ;S=9* Cop( Code
The co&p%ted col%&n can ,e inde.ed in the %s%al :a(. %&am!le: Queries on Com!uted Column 5ased on XML Data #y!e Met"ods To o,tain the H,ookI :hose ;S=9 is 0273+L24+QQ22! the ?%er(* Cop( Code
on the -ML col%&n can ,e re:ritten to %se the co&p%ted col%&n as follo:s* Cop( Code
Bo% can create a %ser2defined f%nction to ret%rn -ML data t(pe and create a co&p%ted col%&n %sing the <8@. Do:ever! (o% cannot create an -ML inde. on the co&p%ted! -ML col%&n. Creating 4ro!erty #a.les Bo% &a( :ant to pro&ote so&e of the &%ltival%ed properties fro& (o%r -ML data into one or &ore ta,les! create inde.es on those ta,les! and retarget (o%r ?%eries to %se the&. A t(pical scenario is one in :hich a s&all n%&,er of properties cover &ost of (o%r ?%er( :orkload. Bo% can do the follo:ing*
Create one or &ore ta,les to hold the &%ltival%ed properties. Bo% &a( find it convenient to store one propert( per ta,le! and to d%plicate the pri&ar( ke( of the ,ase ta,le in the propert( ta,les for ,ack Goin :ith the ,ase ta,le.
;f (o% :ant to &aintain the relative order of the properties! (o% need to introd%ce a separate col%&n for the relative order. Create triggers on the -ML col%&n to &aintain the propert( ta,le5s6. >ithin the triggers! do one of the follo:ing* <se -ML data t(pe ðods! s%ch as nodes+, and value+,! to insert and delete ro:s of the propert( ta,le5s6. 5See the section val%e56! nodes56! and Apen-ML56 for &ore disc%ssion of the nodes+, ðod.6
Create strea&ing ta,le2val%ed f%nction5s6 in CL' to insert and delete ro:s of the propert( ta,le5s6. >rite ?%eries for SQL access to the propert( ta,les and -ML access to the -ML col%&n in the ,ase ta,le! :ith Goins ,et:een the ta,les %sing their pri&ar( ke(.
%&am!le: Create 4ro!erty #a.le S%ppose (o% :ant to pro&ote first na&e of a%thors. =ooks have one or &ore a%thors! so that first na&e is a &%ltival%ed propert(. Fach first na&e is stored in a separate ro: of a propert( ta,le. The pri&ar( ke( of the ,ase ta,le is d%plicated in the propert( ta,le for ,ack Goin. Cop( Code
CREATE D'NCTI%N udfFXML2T1+le "@pk 0-t, @xCol xml# RET'RN$ t1+le ;IT= $C=EMA IN?INC A$ RET'RN" sele!t @pk 1s PropPK, -refB@1lue")B), )@1r!<1r"m1x#)# 1s propAut<or from #
%&am!le: Create #riggers to 4o!ulate 4ro!erty #a.le ;nsert triggerC;nserts ro:s into the propert( ta,le* Cop( Code
@xColB-odes")/+ook/1ut<or/f0rst4-1me)# R"-ref#
0-sert 0-to t+lPropAut<or sele!t pBE from 0-serted 1s I CR%$$ APPLY d+oBudfFXML2T1+le"IBpk, IBxCol# 1s P EN?
8elete triggerC8eletes ro:s fro& the propert( ta,le ,ased on the pri&ar( ke( val%e of deleted ro:s* Cop( Code
!re1te tr0,,er tr,Fdo!sF?EL o- do!s for delete 1s +e,0delete t+lPropAut<or ><ere propPK IN "sele!t pBPropPK from deleted 1s ? CR%$$ APPLY
!re1te tr0,,er tr,Fdo!sF'P? o- do!s for upd1te 1s 0f upd1te"xCol# or upd1te"pk# +e,0delete t+lPropAut<or ><ere propPK IN "sele!t pBPropPK from deleted 1s ? CR%$$ APPLY
d+oBudfFXML2T1+le"?Bpk, ?BxCol# 1s P #
d+oBudfFXML2T1+le"0Bpk, 0BxCol# 1s P
e-d
%&am!le: 7ind XML Instances 9"ose *ut"ors -ave t"e 7irst $ame :David: The ?%er( can ,e for&%lated on the -ML col%&n. Alternativel(! it can search the propert( ta,le for first na&e J8avidJ and perfor& a ,ack Goin :ith the ,ase ta,le to ret%rn the -ML instance! as sho:n here* Cop( Code
$ELECT xCol DR%M ;=ERE do!s J%IN t+lPropAut<or %N do!sBpk . t+lPropAut<orBpropPK t+lPropAut<orBpropAut<or . )?1@0d)
%&am!le: Solution 'sing CLR Streaming #a.le2(alued 7unction This sol%tion consists of the follo:ing steps* 4. 8efine a CL' class C-&lStrea&ingT"@ that i&ple&ents ;Fn%&erator and contains a ðod ;nitMethod to generate a strea&ing ta,le2val%ed o%tp%t ,( appl(ing a si&ple path e.pression on an -ML instance. 2. 3. Create an asse&,l( and a Transact2SQL %ser2defined f%nction 5<8@6 to invoke the CL' class. 8efine insert! %pdate and delete triggers %sing the <8@ to &aintain the propert( ta,le5s6.
@irst! create the strea&ing CL' f%nction sho:n ,elo:. -ML data t(pe is e.posed as a &anaged class S?l-&l in A8A.9FT/ it s%pports the ðod CreateReader+, that ret%rns an -&l'eader* Cop( Code
us0-, $ystemK us0-, $ystemBXmlK us0-, $ystemBI%K us0-, $ystemB?1t1K us0-, $ystemB?1t1B$LlK us0-, $ystemB?1t1B$LlCl0e-tK us0-, $ystemB?1t1B$LlTypesK us0-, M0!rosoftB$Ll$er@erB$er@erK us0-, $ystemBColle!t0o-sK
pu+l0! !l1ss CXml$tre1m0-,T&D M IE-umer1tor pr0@1te XmlRe1der mFre1derK pr0@1te $LlXml mFdo!K pr0@1te str0-, mF-1meK pr0@1te str0-,GH mFp1t<K pr0@1te 0-t mFp1t<Lo!K
pu+l0! CXml$tre1m0-,T&D "$LlXml do!, str0-, s0mpleP1t<# N mFdo! . do!K mFre1der . mFdo!BCre1teRe1der"#K mFp1t< . s0mpleP1t<B$pl0t"-e> !<1rGHN)/)O#K mFp1t<Lo! . mFp1t<BLe-,t<4(K O
//T<ree IE-umer1tor met<odsB //Custom !ode for N1@0,1t0-, t<e do!ume-t for 1 s0mple p1t<B pu+l0! +ool Mo@eNext "# N +ool -e>Fro> . f1lseK ><0le "P-e>Fro> QQ PmFre1derBE%D# N mFre1derBRe1d"#K 0f "mFre1derBLo!1lN1me..mFp1t<GmFp1t<Lo!H QQ mFp1t<Lo!..mFp1t<BLe-,t<4( QQ mFre1derBNodeType..XmlNodeTypeBEleme-t# N mF-1me . mFre1derBRe1d$tr0-,"#K -e>Fro> . trueK O else 0f "mFre1derBLo!1lN1me..mFp1t<GmFp1t<Lo!H QQ mFre1derBNodeType..XmlNodeTypeBEleme-t QQ mFre1derBIsEmptyEleme-t..f1lse# N 0f "mFp1t<Lo!..( QQ mFre1derB?ept<P.3# !o-t0-ueK mFp1t<Lo!RRK O else 0f "mFp1t<Lo!P.( QQ mFre1derBLo!1lN1me..mFp1t<GmFp1t<Lo!4(H QQ mFre1derBNodeType..XmlNodeTypeBE-dEleme-t# N mFp1t<Lo!44K O O retur- -e>Fro>K O
mFre1der . mFdo!BCre1teRe1der"#K O
G$LlDu-!t0o-Attr0+ute "D0llRo>Met<odN1me./CLR%pe-Xml/#H pu+l0! st1t0! IE-umer1tor I-0tMet<od "$LlXml do!, str0-, s0mpleP1t<# N retur- -e> CXml$tre1m0-,T&D"do!, s0mpleP1t<#K O
pu+l0! st1t0! @o0d CLR%pe-Xml"%+Se!t o+S, out str0-, -1me# N CXml$tre1m0-,T&D stre1m . "CXml$tre1m0-,T&D# o+SK -1me . stre1mBmF-1meK O O
9e.t! create an asse&,l( and a Transact2SQL %ser2defined f%nction SQL3streaming3&ml3tvf corresponding to the CL' ðod InitMet"od+,. Cop( Code
!re1te fu-!t0o- CLRFudfFXML2T1+le "@pk 0-t, @xCol xml# retur-s @retFT1+le t1+le "DK 0-t, D0rstN1me @1r!<1r"m1x## >0t< s!<em1+0-d0-, 1s +e,0-
!re1te tr0,,er CLRFtr,Fdo!sFIN$ o- do!s for 0-sert 1s +e,00-sert 0-to t+lPropAut<or sele!t pBE from 0-serted 1s I CR%$$ APPLY
An -ML sche&a collection C t(pes an -ML col%&n .Col according to &%ltiple -ML sche&as. Additionall(! the flag 8AC<MF9T or CA9TF9T specifies :hether -ML trees or frag&ents! respectivel(! can ,e stored in col%&n .Col. @or 8AC<MF9T! each -ML instance specifies the target na&espace of its top2level ele&ent in the instance! according to :hich it is validated and t(ped. @or CA9TF9T! on the other hand! each top2 level ele&ent can specif( an( one of the target na&espaces in C. The -ML instance is validated and t(ped according to all the target na&espaces occ%rring in an instance. Sc"ema %volution -ML sche&a collection is %sed to t(pe -ML col%&ns! varia,les and para&eters. ;t provides a &echanis& for -ML sche&a evol%tion. S%ppose (o% add an -ML sche&a :ith target na&espace =AAP2"4 to an -ML sche&a collection C. An -ML col%&n .Col t(ped %sing C can store -ML data confor&ing to =AAP2"4 sche&a. S%ppose an application :ants to e.tend the -ML sche&a :ith ne: sche&a co&ponents! s%ch as co&ple. t(pe definitions and top2level ele&ent declarations. These ne: sche&a co&ponents can ,e added to =AAP2"4 sche&a and do not re?%ire revalidation of the e.isting -ML data in col%&n .Col. S%ppose later the application :ants to provide a ne: version of the -ML sche&a! for :hich it chooses the target na&espace =AAP2"2. This -ML sche&a can ,e added to C. The -ML col%&n can store instances of ,oth =AAP2"4 and =AAP2"2! and e.ec%te ?%eries and data &odification on -ML instances confor&ing to these na&espaces. La& (alidation Disallo)ed in 9ildcard Sections The -ML sche&a processor does not s%pport la. validation in :ildcard sections 5.s*an( and .s*an(Attri,%te6 and .s*an(T(pe. @or :ildcard sections! the -ML sche&a can specif( either pro!essCo-te-ts . /str0!t/ or pro!essCo-te-ts . /sk0p/. @or .s*an(T(pe! onl( strict validation is s%pported. Strict validation ens%res that &ore precise t(pe infor&ation regarding the -ML nodes instantiating these sche&a co&ponents is kno:n d%ring validation and %sed d%ring ?%er( co&pilation. Skip se&antics loses the t(ping infor&ation and the corresponding nodes are treated as %nt(ped 5.dt*%nt(ped in the case of ele&ents and .dt*%nt(pedAto&ic in the case of attri,%tes6. ;f skip se&antics for .s*an(T(pe is desired! then introd%ce a ne: co&ple. t(pe that %ses .s*an( and .s*an(Attri,%te :ith pro!essCo-te-ts . /sk0p/ as sho:n ,elo:* Cop( Code
*xsM!omplexType -1me./sk0pA-yType/ m0xed./true/: *xsMseLue-!e: *xsM1-y pro!essCo-te-ts./sk0p/ m0-%!!urs./3/ m1x%!!urs./u-+ou-ded//: */xsMseLue-!e: *xsM1-yAttr0+ute pro!essCo-te-ts./sk0p//: */xsM!omplexType:
'sing &s:datetime0 &s:date0 and &s:time "al%es of t(pe .s*dateti&e! .s*date! and .s*ti&e &%st ,e specified in ;SA QL04 for&at and incl%de a ti&e Eone. Ather:ise! the data validation for these val%es fails. Th%s! 200+20+227T4)*44*00.9)3# is valid as a val%e of t(pe .s*dateti&e! ,%t the follo:ing are not* 200+20+227 4)*44*00.9)3# 5&issing date and ti&e separator JTJ6! 200+20+227T4)*44*00.9)3 5&issing ti&e Eone6 and 200+2 0+227 4)*44*00.9)3 5&issing ti&e separator and ti&e Eone6. Si&ilarl(! 200+20+227# is a valid .s*date val%e ,%t 200+20+227 is not since no ti&e Eone is specified.
<nt(ped -ML data &a( contain date! ti&e! and dateti&e val%es that an application &a( :ish to convert to the SQL t(pes dateTi&e or s&all8ateTi&e. These date! ti&e and dateti&e val%es &a( not confor& to ;SA QL04 for&at or contain a ti&e Eone. Si&ilarl(! t(ped -ML &a( contain s%ch val%es as t(pes other than .s*date! .s*ti&e! and .s*dateTi&e 5for e.a&ple! .s*string6. ;n ,oth cases! the val%es sho%ld ,e converted first to 0n7varchar and then to SQL dateti&e or s&alldateti&e! as the follo:ing e.a&ple ill%strates. %&am!le: %&tracting datetime (alue from 'nty!ed XML To o,tain the val%e of the CreationTi&e attri,%te fro& the follo:ing data* Cop( Code
de!l1re @@1r xml sele!t @@1r . )*UueryExe!ut0o-$t1ts: *Ce-er1l$t1ts Exe!ut0o-Cou-t./(/ L1stExe!ut0o-T0me./23374374(A (IM((M33BAI6/ Cre1t0o-T0me./23374374(A (IM((M33BA(6//: *;orkerT0me Tot1l./668(/ L1st./668(/ M0-./668(/ M1x./668(//: *P<ys0!1lRe1ds Tot1l./3/ L1st./3/ M0-./3/ M1x./3//: *P<ys0!1l;r0tes Tot1l./3/ L1st./3/ M0-./3/ M1x./3//: *Lo,0!1lRe1ds Tot1l./3/ L1st./3/ M0-./3/ M1x./3//: */UueryExe!ut0o-$t1ts:)
a value+, ðod is %sed to retrieve the val%e as nvarchar5L)6! :hich is then cast to SQL dateti&e t(pe* Cop( Code
'sage
Loading XML Data
#ransferring XML Data from SQL Server 2000 to SQL Server 2005 Bo% can transfer -ML data to SQL Server 200+ in &%ltiple :a(s. >e disc%ss a fe: options*
;f (o% have (o%r data in an 0n7te.t or i&age col%&n in a SQL Server 2000 data,ase! i&port the ta,le %sing! sa(! 8TS! into a SQL Server 200+ data,ase. Change the col%&n t(pe to 0n7varchar5&a.6 or var,inar(5&a.6! respectivel(! and then to -ML %sing ALTF' TA=LF state&ent.
Bo% can ,%lk2cop( (o%r data fro& SQL Server 2000 %sing ,cp o%t! and ,%lk2insert into the SQL Server 200+ data,ase %sing ,cp in.
;f (o% have data in relational col%&ns in a SQL Server 2000 data,ase! create a ne: ta,le :ith an 0n7te.t col%&n and optionall( a pri&ar( ke( col%&n for a ro: identifier. <se client side progra&&ing to retrieve -ML generated at the server :ith @A' -ML! and :rite it into the 0n7te.t col%&n. Then %se the a,ove2&entioned techni?%es to transfer data to a SQL Server 200+ data,ase. Bo% &a( choose to :rite the -ML into an -ML col%&n in the SQL Server 200+ data,ase directl(.
%&am!le: C"anging Column #y!e to XML S%ppose (o% :ant to change the t(pe of an 0n7te.t! 0n7varchar! var,inar(! or -ML col%&n -B# in ta,le ' to -ML t(ped %sing the -ML sche&a collection ,ookCollection. The follo:ing state&ent perfor&s this t(pe change* Cop( Code
;f (o%r te.t -ML is in <nicode 5<CS22! <T@24L6! assigning it to an -ML col%&n! varia,le or para&eter does not pose an( pro,le&s. ;f the encoding is not <nicode and is i&plicit 5d%e to the so%rce code page6! the string code page in the data,ase sho%ld ,e the sa&e as or co&pati,le :ith the code points that (o% :ant to load 5%se CALLATF if necessar(6. ;f no s%ch server code page e.ists! (o% have to add an e.plicit -ML declaration to specif( the proper encoding.
To %se an e.plicit encoding! either %se var,inar( t(pe! :hich has no interaction :ith code pages! or %se a string t(pe of the appropriate code page. Then assign the data to -ML col%&n! varia,le or para&eter.
Th%s! if (o% :ant to pass <T@2Q! it is safest to pass it in as var,inar(5&a.6. <T@24L data can ,e passed in as nvarchar5&a.6 :here no ,(te order &ark is re?%ired! or as var,inar(5&a.6 :ith the ,(te order &ark 0.@@@F as the first t:o ,(te to indicate <T@24L encoding. 5ul62Loading XML Data Bo% can ,%lk2load -ML data into the server %sing SQL ServerMs ,%lk2loading capa,ilities! s%ch as =C ! A F9'A>SFT! and =<LP ;9SF'T. A F9'A>SFT allo:s (o% to load data into an -ML col%&n fro& files. The follo:ing e.a&ple ill%strates this point. %&am!le: Loading XML from 7iles This e.a&ple sho:s ho: to insert a ro: in ta,le docs. The val%e of the -ML col%&n is loaded fro& file C*Rte&pR.&lfile..&l as ,inar( LA= 5=LA=6! and the pk col%&n is s%pplied the val%e 40. The file is loaded as a =LA= 5instead of a CLA= or 9CLA=6 to accept an( encoding that the -ML doc%&ent &a( ,e encoded in. Cop( Code
IN$ERT INT% do!s $ELECT (3, xCol DR%M "$ELECT E DR%M %PENR%;$ET " 'LK )CMTtempTxmlf0leBxml), $INCLEF L% # A$ xCol# A$ R"xCol#
$on25inary Collations The -ML collation %sed for -ML data t(pe is a ,inar( collation and is case2sensitive 5the so2called <nicode code point collation6. Applications &a( have a different re?%ire&ent! s%ch as case insensitive searches. This can ,e achieved ,( pro&oting the appropriate string val%es into a co&p%ted col%&n of t(pe varchar :ith the appropriate collation. Q%er( the co&p%ted col%&n for collation2dependent operations. @%rther&ore! s%ppose the -ML col%&n contains $er&an and Chinese data strings. Bo% can %se operations specific to each of these collations on t:o co&p%ted col%&ns! one for each of these lang%ages.
F.plicit casting to the proper t(pe allo:s %sers to :ork aro%nd static errors altho%gh r%nti&e cast errors :ill ,e transfor&ed to e&pt( se?%ences. The follo:ing s%,sections disc%ss t(pe checking in greater detail. Singleton C"ec6s Location steps! f%nction para&eters! and operators 5for e.a&ple! e?6 re?%iring singletons ret%rn an error if the co&piler cannot deter&ine :hether a singleton is g%aranteed at r%nti&e. The pro,le& arises often :ith %nt(ped data and so&eti&es :ith t(ped data. @or e.a&ple! look%p of an attri,%te re?%ires a singleton parent ele&ent/ an ordinal selecting a single parent node is ade?%ate. Fval%ation of nodes+,2value+, co&,ination 5see the section val%e56! nodes56! and Apen-ML566 to e.tract attri,%te val%es &a( not re?%ire the ordinal specification! as sho:n in the ne.t e.a&ple! since the nodes+, ðod e&its singleton conte.t ite&s. %&am!le: ;no)n Singleton ;n this e.a&ple! the nodes+, ðod generates a separate ro: for each H,ookI ele&ent. 5See the section val%e56! nodes56! and Apen-ML56 for a &ore detailed description of the nodes+, ðod.6 The value+, ðod eval%ated on a H,ookI node e.tracts the val%e of Kgenre! :hich! ,eing an attri,%te! is a singleton. Cop( Code
$ELECT -refB@1lue")@,e-re), )@1r!<1r"m1x#)# Ce-re DR%M do!s CR%$$ APPLY xColB-odes")//+ook)# A$ R"-ref#
-ML sche&a is %sed for t(pe checking of t(ped -ML. ;f a node is specified as singleton in the -ML sche&a! the co&piler %ses that infor&ation and no error occ%rs. Ather:ise! an ordinal selecting a single node is re?%ired. ;n partic%lar! the %se of descendant a.is! s%ch as in /+ook//t0tle! loses singleton cardinalit( inference for the HtitleI ele&ent even if the -ML sche&a specifies it to ,e so. 'e:rite it as "/+ook//t0tle#G(H. ;t is i&portant to keep the distinction ,et:een //f0rst4-1meG(H and "//f0rst4-1me#G(H in &ind for t(pe checking. The for&er ret%rns a se?%ence of Hfirst2na&eI nodes in :hich each node is the left&ost Hfirst2na&eI node a&ongst its si,lings. The latter ret%rns the first! singleton Hfirst2 na&eI node in doc%&ent order in the -ML instance. %&am!le: 'se of value+, The ?%er( ,elo: on %nt(ped -ML col%&n res%lts in static! co&pilation error since value+, e.pects a singleton node as the first arg%&ent and the co&piler cannot deter&ine :hether onl( one Hlast2 na&eI node :ill occ%r at r%nti&e* Cop( Code
Do:ever! this does not rectif( the error since &%ltiple Ha%thorI nodes &a( occ%r in each -ML instance. The follo:ing re:rite :orks* Cop( Code
This ?%er( ret%rns the val%e of the first Hlast2na&eI ele&ent in each -ML instance. 4arent *&is ;f the t(pe of a node cannot ,e deter&ined! it ,eco&es .s*an(T(pe! :hich is not i&plicitl( cast to an( other t(pe. This occ%rs &ost nota,l( d%ring navigation %sing parent a.is 5for e.a&ple! xColBLuery")/+ook/@,e-re/BB/pr0!e)#6/ the parent node t(pe is deter&ined to ,e .s*an(T(pe. An ele&ent &a( also ,e defined as .s*an(T(pe in an -ML sche&a. ;n ,oth cases! the loss of &ore precise t(pe infor&ation often leads to static t(pe errors! and re?%ires e.plicit cast of ato&ic val%es to their specific t(pe. data+,0 te&t+,0 and string+, *ccessors -Q%er( has a f%nction fn:data+, to e.tract scalar! t(ped val%es fro& nodes! a node test te&t+, to ret%rn te.t nodes! and the f%nction fn:string+, that ret%rns the string val%e of a node. Their %sages are so&eti&es conf%sing. $%idelines for their proper %se in SQL Server 200+ are as follo:s. Consider the -ML instance *1,e:(2*/1,e:.
<nt(ped -ML* The path e.pression /1,e/text"# ret%rns the te.t node J42J. The f%nction
A search for the val%e JMichaelDo:ard8avidLe=lancJ in the ?%er( ,elo: ret%rns an e&pt( res%lt since the search val%e does not e?%al that of an( single te.t node %nder an Ha%thorI ele&ent* Cop( Code
7unctions and
<nion t(pes re?%ire caref%l handling o:ing to t(pe checking. T:o of the pro,le&s are ill%strated in the follo:ing e.a&ples. %&am!le: 7unction over 'nion #y!e Consider an ele&ent definition for HrI of a %nion t(pe Cop( Code
The addition operation MOM re?%ires precise t(pes of the operands! so that the e.pression "//r#G(H R ( ret%rns a static error :ith the a,ove t(pe definition for ele&ent HrI. Ane re:rite to fi. the pro,le& is .s*int5 511r60476 O4.
!enXML+,
Bo% can %se &%ltiple value+, ðods on -ML data t(pe in a S%L%C# cla%se to generate a ro:set of e.tracted val%es. The nodes+, ðod (ields an internal reference for each selected node :hich can ,e %sed to ?%er( f%rther. The nodes+, ðod can operate over an -ML col%&n. The co&,ination of nodes+, and value+, ðods can ,e &ore efficient in generating the ro:set :hen it has &an( col%&ns and perhaps the path e.pressions %sed in its generation are co&ple.. The nodes+, ðod (ields instances of a special -ML data t(pe! each of :hich has its conte.t set to a different selected node. S%ch an -ML instance s%pports <uery+,! value+,0 nodes+, and e&ist+, ðods! and can ,e %sed in count+=, aggregations and ;S 9<LL checks. All other %ses res%lt in error. %&am!le: 'se of nodes+, S%ppose (o% :ant to e.tract first and last na&es of a%thors! :hose first na&e is not J8avidJ! as a ro:set consisting of t:o col%&ns! @irst9a&e and Last9a&e. <sing nodes+, and value+, ðods! (o% can achieve this as follo:s* Cop( Code
$ELECT -refB@1lue")f0rst4-1meG(H), )-@1r!<1r"73#)# D0rstN1me, -refB@1lue")l1st4-1meG(H), )-@1r!<1r"73#)# L1stN1me DR%M ;=ERE do!s CR%$$ APPLY xColB-odes")//1ut<or)# A$ R"-ref# -refBex0st")BGf0rst4-1me P. /?1@0d/H)# . (
;n this e.a&ple! -odes")//1ut<or)# (ields a ro:set of references to Ha%thorI ele&ents for each -ML instance. The first and last na&es of a%thors are o,tained ,( eval%ating value+, ðods relative to those references. SQL Server 2000 provides a facilit( for generating a ro:set fro& an -ML instance %sing !enXml+,. Bo% can specif( the relational sche&a for the ro:set and ho: val%es inside the -ML instance &ap to col%&ns in the ro:set. %&am!le: 'se of !enXml+, on XML Data #y!e
>e can re:rite the ?%er( fro& the previo%s e.a&ple %sing !enXml+, as sho:n ,elo:! ,( creating a c%rsor! reading each -ML instance into an -ML varia,le! and appl(ing Apen-ML to it* Cop( Code
%PEN -1meF!ursor ?ECLARE @xml&1l XML ?ECLARE @0do! 0-t DETC= NEXT DR%M -1meF!ursor INT% @xml&1l
;=ILE "@@DETC=F$TAT'$ . 3# ECIN EXEC spFxmlFprep1redo!ume-t @0do! %'TP'T, @xml&1l $ELECT DR%M E %PENXML "@0do!, )//1ut<or)# ;IT= "D0rstN1me L1stN1me ;=ERE @1r!<1r"73# )f0rst4-1me), @1r!<1r"73# )l1st4-1me)# R
RBD0rstN1me P. )?1@0d)
EXEC spFxmlFremo@edo!ume-t @0do! DETC= NEXT DR%M -1meF!ursor INT% @xml&1l EN? CL%$E -1meF!ursor ?EALL%CATE -1meF!ursor
!enXml+, creates an in2&e&or( representation and %ses :ork ta,les instead of the ?%er( processor. ;ts parsing proced%re s!3&ml3!re!aredocument re?%ires a :ell2for&ed -ML doc%&ent and does not accept -ML frag&ents. !enXML+, relies on the - ath 4.0 processor of MS-MLSQL! :hich is a private version of the MS-ML 3.0 processor %sed ,( the data,ase engine! instead of the -Q%er( engine. The :ork ta,les are not shared a&ong &%ltiple calls to !enXml+, even on the sa&e -ML instance. This li&its its scala,ilit(. !enXml+, allo:s (o% to access an edge
ta,le for&at for the -ML data :hen the 9I#- cla%se is not specified. Also! it allo:s (o% to %se the re&ainder of the -ML val%e in a separate! Joverflo:J col%&n. The co&,ination of nodes+, and value+, f%nctions %se -ML inde.es effectivel(. Th%s! this co&,ination can e.hi,it greater scala,ilit( than !enXml. %&am!le: 'se of !enXml on a Single XML Instance
Apen-&l is often %sed to shred a single -ML instance into a relational for&! for e.a&ple! :hen the -ML data is received on the :ire. ;n this case! no c%rsor is re?%ired. This e.a&ple sho:s a stored proced%re that accepts a single -ML instance for shredding the -ML the sa&e :a( as that %sing the c%rsor e.a&ple a,ove. Cop( Code
CREATE PR%CE?'RE $=RE?F$INCLEFXML @xml&1l -@1r!<1r"m1x# A$ ECIN ?ECLARE @0do! INT EXEC spFxmlFprep1redo!ume-t @0do! %'TP'T, @xml&1l $ELECT DR%M E %PENXML "@0do!, )//1ut<or)# ;IT= "D0rstN1me L1stN1me ;=ERE @1r!<1r"73# )f0rst4-1me), @1r!<1r"73# )l1st4-1me)# R
RBD0rstN1me P. )?1@0d)
?ECLARE @x&1l XML $ET @x&1l . "$ELECT xC%l DR%M do!s ;=ERE pk.(#
infor&ation can ,e fo%nd in the MS89 article >hatMs 9e: in @A' -ML in Microsoft SQL Server 200+ 0 http*11technet.&icrosoft.co&1en2%s1li,rar(1&s3)+4375printer6.asp. 7 . %&am!le: SQL (ie) Returning /enerated XML Data #y!e The follo:ing SQL vie: definition creates an -ML vie: over a relational col%&n 5pk6 and ,ook a%thors retrieved fro& an -ML col%&n* Cop( Code
The ?%er( e.ec%tion &aterialiEes the -ML instance ,efore e.ec%ting the <uery+, ðod on it. Dence! this approach does not perfor& or scale :ell e.cept :hen the aggregated -ML instance is s&all. SQL vie: definitions are so&e:hat analogo%s to -ML vie:s created %sing annotated sche&as. Do:ever! there are i&portant differences. The SQL vie: definition is read2onl( and &%st ,e &anip%lated :ith e&,edded -Q%er(/ not so for -ML vie:s %sing annotated sche&a. @%rther&ore! the SQL vie: &aterialiEes the -ML res%lt ,efore appl(ing the -Q%er( e.pression! :hile - ath ?%eries on -ML vie:s eval%ate SQL ?%eries on the %nderl(ing ta,les.
Bo% can :rite ro: or col%&n constraints to enforce do&ain2specific constraints d%ring insertion and &odification of -ML data. Constraints %sing -ML data t(pe ðods are allo:ed onl( :ithin a scalar %ser2defined f%nction.
Bo% can :rite a trigger on the -ML col%&n that fires :hen (o% insert or %pdate val%es in the col%&n. The trigger can contain do&ain2specific validation r%les or pop%late propert( ta,les.
Bo% can :rite SQLCL' f%nctions in &anaged code to :hich (o% pass -ML val%es! and %se -ML processing capa,ilities provided ,( S(ste&.-&l na&espace. An e.a&ple is to appl( -SL transfor&ation to -ML data! as sho:n ,elo:. Alternativel(! (o% can deserialiEe the -ML into one or &ore &anaged classes and operate on the& %sing &anaged code.
Bo% can :rite Transact2SQL stored proced%res and f%nctions that invoke processing on the -ML col%&n for (o%r ,%siness needs.
Consider a CL' f%nction Transfor&-&l56 that accepts an -ML data t(pe instance and an -SL transfor&ation stored in a file! applies the transfor&ation to the -ML data and ret%rns the transfor&ed -ML in the res%lt. A skeleton f%nction :ritten in CT is as follo:s* Cop( Code
us0-, $ystemK us0-, $ystemB?1t1B$LlTypesK us0-, $ystemBXmlK us0-, $ystemBXmlBXP1t<K us0-, $ystemBXmlBXslK
pu+l0! !l1ss Tr1-sformXml N pu+l0! st1t0! $LlXml ApplyXslTr1-sform "$LlXml Xml?1t1, str0-, xslP1t<# N // Lo1d X$L tr1-sform1t0oXslComp0ledTr1-sform xform . -e> XslComp0ledTr1-sform"#K xformBLo1d "xslP1t<#K
// Lo1d XML d1t1 XP1t<?o!ume-t x?o! . -e> XP1t<?o!ume-t "Xml?1t1BCre1teRe1der"##K XP1t<N1@0,1tor -1@ . x?o!BCre1teN1@0,1tor "#K
// Apply t<e tr1-sform1t0o// us0-, m1kes sure t<1t >e flus< t<e >r0ter 1t t<e e-d us0-, "Xml;r0ter >r0ter . -1@BAppe-dC<0ld"## N xformBTr1-sform"Xml?1t1BCre1teRe1der"#, >r0ter#K O
// Retur- t<e tr1-sformed @1lue $LlXml ret$LlXml . -e> $LlXml "-1@BRe1d$u+tree"##K retur- "ret$LlXml#K O O
Ance the asse&,l( is registered and a corresponding %ser2defined Transact2SQL f%nction S<lXsl#ransform+, corresponding to *!!lyXsl#ransform+, is created ! the f%nction can ,e invoked fro& Transact2SQL as in the follo:ing ?%er(* Cop( Code
DR%M ;=ERE
do!s xColBex0st")/+ook/t0tle/text"#G!o-t10-s"B,/$e!ure/#H)# .(
The ?%er( res%lt contains a ro:set of the transfor&ed -ML. Cop( Code SQLCL' opens %p a :hole ne: :orld that can ,e %sed for deco&posing -ML data into ta,les or propert( pro&otion! and ?%er(ing -ML data %sing &anaged classes in the S(ste&.-&l na&espace. More infor&ation can ,e fo%nd in SQL Server 200+ and "is%al St%dio J>hid,e(J ,ooks online.
Bo% can %se s<l:varia.le+, to %se the val%e of a SQL varia,le in (o%r -Q%er( or -ML 8ML e.pression. Bo% can %se s<l:column+, to %se val%es fro& a relation col%&n in (o%r -Q%er( or -ML 8ML e.pression.
This approach allo:s applications to para&eteriEe ?%eries! as sho:n in the e.a&ple ,elo:. Do:ever! -ML and %ser2defined t(pe are not per&itted in s<l:varia.le+, and s<l:column+,. %&am!le: Data 5inding 'sing s<l:varia.le+, The ?%er( ,elo: is a &odified version of the one sho:n in F.a&ple* Q%eries on Co&p%ted Col%&n =ased on -ML 8ata T(pe Methods. ;n this version! the ;S=9 of interest is passed in %sing a SQL varia,le Kis,n. =( replacing the constant :ith s<l:varia.le+,! the ?%er( can ,e %sed to search for an( ;S=9! not G%st the one :hose ;S=9 is 0273+L24+QQ22. Cop( Code
?ECLARE @0s+- @1r!<1r"23# $ET $ELECT DR%M ;=ERE @0s+- . )3456784(79942) xCol do!s xColBex0st ")/+ookG@I$ N . sLlM@1r01+le"/@0s+-/#H)# . (
S<l:column+, can ,e %sed in a si&ilar :a( and provides additional ,enefits. ;nde.es over the col%&n &a( ,e %sed for efficienc( as decided ,( the cost2,ased ?%er( opti&iEer. @%rther&ore! the co&p%ted col%&n &a( store a pro&oted propert(! as disc%ssed in Co&p%ted Col%&n =ased on -ML 8ata T(pe.
Catalog vie:s e.ist to provide &eta2data infor&ation regarding -ML %sage. A fe: of these are disc%ssed ,elo:.
XML Inde&es
-ML inde. entries appear in the catalog vie: s(s.inde.es :ith the inde. Jt(peJ 3. The Jna&eJ col%&n contains the na&e of the -ML inde.. -ML inde.es are also recorded in the catalog vie: s(s..&lNinde.es! :hich contains all the col%&ns of s(s.inde.es and a fe: special ones &eaningf%l for -ML inde.es. The val%e 9<LL in the col%&n Jsecondar(Nt(peJ indicates a pri&ar( -ML inde./ the val%es M M! M'M and M"M stand for ATD! 'A F'TB and "AL<F secondar( -ML inde.es! respectivel(. Space %sage of -ML inde.es can ,e fo%nd in the ta,le2val%ed f%nction sys?dm3d.3inde&3!"ysical3stats+, . ;t provides infor&ation s%ch as the n%&,er of disk pages occ%pied! average ro: siEe in ,(tes! n%&,er of records and other infor&ation for all inde. t(pes! incl%ding -ML inde.es. This infor&ation is availa,le for each data,ase partition/ -ML inde.es %se the sa&e partitioning sche&e and partitioning f%nction of the ,ase ta,le. %&am!le: S!ace 'sage of XML Inde&es Cop( Code
$ELECT sum "p1,eF!ou-t# DR%M sysBdmFd+F0-dexFp<ys0!1lFst1ts "d+F0d"#, o+Se!tF0d")do!s)#, ?EDA'LT, ?EDA'LT, )?ETAILE?)# $?P$ J%IN sysBxmlF0-dexes $XI %N "$XIB0-dexF0d . $?P$B0-dexF0d# ;=ERE $XIB-1me . )0dxFxColFP1t<)
This (ields the n%&,er of disk pages occ%pied ,( the -ML inde. id.N.ColN ath in ta,le docs across all partitions. >itho%t the sum+, f%nction! the res%lt :o%ld ret%rn the disk page %sage per partition.
>rite Transact2SQL ?%eries on the appropriate catalog vie:s for -ML sche&a collections. <se the ,%ilt2in f%nction XML3SC-%M*3$*M%S4*C%+,. Bo% can appl( -ML data t(pe ðods on the o%tp%t of this f%nction. Do:ever! (o% cannot &odif( the %nderl(ing -ML sche&as.
%&am!le: %numerate XML $ames!aces in XML Sc"ema Collection <se the follo:ing ?%er( for -ML sche&a collection J&(CollectionJ* Cop( Code
$ELECT X$NB-1me DR%M sysBxmlFs!<em1F!olle!t0o-s X$C J%IN sysBxmlFs!<em1F-1mesp1!es X$N %N "X$CBxmlF!olle!t0o-F0d . X$NBxmlF!olle!t0o-F0d# ;=ERE X$CB-1me . )myColle!t0o-)
%&am!le: %numerate Contents of an XML Sc"ema Collection The follo:ing state&ent en%&erates the contents of the -ML sche&a collection J&(CollectionJ :ithin 5relational6 sche&a d,o. Cop( Code
The follo:ing state&ent o%tp%ts the -ML sche&a :ith target na&espace Jhttp*11:::.&icrosoft.co&1,ooksJ fro& the -ML sche&a collection J&(CollectionJ :ithin 5relational6 sche&a d,o. Cop( Code
>rite Transact2SQL ?%eries on catalog vie:s for -ML sche&a na&espaces. Create a ta,le containing an -ML data t(pe col%&n to store (o%r -ML sche&as! in addition to loading the& into the -ML t(pe s(ste&. Bo% can ?%er( the -ML col%&n %sing the -ML data t(pe ðods. @%rther&ore! (o% can ,%ild -ML inde. on this col%&n. Do:ever! &aintaining consistenc( ,et:een the -ML sche&as stored in the -ML col%&n and the -ML t(pe s(ste& is left to the application. @or e.a&ple! if (o% drop the -ML sche&a na&espace fro& the -ML t(pe s(ste&! (o% have to drop it also fro& (o%r ta,le to preserve consistenc(.