You are on page 1of 34

2007 Microsoft Corporation. All rights reserved. Microsoft SQL Server 9.

0 Technical Articles

XML Best Practices for Microsoft SQL Server 2005


Shankar al! "ishesh arikh! "asili #olotov! Leo $iako%&akis! Michael '(s Microsoft Corporation April 200) 'evised April 2007 Applies to* Microsoft SQL Server 200+ Summary: Learn a,o%t the g%idelines for -ML data &odeling and %sage in Microsoft SQL Server 200+! and see ill%strative e.a&ples. To get the &ost fro& this article! (o% sho%ld have a ,asic %nderstanding of -ML feat%res in SQL Server/ for ,ackgro%nd &aterial! see -ML S%pport in Microsoft SQL Server 200+ 0 http*11technet.&icrosoft.co&1en2%s1li,rar(1&s3)+4475printer6.asp. 7 on the Microsoft 8eveloper 9et:ork. Contents ;ntrod%ction 8ata Modeling 8ata Modeling <sing -ML 8ata T(pe <sage 8ata =inding in Q%eries Catalog "ie:s for 9ative -ML S%pport

Introduction
Microsoft SQL Server 2000 and SQL-ML >e, 'eleases provide po:erf%l -ML data &anage&ent capa,ilities. These feat%res foc%s on the &apping ,et:een relational and -ML data. -ML vie:s of relational data can ,e defined %sing annotated -S8 5A-S86 to provide an -ML2centric approach that s%pports ,%lk load of -ML data! and ?%er( and %pdate capa,ilities on -ML data. Transact2SQL e.tensions provide SQL2centric approach for &apping relational ?%er( res%lts to -ML 5%sing @A' -ML6 and generating relational vie:s fro& -ML 5%sing Apen-ML6. Microsoft SQL Server 200+ provides e.tensive s%pport for -ML data processing. -ML val%es can ,e stored nativel( in an -ML data t(pe col%&n! :hich can ,e t(ped according to a collection of -ML sche&as! or left %nt(ped. Bo% can inde. the -ML col%&n. @%rther&ore! fine2grained data &anip%lation is s%pported %sing -Q%er( and -ML 8ML! the latter ,eing an e.tension for data &odification. ;n addition! the SQL-ML! @A' -ML and Apen-ML feat%res have ,een e.tended in SQL Server 200+. Together :ith the ne:l( added native -ML s%pport! SQL Server 200+ provides a po:erf%l platfor& for developing rich applications for se&i2str%ct%red and %nstr%ct%red data &anage&ent. >ith all the added f%nctionalit(! the %sers have &ore design choices for their data storage and application develop&ent. This article provides g%idelines for -ML data &odeling and %sage in SQL Server 200+. ;t is divided into the follo:ing t:o topics*

Data modeling

-ML data can ,e stored in &%ltiple :a(s in SQL Server 200+Cfor e.a&ple! %sing native -ML data t(pe and -ML shredded into ta,les. This topic provides g%idelines for &aking the

appropriate choices for &odeling (o%r -ML data. ;t also covers inde.ing of -ML data! propert( pro&otion! and t(ping of -ML instances.

Usage

This topic disc%sses %sage2related topics! s%ch as loading -ML data into the server and t(pe inference d%ring ?%er( co&pilation! e.plains and differentiates closel(2related feat%res! and s%ggests appropriate %se of these feat%res. The ideas are ill%strated :ith e.a&ples.

Data Modeling
This section o%tlines reasons for %sing -ML in SQL Server 200+! provides g%idelines for choosing ,et:een native -ML storage and -ML vie: technolog(! and gives data &odeling s%ggestions.

Relational or XML Data Model


;f (o%r data is highl( str%ct%red :ith kno:n sche&a! relational &odel is likel( to :ork the ,est for data storage. Microsoft SQL Server provides the necessar( f%nctionalit( and the tools (o% &a( need. An the other hand! if the str%ct%re is fle.i,le 5se&i2str%ct%red or %nstr%ct%red6 or %nkno:n! (o% have to give d%e considerations to &odeling s%ch data. -ML is a good choice if (o% :ant a platfor&2independent &odel to ens%re porta,ilit( of the data %sing str%ct%ral and se&antic &ark%p. @%rther&ore! it is an appropriate option if so&e of the follo:ing properties are satisfied*

Bo%r data is sparse! (o% do not kno: the str%ct%re of the data! or the str%ct%re of (o%r data &a( change significantl( in the f%t%re. Bo%r data represents contain&ent hierarch( 5as opposed to references a&ong entities6 and &a( ,e rec%rsive. Arder of &ark%ps and val%es is inherent in (o%r data. Bo% :ant to ?%er( into the data or %pdate parts of it ,ased on its str%ct%re.

;f none of these conditions are &et! (o% sho%ld %se relational data &odel. @or e.a&ple! if (o%r data is in -ML for&at ,%t (o%r application &erel( %ses the data,ase to store and retrieve the data! an 0n7varchar5&a.6 col%&n is all (o% need. Storing the data in an -ML col%&n ,rings additional ,enefits/ the engine checks that the data is :ell2for&ed or valid according to a prespecified -ML sche&a! and s%pports fine2grained ?%er( and %pdate of the -ML data.

Reasons for Storing XML Data in SQL Server 2005


Dere are so&e reasons for %sing native -ML feat%res in SQL Server 200+ as opposed to &anaging (o%r -ML data in the file s(ste&*

Bo% :ant to %se ad&inistrative f%nctionalit( of the data,ase server for &anaging (o%r -ML data 5for e.a&ple! ,ack%p! recover( and replication6. Bo% :ant to share! ?%er(! and &odif( (o%r -ML data in an efficient and transacted :a(. @ine2grained data access is i&portant to (o%r application. @or e.a&ple! (o% &a( :ant to

e.tract so&e of the sections :ithin an -ML doc%&ent! or (o% &a( :ant to insert a ne: section :itho%t replacing (o%r :hole doc%&ent.

Bo% have relational data and SQL applications! and (o% :ant interopera,ilit( ,et:een relational and -ML data :ithin (o%r application. Bo% need lang%age s%pport for ?%er( and data2&odification for cross2do&ain applications.

Bo% :ant the server to g%arantee :ell2for&ed data! and optionall( validate (o%r data according to -ML sche&as. Bo% :ant inde.ing of -ML data for efficient ?%er( processing and good scala,ilit(! and the %se a first2rate ?%er( opti&iEer. Bo% :ant SAA ! A8A.9FT and ALF 8= accesses to the -ML data.

;f none of these conditions are satisfied! (o% &a( ,e ,etter off storing (o%r data as a non2-ML! large o,Gect t(pe! s%ch as 0n7varchar5&a.6 or var,inar(5&a.6.

XML Storage

!tions

The storage options for -ML in SQL Server 200+ are as follo:s*

9ative storage as -ML data t(pe*

The data is stored in an internal representation that preserves the -ML content of the data! s%ch as contain&ent hierarch(! doc%&ent order! ele&ent and attri,%te val%es! and so on. Specificall(! the ;nfoSet content of the -ML data is preserved 5for &ore infor&ation on ;nfoSet! see http*11:::.:3.org1T'1.&l2infoset 0 http*11:::.:3.org1T'1.&l2infoset 7 6. ;t &a( not ,e an e.act cop( of the te.t -ML! since the follo:ing infor&ation is not retained* insignificant :hite spaces! order of attri,%tes! na&espace prefi.es! and -ML declaration. @or t(ped -ML data t(pe 5that is! -ML data t(pe ,o%nd to -ML sche&as6! the t(pe2relevant infor&ation of the post2sche&a validation ;nfoset 5 S";6! :hich adds t(pe infor&ation to the ;nfoset! is encoded in the internal representation. This i&proves parsing speed significantl(. 5@or &ore infor&ation! see the >3C -ML Sche&a specifications at http*11:::.:3.org1T'1.&lsche&a24 0 http*11:::.:3.org1T'1.&lsche&a24 7 and http*11:::.:3.org1T'1.&lsche&a22 0 http*11:::.:3.org1T'1.&lsche&a22 7 and the -Q%er( 4.0 and - ath 2.0 8ata Model :orking draft at http*11:::.:3.org1T'1200+1>82.path2 data&odel2200+0244 0 http*11:::.:3.org1T'1200+1>82.path2data&odel2200+0244 7 .6

Mapping ,et:een -ML and relational storage*

<sing annotated sche&a 5A-S86! the -ML is deco&posed into col%&ns in one or &ore ta,les! preserving fidelit( of the data at the relational levelChierarchical str%ct%re is preserved! :hile order a&ong ele&ents is ignored. The sche&a cannot ,e rec%rsive.

Large o,Gect storage 50n7varchar5&a.6 and var,inar(5&a.66*

An e.act cop( of the data is stored. This is %sef%l for special2p%rpose applications s%ch as legal doc%&ents. Most applications do not re?%ire an e.act cop(! and are satisfied :ith the -ML content 5;nfoset fidelit(6. ;n general! (o% &a( need to %se a co&,ination of these approaches. @or e.a&ple! (o% &a( :ant to store (o%r -ML data in an -ML data t(pe col%&n and pro&ote properties fro& it into relational col%&ns. Ar! (o% &a( :ant to %se &apping technolog( and store non2rec%rsive parts in non2-ML col%&ns and onl( the rec%rsive parts in -ML data t(pe col%&ns. C"oice of XML #ec"nology The choice of -ML technolog( 5native -ML vers%s -ML vie:6 generall( depends %pon the follo:ing factors*

Storage options*

Bo%r -ML data &a( ,e &ore s%ita,le for large o,Gect storage 5for e.a&ple! a prod%ct &an%al6! or &ore a&ena,le to storage in relational col%&ns 5for e.a&ple! line ite&s converted to -ML6. Fach storage option preserves doc%&ent fidelit( to a different e.tent.

Q%er( capa,ilities*

Bo% &a( find one storage option &ore s%ita,le than others! ,ased on the nat%re of (o%r ?%eries and on the e.tent to :hich (o% ?%er( (o%r -ML data. @ine2grained ?%er( of (o%r -ML dataCfor e.a&ple! predicate eval%ation on -ML nodesCis s%pported to var(ing degrees in the t:o storage options.

;nde.ing -ML data*

Bo% &a( :ant to inde. the -ML data to speed %p -ML ?%er( perfor&ance. ;nde.ing options var( :ith the storage options/ (o% need to &ake the appropriate choices to opti&iEe (o%r :orkload.

8ata &odification capa,ilities*

So&e :orkloads involve fine2grained &odification of -ML data 5for e.a&ple! adding a ne: section :ithin a doc%&ent6! :hile others do not 5for e.a&ple! >e, content6. 8ata &odification lang%age s%pport &a( ,e i&portant for (o%r application.

Sche&a s%pport*

Bo%r -ML data &a( ,e descri,ed ,( a sche&a! :hich &a( or &a( not ,e an -ML sche&a doc%&ent. The s%pport for sche&a2,o%nd -ML depends %pon the -ML technolog(. 9eedless to sa(! different choices have different perfor&ance characteristics.

$ative XML Storage Bo% can store (o%r -ML data in an -ML data t(pe col%&n at the server. This is a s%ita,le choice if*

Bo% :ant a straightfor:ard :a( of storing (o%r -ML data at the server :hile preserving doc%&ent order and doc%&ent str%ct%re. Bo% &a( or &a( not have a sche&a for (o%r -ML data. Bo% :ant to ?%er( and &odif( (o%r -ML data. Bo% :ant to inde. the -ML data for faster ?%er( processing. Bo%r application needs s(ste& catalog vie:s to ad&inister (o%r -ML data and -ML sche&as.

9ative -ML storage is %sef%l :hen (o% have -ML doc%&ents :ith a :ide range of str%ct%res! or -ML doc%&ents confor&ing to different or co&ple. sche&as that are too hard to &ap to relational str%ct%res. %&am!le: Modeling XML Data 'sing XML Data #y!e Consider a prod%ct &an%al in -ML for&at! consisting of a separate chapter for each topic! and &%ltiple sections :ithin each chapter. A section can contain s%,2sections! so that HsectionI is a rec%rsive ele&ent. rod%ct &an%als contain plent( of &i.ed content! diagra&s and technical &aterial/ the data is se&i2str%ct%red. <sers &a( :ant to perfor& conte.t%al search for topics of interest 5for e.a&ple! the section on Jcl%stered inde.J :ithin the chapter on Jinde.ingJ6! and ?%er( technical ?%antities. A s%ita,le storage &odel for (o%r -ML doc%&ents is an -ML data t(pe col%&n. This preserves the ;nfoset content of (o%r -ML data. ;nde.ing the -ML col%&n ,enefits ?%er( perfor&ance. %&am!le: Retaining %&act Co!ies of XML Data S%ppose govern&ent reg%lations re?%ire (o% to retain e.act te.t%al copies of (o%r -ML doc%&ents 5for e.a&ple! signed doc%&ents! legal doc%&ents! or stock transaction orders6. Bo% &a( :ant to store (o%r doc%&ents in an 0n7varchar5&a.6 col%&n. @or ?%er(ing! convert the data to -ML data t(pe at r%nti&e and e.ec%te -Q%er( on it. The r%nti&e conversion &a( ,e e.pensive especiall( :hen the doc%&ent is large. ;f (o% ?%er( often! (o% can red%ndantl( store the doc%&ents in an -ML data t(pe col%&n and inde. it! :hile (o% ret%rn e.act doc%&ent copies fro& the 0n7varchar5&a.6 col%&n. The -ML col%&n &a( ,e a co&p%ted col%&n ,ased on the 0n7varchar5&a.6 col%&n. Do:ever! (o% cannot create an -ML inde. on a co&p%ted! -ML col%&n! nor can an -ML inde. ,e ,%ilt on 0n7varchar5&a.6 or var,inar(5&a.6 col%&ns. XML (ie) #ec"nology =( defining a &apping ,et:een (o%r -ML sche&as and the ta,les in (o%r data,ase! (o% create an J-ML vie:J of (o%r persistent data. -ML ,%lk load can ,e %sed to pop%late the %nderl(ing ta,les %sing the -ML vie:. Bo% can ?%er( the -ML vie: %sing - ath 4.0/ the ?%er( is translated into SQL ?%eries on the ta,les. Si&ilarl(! %pdates are propagated to those ta,les as :ell. This technolog( is %sef%l :hen*

Bo% :ant to have an -ML2centric progra&&ing &odel %sing -ML vie:s over (o%r e.isting relational data.

Bo% have a sche&a 5-S8! -8'6 for (o%r -ML data! :hich an e.ternal partner &a( have provided. Arder is not i&portant in (o%r data! (o%r ?%er(a,le data is not rec%rsive! or the &a.i&al rec%rsion depth is kno:n in advance. Bo% :ant to ?%er( and &odif( the data thro%gh the -ML vie: %sing - ath 4.0. Bo% :ant to ,%lk2load -ML data and deco&pose the& into the %nderl(ing ta,les %sing the -ML vie:.

F.a&ples incl%de relational data e.posed as -ML for data e.change and :e, services! and -ML data :ith fi.ed sche&a. @or &ore infor&ation! see http*11&sdn.&icrosoft.co&1SQL-ML 0 http*11&sdn.&icrosoft.co&1SQL-ML 7 . Bo% can also p%,lish -ML fro& relational and -ML data stored at the server %sing @A' -ML. @or &ore infor&ation! refer to <sing @A' -ML to $enerate -ML fro& 'o:sets in this article. %&am!le: Modeling Data 'sing *nnotated XML Sc"ema +*XSD, S%ppose (o% have e.isting relational data 5for e.a&ple! c%sto&ers! orders! and line ite&s6 that (o% :o%ld like to &anip%late as -ML. 8efine an -ML vie: %sing A-S8 over the relational data. The -ML vie: allo:s (o% to ,%lk2load -ML data into (o%r ta,les! and ?%er( and %pdate the relational data %sing the -ML vie:. This &odel is %sef%l if (o% need to e.change data :ith -ML &ark%p :ith other applications :hile (o%r SQL applications :ork %ninterr%pted. -y.rid Model Q%ite often! a co&,ination of relational and -ML data t(pe col%&ns is appropriate for data &odeling. So&e of the val%es fro& (o%r -ML data can ,e stored in relational col%&ns! and the rest! or the entire -ML val%e! stored in an -ML col%&n. This &a( (ield ,etter perfor&ance 5for e.a&ple! (o% have f%ll control over the inde.es created on the relational col%&ns6 and locking characteristics. Do:ever! (o% %ndertake greater responsi,ilit( for &anaging (o%r data storage. The val%es to store in relational col%&ns depend on (o%r :orkload. @or e.a&ple! if (o% retrieve entire -ML val%es ,ased on the path e.pression 1C%sto&er1KC%st;d! then pro&oting the val%e of the CustId attri,%te into a relational col%&n and inde.ing it &a( (ield faster ?%er( perfor&ance. An the other hand! if (o%r -ML data is e.tensivel( and non2red%ndantl( deco&posed into relational col%&ns! the reasse&,l( cost &a( ,e significant. @or highl( str%ct%red -ML data 5for e.a&ple! the content of a ta,le has ,een converted into -ML6! (o% can &ap all val%es to relational col%&ns! possi,l( %sing -ML vie: technolog(.

Data Modeling 'sing XML Data #y!e


This section disc%sses data &odeling topics for native -ML storage. These incl%de inde.ing -ML data! propert( pro&otion! and t(ped -ML data t(pe.

Same or Different #a.le


An -ML data t(pe col%&n can ,e created in a ta,le containing other relational col%&ns! or in a separate ta,le :ith a foreign ke( relationship to a &ain ta,le. Create an -ML data t(pe col%&n in the sa&e ta,le :hen one of the follo:ing conditions is tr%e*

Bo%r application perfor&s data retrieval on the -ML col%&n and does not re?%ire an -ML inde. on the -ML col%&n! or.

Bo% :ant to ,%ild an -ML inde. on the -ML data t(pe col%&n! and the pri&ar( ke( of the &ain ta,le is the sa&e as its cl%stering ke(. See the section on ;nde.ing an -ML 8ata T(pe Col%&n for &ore details.

Create the -ML data t(pe col%&n in a separate ta,le if the follo:ing conditions are tr%e*

Bo% :ant to ,%ild an -ML inde. on the -ML data t(pe col%&n! ,%t the pri&ar( ke( of the &ain ta,le is not the sa&e as its cl%stering ke(! or the &ain ta,le does not have a pri&ar( ke(! or the &ain ta,le is a heap 5that is! no cl%stering ke(6. This &a( ,e tr%e if the &ain ta,le alread( e.ists.

Bo% do not :ant ta,le scans to slo: do:n d%e to the presence of the -ML col%&n in the ta,le! :hich takes %p space :hether it is stored in2ro: or o%t2of2ro:.

/ranularity of XML Data


The gran%larit( of the -ML data stored in an -ML col%&n is critical for locking and %pdate characteristic. SQL Server e&plo(s the sa&e locking &echanis& for ,oth -ML and non2-ML data. Th%s! ro:2level locking ca%ses all -ML instances in the ro: to ,e locked. >hen the gran%larit( is large! locking large -ML instances for %pdates ca%ses thro%ghp%t to decline in a &%lti%ser scenario. To i&prove conc%rrenc( in a high %pdate scenario! the -ML data can ,e shredded into relational ro:s in one or &ore ta,les. Severe deco&position of this kind &a( lose o,Gect encaps%lation and the str%ct%re of the -ML data! and raises reasse&,l( cost. <pdates to an -ML instance are perfor&ed in2place and incre&entall(Cin other :ords! :itho%t replacing the entire -ML instance in &ost cases. Th%s! %pdating the val%e of a single attri,%te is efficient and fairl( independent of the siEe of the -ML instance. A ,alance ,et:een data &odeling re?%ire&ents and locking characteristics is i&portant for good design.

'nty!ed0 #y!ed0 and Constrained XML Data #y!e


The SQL Server 200+ -ML data t(pe i&ple&ents the ;SA SQL22003 standard -ML data t(pe. As s%ch! it can store :ell2for&ed -ML 4.0 doc%&ents as :ell as so2called -ML content frag&ents :ith te.t nodes and an ar,itrar( n%&,er of top2level ele&ents in an untyped -ML col%&n. The s(ste& checks for the :ell2for&edness of the data! does not re?%ire the col%&n to ,e ,o%nd to -ML sche&as! and reGects data that is not :ell2for&ed in the e.tended sense. This is tr%e also of %nt(ped -ML varia,les and para&eters. ;f (o% have -ML sche&as descri,ing (o%r -ML data! (o% can associate the sche&as :ith the -ML col%&n to (ield typed -ML. The -ML sche&as are %sed to validate the data! perfor& &ore precise t(pe checks d%ring co&pilation of ?%er( and data &odification state&ents than %nt(ped -ML! and opti&iEe storage and ?%er( processing. <se %nt(ped -ML data t(pe %nder the follo:ing conditions*

Bo% do not have a sche&a for (o%r -ML data. Bo% have sche&as ,%t (o% do not :ant the server to validate the data. This is so&eti&es the case :hen an application perfor&s client2side validation ,efore storing the data at the server! or te&poraril( stores -ML data invalid according to the sche&a! or %ses -ML sche&a feat%res not s%pported at the server 5for e.a&ple! key/keyref6.

<se t(ped -ML data t(pe %nder the follo:ing conditions*

Bo% have sche&as for (o%r -ML data and (o% :ant the server to validate (o%r -ML data according on the -ML sche&as. Bo% :ant to take advantage of storage and ?%er( opti&iEations ,ased on t(pe infor&ation. Bo% :ant to take ,etter advantage of t(pe infor&ation d%ring co&pilation of (o%r ?%eries s%ch as static t(pe errors.

T(ped -ML col%&ns! para&eters and varia,les can store -ML doc%&ents or content! :hich (o% have to specif( as a flag 58AC<MF9T or CA9TF9T! respectivel(6 at the ti&e of declaration. @%rther&ore! (o% have to provide one or &ore -ML sche&as. Specif( 8AC<MF9T if each -ML instance has e.actl( one top2level ele&ent/ other:ise! %se CA9TF9T. The ?%er( co&piler %ses 8AC<MF9T flag in t(pe checks d%ring ?%er( co&pilation to infer singleton top2level ele&ents. ;n addition to t(ping an -ML col%&n! (o% can %se relational 5col%&n or ro:6 constraints on t(ped or %nt(ped -ML data t(pe col%&ns. <se constraints %nder the follo:ing conditions*

Bo%r ,%siness r%les cannot ,e e.pressed in -ML sche&as. @or e.a&ple! the deliver( address of a flo:er shop &%st ,e :ithin +0 &iles of its ,%siness location! :hich can ,e :ritten as a constraint on the -ML col%&n. The constraint &a( involve -ML data t(pe &ethods :ithin scalar 5as opposed to ta,le2val%ed6 %ser2defined f%nctions.

Bo%r constraint involves other -ML or non2-ML col%&ns in the ta,le. An e.a&ple is the enforce&ent of the ;8 of a C%sto&er 5/Customer/@CustId6 fo%nd in an -ML instance to &atch the val%e in a relational C%sto&er;8 col%&n.

Document #y!e Definition +D#D, -ML data t(pe col%&ns! varia,les! and para&eters can ,e t(ped %sing -ML sche&a! ,%t not %sing 8T8. Bo% can convert 8T8s to -ML sche&a doc%&ents %sing third2part( tools! and load the -ML sche&as into the data,ase. ;nline 8T8 can ,e %sed for ,oth %nt(ped and t(ped -ML instances to s%ppl( defa%lt val%es and to replace entit( references :ith their e.panded for&.

Internal Storage of XML Data


The -ML data s%pplied ,( a %ser is stored internall( in a ,inar( for&at! :hich can ,e parsed faster than the te.t%al representation of the -ML data. This ,inar( for&at (ields so&e co&pression in the general case! and is li&ited to 2$= per instance. The a&o%nt of co&pression depends %pon the length and n%&,er of repeating tags! and the t(pes of the val%es occ%rring in the -ML data. The follo:ing e.a&ple sho:s ho: the siEe of the stored -ML data can ,e co&p%ted. %&am!le: Com!uting Stored XML Si1e >e %se ta,le docs 5pk INT PRIMARY KEY, xCol XML6 :ith an %nt(ped -ML col%&n in &ost of o%r e.a&ples/ these can ,e e.tended to t(ped -ML in a straightfor:ard :a( 5see SQL Server 200+ =ooks Anline for infor&ation on the %se of t(ped -ML6. Cop( Code

CREATE TA LE do!s "pk INT PRIMARY KEY, xCol XML#


@or ease of e.position! ?%eries are descri,ed for -ML data instances s%ch as the follo:ing*

Cop( Code

IN$ERT INT% do!s &AL'E$ "(, )*+ook ,e-re./se!ur0ty/ pu+l0!1t0o-d1te./2332/ I$ N./3456784(79942/: *t0tle:;r0t0-, $e!ure Code*/t0tle: *1ut<or: *f0rst4-1me:M0!<1el*/f0rst4-1me: *l1st4-1me:=o>1rd*/l1st4-1me: */1ut<or: *1ut<or: *f0rst4-1me:?1@0d*/f0rst4-1me: *l1st4-1me:Le l1-!*/l1st4-1me: */1ut<or: *pr0!e:6ABAA*/pr0!e: */+ook:)#
The stored siEe in ,(tes of the -ML instances in the -ML col%&n can ,e fo%nd %sing the D*#*L%$/#-+, f%nction* Cop( Code

$ELECT ?ATALENCT= "xCol# DR%M do!s

In2Ro) and

ut2of2Ro) Storage

S&all -ML data t(pe instances are stored :ithin the ro:s of a ta,le. Larger val%es that cannot ,e acco&&odated :ithin a disk page are stored o%t of ro: :ith an in2ro: pointer of 4L ,(tes. Storing -ML val%es in2ro: red%ces the record densit( and slo:s do:n ta,le scans over the non2-ML col%&ns in the ta,le. ;n s%ch cases! the MMlarge val%e t(pes o%t of ro:MM option can ,e specified in the s(ste& stored proced%re s!3ta.leo!tion to store all large data t(pes off2ro:.

Inde&ing an XML Data #y!e Column


-ML inde.es can ,e created on -ML data t(pe col%&ns. ;t inde.es all tags! val%es and paths over the -ML instances in the col%&n and ,enefits ?%er( perfor&ance. Bo%r application &a( ,enefit fro& an -ML inde. %nder the follo:ing conditions*

Q%eries on -ML col%&ns are co&&on in (o%r :orkload. -ML inde. &aintenance cost d%ring data &odification &%st ,e taken into acco%nt. Bo%r -ML val%es are relativel( large and the retrieved parts are relativel( s&all. =%ilding the inde. avoids parsing the :hole data at r%nti&e and ,enefits inde. look%ps for efficient ?%er( processing.

The first inde. on an -ML col%&n is the Jpri&ar( -ML inde.J. <sing it! three t(pes of secondar( -ML inde.es can ,e created on the -ML col%&n to speed %p co&&on classes of ?%eries! as descri,ed ,elo:.

4rimary XML Inde& This inde.es all tags! val%es and paths :ithin the -ML instances in an -ML col%&n. The ,ase ta,le 5that is! the ta,le in :hich the -ML col%&n occ%rs6 &%st have a cl%stered inde. on the pri&ar( ke( of the ta,le/ the pri&ar( ke( is %sed to correlate inde. ro:s :ith the ro:s in the ,ase ta,le. @%ll -ML instances are retrieved fro& the -ML col%&ns 5for e.a&ple! $ELECT E6. Q%eries %se the pri&ar( -ML inde.! ret%rning scalar val%es or -ML s%,trees %sing the inde.. %&am!le: Creating 4rimary XML Inde& The follo:ing state&ent creates a pri&ar( -ML inde. called id.N.Col on the -ML col%&n .Col of the ta,le docs* Cop( Code

CREATE PRIMARY XML IN?EX 0dxFxCol o- do!s "xCol#


Secondary XML Inde&es Ance the pri&ar( -ML inde. has ,een created! (o% &a( :ant to create secondar( -ML inde.es to speed %p different classes of ?%eries :ithin (o%r :orkload. Three t(pes of secondar( -ML inde.esC ATD! 'A F'TB! and "AL<FC,enefit path2,ased ?%eries! c%sto& propert( &anage&ent scenarios! and val%e2,ased ?%eries! respectivel(. The ATD inde. ,%ilds a =O2tree on 5path! val%e6 pair of each -ML node in doc%&ent order over all -ML instances in the col%&n. The 'A F'TB inde. creates a =O2tree cl%stered on the 5 P! path! val%e6 pair :ithin each -ML instance! :here P is the pri&ar( ke( of the ,ase ta,le. @inall(! the "AL<F inde. creates a =O2tree on 5val%e! path6 pair of each node in doc%&ent order across all -ML instances in the -ML col%&n. Dere are so&e g%idelines for creating one or &ore of these inde.es*

;f (o%r :orkload %ses path e.pressions heavil( on -ML col%&ns! the ATD secondar( -ML inde. is likel( to speed %p (o%r :orkload. The &ost co&&on case is the %se of e&ist+, &ethod on -ML col%&ns in >DF'F cla%se of Transact2SQL.

;f (o%r :orkload retrieves &%ltiple val%es fro& individ%al -ML instances %sing path e.pressions! cl%stering paths :ithin each -ML instance in the 'A F'TB inde. &a( ,e helpf%l. This scenario t(picall( occ%rs in a propert( ,ag scenario :hen properties of an o,Gect are fetched and its relational pri&ar( ke( val%e is kno:n.

;f (o%r :orkload involves ?%er(ing for val%es :ithin -ML instances :itho%t kno:ing the ele&ent or attri,%te na&es that contain those val%es! (o% &a( :ant to create the "AL<F inde.. This t(picall( occ%rs :ith descendant a.es look%ps! s%ch as //1ut<orGl1st4

-1me./=o>1rd/H! :here Ha%thorI ele&ents can occ%r at an( level of the hierarch( and the
search val%e 5JDo:ardJ6 is &ore selective than the path. ;t also occ%rs in J:ildcardJ ?%eries! s%ch as /+ook G@E . /-o@el/H! :here the ?%er( looks for H,ookI ele&ents :ith so&e attri,%te having the val%e JnovelJ. %&am!le: 4at"25ased Loo6u! S%ppose the ?%er( ,elo: is co&&on in (o%r :orkload* Cop( Code

$ELECT pk, xCol

DR%M ;=ERE

do!s xColBex0st ")/+ookG@,e-re . /se!ur0ty/H)# . (

The path e.pression /+ook/@,e-re and the val%e Jsec%rit(J correspond to the ke( fields of the ATD inde.. Conse?%entl(! secondar( -ML inde. of t(pe ATD is helpf%l for this :orkload* Cop( Code

CREATE XML IN?EX 0dxFxColFP1t< o- do!s "xCol# '$INC XML IN?EX 0dxFxCol D%R PAT=
%&am!le: 7etc"ing 4ro!erties of an .8ect

Consider the ?%er( ,elo: that retrieves the first na&es of a%thors of a ,ook fro& each ro: in ta,le docs* Cop( Code

$ELECT refB@1lue ")f0rst4-1me), )-@1r!<1r"8I#)#, refB@1lue ")l1st4-1me), )-@1r!<1r"8I#)# DR%M do!s CR%$$ APPLY xColB-odes ")/+ook/1ut<or# R"ref#

The propert( inde. is %sef%l in this case and is created as follo:s* Cop( Code

CREATE XML IN?EX 0dxFxColFProperty o- do!s "xCol# '$INC XML IN?EX 0dxFxCol D%R PR%PERTY
%&am!le: (alue25ased Query ;n the follo:ing ?%er(! a partial path is specified %sing 11! so that the look%p ,ased on the val%e of ;S=9 ,enefits fro& the %se of the "AL<F inde.* Cop( Code

$ELECT xCol DR%M ;=ERE do!s xColBex0st ")//+ook/@I$ NGB . /3456784(79942/H)# . (

The "AL<F inde. is created as follo:s* Cop( Code

CREATE XML IN?EX 0dxFxColF&1lue o- do!s "xCol# '$INC XML IN?EX 0dxFxCol D%R &AL'E
XML Inde& on Multi!le 7ile /rou!s -ML inde.es are collocated :ith the ,ase ta,le/ that is! -ML inde. ro:s are stored in the sa&e file gro%ps and ta,le partitions as the corresponding ,ase ta,le ro:s. This &a( so&eti&es re?%ire large file gro%ps for -ML ,lo,s and their collocated -ML inde.es. The TF-T;MA$FNA9 H filegroupI specification in the C'FATF TA=LF state&ent stores the -ML ,lo,s in the specified filegro%pCthe -ML inde. ro:s are still collocated :ith the ,ase ta,le! :hile large -ML node val%es are in the sa&e file gro%p as the -ML ,lo,s. This red%ces the siEe of the individ%al file gro%ps and provides &ore convenience for data &anage&ent. @or e.a&ple! :hen the non2-ML data in the ro: is s&all relative to the siEe of the -ML data! this techni?%e can distri,%te the storage &ore evenl(.

7ull2#e&t Inde& on XML Column Bo% can create a f%ll2te.t inde. on -ML col%&ns/ this inde.es the content of the -ML val%es :hile ignoring the -ML &ark%p. Attri,%te val%es are not f%ll2te.t inde.ed 5since the( are considered part of the &ark%p6 and ele&ent tags are %sed as token ,o%ndaries. Bo% can co&,ine f%ll2te.t search :ith -ML inde. %sage in so&e scenarios*

@ilter the -ML val%es of interest %sing SQL f%ll2te.t search. Q%er( those -ML instances! :hich %ses -ML inde. on the -ML col%&n.

%&am!le: Com.ining 7ull2#e&t Searc" )it" XML Querying The steps for creating f%ll2te.t inde. on an -ML col%&n are identical to those for other SQL t(pe col%&ns. The 88L state&ents are as follo:s! in :hich PNNdocsNN0238+A0) is the single2col%&n pri&ar( ke( inde. of the ta,le* Cop( Code

CREATE D'LLTEXT CATAL%C ft A$ ?EDA'LT CREATE D'LLTEXT IN?EX %N d+oBdo!s "xCol# KEY IN?EX PKFFdo!sFF326?7A3I
Ance the f%ll2te.t inde. has ,een created on the -ML col%&n! the follo:ing ?%er( checks that an -ML instance contains the :ord JSec%reJ in the title of a ,ook* Cop( Code

$ELECT E DR%M ;=ERE AN? do!s C%NTAIN$"xCol,)$e!ure)# xColBex0st")/+ook/t0tle/text"#G!o-t10-s"B,/$e!ure/#H)# .(

The C $#*I$S+, &ethod %ses the f%ll2te.t inde. to s%,set the -ML instances that contain the :ord JSec%reJ an(:here in the doc%&ent. The e&ist+, cla%se ens%res that the :ord JSec%reJ occ%rs in the title of a ,ook. @%ll2te.t search %sing C $#*I$S+, and -Q%er( contains+, have different se&antics. The latter is a s%,string &atch! :hile the for&er is a token &atch %sing ste&&ing. Th%s! if the search is for the string Jr%nJ in the title! then Jr%nJ! Jr%nsJ and Jr%nningJ all &atch! since ,oth the f%ll2te.t C $#*I$S+, and the -Q%er( contains+, are satisfied. Do:ever! the ?%er( a,ove does not &atch the :ord J<nSec%redJ in the title 5the f%ll2te.t C $#*I$S+, fails ,%t the -Q%er( contains+, is satisfied6. @%rther&ore! f%ll2te.t search e&plo(s :ord ste&&ing! :hile -Q%er( contains+, is a literal &atch. ;n general! for a p%re s%,string &atch! the f%ll2te.t C $#*I$S+, cla%se sho%ld ,e re&oved. This difference is ill%strated in the ne.t e.a&ple. %&am!le: 7ull2#e&t Searc" on XML (alues 'sing Stemming The -Q%er( contains+, check in F.a&ple* Co&,ining @%ll2Te.t Search :ith -ML Q%er(ing cannot ,e eli&inated in general. Consider the ?%er(* Cop( Code

$ELECT E DR%M ;=ERE do!s C%NTAIN$"xCol,)ru-)#

The :ord JranJ in the doc%&ent &atches the search condition o:ing to ste&&ing. @%rther&ore! the search conte.t is not checked %sing -Q%er(. >hen -ML is deco&posed %sing A-S8 into relational col%&ns that are f%ll2te.t inde.ed! - ath ?%eries over the -ML vie: do not perfor& f%ll2te.t search on the %nderl(ing ta,les. Su!!ort for Different Languages in 7ull2#e&t Inde& on XML Column <nlike nvarchar or varchar col%&ns that can have onl( one :ord ,reaker for the entire col%&n! an -ML data t(pe col%&n s%pports &%ltiple lang%age :ord ,reakers %sing the .&l*lang attri,%te on -ML ele&ents. The :ord ,reaker for the specified lang%age is %sed on the content of that ele&ent. A s%,2ele&ent can specif( a different lang%age in an .&l*lang attri,%te. Th%s! not onl( can different -ML instances ,%t also a single -ML instance can involve &%ltiple :ord ,reakers. This gives rise to interesting possi,ilities. @or e.a&ple! a >ord 2003 doc%&ent &a( contain sections in different lang%ages. The doc%&ent in >ordML -ML representation can ,e stored in an -ML data t(pe col%&n! and the appropriate lang%age :ord ,reakers are %sed for f%ll2te.t inde.ing. A f%ll2te.t ?%er( can specif( the lang%age to %se! as sho:n in the e.a&ple ,elo:. %&am!le: 7ull2#e&t Searc" S!ecifying a Language The ?%er( ,elo: specifies that the f%ll2te.t search sho%ld ,e perfor&ed for the $er&an lang%age. Cop( Code

$ELECT E DR%M do!s ;=ERE !o-t10-s "xCol, )&0s0o-e-), LANC'ACE )Cerm1-)#

4ro!erty 4romotion
;f ?%eries are &ade principall( on a s&all n%&,er of ele&ent and attri,%te val%es 5for e.a&ple! find c%sto&ers ,ased on c%sto&er ;8Cthat is! the val%e of /Customer/@CustId is specified6! (o% &a( :ant to pro&ote those val%es into relational col%&ns. This is helpf%l :hen ?%eries are iss%ed on a s&all part of the -ML data :hile the entire -ML instance is retrieved. Creating -ML inde. on the -ML col%&n is overkill/ instead! the pro&oted col%&n can ,e inde.ed. Q%eries &%st ,e :ritten to %se the pro&oted col%&n 5that is! the ?%er( opti&iEer does not retarget ?%eries on the -ML col%&n to the pro&oted col%&n6. The pro&oted col%&n can ,e a co&p%ted col%&n in the sa&e ta,le or a separate! %ser2&aintained col%&n in a ta,le. This is ade?%ate :hen singleton val%es 5that is! single2val%ed properties6 are pro&oted fro& each -ML instance. Do:ever! for &%ltival%ed properties! (o% have to create a separate ta,le for the propert(! as descri,ed in the follo:ing section. Com!uted Column 5ased on XML Data #y!e A co&p%ted col%&n can ,e created %sing a %ser2defined f%nction 5<8@6 that invokes -ML data t(pe &ethods. The t(pe of the co&p%ted col%&n can ,e an( SQL t(pe! incl%ding -ML. This is ill%strated in the follo:ing e.a&ple. %&am!le: Com!uted Column 5ased on XML Data #y!e Met"od Create the %ser2defined f%nction for ;S=9 of ,ooks* Cop( Code

CREATE D'NCTI%N udfF,etF+ookFI$ N "@x?1t1 xml# RET'RN$ @1r!<1r"23#

;IT= $C=EMA IN?INC ECIN

?ECLARE @I$ N

@1r!<1r"23#

$ELECT @I$ N . @x?1t1B@1lue")/+ookG(H/@I$ N), )@1r!<1r"23#)#

RET'RN @I$ N

EN?
Add a co&p%ted col%&n to the ta,le for ;S=9* Cop( Code

ALTER TA LE do!s A?? I$ N A$ d+oBudfF,etF+ookFI$ N"xCol#

The co&p%ted col%&n can ,e inde.ed in the %s%al :a(. %&am!le: Queries on Com!uted Column 5ased on XML Data #y!e Met"ods To o,tain the H,ookI :hose ;S=9 is 0273+L24+QQ22! the ?%er(* Cop( Code

$ELECT pk, xCol DR%M ;=ERE do!s xColBex0st ")/+ookG@I$ N . /3456784(79942/H)# . (

on the -ML col%&n can ,e re:ritten to %se the co&p%ted col%&n as follo:s* Cop( Code

$ELECT pk, xCol DR%M ;=ERE do!s I$ N . )3456784(79942)

Bo% can create a %ser2defined f%nction to ret%rn -ML data t(pe and create a co&p%ted col%&n %sing the <8@. Do:ever! (o% cannot create an -ML inde. on the co&p%ted! -ML col%&n. Creating 4ro!erty #a.les Bo% &a( :ant to pro&ote so&e of the &%ltival%ed properties fro& (o%r -ML data into one or &ore ta,les! create inde.es on those ta,les! and retarget (o%r ?%eries to %se the&. A t(pical scenario is one in :hich a s&all n%&,er of properties cover &ost of (o%r ?%er( :orkload. Bo% can do the follo:ing*

Create one or &ore ta,les to hold the &%ltival%ed properties. Bo% &a( find it convenient to store one propert( per ta,le! and to d%plicate the pri&ar( ke( of the ,ase ta,le in the propert( ta,les for ,ack Goin :ith the ,ase ta,le.

;f (o% :ant to &aintain the relative order of the properties! (o% need to introd%ce a separate col%&n for the relative order. Create triggers on the -ML col%&n to &aintain the propert( ta,le5s6. >ithin the triggers! do one of the follo:ing* <se -ML data t(pe &ethods! s%ch as nodes+, and value+,! to insert and delete ro:s of the propert( ta,le5s6. 5See the section val%e56! nodes56! and Apen-ML56 for &ore disc%ssion of the nodes+, &ethod.6

Create strea&ing ta,le2val%ed f%nction5s6 in CL' to insert and delete ro:s of the propert( ta,le5s6. >rite ?%eries for SQL access to the propert( ta,les and -ML access to the -ML col%&n in the ,ase ta,le! :ith Goins ,et:een the ta,les %sing their pri&ar( ke(.

%&am!le: Create 4ro!erty #a.le S%ppose (o% :ant to pro&ote first na&e of a%thors. =ooks have one or &ore a%thors! so that first na&e is a &%ltival%ed propert(. Fach first na&e is stored in a separate ro: of a propert( ta,le. The pri&ar( ke( of the ,ase ta,le is d%plicated in the propert( ta,le for ,ack Goin. Cop( Code

CREATE TA LE t+lPropAut<or "propPK 0-t, propAut<or @1r!<1r"m1x##


%&am!le: Create 'ser2Defined 7unction to /enerate a Ro)set from XML Instance The ta,le2val%ed %ser2defined f%nction %dfN-ML2Ta,le ,elo: accepts a pri&ar( ke( val%e and an -ML instance. ;t retrieves the first na&e of all a%thors of H,ookI ele&ents and ret%rns a ro:set of 5pri&ar( ke(! first na&e6 pairs. ;nde.ing a co&p%ted col%&n ,ased on -ML data t(pe &ethods :itho%t %sing a :rapper %ser2defined f%nction is not s%pported in SQL Server 200+. Cop( Code

CREATE D'NCTI%N udfFXML2T1+le "@pk 0-t, @xCol xml# RET'RN$ t1+le ;IT= $C=EMA IN?INC A$ RET'RN" sele!t @pk 1s PropPK, -refB@1lue")B), )@1r!<1r"m1x#)# 1s propAut<or from #
%&am!le: Create #riggers to 4o!ulate 4ro!erty #a.le ;nsert triggerC;nserts ro:s into the propert( ta,le* Cop( Code

@xColB-odes")/+ook/1ut<or/f0rst4-1me)# R"-ref#

CREATE TRICCER tr,Fdo!sFIN$ o- do!s D%R IN$ERT A$ ECIN

0-sert 0-to t+lPropAut<or sele!t pBE from 0-serted 1s I CR%$$ APPLY d+oBudfFXML2T1+le"IBpk, IBxCol# 1s P EN?
8elete triggerC8eletes ro:s fro& the propert( ta,le ,ased on the pri&ar( ke( val%e of deleted ro:s* Cop( Code

!re1te tr0,,er tr,Fdo!sF?EL o- do!s for delete 1s +e,0delete t+lPropAut<or ><ere propPK IN "sele!t pBPropPK from deleted 1s ? CR%$$ APPLY

d+oBudfFXML2T1+le"?Bpk, ?BxCol# 1s P # e-d


<pdate triggerC8eletes e.isting ro:s in propert( ta,le corresponding to the %pdated -ML instance and inserts ne: ro:s into propert( ta,le* Cop( Code

!re1te tr0,,er tr,Fdo!sF'P? o- do!s for upd1te 1s 0f upd1te"xCol# or upd1te"pk# +e,0delete t+lPropAut<or ><ere propPK IN "sele!t pBPropPK from deleted 1s ? CR%$$ APPLY

d+oBudfFXML2T1+le"?Bpk, ?BxCol# 1s P #

0-sert 0-to t+lPropAut<or sele!t pBE from 0-serted 1s I CR%$$ APPLY

d+oBudfFXML2T1+le"0Bpk, 0BxCol# 1s P

e-d
%&am!le: 7ind XML Instances 9"ose *ut"ors -ave t"e 7irst $ame :David: The ?%er( can ,e for&%lated on the -ML col%&n. Alternativel(! it can search the propert( ta,le for first na&e J8avidJ and perfor& a ,ack Goin :ith the ,ase ta,le to ret%rn the -ML instance! as sho:n here* Cop( Code

$ELECT xCol DR%M ;=ERE do!s J%IN t+lPropAut<or %N do!sBpk . t+lPropAut<orBpropPK t+lPropAut<orBpropAut<or . )?1@0d)

%&am!le: Solution 'sing CLR Streaming #a.le2(alued 7unction This sol%tion consists of the follo:ing steps* 4. 8efine a CL' class C-&lStrea&ingT"@ that i&ple&ents ;Fn%&erator and contains a &ethod ;nitMethod to generate a strea&ing ta,le2val%ed o%tp%t ,( appl(ing a si&ple path e.pression on an -ML instance. 2. 3. Create an asse&,l( and a Transact2SQL %ser2defined f%nction 5<8@6 to invoke the CL' class. 8efine insert! %pdate and delete triggers %sing the <8@ to &aintain the propert( ta,le5s6.

@irst! create the strea&ing CL' f%nction sho:n ,elo:. -ML data t(pe is e.posed as a &anaged class S?l-&l in A8A.9FT/ it s%pports the &ethod CreateReader+, that ret%rns an -&l'eader* Cop( Code

us0-, $ystemK us0-, $ystemBXmlK us0-, $ystemBI%K us0-, $ystemB?1t1K us0-, $ystemB?1t1B$LlK us0-, $ystemB?1t1B$LlCl0e-tK us0-, $ystemB?1t1B$LlTypesK us0-, M0!rosoftB$Ll$er@erB$er@erK us0-, $ystemBColle!t0o-sK

pu+l0! !l1ss CXml$tre1m0-,T&D M IE-umer1tor pr0@1te XmlRe1der mFre1derK pr0@1te $LlXml mFdo!K pr0@1te str0-, mF-1meK pr0@1te str0-,GH mFp1t<K pr0@1te 0-t mFp1t<Lo!K

pu+l0! CXml$tre1m0-,T&D "$LlXml do!, str0-, s0mpleP1t<# N mFdo! . do!K mFre1der . mFdo!BCre1teRe1der"#K mFp1t< . s0mpleP1t<B$pl0t"-e> !<1rGHN)/)O#K mFp1t<Lo! . mFp1t<BLe-,t<4(K O

//T<ree IE-umer1tor met<odsB //Custom !ode for N1@0,1t0-, t<e do!ume-t for 1 s0mple p1t<B pu+l0! +ool Mo@eNext "# N +ool -e>Fro> . f1lseK ><0le "P-e>Fro> QQ PmFre1derBE%D# N mFre1derBRe1d"#K 0f "mFre1derBLo!1lN1me..mFp1t<GmFp1t<Lo!H QQ mFp1t<Lo!..mFp1t<BLe-,t<4( QQ mFre1derBNodeType..XmlNodeTypeBEleme-t# N mF-1me . mFre1derBRe1d$tr0-,"#K -e>Fro> . trueK O else 0f "mFre1derBLo!1lN1me..mFp1t<GmFp1t<Lo!H QQ mFre1derBNodeType..XmlNodeTypeBEleme-t QQ mFre1derBIsEmptyEleme-t..f1lse# N 0f "mFp1t<Lo!..( QQ mFre1derB?ept<P.3# !o-t0-ueK mFp1t<Lo!RRK O else 0f "mFp1t<Lo!P.( QQ mFre1derBLo!1lN1me..mFp1t<GmFp1t<Lo!4(H QQ mFre1derBNodeType..XmlNodeTypeBE-dEleme-t# N mFp1t<Lo!44K O O retur- -e>Fro>K O

pu+l0! o+Se!t Curre-t pu+l0! @o0d Reset "# N mFre1derBClose"#K

N ,et N retur- t<0sK O O

mFre1der . mFdo!BCre1teRe1der"#K O

G$LlDu-!t0o-Attr0+ute "D0llRo>Met<odN1me./CLR%pe-Xml/#H pu+l0! st1t0! IE-umer1tor I-0tMet<od "$LlXml do!, str0-, s0mpleP1t<# N retur- -e> CXml$tre1m0-,T&D"do!, s0mpleP1t<#K O

pu+l0! st1t0! @o0d CLR%pe-Xml"%+Se!t o+S, out str0-, -1me# N CXml$tre1m0-,T&D stre1m . "CXml$tre1m0-,T&D# o+SK -1me . stre1mBmF-1meK O O
9e.t! create an asse&,l( and a Transact2SQL %ser2defined f%nction SQL3streaming3&ml3tvf corresponding to the CL' &ethod InitMet"od+,. Cop( Code

CREATE A$$EM LY CLRXML DR%M ;IT= )CMTtempT$tre1m0-,T&DBdll) PERMI$$I%NF$ET . $ADE

CREATE D'NCTI%N $ULFstre1m0-,FxmlFt@f " @x?1t1 XML, @xP1t< -@1r!<1r"m1x##

RET'RN$ t1+le "D0rstN1me -@1r!<1r"m1x## A$ EXTERNAL NAME GCLRXMLHBGCXml$tre1m0-,T&DHBGI-0tMet<odH


The <8@ is %sed to define the ta,le2val%ed f%nction CL'N%dfN-ML2Ta,le for ro:set generation* Cop( Code

!re1te fu-!t0o- CLRFudfFXML2T1+le "@pk 0-t, @xCol xml# retur-s @retFT1+le t1+le "DK 0-t, D0rstN1me @1r!<1r"m1x## >0t< s!<em1+0-d0-, 1s +e,0-

0-sert 0-to @retFT1+le sele!t @pk, D0rstN1me DR%M reture-d


@inall(! define triggers as in JF.a&ple* Create Triggers to op%late ropert( Ta,leJ :ith the f%nction CL'N%dfN-ML2Ta,le replacing %dfN-ML2Ta,le. Th%s! the insert trigger is as follo:s* Cop( Code

$ULFstre1m0-,FxmlFt@f "@xCol, )/+ook/1ut<or/f0rst4-1me)#

!re1te tr0,,er CLRFtr,Fdo!sFIN$ o- do!s for 0-sert 1s +e,00-sert 0-to t+lPropAut<or sele!t pBE from 0-serted 1s I CR%$$ APPLY

d+oBCLRFudfFXML2T1+le"IBpk, IBxCol# 1s P e-d


The delete and the %pdate triggers are si&ilar to the non2CL' ones and are o,tained ,( &erel( replacing the f%nction udf3XML2#a.le+, :ith CLR3udf3XML2#a.le+,. 4ros and Cons of #"ese #)o *lternatives >hen the f%nction udf3XML2#a.le+, %sed to generate! re&ove and &odif( the ro:s in the propert( ta,le is C < intensive! the CL'2,ased approach is generall( faster. This incl%des -ML data :ith ver( co&ple. str%ct%re so that the -ML parsing is co&p%tationall( e.pensive. >hen the eval%ation of the f%nction udf3XML2#a.le+, is cheap! the difference di&inishes. @or s&all -ML siEes and si&ple path e.pressions! the cost of conte.t s:itching &a( h%rt the CL'2,ased sol%tion &ore. <nlike the Transact2SQL and -Q%er(2,ased sol%tion! the path e.pression in the CL'2,ased sol%tion is hard2coded. This :orks :ell as long as the path e.pressions are kno:n ahead of ti&e. ;n all other cases! the Transact2SQL and -Q%er(2,ased sol%tions are the onl( via,le ones.

XML Sc"ema Collections


An -ML sche&a collection is a &eta2data entit(! scoped ,( a relational sche&a! :hich contains one or &ore -ML sche&as that &a( ,e related 5for e.a&ple! thro%gh H.s*i&portI6 or %nrelated. ;ndivid%al -ML sche&as :ithin an -ML sche&a collection are identified %sing their target na&espace. An -ML sche&a collection is created %sing C'FATF -ML SCDFMA CALLFCT;A9 s(nta. and providing one or &ore -ML sche&as. More -ML sche&a co&ponents can ,e added to an e.isting -ML sche&a! and &ore sche&as can ,e added to an -ML sche&a collection %sing ALTF' -ML SCDFMA CALLFCT;A9 s(nta.. -ML sche&a collections can ,e sec%red like an( SQL o,Gect %sing SQL Server 200+Ms sec%rit( &odel. Multity!ed Column

An -ML sche&a collection C t(pes an -ML col%&n .Col according to &%ltiple -ML sche&as. Additionall(! the flag 8AC<MF9T or CA9TF9T specifies :hether -ML trees or frag&ents! respectivel(! can ,e stored in col%&n .Col. @or 8AC<MF9T! each -ML instance specifies the target na&espace of its top2level ele&ent in the instance! according to :hich it is validated and t(ped. @or CA9TF9T! on the other hand! each top2 level ele&ent can specif( an( one of the target na&espaces in C. The -ML instance is validated and t(ped according to all the target na&espaces occ%rring in an instance. Sc"ema %volution -ML sche&a collection is %sed to t(pe -ML col%&ns! varia,les and para&eters. ;t provides a &echanis& for -ML sche&a evol%tion. S%ppose (o% add an -ML sche&a :ith target na&espace =AAP2"4 to an -ML sche&a collection C. An -ML col%&n .Col t(ped %sing C can store -ML data confor&ing to =AAP2"4 sche&a. S%ppose an application :ants to e.tend the -ML sche&a :ith ne: sche&a co&ponents! s%ch as co&ple. t(pe definitions and top2level ele&ent declarations. These ne: sche&a co&ponents can ,e added to =AAP2"4 sche&a and do not re?%ire revalidation of the e.isting -ML data in col%&n .Col. S%ppose later the application :ants to provide a ne: version of the -ML sche&a! for :hich it chooses the target na&espace =AAP2"2. This -ML sche&a can ,e added to C. The -ML col%&n can store instances of ,oth =AAP2"4 and =AAP2"2! and e.ec%te ?%eries and data &odification on -ML instances confor&ing to these na&espaces. La& (alidation Disallo)ed in 9ildcard Sections The -ML sche&a processor does not s%pport la. validation in :ildcard sections 5.s*an( and .s*an(Attri,%te6 and .s*an(T(pe. @or :ildcard sections! the -ML sche&a can specif( either pro!essCo-te-ts . /str0!t/ or pro!essCo-te-ts . /sk0p/. @or .s*an(T(pe! onl( strict validation is s%pported. Strict validation ens%res that &ore precise t(pe infor&ation regarding the -ML nodes instantiating these sche&a co&ponents is kno:n d%ring validation and %sed d%ring ?%er( co&pilation. Skip se&antics loses the t(ping infor&ation and the corresponding nodes are treated as %nt(ped 5.dt*%nt(ped in the case of ele&ents and .dt*%nt(pedAto&ic in the case of attri,%tes6. ;f skip se&antics for .s*an(T(pe is desired! then introd%ce a ne: co&ple. t(pe that %ses .s*an( and .s*an(Attri,%te :ith pro!essCo-te-ts . /sk0p/ as sho:n ,elo:* Cop( Code

*xsM!omplexType -1me./sk0pA-yType/ m0xed./true/: *xsMseLue-!e: *xsM1-y pro!essCo-te-ts./sk0p/ m0-%!!urs./3/ m1x%!!urs./u-+ou-ded//: */xsMseLue-!e: *xsM1-yAttr0+ute pro!essCo-te-ts./sk0p//: */xsM!omplexType:
'sing &s:datetime0 &s:date0 and &s:time "al%es of t(pe .s*dateti&e! .s*date! and .s*ti&e &%st ,e specified in ;SA QL04 for&at and incl%de a ti&e Eone. Ather:ise! the data validation for these val%es fails. Th%s! 200+20+227T4)*44*00.9)3# is valid as a val%e of t(pe .s*dateti&e! ,%t the follo:ing are not* 200+20+227 4)*44*00.9)3# 5&issing date and ti&e separator JTJ6! 200+20+227T4)*44*00.9)3 5&issing ti&e Eone6 and 200+2 0+227 4)*44*00.9)3 5&issing ti&e separator and ti&e Eone6. Si&ilarl(! 200+20+227# is a valid .s*date val%e ,%t 200+20+227 is not since no ti&e Eone is specified.

<nt(ped -ML data &a( contain date! ti&e! and dateti&e val%es that an application &a( :ish to convert to the SQL t(pes dateTi&e or s&all8ateTi&e. These date! ti&e and dateti&e val%es &a( not confor& to ;SA QL04 for&at or contain a ti&e Eone. Si&ilarl(! t(ped -ML &a( contain s%ch val%es as t(pes other than .s*date! .s*ti&e! and .s*dateTi&e 5for e.a&ple! .s*string6. ;n ,oth cases! the val%es sho%ld ,e converted first to 0n7varchar and then to SQL dateti&e or s&alldateti&e! as the follo:ing e.a&ple ill%strates. %&am!le: %&tracting datetime (alue from 'nty!ed XML To o,tain the val%e of the CreationTi&e attri,%te fro& the follo:ing data* Cop( Code

de!l1re @@1r xml sele!t @@1r . )*UueryExe!ut0o-$t1ts: *Ce-er1l$t1ts Exe!ut0o-Cou-t./(/ L1stExe!ut0o-T0me./23374374(A (IM((M33BAI6/ Cre1t0o-T0me./23374374(A (IM((M33BA(6//: *;orkerT0me Tot1l./668(/ L1st./668(/ M0-./668(/ M1x./668(//: *P<ys0!1lRe1ds Tot1l./3/ L1st./3/ M0-./3/ M1x./3//: *P<ys0!1l;r0tes Tot1l./3/ L1st./3/ M0-./3/ M1x./3//: *Lo,0!1lRe1ds Tot1l./3/ L1st./3/ M0-./3/ M1x./3//: */UueryExe!ut0o-$t1ts:)
a value+, &ethod is %sed to retrieve the val%e as nvarchar5L)6! :hich is then cast to SQL dateti&e t(pe* Cop( Code

sele!t !1st "@@1rB@1lue" )"/UueryExe!ut0o-$t1ts /Ce-er1l$t1ts/@Cre1t0o-T0me#G(H), )-@1r!<1r"m1x#)# A$ d1tet0me# 1s !re1t0o-Ft0me

'sage
Loading XML Data
#ransferring XML Data from SQL Server 2000 to SQL Server 2005 Bo% can transfer -ML data to SQL Server 200+ in &%ltiple :a(s. >e disc%ss a fe: options*

;f (o% have (o%r data in an 0n7te.t or i&age col%&n in a SQL Server 2000 data,ase! i&port the ta,le %sing! sa(! 8TS! into a SQL Server 200+ data,ase. Change the col%&n t(pe to 0n7varchar5&a.6 or var,inar(5&a.6! respectivel(! and then to -ML %sing ALTF' TA=LF state&ent.

Bo% can ,%lk2cop( (o%r data fro& SQL Server 2000 %sing ,cp o%t! and ,%lk2insert into the SQL Server 200+ data,ase %sing ,cp in.

;f (o% have data in relational col%&ns in a SQL Server 2000 data,ase! create a ne: ta,le :ith an 0n7te.t col%&n and optionall( a pri&ar( ke( col%&n for a ro: identifier. <se client side progra&&ing to retrieve -ML generated at the server :ith @A' -ML! and :rite it into the 0n7te.t col%&n. Then %se the a,ove2&entioned techni?%es to transfer data to a SQL Server 200+ data,ase. Bo% &a( choose to :rite the -ML into an -ML col%&n in the SQL Server 200+ data,ase directl(.

%&am!le: C"anging Column #y!e to XML S%ppose (o% :ant to change the t(pe of an 0n7te.t! 0n7varchar! var,inar(! or -ML col%&n -B# in ta,le ' to -ML t(ped %sing the -ML sche&a collection ,ookCollection. The follo:ing state&ent perfor&s this t(pe change* Cop( Code

ALTER TA LE R ALTER C%L'MN XYV XML "+ookColle!t0o-#


The target is %nt(ped -ML if no -ML sche&a collection is specified. #e&t %ncoding SQL Server 200+ stores -ML data in <nicode 5<T@24L6. -ML data retrieved fro& the server co&es o%t in <T@24L encoding/ if (o% :ant a different encoding for data retrieval! (o%r application needs to perfor& the necessar( conversion on the retrieved <T@24L data. >hen converting a string t(pe to -ML data t(pe! SQL Server 200+ %ses the code page of the collation of the so%rce string to deter&ine the encoding. ;f -ML encoding infor&ation is specified %sing the JencodingJ attri,%te in the -ML declaration 5for e.a&ple! *Wxml @ers0o-./(B3/ e-!od0-,./>0-do>s4(278/W:6! the encoding &%st ,e co&pati,le :ith the stringMs code page. The string data can ,e parsed correctl( ,( the -ML parser as long as these t:o collations are co&pati,le. Ather:ise! an error &a( ,e raised or invalid data &a( ,e loaded. The sa&e ,ehavior also occ%rs :hen a client application sends a string val%e to the server for conversion to -ML data t(pe. So&eti&es! (o% &a( have -ML data in different encodings! or have no advance kno:ledge a,o%t the encodings. The reco&&endation in s%ch sit%ations is to provide the -ML data as a ,inar( data t(pe 5for e.a&ple! @1r+0-1ry"m1x#6. The server derives the encoding fro& the ,(te2order &ark of the data strea& 50.@@@F indicates <T@24L6 or! if present! the -ML declaration. Conse?%entl(! the easiest :a( of avoiding -ML encoding &is&atch in an -ML para&eter is to send the -ML data fro& the client as native -ML 5%sing the S?l-&l class in A8A.9FT6 or ,inar( t(pe! or to convert fro& 0var7,inar( data t(pe to -ML at the server. To s%&&ariEe! the r%les are*

;f (o%r te.t -ML is in <nicode 5<CS22! <T@24L6! assigning it to an -ML col%&n! varia,le or para&eter does not pose an( pro,le&s. ;f the encoding is not <nicode and is i&plicit 5d%e to the so%rce code page6! the string code page in the data,ase sho%ld ,e the sa&e as or co&pati,le :ith the code points that (o% :ant to load 5%se CALLATF if necessar(6. ;f no s%ch server code page e.ists! (o% have to add an e.plicit -ML declaration to specif( the proper encoding.

To %se an e.plicit encoding! either %se var,inar( t(pe! :hich has no interaction :ith code pages! or %se a string t(pe of the appropriate code page. Then assign the data to -ML col%&n! varia,le or para&eter.

Th%s! if (o% :ant to pass <T@2Q! it is safest to pass it in as var,inar(5&a.6. <T@24L data can ,e passed in as nvarchar5&a.6 :here no ,(te order &ark is re?%ired! or as var,inar(5&a.6 :ith the ,(te order &ark 0.@@@F as the first t:o ,(te to indicate <T@24L encoding. 5ul62Loading XML Data Bo% can ,%lk2load -ML data into the server %sing SQL ServerMs ,%lk2loading capa,ilities! s%ch as =C ! A F9'A>SFT! and =<LP ;9SF'T. A F9'A>SFT allo:s (o% to load data into an -ML col%&n fro& files. The follo:ing e.a&ple ill%strates this point. %&am!le: Loading XML from 7iles This e.a&ple sho:s ho: to insert a ro: in ta,le docs. The val%e of the -ML col%&n is loaded fro& file C*Rte&pR.&lfile..&l as ,inar( LA= 5=LA=6! and the pk col%&n is s%pplied the val%e 40. The file is loaded as a =LA= 5instead of a CLA= or 9CLA=6 to accept an( encoding that the -ML doc%&ent &a( ,e encoded in. Cop( Code

IN$ERT INT% do!s $ELECT (3, xCol DR%M "$ELECT E DR%M %PENR%;$ET " 'LK )CMTtempTxmlf0leBxml), $INCLEF L% # A$ xCol# A$ R"xCol#
$on25inary Collations The -ML collation %sed for -ML data t(pe is a ,inar( collation and is case2sensitive 5the so2called <nicode code point collation6. Applications &a( have a different re?%ire&ent! s%ch as case insensitive searches. This can ,e achieved ,( pro&oting the appropriate string val%es into a co&p%ted col%&n of t(pe varchar :ith the appropriate collation. Q%er( the co&p%ted col%&n for collation2dependent operations. @%rther&ore! s%ppose the -ML col%&n contains $er&an and Chinese data strings. Bo% can %se operations specific to each of these collations on t:o co&p%ted col%&ns! one for each of these lang%ages.

XQuery and #y!e Inference


-Q%er( 5http*11:::.:3.org1T'1.?%er(1 0 http*11:::.:3.org1T'1.?%er(1 7 6 e&,edded in Transact2SQL is the lang%age s%pported for ?%er(ing -ML data t(pe. The lang%age is %nder develop&ent 5c%rrentl( in last call6 ,( the >orld >ide >e, Consorti%& 5>3C6 :ith the participation of all &aGor data,ase vendors incl%ding Microsoft. ;t incl%des - ath 2.0 as navigation lang%age. Lang%age constr%cts for data &odification are availa,le on -ML data t(pe as :ell. See ,ooks online for infor&ation on the -Q%er( constr%cts! f%nctions and operators s%pported in SQL Server 200+. %rror Model Co&pilation errors are ret%rned fro& s(ntacticall( incorrect -?%er( e.pressions and -ML 8ML state&ents. The co&pilation phase checks static t(pe correctness of -Q%er( e.pressions and 8ML state&ents! and %ses -ML sche&as for t(pe inferences in case of t(ped -ML. ;t raises static t(pe errors if an e.pression co%ld fail at r%nti&e d%e to t(pe safet( violation. F.a&ples of static error are addition of a string to an integer and ?%er(ing for a non2e.istent node for t(ped data. As a deviation fro& the >3C standard! -Q%er( r%nti&e errors are converted into e&pt( se?%ences! :hich &a( propagate as e&pt( -ML or 9<LL to the ?%er( res%lt depending %pon the invocation conte.t.

F.plicit casting to the proper t(pe allo:s %sers to :ork aro%nd static errors altho%gh r%nti&e cast errors :ill ,e transfor&ed to e&pt( se?%ences. The follo:ing s%,sections disc%ss t(pe checking in greater detail. Singleton C"ec6s Location steps! f%nction para&eters! and operators 5for e.a&ple! e?6 re?%iring singletons ret%rn an error if the co&piler cannot deter&ine :hether a singleton is g%aranteed at r%nti&e. The pro,le& arises often :ith %nt(ped data and so&eti&es :ith t(ped data. @or e.a&ple! look%p of an attri,%te re?%ires a singleton parent ele&ent/ an ordinal selecting a single parent node is ade?%ate. Fval%ation of nodes+,2value+, co&,ination 5see the section val%e56! nodes56! and Apen-ML566 to e.tract attri,%te val%es &a( not re?%ire the ordinal specification! as sho:n in the ne.t e.a&ple! since the nodes+, &ethod e&its singleton conte.t ite&s. %&am!le: ;no)n Singleton ;n this e.a&ple! the nodes+, &ethod generates a separate ro: for each H,ookI ele&ent. 5See the section val%e56! nodes56! and Apen-ML56 for a &ore detailed description of the nodes+, &ethod.6 The value+, &ethod eval%ated on a H,ookI node e.tracts the val%e of Kgenre! :hich! ,eing an attri,%te! is a singleton. Cop( Code

$ELECT -refB@1lue")@,e-re), )@1r!<1r"m1x#)# Ce-re DR%M do!s CR%$$ APPLY xColB-odes")//+ook)# A$ R"-ref#

-ML sche&a is %sed for t(pe checking of t(ped -ML. ;f a node is specified as singleton in the -ML sche&a! the co&piler %ses that infor&ation and no error occ%rs. Ather:ise! an ordinal selecting a single node is re?%ired. ;n partic%lar! the %se of descendant a.is! s%ch as in /+ook//t0tle! loses singleton cardinalit( inference for the HtitleI ele&ent even if the -ML sche&a specifies it to ,e so. 'e:rite it as "/+ook//t0tle#G(H. ;t is i&portant to keep the distinction ,et:een //f0rst4-1meG(H and "//f0rst4-1me#G(H in &ind for t(pe checking. The for&er ret%rns a se?%ence of Hfirst2na&eI nodes in :hich each node is the left&ost Hfirst2na&eI node a&ongst its si,lings. The latter ret%rns the first! singleton Hfirst2 na&eI node in doc%&ent order in the -ML instance. %&am!le: 'se of value+, The ?%er( ,elo: on %nt(ped -ML col%&n res%lts in static! co&pilation error since value+, e.pects a singleton node as the first arg%&ent and the co&piler cannot deter&ine :hether onl( one Hlast2 na&eI node :ill occ%r at r%nti&e* Cop( Code

$ELECT xColB@1lue")//1ut<or/l1st4-1me), )-@1r!<1r"73#)# L1stN1me DR%M do!s

;t is te&pting to tr( the follo:ing fi.* Cop( Code

$ELECT xColB@1lue")//1ut<or/l1st4-1meG(H), )-@1r!<1r"73#)# L1stN1me DR%M do!s

Do:ever! this does not rectif( the error since &%ltiple Ha%thorI nodes &a( occ%r in each -ML instance. The follo:ing re:rite :orks* Cop( Code

$ELECT xColB@1lue")"//1ut<or/l1st4-1me#G(H), )-@1r!<1r"73#)# L1stN1me DR%M do!s

This ?%er( ret%rns the val%e of the first Hlast2na&eI ele&ent in each -ML instance. 4arent *&is ;f the t(pe of a node cannot ,e deter&ined! it ,eco&es .s*an(T(pe! :hich is not i&plicitl( cast to an( other t(pe. This occ%rs &ost nota,l( d%ring navigation %sing parent a.is 5for e.a&ple! xColBLuery")/+ook/@,e-re/BB/pr0!e)#6/ the parent node t(pe is deter&ined to ,e .s*an(T(pe. An ele&ent &a( also ,e defined as .s*an(T(pe in an -ML sche&a. ;n ,oth cases! the loss of &ore precise t(pe infor&ation often leads to static t(pe errors! and re?%ires e.plicit cast of ato&ic val%es to their specific t(pe. data+,0 te&t+,0 and string+, *ccessors -Q%er( has a f%nction fn:data+, to e.tract scalar! t(ped val%es fro& nodes! a node test te&t+, to ret%rn te.t nodes! and the f%nction fn:string+, that ret%rns the string val%e of a node. Their %sages are so&eti&es conf%sing. $%idelines for their proper %se in SQL Server 200+ are as follo:s. Consider the -ML instance *1,e:(2*/1,e:.

<nt(ped -ML* The path e.pression /1,e/text"# ret%rns the te.t node J42J. The f%nction

f-Md1t1"/1,e# ret%rns the string val%e J42J and so does f-Mstr0-,"/1,e#.


T(ped -ML* ;n SQL Server 200+! the e.pression 1age1te.t56 ret%rns static error for an( si&ple t(ped HageI ele&ent. An the other hand! f-Md1t1"/1,e# ret%rns integer 42! :hile

f-Mstr0-,"/1,e# (ields the string J42J.


<sed :ithin a <uery+, &ethod! the res%lt of each of these f%nctions and nodes tests are converted to te.t nodes and serialiEed as a single -ML data t(pe instance. Te.t nodes are vis%all( represented ,( their string val%e! and the serialiEation of &%ltiple te.t nodes appears as a concatenation of their string val%es. Do:ever! a search for the concatenated string or a part of it (ields an e&pt( res%lt :henever it does not co&e fro& a single te.t node. This difference is ill%strated in the e.a&ple ,elo:. %&am!le: Seriali1ation of #e&t $odes ;n the follo:ing ?%er(! the te.t nodes %nder all the Ha%thorI ele&ents are retrieved :ithin a <uery+, &ethod. There are fo%r s%ch te.t nodes in the e.a&ple %sed in F.a&ple* Creating ri&ar( -ML ;nde.! and the serialiEed o%tp%t appears as JMichaelDo:ard8avidLe=lancJ in SQL Server Manage&ent St%dio. Cop( Code

$ELECT xColBLuery ")//1ut<or/E/text"#)# L1stN1me DR%M do!s

A search for the val%e JMichaelDo:ard8avidLe=lancJ in the ?%er( ,elo: ret%rns an e&pt( res%lt since the search val%e does not e?%al that of an( single te.t node %nder an Ha%thorI ele&ent* Cop( Code

$ELECT xColBLuery")//1ut<or/E/text"#GB . /M0!<1el=o>1rd?1@0dLe l1-!/H)# DR%M do!s

7unctions and

!erators over 'nion #y!es

<nion t(pes re?%ire caref%l handling o:ing to t(pe checking. T:o of the pro,le&s are ill%strated in the follo:ing e.a&ples. %&am!le: 7unction over 'nion #y!e Consider an ele&ent definition for HrI of a %nion t(pe Cop( Code

*xsMeleme-t -1me./r/: *xsMs0mpleType: *xsMu-0o- mem+erTypes./xsM0-t xsMflo1t xsMdou+le//: */xsMs0mpleType: */xsMeleme-t:


>ithin -Q%er( conte.t! the JaverageJ f%nction f-M1@, "//r# ret%rns a static error since the -Q%er( co&piler cannot s%& val%es of different t(pes 5.s*int! .s*float! or .s*do%,le6 for the HrI ele&ents in the arg%&ent of fn:avg+,. To get aro%nd this! re:rite the f%nction invocation as fn*avg5for Sr in //r retur- xsMdou+le "Xr#6. %&am!le: !erator over 'nion #y!e

The addition operation MOM re?%ires precise t(pes of the operands! so that the e.pression "//r#G(H R ( ret%rns a static error :ith the a,ove t(pe definition for ele&ent HrI. Ane re:rite to fi. the pro,le& is .s*int5 511r60476 O4.

value+,0 nodes+,0 and

!enXML+,

Bo% can %se &%ltiple value+, &ethods on -ML data t(pe in a S%L%C# cla%se to generate a ro:set of e.tracted val%es. The nodes+, &ethod (ields an internal reference for each selected node :hich can ,e %sed to ?%er( f%rther. The nodes+, &ethod can operate over an -ML col%&n. The co&,ination of nodes+, and value+, &ethods can ,e &ore efficient in generating the ro:set :hen it has &an( col%&ns and perhaps the path e.pressions %sed in its generation are co&ple.. The nodes+, &ethod (ields instances of a special -ML data t(pe! each of :hich has its conte.t set to a different selected node. S%ch an -ML instance s%pports <uery+,! value+,0 nodes+, and e&ist+, &ethods! and can ,e %sed in count+=, aggregations and ;S 9<LL checks. All other %ses res%lt in error. %&am!le: 'se of nodes+, S%ppose (o% :ant to e.tract first and last na&es of a%thors! :hose first na&e is not J8avidJ! as a ro:set consisting of t:o col%&ns! @irst9a&e and Last9a&e. <sing nodes+, and value+, &ethods! (o% can achieve this as follo:s* Cop( Code

$ELECT -refB@1lue")f0rst4-1meG(H), )-@1r!<1r"73#)# D0rstN1me, -refB@1lue")l1st4-1meG(H), )-@1r!<1r"73#)# L1stN1me DR%M ;=ERE do!s CR%$$ APPLY xColB-odes")//1ut<or)# A$ R"-ref# -refBex0st")BGf0rst4-1me P. /?1@0d/H)# . (

;n this e.a&ple! -odes")//1ut<or)# (ields a ro:set of references to Ha%thorI ele&ents for each -ML instance. The first and last na&es of a%thors are o,tained ,( eval%ating value+, &ethods relative to those references. SQL Server 2000 provides a facilit( for generating a ro:set fro& an -ML instance %sing !enXml+,. Bo% can specif( the relational sche&a for the ro:set and ho: val%es inside the -ML instance &ap to col%&ns in the ro:set. %&am!le: 'se of !enXml+, on XML Data #y!e

>e can re:rite the ?%er( fro& the previo%s e.a&ple %sing !enXml+, as sho:n ,elo:! ,( creating a c%rsor! reading each -ML instance into an -ML varia,le! and appl(ing Apen-ML to it* Cop( Code

?ECLARE -1meF!ursor C'R$%R D%R $ELECT xCol DR%M do!s

%PEN -1meF!ursor ?ECLARE @xml&1l XML ?ECLARE @0do! 0-t DETC= NEXT DR%M -1meF!ursor INT% @xml&1l

;=ILE "@@DETC=F$TAT'$ . 3# ECIN EXEC spFxmlFprep1redo!ume-t @0do! %'TP'T, @xml&1l $ELECT DR%M E %PENXML "@0do!, )//1ut<or)# ;IT= "D0rstN1me L1stN1me ;=ERE @1r!<1r"73# )f0rst4-1me), @1r!<1r"73# )l1st4-1me)# R

RBD0rstN1me P. )?1@0d)

EXEC spFxmlFremo@edo!ume-t @0do! DETC= NEXT DR%M -1meF!ursor INT% @xml&1l EN? CL%$E -1meF!ursor ?EALL%CATE -1meF!ursor
!enXml+, creates an in2&e&or( representation and %ses :ork ta,les instead of the ?%er( processor. ;ts parsing proced%re s!3&ml3!re!aredocument re?%ires a :ell2for&ed -ML doc%&ent and does not accept -ML frag&ents. !enXML+, relies on the - ath 4.0 processor of MS-MLSQL! :hich is a private version of the MS-ML 3.0 processor %sed ,( the data,ase engine! instead of the -Q%er( engine. The :ork ta,les are not shared a&ong &%ltiple calls to !enXml+, even on the sa&e -ML instance. This li&its its scala,ilit(. !enXml+, allo:s (o% to access an edge

ta,le for&at for the -ML data :hen the 9I#- cla%se is not specified. Also! it allo:s (o% to %se the re&ainder of the -ML val%e in a separate! Joverflo:J col%&n. The co&,ination of nodes+, and value+, f%nctions %se -ML inde.es effectivel(. Th%s! this co&,ination can e.hi,it greater scala,ilit( than !enXml. %&am!le: 'se of !enXml on a Single XML Instance

Apen-&l is often %sed to shred a single -ML instance into a relational for&! for e.a&ple! :hen the -ML data is received on the :ire. ;n this case! no c%rsor is re?%ired. This e.a&ple sho:s a stored proced%re that accepts a single -ML instance for shredding the -ML the sa&e :a( as that %sing the c%rsor e.a&ple a,ove. Cop( Code

CREATE PR%CE?'RE $=RE?F$INCLEFXML @xml&1l -@1r!<1r"m1x# A$ ECIN ?ECLARE @0do! INT EXEC spFxmlFprep1redo!ume-t @0do! %'TP'T, @xml&1l $ELECT DR%M E %PENXML "@0do!, )//1ut<or)# ;IT= "D0rstN1me L1stN1me ;=ERE @1r!<1r"73# )f0rst4-1me), @1r!<1r"73# )l1st4-1me)# R

RBD0rstN1me P. )?1@0d)

EXEC spFxmlFremo@edo!ume-t @0do! EN?


The stored proced%re can ,e invoked as sho:n here* Cop( Code

?ECLARE @x&1l XML $ET @x&1l . "$ELECT xC%l DR%M do!s ;=ERE pk.(#

EXEC $=RE?F$INCLEFXML @x&1l

'sing 7 R XML to /enerate XML from Ro)sets


Bo% can generate an -ML data t(pe instance fro& a ro:set %sing 7 R XML :ith the ne: #>4% directive. The res%lt can ,e assigned to an -ML data t(pe col%&n! varia,le or para&eter. @%rther&ore! 7 R XML can ,e nested to generate an( hierarchical str%ct%re. This &akes nested 7 R XML &%ch &ore convenient to :rite than 7 R XML %X4LICI#! ,%t it &a( not perfor& as :ell for deep hierarchies. @A' -ML also introd%ces a ne: ATD &ode that specifies the path in the -ML tree :here a col%&nMs val%e sho%ld appear. The ne: 7 R XML #>4% directive can ,e %sed to define read2onl( -ML vie:s over relational data :ith SQL s(nta.. The vie: can ,e ?%eried :ith SQL state&ents and e&,edded -Q%er(! as the follo:ing e.a&ple sho:s. @or instance! (o% can refer to s%ch SQL vie:s in stored proced%res. More

infor&ation can ,e fo%nd in the MS89 article >hatMs 9e: in @A' -ML in Microsoft SQL Server 200+ 0 http*11technet.&icrosoft.co&1en2%s1li,rar(1&s3)+4375printer6.asp. 7 . %&am!le: SQL (ie) Returning /enerated XML Data #y!e The follo:ing SQL vie: definition creates an -ML vie: over a relational col%&n 5pk6 and ,ook a%thors retrieved fro& an -ML col%&n* Cop( Code

CREATE &IE; & "xml&1l# A$ $ELECT pk, xColBLuery")/+ook/1ut<or)# DR%M do!s

D%R XML A'T%, TYPE


The vie: " contains a single ro: :ith a single col%&n .&l"al of -ML t(pe. ;t can ,e ?%eried like a reg%lar -ML data t(pe instance. @or e.a&ple! the follo:ing ?%er( ret%rns the a%thor :hose first na&e is J8avidJ* Cop( Code

$ELECT xml&1lBLuery")//1ut<orGf0rst4-1me . /?1@0d/H)# DR%M &

The ?%er( e.ec%tion &aterialiEes the -ML instance ,efore e.ec%ting the <uery+, &ethod on it. Dence! this approach does not perfor& or scale :ell e.cept :hen the aggregated -ML instance is s&all. SQL vie: definitions are so&e:hat analogo%s to -ML vie:s created %sing annotated sche&as. Do:ever! there are i&portant differences. The SQL vie: definition is read2onl( and &%st ,e &anip%lated :ith e&,edded -Q%er(/ not so for -ML vie:s %sing annotated sche&a. @%rther&ore! the SQL vie: &aterialiEes the -ML res%lt ,efore appl(ing the -Q%er( e.pression! :hile - ath ?%eries on -ML vie:s eval%ate SQL ?%eries on the %nderl(ing ta,les.

*dding 5usiness Logic


Bo%r ,%siness logic can ,e added to -ML data in several :a(s*

Bo% can :rite ro: or col%&n constraints to enforce do&ain2specific constraints d%ring insertion and &odification of -ML data. Constraints %sing -ML data t(pe &ethods are allo:ed onl( :ithin a scalar %ser2defined f%nction.

Bo% can :rite a trigger on the -ML col%&n that fires :hen (o% insert or %pdate val%es in the col%&n. The trigger can contain do&ain2specific validation r%les or pop%late propert( ta,les.

Bo% can :rite SQLCL' f%nctions in &anaged code to :hich (o% pass -ML val%es! and %se -ML processing capa,ilities provided ,( S(ste&.-&l na&espace. An e.a&ple is to appl( -SL transfor&ation to -ML data! as sho:n ,elo:. Alternativel(! (o% can deserialiEe the -ML into one or &ore &anaged classes and operate on the& %sing &anaged code.

Bo% can :rite Transact2SQL stored proced%res and f%nctions that invoke processing on the -ML col%&n for (o%r ,%siness needs.

%&am!le: *!!lying XSL #ransformation

Consider a CL' f%nction Transfor&-&l56 that accepts an -ML data t(pe instance and an -SL transfor&ation stored in a file! applies the transfor&ation to the -ML data and ret%rns the transfor&ed -ML in the res%lt. A skeleton f%nction :ritten in CT is as follo:s* Cop( Code

us0-, $ystemK us0-, $ystemB?1t1B$LlTypesK us0-, $ystemBXmlK us0-, $ystemBXmlBXP1t<K us0-, $ystemBXmlBXslK

pu+l0! !l1ss Tr1-sformXml N pu+l0! st1t0! $LlXml ApplyXslTr1-sform "$LlXml Xml?1t1, str0-, xslP1t<# N // Lo1d X$L tr1-sform1t0oXslComp0ledTr1-sform xform . -e> XslComp0ledTr1-sform"#K xformBLo1d "xslP1t<#K

// Lo1d XML d1t1 XP1t<?o!ume-t x?o! . -e> XP1t<?o!ume-t "Xml?1t1BCre1teRe1der"##K XP1t<N1@0,1tor -1@ . x?o!BCre1teN1@0,1tor "#K

// Apply t<e tr1-sform1t0o// us0-, m1kes sure t<1t >e flus< t<e >r0ter 1t t<e e-d us0-, "Xml;r0ter >r0ter . -1@BAppe-dC<0ld"## N xformBTr1-sform"Xml?1t1BCre1teRe1der"#, >r0ter#K O

// Retur- t<e tr1-sformed @1lue $LlXml ret$LlXml . -e> $LlXml "-1@BRe1d$u+tree"##K retur- "ret$LlXml#K O O
Ance the asse&,l( is registered and a corresponding %ser2defined Transact2SQL f%nction S<lXsl#ransform+, corresponding to *!!lyXsl#ransform+, is created ! the f%nction can ,e invoked fro& Transact2SQL as in the follo:ing ?%er(* Cop( Code

$ELECT $LlXslTr1-sform "xCol, )CMTtempTxsltr1-sformBxsl)#

DR%M ;=ERE

do!s xColBex0st")/+ook/t0tle/text"#G!o-t10-s"B,/$e!ure/#H)# .(

The ?%er( res%lt contains a ro:set of the transfor&ed -ML. Cop( Code SQLCL' opens %p a :hole ne: :orld that can ,e %sed for deco&posing -ML data into ta,les or propert( pro&otion! and ?%er(ing -ML data %sing &anaged classes in the S(ste&.-&l na&espace. More infor&ation can ,e fo%nd in SQL Server 200+ and "is%al St%dio J>hid,e(J ,ooks online.

Data 5inding in Queries


>hen (o%r data resides in a co&,ination of relational and -ML data t(pe col%&ns! (o% &a( :ant to :rite ?%eries that co&,ine relational and -ML data processing. @or e.a&ple! (o% can convert the data in relational and -ML col%&ns into an -ML data t(pe instance %sing 7 R XML and ?%er( it %sing -Q%er(. Conversel(! (o% can generate a ro:set fro& -ML val%es 5see <sage6 and ?%er( it %sing Transact2SQL. A &ore convenient and efficient :a( of :riting cross2do&ain ?%eries is to %se the val%e of a SQL varia,le or col%&n :ithin -Q%er( or -ML 8ML e.pressions*

Bo% can %se s<l:varia.le+, to %se the val%e of a SQL varia,le in (o%r -Q%er( or -ML 8ML e.pression. Bo% can %se s<l:column+, to %se val%es fro& a relation col%&n in (o%r -Q%er( or -ML 8ML e.pression.

This approach allo:s applications to para&eteriEe ?%eries! as sho:n in the e.a&ple ,elo:. Do:ever! -ML and %ser2defined t(pe are not per&itted in s<l:varia.le+, and s<l:column+,. %&am!le: Data 5inding 'sing s<l:varia.le+, The ?%er( ,elo: is a &odified version of the one sho:n in F.a&ple* Q%eries on Co&p%ted Col%&n =ased on -ML 8ata T(pe Methods. ;n this version! the ;S=9 of interest is passed in %sing a SQL varia,le Kis,n. =( replacing the constant :ith s<l:varia.le+,! the ?%er( can ,e %sed to search for an( ;S=9! not G%st the one :hose ;S=9 is 0273+L24+QQ22. Cop( Code

?ECLARE @0s+- @1r!<1r"23# $ET $ELECT DR%M ;=ERE @0s+- . )3456784(79942) xCol do!s xColBex0st ")/+ookG@I$ N . sLlM@1r01+le"/@0s+-/#H)# . (

S<l:column+, can ,e %sed in a si&ilar :a( and provides additional ,enefits. ;nde.es over the col%&n &a( ,e %sed for efficienc( as decided ,( the cost2,ased ?%er( opti&iEer. @%rther&ore! the co&p%ted col%&n &a( store a pro&oted propert(! as disc%ssed in Co&p%ted Col%&n =ased on -ML 8ata T(pe.

Catalog (ie)s for $ative XML Su!!ort

Catalog vie:s e.ist to provide &eta2data infor&ation regarding -ML %sage. A fe: of these are disc%ssed ,elo:.

XML Inde&es
-ML inde. entries appear in the catalog vie: s(s.inde.es :ith the inde. Jt(peJ 3. The Jna&eJ col%&n contains the na&e of the -ML inde.. -ML inde.es are also recorded in the catalog vie: s(s..&lNinde.es! :hich contains all the col%&ns of s(s.inde.es and a fe: special ones &eaningf%l for -ML inde.es. The val%e 9<LL in the col%&n Jsecondar(Nt(peJ indicates a pri&ar( -ML inde./ the val%es M M! M'M and M"M stand for ATD! 'A F'TB and "AL<F secondar( -ML inde.es! respectivel(. Space %sage of -ML inde.es can ,e fo%nd in the ta,le2val%ed f%nction sys?dm3d.3inde&3!"ysical3stats+, . ;t provides infor&ation s%ch as the n%&,er of disk pages occ%pied! average ro: siEe in ,(tes! n%&,er of records and other infor&ation for all inde. t(pes! incl%ding -ML inde.es. This infor&ation is availa,le for each data,ase partition/ -ML inde.es %se the sa&e partitioning sche&e and partitioning f%nction of the ,ase ta,le. %&am!le: S!ace 'sage of XML Inde&es Cop( Code

$ELECT sum "p1,eF!ou-t# DR%M sysBdmFd+F0-dexFp<ys0!1lFst1ts "d+F0d"#, o+Se!tF0d")do!s)#, ?EDA'LT, ?EDA'LT, )?ETAILE?)# $?P$ J%IN sysBxmlF0-dexes $XI %N "$XIB0-dexF0d . $?P$B0-dexF0d# ;=ERE $XIB-1me . )0dxFxColFP1t<)
This (ields the n%&,er of disk pages occ%pied ,( the -ML inde. id.N.ColN ath in ta,le docs across all partitions. >itho%t the sum+, f%nction! the res%lt :o%ld ret%rn the disk page %sage per partition.

Retrieving XML Sc"ema Collections


-ML sche&a collections are en%&erated in the catalog vie: s(s..&lNsche&aNcollections. The -ML sche&a collection Js(sJ is defined ,( s(ste& and contains predefined na&espaces that can ,e %sed in all %ser2defined -ML sche&a collections :itho%t having to load the& e.plicitl(. This list contains the na&espaces for .&l! .s! .si! fn! and .dt. T:o other catalog vie:s :orth &entioning are* s(s..&lNsche&aNna&espaces! :hich en%&erates all na&espaces :ithin each -ML sche&a collection/ and s(s..&lNsche&aNco&ponents! :hich en%&erates all -ML sche&a co&ponents :ithin each -ML sche&a. The ,%ilt2in f%nction XML3SC-%M*3$*M%S4*C%+schemaName0 XmlSchemacollectionName0 namespace-uri, (ields an -ML data t(pe instance containing -ML sche&a frag&ents for sche&as contained in an -ML sche&a collection! e.cept for the predefined -ML sche&as. Bo% can en%&erate the contents of an -ML sche&a collection in the follo:ing :a(s*

>rite Transact2SQL ?%eries on the appropriate catalog vie:s for -ML sche&a collections. <se the ,%ilt2in f%nction XML3SC-%M*3$*M%S4*C%+,. Bo% can appl( -ML data t(pe &ethods on the o%tp%t of this f%nction. Do:ever! (o% cannot &odif( the %nderl(ing -ML sche&as.

These are ill%strated in the e.a&ples that follo:.

%&am!le: %numerate XML $ames!aces in XML Sc"ema Collection <se the follo:ing ?%er( for -ML sche&a collection J&(CollectionJ* Cop( Code

$ELECT X$NB-1me DR%M sysBxmlFs!<em1F!olle!t0o-s X$C J%IN sysBxmlFs!<em1F-1mesp1!es X$N %N "X$CBxmlF!olle!t0o-F0d . X$NBxmlF!olle!t0o-F0d# ;=ERE X$CB-1me . )myColle!t0o-)

%&am!le: %numerate Contents of an XML Sc"ema Collection The follo:ing state&ent en%&erates the contents of the -ML sche&a collection J&(CollectionJ :ithin 5relational6 sche&a d,o. Cop( Code

$ELECT XMLF$C=EMAFNAME$PACE "N)d+o), N)myColle!t0o-)#


;ndivid%al -ML sche&as :ithin the collection can ,e o,tained as -ML data t(pe instances ,( specif(ing the target na&espace as the third arg%&ent to XML3SC-%M*3$*M%S4*C%+,! as sho:n ,elo:. %&am!le: ut!ut a S!ecified Sc"ema from an XML Sc"ema Collection

The follo:ing state&ent o%tp%ts the -ML sche&a :ith target na&espace Jhttp*11:::.&icrosoft.co&1,ooksJ fro& the -ML sche&a collection J&(CollectionJ :ithin 5relational6 sche&a d,o. Cop( Code

$ELECT XMLF$C=EMAFNAME$PACE "N)d+o), N)myColle!t0o-), N)<ttpM//>>>Bm0!rosoftB!om/+ooks)#

Querying XML Sc"emas


;f (o% :ant to ?%er( -ML sche&as that (o% have loaded into -ML sche&a collections! (o% &a( do so in the follo:ing :a(s*

>rite Transact2SQL ?%eries on catalog vie:s for -ML sche&a na&espaces. Create a ta,le containing an -ML data t(pe col%&n to store (o%r -ML sche&as! in addition to loading the& into the -ML t(pe s(ste&. Bo% can ?%er( the -ML col%&n %sing the -ML data t(pe &ethods. @%rther&ore! (o% can ,%ild -ML inde. on this col%&n. Do:ever! &aintaining consistenc( ,et:een the -ML sche&as stored in the -ML col%&n and the -ML t(pe s(ste& is left to the application. @or e.a&ple! if (o% drop the -ML sche&a na&espace fro& the -ML t(pe s(ste&! (o% have to drop it also fro& (o%r ta,le to preserve consistenc(.

You might also like