You are on page 1of 16

Cc C Ch Thc Thi Lnh JOIN

V Huy Tm Tip theo bi Cc Loi JOIN Trong SQL Server, bi ny gii thiu v cc c ch bn trong SQL Server s dng x l cc cu truy vn JOIN. V c bn khi thc hin cu lnh JOIN, SQL Server duyt qua hai bng tham gia vo, ly ra tng cp bn ghi so snh, ri tr v tp kt qu nu tha mn iu kin JOIN hoc loi b nu khng tha mn. SQL Se rver ci t mt vi thut ton khc nhau, thch hp vi cc tnh hung khc nhau (nh s lng bn ghi cn so snh nhiu hay t, ct JOIN c index hay khng). Cc thut ton l Nested Loop Join, Merge Join, v Hash Join.

Nested Loop Join


y l thut ton rt n gin v cng rt hiu qu i vi tp d liu nh, n ly mi bn ghi trong mt bng (gi l inner table) v so snh vi tng bn ghi ca bng kia (gi l outer table) tm ra bn ghi tha mn. Thut ton ny c th c vit dng pseodo code nh sau: for each row R1 in the outer table for each row R2 in the inner table if R1.join_column = R2.join_column return (R1, R2) (pseodo-code c copy t Craig Freedmans Blog) v d 1:
USE AdventureWorkds GO SELECT a.SalesOrderDetailID, b.Name FROM Sales.SalesOrderDetail a JOIN Production.Product b ON a.ProductID = b.ProductID WHERE a.SalesOrderID = 43659

Trong phng n thc thi hnh trn, bn thy hai thao tc Clustered index seek, bng thao tc trn lun lun l outer table, v bng di lun l inner table. V s bn ghi c tr v l nh, b Optimizer chn Nested Loop Join. Ni chung Nested Loop Join thch hp khi outer table c s bn ghi nh v inner table c sp xp theo trng c JOIN (v d trng ny c clustered index). Khi , vic qut bng inner table (vng lp trong) tr thnh index seek. Khi xy dng phng n thc thi, b Optimizer s t ng chn mt bng lm outer table v bng kia lm inner table theo cch m bo chi ph l nh nht. Nested Loop Join l thao tc join bn mong i xut hin nht trong phng n thc thi, v n chng t cu truy vn c thc hin hiu qu v c chi ph thp. Khi s bn ghi tng ln hoc bng inner table khng c sp th t (v khng c index h tr), b Optimizer s xem xt n Merge Join hoc Hash Join, v lc cc thut ton ny c th s hiu qu hn so vi Nested Loop Join. Cn phn bit cc thut ton k trn vi cc loi JOIN m bn dng khi vit code nh INNER JOIN, OUTER JOIN. Theo thut ng ca Microsoft, cc loi JOIN c gi l logical operator (cc ton t mc logic); cn cc thut ton trn gi l physical operator (cc ton t mc vt l). Khi vit code, bn dng cc ton t mc logic din t yu cu, h thng s xem xt v s dng mt ton t mc vt l (mt trong ba thut ton) thch hp thc thi cu lnh.

Merge Join
K thut ny i hi hai bng phi cng c sp xp theo th t ca trng JOIN. N c tng cp bn ghi ca mi bng v so snh vi nhau. Nu khp th gi ra tp kt qu. Nu khng th n loi bn ghi c trng JOIN nh hn, c ti bn ghi tip theo ca bng tng ng v tip tc qu trnh. Vi thut ton ny, hai bng c c t u v cng tin ln song song vi nhau. Pseodo-code ca thut ton Merge Join nh sau: get first row R1 from table 1 get first row R2 from table 2 while not at the end of either table begin if R1.join_column = R2.join_column begin return (R1, R2) get next row R2 from table 2 end else if R1.join_column < R2.join_column get next row R1 from table 1 else get next row R2 from table 2 end (pseodo-code c copy t Craig Freedmans Blog ) V d 2: Cng cu truy vn nh phn trc, nhng b qua mnh WHERE
SELECT a.SalesOrderDetailID, b.Name FROM Sales.SalesOrderDetail a JOIN Production.Product b ON a.ProductID = b.ProductID --WHERE a.SalesOrderID = 43659

Vi cu lnh trn th Merge Join tr nn thch hp, v s bn ghi tr v l ln v c hai bng u c sp xp (ni chnh xc ra l, i vi bng SalesOrderDetail n ch cn c index trn trng ProductID, v tt nhin index sp xp sn). S php so snh, v do , chi ph ca thut ton ny, tng ng vi tng ca s bn ghi trong hai bng. Do thut ton ny hot ng hiu qu hn Nested Loop khi s bn ghi tng cao. Trong nhiu trng hp, thut ton kt thc khi n mi ch qut xong bng nh hn, v bng kia nu c qut tip cng khng tm c bn ghi no tha mn na. Khi s ln so snh ch bng hai ln s bn ghi ca bng nh. Nu mt trong hai bng khng c sp sn th t, b Optimizer c hai la chn: (1) sp xp li bng theo th t trng JOIN ri p dng Merge Join hoc (2) chuyn sang dng Hash Join. Phng n no r hn s c chn.

Hash Join
Thut ton ny pht huy hiu qu nht i vi lng d liu ln v khng c sp xp sn. N c thc hin lm hai giai on: xy dng (build) v d tm (probe). bc xy dng, n qut qua mt bng (gi l build table), v bm (hash) cc bn ghi da vo trng JOIN, ri xy dng mt bng bm (hash table) trong b nh. n bc d tm, n c bng th hai (gi l probe table) v cng bm cc bn ghi dng trng JOIN, ri dng gi tr bm tm trn bng bm. Mi ln tm c n gi cp bn ghi tng ng ra tp kt qu. Pseodo-code:

for each row R1 in the build table begin calculate hash value on R1.join_column insert R1 into the appropriate hash bucket end for each row R2 in the probe table begin calculate hash value on R2.join_column for each row R1 in the corresponding hash bucket if R1.join_column = R2.join_column return (R1, R2) end (pseodo-code c copy t Craig Freedmans Blog ) V d 3: ging nh v d 2 nhng thm mt trng OrderQty vo phn SELECT
SELECT a.SalesOrderDetailID, a.OrderQty, b.Name FROM Sales.SalesOrderDetail a JOIN Production.Product b ON a.ProductID = b.ProductID --WHERE a.SalesOrderID = 43659

v d ny Hash Join c s dng, mc d s bn ghi c tr v ging nh v d 2. Lu v d 2, bng SalesOrderDetail ch cn c index trn trng ProductID l , v v input cho vic join c sp xp nn Merge Join c dng. Nhng v d 3, v c thm trng OrderQty nn ch c index trn trng ProductID l khng m h thng

phi c c vo bng na. Input cho thao tc join lc ny khng cn c sp xp na v do , Hash Join tr nn thch hp hn. Trong cc tnh hung nh th ny Hash Join c u th hn Merge Join v vic xy dng bng bm nhanh hn sp xp li bng, hn na n ch cn p dng i vi mt bng. Thng thng b Optimizer chn bng nh xy dng bng bm khng cn chim qu nhiu b nh. Tuy vy so vi Nested Loop Join v Merge Join th thut ton ny i hi rt nhiu ti nguyn CPU v b nh. Khi Hash Join xut hin trong phng n thc thi, l ch du cho thy lng d liu cn x l kh ln (do khng c mnh WHERE, ch hoc do qun khng a vo) v khng c index h tr. Trong cc h thng OLTP, vn c trng bi nhiu giao dch c thc hin nhanh, Hash Join cho thy nhiu kh nng l cu lnh cha c thc hin ti u. Cn trong mi trng data warehouse, cc thao tc x l thng trn mt lng d liu ln, do Hash Join c s dng rt thng xuyn.

Li kt
Trn y gii thiu cc thut ton SQL Server dng thc thi cu lnh JOIN. Trn thc t cc thut ton phc tp hn v cn c nhiu bin th b Optimizer tinh chnh trong tng tnh hung c th. Tuy nhin mc su nht m bn c th nhn vo h thng l bit thut ton no c s dng cho cu lnh, do Microsoft che du ton b cc chi tit bn di. Vic hiu bit c ch hot ng ca cc thut ton gip bn c thm mt cng c ti u ha cu lnh. V d, vi cu truy vn phn Hash Join, khi quan st k hoch thc thi v thy Hash Join c s dng bn hiu rng y c th l ch du cu lnh cha c thc hin ti u. Bn c gng to thay i h thng chuyn sang chn Merge Join (v s bn ghi tr v ln nn Nested Loop Join chc chn khng thch hp). dng Merge Join th u vo phi c sp xp. V th bn c th sa li index trn trng ProductID n cover c trng OrderQty. V gi cu lnh c thc hin bng Merge Join v hiu nng c ci thin ng k:
--To mt bng copy ca Sales.SalesOrderDetail SELECT * INTO Sales.SalesOrderDetail_2 FROM Sales.SalesOrderDetail GO -- to cc index trn bng copy

CREATE CLUSTERED INDEX IX_SalesOrderDetail_SalesOrderID_SalesOrderDetailID_2 ON Sales.SalesOrderDetail_2 ( SalesOrderID, SalesOrderDetailID) GO CREATE NONCLUSTERED INDEX IX_SalesOrderDetail_ProductID_2 ON Sales.SalesOrderDetail_2(ProductID) INCLUDE(OrderQty) -- so snh hai cu lnh: -- Query 1: cu lnh trn bng c SELECT a.SalesOrderDetailID, a.OrderQty, b.Name FROM Sales.SalesOrderDetail a JOIN Production.Product b ON a.ProductID = b.ProductID -- Query 2: cu lnh trn bng copy SELECT a.SalesOrderDetailID, a.OrderQty, b.Name FROM Sales.SalesOrderDetail_2 a JOIN Production.Product b ON a.ProductID = b.ProductID

1. Shared Locks (S) - Shared Lock ( Read Lock ): Khi c 1 n v d liu, SQL Server t ng thit lp Shared Lock trn n v d liu (tr trng hp s dng No Lock) - Shared Lock c th c thit lp trn 1 bng, 1 trang, 1 kha hay trn 1 dng d liu. - Nhiu giao tc c th ng thi gi Shared Lock trn cng 1 n v d liu. - Khng th thit lp Exclusive Lock trn n v d liu ang c Shared Lock. - Shared Lock thng c gii phng ngay sau khi s dng xong d liu c c, tr khi c thit lp gi shared lock cho n ht giao tc. 2. Exclusive Locks (X) - Exclusive Lock Write Lock - Khi thc hin thao tc ghi (insert, update, delete) trn 1 n v d liu, SQL Server t ng thit lp Exclusive Lock trn n v d liu . - Exclusive Lock lun c gi n ht giao tc. - Ti 1 thi im, ch c ti a 1 giao tc c quyn gi Exclusive Lock trn 1 n v d liu. - Khng th thit lp Exclusive Lock trn n v d liu ang c Shared Lock. 3. Update Locks (U) - Update Lock = Intent-to-update Lock - Update Lock s dng khi c d liu vi d nh ghi tr li sau khi c trn n v d liu ny. - Update Lock l ch kha trung gian gia Shared Lock v Exclusive Lock. Shared Lock Update Lock Tng thch vi Shared Lock Tng thch vi Shared Lock S dng trong vic c d liu S dng trong vic c d liu Ti 1 thi im c th c nhiu Ti 1 thi im, c ti a 1 Shared Lock trn cng1 n v d Update Lock trn 1 n v d liu liu - Update Lock :khng ngn cn vic thit lp cc Shared Lock khc trn cng 1 n v d liu => Update Lock tng thch vi Shared Lock - Update Lock :Gip trnh hin tng deadlock khi c yu cu chuyn t Shared Lock ln Exclusive Lock trn 1 n v d liu no (Do ti 1 thi im ch c ti a 1 Update Lock trn 1 n v d liu) Tm li : Ta c bng tng thch gia cc loi kha nh sau : ( hai loi kha x,y c gi l tng thch nu nh ti mt thi im c th c hai transaction ng thi gi 2 loi lock ny trn n v d liu ) Tiu 1 Shared lock Shared lock + Updlock + Exclusive Lock -

Updlock Exclusive Lock

+ -

Cc Mc Isolation Level
V Huy Tm

Isolation level l mt thuc tnh ca transaction, qui nh mc c lp ca d liu m transaction c th truy nhp vo khi d liu ang c cp bi mt transaction khc. Khi mt transaction cp nht d liu ang din ra, mt phn d liu s b thay i (v d mt s bn ghi ca bng c sa i hoc b xa b, mt s c thm mi), vy cc transaction hoc truy vn khc xy ra ng thi v cng tc ng vo cc bn ghi s din ra th no? Chng s phi i n khi transaction u hon thnh hay c th thc hin song song, kt qu d liu nhn c l trong khi hay sau khi cp nht? Bn c th iu khin nhng hnh vi ny thng qua vic t isolation level ca tng transaction. SQL Server cung cp cc mc isolation level sau xp theo th t tng dn ca mc c lp ca d liu: Read Uncommitted, Read Commited, Repeatable Read, v Serializable. T bn 2005 bt u b sung thm mt loi mi l Snapshot. Phn cn li ca bi ny s i vo chi tit ca tng loi.

1. Read Uncommitted
Khi transaction thc hin mc ny, cc truy vn vn c th truy nhp vo cc bn ghi ang c cp nht bi mt transaction khc v nhn c d liu ti thi im mc d d liu cha c commit (uncommited data). Nu v l do no transaction ban u rollback li nhng cp nht, d liu s tr li gi tr c. Khi transaction th hai nhn c d liu sai. Hy tm hiu qua v d sau:
CREATE TABLE dbo.Item (id INT, NAME VARCHAR(50)) INSERT INTO dbo.Item SELECT 1,'a' INSERT INTO dbo.Item SELECT 2,'b' INSERT INTO dbo.Item SELECT 3,'c' SELECT * FROM dbo.Item

Nay bn hy m hai ca s trong Management Studio, ca s th nht bn nhp vo:


BEGIN TRAN UPDATE dbo.Item SET name = 'x' WHERE id>2 WAITFOR DELAY '00:00:10' --wait for 10 seconds ROLLBACK

V ca s th hai bn nhp:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED SELECT * FROM dbo.Item

Gi bn thc hin on lnh ca s th nht ri nhanh chng chuyn sang thc hin on lnh ca s th hai. Bn s thy ca s th hai tr v bn ghi s 3 vi name = x. Tuy nhin sau transaction ca s th nht b rollback v sau khi c hai transaction kt thc, bn ghi s 3 li tr li gi tr ban u name=c. Nh vy l transaction ca s th hai nhn c d liu sai v d liu ny cha c commit. Hin tng ny gi l uncommited read, hay cn gi l dirty read. u im ca mc isolation ny l tng tng tranh trong database, cc tin trnh c khng cn i n khi tin trnh ghi hon tt m c th ly d liu ra c ngay. Ni nm na l yu cu c ca n l ti khng cn bit d liu c ang c cp nht hay khng, hy cho ti d liu hin c ngay ti thi im ny. Ty theo ng dng ca bn m bn c th t mc isolation ny khng, nu vic c sai nh trn l khng th chp nhn c bn cn t mc isolation cao hn. Cn nu c th dung th c th t mc ny s gip tng hiu nng c cho h thng.

Ch l mc isolation ny tng c vi gi NOLOCK khi truy vn bng, on lnh ca s th hai tng ng vi:
SELECT * FROM dbo.Item WITH (NOLOCK)

2. Read Commited
y l mc isolation mc nh, nu bn khng t g c th transaction s hot ng mc ny. Transaction s khng c c d liu ang c cp nht m phi i n khi vic cp nht thc hin xong. V th n trnh c dirty read nh mc trn. Gi hy sa li on lnh ca s th hai thnh:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED SELECT * FROM dbo.Item WHERE id>2

V thc hin li hai ca s theo trnh t nh trn, bn s thy ca s th hai khng tr v kt qu ngay m phi i n khi ca s th nht thc hin xong. V ln ny ca s th hai tr v d liu ng. Tuy nhin nu transaction th hai insert thm bn ghi nm trong phm vi cp nht ca transaction th nht, n vn c php lm nh vy v gy nhiu n transaction ca s 1
BEGIN TRAN

th

nht.

Gi

hy

sa

li

code

hai

ca

thnh:

UPDATE dbo.Item SET name = 'x' WHERE id>2 WAITFOR DELAY '00:00:10' --wait for 10 seconds --ROLLBACK COMMIT SELECT * FROM dbo.Item WHERE id>2

Ca s hai:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED INSERT INTO dbo.Item SELECT 5,'e'

Sau khi thc hin c hai ca s bn s thy kt qu tr v c cha bn ghi 5 vi name = e. iu ny hon ton bt ng v theo trnh t thc hin on lnh ca s th nht, tt c cc bn ghi vi id>2 u c cp nht. Trong tnh hung trn, bn ghi 5 xut hin sau khi bng c cp nht nhng trc khi transaction kt thc. V th n c gi l bn ghi ma (phantom row).

3. Repeatable read
Mc isolation ny hot ng nh mc read commit nhng nng thm mt nc na bng cch ngn khng cho transaction ghi vo d liu ang c c bi mt transaction khc cho n khi transaction khc hon tt. Tr li hai ca s:

Ca s 1:
BEGIN TRAN SELECT * FROM dbo.Item WAITFOR DELAY '00:00:10' --wait for 10 seconds SELECT * FROM dbo.Item COMMIT

Ca s 2:
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ UPDATE dbo.Item SET name = 'x' WHERE id>2 SELECT * FROM item

Khi thc hin code hai ca s lin tip nhau, hai lnh select ca s 1 cho cng kt qu v ca s 2 phi i n khi ca s 1 hon tt mi c thc hin. Mc isolation ny m bo cc lnh c trong cng mt transaction cho cng kt qu, ni cch khc d liu ang c c s c bo v khi cp nht bi cc transaction khc. Tuy nhin n khng bo v c d liu khi insert hoc delete: nu bn thay lnh update ca s th hai bng lnh

insert, hai lnh select ca s u s cho kt qu khc nhau. V th n vn khng trnh c hin tng bn ghi ma.

4. Serializable
Mc isolation ny tng thm mt cp na v kha ton b di cc bn ghi c th b nh hng bi mt transaction khc, d l UPDATE/DELETE bn ghi c hay INSERT bn ghi mi. Nu bn thay ca s 1 bng on code
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE BEGIN TRAN SELECT * FROM dbo.Item WAITFOR DELAY '00:00:10' --wait for 10 seconds SELECT * FROM dbo.Item COMMIT

v ca s 2 bng
INSERT INTO dbo.Item SELECT 4,'d'

Ca s 2 s b treo n khi ca s 1 thc hin xong, v hai lnh SELECT trong ca s 1 tr v kt qu ging nhau.

5. Snapshot
Mc ny cng m bo c lp tng ng vi Serializable, nhng n hi khc phng thc hot ng. Khi transaction ang select cc bn ghi, n khng kha cc bn ghi ny li m to mt bn sao (snapshot) v select trn . V vy cc transaction khc insert/update ln cc bn ghi khng gy nh hng n transaction ban u. Tc dng ca n l gim blocking gia cc transaction m vn m bo tnh ton vn d liu. Tuy nhin ci gi km theo l cn thm b nh lu bn sao ca cc bn ghi, v phn b nh ny l cn cho mi transaction do c th tng ln rt ln. thit lp isolation mc ny bn cn t li option ca database:
ALTER DATABASE TestDB SET ALLOW_SNAPSHOT_ISOLATION ON

V phm vi p dng cc mc isolation


Cc mc isolation t 1 4 k trn tng theo th t mc c lp d liu, gip tng tnh ton vn d liu v nht qun ca transaction. ng thi n cng tng thi gian ch l n nhau ca cc transaction. Khi cng ln mc cao, i hi v tnh ton vn d liu cng cao v cng c nhiu tnh hung mt transaction ngn khng cho cc transaction khc truy

nhp vo d liu m n ang thao tc. Do n cng tng tnh trng locking v blocking trong database (ngoi tr vi snapshot th tng lng b nh cn s dng). Hiu nng ca h thng do b gim i. Thng thng, mc isolation read commited (mc mc nh) l ph hp trong a s cc ng dng. C th mt vi chc nng quan trng (v d chc nng trang admin update d liu c nh hng n ton h thng) bn cn tnh ton vn cao v phi chn mc isolation cao hn. Hoc c nhng chc nng cn u tin tc thc hin v c th chp nhn mt cht d liu khng nht qun, bn c th t xung mc read uncommited. Bng di y tm tt cc tnh nng ca tng mc isolation.

Mc Isolation

Dirty read

Nonrepeatable read

Phantom read

Read Uncommitted

Yes

Yes

Yes

Read Committed

No

Yes

Yes

Repeatable read

No

No

Yes

Serializable

No

No

No

Snapshot

No

No

No

Clustered Index: Chn Trng No


V Huy Tm Do cc c tnh ca clustered index, c mt vi im bn cn lu khi chn trng lm clustered index c th t hiu qu ti u. Mt ng c vin cho clustered index cn t c cc ch tiu sau: Kch thc nh: Ni chung vi loi index no th bn cng nn chn trng nh gim kch thc ca index. Vi clustered index th tiu ch ny cng quan trng, v kha ca n c dng trong tt c cc index khc (nonclustered) ca bng lm con tr ti bn ghi. V d mt trng VARCHAR(100) hay trng c kiu d liu xp x nh FLOAT c l cn c xem xt li. Tt nht l mt trng kiu s nguyn (INT hoc BIGINT) v tm kim theo s nguyn lun nhanh hn tm kim theo chui k t. V mc d clustered index cho php cha nhiu trng (index phc hp) nhng bn ch nn dng mt trng, cng v l do gi cho kch thc index nh. Trng lun tng: Khi gi tr mi ca trng clustered index lun tng ln, cc bn ghi mi s lun c thm vo cui bng. Nu gi tr ny thay i bt k, cc bn ghi mi c th c chn vo gia bng. iu ny dn n phn mnh d liu, tc l cc bn ghi k tip nhau mt cch logic nhng li khng c lu tr lin k vi nhau (lu tr cc trang khc nhau). Phn mnh lm cho h thng phi truy xut nhiu hn c d liu, nht l khi cn ly v mt di cc bn ghi. Trng tnh: Trng clustered index khng nn b cp nht thng xuyn, mt khi c mt trong bng th gi tr ca n cn c gi nguyn. Khi n b cp nht, bn thn clustered index cng cn c cp nht sp xp bn ghi vo v tr mi cho ng th t, v ng thi cc nonclustered index khc cng phi cp nht theo cho con tr gi phi cha gi tr mi. Thao tc cp nht trng clustered index do vy rt tn km v nu din ra thng xuyn, cng lm cho clustered index b phn mnh. Ct kiu t tng (IDENTITY) trong nhiu trng hp rt ph hp vi clustered index v n tha mn tt c cc yu cu trn: kch thc nh (kiu INT hoc BIGINT), lun tng, v tnh (mt khi insert th bn khng my quan tm n gi tr ca n na v him khi cn phi

cp nht). Bn c th trc ht hy dng ct IDENTITY lm clustered index, v sau nu thy khng thch hp th chuyn sang chn trng khc.

You might also like