You are on page 1of 12
SDD-1 Algorithm * SDD-1 algorithm » objective: minimize total communication cost © input: ¥ QG: query graph with 1 relations: ¥ statistics for each relation * output: ES: execution strategy — Initial selection of beneficial semijoins » identity all possible semijoins, and determine beneficial ones — Ordering beneficial semijoins swith updates of the set of beneficial semijoins » choose the most beneficial semijoin iteratively — Assembly site selection » select the site that requires the least cost of transmission — Post-optimization » remove useless semijoins » adjust the order of semijoins ® Beneficial semijoins: R ©, S =(R%,S) %, S TS) @4@—® : RLS — Cost(R£,S8)= — size(I,(S)) * Crg + Chase — Benefit( ,S) = size(R) * (1- SF,(S-A)) * Cxp re BS: a set of beneficial semijoins » aset of semijoins SJ such that Cost(Syv) < Benefit(SJ) Ordering of beneficial semijoins w= with updates of the set of beneficial semijoins choose the most beneficial SU from BS, and » append it to &S modify database statistics accordingly modify BS » by computing new benefit/cost values * delete non-beneficial semijoins from BS * add new beneficial semijoins to BS If BS # @, repeat this process. e Assembly site selection = find the site where the largest amount of data resides and select it as the assembly site e Post-optimization rw this phase is necessary because the assembly site is chosen after all the semijoins have been ordered — remove useless semijoins » i.e., semijoins that affect only relations stored at the assembly site — permute the order of semijoins if doing so would improve the total cost of ES (Ex) SDD-1 Algorithm Site 1 Site 2 Site 3 query graph Cys = 0, Cry = relation card tuple size | relation size Rt 3 | 8 1500 R2 100 3a 3000 RS 50 49 2000 attribute SFsy |_size(Il arrmeure) RIA | 03 6 R2A os 320 R28 10 400 re | o4 80 DB statistics (DB profile) (1) Initial selection of beneficial semijoins (Example cont'd) — Beneficial semijoins: > SJ, = Ry xR, * benefit is 2100 = (1 - 0.3) * 3000, and cost is 36 » Su) = Ry %g Ry * benefit is 1800 = (1 - 0.4) * 3000, and cost is 80 — Nenbeneficial semijoins: » Sy = Ry Sy Ry © benefit is 300 = (1 - 0.8) * 1500, and cost is 320 » SU, = Ry %g Ro © benefit is 0 = (1 - 1.0)* 2000, and cost is 400 (2) Ordering beneficial semijoins (Example cont'd) w after choosing the most beneficial semijoin, » update statistics for reduced relation » cardinality of the relation » # of distinct values in the attributes ¥ semijoin selectivity is also updated » update the set of beneficial semijoins * all the semijoins involving the reduced relation need to be considered 1 Iteration 1 - choose SJ, (i,e., R, <, R,), and append it to ES - update statistics » size(R,') = 900 (= 3000 * 0.3), » size(T] gy a) = 96 (=320 * 0.3) » SF,,(Ro’A) = ~0.8 * 0.3 = ~0.24 » siz@(TIpo _) = ? » SF.(Ry.B)= ? 1 Iteration 2 (Example cont'd) — update the set of beneficial semijains » Sup =Ry’ og Ry * benefit is 540 = (1 - 0.4) * 900, and cost is 80 » Sus =Ry x, Ry’ * benefit is 1440 = (1 - 0.24) * 1500, and cost is 96 — choose SJ., and append it to ES — update statistics » size(R,) = 360 (= 1500" 0.24) ? » size(II p; ,) = 8.64 (=36* 0.24) 2 » SF5,(Ry.A) = ~0.3 *0.24=0.072 ? 1 Iteration 3 (Example cont'd) — update the set of beneficial semijoins » no change in SJ, » no new beneficial semijoins - choose the remaining beneficial semijoin SJ, (i.e., R,’ <, R3), and append it to ES = update statistics » size(R,) = 360 (= 900 * 0.4) WES = (Ry ,R,, Rix, R,, Ry’ xp R,) (3) Assembly Site Selection (Example cont'd) » Sitet: size(R,) = 360 (= 15007 0.24) » Site2: size(R,) = 360 (= 3000*0.3* 0.4) » Site3: size(R,) = 2000 (no reduction) — Site3 will be chosen as the assembly site (4) Postoptimization — remove useless semijoins » no semijoins are removed — adjust the order of semijoins » send (R, 2 R,) « R, to Site 3 » send R, x (R, « R,) to Site 3

You might also like