You are on page 1of 3

Nested loop (loop over loop) In this algorithm, an outer loop is formed which consists of few entries and

the n for each entry, and inner loop is processed. Ex: Select tab1.*, tab2.* from tabl, tab2 where tabl.col1=tab2.col2; It is processed like: For i in (select * from tab1) loop For j in (select * from tab2 where col2=i.col1) loop Display results; End loop; End loop; The Steps involved in doing nested loop are: a) <!--[endif]-->Identify outer (driving) table <!--[if !supportLists]--> b) <!--[endif]-->Assign inner (driven) table to outer table. <!--[if !supportLists]--> c) <!--[endif]-->For every row of outer table, access the rows of inner table. In execution plan it is seen like this: NESTED LOOPS outer_loop inner_loop When optimizer uses nested loops? Optimizer uses nested loop when we are joining tables containing small number of rows with an efficient driving condition. It is important to have an index on c olumn of inner join table as this table is probed every time for a new value fro m outer table. Optimizer may not use nested loop in case: <!--[endif]--> 1.No of rows of both the table is quite high 2.Inner query always results in same set of records 3.The access path of inner table is independent of data coming from outer table. <!--[endif]--> Note: You will see more use of nested loop when using FIRST_ROWS optimizer mode as it works on model of showing instantaneous results to user as they are fetche d. There is no need for selecting caching any data before it is returned to user . In case of hash join it is needed and is explained below.

Hash join Hash joins are used when the joining large tables. The optimizer uses smaller of the 2 tables to build a hash table in memory and the scans the large tables and

compares the hash value (of rows from large table) with this hash table to find the joined rows. The algorithm of hash join is divided in two parts <!--[if !supportLists]--><!--[endif]--> 1.Build a in-memory hash table on smaller of the two tables. 2.Probe this hash table with hash value for each row second table <!--[endif]--> In simpler terms it works like Build phase For each row RW1 in small (left/build) table loop Calculate hash value on RW1 join key Insert RW1 in appropriate hash bucket. End loop;

Probe Phase For each row RW2 in big (right/probe) table loop Calculate the hash value on RW2 join key For each row RW1 in hash table loop If RW1 joins with RW2 Return RW1, RW2 End loop; End loop;

When optimizer uses hash join?

Optimizer uses has join while joining big tables or big fraction of small tables . Unlike nested loop, the output of hash join result is not instantaneous as hash joining is blocked on building up hash table.

Note: You may see more hash joins used with ALL_ROWS works on model of showing results after all the rows es are hashed in hash table. Sort merge join Sort merge join is used to join two independent data r than nested loop when the volume of data is big in ash joins in general.

optimizer mode, because it of at least one of the tabl sources. They perform bette tables but not as good as h

They perform better than hash join when the join condition columns are already s orted or there is no sorting required. The full operation is done in two parts: <!--[if !supportLists]-->Sort join operationget first row RW1 from input1 get first row RW2 from input2.

Merge join operationwhile not at the end of either input loop if RW1 joins with RW2 get next row R2 from input 2 return (RW1, RW2) else if RW1 < style=""> get next row RW1 from input 1 else get next row RW2 from input 2 end loop Note: If the data is already sorted, first step is avoided. Important point to understand is, unlike nested loop where driven (inner) table is read as many number of times as the input from outer table, in sort merge joi n each of the tables involved are accessed at most once. So they prove to be bet ter than nested loop when the data set is large. When optimizer uses Sort merge join? <!--[if !supportLists]-->a) <!--[endif]-->When the join condition is an inequali ty condition (like <, <=, >=). This is because hash join cannot be used for ineq uality conditions and if the data set is large, nested loop is definitely not an option. <!--[if !supportLists]-->b) <!--[endif]-->If sorting is anyways required due to some other attribute (other than join) like order by , optimizer prefers sort merge join over hash join as it is cheaper.

Note: Sort merge join can be seen with both ALL_ROWS and FIRST_ROWS optimizer hi nt because it works on a model of first sorting both the data sources and then s tart returning the results. So if the data set is large and you have FIRST_ROWS as optimizer goal, optimizer may prefer sort merge join over nested loop because of large data. And if you have ALL_ROWS as optimizer goal and if any inequality condition is used the SQL, optimizer may use sort-merge join over hash join

You might also like