You are on page 1of 3

MERGE JOIN vs HASH JOIN

• nested loop join: The right relation is scanned once for every row found in the left relation. This
strategy is easy to implement but can be very time consuming. (However, if the right relation can be
scanned with an index scan, this can be a good strategy. It is possible to use values from the current
row of the left relation as keys for the index scan of the right.)
• merge join: Each relation is sorted on the join attributes before the join starts. Then the two relations
are scanned in parallel, and matching rows are combined to form join rows. This kind of join is more
attractive because each relation has to be scanned only once. The required sorting might be achieved
either by an explicit sort step, or by scanning the relation in the proper order using an index on the
join key.
• hash join: the right relation is first scanned and loaded into a hash table, using its join attributes as
hash keys. Next the left relation is scanned and the appropriate values of every row found are used as
hash keys to locate the matching rows in the table.

Merge Join has two major drawbacks:

1. It needs the recordsets to be sorted


2. Even with the recordsets sorted, it is no better than a HASH JOIN

The sorting part is essential for this algorithm and there is nothing that can be done with it: the recordsets
should be sorted, period. Databases, however, often provide the records in the sorted order: from clustered
tables, indexes, previously sorted and ordered subqueries, spool tables etc.

But even when the recordsets are already sorted, on equijoins the MERGE JOIN is hardly faster than a HASH
JOIN.

Why?

MERGE JOIN vs. HASH JOIN

Let's remember how the HASH JOIN works:

• It takes the smaller table and builds a hash table out of it, with the join key as the hash key.
• Then it takes each record from the larger table and looks it up in the hash table. If found, the records
are returned.

We see that there are four major steps involved:

1. Scan the smaller table


2. Build a hash table (i. e. copy each record from the smaller table into the hash slot)
3. Scan the larger table
4. Look up the larger table

Since building and looking up the hash table are performed in memory (or, depending on the SQL engine
implementation, in memory-mapped temporary database, which is almost the same), these steps take
negligible time compared to the time required to scan the table.

But we see that MERGE JOIN, as it is implemented now, also requires scanning both recordsets. Each record
should be evaluated by the pointer to figure out if its join key is more, less or equal to that of the the other
pointer.

This means that both MERGE JOIN and HASH JOIN require scanning both recordsets. However, HASH JOIN
does not require any special order, which means it can use a table scan, index fast full scan and any other
methods to get the records all at once, while MERGE JOIN need either to sort the records (which is obviously
slow) or to traverse the index with the subsequent key lookups (which is not fast too).

In some terminal cases MERGE JOIN can be more efficient indeed: say, when the hash table does not fit
completely into memory and would require either extensive disk writes or several scans over the source
tables, while a MERGE JOIN could be performed on a pair of indexes.

It is also efficient for FULL OUTER JOIN: each record is evaluated, returned and forgotten only once, while
HASH JOIN would require a second pass over the records that had not been ever matched.

Seeks instead of scans

But does the MERGE JOIN really always need to traverse all records?

Here, the right recordset is 100,000 records ahead of the left recordset. With MERGE JOIN, 100,000 records
should be scanned from the left recordset and 100,000 comparisons made.

This is unavoidable if the recordset is a result of a sort operation.

However, MERGE JOIN is usually chosen when there is a more efficient sorted row source available: an index
or a spool table (temporary index built in runtime). And both these sources allow efficient random seeks.

If an index served as the left recordset, we could see that right pointer is too far ahead, and just seek for its
value in the left recordset instead of scanning 100,000 records:

Here, we can see that 100,000 is too far away and could advance the left pointer to the position of the right
pointer in only several reads, traversing the B-Tree.

Since the indexes usually collect statistics, all we would need to do to decide whether we need to seek or
scan was to check the histograms to estimate how may records are there between the current and the
opposite pointers. If there are too many, the seek cost would overweight the scan cost and a seek should be
performed. The statistics table itself would not need to be queried too often: since the records are always
selected in order, the statistics table could be also read sequentially.

Note that PostgreSQL used a MERGE JOIN without any tricks from our side. This is because the table
records are too large and could not fit into a hash table all at once.
Of course, PostgreSQL could only store the record pointers in the hash table and do the record lookups after
the join, however, for some reason it would not select this plan.

MERGE JOIN, in our case, is quite efficient, since the indexes are read first and the actual records are only
looked up for the matched records (which are not too numerous). However, it still requires traversing
2,000,000 records which takes more than a second.

Summary

With its current implementation, MERGE JOIN is not the most efficient algorithm, however, for several types
of queries it outperforms HASH JOIN.

The main drawback of the MERGE JOIN is its inability to use seeks to advance the record pointers. Even if the
opposite pointer is far away, the sequential scan is used instead of a B-Tree seek, even if the recordset is an
index or a spool table.

To improve this, the accumulated index statistics should be taken into account when deciding whether to
perform a seek or a sequential scan to catch up with the opposite pointer. If the statistics show a high number
of the records in between, an index seek should be used instead of the index scan.

With this improvement, MERGE JOIN would perform much better, especially when joining two large indexed
tables. It would require much less resources than a HASH JOIN, and, unlike NESTED LOOPS, the seeks would
be performed only when really needed, thus preserving the benefits of the sequential access to the tables.

You might also like