You are on page 1of 9

Tuning at the Block Level (Advanced) this topic is still to cover from richard

nemic book - chap 9

========================= NESTED LOOPS Joins


===================================

In a NESTED LOOPS join, Oracle reads the first row from the first row source and
then
checks the second row source for matches. All matches are then placed in the result
set and
Oracle goes on to the next row from the first row source. This continues until all
rows in the first
row source have been processed. The first row source is often called the outer or
driving table,
whereas the second row source is called the inner table

NESTED LOOPS joins are ideal when the driving row source (the records you are
looking for)
is small and the joined columns of the inner row source are uniquely indexed or
have a highly
selective non-unique index. NESTED LOOPS joins have an advantage over other join
methods
in that they can quickly retrieve the first few rows of the result set without
having to wait for the
entire result set to be determined. This situation is ideal for query screens where
an end user can
read the first few records retrieved while the rest are being fetched

However, NESTED LOOPS joins can be very inefficient if the inner row source (second
table
accessed) does not have an index on the joined columns or if the index is not
highly selective.
If the driving row source (the records retrieved from the driving table) is quite
large, other join
methods may be more efficient

============================= SORT-MERGE Joins


==================================

In a SORT-MERGE join, Oracle sorts the first row source by its join columns, sorts
the second
row source by its join columns, and then merges the sorted row sources together. As
matches are
found, they are put into the result set.
SORT-MERGE joins can be effective when lack of data selectivity or useful indexes
render
a NESTED LOOPS join inefficient, or when both of the row sources are quite large
(greater than
5 percent of the blocks accessed). However, SORT-MERGE joins can be used only for
equijoins
(WHERE D.deptno = E.deptno, as opposed to WHERE D.deptno >= E.deptno).

SORT-MERGE
joins require temporary segments for sorting (if PGA_AGGREGATE_TARGET or
SGA_TARGET,
if used, is set too small). This can lead to extra memory utilization and/or extra
disk I/O in the
temporary tablespace.

============================ CLUSTER Joins


=====================================

A CLUSTER join is really just a special case of the NESTED LOOPS join that is not
used very
often. If the two row sources being joined are actually tables that are part of a
cluster, and if the
join is an equijoin between the cluster keys of the two tables, then Oracle can use
a CLUSTER
join

CLUSTER joins are extremely efficient because the joining rows in the two row
sources will
actually be located in the same physical data block. However, clusters carry
certain caveats of
their own, and you cannot have a CLUSTER join without a cluster. Therefore, CLUSTER
joins are
not very commonly used.

============================= HASH Joins


=========================================

HASH joins are the usual choice of the Oracle optimizer when the memory is set up
to accommodate
them. In a HASH join, Oracle accesses one table (usually the smaller of the joined
results) and
builds a hash table on the join key in memory. It then scans the other table in the
join (usually
the larger one) and probes the hash table for matches to it

Oracle uses a HASH join efficiently


only if the parameter PGA_AGGREGATE_TARGET is set to a large enough value. If you
set the
SGA_TARGET, you must set the PGA_AGGREGATE_TARGET as the SGA_TARGET does not
include the PGA.

As with
SORT-MERGE joins and CLUSTER joins, HASH joins work only on equijoins
=========================== COMPARISON
==============================================

Category NESTED LOOPS Join SORT-MERGE Join HASH Join

Optimizer hint USE_NL. USE_MERGE. USE_HASH.

When you can use it Any join. Any join.


Equijoins only.

Resource concerns CPU, disk I/O. Memory, temporary


Memory, temporary
segments. segments.

Features Efficient with highly Better than NESTED


Better than NESTED
selective indexes and LOOPS when an index LOOPS when an
index
restrictive searches. Used is missing or the search is
missing or the search
to return the first row of a criteria are not very
criteria are not very
result quickly. selective. Can work selective.
It is usually faster
with limited memory. than a SORT-MERGE.

Drawbacks Very inefficient when Requires a sort on both Can require


a large amount
indexes are missing or if tables. It is built for best of
memory for the hash
the index criteria are not optimal throughput and table to be
built. Does
limiting. does not return the first not return
the first rows
row until all rows are quickly. Can be
extremely
found. slow if it must do the

operation on disk.

======================== A Two-Table Join: Equal-Sized Tables no index - cost


based ============================

If you have set up the initialization parameters for hashing, Oracle builds a hash
table from the
join values of the first table, and then it probes that table for values from the
second table. If you
have not set up the initialization parameters for hashing, the first table in the
FROM clause in
cost-based optimization is the driving table.
However, in a SORT-MERGE join, this has no impact
because each table must be sorted and then all results must be merged together.
Also note that
the order of tables cannot be guaranteed when all conditions are not equal (when
you have
tables of different sizes or with different indexes) because the optimizer chooses
the order unless
you specify the ORDERED hint.

Using cost-based optimization, the first table in the FROM clause is


the driving table when the ORDERED hint is used. This overrides the
optimizer from choosing the driving table. If a SORT-MERGE join is
used, then the order of the tables has no impact because neither will
drive the query. Knowing which table is generally the driving table
when using an ORDERED hint in small joins can help you solve
larger table join issues and also help you find indexing problems.

When hash initialization parameters are set up, the optimizer uses
HASH joins in lieu of SORT-MERGE joins. With HASH joins, the first
table is used to build a hash table (in memory if available), and the
second table in the FROM clause then probes for corresponding
hash table matches. The first table in the FROM clause (using the
ORDERED hint) is the first table accessed in a HASH join.

=================== A Two-Table INDEXED Join: Equal-Sized Tables (Cost-Based)


==========================

All conditions being equal, the first table in the FROM clause in cost-based
optimization is the
driving table. The index is used on the join condition for the second table. In
Example 1, Oracle
used a NESTED LOOPS join to join the queries, but a HASH join or MERGE join was
also
possible, depending on the number of records in the table and index.

All conditions being equal, the first table in the FROM clause in cost-based
optimization using a
NESTED LOOPS join is the driving table with or without the ORDERED hint. Only the
ORDERED
hint guarantees the order in which the tables will be accessed. The index is used
on the join
condition for the second table.

Using cost-based optimization and a NESTED LOOPS join as the


means of joining, the first table in the FROM clause is the driving
table (all other conditions being equal), but only the ORDERED hint
guarantees this. In NESTED LOOPS joins, choosing a driving table
that is the smaller result set (not always the smaller table) makes fewer
loops through the other result set (from the nondriving table) and
usually results in the best performance
====================== Forcing a Specific Join Method
===================================

In these situations, you can use the USE_NL, USE_MERGE, and USE_HASH hints to
request
a specific join method, and you can use the ORDERED hint to request a specific join
order

Forcing a NESTED LOOPS join

select /*+ USE_NL (a b) */


b.business_unit,b.po_number,b.vendor_type,a.line_number,
a.line_amount,a.line_status,a.description
from purchase_order_lines a, purchase_orders b
where b.business_unit = a.business_unit
and b.po_number = a.po_number
order by b.business_unit,b.po_number,a.line_number

Forcing a SORT-MERGE join

select /*+ USE_MERGE (a b) */


a.business_unit,a.po_number,a.vendor_type,b.line_number,
b.line_amount,b.line_status,b.description
from purchase_orders a,purchase_order_lines b
where b.business_unit = a.business_unit
and b.po_number = a.po_number
order by a.business_unit,a.po_number,b.line_number

Forcing a HASH join

select /*+ USE_HASH (a b) */


a.business_unit,a.po_number,a.vendor_type,b.line_number,
b.line_amount,b.line_status,b.description
from purchase_orders a,purchase_order_lines b
where b.business_unit = a.business_unit
and b.po_number = a.po_number
order by a.business_unit,a.po_number,b.line_number

===================== A Two-Table Join Between a Large and Small Table no index


- cost based =====================

Using cost-based optimization, when a large table and a small table


are joined, the smaller table is used to build a hash table in memory
on the join key. The larger table is scanned and then probes the hash
table for matches to the join key. Also note that if there is not enough
memory for the hash, the operation can become extremely slow
because the hash table may be split into multiple partitions that could
be paged to disk. If the ORDERED hint is specified, then the first table
in the FROM clause will be the driving table and it will be the one
used to build the hash table.

===================== A Two-Table Join Between a Large and Small Table index -


cost based =====================

Using cost-based optimization, when a large and small table are


joined, the larger table is the driving table if an index can be used on
the large table. If the ORDERED hint is specified, then the first table in
the FROM clause will be the driving table.

====================== Bitmap Join Indexes ===============================

You should use b-tree indexes when columns are unique or near-unique; you should at
least
consider bitmap indexes in all other cases. Although you generally would not use a
b-tree index
when retrieving 40 percent of the rows in a table, using a bitmap index usually
makes this task
faster than doing a full table scan

Bitmap indexes are smaller and work differently than b-tree indexes.
You can use bitmap indexes even when retrieving large percentages (20�80 percent)
of a table. You
can also use bitmaps to retrieve conditions based on NULLs (because NULLs are also
indexed), and
can be used for not equal conditions for the same reason.

Bitmap indexes do not perform well in a heavy DML (UPDATE, INSERT, DELETE)
environment
and generally are not used in certain areas of an OLTP environment. There is a
heavy cost if you
are doing a lot of DML, so be very careful with this

The bitmap join index in Oracle is a lot like building a single index across two
tables. You must build a primary
key or unique constraint on one of the tables
create bitmap index empdept_idx
on emp1(dept1.deptno)
from emp1, dept1
where emp1.deptno = dept1.deptno;

============== Bitmap Join Indexes on Columns Other Than the Join


Remember, the join condition must be on the primary key or unique column.

Create bitmap index emp_dept_location


on emp1 (dept1.loc)
from emp1, dept1
where emp1.deptno = dept1.deptno;

The query shown next can now use the bitmap join index appropriately.
select emp1.empno, emp1.ename, dept1.loc
from emp1, dept1
where emp1.deptno = dept1.deptno;

========= Bitmap Join Indexes on Multiple Columns

create bitmap index emp_dept_location_deptname


on emp1 (dept1.loc, dept1.dname)
from emp1, dept1
where emp1.deptno = dept1.deptno;

The query in the following listing would now be able to use the bitmap join index
appropriately:

select emp1.empno, emp1.ename, dept1.loc, dept1.dname


from emp1, dept1
where emp1.deptno = dept1.deptno;

=============== Bitmap Join Indexes on Multiple Table

The example shown next assumes that the unique constraint on dept1.deptno from the
example in the earlier listing (where we added a unique constraint to the DEPT1
table) exists
and additionally on sales1.empno (creation not shown).
Create bitmap index emp_dept_location_ms
on emp1 (dept1.loc, sales1.marital_status)
from emp1, dept1, sales1
where emp1.deptno = dept1.deptno
and emp1.empno = sales1.empno;

The query in this next listing would now be able to use the bitmap join index
appropriately:

select emp1.empno, emp1.ename, dept1.loc, sales1.marital_status


from emp1, dept1, sales1
where emp1.deptno = dept1.deptno
and emp1.empno = sales1.empno;

================= Bitmap Join Index Caveats

Because the result of the join is stored, only one table can be updated
concurrently by different
transactions, and parallel DML is supported only on the fact table. Parallel DML on
the dimension

table marks the index as unusable. No table can appear twice in the join, and you
can�t create a
bitmap join index on an index-organized table (IOT) or a temporary table.

================================= Miscellaneous Tuning Snippets


=================================

The issues covered in this section will help the advanced DBA. We�ll look at
external tables,
consider the Snapshot Too Old issue along with how to set the event to dump every
wait, and
explore what�s really going on by performing block dumps.

============================= External Tables


This simple example shows you exactly how to use external tables. First, you need a
flat file
of data to access for the examples. You do this by simply spooling some data from
our familiar
friend, the EMP table.

set head off


set verify off
set feedback off
set pages 0
spool emp4.dat
select empno||','||ename ||','|| job||','||deptno||','
from scott.emp;
spool off

Output of the emp4.dat file

7369,SMITH,CLERK,20,
7499,ALLEN,SALESMAN,30,
7521,WARD,SALESMAN,30,

Then you need to create a directory from within SQL*Plus so that Oracle knows where
to
find your external tables.

SQL> create directory rich_new as '/u01/home/oracle/rich';

create table emp_external4


(empno char(4), ename char(10), job char(9), deptno char(2))
organization external
(type oracle_loader
default directory rich_new
access parameters
(records delimited by newline
fields terminated by ','
(empno , ename, job, deptno ))
location ('emp4.dat'))
reject limit unlimited;

SQL> desc emp_external4

There is currently no support for DML (insert, update, delete) commands, but you
can always
do this outside the database because the data is in a flat file. Using shell
scripting as shown next,
you can certainly replicate those commands. Although you can�t create an index
currently, external
tables are pleasantly and surprisingly fast.You can also count the records in the
flat file using SQL,

since you�ve now built an external


table. The command shown next takes less than one second to return its result.

select count(*)
from emp_external4;

You might also like