How The CBO Works: Jonathan Lewis WWW - Jlcomp.demon - Co.uk

How the CBO works
Jonathan Lewis
www.jlcomp.demon.co.uk
Who am I
Independent Consultant.
18+ years experience.
Design, Strategy, Reviews,

Briefings, Seminars,
Tutorials, Trouble-shooting
www.jlcomp.demon.co.uk
© Jonathan Lewis NoCOUG 2003

2001 - 2003
Highlights
A Puzzle
Basic Costs
Correcting Oracle's assumptions
Oracle 9 learns
(Join Mechanics - time permitting)
Q and A

2001 - 2003
A puzzle (v8.1 - 4K)
create table t1 as
select
trunc((rownum-1)/15) n1,
trunc((rownum-1)/15) n2,
rpad('x',100) v1
from all_objects where rownum <= 3000;
create table t2 as
select
mod(rownum,200) n1,
mod(rownum,200) n2,
rpad('x',100) v1
from all_objects where rownum <= 3000;
We construct two sets of data with identical content, although we do use two
2001 - 2003 different mathematical methods to get 15 rows each for 200 different values.
A puzzle - indexed
create index t_i1 on t1(n1);
create index t_i2 on t2(n1);
analyze table t1 compute statistics;
analyze table t2 compute statistics;
We create indexes and generate statistics. In newer versions, we should use

2001 - 2003 the dbms_stats package, not analyze. (Note- compute is often over-kill)
A puzzle - checking data
USER_TABLES
TABLE_NAME BLOCKS NUM_ROWS AVG_ROW_LEN
----------- ------ ---------- -----------
T1 96 3000 111
T2 96 3000 111
USER_TAB_COLUMNS
TAB COL LOW_VALUE HIGH_VALUE NUM_DISTINCT
--- ---- --------- ---------- ------------
T1 N1 80 C20264 200
T2 N1 80 C20264 200
We can check that statistics like number of rows, column values and column
2001 - 2003 counts are identical. The data contents is identical across the two tables.
A puzzle - the problem
select * from t1 where n1 = 45;

SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=15)
TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=2 Card=15)
INDEX (RANGE SCAN) OF 'T_I1' (NON-UNIQUE) (Cost=1 Card=15)

TABLE ACCESS (FULL) OF 'T2' (Cost=15 Card=15)
We now run exactly the same query against the two sets of data - with
2001 - 2003 autotrace switched on - and find that the execution plans are different.
A puzzle - force it
select /*+ index(t2) */ * from t2 where n1 = 45;


Why has Oracle ignored the index on T2 ? Put in the hint(s) to make it
2001 - 2003 happen, and see if we get any clues. The cost of a tablescan is cheaper !
A puzzle - the detail
select
table_name tab,
num_rows num_rows,
avg_leaf_blocks_per_key l_blocks,
avg_data_blocks_per_key d_blocks,
clustering_factor cl_fac
from user_indexes;
TAB NUM_ROWS L_BLOCKS D_BLOCKS CL_FAC

---- -------- -------- -------- ------
T1 3000 1 1 96
T2 3000 1 15 3000
Why is the tablescan cheaper ? We look at the data scattering, rather than the
2001 - 2003 data content, and find the answer. The clustering is different.
A puzzle - the difference
0, 0, 0, 0 …. 1, 1, 1,… 0, 1 , 2, 3, 4, 5, 6, 7,…
5, 5, 5, 5, … 6, 6, 6, 6.. 45, 46, 47, 48, 49,…
10, 10, 10, …..11, 11, .. 103, 104, 105, …
….. ….. 198, 199, 0, 1, 2, ..
45,45,45,45,45,….. 40, 41, 42, 43, 44, 45, ..
46,46,..
….. …..
131, 131, 131, … …..
….. ….. 44, 45, 46, ….
…..199,199,199,199 …..196,197,198,199
The data on the left shows the effect of the trunc() function, the data on the
2001 - 2003 right shows the mod() effect. The statistics describe the data perfectly.
The arithmetic
T2 by index
one index block, 15 data blocks = 16
T2 by scan
96 blocks / 8 (multiblock read) = 12
(this is a first approximation)
T1 by index
one index block, one data block = 2
Silly assumption 1: Every logical request turns into a physical read.
2001 - 2003 Silly assumption 2: A multiblock read is just as fast as a single block read.
Multiblock Read
Actual Adjusted
4 4.175
Adjusted
8 6.589
16 10.398
32 16.409
64 25.895
128 40.865
Actual
96 / 6.589 = 14.57
(and add 1 in 9.2)
The cost of a tablescan uses an 'adjusted' db_file_multiblock_read_count.

2001 - 2003 Under 'traditional' costing the v9 cost is always one more than the v8 cost..
Multiblock Read Count
select * from t2 where n1 = 45; -- tablescan cost was 15
alter session set DB_FILE_MULTIBLOCK_READ_COUNT = 4;

(tablescan cost would be 23)
alter session set DB_FILE_MULTIBLOCK_READ_COUNT = 16;

We can affect access paths by changing the db_file_multiblock_read_count.

2001 - 2003 But this is a bit of a drastic change on a production system.
Single-block adjustment
select * from t2 where n1 = 45; -- index access cost was 16
alter session set OPTIMIZER_INDEX_COST_ADJ = 50;


`
Under Oracle 9 the numbers are slightly different.
We can fix one of Oracle's silly assumptions - let it know that single block
2001 - 2003 reads are cheaper (faster) than multiblock reads - by setting a percentage cost
Single-block adjustment
select event, average_wait
from v$system_event -- v$session_event
where event like 'db file s%read';
EVENT AVERAGE_WAIT
db file sequential read 1.05
db file scattered read 3.72
sequential read time / scattered read time = 1.05/3.72 = 0.28226

alter session set optimizer_index_cost_adj = 28;
init.ora or login trigger
Tim Gorman (www.evdbt.com) - The search for intelligent life in the CBO.
*** but see Garry Robinson: http://www.oracleadvice.com/Tips/optind.htm
The really nice thing about this is that we can set a genuine and realistic cost
2001 - 2003 factor by checking recent, or localised, history. (snapshot v$session_event)
Join cost-adjustment
select t2.n1, t1.n2

from t2,t1
where t2.n2 = 45
and t2.n1 = t1.n1;
SELECT STATEMENT (Cost=31 Card=225)

HASH JOIN (Cost=31 Card=25) 31 = 15 + 15 + a bit
TABLE ACCESS (FULL) OF T2 (Cost=15 Card=15)
TABLE ACCESS (FULL) OF T1 (Cost=15 Card=3000)
We can even see the effect of this price fixing in joins. Unhinted, or unfixed,
2001 - 2003 the optimizer chooses a hash join as the cheapest way to our two tables.
Hash Join (1)
Hashed
First (smaller) Second (larger)

data set data set
The first table is hashed in memory, the second table is used to probe the
2001 - 2003 hash (build) table for matches. In simple cases the cost is easy to calculate.
Force a nested loop
select
/*+ ordered use_nl(t1) index(t1) */
t2.n1, t1.n2
from t2,t1
where t2.n2 = 45
and t2.n1 = t1.n1;
NESTED LOOPS (Cost=45 Card=225)

TABLE ACCESS (FULL) OF T2 (Cost=15, Card=15)
TABLE ACCESS (BY ROWID) OF T1(Cost=2,Card=3000)
INDEX(RANGE SCAN) OF T_I1(NON-UNIQUE)(Cost=1)
As usual, to investigate why a plan is going wrong, we hint it to make it do

2001 - 2003 what we want - and then look for clues in the resulting cost lines.
Forced NL cost
alter session set OPTIMIZER_INDEX_COST_ADJ = 100; -- def
TABLE ACCESS(FULL) OF T2 (Cost=15, Card=15)
TABLE ACCESS(BY ROWID) OF T1(Cost=2, Card=3000)
T2 - cost = 15 Estimated rows = 15
For each row from T2 we access T1 by complete key value

T1 - Cost per access = 2
Cost for 15 accesses = 15 x 2 = 30
Total cost of query = cost of T2 + total cost of T1

= 15 + 30 = 45
The nested loop algorithm is: for each row in the outer table, use the value in
2001 - 2003 that row to access the inner table - hence the simple formula.
Nested Loop
T1
T2
The basic arithmetic of the nested loop join is visible in the picture. We do
2001 - 2003 three indexed access into T2, but need the three driving rows from T1 first.
Nested Loops - recosted
TABLE ACCESS(FULL) OF T2 (Cost=15, Card=15)
TABLE ACCESS(BY ROWID) OF T1(Cost=1, Card=3000)
T2 - cost = 15 Estimated rows = 15
For each row from T2 we access T1 by complete key value

T1 - Cost per access = 2 x 50% = 1
Cost for 15 accesses = 15 x 1 = 15
Total cost of query = cost of T2 + total cost of T1

= 15 + 15 = 30
What happens to the cost when we tell Oracle that single block reads cost
2001 - 2003 half as much as it would otherwise charge ?
Index Caching (NL only)
Basic nested loop cost (hinted)

TABLE ACCESS (FULL) OF 'T2' (Cost=15, Card=15)
TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=2, Card=3000)
INDEX (RANGE SCAN) OF 'T_I1' (NON-UNIQUE) (Cost=1)
alter session set OPTIMIZER_INDEX_CACHING = 100;

TABLE ACCESS (FULL) OF 'T2' (Cost=15, Card=15)
TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=1, Card=3000)
INDEX (RANGE SCAN) OF 'T_I1' (NON-UNIQUE)
We can improve silly assumption 2 (every logical I/O is also a physical I/O).
2001 - 2003 Index blocks are often cached. So tell the optimizer how good our cache is.
Simplifications
• What about the blevel ?

• What about multi-column indexes ?
• What about unbounded ranges ?
• What about unique indexes ?
• What about bitmap indexes ?
This was a visually helpful introduction
This walk-through is intended to give you a gut-feeling of how the optimizer.

2001 - 2003 works. But there are plenty of special cases, and bits of funny arithmetic.
An improved approximation:
blevel +
selectivity * leaf_blocks +
selectivity * clustering_factor
(see Wolfgang Breitling's paper to IOUG-A 2002)
For equality on all index columns:

avg_leaf_blocks_per_key sel * leaf_blocks
avg_data_blocks_per_key sel * clustering_factor
For multi column indexes, or when using a range scan, we need more precise
2001 - 2003 arithmetic - but even our example was a special case of the general formula
Adjusted cost:
(
(blevel + selectivity * leaf_blocks) * Index bit
(1 - optimizer_index_caching/100)
+
selectivity * clustering_factor Table bit
) *
optimizer_index_cost_adj / 100
The formula that Wolfgang Breitling proposed has to be adjusted to handle

2001 - 2003 the two 'fudge factor' parameters. This formula seems to be about right.
System Statistics (v9)
dbms_stats.gather_system_stats('start')
dbms_stats.gather_system_stats('stop')
SNAME PNAME PVAL1

SYSSTATS_MAIN CPUSPEED 357 MHz
SYSSTATS_MAIN SREADTIM 7.179 ms Single block is
SYSSTATS_MAIN MREADTIM 18.559 ms cheaper than multi.
SYSSTATS_MAIN MBRC 5 Used in t/s cost.
But in version 9 you need the 'fudge factors' less (You could still use them as
2001 - 2003 indicators of caching) - instead, you let Oracle learn about your hardware
Conclusions
• Understand your data

• Data distribution is important
• Think about your parameters
• Help Oracle with the truth
• Use system statistics in v9

2001 - 2003
Sort / Merge
select
count(t1.v1) ct_v1,
count(t2.v2) ct_v2
from big1 t1, big2 t2
where t2.n2 = t1.n1;
SELECT STATEMENT (choose) Cost (963)

SORT (aggregate)
MERGE JOIN Cost (963 = 174 + 789)
SORT (join) Cost (174)
TABLE ACCESS (analyzed) T1 (full) Cost (23)
SORT (join) Cost (789)
TABLE ACCESS (analyzed) T2 (full) Cost (115)
The cost of a sort-merge equi-join is typically the cost of acquiring each of

2001 - 2003 the two data sets, plus the cost of making sure the two data sets are sorted.
Sort Merge
Table 1 Table 2
Merge Merge
Sort Sort
To next step -
e.g. order by
Once the two sets are in order, they can be shuffled together. The shuffling
2001 - 2003 can be quick - the sorting may be the most expensive bit.
In-memory sort
PGA
UGA (will be in SGA for Shared Servers (MTS))
sort_area_retained_size sort_area_retained_size
To disc
Sort_area_size - sort_area_retained_size
In a merge join, even if the first sort completes in memory, it will still dump
2001 - 2003 the excess over sort_area_retained_size to disc. (and so will the second sort)
Big Sorts
Sort Sort Sort
Merge Merge Merge
A one-pass sort. The data has been read, sorted, and dumped to disc in
2001 - 2003 chunks, then re-read once to be merged into order, and dumped again.
Huge Sorts
Sort
Merge 1 Merge 2
Merge 3
Multipass sort. After sorting the data in chunks, Oracle was unable to re-read
2001 - 2003 the top of every chunk simultaneously, so we have multiple merge passes.
Hash Join (1)
Hashed
First (smaller) Second (larger)

data set data set
The first table is hashed in memory, the second table is used to probe the
2001 - 2003 hash (build) table for matches. In simple cases the cost is easy to calculate.
Hash Join (2)
Bit mapped
Hashed and
partitioned
Dump
to disc
Dump
to disc
Small data set Big data set
If the smaller data set cannot be hashed in memory, it partitioned, mapped,

2001 - 2003 and partly dumped to disc. The larger data set is partitioned in the same way
Hash Join (3)
Bit mapped
Hashed and
partitioned
Dump
to disc
Dump
to disc
Small data set Big data set
And if things go really wrong (bad statistics) Oracle uses partitions which are
2001 - 2003 too large - and the probe (secondary) partitions are re-read many times.
Version 9 approach
pga_aggregate_target = 500M
worksize_area_policy = auto
v$sysstat
Sorts, hashes, bitmap creates (v.9)
workarea executions - optimal
The job completed in memory - perfect.
workarea executions - onepass
The job required a dump to disk and single re-read.
workarea executions - multipass
Data was dumped to disc and re-read more than once.
Under Oracle 9, you should be setting workarea_size_policy to true, and use

2001 - 2003 the pga_aggregate_target to something big - the limit per user is 5%
Conclusions 2
• Sort joins have catastrophe points

• Hash joins have catastrophe points
• Work to avoid multi-pass
• pga_aggregate_target helps (v9)

2001 - 2003

How The CBO Works: Jonathan Lewis WWW - Jlcomp.demon - Co.uk

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

How The CBO Works: Jonathan Lewis WWW - Jlcomp.demon - Co.uk

Uploaded by

Copyright:

Available Formats

How the CBO works

18+ years experience.

Design, Strategy, Reviews,

© Jonathan Lewis NoCOUG 2003

© Jonathan Lewis NoCOUG 2003

create index t_i1 on t1(n1);

create index t_i2 on t2(n1);

analyze table t1 compute statistics;

analyze table t2 compute statistics;

We create indexes and generate statistics. In newer versions, we should use

select * from t1 where n1 = 45;

select * from t2 where n1 = 45;

select /*+ index(t2) */ * from t2 where n1 = 45;

select * from t2 where n1 = 45;

TAB NUM_ROWS L_BLOCKS D_BLOCKS CL_FAC

The cost of a tablescan uses an 'adjusted' db_file_multiblock_read_count.

alter session set DB_FILE_MULTIBLOCK_READ_COUNT = 4;

(tablescan cost would be 23)

alter session set DB_FILE_MULTIBLOCK_READ_COUNT = 16;

SELECT STATEMENT Optimizer=CHOOSE (Cost=10 Card=15)

We can affect access paths by changing the db_file_multiblock_read_count.

alter session set OPTIMIZER_INDEX_COST_ADJ = 50;

SELECT STATEMENT Optimizer=CHOOSE (Cost=8 Card=15)

alter session set OPTIMIZER_INDEX_COST_ADJ = 25;

SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=15)

sequential read time / scattered read time = 1.05/3.72 = 0.28226

select t2.n1, t1.n2

SELECT STATEMENT (Cost=31 Card=225)

First (smaller) Second (larger)

NESTED LOOPS (Cost=45 Card=225)

As usual, to investigate why a plan is going wrong, we hint it to make it do

T2 - cost = 15 Estimated rows = 15

For each row from T2 we access T1 by complete key value

Total cost of query = cost of T2 + total cost of T1

T2 - cost = 15 Estimated rows = 15

For each row from T2 we access T1 by complete key value

Total cost of query = cost of T2 + total cost of T1

NESTED LOOPS (Cost=45 Card=225)

alter session set OPTIMIZER_INDEX_CACHING = 100;

NESTED LOOPS (Cost=30 Card=225)

• What about the blevel ?

This walk-through is intended to give you a gut-feeling of how the optimizer.

For equality on all index columns:

The formula that Wolfgang Breitling proposed has to be adjusted to handle

SNAME PNAME PVAL1

• Understand your data

© Jonathan Lewis NoCOUG 2003

SELECT STATEMENT (choose) Cost (963)

The cost of a sort-merge equi-join is typically the cost of acquiring each of

UGA (will be in SGA for Shared Servers (MTS))

Sort Sort Sort

Merge Merge Merge

First (smaller) Second (larger)

Small data set Big data set

If the smaller data set cannot be hashed in memory, it partitioned, mapped,

Small data set Big data set

Under Oracle 9, you should be setting workarea_size_policy to true, and use

• Sort joins have catastrophe points

© Jonathan Lewis NoCOUG 2003

You might also like

select /+ index(t2) / * from t2 where n1 = 45;