Professional Documents
Culture Documents
Disclaimer
The goal of this session to provide you with a guide for reading SQL execution plans and to help you determine if that plan is what you should be expecting This session will not provide you with sudden enlightenment making you an Optimizer expert or give you the power to tune SQL statements with the flick of your wrist!
Agenda
What is an execution plan and how to generate one What is a good plan for the optimizer Understanding execution plans
Cardinality Access paths Join order Join type Partitioning pruning Parallelism
2.V$SQL_PLAN
A dictionary view introduced in Oracle 9i that shows the execution plan for a SQL statement that has been compiled into a cursor in the cursor cache
FROM sales s, products p WHERE p.prod_id = s.prod_id GROUP BY prod_category; no rows selected
SQL> SELECT plan_table_output FROM table(dbms_xplan.display_cursor(null,null,'basic')); -----------------------------------------Id Operation Name -----------------------------------------0 SELECT STATEMENT 1 HASH GROUP BY 2 HASH JOIN 3 TABLE ACCESS FULL PRODUCTS 4 PARTITION RANGE ALL 5 TABLE ACCESS FULL SALES -------------------------------------------
2.Indirectly:
SQL> SELECT plan_table_output FROM v$sql s, TABLE(dbms_xplan.display_cursor(s.sql_id,s.child_number, 'basic')) t WHERE s.sql_text like 'select PROD_CATEGORY%';
What is Cost?
A magically number the optimizer makes up? Resources required to execute a SQL statement? Result of complex calculations? Estimate of how long it will take to execute a statement?
Actual Definition
What is performance?
Getting as many queries completed as possible? Getting fastest possible elapsed time using the fewest resources? Getting the best concurrency rate? Actual Definition Performance is fastest possible response time for query
Goal is to complete the query as quickly as possible
Access paths
Is the data being accessed in the best way? Scan? Index lookup?
Join order
Are tables being joined in the correct order to eliminate as much data as early as possible?
Join type
Are the right join types being used?
Partitioning pruning
Did I get partition pruning? Is it eliminating enough data?
Parallelism
Cardinality
What is it?
Estimate of number rows that will be returned Cardinality for a single value predicate = num_rows total / num_distinct total
E.g. 100 rows total, 10 distinct values => cardinality=10 rows
Cardinality or Selectivity
To determine correct cardinality using a simple SELECT COUNT(*) from each table applying any WHERE Clause predicates belonging to that table
Data Skew
Cardinality = num_rows / num_distinct
If there is a data skew the selectivity could be way off Create a histogram to correct the selectivity calculation Oracle automatically creates a histogram if it suspects a data skew
Be careful
Histograms have an interesting side effects on statements with binds
Less relevant for data warehousing
Prior to 11g stmt with binds had only one plan based on first literal value But presence of a histogram indicate skew unlikely one plan good for all bind values In 11g multiple execution plans allowed for a single statement
Applying a function to a column means the optimizer does not know how it will effect the cardinality Most likely the optimizer will under-estimate the cardinality Creating extended statistics for this function allows the optimizer to get the correct cardinality
exec dbms_stats.gather_table_stats(sh,'customers', method_opt => 'for all columns size skewonly for columns(lower(country_id))');
Access Paths
How to get data out of the table The access path can be:
Full table scan Table access by Rowid Index unique scan Index range scan (descending) Index skip scan Full index scan Fast full index scan Index joins Bitmap indexes
Access Path
If you know the wrong access method is being used check cardinality, join order
Select country_id, name from countries where country_id between 'AU' and 'IE';
Join Type
A Join retrieve data from more than one table Possible join types are
Nested Loops joins Hash Joins Partition Wise Joins Sort Merge joins Cartesian Joins Outer Joins
Employees has 107 rows Departments has 27 rows Foreign key relationship between Employees and Departments on dept_id
Employees has 107 rows Departments has 27 rows Foreign key relationship between Employees and Departments on dept_id
Join Type
Look in the Operation section to check the right join type is used
If the wrong join type is used go back and check the stmt is written correctly and the cardinality estimates are accurate
Join Orders
The order in which the tables are join in a multi table stmt Ideally start with the table that will eliminate the most rows Strongly effected by the access paths available
Join order
1 2 3
Want to start with the table that reduce the result set the most
If the join order is not correct, check the statistics, cardinality & access methods
Partition Pruning
Q: What was the total sales for the weekend of May 20 - 22 2008?
Sales Table
May 18th 2008
Partition Pruning
If you see the word KEY listed it means the partitions touched will be decided at Run Time
Bloom Filter
2. Bloom Filter create: Consumer set creates a hash table and a BIT VECTOR. Bit vector sets a bit for each row that matches the search conditions
DFO
Hash Join
7. Hash Join: Consumers will complete the hash join by probing into the hash table from the time time to find actual matching rows
Filter Create
1. Table scan: Time table is scanned and sent
Receive
4. Bloom Filter send: BIT VECTOR is sent as an additional filter criteria to the scan of the sales table
6. Reduced row sent: Only rows that have a match in the bit vector get sent to the consumers
Receive
DFO
Send
DFO
Send
Set
Shared Bloom filter
Filter Use
Scan Sales
3. Table Scan: Sales table is scan and rows are filtered based on query predicates
Parallelism
Goal is to execute all aspects of the plan in parallel
Identify if one or more sets of parallel server processes are used
Producers and Consumers
Parallelism
IN-OUT column shows which step is run in parallel and if it is a single parallel server set or not
If you see any lines beginning with the letter S you are running Serial check DOP for each table & index used
Each parallel server is allocated one or more granules The granule method is specified on line above the scan in the operation section
Partition
Block Iterator Block Iterator
Parallel Distribution
Necessary when producers & consumers sets are used Producers must pass or distribute their data into consumers Operator into which the rows flow decides the distribution Distribution can be local or across other nodes in RAC Five common types of redistribution
Parallel Distribution
HASH
Assumes one of the tables is hash partitioned Hash function applied to value of the join column Distribute to the consumer working on the corresponding hash partition
Broadcast
The size of one of the result sets is small Sends a copy of the data to all consumers
Range
Typically used for parallel sort operations Individual parallel servers work on data ranges QC doesnt have to sort just present the parallel server results in the correct order
Round Robin
Randomly but evenly distributes the data among the consumers
Parallel Distribution
1 2 3
7. Check all aspects of the plan are executing in parallel 8. Check the distribution method and make sure we are not broadcasting a large table?
2 1
3
4. Partition pruning One range partition 4 hash partitions
2. Cardinalities are correct and with each join number of rows reduced 6. The join order has changed - PWJ them join hash to look-up table 5. Join types are still hash joins but now a PWJ 3. Access methods remains the same
Explanation
Join to customers is redundant as no columns are selected Presence of primary foreign key relationship means we can remove table
Q&A