1

<Insert Picture Here>

Explaining the Explain Plan

Disclaimer
• The goal of this session to provide you with a guide for reading SQL execution plans and to help you determine if that plan is what you should be expecting • This session will not provide you with sudden enlightenment making you an Optimizer expert or give you the power to tune SQL statements with the flick of your wrist!

Agenda
• What is an execution plan and how to generate one • What is a good plan for the optimizer • Understanding execution plans
• • • • • • Cardinality Access paths Join order Join type Partitioning pruning Parallelism

• Execution plan examples

<Insert Picture Here>

What is an execution plan and how to generate one

but a plan is in fact tree-shaped .What is an Execution plan? • Execution plans show the detailed steps necessary to execute a SQL statement • These steps are expressed as a set of database operators that consumes and produces rows • The order of the operators and their implementation is decided by the optimizer using a combination of query transformations and physical optimization techniques • The display is commonly shown in a tabular format.

prod_id = s.What is an Execution plan? Query SELECT prod_category. Tabular representation of plan ----------------------------------------------------------Id Operation Name ----------------------------------------------------------0 SELECT STATEMENT 1 HASH GROUP BY 2 HASH JOIN 3 TABLE ACCESS FULL PRODUCTS 4 PARTITION RANGE ALL 5 TABLE ACCESS FULL SALES ---------------------------------------------------------- Tree-shaped representation of plan GROUP BY | JOIN _______|_______ | TABLE ACCESS PRODUCTS | TABLE ACCESS SALES . avg(amount_sold) FROM sales s. products p WHERE p.prod_id GROUP BY prod_category.

EXPLAIN PLAN command • Displays an execution plan for a SQL statement without actually executing the statement 2.V$SQL_PLAN • A dictionary view introduced in Oracle 9i that shows the execution plan for a SQL statement that has been compiled into a cursor in the cursor cache Use DBMS_XPLAN package to display plans Under certain conditions the plan shown with EXPLAIN PLAN can be different from the plan shown using V$SQL_PLAN .How to get an Execution Plan Two methods for looking at the execution plan 1.

display function SQL> EXPLAIN PLAN FOR SELECT prod_category. Explained SQL> SELECT plan_table_output FROM table(dbms_xplan.display('plan_table'.prod_id GROUP BY prod_category. products p WHERE p.prod_id = s.null.'basic')). -----------------------------------------Id Operation Name -----------------------------------------0 SELECT STATEMENT 1 HASH GROUP BY 2 HASH JOIN 3 TABLE ACCESS FULL PRODUCTS 4 PARTITION RANGE ALL 5 TABLE ACCESS FULL SALES ------------------------------------------- . avg(amount_sold) FROM sales s.How to get an Execution Plan Example 1 EXPLAIN PLAN command & dbms_xplan.

'basic')).null. avg(amount_sold) FROM sales s. -----------------------------------------Id Operation Name -----------------------------------------0 SELECT STATEMENT 1 HASH GROUP BY 2 HASH JOIN 3 TABLE ACCESS FULL PRODUCTS 4 PARTITION RANGE ALL 5 TABLE ACCESS FULL SALES ------------------------------------------- .How to get an Execution Plan Example 2 Generate & display execution plan for the last SQL stmts executed in a session SQL>SELECT prod_category.prod_id = s. products p WHERE p.display_cursor(null. no rows selected SQL> SELECT plan_table_output FROM table(dbms_xplan.prod_id GROUP BY prod_category.

How to get an Execution Plan Example 3 Displaying execution plan for any other statement from V$SQL_PLAN 1. TABLE(dbms_xplan.com . Note More information on www.blogspot.optimizermagic.s.display_cursor(s. 2.'basic')) . 'basic')) t WHERE s.Directly: SQL> SELECT plan_table_output FROM table(dbms_xplan.sql_text like 'select PROD_CATEGORY%'.sql_id.null.Indirectly: SQL> SELECT plan_table_output FROM v$sql s.display_cursor('fnrtqw9c233tt'.child_number.

<Insert Picture Here> What is a good plan for the optimizer .

the better • Parallel execution: it’s all about performance • The faster.What’s a Good Plan for the Optimizer? The Optimizer has two different goals • Serial execution: It’s all about cost • The cheaper. the better Two fundamental questions: • What is cost? • What is performance? .

used in performing an operation Cost is an internal Oracle measurement .What is Cost? • • • • A magically number the optimizer makes up? Resources required to execute a SQL statement? Result of complex calculations? Estimate of how long it will take to execute a statement? Actual Definition • Cost represents units of work or resources used • Optimizer uses CPU & memory usage plus IO as units of work • Cost is an estimate of the amount of CPU and memory plus the number of disk I/Os.

What is performance? • Getting as many queries completed as possible? • Getting fastest possible elapsed time using the fewest resources? • Getting the best concurrency rate? Actual Definition • Performance is fastest possible response time for query • Goal is to complete the query as quickly as possible • Optimizer does not focus on resources needed to execute the plan .

<Insert Picture Here> Understanding an Execution Plan .

SQL Execution Plan When looking at a plan can you determine if the following is correct? • Cardinality • Are the correct number of rows coming out of each object? • Access paths • Is the data being accessed in the best way? Scan? Index lookup? • Join order • Are tables being joined in the correct order to eliminate as much data as early as possible? • Join type • Are the right join types being used? • Partitioning pruning • Did I get partition pruning? Is it eliminating enough data? • Parallelism .

100 rows total. 10 distinct values => cardinality=10 rows • OR if histogram present num_rows * Density Why should you care? • Influences access method and Join Order • If estimate is off it can have a huge impact on a plan What causes Cardinality to be wrong? • Data Skews • Multiple single column predicates on a table • A function wrapped where clause predicate .Cardinality What is it? • Estimate of number rows that will be returned • Cardinality for a single value predicate = num_rows total / num_distinct total • E.g.

Cardinality or Selectivity Cardinality the estimated # of rows returned To determine correct cardinality using a simple SELECT COUNT(*) from each table applying any WHERE Clause predicates belonging to that table .

Data Skew Cardinality = num_rows / num_distinct • If there is a data skew the selectivity could be way off • Create a histogram to correct the selectivity calculation • Oracle automatically creates a histogram if it suspects a data skew Be careful • Histograms have an interesting side effects on statements with binds • Less relevant for data warehousing • Prior to 11g stmt with binds had only one plan – based on first literal value • But presence of a histogram indicate skew unlikely one plan good for all bind values • In 11g multiple execution plans allowed for a single statement .

car model influences make • How do you tell the Optimizer about the correlation? • Extended Optimizer Statistics provides a mechanism to collect statistics on a group of columns •Full integration into existing statistics framework •Automatically maintained with column statistics •Instantaneous and transparent benefit for any migrated application .Multiple Single Column Predicates • Optimizer always assumes each additional predicate increases the selectivity •Selectivity of predicate 1 * selectivity of predicate 2 …etc • But real data often shows correlations •Job title influences salary.

• Applying a function to a column means the optimizer does not know how it will effect the cardinality • Most likely the optimizer will under-estimate the cardinality • Creating extended statistics for this function allows the optimizer to get the correct cardinality exec dbms_stats.A function Wrapped Where Clause Predicate SELECT * FROM customers WHERE lower(country_id) = 'us'. method_opt => 'for all columns size skewonly for columns(lower(country_id))').'customers'. .gather_table_stats(‘sh’.

Access Paths How to get data out of the table • The access path can be: • • • • • • • • • Full table scan Table access by Rowid Index unique scan Index range scan (descending) Index skip scan Full index scan Fast full index scan Index joins Bitmap indexes .

join order… .Access Path Look in Operation session to see how obj is being accessed If you know the wrong access method is being used check cardinality.

'FR'. name from countries where name='USA'. . Select country_id. name from countries where country_id between 'AU' and 'IE'. name from countries where country_id in ('AU'.'IE‘).Access Path examples A table countries contains 10K rows & has a primary key on country_id – What plan would you expect for these queries? Select country_id. Select country_id.

Join Type • A Join retrieve data from more than one table • Possible join types are • • • • • • Nested Loops joins Hash Joins Partition Wise Joins Sort Merge joins Cartesian Joins Outer Joins .

e.dept_name IN ('Marketing‘.department_id. d.employees e.department_id=d.Join Type Example 1 What Join type should be use for this Query? SELECT e. Employees has 107 rows Departments has 27 rows Foreign key relationship between Employees and Departments on dept_id .departments d WHERE d.dept_name FROM hr.name.salary.'Sales') AND e. hr.

Orders has 105 rows Order Items has 665 rows .oe.order_id = o. l.orders o .order_items l WHERE l.unit_price * l.order_id.customer_id.quantity FROM oe.Join Type Example 2 What Join type should be use for this Query? SELECT o.

e. Orders has 105 rows Employees has 107 rows .Join Type Example 3 What Join type should be use for this Query? SELECT o.0.orders o .order_id.employees e.order_date.name FROM oe. hr.

department_id.emp_id FROM hr.departments d ON e.department_id = d.department_id.Join Type Example 4 What Join type should be use for this Query? SELECT d.e. Employees has 107 rows Departments has 27 rows Foreign key relationship between Employees and Departments on dept_id .employees e FULL OUTER JOIN hr.

Join Type Look in the Operation section to check the right join type is used If the wrong join type is used go back and check the stmt is written correctly and the cardinality estimates are accurate .

Join Orders The order in which the tables are join in a multi table stmt • Ideally start with the table that will eliminate the most rows • Strongly effected by the access paths available Some basic rules • Joins that definitely results in at most one row always go first • When outer joins are used the table with the outer join operator must come after the other table in the predicate • If view merging is not possible all tables in the view will be joined before joining to the tables outside the view .

Join order 1 2 3 Want to start with the table that reduce the result set the most If the join order is not correct. cardinality & access methods . check the statistics.

Partition Pruning Q: What was the total sales for the weekend of May 20 .’MM/DD/YYYY’).’MM/DD/YYYY’) And to_date(‘05/23/2008’. May 22nd 2008 Only the 3 relevant partitions are accessed May 23rd 2008 May 24th 2008 .22 2008? Sales Table May 18th 2008 May 19th 2008 May 20th 2008 Select sum(sales_amount) From SALES May 21st 2008 Where sales_date between to_date(‘05/20/2008’.

Partition Pruning Pstart and Pstop list the partition touched by the query If you see the word ‘KEY’ listed it means the partitions touched will be decided at Run Time .

Bloom Filter 2. Bit vector sets a bit for each row that matches the search conditions DFO Hash Join 7. Bloom Filter create: Consumer set creates a hash table and a BIT VECTOR. Table scan: Time table is scanned and sent Receive 4. Bloom Filter send: BIT VECTOR is sent as an additional filter criteria to the scan of the sales table 6. Hash Join: Consumers will complete the hash join by probing into the hash table from the time time to find actual matching rows Filter Create 1. Bloom Filter apply: Join column is hashed and compared to BIT VECTOR Filter Use Scan Sales 3. Reduced row sent: Only rows that have a match in the bit vector get sent to the consumers Receive DFO Send DFO Send Set Shared Bloom filter Test Scan Time 5. Table Scan: Sales table is scan and rows are filtered based on query predicates .

Parallelism Goal is to execute all aspects of the plan in parallel • Identify if one or more sets of parallel server processes are used • Producers and Consumers • Identify if any part of the plan is running serial .

Parallelism IN-OUT column shows which step is run in parallel and if it is a single parallel server set or not If you see any lines beginning with the letter S you are running Serial check DOP for each table & index used .

Identifying Granules of Parallelism during scans in the plan • Data is Partitioned into Granules either • block range • Partition • Each parallel server is allocated one or more granules • The granule method is specified on line above the scan in the operation section .

Identifying Granules of Parallelism during scans in the plan .

Access Paths and how they are parallelized Access Paths Full table scan Table accessed by Rowid Index unique scan Index range scan (descending) Index skip scan Parallelization method Block Iterator Partition Partition Partition Partition Full index scan Fast full index scan Bitmap indexes (in Star Transformation) Partition Block Iterator Block Iterator .

Parallel Distribution • Necessary when producers & consumers sets are used • Producers must pass or distribute their data into consumers • Operator into which the rows flow decides the distribution • Distribution can be local or across other nodes in RAC • Five common types of redistribution .

Parallel Distribution • HASH • Assumes one of the tables is hash partitioned • Hash function applied to value of the join column • Distribute to the consumer working on the corresponding hash partition • Broadcast • The size of one of the result sets is small • Sends a copy of the data to all consumers • Range • Typically used for parallel sort operations • Individual parallel servers work on data ranges • QC doesn’t have to sort just present the parallel server results in the correct order • Partitioning Key Distribution – PART (KEY) • Assumes that the target table is partitioned • Partitions of the target tables are mapped to the parallel servers • Producers will map each scanned row to a consumer based on the partitioning column • Round Robin • Randomly but evenly distributes the data among the consumers .

Parallel Distribution Shows how the PQ servers distribute rows between each other .

<Insert Picture Here> Example of reading a plan .

BMG.t_tran_detail_hd c WHERE a.BMG.pcode .co_id=c. count(a.pcode. b. CNT FROM (SELECT a.pcode) CNT FROM BMG.hogan_pcode_hd_ref b .pcode_desc ) HOGAN_PCODE_HD_REF PCODE T_TRAN_DETAIL_HD ACCT_NUM CO_ID T_ACCT_MASTER_HD Multiple Terabytes 1 Gigabyte in size .pcode .t_acct_master_hd a . b. b.pcode_desc ORDER BY a.acct_num AND a.acct_num=c.co_id AND c.Example SQL Statement and Block Diagram SELECT '(' || pcode || ')' || pcode_desc AS PRODUCT.pcode AND a.tran_amt <2000000000 GROUP BY a.asof_yyyymm=200102 AND c.pcode_desc.pcode = b.

Are the access method correct? Means no stats gathered strong indicate this won’t be best possible plan .Example Cont’d Execution plan 1. Are the cardinality estimates correct? 3. Check the rows returned is approx correct 2.

Is the join order correct? Is the table that eliminates the most rows accessed first? . Are the correct join methods used? 6. Has partition pruning happen? 5.Example Cont’d Execution plan 1 2 3 4.

Check the distribution method and make sure we are not broadcasting a large table? .Example Cont’d Execution plan 7. Check all aspects of the plan are executing in parallel 8.

Only 1 row is actually returned and the cost is 4 X lower now 2 1 3 4. The join order has changed . Join types are still hash joins but now a PWJ 3. Partition pruning One range partition 4 hash partitions 2.Example Cont’d Execution plan .Solution 1. Cardinalities are correct and with each join number of rows reduced 6. Row distribution is now all hash .PWJ them join hash to look-up table 5. All aspects of the plan are executing in parallel 8. Access methods remains the same 7.

cust_id.cust_id =c.ID What do you expect the plan to look like for this statement?S NOT NULL) Explanation Join to customers is redundant as no columns are selected Presence of primary –foreign key relationship means we can remove table .Determining if you get the right plan Query SELECT quantity_sold FROM sales s. customers c WHERE s.

Q&A .