You are on page 1of 42

1

<Insert Picture Here>

Explain Plan:
How to Read and Analyze Execution plans

Maria Colgan
Senior Principal Product Manager
Agenda

• Introduction <Insert Picture Here>

– What is an execution plan


– How to generate a plan
• Hands on lab instructions
– Step by steps instructions to start the Virtual Box
• Exercise 1
– Incorrect Access Path Chosen
• Exercise 2
– Incorrect Cardinality Estimate
• Exercise 3
– Incorrect Join Order

3
What is an Execution plan?

• Execution plans show the detailed steps necessary to


execute a SQL statement
• These steps are expressed as a set of database
operators that consumes and produces rows
• The order of the operators and their implementation is
decided by the optimizer using a combination of query
transformations and physical optimization techniques
• The display is commonly shown in a tabular format,
but a plan is in fact tree-shaped

4
What is an Execution plan?

Query:
SELECT prod_category, avg(amount_sold)
FROM sales s, products p
WHERE p.prod_id = s.prod_id
GROUP BY prod_category;

Tabular representation of plan Tree-shaped representation of plan


-----------------------------------------------------------
Id Operation Name GROUP BY
-----------------------------------------------------------
0 SELECT STATEMENT
1 HASH GROUP BY HASH JOIN
2 HASH JOIN
3 TABLE ACCESS FULL PRODUCTS
4 PARTITION RANGE ALL
5 TABLE ACCESS FULL SALES TABLE ACCESS TABLE ACCESS
---------------------------------------------------------- PRODUCTS SALES

5
How to get an Execution Plan

Two methods for looking at the execution plan


1.EXPLAIN PLAN command
– Displays an execution plan for a SQL statement without actually
executing the statement
2.V$SQL_PLAN
– A dictionary view introduced in Oracle 9i that shows the execution
plan for a SQL statement that has been compiled into a cursor in
the cursor cache

Either way use DBMS_XPLAN package to display plans

Under certain conditions the plan shown with EXPLAIN PLAN


can be different from the plan shown using V$SQL_PLAN

6
How to get an Execution Plan Example 1

EXPLAIN PLAN command & dbms_xplan.display function


SQL> EXPLAIN PLAN FOR SELECT prod_name, avg(amount_sold)
FROM sales s, products p
WHERE p.prod_id = s.prod_id
GROUP BY prod_name;
SQL> SELECT plan_table_output
FROM table(dbms_xplan.display('plan_table',null,'basic'));
------------------------------------------
Id Operation Name
------------------------------------------
0 SELECT STATEMENT
1 HASH GROUP BY
2 HASH JOIN
3 TABLE ACCESS FULL PRODUCTS
4 PARTITION RANGE ALL
5 TABLE ACCESS FULL SALES
-------------------------------------------

7
How to get an Execution Plan Example 2

Generate & display execution plan for last SQL stmts


executed in a session
SQL>SELECT prod_category, avg(amount_sold)
FROM sales s, products p
WHERE p.prod_id = s.prod_id
GROUP BY prod_category;

SQL> SELECT plan_table_output


FROM table(dbms_xplan.display_cursor(null,null,'basic'));
------------------------------------------
Id Operation Name
------------------------------------------
0 SELECT STATEMENT
1 HASH GROUP BY
2 HASH JOIN
3 TABLE ACCESS FULL PRODUCTS
4 PARTITION RANGE ALL
5 TABLE ACCESS FULL SALES
-------------------------------------------

8
Generating a more detailed plan Add hint to gather run
time statistics to
compare against
Optimizer estimations

SELECT /*+ gather_plan_statistics */ p.prod_name, SUM(s.quantity_sold)


FROM sales s, products p
WHERE s.prod_id =p.prod_id GROUP By p.prod_name ;

SELECT * FROM table (


DBMS_XPLAN.DISPLAY_CURSOR(FORMAT=>'ALLSTATS LAST'));

Compare the estimated number of rows returned for each operation in the
plan to actual rows returned

9
Agenda

• Introduction <Insert Picture Here>

– What is an execution plan


– How to generate a plan
• Hands on lab instructions
– Step by steps instructions to start the Virtual Box
• Exercise 1
– Incorrect Access Path Chosen
• Exercise 2
– Incorrect Cardinality Estimate
• Exercise 3
– Incorrect Join Order

10
Instructions for running the lab

• All exercises are running in a VirtualBox environment


– Oracle Enterprise Linux 5
– Oracle Database 11g Release 2 (11.2.0.2 pre-production)
• All OS passwords are oracle unless stated otherwise
– Main user oracle/oracle
– Environment is automatically set
• The exercise will be run in SQL*Plus
– Oracle users is scott password tiger

11
Environment

12 12
Environment – What you should see on VirtualBox

The Hands-on
directory

Terminal

emacs

Local Oracle
documentation

13
Environment What you should see on VirtualBox

The Hands-on
directory

Terminal

emacs

Local Oracle
documentation

14
Starting the hands-on lab

15
Running the hands-on lab

• All exercises are run using SQL*PLUS from terminal


• Scripts are in /HOL/optimizer
• cd into the directory of the exercise you want to do
– Exercise1 or exercise 2 or exercise 3
• Intermediate users – intermediate.sql
– Self-running scripts that shows step by step how to approach the
problem and correct it
• Advanced users – advanced.sql
– Script executes the problem query & displays its execution plan
– You must solve the problem and produce the correct plan yourself

16
Agenda

• Introduction <Insert Picture Here>

– What is an execution plan


– How to generate a plan
• Hands on lab instructions
– Step by steps instructions to start the Virtual Box
• Exercise 1
– Incorrect Access Path Chosen
• Exercise 2
– Incorrect Cardinality Estimate
• Exercise 3
– Incorrect Join Order

17
Exercise 1
Incorrect Access Path

• Scenario
– Optimizer underestimates cardinality of a simple query by 10X and
picks an index scan followed by a table access rather than a full
table scan
• Goal
– Determine why the Optimizer got the cardinality wrong and fix it
– No Optimizer hints are required
– You can’t delete or hide the index
• Getting Started
– Scripts are located in /hol/optimizer/exercise1
– To solve it on your own run advanced.sql You have
15 minutes
– To follow the worked example run intermediate.sql

18
Exercise 1
Incorrect Access Path

• The Query : SELECT max(empno), count(*)


FROM bigemp
WHERE deptno= :deptno;
• Bind variable deptno has a value of 10

Optimizer estimated that only 9,091 rows would be accessed


but 10X more, 99,900 were accessed. Why did the optimizer
get it so wrong?

19
Exercise 1 Solution
Do you have accurate statistics?

• Do we have statistics for the bigemp table?


Select table_name, num_distinct, histograms
From user_tab_col_statistics;

• Are the statistics stale? There are statistics on


Select table_name, stale_stats the bigemp table and
From user_tab_ statistics they are up to date
Where table_name = ‘BIGEMP’;

20
Exercise 1 Solution
Do you have a data skew?

• How many rows are in the bigemp table?


Select count(*) From bigemp;

• What does the data look like in the big emp table?
Select deptno, count(*) from bigemp group by deptno;

There is a massive
data skew with 10
being the most
popular value in the
table

21
Exercise 1 Solution
Data skew?

• Value 10 is more popular the dept column than any other


• Optimizer assumes an even distribution of row per distinct
value
• Default cardinality calculation is Total number of rows
Number of Distinct values

=> 100,000 = 9,090.9


11
• Optimizer only deals in whole numbers so 9,091rows
• Only way to tell the Optimizer about skew is a histogram

22
Exercise 1 solution
What is a histogram?

• A histogram is a collection of information about the


distribution of values within a column
• Frequency histogram is chosen when the number of
distinct values in the column is less than 254
• Oracle automatically creates a histograms when there is
column usage information available
• Statistics were immediately gathered after BIGEMP was
created
– No column usage information was available
– No histograms were created

23
Exercise 1 Solution
Create a histogram on deptno column

• Create a histogram on the deptno of the BigEmp table


Exec dbms_stats.gather_table_stats(null,’BIGEMP’, -
method_opt =>‘FOR COLUMN DEPTNO SIZE AUTO’);

• Flush the Shared Pool


Alter system flush shared_pool;

• Corrected execution plan

24
Agenda

• Introduction <Insert Picture Here>

– What is an execution plan


– How to generate a plan
• Hands on lab instructions
– Step by steps instructions to start the Virtual Box
• Exercise 1
– Incorrect Access Path Chosen
• Exercise 2
– Incorrect Cardinality Estimate
• Exercise 3
– Incorrect Join Order

25
Exercise 2
Incorrect cardinality estimate

• Scenario
– The optimizer is underestimating the rows returned by a simple
count(*) from the emp2 table
• Goal
– Determine why the optimizer got the cardinality wrong on emp2
and fix it
– No Optimizer hints are required
– No additional access structures are allowed
• Getting Started
– Scripts are located in /hol/optimizer/exercise2
– To solve it on your own run advanced.sql You have
15 mins
– To follow the worked example run intermediate.sql

26
Exercise 2
Incorrect cardinality estimate2

• The Query : SELECT count(*)


FROM emp2
WHERE sal + NVL(comm, 0) > 1500;

• Original Plan

Optimizer estimated that only 45 rows would be accessed but


10X more, 576 were accessed. Why did the optimizer get it so
wrong?

27
Exercise 2 Solution
Do you have accurate statistics?
• Do we have statistics for the emp2 table?
Select table_name, num_distinct, histograms
From user_tab_col_statistics where table_name=‘EMP2’;

There are statistics on


• Are the statistics stale? the EMP2 table and
Select table_name, stale_stats they are up to date so
From user_tab_ statistics why is the optimizer
getting it wrong?
Where table_name = ‘EMP2’;

28
Exercise 2 Solution
Optimizer and Function wrapped indexes

• Query: SELECT count(*)


FROM emp2
WHERE sal + NVL(comm, 0) > 1500;
• Optimizer doesn’t know how function affects values in the
column so it guesses the cardinality to be 5% of rows

SELECT count(*) FROM emp2;


COUNT(*)
896
• Cardinality estimate is 5% of the rows 45 rows

29
Exercise 2 Solution

• Create extended statistics on ‘SAL + NVL(COM, 0)’

• Corrected execution plan

30
Agenda

• Introduction <Insert Picture Here>

– What is an execution plan


– How to generate a plan
• Hands on lab instructions
– Step by steps instructions to start the Virtual Box
• Exercise 1
– Incorrect Access Path Chosen
• Exercise 2
– Incorrect Cardinality Estimate
• Exercise 3
– Incorrect Join Order

31
Exercise 3
Join Order

• Scenario
– One of your end-users is complaining a query is slower after you
upgraded. Problem seems to be the join order for the query
• Goal
– Determine why the optimizer got the join order wrong and fix it
– No Optimizer hints are required
– No additional access structures allowed
• Getting Started
– Scripts are located in /hol/optimizer/exercise3
– To solve it on your own run advanced.sql You have
15 mins
– To follow the worked example run intermediate.sql

32
Exercise 3
Join Order
• The Query : SELECT c.cust_id, c.cust_city, s.amount_sold
FROM sh.sales, s,
scott.customers c,
sh.products p,
sh.promotions po,
WHERE s.cust_id = c.cust_id
AND s.prod_id = p.prod_id
AND s.promo_id = po.promo_id
AND c.country_id= ‘US’
AND c.cust_state_province = ‘CA’
AND p.prod_name=‘O/S Documentation Set – English’
AND po.promo_name=‘internet promotion #29-350’;
• Current join order: p -> c -> s -> po
• Desired join order: p -> po -> c-> s

33
Exercise 3

• Original execution plan

34
Exercise 3 Solution
Do you have accurate statistics?

• Are there statistics on all of the tables and are they


stale?
Select table_name, stale_stats
From user_tab_ statistics
Where table_name in
(’SALES’,’PRODUCT’,’CUSTOMERS,’PROMOTIONS’)
Group by table_name, stale_stats; All tables have stats
and they are all
current. So why is the
optimizer getting it
wrong?

35
Exercise 3 Solution
Take a closer look at the plan

• Examine original plan and compare E-Rows to A-Rows

• Problem appears to be with the Customers table


• Estimate was 1 row when Actually it was 29
• Lets examine the aspects of the query related to the
customers table
36
Exercise 3 Solution
Correlated Columns
Query: SELECT COUNT(*)
FROM customers
WHERE country_id=‘US’
AND cust_state_province = ‘CA’;

• Optimizer assumes each where clause predicate will


eliminate rows from the result set so estimated 1 row
• Cardinality calculation is #ROWS * 1 * 1
NDV c1 NDV c2
• But real data shows correlations between column values
• State influences country

37
Exercise 3 Solution
Correlated Columns
• Need some additional information to complete the formula
• Select count(*) from customers;

• Select column_name, num_distinct


From user_tab_col_statistics
Where column_name in (‘COUNTRY_ID’,’CUST_STATE_PROVINCE);

Cardinalityis #ROWS * 1 * 1
NDV c1 NDV c2
630 * 1/19 * 1/120 = 0.2 Optimizer rounds up to 1

38
Exercise 3 Solution
Create extended stats on the column group

• How do we tell the Optimizer about real-world correlation


between these two columns?
• Create extended statistics on the column group
• Extended statistics give the Optimizer explicit details on how the
data in these columns interacts

39
Exercise 3 Solution
Create extended stats on the column group

• User_stat_extensions shows what extended statistics


exist for a table

SELECT extension_name, extension


FROM user_stat_extensions
WHERE table_name = ‘CUSTOMERS’;

• Now we have extended statistics lets rerun the query


and see if the execution plan has changed

40
Exercise 3 Solution

• Corrected Execution plan

2
1
3
4

41
42