You are on page 1of 51

Presentation Outline

• SQL Writing Process


• SQL Standards
• Using Indexes
• The Optimizer
• FROM, WHERE Clauses
• EXPLAIN
• SQL Trace
• Sub-Selects and Joins
• Tips and Tricks
1
Caveat

Although many of these principles apply


to all databases, Oracle will be used in
the examples.

2
SQL Writing Process
Step 1: What information do I need?  Columns

Step 2: Where is it?  Tables

Step 3: Write SQL:

SELECT columns
FROM tables
WHERE ... (joins, filters, subqueries)

I'M FINISHED!

3
SQL Writing Process
• YOU'RE NOT FINISHED YET! You've got the results
you want, but at what cost?

• There are many, many ways to get the right results,


but only one is the fastest way—1000-to-1
improvements are attainable!

• Inefficient SQL can dramatically degrade the


performance of the entire system

• Developers and DBAs must work together to tune the


database and the application

4
Pre-Tuning Questions
• How long is too long?

• Is the statement running on near-production


volumes?

• Is the optimal retrieval path being used?

• How often will it execute?

• When will it execute?

5
SQL Standards
Why are SQL standards important?

• Maintainability, readability

• Performance: If SQL is the same as a (recently)


executed statement, it can be re-used instead of
needing to be reparsed

6
SQL Standards
Question: which of these statements are the same?
A. SELECT LNAME FROM EMP WHERE EMPNO = 12;
B. SELECT lname FROM emp WHERE empno = 12;
C. SELECT lname FROM emp WHERE empno = :id;
D. SELECT lname FROM emp
WHERE empno = 12;

7
SQL Standards
• Answer: None

• Whitespace, case, bind variables vs. constants all


matter

• Using standards helps to ensure that equivalent SQL


can be reused.

8
Tables Used in the Examples
DEPT EMP SALGRADE
deptno empno grade
dname mgr losal
loc job hisal
deptno
fname
lname
comm
hiredate
grade
sal

9
SQL Standards: Example
Keywords upper case
SELECT E.empno, and left-aligned

D.dname Columns on new lines

FROM emp E, Use std. table aliases

dept D Separate w/ one space

WHERE E.deptno = D.deptno Use bind variables

AND (D.deptno = :vardept AND/OR on new lines

OR E.empno = :varemp); No space before/after


parentheses

10
Indexes: What are they?
• An index is a database object used to speed retrieval
of rows in a table.

• The index contains only the indexed value--usually the


key(s)--and a pointer to the row in the table.

• Multiple indexes may be created for a table

• Not all indexes contain unique values

• Indexes may have multiple columns (e.g., Oracle


allows up to 32)

11
Indexes and SQL
• If a column appears in a WHERE clause it is a
candidate for being indexed.

• If a column is indexed the database can used the


index to find the rows instead of scanning the table.

• If the column is not referenced properly, however,


the database may not be able to used the index and
will have to scan the table anyway.

• Knowing what columns are and are not indexed can


help you write more efficient SQL

12
Example: Query without Index
No index exists for column EMPNO on table EMP, so
a table scan must be performed:

Table: EMP
SELECT * empno fname lname...
FROM emp 4 lisa baker
WHERE empno = 8 9 jackie miller
1 john larson
3 larry jones
5 jim clark
2 mary smith
7 harold simmons
8 mark burns
6 gene harris

13
Example: Query with Index
Column EMPNO is indexed, so it can be used to find
the requested row:
SELECT *
FROM emp
WHERE empno = 8
Table: EMP
Index: PK_EMP
empno fname lname ...
EMP (EMPNO) 5
4 lisa baker
9 jackie miller
1, 4 5, 9 1 john larson
3 larry jones
5 jim clark
1 2 3 4 5 6 7 8 9 2 mary smith
        7 harold simmons
8 mark burns
6 gene harris

14
Indexes: Caveats
• Sometimes a table scan cannot be avoided

• Not every column should be indexed--there is


performance overhead on Inserts, Updates, Deletes

• Small tables may be faster with a table scan

• Queries returning a large number (> 5-20%) of the


rows in the table may be faster with a table scan

15
Indexes: Column Order
Example: Index on (EMPNO, DEPTNO)
SELECT *
FROM emp
WHERE deptno = 10; Will NOT use index

SELECT *
FROM emp
WHERE empno > 0 WILL use index
AND deptno = 10;

 Must use the leading column(s) of the index for


the index to be used

16
Indexes: Functions
Using a function, calculation, or other operation on an
indexed column disables the use of the Index

SELECT *
FROM emp Will NOT use index
WHERE TRUNC(hiredate) = TRUNC(SYSDATE);
...
WHERE fname || lname = 'MARYSMITH';

SELECT *
FROM emp
WHERE hiredate BETWEEN TRUNC(SYSDATE)
AND TRUNC(SYSDATE)+1
...
WHERE fname = 'MARY' WILL use index
AND lname = 'SMITH';
17
Indexes: NOT
Using NOT excludes indexed columns:

SELECT *
FROM dept
WHERE deptno != 0; Will NOT use index
... deptno NOT = 0;
... deptno IS NOT NULL;

SELECT *
FROM dept
WILL use index
WHERE deptno > 0;

18
The Optimizer
• The WHERE/FROM rules on the following pages apply
to the Rule-based optimizer (Oracle).

• If the Cost-based Optimizer is used, Oracle will attempt


to reorder the statements as efficiently as possible
(assuming statistics are available).

• DB2 and Sybase use only a Cost-based optimizer

• The Optimizer's access paths can be overridden in


Oracle and Sybase (not DB2)

19
The Optimizer: Hints
Return the first rows in the result set as fast as possible:
SELECT /*+ FIRST_ROWS */ empno
FROM emp E
dept D,
WHERE E.deptno = D.deptno;

Force Optimizer to use index IDX_HIREDATE:


SELECT /*+ INDEX (E idx_hiredate) */ empno
FROM emp E
WHERE E.hiredate > TO_DATE('01-JAN-2000');

20
FROM Clause: Driving Table
Specify the driving table last in the FROM Clause:
SELECT *
FROM dept D, -- 10 rows
emp E -- 1,000 rows Driving table is EMP
WHERE E.deptno = D.deptno;

SELECT *
FROM emp E, -- 1,000 rows
Driving table is DEPT
dept D -- 10 rows
WHERE E.deptno = D.deptno;

21
FROM Clause: Intersection
Table
When joining 3 or more tables, use the Intersection table
(with the most shared columns) as the driving table:

SELECT *
FROM dept D,
salgrade S, EMP shares columns with
emp E DEPT and SALGRADE,
WHERE E.deptno = D.deptno so use as the driving table
AND E.grade = S.grade;

22
WHERE: Discard Early
Use WHERE clauses first which discard the maximum
number of rows:

SELECT *
FROM emp E
WHERE E.empno IN (101, 102, 103) 3 rows
AND E.deptno > 10; 90,000 rows

23
WHERE: AND Subquery First
When using an "AND" subquery, place it first:
SELECT *
FROM emp E CPU = 156 sec
WHERE E.sal > 50000
AND 25 > (SELECT COUNT(*)
FROM emp M
WHERE M.mgr = E.empno)

SELECT * CPU = 10 sec


FROM emp E
WHERE 25 > (SELECT COUNT(*)
FROM emp M
WHERE M.mgr = E.empno)
AND E.sal > 50000
24
WHERE: OR Subquery Last
When using an "OR" subquery, place it last:
SELECT * CPU = 100 sec
FROM emp E
WHERE 25 > (SELECT COUNT(*)
FROM emp M
WHERE M.mgr = E.empno)
OR E.sal > 50000

SELECT * CPU = 30 sec


FROM emp E
WHERE E.sal > 50000
OR 25 > (SELECT COUNT(*)
FROM emp M
WHERE M.mgr = E.empno)

25
WHERE: Filter First, Join Last
When Joining and Filtering, specify the Filter condition
first, Joins last.

SELECT *
FROM emp E,
dept D
WHERE (E.empno = 123 Filter criteria
OR D.deptno > 10)
AND E.deptno = D.deptno; Join criteria

26
Subqueries: IN vs. EXISTS
Use EXISTS instead of IN in subqueries:
SELECT E.*
FROM emp E IN: Both tables are
WHERE E.deptno IN ( scanned
SELECT D.deptno
FROM dept D
WHERE D.dname = 'SALES');

SELECT * EXISTS: Only outer table


FROM emp E is scanned; subquery
WHERE EXISTS ( uses index
SELECT 'X'
FROM dept D
WHERE D.deptno = E.deptno
AND D.dname = 'SALES');
27
Subquery vs. Join
Use Join instead of Subquery :

SELECT * IN: Both tables are


FROM emp E scanned
WHERE E.deptno IN (
SELECT D.deptno
FROM dept D
WHERE D.dname = 'SALES');

SELECT E.* JOIN: Only one table is


FROM emp E, scanned, other uses index
dept D
WHERE D.dname = 'SALES'
AND D.deptno = E.deptno;

28
Join vs. EXISTS
Best performance depends on subquery/driving table:
SELECT * EXISTS: better than Join if
FROM emp E the number of matching
WHERE EXISTS ( rows in DEPT is small
SELECT 'X'
FROM dept D
WHERE D.deptno = E.deptno
AND D.dname = 'SALES');

SELECT E.* JOIN: better than Exists if


FROM emp E, the number of matching
dept D rows in DEPT is large
WHERE D.dname = 'SALES'
AND D.deptno = E.deptno;

29
Explain
Display the access path the database will use (e.g., use
of indexes, sorts, joins, table scans)

• Oracle: EXPLAIN
• Sybase: SHOWPLAN
• DB2: EXPLAIN
Oracle Syntax:
EXPLAIN PLAN
SET STATEMENT_ID = 'statement id'
INTO PLAN_TABLE FOR
statement

Requires Select/Insert privileges on PLAN_TABLE

30
Explain
Example 1: “IN” subquery
SELECT *
FROM emp E
WHERE E.deptno IN (
SELECT D.deptno
FROM dept D
WHERE D.dname = 'SALES');

Result:
MERGE JOIN 3 joins
SORT (JOIN) 1 dynamic view
TABLE ACCESS (FULL) OF EMP 2 table scans
SORT (JOIN) 3 sorts
VIEW
SORT (UNIQUE)
TABLE ACCESS (FULL) OF DEPT
31
Explain
Example 2: "EXISTS" subquery
SELECT *
FROM emp e
WHERE EXISTS (
SELECT 'x'
FROM dept d
WHERE d.deptno = e.deptno
AND d.dname = 'SALES');

1 table scan
Result: 1 index scan
FILTER 1 index access
TABLE ACCESS (FULL) OF EMP
TABLE ACCESS (BY INDEX ROWID) OF DEPT
INDEX (UNIQUE SCAN) OF PK_DEPT (UNIQUE)

32
Explain
Example 3: Join (no subquery)
SELECT E.*
FROM emp E,
dept D
WHERE D.dname = 'SALES'
AND D.deptno = E.deptno; 1 table scan
1 index scan
1 index access
Result:
NESTED LOOPS
TABLE ACCESS (FULL) OF EMP
TABLE ACCESS (BY INDEX ROWID) OF DEPT
INDEX (UNIQUE SCAN) OF PK_DEPT (UNIQUE)

33
SQL Trace
Use SQL Trace to determine the actual time and
resource costs for for a statement to execute.

Step 1: ALTER SESSION SET SQL_TRACE TRUE;

Step 2: Execute SQL to be traced:


SELECT E.*
FROM emp E,
dept D
WHERE D.dname = 'SALES'
AND D.deptno = E.deptno;

Step 3: ALTER SESSION SET SQL_TRACE FALSE;

34
SQL Trace
Step 4: Trace file is created in <USER_DUMP_DEST>
directory on the server (specified by the DBA).

Step 5: Run TKPROF (UNIX) to create a formatted


output file:

tkprof
echd_ora_15319.trc Trace file
$HOME/prof.out Formatted output file
table=plan_table destination for Explain
explain=dbuser/passwd user/passwd for Explain

35
SQL Trace
Step 6: view the output file:
TIMED_STATISTICS
... must be turned on to get
SELECT E.*
FROM emp E, dept D these values
WHERE D.dname = 'SALES' AND D.deptno = E.deptno;

call count cpu elapsed disk query current rows


------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 4 19 3 6
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 4 19 3 6

Misses in library cache during parse: 0


Optimizer goal: CHOOSE
Parsing user id: 62 (PMARKS)

Rows Row Source Operation


------- ---------------------------------------------------
6 NESTED LOOPS
14 TABLE ACCESS FULL EMP EXPLAIN output
14 TABLE ACCESS BY INDEX ROWID DEPT
14 INDEX UNIQUE SCAN (object id 4628)

36
Tips and Tricks: UNION ALL
Use UNION ALL instead of UNION if there are no
duplicate rows (or if you don't mind duplicates):

SELECT * FROM emp


UNION UNION: requires sort
SELECT * FROM emp_arch;

SELECT * FROM emp


UNION ALL UNION ALL: no sort
SELECT * FROM emp_arch;

37
Tips and Tricks: HAVING vs. WHERE
With GROUP BY, use WHERE instead of HAVING (if the
filter criteria does not apply to a group function):

SELECT deptno,
AVG(sal)
FROM emp HAVING: rows are
GROUP BY deptno filtered after result
HAVING deptno IN (10, 20); set is returned

SELECT deptno,
AVG(sal)
FROM emp WHERE: rows are
WHERE deptno IN (10, 20) filtered first--possibly
GROUP BY deptno; far fewer to process

38
Tips and Tricks: EXISTS vs DISTINCT
Use EXISTS instead of DISTINCT to avoid implicit sort (if
the column is indexed):
SELECT DISTINCT DISTINCT: implicit
e.deptno, sort is performed to
e.lname filter duplicate rows
FROM dept d,
emp e
WHERE d.deptno = e.deptno;

SELECT e.deptno, e.lname


FROM emp e EXISTS: no sort
WHERE EXISTS (
SELECT 'X'
FROM dept d
WHERE d.deptno = e.deptno);
39
Tips and Tricks: Consolidate SQL
Select from Sequences and use SYSDATE in the
statement in which they are used:

SELECT SYSDATE INTO :vardate FROM dual;


SELECT arch_seq.NEXTVAL INTO :varid FROM dual;
INSERT INTO archive BEFORE: 3 statements
VALUES (:vardate, :varid, ...) are used to perform 1
Insert

INSERT INTO emp_archive


VALUES (SYSDATE, emp_seq.NEXTVAL, ...)
AFTER: only 1
statement is needed
40
Tips and Tricks: Consolidate SQL
Consolidate unrelated statements using outer-joins to the
the DUAL (dummy) table:

SELECT dname FROM dept WHERE deptno = 10;


SELECT lname FROM emp WHERE empno = 7369;
BEFORE: 2 round-trips
SELECT d.dname,
e.lname
FROM dept d,
emp e, AFTER: only 1 round-trip
dual x
WHERE d.deptno (+) = 10
AND e.empno (+) = 7369
AND NVL('X', x.dummy) = NVL('X', e.ROWID (+))
AND NVL('X', x.dummy) = NVL('X', d.ROWID (+));

41
Tips and Tricks: COUNT
Use COUNT(*) instead of COUNT(column):
SELECT COUNT(empno)
FROM emp;

SELECT COUNT(*)
~ 50% faster
FROM emp;

42
Tips and Tricks: Self-Join
Use a self-join (joining a table to itself) instead of two
queries on the same table:

SELECT mgr INTO :varmgr FROM emp WHERE deptno = 10;


LOOP...
SELECT mgr, lname FROM emp WHERE mgr = :varmgr;

SELECT E.mgr, BEFORE: 2 round-trips


E.lname
FROM emp E,
emp M
WHERE M.deptno = 10
AND E.empno = M.mgr; AFTER: only 1

43
Tips and Tricks: ROWNUM
Use the ROWNUM pseudo-column to return only the first
N rows of a result set. (For example, if you just want a
sampling of data):

SELECT * Returns only the first 10


FROM emp employees in the table,
WHERE ROWNUM <= 10; in no particular order

44
Tips and Tricks: ROWID
The ROWID pseudo-column uniquely identifies a row,
and is the fastest way to access a row:

CURSOR retired_emp_cur IS Instead of selecting the


SELECT ROWID key column(s), ROWID is
FROM emp used to identify the row
WHERE retired = 'Y'; for later use
...
FOR retired_emp_rec IN retired_emp_cur LOOP
SELECT fname || ' ' || lname
INTO :printable_name
FROM emp
WHERE ROWID = retired_emp_rec.ROWID;
...

45
Tips and Tricks: Sequences
Use a Sequence to generate unique values for a table:

SELECT MAX(empno) MAX(empno) requires a


INTO :new_empno sort and an index scan
FROM emp;
... INSERT could fail with a
INSERT INTO emp Duplicate error if someone
VALUES (:new_empno + 1, ...); else gets there first

INSERT INTO emp Using a Sequence


VALUES (emp_seq.NEXTVAL, ...); ensures that you always
or have a unique number,
SELECT emp_seq.NEXVAL and does not require any
INTO :new_empno FROM dual; table reads

46
Tips and Tricks: Connect By
Use CONNECT BY to construct hierarchical queries:

SELECT LPAD(' ',4*(LEVEL-1)) || lname Name,


Job
FROM emp
WHERE job != 'CLERK'
START WITH job = 'PRESIDENT'
Name Job
CONNECT BY PRIOR empno = mgr;
King PRESIDENT
Jones MANAGER
Scott ANALYST
Ford ANALYST
Blake MANAGER
Allen SALESMAN
Ward SALESMAN
Martin SALESMAN
Turner SALESMAN
Clark MANAGER

47
Tips and Tricks: Cartesian Products
Avoid Cartesian products by ensuring that the tables are
joined on all shared keys:

SELECT *
FROM dept, -- 10 rows
salgrade, -- 20 rows
emp; -- 1,000 rows
10 * 1000 * 20 = 200,000 rows

SELECT *
FROM dept, -- 10 rows
salgrade, -- 20 rows
emp -- 1,000 rows
WHERE E.deptno = D.deptno 1,000 rows
AND E.grade = S.grade;

48
Tips and Tricks: TOAD
• Tool for Oracle Application Developers
• Oracle only! Requires Oracle SQL*Net client software
• Freeware tool for viewing/updating Oracle objects
• http://www.toadsoft.com or s:\tempfile\toad\toadfree.zip

49
Tips and Tricks: TOAD
CTRL+E displays
EXPLAIN PLAN

SQL result set


displayed in grid

50
Tips and Tricks: TOAD

Indexes, constraints,
grants, etc. for the
current table

Table/view data
All tables/views for in an editable grid
a selected schema

51

You might also like