P. 1
SQL Pivot and Prune Queries - Keeping an Eye on Performance

SQL Pivot and Prune Queries - Keeping an Eye on Performance

|Views: 1,082|Likes:
Published by Brendan Furey
It is a very common requirement in SQL to join two record sets where there is a one to many relationship between the two sets, but where the cardinality of the result set is the same as that of the set on the 'one' side. The obvious case is for standard grouping and aggregation querying, such as simply counting the number of records in the 'many' set for each record in the 'one' set. There are also some slightly less obvious cases where there may be various SQL techniques available, with varying performance and complexity characteristics. This article looks at two such cases: the first where one wishes to join multiple subtypes of a given entity – this is generally referred to as ‘pivoting’ from rows to columns; and the second where one wishes to join just one record from the 'many' set, but does not have a pure join condition to identify the record and so must use an ordering condition instead – I will call this ‘pruning’.

This work attempts to find the best SQL techniques for such queries in Oracle 11g, mainly in terms of performance. It does this by running a variety of queries within the context of an outbound interface against a deliberately simple data model across a two-dimensional range of data sizes. A simple generic PL/SQL package has been written to perform the testing efficiently, and it uses a previously described (REF-4) object type for timing. Visio diagrams are provided for query structures, based on a similar approach previously described (REF-3), and Microsoft Excel graphs are used to display comparative performances. The results reveal some interesting features of the behaviour of the Cost Based Optimiser in Oracle 11g.

I have applied the same domain-based approach to performance analysis in a subsequent article, ‘Forming Range-Based Break Groups with Advanced SQL'.
It is a very common requirement in SQL to join two record sets where there is a one to many relationship between the two sets, but where the cardinality of the result set is the same as that of the set on the 'one' side. The obvious case is for standard grouping and aggregation querying, such as simply counting the number of records in the 'many' set for each record in the 'one' set. There are also some slightly less obvious cases where there may be various SQL techniques available, with varying performance and complexity characteristics. This article looks at two such cases: the first where one wishes to join multiple subtypes of a given entity – this is generally referred to as ‘pivoting’ from rows to columns; and the second where one wishes to join just one record from the 'many' set, but does not have a pure join condition to identify the record and so must use an ordering condition instead – I will call this ‘pruning’.

This work attempts to find the best SQL techniques for such queries in Oracle 11g, mainly in terms of performance. It does this by running a variety of queries within the context of an outbound interface against a deliberately simple data model across a two-dimensional range of data sizes. A simple generic PL/SQL package has been written to perform the testing efficiently, and it uses a previously described (REF-4) object type for timing. Visio diagrams are provided for query structures, based on a similar approach previously described (REF-3), and Microsoft Excel graphs are used to display comparative performances. The results reveal some interesting features of the behaviour of the Cost Based Optimiser in Oracle 11g.

I have applied the same domain-based approach to performance analysis in a subsequent article, ‘Forming Range-Based Break Groups with Advanced SQL'.

More info:

Published by: Brendan Furey on May 02, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as DOC, PDF, TXT or read online from Scribd
See more
See less

09/26/2012

pdf

text

original

RUN_CONTROL

Column Name

Type

Notes

id*

Number

Sequence generated primary key

description

Char(500)

Description of run

status

Char(1)

S – success, F - failure

message

Char(4000)

Error message if any

point_wide_max

Number

Maximum width point

point_deep_maxNumber

Maximum depth point

cpu_time

Number

Total CPU time

elapsed_time

Number

Total elapsed time

creation_date

Date

Creation date

RUN_STATISTICS

Column Name

Type

Notes

run_control_id*

Number

Foreign key to RUN_CONTROL table

run_type*

Char(60)

Query code

point_wide*

Number

Width point

point_deep*

Number

Depth point

cpu_time

Number

CPU time

elapsed_time

Number

Elapsed time

creation_date

Date

Creation date

status

Char(1)

S – success, F - failure

message

Char(4000)

Error message if any

110516346.doc

Page 48 of 52

OUTPUT_LOG

Column Name

Type

Notes

line_ind

Integer

Line number

line_text

Char(4000)

Line text

id

Char(30)

Identifier code

creation_date

Date

Creation date

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->