Professional Documents
Culture Documents
Access Plan:
Access Plan:
-----------
-----------
Total Cost: 51489.3
Query Degree: 1 Total Cost: 345735
Query Degree: 1
Rows
RETURN Rows
RETURN
( 1)
( 1)
Cost
Cost
I/O
I/O
|
|
51013.8
700982
TBSCAN
( 2) TBSCAN
51489.3 ( 2)
345735
28588
|
No I/O Cost Large I/O Cost 68646
|
51013.8
SORT
To Sort To Sort 700982
( 3) 51K Rows 700K Rows SORT
( 3)
51489.3
309824
28588
| 48617
51013.8 |
TBSCAN 700982
TBSCAN
( 4)
51449.8 Also Monitor: ( 4)
51556.8
28588 Buffer pool temporary data logical reads 28588
| Buffer pool temporary data physical reads |
1e+006
1e+006
TABLE: ADMIN
ACCT TABLE: ADMIN
ACCT
3) SORT : (Sort)
Cumulative Total Cost: 309824
Cumulative CPU Cost: 5.46755e+009
Cumulative I/O Cost: 48617
Cumulative Re-Total Cost: 0
Cumulative Re-CPU Cost: 0
Cumulative Re-I/O Cost: 20029
Cumulative First Row Cost: 309824
Estimated Bufferpool Buffers: 48617
Arguments:
---------
DUPLWARN: (Duplicates Warning flag)
FALSE
NUMROWS : (Estimated number of rows)
700982
ROWWIDTH: (Estimated width of rows)
Estimated 108
Sort SORTKEY : (Sort Key column)
1: Q1.BALANCE(D)
Overflow SPILLED : (Pages spilled to bufferpool or disk)
20029
TEMPSIZE: (Temporary Table Page Size)
4096
UNIQUE : (Uniqueness required flag)
FALSE
SELECT *
• JOIN SATISFIED THROUGH
FROM SIBLINGS S,
TABLES ORDERED ON
OCCUPATIONS O
JOIN COLUMN(S), INDEXES
WHERE
COULD REDUCE SORT COSTS.
S.JOB_ID = O.JOB_ID
Fact table
• What is Star Schema?
Date
– Simplest form of a dimensional Id Dimension
model Date table
Day
Month
Quarter Snowflakes
• How the data is organized? Year
– Facts
– Dimensions
Account Sales Store
Nested Loop
Join
Cartesian
Join
*
Multi-Column
Fact Table
Index
Product
Cartesian Dimension
Join
Period
Store Dimension
Dimension
* Additional Prerequisite
© Copyright IBM Corporation 2013
Star Join and dynamic bitmaps
2. 2.
Nested Loop Nested Loop Nested Loop
Join Join Join
1. 1. 1.
Fact Table Product Fact Table Period
Store Fact Table
* store_id * product_id Dimension * period_id
Dimension
Index Dimension Index Index
* Additional Prerequisite
© Copyright IBM Corporation 2013
Characteristics of a Zigzag join
• Joins a fact table and two or more dimension tables in a star schema,
using an index scan of the fact table
• It requires equality predicates between each dimension table and the
fact table.
• The join method calculates the Cartesian product of rows from the
dimension tables without actually materializing the Cartesian product
• Probes the fact table using a multicolumn index, so that the fact table is
filtered along two or more dimension tables simultaneously
• The probe into the fact table finds matching rows
• The zigzag join then returns the next combination of values that is
available from the fact table index
• This next combination of values, known as feedback, is used to skip
over probe values provided by the Cartesian product of dimension
tables that will not find a match in the fact table
• Filtering the fact table on two or more dimension tables simultaneously,
and skipping probes that are known to be unproductive, together makes
the zigzag join an efficient method for querying large fact tables.
© Copyright IBM Corporation 2013
The Zigzag Join Method for Star Schema Based Queries
• How does it work?
– First forms the conceptual Cartesian product of dimensions but avoids
most non-productive probes from the Cartesian product into the fact
table
– Fact table index provides feedback to dimensions
– Zigzags through the dimensions and the fact tables Cartesian product of
dimension keys
Fact table multi
• Pre-requisite: A multi-column index on the d1 d2 column index
fact table on columns that join with
the dimensions 1 1 1
f1 f2
1 3
2 1 1
1 4 2 2
Unproductive 3
probes are 1 5 3 3
skipped` 4 4
Dimension keys 2 1
2 3 5 5
4
d1 d2 6 … …
2 4
1 1 5
Unproductive 2 5
2 3 probes are
skipped 3 1
3 4 probe
3 3
4 5 match
3 4
… …
… … Join: d1=f1 and d2=f2
EXP0256I Analysis of the query shows that the query might execute faster
if an additional index was created to enable zigzag join.
Schema name: table-schema. Table name: table-name.
Column list: column-list.
3. Use SET INTEGRITY to refresh the MQT and resolve Set Integrity Pending
for the staging table.
SET INTEGRITY FOR mqt2,stage2 IMMEDIATE CHECKED
5. Use REFRESH TABLE to incrementally update the MQT using the staging
table. Use runstats to get current statistics for staging table first.
RUNSTATS on table inst411.stage2
REFRESH TABLE inst411.mqt2
© Copyright IBM Corporation 2013
Example of the access plan for an INSERT into a
parent table of a MQT with a staging table defined
0.333333
Original Statement: TBSCAN
( 2)
------------------ 30.2853
INSERT INTO HISTORY(ACCT_ID, 4
TELLER_ID +-----------------------+--------+-------------+
BRANCH_ID, BALANCE, DELTA, PID, 1 1 1
TBSCAN INSERT TABFNC: SYSIBM
ACCTNAME, TEMP) ( 3) ( 7) GENROW
VALUES(:H00001 7.57432 22.7103
, :H00004 , :H00003 , :H00005 , 1 3
:H00006, :H00008 , :H00007 , 'TP1ST ') | /---+---\
1 1 38
TEMP GRPBY TABLE: INST411
( 4) ( 8) STAGE2
7.5663 15.1473
1 2
| |
1 1
Insert into HISTORY INSERT NLJOIN
( 5) ( 9)
7.56331 15.1472
Also Requires: 1
/----+----\
2
/---+--\
1 513576 1 1
TBSCAN TABLE: INST411 TBSCAN IXSCAN
• Insert into Staging table ( 6) HISTORY ( 10) ( 11)
• Access to Teller Index 2.83407e-05 0
7.57432
1
7.57285
1
| | |
1 1 1000
TABFNC: SYSIBM TEMP INDEX: INST411
GENROW ( 4) TELLINDX
7.5663
1
Complete access plan in notes
© Copyright IBM Corporation 2013
Sample Explain report for a
REFRESH TABLE statement using a staging table
DELETE
( 8)
28510.9
3783
/---+--\
848 1000
UPDATE TABLE: INST411
REFRESH TABLE and SET INTEGRITY ( 9)
22098
MQT2
2935
Statements can be explained to better 848
/---+--\
1000
understand the processing required. NLJOIN TABLE: INST411
( 10) MQT2
15685
2087
/---+---\
848 1.01626
GRPBY TBSCAN
( 11) ( 16)
15431.4 106.582 Original Statement:
• Staging table is accessed to 2073
|
14
| ------------------
848 1000
update the MQT TBSCAN TABLE: INST411
refresh table mqt2
( 12) MQT2
15431.3
2073
• Data in the Staging table |
848
SORT
is deleted automatically ( 13)
15431.3
2073
|
2035
DELETE
( 14)
15430.1
2073
/---+---\
2035 2035
TBSCAN TABLE: INST411
( 15) STAGE2
40.6609
38
|
2035
TABLE: INST411
STAGE2 Complete access plan in notes
Diagnostic Identifier: 1
Diagnostic Details: EXP0020W Table has no statistics. The
table "CMCCAIN "."MQT1" has not had runstats run on it. This
may result in a sub-optimal access plan and poor performance.
Diagnostic Identifier: 3
Diagnostic Details: EXP0148W The following MQT or
statistical view wasconsidered in query matching: "CMCCAIN
"."MQT1".
Diagnostic Identifier: 4
Diagnostic Details: EXP0149W The following MQT was used
(from those considered) in query matching: "CMCCAIN "."MQT1".
--
--
-- LIST OF RECOMMENDED INDEXES
-- ===========================
-- index[1], 0.036MB
CREATE INDEX "INST411 "."IDX909251812470000" ON "INST411
"."MQT909251812330000"
("C0" ASC, "C1" DESC) ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED
STATISTICS;
COMMIT WORK ;
© Copyright IBM Corporation 2013
Table partitioning: What is it and why use it?
• Allows a single logical table to be
broken up into multiple separate
Without Partitioning
physical storage objects:
– Each corresponds to a partition of the
table
SALESDATA
– Partition boundaries correspond to
specified value ranges in a specified
partition key
• Main Benefits:
– Allows for partition elimination during With Partitioning
SQL processing
Applications
– Allows for optimized roll-in / roll-out see single table
processing (for example, minimized
logging)
– Allows for divide and conquer
management of huge tables SALESDATA SALESDATA SALESDATA
JanPart FebPart MarPart
– Allows for improved HSM integration
– In this example:
• Sales data with dates prior to 1/1/2007 will be
placed in tbspd1
TBSPD4 TBSPD5
• 1st quarter sales in tbspd2
• 2nd quarter sales in tbspd3 sales.q3 sales.q4
• 3rd quarter sales in tbspd4
• 4th quarter 2007 sales will be in tbspd5
Index 1 Index 2
Sales Date Product ID
Range Partitions
Index 1 Index 2
Sales Date Product ID