You are on page 1of 45

DB2 10 Tips and Other Performance Topics

par Philippe Dubost, CA technologies

Runion du Guide DB2 pour z/OS France Jeudi 10 octobre 2013 Tour Opus, Paris-La Dfense

DB2 v10 tips and other performance topics


Agenda:
Save CPU by including extra columns to an unique index Is the length of your inline LOBs optimal ? Impact of SELECT * versus SELECTing specific columns Improve the response time of Native SQL procedures in DB2 v10

DB2 v10 tips and other performance topics


Save CPU by including extra columns to an unique index

Why do we have indexes ?


Indexes can be used to enforce uniqueness of keys Indexes are mainly created to improve SQL performance

Whats the cost of indexes ?


Indexes consume a lot of disk space (DASD) An index adds ~30% of CPU on INSERTs / DELETEs (0)

DB2 v10 tips and other performance topics


Save CPU by including extra columns to an unique index

DB2 v10 new feature :


ALTER INDEX IX ADD INCLUDE (COL)

What is it good for ?


The new INCLUDE feature in DB2 v10 allows you to reduce the number of indexes, it means:
Cost savings in DASD Cost savings in CPU.

Potential negative impact discussed in appendix (1)

DB2 v10 tips and other performance topics


Save CPU by including extra columns to an unique index

ALTER INDEX IX1 ADD INCLUDE (COL2)

DB2 v10 tips and other performance topics


Save CPU by including extra columns to an unique index

DB2 v10 tips and other performance topics


Save CPU by including extra columns to an unique index

DB2 v10 tips and other performance topics


Save CPU by including extra columns to an unique index

A note when adding several columns to an index


You have to add them one by one
ALTER INDEX IX1 ADD INCLUDE (COL3) ; ALTER INDEX IX1 ADD INCLUDE (COL4) ; ALTER INDEX IX1 ADD INCLUDE (COL5) ;

Thats honestly not a big deal, but maybe an nice DB2 enhancement to add in future versions could be:
ALTER INDEX IX1 ADD INCLUDE (COL3, COL4, COL5) ;

DB2 v10 tips and other performance topics


Save CPU by including extra columns to an unique index

From Theory to Practice


How to locate this scenario in your DB2 environment ? Query the system catalog to locate indexes whose keys overlap with the unique index ! An example next slide, when the unique index is composed of only 1 column.

DB2 v10 tips and other performance topics


Save CPU by including extra columns to an unique index
SELECT IX.CREATOR,IX.NAME, IX.TBCREATOR,IX.TBNAME, IX.UNIQUERULE,IX.COLCOUNT, KEYS.IXCREATOR,KEYS.IXNAME, KEYS.COLNAME,KEYS.COLNO, IX2.CREATOR,IX2.NAME, IX2.TBCREATOR,IX2.TBNAME, IX2.UNIQUERULE,IX2.COLCOUNT, KEYS2.IXCREATOR,KEYS2.IXNAME, KEYS2.COLNAME,KEYS2.COLNO FROM SYSIBM.SYSINDEXES IX, SYSIBM.SYSKEYS KEYS,

SYSIBM.SYSINDEXES contains information about all indexes in the subsystem, CREATOR, NAME, TBCREATOR, TBNAME, uniqueness, #columns, SYSIBM.SYSKEYS contains information about all keys of all indexes in the subsystem, COLNAME, COLNO, associated with a particular INDEX

SYSIBM.SYSINDEXES IX2, SYSIBM.SYSKEYS KEYS2

!ERE IX.UNIQUERULE " #U# AND IX.CREATOR " KEYS.IXCREATOR AND IX.NAME " KEYS.IXNAME AND IX.COLCOUNT " 1 AND IX.TBCREATOR " IX2.TBCREATOR AND IX.TBNAME " IX2.TBNAME AND IX2.CREATOR " KEYS2.IXCREATOR AND IX2.NAME " KEYS2.IXNAME AND IX2.UNIQUERULE " #D# AND IX.TBCREATOR " IX2.TBCREATOR AND IX.TBNAME " IX2.TBNAME AND KEYS.COLNAME " KEYS2.COLNAME ;

DB2 v10 tips and other performance topics


Save CPU by including extra columns to an unique index

With the result of this SQL statement, you can prepare your ALTER INDEX statements to INCLUDE non-key columns to the unique indexes listed, and get rid of the corresponding now-superfluous indexes.
Save DASD ! Save CPU !

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ?

DB2 v10 new feature :


INLINE LOBs allow a portion of a LOB column to be stored in the base TableSpace.

What is it good for ?


improves the performance of applications accessing LOB data, (reduces the need to access the auxiliary LOB TableSpace) enables the creation of expression-based indexes on the inline portion of a LOB column (improves performance of searches through a LOB column) the Inline portion of the LOB can be compressed

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ?

COL1 0001 0002 0003 0004 0005 000$ 000% 000&

COL2 AAAA AAAB AABB BBBB CCCC CCDD EEEE FFFF

COL3 2002 2002 2005 200& 2012 2013 2013 2013

LOB '()*(LOB+ ))000)))+ ,,000,,,+ --111 ..111...+ //111// 00222000+ 11333

R/'( 20 (3/ LOB +/4.20(3/LOB +)))))))) +,,, +.......... +00000

Base TableSpace

auxiliary LOB TableSpace

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ? Index on expression example COL1 0001 0002 0003 0004 0005 000$ 000% 000& COL2 AAAA AAAB AABB BBBB CCCC CCDD EEEE FFFF COL3 2002 2002 2005 200& 2012 2013 2013 2013 LOB '()*(LOB+ ))000)))+ ,,000,,,+ --111 ..111...+ //111// 00222000+ 11333 R/'( 20 (3/ LOB +/4.20(3/LOB +)))))))) +,,, +.......... +00000

Base TableSpace

auxiliary LOB TableSpace

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ? Some LOBs are not entirely self-contained inline COL1 0001 0002 0003 0004 0005 000$ 000% 000& COL2 AAAA AAAB AABB BBBB CCCC CCDD EEEE FFFF COL3 2002 2002 2005 200& 2012 2013 2013 2013 LOB '()*(LOB+ ))000)))+ ,,000,,,+ --111 ..111...+ //111// 00222000+ 11333 R/'( 20 (3/ LOB +/4.20(3/LOB +)))))))) +,,, +.......... +00000

Base TableSpace

auxiliary LOB TableSpace

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ? and some other are self-contained. COL1 0001 0002 0003 0004 0005 000$ 000% 000& COL2 AAAA AAAB AABB BBBB CCCC CCDD EEEE FFFF COL3 2002 2002 2005 200& 2012 2013 2013 2013 LOB '()*(LOB+ ))000)))+ ,,000,,,+ --111 ..111...+ //111// 00222000+ 11333 R/'( 20 (3/ LOB +/4.20(3/LOB +)))))))) +,,, +.......... +00000

Base TableSpace

auxiliary LOB TableSpace

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ?

So what should be the length for my INLINE Lobs ?


IBM DB2 limitations:
Minimum: 0 (i.e. LOB is not INLINE) Maximum: 32680

Within this range, this is up to the DBA to define the proper length.
The length is defined via DDL statements:
Table creation : CREATE TABLE TABLE1 (COL5LOB CLOB(1M) INLINE LEN6T! 20000) Table alter : ALTER TABLE TABLE1 ALTER COLUMN COL5LOB INLINE LEN6T! 30000 ;
C)7(8249 R/2*1 4//./. : (2)
;

Tip : Larger page size can be beneficial when turning a LOB into Inline LOB (3)

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ?

The optimal length for INLINE Lobs depends on the actual DATA, more precisely, the length of the LOBs
The question is almost philosophical, since the answer greatly depends on the LOB Lengths Distribution in the column, in other words, it depends on the type of data that are stored in the LOB. And the DBA creating the table does not necessarily know what it contains, nor what it will contain

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ?

Various examples of LOB length distributions:


1st case: almost all LOBs data are small, limited in size, with a few exceptions.

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ?

Various examples of LOB length distributions:


2nd case: the LOBs lengths are equally distributed, from the minimum length to the maximum length.

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ?

Various examples of LOB length distributions:


3rd case: the LOBS lengths is are linearly distributed (ascending or descending)

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ?

Various examples of LOB length distributions:


4th case: the LOBs length follow a normal distribution (Gauss).

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ?

Various examples of LOB length distributions:


5th case: the LOBs length distribution is sinusoidal

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ?

From Theory to Practice


Identify inline LOBs in your subsystem
SELECT TBCREATOR,TBNAME,;NAME;,LEN6T!,LEN6T!2,COLTY<E FROM ;SYSIBM;.SYSCOLUMNS !ERE ( COLTY<E " #CLOB# OR COLTY<E " #BLOB# OR COLTY<E " #DBCLOB# ) AND LEN6T! = 4 ;

LENGTH > 4 indicates that the LOB column uses the INLINE LOB technique, the actual Inline LOB Length is LENGTH-4 LENGTH2 reflects the maximum length of the LOB column (inline + stored in Auxiliary)

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ?

From Theory to Practice


Visualize the length distribution of a LOB column The following SQL categorizes the various lengths of the LOBs into LOB length ranges
SELECT RAN6E, COUNT(LEN6T!) AS RO 5COUNT FROM ( SELECT LEN6T!(<ARSETREE) AS LEN6T!, (CEIL(2%$%0>100)?(1@CEIL( 100 ? LEN6T!(<ARSETREE) > 2%$%0))) AS RAN6E FROM SYSIBM.SYSAIE S) AS TABLE5 IT!5RAN6E 6ROU< BY RAN6E ORDER BY RAN6E ;

This example uses SYSIBM.SYSAIE S that contains a LOB column <ARSETREE, a 1GB BLOB (1073741824) with Inline LENGTH 27674-4 = 27670 When exported to Excel, one can make a simple graph representation of the LOB length distribution and visualize if the INLINE LENGTH value is properly set (next slide).

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ?

DB2 v10 tips and other performance topics


Is the length of your inline LOBs optimal ?

Advantages & Benefits


The nice thing about this technique is that it is re-usable over time, meaning that the same query can be ran later, when the LOB data values have evolved. That can help to make sure the INLINE LENGTH value is (still) optimal !

DB2 v10 tips and other performance topics


Impact of SELECT * versus SELECTing specific columns

Performance impact benchmark


a simple and dirty SELECT * versus selecting only the specific columns you need

First note
the performance degradation is not visible if we are dealing with small tables (small number of rows). My tests with a 1000 rows table did not show any difference. But when I performed tests on a more realist size of table (1,000,000 rows), I did notice meaningful differences.

DB2 v10 tips and other performance topics


Impact of SELECT * versus SELECTing specific columns

Test environment
Presenting here the results of a performance test (CPU time, and Elapsed time) for a 1,000,000 rows table. DB2 v10 New Function Mode (NFM) subsystem. Using a fairly simple table, with 7 columns only (integer, dates, and timestamps) Compared a SELECT * with SELECT COL1, COL2.

DB2 v10 tips and other performance topics


Impact of SELECT * versus SELECTing specific columns Benchmark results

70% overhead of CPU time 17% overhead of Elapsed time Other tests performed show similar results, overhead of SELECT * varies depending on:
the number of columns specified in the SELECT COL1, COL2, the size of the table (number of rows).

DB2 v10 tips and other performance topics


Impact of SELECT * versus SELECTing specific columns

From Theory to Practice


How to identify embedded SELECT * in existing applications running against DB2 for z/OS ? A few years ago, I used a product that examines embedded SQL statement in Cobol programs: CA Plan Analyzer (it also works for other programming languages). Not to enter into too much details about this tool, there is a rule (sort of trigger) called Expert Rule 0064 that will be triggered if an embedded SELECT * SQL statement is discovered in the application / plan / package analyzed.

DB2 v10 tips and other performance topics


Impact of SELECT * versus SELECTing specific columns

In addition to the performance impact mentioned detailed above, CA Plan Analyzer also notifies the user with the following recommendation (which makes a lot of sense, and IMHO another good reason why application developers should not use SELECT * type of SQL statements in their application) :
T38' '327B. ,/ )C28./. ,/-)7'/ 20 D*2,B/E' (3)( -)4 ,/ /4-274(/*/. F3/4 )..841 )4. */E2C841 -2B7E4' 0*2E (3/ 74./*BG841 (),B/('). Y27* )DDB8-)(824 D*21*)E 32'( C)*8),B/' F8BB 42( -2**/'D24. (2 (3/ )../.>*/E2C/. -2B7E4'.

DB2 v10 tips and other performance topics


Impact of SELECT * versus SELECTing specific columns

Recommendations
If you are a DB2 application developer and you care about your applications performance : do not use SELECT * statements ! If you are a DB2 administrator, you may want to share this presentation with your application developers, and / or look for products that can detect the use of embedded SELECT * statements running in your DB2 environment.

DB2 v10 tips and other performance topics


Improve the response time of Native SQL procedures in DB2 v10

DB2 v9 introduced a new type of stored procedures called Native SQL Procedures.
These procedures execute in DBM1 address space, and provide SQL performance improvements due to less crossmemory calls compared to external stored procedures. Several improvements were done in this area in DB2 v10. IBM however mentions that the response time improvement (up to 20% improvement) can be achieved only if the existing Native SQL Procedures are dropped and re-created. That is, if you had created Native SQL Procedures under DB2 version 9 and upgraded to DB2 version 10, you might want to DROP/CREATE those.

DB2 v10 tips and other performance topics


Improve the response time of Native SQL procedures in DB2 v10

From Theory to Practice


You moved to DB2 v10, and you need to locate the Native Stored Procedures that were created prior the migration. As I did not know when my subsystem moved to DB2 v10, I used a trick to discover this information: As in every new DB2 version, the DB2 catalog contains additional tables that are created during the migration process. I took one of them, SYSIBM.SYSAUTOALERTS, and queried SYSIBM.SYSTABLES to get the CREATEDTS value.

DB2 v10 tips and other performance topics


Improve the response time of Native SQL procedures in DB2 v10

SQL statement used to discover the date of migration to DB2 v10 :


HH 3/4 F)' DB2 7D1*)./. (2 C10 NFM I

SELECT CREATEDTS FROM SYSIBM.SYSTABLES !ERE NAME " #SYSAUTOALERTS# AND CREATOR " #SYSIBM# ;

DB2 v10 tips and other performance topics


Improve the response time of Native SQL procedures in DB2 v10

From Theory to Practice


Listing all Native Stored Procedures created prior the upgrade to version 10 NFM
HH L8'( 20 N)(8C/ SQL <*2-/.7*/' (2 */H-*/)(/ SELECT CREATEDBY,O NER,NAME,ORI6IN,CREATEDTS FROM SYSIBM.SYSROUTINES !ERE ORI6IN " #N# AND CREATEDTS J ( SELECT CREATEDTS FROM SYSIBM.SYSTABLES !ERE NAME " #SYSAUTOALERTS# AND CREATOR " #SYSIBM# );

Stored procedures are listed in SYSIBM.SYSROUTINES, and the column ORIGIN = 'N' indicates that we deals with a Native SQL Procedure Another method was suggested to me, see appendix
(4)

DB2 v10 tips and other performance topics


Improve the response time of Native SQL procedures in DB2 v10

From Theory to Practice


With that, you have the list of Native SQL Procedures that you want to DROP / RECREATE. Hopefully, you have the DDL for these objects stored in dataset, but nothing is less obvious. If not, you can use tools to generate the DDL statements from the information in the catalog, in this example I used CA RC/Query for DB2 for z/OS to locate a particular Native SQL Procedure and generate its DDL (next slide)

DB2 v10 tips and other performance topics


Improve the response time of Native SQL procedures in DB2 v10

Locate the Native Stored Procedure, and execute the DDL command.

DB2 v10 tips and other performance topics


Improve the response time of Native SQL procedures in DB2 v10

Result of the DDL command

DB2 v10 tips and other performance topics


Improve the response time of Native SQL procedures in DB2 v10

From Theory to Practice


Ones you have the DDL, all whats needed is to update the SQL, add the DROP syntax and a couple of COMMITs, and the job is done!

Benefits
Performance improvements

Appendix
Philippe Dubost

Appendix
(0) Additional indexes --> Higher insert/delete CPU time (approximately 30% for each index)
Ref. presentation: DB2 10 Performance Update (Susan Lawson)

(1) Potential negative impact when using INCLUDE columns


If the majority of SQL is using the columns of the unique index (not the one to include) after adding an additional column, you now will have less index entries per page, i.e. more GETP requests, i.e. impact on performance.

(2) Reorg needed when implementing Inline LOBs


Add an Inline length or increase length : Advisory Status (AREO) Decrease length : Restrictive state (REORP)

(3) Larger page sizes can be beneficial when turning a LOB into Inline LOB
Some customers default everything to 4K pages. If youve got a LOB and most of the data fits into a smaller length than the maximum inline length (32680), which isnt uncommon, it might be worth considering making the LOB columns Inline and the Page size larger so you can get more rows on a page. Inlining LOB columns in 4K pages might reduce the number of rows on the page, forcing you to read more pages.

(4) (not tested) - Another way to find Native Stored Procedures create under DB2 v9
SYSROUTINES created under DB2 9 should have M in column RELCREATED

DB2 v10 tips and other performance topics


Presenters biography:
A 9-year IT professional, Philippe Dubost is Product Manager at CA technologies. In this role, he is responsible for products planning and strategy, presenting and representing the products portfolio to customers and industry analysts, collecting customer requirements and transforming them into actionable Agile/Scrum stories in the engineering backlog.

Contact information:
www.linkedin.com/in/dubost/ www.db2forz.blogspot.com E-mail: Philippe.Dubost@ca.com

DB2 10 Tips and Other Performance Topics


par Philippe Dubost, CA technologies Philippe.Dubost@ca.com

You might also like