Lab 03 Storage and Data Load: Important Things About LOAD For Columnar Tables

Lab 03 – Storage and Data Load IBM Software
Lab 03 Storage and Data Load

Important things about LOAD for Columnar Tables
__1. The LOAD utility using REPLACE or INSERT into empty table builds column compression
dictionary. This happens during ANALYZE phase.
__2. The LOAD utility when it is in LOAD phase compresses values and data pages. It updates
synopsis table and build keys for page map index and any unique indexes.
__3. The UTIL_HEAP_SZ should be as big as possible for the LOAD process. If the database server
has more than 128GB of RAM, the value of UTIL_HEAP_SZ should be at least 4 million pages.
__4. In order to get good compression, it is necessary to load large amount of representative data.
__5. It is not advisable to load a small initial subset of data for the 1st load as it might lead to the poor
compression dictionary.
__6. Page level compression is used for new values not covered by the column level dictionary. This
reduces the need to rebuild the compression dictionary.
__7. Page level compression reduces deteriorating compression ratios over time.
Use of Load Utility

__8. In GNOME Command window, type cd3 to change the directory to Lab 03.
$ cd3
__9. Run data01 to create a DB2PSC.FACT_DX_COL column organized table.

$ ./data01
IBM DB2 10.5 BLU Acceleration Page 31

IBM Software Lab 03 – Storage and Data Load
__10. Run data02 to load 100,000 rows in it.

$ ./data02
__11. Please notice new ANALYZE phase for the column organize table.
__12. In the ANALYZE phase, data is converted from row organized format to column organized format.
Histograms are built to track value frequency. The compression dictionary is built based upon
histograms.
__13. In the LOAD phase, raw data is converted from row organized format to the column organized
format. The data pages are built using compression dictionary. The synopsis table is built.
__14. The keys for page map index and any unique indexes are built.
Page 32 An IBM Proof of Technology

Check Percent Pages Saved

__15. Run data03 to check percent pages saved.
$ ./data03
__16. We will now load 1 million rows and check compression ratio. Run the following commands.
$ ./data04  [Same as data02 but load 1 million rows].
__17. Run data03 to check percent pages saved.

$ ./data03
__18. Please notice the compression ratio increases from 1.61 to 3.03 when number of rows
increase from 100000 to 1000000.

Check Synopsis Table

__19. Run data05.
$ ./data05
__20. Run gedit data05.log to see the contents of the synopsis table for SAMPLE_TAB.
$ gedit data05.log
__21. There is one entry for 1024 rows and it contains the min and max values for each column. The
TSNMIN and TSNMAX (Tuple Sequence Number) is an internal reference to the actual page
holding the data.
__22. We will explore how DB2 BLU Acceleration does the data skipping using synopsis table in
Lab 05.
__23. Press CTRL-Q to quit from the gedit window.

Use of db2convert Utility

__24. The db2convert utility can convert one or all row-organized user tables into column-organized
tables in a specified database. The row-organized tables remain online during command
processing. Internally, the db2convert utility invokes ADMIN_MOVE_TABLE stored procedure to
convert and move the table.
__25. Run data06 to create FACT_DX_ROW as row organized table.

$ ./data06
__26. Run data07 to load 1 million rows in it.

$ ./data07

__27. Run data11 to check amount due on providers having balance more than $2000.
$ ./data11
__28. Run data08 script runs two commands. The first command opens a new GNOME-Terminal
window which runs a workload defined in data09 against the table in parallel that we are
converting. The db2convert command converts row organized table DB2PSC.FACT_DX_ROW
into column organized while the workload is also running on the same table.
$ ./data08
__29. After balances (more than $2000) are cleared from the table, the workload will finish and it will
prompt you to press Y to close the command window.

__30. Watch the progress of the db2convert utility.
Note: Since we updated the table through the workload, the REPLAY phase
applies the changes using LOG records. This may take longer as we
committed every single UPDATE. After the SWAP phase, it will try to obtain a
z table lock to drop and rename the table.
This whole operation is online and it allows to convert the table without
having to take any downtime.
__31. When db2convert finishes the work, you will see message similar to the one shown below.
__32. Run data10 to check the DB2PSC.FACT_DX_ROW table.

$ ./data10
__33. Please note TABLEORG as C and compression ratio as 3.

__34. Since we ran the workload against the row organized table (while it was converting) to clear the
balances more than $2000, we will now check the same on the converted table. Run data11.
$ ./data11
__35. Type clear in the command window.

$ clear
** End of Lab 03: Storage and Data Load

Lab 03 Storage and Data Load: Important Things About LOAD For Columnar Tables

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lab 03 Storage and Data Load: Important Things About LOAD For Columnar Tables

Uploaded by

Copyright:

Available Formats

Lab 03 – Storage and Data Load IBM Software

Lab 03 Storage and Data Load

Use of Load Utility

__9. Run data01 to create a DB2PSC.FACT_DX_COL column organized table.

IBM DB2 10.5 BLU Acceleration Page 31

__10. Run data02 to load 100,000 rows in it.

Page 32 An IBM Proof of Technology

Check Percent Pages Saved

__17. Run data03 to check percent pages saved.

IBM DB2 10.5 BLU Acceleration Page 33

Check Synopsis Table

__23. Press CTRL-Q to quit from the gedit window.

Page 34 An IBM Proof of Technology

Use of db2convert Utility

__25. Run data06 to create FACT_DX_ROW as row organized table.

__26. Run data07 to load 1 million rows in it.

IBM DB2 10.5 BLU Acceleration Page 35

Page 36 An IBM Proof of Technology

__30. Watch the progress of the db2convert utility.

__32. Run data10 to check the DB2PSC.FACT_DX_ROW table.

__33. Please note TABLEORG as C and compression ratio as 3.

IBM DB2 10.5 BLU Acceleration Page 37

__35. Type clear in the command window.

** End of Lab 03: Storage and Data Load

Page 38 An IBM Proof of Technology

You might also like