Professional Documents
Culture Documents
__2. The LOAD utility when it is in LOAD phase compresses values and data pages. It updates
synopsis table and build keys for page map index and any unique indexes.
__3. The UTIL_HEAP_SZ should be as big as possible for the LOAD process. If the database server
has more than 128GB of RAM, the value of UTIL_HEAP_SZ should be at least 4 million pages.
__4. In order to get good compression, it is necessary to load large amount of representative data.
__5. It is not advisable to load a small initial subset of data for the 1st load as it might lead to the poor
compression dictionary.
__6. Page level compression is used for new values not covered by the column level dictionary. This
reduces the need to rebuild the compression dictionary.
__7. Page level compression reduces deteriorating compression ratios over time.
__11. Please notice new ANALYZE phase for the column organize table.
__12. In the ANALYZE phase, data is converted from row organized format to column organized format.
Histograms are built to track value frequency. The compression dictionary is built based upon
histograms.
__13. In the LOAD phase, raw data is converted from row organized format to the column organized
format. The data pages are built using compression dictionary. The synopsis table is built.
__14. The keys for page map index and any unique indexes are built.
__16. We will now load 1 million rows and check compression ratio. Run the following commands.
$ ./data04 [Same as data02 but load 1 million rows].
__18. Please notice the compression ratio increases from 1.61 to 3.03 when number of rows
increase from 100000 to 1000000.
__20. Run gedit data05.log to see the contents of the synopsis table for SAMPLE_TAB.
$ gedit data05.log
__21. There is one entry for 1024 rows and it contains the min and max values for each column. The
TSNMIN and TSNMAX (Tuple Sequence Number) is an internal reference to the actual page
holding the data.
__22. We will explore how DB2 BLU Acceleration does the data skipping using synopsis table in
Lab 05.
__27. Run data11 to check amount due on providers having balance more than $2000.
$ ./data11
__28. Run data08 script runs two commands. The first command opens a new GNOME-Terminal
window which runs a workload defined in data09 against the table in parallel that we are
converting. The db2convert command converts row organized table DB2PSC.FACT_DX_ROW
into column organized while the workload is also running on the same table.
$ ./data08
__29. After balances (more than $2000) are cleared from the table, the workload will finish and it will
prompt you to press Y to close the command window.
Note: Since we updated the table through the workload, the REPLAY phase
applies the changes using LOG records. This may take longer as we
committed every single UPDATE. After the SWAP phase, it will try to obtain a
z table lock to drop and rename the table.
This whole operation is online and it allows to convert the table without
having to take any downtime.
__31. When db2convert finishes the work, you will see message similar to the one shown below.
__34. Since we ran the workload against the row organized table (while it was converting) to clear the
balances more than $2000, we will now check the same on the converted table. Run data11.
$ ./data11