You are on page 1of 17

Oracle

Compression
AISHWARYA KALA

Compression Modes

Basic (from 9i)


compress BASIC

For OLTP (from 11gR1)


compress for OLTP

11gR2 hybrid columnar (11gR2 exadata)


compress for QUERY [LOW|HIGH]
compress for ARCHIVE [LOW|HIGH]

Block Structure

Row Structure
In the row each column within the
row is preceded by a byte specifying
the length of the column.
Here we have 4 columns in a row
the first is 3 bytes in length, the
second column is 5 bytes in length,
the third is NULL and the fourth is 2
bytes in length
If the value of the byte is 0xFF (255) it
means the column is NULL in other
words there is no data for this column
of the row.

Block/Row Level Compression

The format of a data block that uses basic and OLTP table compression is essentially
the same as an uncompressed block.
The difference is that a symbol table at the beginning of the block stores duplicate

Symbol Table (Oracles) Local (Row level compression)

Compressed blocks contain a structure called a symbol table that maintains


compression metadata. The symbol table is stored as another table in the block. Each
column in a row in a block references back to an entry in the symbol table in the block
Duplicate values are eliminated by first adding a single copy of the duplicate value to
the symbol table when a block is compressed. Each duplicate value is then replaced
by a short reference to the appropriate entry in the symbol table.
This makes compressed data self-contained within the database block as the
metadata used to translate compressed data into its original state is contained within
the block.

For an Uncompressed block, there will be no symbol table , its just look like

For an compressed block, there will be a symbol table, here we have rows with
names , John, Doe, Jane, Smith which are repeated values.

A symbol table will be created with the count of their rows

The original data has been replaced with symbols , and the original data is stored
i.e. the values Jane , Doe, Smith etc .

Now see the free space, that is has accumulated.

Compressed Block vs. Non-compressed Block

Basic Compression

This feature is available starting Oracle 9i and it allows us to store 2x, 3x, 4x or more data
per block.

It only works during direct path operations such as insert /*+APPEND*/, alter table t move,
create table as select, sqlldr direct=y.

It does not PREVENT you from using normal insert/update/delete statements - it just means
that the results of those statements will result in some non-compressed data.

A single table may have some blocks compressed and some blocks not compressed .

The original basic compression still exists in 11g for enterprise edition users, and there is the
new advanced compression option (OLTP).

Oracle will do some decompression of a row before updating it .

Oracle makes no attempt to re-compress the row after update

Few Examples
1. Baseline CTAS
create table t1 as select * from all_objects where rownum <= 50000;
2. CTAS with basic compression enabled

Blocks (714) , No Compression


Blocks (189) , Compression Enabled

create table t1 compress basic as select * from all_objects where rownum <= 50000;

3. Normal insert into empty table defined as compressed


create table t1 compress basic as select * from all_objects where rownum = 0;
insert into t1 select * from all_objects where rownum <= 50000;

Blocks (714) , Compression Enabled

4. Direct path insert into empty table defined as compressed


create table t1 compress basic as select * from all_objects where rownum = 0;
insert /*+ append */ into t1 select * from all_objects where rownum <= 50000

Blocks (189) , Compression Enabled

5. CTAS without compression, then change to compressed


create table t1 as select * from all_objects where rownum <= 50000;
alter table t1 compress basic;
alter table t1 move

Blocks (714) , Compression Enabled


Blocks (189) , Compression Enabled

OLTP Compression

Basic and OLTP compression use the same underlying algorithm

Compression that allows data to be compressed during all types of data manipulation
operations, including conventional DML such as INSERT and UPDATE.

One significant advantage is Oracles ability to read compressed blocks directly without
having to first un-compress the block. Therefore, there is no measurable performance
degradation for accessing compressed data.

To gain performance during DML executions, a Newly initialized block remains


uncompressed until data in the block reaches an internally controlled threshold. When a
transaction causes the data in the block to reach this threshold, all contents of the block
are compressed.

Compression During DML

Columnar Compression

An alternative approach is to store data in a columnar format, where data is organized


and stored by column.

Storing column data together, with the same data type and similar characteristics,
dramatically increases the storage savings achieved from compression. However, storing
data in this manner can negatively impact database performance when application
queries access more than one or two columns as well as during DMLs

Columnar storage is a reference to how data is grouped together on disk


Columnar compression is a reference to whether the actual data is on disk, or whether
you save space by storing some smaller substitute for the actual data.

Hybrid Columnar Compression (HCC)

As the name implies, this technology utilizes a combination of both row and columnar
methods for storing data.

A logical construct called the compression unit (CU) , which is nothing but a collection
of blocks, is used to store a set of hybrid columnar-compressed rows

When data is loaded, column values for a set of rows are grouped together and
compressed. After the column data for a set of rows has been compressed, it is stored in
a compression unit. So its a Logical structure spanning multiple database blocks.

Rows are self contained within the CU

Query / DML with HCC

DML with HCC

Direct Load operations result in Hybrid Columnar Compression

Data is transformed into columnar format and compressed during load

Conventional INSERT results in OLTP Compression

Updated rows are moved, as in a delete + insert, and this row automatically migrates to
OLTP Compression

Query with HCC

Queries with Hybrid Columnar Compression only decompress necessary columns to satisfy
query. Data can remain compressed in the buffer cache.

Optimized algorithm avoids or greatly reduces overhead of decompression during queries

Performing DML operations could result in a reduction of the HCC compression ratio. It
is recommended that HCC be enabled on tables or partitions with no or infrequent
DML operations.

HCC Options Available

Warehouse Compression

This option is for optimizing Query performance.

Suitable for data warehouse applications.

Two options : Query High and Query Low


COMPRESS FOR QUERY LOW - Low compression ratio without affecting load times

COMPRESS FOR QUERY HIGH High Compression ratio without affecting query times
but some data load performance hit

Online Archival Compression

This option is for optimized for maximum compression ratios.

Suitable for data changes very rarely.

Two options: Archive High and Archive Low


COMPRESS FOR ARCHIVE HIGH - Very High Compression Ratio with slower query times.
COMPRESS FOR ARCHIVE LOW High compression ratio with slower query response times.