Professional Documents
Culture Documents
Table compression
We often hear that disk space is cheap and that we should not be too concerned if databases are
growing at an excessive rate. Just buy more disks. Well, what most managers do not realize is that
we are not talking about disks that you buy at some large electronic store. Microsoft SQL Server and
other RDBMSs require fast redundant sets of disks. That coupling takes disk price from cheap to often
very expensive. In addition, if you must include high availability and disaster recovery as part of your
topology, then your costs are doubled.
To offset some of the cost, SQL Server includes a feature that allows you to compress your data at
GLIIHUHQWOHYHOV³VSHFLÀFDOO\WDEOHVDQGLQGH[HV<RXFDQDFWXDOO\FRPSUHVVDQ\RIWKHIROORZLQJ
■ $WDEOHWKDWLVVWRUHGDVDKHDSLWGRHVQRWKDYHDFOXVWHUHGLQGH[
■ $WDEOHWKDWKDVDFOXVWHUHGLQGH[
■ $QRQFOXVWHUHGLQGH[
■ $QLQGH[HGYLHZ
■ $SDUWLWLRQVHH&KDSWHU´64/6HUYHU3DUWLWLRQLQJµ
1RWRQO\FDQ\RXUHGXFHWKHGLVNUHTXLUHPHQWVRI64/6HUYHUZLWKFRPSUHVVLRQEXWLQPRVWFDVHV
you can also improve the overall performance of your disk subsystems and query request times. The
actual compression rate is dependent upon two primary factors: data characteristics and the corre-
sponding data type. While not all data types are affected by compression, most are. Table 7-1 lists the
affected data types.
95
www.it-ebooks.info
TABLE 7-1 'DWD7\SHV$IIHFWHGE\&RPSUHVVLRQ
smallint int bigint decimal numeric
real ÁRDW money smallmoney bit
datetime datetime2 datetimeoffset char nchar
binary rowversion
SQL Server has two types of compression: data and backup. Throughout this chapter, you will focus
on data compression.
Note Data compression is supported only in the Enterprise version of SQL Server 2012.
5RZFRPSUHVVLRQE\QDWXUHLVQRWDYHU\FRPSOLFDWHGSURFHVV%DVLFDOO\LWLGHQWLÀHVWKHGDWDW\SH
RIHDFKFROXPQFRQYHUWVLWWRYDULDEOHOHQJWKDQGÀQDOO\UHGXFHVWKHDPRXQWRIUHTXLUHGVWRUDJHWR
RQO\ZKDWLVQHHGHG$VDUHVXOWFRPSUHVVLRQLQFUHDVHVWKHDPRXQWRIGDWDWKDWFDQEHVWRUHGRQD
page. In addition, it may reduce the amount of metadata associated with a record.
For example, if you have a column that has a data type of smallint, by default it will allocate 2 bytes
of storage. However, the value inserted into the column may require only 1 byte of storage. If that
is the case, enabling compression on that table will reduce the amount of allocated storage to only
what is needed: 1 byte. This process is repeated for every column in the table or index.
$VVWDWHG\RXFDQFRPSUHVVDWDEOHRULQGH['XULQJFUHDWLRQ\RXKDYHWKHRSWLRQRIVSHFLI\LQJD
table option that will compress the data within a table. Continuing the trend, you can compress data
with T-SQL and Microsoft SQL Server Management Studio (SSMS).
2. ([SDQGWKHVHUYHUQRGHLQ2EMHFW([SORUHU
www.it-ebooks.info
3. ([SDQGWKH$GYHQWXUH:RUNVGDWDEDVH
5. )URPWKHFRQWH[WPHQXWKDWDSSHDUVVHOHFW6WRUDJH_0DQDJH&RPSUHVVLRQ
The next page, Select Compression Type, is where all the magic happens.
7. &KHFNWKHER[ODEHOHG8VH6DPH&RPSUHVVLRQ7\SHIRU$OO3DUWLWLRQV
8. Select Row from the drop-down list, and click Calculated in the bottom-right corner.
$V\RXFDQVHH\RXZLOOVDYHDSSUR[LPDWHO\0%RIGLVNVSDFHE\LPSOHPHQWLQJURZFRP-
pression on the table.
9. &OLFN1H[W
The Select an Output Option page is where you specify how and when to compress the data.
www.it-ebooks.info
10. Click Run Immediately.
11. &OLFN1H[WDQGWKH6XPPDU\SDJHDSSHDUV
www.it-ebooks.info
12. Review the summary information and click Finish.
Note Compressing the heap or the clustered index does not compress the nonclus-
tered indexes. If you want to compress the nonclustered indexes, you must do so
individually.
1. In 2EMHFW([SORUHUH[SDQG6DOHV6DOHV2UGHU+HDGHU
3. 5LJKWFOLFNWKH,;B6DOHV2UGHU+HDGHUB6DOHV3HUVRQ,'QRQFOXVWHUHGLQGH[DQGVHOHFW6WRUDJH_
Manage Compression.
www.it-ebooks.info
4. 5HSHDWVWHSV²IURPWKHSUHYLRXVH[HUFLVH
--Use this code to row compress a nonclustered index on the Sales.SalesOrderHeader table
ALTER INDEX IX_SalesOrderHeader_SalesPersonID
ON Sales.SalesOrderHeader
REBUILD WITH(DATA_COMPRESSION = ROW);
You can also compress a table during initial creation with T-SQL. The following code creates a table
that has a primary key that will be row compressed:
1RWHWKDWQRWKLQJDERXWFUHDWLQJDWDEOHFKDQJHV³KRZHYHU\RXPXVWDGGWKHRSWLRQDWWKHHQG
ZKLFKDSSHDUVLQEROGLQWKHSUHYLRXVVFULSW1RZZKHQHYHUGDWDLVLQVHUWHGLQWRWKHWDEOHLWZLOOEH
www.it-ebooks.info
row compressed. Using this method will compress the clustered index on the table or, if the clustered
index does not exist, it will compress the heap.
■ Row compression
■ 3UHÀ[FRPSUHVVLRQ
■ Dictionary compression
$V\RXFDQVHHSDJHFRPSUHVVLRQLQFOXGHVURZFRPSUHVVLRQDVSDUWRILWVSURFHVV1RWKLQJ
FKDQJHVZLWKWKHURZFRPSUHVVLRQSURFHVV³LW·VMXVWWKHÀUVWVWHSLQSDJHFRPSUHVVLRQ$IWHUWKHURZ
FRPSUHVVLRQLVFRPSOHWHWKHQH[WVWHSLVSUHÀ[FRPSUHVVLRQ
During this step, each column is scanned for a value that will reduce the storage space for each
FROXPQ2QFHWKHYDOXHLVLGHQWLÀHGDURZIRUHDFKFROXPQLVVWRUHGLQWKHSDJHKHDGHU$OOWKH
information is called the compression information (CI), which is stored below the page header. The
LGHQWLÀHGYDOXHVSUHÀ[HGYDOXHVDUHORFDWHGLQHDFKFROXPQDQGUHSODFHGZLWKDSRLQWHUWRWKHYDOXH
in the CI section. Figure 7-1 illustrates the process.
4j 4j [empty]
[empty] 0jjjj [empty]
3kkk [empty] 0jjjj
www.it-ebooks.info
7KHSUHÀ[HVWKDWZLOOUHGXFHWKHVL]HRIWKHGDWDDUHPRYHGWRWKHSDJHKHDGHUDQGWKHDFWXDO
FROXPQYDOXHVDUHPRGLÀHGWRLQFOXGHSRLQWHUVWRWKH&,DVVKRZQLQ)LJXUH7KHYDOXHNNNUHS-
UHVHQWVWKHÀUVWWKUHHFKDUDFWHUVLQWKHSUHÀ[DQGNNN
The next step is dictionary compression, which scans the entire page instead of a single column.
7KHYDOXHVWKDWDUHUHSHDWHG³IRUH[DPSOHM³DUHPRYHGWRWKH&,VHFWLRQRIWKHSDJHKHDGHUDQG
replaced with references to the values, as shown in Figure 7-2.
0 0 [empty]
[empty] 1 [empty]
3kkk [empty] 1
The process of page compression with SSMS or T-SQL is exactly the same as with row compression.
7KHGLIIHUHQFHLVWKDW\RXVSHFLI\3$*(LQVWHDGRI52:
1. Repeat steps 1–7 from the “Compress a table or row using SSMS” exercise.
2. On the Select Compression Type page, select Page from the drop-down list.
4. &OLFN1H[W
6. &OLFN1H[W
7. Click Finish.
www.it-ebooks.info
Page compression with T-SQL
Just as with row compression, you have the ability to page compress data with T-SQL. The follow-
ing script will page compress the clustered index on the Sales.SalesOrderDetail table and one of the
nonclustered indexes:
--Use this code to page compress a nonclustered index on the Sales.SalesOrderDetail table
ALTER INDEX IX_SalesOrderDetail_ProductID
ON Sales.SalesOrderDetail
REBUILD WITH(DATA_COMPRESSION = PAGE);
The biggest advantages of using T-SQL are the portability and reusability of the code. While SSMS
RIIHUVDYHU\LQWXLWLYHDQGÁH[LEOHXVHULQWHUIDFHLI\RXZDQWWRUHSURGXFHWKHVWHSVLQDGLIIHUHQW
environment, you have to replicate the steps for each table or index. However, when you use T-SQL, it
is as simple as executing a query.
Moreover, with a little time and skill, you can build a process that loops over every table or index in
a database and yields the results of the stored procedure call. The following script estimates the space
savings for a single index on the Production.TransactionHistory table:
exec sp_estimate_data_compression_savings
@schema_name = 'Production',
@object_name = 'TransactionHistory',
@index_id = 1,
@partition_number = NULL,
@data_compression = 'row'
This script could take several minutes to execute depending on how much data there is and the
type of compression that you select. The size_with_requested_compession_setting(KB) column esti-
mates the size of the table or index if compressed. You should consider these results along with other
factors mentioned in the next section when determining whether to compress a table or index.
www.it-ebooks.info
Compression considerations
You will want to carefully consider whether to implement row or page compression prior to doing so.
$VZLWKPRVWWKLQJVFRPSUHVVLRQGRHVQRWFRPHZLWKRXWDFRVW:KLOHWKHGDWDUHPDLQVFRPSUHVVHG
in memory, when it is selected, it is decompressed. In addition, when new rows are inserted, the data
LVURZDQGRUSDJHFRPSUHVVHG:KHQURZVDUHXSGDWHGRUGHOHWHGURZFRPSUHVVHGREMHFWVVKRXOG
persist their current level of compression. However, page compression may be recalculated depend-
ing upon the number of changes that occur to the data.
$VDUHVXOWGHWHUPLQLQJZKLFKREMHFWVWRFRPSUHVVLVKLJKO\GHSHQGHQWXSRQZKDWDFWLYLWLHV
DUHSHUIRUPHGRQWKHFRUUHVSRQGLQJREMHFWV$VDJHQHUDOVWDUWLQJSRLQWWKRVHREMHFWVWKDWDUH
XSGDWHGIUHTXHQWO\VKRXOGEHURZFRPSUHVVHG7KRVHREMHFWVWKDWDUHPRVWO\UHDGVKRXOGEHSDJH
compressed. There are certainly other issues to think about, but these are a good starting point.
$OVRWDEOHVWKDWDUHRQO\DSSHQGHGWRLQRWKHUZRUGVGDWDLVDGGHGWRWKHHQGVKRXOGEHSDJH
compressed.
Summary
'HFLGLQJZKDWWRFRPSUHVVDQGZKHQWDNHVVRPHGHÀQLWHWHVWLQJDQGDQDO\VLVRI\RXUFXUUHQWGDWD-
base. Compression in most cases should save space, and it may even provide some performance
improvements to your overall database environment. The process of compression can be applied to
FHUWDLQREMHFWVLQ\RXUGDWDEDVHXVLQJ6606RU764/7KLVFKDSWHUH[DPLQHGWKHWZRW\SHVRIFRP-
pression, row and page, and you walked through the steps to implement both types of compression.
In addition, you learned how to estimate the amount of space savings that you can expect to gain by
either implementation.
www.it-ebooks.info
CHAPTER 8
Table partitioning
■ Partition a table.
The FRQFHSWRISDUWLWLRQLQJLVQRWYHU\GLIÀFXOWWRH[SODLQRUFRPSUHKHQG)RUH[DPSOHDVVXPH\RX
have a table that contains sales data, and you would like to divide the data into segments based on
the year of sale. Logically, the table would resemble Figure 8-1.
$VVKRZQLQ)LJXUHGLYLGLQJWKHGDWDE\\HDULVVLPSOHWRLOOXVWUDWH:KHQGDWDLVDGGHGWR
the table, it will be placed in the appropriate location based on the year of the sale. Partitioning
WDEOHVRIIHUVVHYHUDOEHQHÀWVSULPDULO\LQWKHIRUPRIVLPSOLÀHGPDLQWHQDQFHSRWHQWLDOSHUIRUPDQFH
improvements, and the ability to physically store data in a single database across several disks.
So how can partitioning be handled physically as data is inserted into a Microsoft SQL Server
table? The process consists of the following three steps:
3. $SSO\WKHSDUWLWLRQVFKHPHWRDWDEOH
105
www.it-ebooks.info
Creating a partition function
While the logical—or you could even say manual—partitioning process is straightforward, the pro-
FHVVRISDUWLWLRQLQJDWDEOHLQ64/6HUYHULVQRWWKDWPXFKPRUHGLIÀFXOW7KHÀUVWVWHSLVWRFUHDWHD
partition function. This function is what will be used to align or map the data to the corresponding
partition based on a column in the table.
Note Use the “Readeraid” style for all normal text paragraphs inside of a readeraid box.
Note In Microsoft SQL Server 2012, the number of partitions that can be created on a table
or index has been increased to 15,000.
Note For ease of maintenance and, often, improved performance, it is recommended that
\RXSODFHHDFKSDUWLWLRQLQDVHSDUDWHÀOHJURXS:KHQFUHDWLQJ\RXUÀOHJURXSV\RXPXVW
DOZD\VLQFOXGHRQHPRUHWKDQWKHQXPEHURIERXQGDULHVVSHFLÀHGLQ\RXUSDUWLWLRQIXQF-
WLRQ7KHÀOHJURXSVVKRXOGEHSODFHGRQGLIIHUHQWGLVNVWRSK\VLFDOO\VHSDUDWH,27KHIRO-
ORZLQJVFULSWDGGVVHYHUDOÀOHJURXSVWRWKH$GYHQWXUH:RUNVGDWDEDVH
ALTER DATABASE AdventureWorks2012
ADD FILEGROUP Sales2005;
--Use this code to add multiple filegroups to the AdventureWorks2012 database
USE master;
ALTER DATABASE AdventureWorks2012
ADD FILE
www.it-ebooks.info
(
NAME = 'Sales2005',
FILENAME = 'C:\SQLData\Sales2005File.ndf',
SIZE = 5MB,
MAXSIZE = 200MB,
FILEGROWTH = 5MB
)
TO FILEGROUP Sales2005;
www.it-ebooks.info
TO FILEGROUP Sales2008;
ALTER DATABASE AdventureWorks2012
ADD FILEGROUP Sales2009;
While SQL Server 2012 offers a very robust user interface that encompasses all the steps required
to partition a table or index, you will use T-SQL initially to work through the partitioning process.
1. Open the query editor in Microsoft SQL Server Management Studio (SSMS).
2. In the query editor, enter and execute the following T-SQL code:
The preceding script results in the division or partitioning illustrated in Figure 8-2.
Partition 1 2 3 4 5
www.it-ebooks.info
Creating a partition scheme
$VPHQWLRQHGSUHYLRXVO\XVLQJÀOHJURXSVDVSDUWRI\RXUSDUWLWLRQLQJVWUDWHJ\RIIHUVVHYHUDODGYDQ-
WDJHV7RHQVXUHWKDW\RXSODFHWKHFRUUHFWGDWDLQWKHFRUUHFWÀOHJURXS\RXZLOOXVHDpartition
scheme7KHSDUWLWLRQVFKHPHDVVLJQVRUPDSVWKHSDUWLWLRQVFUHDWHGE\WKHIXQFWLRQWRÀOHJURXSV
■ 7KHQDPHZKRVHGHÀQLWLRQDQGSXUSRVHDUHREYLRXV
■ The name of the partition function, which must be created prior to creating the scheme. The
SDUWLWLRQVFUHDWHGE\WKHVSHFLÀHGIXQFWLRQZLOOEHPDSSHGWRWKHÀOHJURXSVSURYLGHGZKHQ
creating the scheme.
■ $//ZKLFKZKHQXVHGOLPLWVWKHQXPEHURIÀOHJURXSVWRRQH
■ 7KHODVWDUJXPHQWZKLFKDFFHSWVDFRPPDGHOLPLWHGOLVWRIÀOHJURXSV
7KHSDUWLWLRQVDUHDVVLJQHGWRWKHÀOHJURXSVLQWKHRUGHULQZKLFKWKH\DUHOLVWHGVWDUWLQJZLWKWKH
ÀUVWSDUWLWLRQVSHFLÀHGLQWKHIXQFWLRQ
2. In the query editor, enter and execute the following T-SQL code:
■ $WDEOHZLWKRXWDFOXVWHUHGLQGH[KHDS
■ $FOXVWHUHGLQGH[
■ $XQLTXHLQGH[
■ $QRQFOXVWHUHGLQGH[
www.it-ebooks.info
:KHQSDUWLWLRQLQJDFOXVWHUHGLQGH[WKHFROXPQWKDWKDVEHHQVSHFLÀHGDVWKHSDUWLWLRQLQJFRO-
umn must be included in the clustering key. If you are partitioning a clustered or nonclustered index
that is not unique, the partitioning column is not required as part of the key. However, if you do not
include it, SQL Server will add it to the index by default. With regard to unique indexes, clustered or
nonclustered, you must include the partitioning column as part of the unique index key.
2. In the query editor, enter and execute the following T-SQL code:
USE [AdventureWorks2012];
IF(OBJECT_ID('dbo.PurchaseOrderHeader')) IS NOT NULL
DROP TABLE dbo.PurchaseOrderHeader
GO
CREATE TABLE dbo.[PurchaseOrderHeader](
[PurchaseOrderID] [int] NOT NULL,
[RevisionNumber] [tinyint] NOT NULL,
[Status] [tinyint] NOT NULL,
[EmployeeID] [int] NOT NULL,
[VendorID] [int] NOT NULL,
[ShipMethodID] [int] NOT NULL,
[OrderDate] [datetime] NOT NULL,
[ShipDate] [datetime] NULL,
[SubTotal] [money] NOT NULL,
[TaxAmt] [money] NOT NULL,
[Freight] [money] NOT NULL,
[TotalDue] money,
[ModifiedDate] [datetime] NOT NULL
);
3. ,Q2EMHFW([SORUHUH[SDQGWKH'DWDEDVHVIROGHU
4. ([SDQGWKH$GYHQWXUH:RUNVGDWDEDVH
7. 6HOHFW6WRUDJH_&UHDWH3DUWLWLRQIURPWKHFRQWH[WPHQX
www.it-ebooks.info
8. On the &UHDWH3DUWLWLRQ:L]DUGSDJHFOLFN1H[W
9. On the Select a Partitioning Column page, click the radio button next to the OrderDate
column.
www.it-ebooks.info
10. &OLFN1H[W
11. On the Select a Partition Function page, click Existing Partition Function and select fnPOOr-
derDate from the drop-down list.
12. On the Select a Partition Scheme page, click Existing Partition Scheme and select schPOOrder-
Date from the drop-down list.
13. Because you chose an existing function and scheme, the Map Partitions page should be pre-
ÀOOHGZLWKWKHFRUUHFWYDOXHVDVVKRZQLQWKHIROORZLQJLPDJH
14. &OLFN1H[W
16. &OLFN1H[W
www.it-ebooks.info
Partition a table using T-SQL
2. In the query editor, enter and execute the following T-SQL code:
USE [AdventureWorks2012];
IF(OBJECT_ID('dbo.PurchaseOrderHeader')) IS NOT NULL
DROP TABLE dbo.PurchaseOrderHeader
GO
CREATE TABLE dbo.[PurchaseOrderHeader](
[PurchaseOrderID] [int] NOT NULL,
[RevisionNumber] [tinyint] NOT NULL,
[Status] [tinyint] NOT NULL,
[EmployeeID] [int] NOT NULL,
[VendorID] [int] NOT NULL,
[ShipMethodID] [int] NOT NULL,
[OrderDate] [datetime] NOT NULL,
[ShipDate] [datetime] NULL,
[SubTotal] [money] NOT NULL,
[TaxAmt] [money] NOT NULL,
[Freight] [money] NOT NULL,
[TotalDue] money,
[ModifiedDate] [datetime] NOT NULL
) ON schPOOrderDate(OrderDate);
Note The preceding script creates a standard table. However, note that the last line
LQEROGWHOOV64/6HUYHUWRFUHDWHWKHWDEOHXVLQJWKHVSHFLÀHGSDUWLWLRQVFKHPHDQG
OrderDate as the partition column.
2. In the query editor, enter and execute the following T-SQL code:
USE [AdventureWorks2012];
CREATE CLUSTERED INDEX CIX_PurchaseOrderHeader_OrderDate
ON dbo.PurchaseOrderHeader(OrderDate)
3. ,Q2EMHFW([SORUHUH[SDQGWKH'DWDEDVHVIROGHU
4. ([SDQGWKH$GYHQWXUH:RUNVGDWDEDVH
6. Expand Indexes.
www.it-ebooks.info
7. Right-click the CIX_PurchaseOrderHeader_OrderDate index and select Properties from the
context menu.
9. Click Partition Scheme and select schPOOrderDate from the drop-down list.
10. Click under the Table Column column (toward the bottom of the screen).
2. In the query editor, enter and execute the following T-SQL code:
USE [AdventureWorks2012];
CREATE CLUSTERED INDEX CIX_PurchaseOrderHeader_OrderDate
ON dbo.PurchaseOrderHeader(OrderDate)
WITH(DROP_EXISTING = ON)
ON schPOOrderDate(OrderDate);
www.it-ebooks.info
Summary
This chapter presented a brief overview of SQL Server partitioning, including an introduction to the
key concepts and terms needed to gain a general understanding of the partitioning process. You fol-
lowed a step-by-step demonstration on how to create a partition function and scheme that you can
use to partition a table or index. You will need to investigate several advanced topics once you have
DFTXLUHGDEDVLFXQGHUVWDQGLQJRISDUWLWLRQLQJ$V\RXUGDWDEDVHDUFKLWHFWXUHVEHFRPHPRUHFRP-
plex and your data space requirements grow, consider looking into further partition topics, including
modifying, splitting, and aligning partitions.
www.it-ebooks.info