Professional Documents
Culture Documents
02b SSIS Best Practices PDF
02b SSIS Best Practices PDF
Level 300
Bob Duffy
DTS 2000
SSIS 2005
1.75 Developers
Optimize and
Stabilize the basics
Measure
Tune
Parallelize
Lookup patterns
Script vs custom transform
Sharpen
Share
Parallelize, partition,
pipeline
Spend
Row
Based
(synchronous)
Partially
Blocking
(asynchronous)
Blocking
(asynchronous)
http://msdn.microsoft.com/en-us/library/ms345346.aspx
Destination
server runs SQL
Server
Database
EMC CX3-80
1 Gb Ethernet
connections
4 Gb Fiber Channel
Source servers:
Unisys ES3220L
2 sockets each with 4 core
Intel 2 GHz CPUs
4 GB RAM
Windows Server 2008
SQL Server 2008
Destination server:
Unisys ES7000/One
32 sockets each with dual core
Intel 3.4 GHz CPUs
256 GB RAM
Windows Server 2008
SQL Server 2008
Make:
Model:
OS:
CPU:
RAM:
HBA:
NIC:
Database:
Storage:
Unisys
ES7000/one Enterprise Server
Microsoft Windows Server 2008 x64 Datacenter Edition
32 socket dual core Intel Xeon 3.4 GHz (7140M)
256 GB
8 dual port 4Gbit FC
Intel PRO/1000 MT Server Adapter
Pre-release build of SQL Server 2008 Enterprise Edition (V10.0.1300.4)
EMC Clariion CX3-80 (Qty 1)
11 trays of 15 disks; 165 spindles x 146 GB 15Krpm; 4Gbit FC
Quantity:
Make:
Model:
OS:
CPU:
RAM:
HBA:
NIC:
Database:
Storage:
4
Unisys
ES3220L
Windows2008 x64 Enterprise Edition
2 socket quad core Intel Xeon processors @ 2.0GHz
4 GB
1 dual port 4Gbit Emulex FC
Intel PRO1000/PT dual port
Pre-release build of SQL Server 2008 Integration Services (V10.0.1300.4)
2x EMC CLARiiON CX600 (ea: 45 spindles, 4 2Gbit FC)
C1
C1
C1
C1
Orders Table
Partition
1
Orders_1
Partition
2
Orders_2
Partition
3
Orders_3
Partition
4
Orders_4
Partition
5
Orders_5
Partition
6
Orders_6
SSIS
SSIS
SSIS
SSIS
orders.tbl.3
orders.tbl.4
orders.tbl.5
orders.tbl.6
...
Orders_55
orders.tbl.55
Partition
56
Orders_56
SSIS
SSIS
orders.tbl.2
...
SSIS
SSIS
orders.tbl.1
Partition
55
orders.tbl.56
(Package details
removed to protect
the innocent)
Follow Microsoft
Development Guidelines
Platform considerations
Process
Modularity
Package
Modularity
Component
Modularity
Annotations
Error Logging
Configurations
BEFORE:
AFTER:
select
select * from etl.uf_FactStoreSales(@Date)
dbo.Tbl_Dim_Store.SK_Store_ID
, Tbl_Dim_Store.Store_Num
,isnull(dbo.Tbl_Dim_Merchant_Division.SK_Merch_Di
v_ID, 0) as SK_Merch_Div_ID
from dbo.Tbl_Dim_Store
left outer join dbo.Tbl_Dim_Merchant_Division
on dbo.Tbl_Dim_Store.Merch_Div_Num =
dbo.Tbl_Dim_Merchant_Division.Merch_Div_N
um
where Current_Row = 1
Use the power of TSQL to clean the data 'on the fly'
Avoid overdesign
Maximize
Parallelism
Minimize
blocking
Minimize
ancillary data
Three Modes of
Operation
Tradeoff
memory vs.
performance
Can use Merge
Join component
instead
Full Cache is optimal, but uses the most memory, also takes time to
load
Partial Cache can be expensive since it populates on the fly using
singleton SELECTs
No Cache uses no memory, but takes longer
Custom
components
Scripts
http://sqlcat.com
http://technet.microsoft.com/en-us/library/bb961995.aspx
http://blogs.msdn.com/sqlperf/archive/2008/02/27/etl-world-record.aspx