Professional Documents
Culture Documents
Level 300
Bob Duffy
DTS 2000 SSIS 2005
1.75 Developers
Lookup patterns
Parallelize Script vs custom transform
Increase the efficiency of
Sharpen every aspect
Parallelize, partition,
Share pipeline
Partially
Blocking
(asynchronous)
Blocking
(asynchronous)
http://msdn.microsoft.com/en-us/library/ms345346.aspx
Source data Source servers
EMC CX600 run SSIS Destination Database
server runs SQL EMC CX3-80
2 Gb Fiber Channel Server
1 Gb Ethernet
connections
4 Gb Fiber Channel
Quantity: 4
Make: Unisys
Model: ES3220L
OS: Windows2008 x64 Enterprise Edition
CPU: 2 socket quad core Intel Xeon processors @ 2.0GHz
RAM: 4 GB
HBA: 1 dual port 4Gbit Emulex FC
NIC: Intel PRO1000/PT dual port
Database: Pre-release build of SQL Server 2008 Integration Services (V10.0.1300.4)
Storage: 2x EMC CLARiiON CX600 (ea: 45 spindles, 4 2Gbit FC)
C1
C1
C1
C1
Orders Table
Partition Partition Partition Partition Partition Partition Partition Partition
1 2 3 4 5 6 55 56
...
SSIS
SSIS
SSIS
SSIS
SSIS
SSIS
SSIS
SSIS
orders.tbl.1 orders.tbl.2 orders.tbl.3 orders.tbl.4 orders.tbl.5 orders.tbl.6 orders.tbl.55 orders.tbl.56
(Package details
removed to protect
the innocent)
Follow Microsoft
Iterative design, development & testing
Development Guidelines
Modularity
Build custom components for maximum re-use
Concise naming conventions
Presentable layout
Annotations
Error Logging
Configurations
Get as close to the data as possible
Limit number of columns
Filter number of rows
Tradeoff Full Cache is optimal, but uses the most memory, also takes time to
load
memory vs. Partial Cache can be expensive since it populates on the fly using
singleton SELECTs
performance No Cache uses no memory, but takes longer
instead
Can written in any .Net language
Custom Must be signed, registered and installed but
http://technet.microsoft.com/en-us/library/bb961995.aspx
http://blogs.msdn.com/sqlperf/archive/2008/02/27/etl-world-record.aspx