You are on page 1of 11

SQL Server Integration

Services Design Patterns

Second Edition

Andy Leonard
Tim Mitchell

Matt Masson

Jessica Moss

Michelle Ufford
Apress*
Contents
J
First-Edition Foreword xv

About the Authors xvii

About the Technical Reviewer xix

Chapter 1: Metadata Collection 1

About SQL Server Data Tools 1

A Peek at the Final Product 1

SQL Server Metadatacatalog 3

sys.dm_os_performance_counters 3

sys.dm_db_index__usage_stats 3

sys.dm_os_sys_info 3

sys.tables 3

sys.indexes 3

sys.partitions 4

sys.allocation_units 4

Setting Up the Central Repository 4

The Iterative Framework 6

Metadata Collection 14

Summary 26

HChapter 2: Execution Patterns 27

Building the Demonstration SSIS Package 27

28
Debug Execution
Command-Line Execution 29

Execute Package Utility 30

v
CONTENTS

The SQL Server 2014 Integration Services Service 30

Integration Services Catalogs 30

Integration Server Catalog Stored Procedures 31

Scheduling SSIS Package Execution 53

Scheduling an SSIS Package 53

Scheduling a File System Package 54

Running SQL Server Agent Jobs with the Custom Execution Framework 55

Running the Custom Execution Framework with SQL Server Agent 56

Execute Package Task 57

Execution from Managed Code 58

The Demo Application 58

ThefrmMain Form 59

Conclusion 70

^Chapter 3: Scripting Patterns 71

The Toolset 71

Should I Use Script? 72

The Script Editor 72

The Script Task 75

The Script Component 77

Script Maintenance Patterns 78

Code Reuse 78

Source Control 79

Scripting Design Patterns 79

Connection Managers and Scripting 80

Variables 82

Naming Patterns 85

Conclusion 85
CONTENTS

Chapter 4: SQL Server Source Patterns 87

Setting Up a Source 87

Selecting a SQL Server Connection Manager and Provider 88

ADO.NET 89

ODBC 89

OLE DB 91

Creating a SQL Server Source Component 92

Writing a SQL Server Source Component Query 95

ADO.NET Data Access 95

OLE DB Data Access 96

Waste Not, Want Not 97

Data Translations 97

Source Assistant 97

Summary 99

Quality Services 101


Chapter 5: Data Correction with Data

Overview of Data Quality Services 101

the Data Quality Client 102


Using

Using DQS withSSIS 108

DQS Cleansing Transform 108

DQS Extensions on CodePlex 113

Cleansing Data in the Data Flow 114

of the DQS Cleansing Transform 114


Handling the Output

Performance Considerations 117

121
Approving and Importing Cleansing Rules
Conclusion 123

125
^Chapter 6: DB2 Source Patterns
DB2 Database 125
Family
DB2 Provider 126
Selecting a

Find the Database Version 126

Pick Provider Vendor 127

vii
CONTENTS

Connecting to a DB2 Database 127

Querying the DB2 Database 130

DB2 Source Component Parameters 131

DB2 Source Component Dynamic Queries 132

Summary 133

Chapter 7: Flat File Source Patterns ........................135

Flat File Sources 135

Moving to SSIS! 136

Strong-Typing the Data 138

Introducing a Data-Staging Pattern 140

Variable-Length Rows 143

Reading into a Data Flow 144

Splitting Record Types 145

Terminating the Streams 146

Header and Footer Rows 147

Consuming a Footer Row 148

Consuming a Header Row 150

Producing a Footer Row 152

Producing a Header Row 159

The Archive File Pattern 163

Summary 169

Chapter 8: Loading a PDW Region in APS 171

Massively Parallel Processing 171

APS Appliance Overview 172

Hardware Architecture 172

Software Architecture 173

Shared-Nothing Architecture 175

Clustered Columnstore Indexes 175

viii
CONTENTS

Loading Data 176

DWLoader vs. Integration Services 176

ETLvs. ELT 177

Data Import Pattern for PDW 178

Prerequisites 178

Preparing the Data 179

Package Overview 181

The Data Source 181

The Data Transformation 183

The Data Destination 184

Multithreading 189

Limitations 190

Summary 191

Chapter 9: XML Patterns 193

Using the XML Source 193

Dealing with Multiple Outputs 194

Making Things Easier with XSLT 200

Using a Script Component 203

Configuring the Script Component 203

Processing XML with XmlSerializer 209

Processing XML with XmlReader and LINQ to XML 210

Conclusion 212

Chapter 10: Expression Language Patterns 213

Getting to Know the Expression Language 213

What Is the Expression Language? 213

Why Use Expressions? 214

Language Essentials 215

Limitations 215

ix
CONTENTS

Putting the Expression Language to Work 216

Package Expressions 216

Variable Expressions 217

Connection Managers 217

Project-Level Connection Managers 219

Control Flow 219

Data Flow Expressions 222

Conclusion 226

aChapter 11: Data Warehouse Patterns •


227

Incremental Loads 227

What Is an Incremental Load? 227

Why Incremental Loads? 228

The Slowly Changing Dimension 228

Incremental Loads of Fact Data 228

Incremental Loads in SSIS 228

Native SSIS Components 229

The Slowly Changing Dimension Wizard 232

The MERGE Statement 234

Change Data Capture (CDC) 237

Data Errors 242

Simple Errors 242

Missing Data 243

Coding to Allow Errors 246

Data Warehouse ETL Workflow 248

Dividing Up the Work 248

One Package = One Unit of Work 249

Conclusion 250

X
CONTENTS

Chapter 12: OData Source -251

Understanding the OData Protocol 251

Data Type Mappings 252

Query Options 253

Configuring the OData Connection Manager 254

Enabling Microsoft Online Services Authentication 254

Configuring the Source Component 256

Overriding Data Types 259

Conclusion 260

a* Chapter 13: Slowly Changing Dimensions 261

The Slowly Changing Dimension Transform 261

Running the Wizard 262

Using the Transformations 267

Optimizing Performance 268

Third-Party SCD Components 269

Merge Pattern 270

Handling Type 1 Changes 271

Handling Type 2 Changes 272

Conclusion 272

HChapter 14: Loading the Cloud 275

Interacting with the Cloud 275

Incremental Loads to Azure SQL Database 276

Change Detection 276

New Rows (Only) 276

Building the Cloud Loader 277

Conclusion 280

xi
CONTENTS

Patterns 281
Chapter 15: Logging and Reporting
281
Package Logging and Reporting
281
Setting Up Package Logging
282
Reporting on Package Logging
283
Design Pattern: Package Executions
283
Catalog Logging and Reporting
283
Setting Up Catalog Logging
285
Catalog Tables
Fact 286
Changing Logging Levels After the
287
Design Patterns
Changing the Logging Level 287

289
Using the Existing Reports

Creating New Reports 290

291
Summary

Chapter 16: Parent-Child Patterns 293

Master Package Pattern 293

Assign the Child Package 294

295
Configure Parameter Binding

Dynamic Child Package Pattern 296

Child-to-Parent Variable Pattern 302

Conclusion 303

Chapter 17: Configuration 305

Parameters 305

Configuring Your Package Using Parameters 307

Using the Parametrize Dialog 309

Creating Visual Studio Configurations 310

Specifying Entry-Point Packages 312

Connection Managers 313

xii
CONTENTS

Parameter Configuration on the Server 313

Default Configuration 314

Server Environments 315

Default Parameter Values Using T-SQL 317

Package Execution Through the SSIS Catalog 317

Parameters with DTEXEC 320

Projects on the File System 320

Projects in the SSIS Catalog 321

Dynamic Configurations 322

Configuring from a Database Table 323

Setting Values Using a Script Task 326

Dynamic Package Executions 327

Conclusion 329

Chapter 18: Deployment 331

Project Deployment Model 331

SSIS Catalog 332

Deployment Methods 334

Deployment from the Command Line 335

Deployment Using Custom Code 336

Deployment Using PowerShell 337

Deployment Using SQL 338

Package Deployment Model 339

Conclusion 341

Chapter 19: Business Intelligence Markup Language 343

A Brief History of Business Intelligence Markup Language 343

Building Your First Biml File 344

Building a Basic Incremental Load SSIS Package 347

Creating Databases and Tables 347

Adding Metadata 349

xiii
CONTENTS

Specifying a Data Flow Task 350

Adding Transforms 350

Testing the Biml 356

Using Biml as an SSIS Design Patterns Engine 360

Time for a Test 367

Conclusion 368

HChapter 20: Biml and SSIS Frameworks 369

Using Biml with an SSIS Framework 369

Adding SSIS Package Metadata to the Framework 369

Executing the Biml File 374

Generating the SSIS Command-Line 375

Summarizing 376

£9Appendix A: Evolution of an SSIS Framework 377

Starting in the Middle 377

Introducing SSIS Applications 387

A Note About Relationships 389

Retrieving SSIS Applications in T-SQL 392

Retrieving SSIS Applications in SSIS 396

Monitoring Execution 399

Building Application Instance Logging 399

Building Package Instance Logging 406

Building Error Logging 410

Reporting Execution Metrics 420

Conclusion 434

Index 435

xiv

You might also like