You are on page 1of 66

Let’s Get Meta:


ETL Frameworks Using


Biml

KATEGRASS
@
Kate Grass:
The Who, What and Where
➢ Digital nomad, dog dork, lover of data, hiker, biker, runner(?)

➢ Over 15 years experience with SQL Server, currently consulting


full-time for firm specializing in Health & Human Service analytics

➢ Various forests and campgrounds around the US

@KATEGRASS KATEGRASS.COM
Let’s Get Meta:

ETL Frameworks Using Biml
➢ETL Framework

Framework
➢Metadata The
Trifecta
➢Biml Biml Metadata

@KATEGRASS KATEGRASS.COM
Samantha’s Solar Systems

@KATEGRASS KATEGRASS.COM
ETL Systems Investigation:

The Good The Bad The Ugly


• Solid ETL • Minimal • Inconsisten
logging t design
• No auditing • Lack of
• No error standards
handling

@KATEGRASS KATEGRASS.COM

Lack of Standardization:

The Problem

➢Different developers => Different


designs
➢Different projects => Different
designs
➢Different day => Different designs

@KATEGRASS KATEGRASS.COM
Lack of Standardization:

The Effects
➢Learning curve for new developers
➢Development time required for changes
➢Increased likelihood of mistakes
➢Intelligent developers spending time
doing copy/paste

@KATEGRASS KATEGRASS.COM
Minimal Logging:

The Problem
➢Output file generated
by SQL Agent job step
➢Very minimal data
➢Not persisted,
overwritten every time!

@KATEGRASS KATEGRASS.COM
Minimal Logging:

The Effects
➢Troubleshooting anything becomes a
search for the needle in the haystack
➢Difficult to pinpoint where or why
performance has degraded

@KATEGRASS KATEGRASS.COM
No Auditing:

The Problem
No record of ETL outcomes
➢Row counts
➢Table sizes
➢Deletions

@KATEGRASS KATEGRASS.COM
No Auditing:

The Effects
➢No proactive alerting when things get
wonky
Ex: Row count deltas from source <> target
➢Difficult to answer questions about history
➢Hard to explain changes in processing time
that are found in the logs
* Oh wait, there is no logging, good thing we don’t have to worry about that last one

@KATEGRASS KATEGRASS.COM
No Error Handling:

The Problem
➢Error file generated by SQL Agent job step,
limited information and it’s not persisted
➢Programmatic handling of errors limited to
restart attempts in SQL Agent job
➢No restartability within an application

@KATEGRASS KATEGRASS.COM
No Error Handling:

The Effects
➢Relying on SQL Agent error logs is cumbersome and not
very helpful
➢No error handling means SOMEONE has to handle the
errors!
➢A failure within one package resulted in the entire job
step being re-run
➢Can’t learn from your mistakes if your mistakes aren’t
documented

@KATEGRASS KATEGRASS.COM
The Solution

Develop an ETL Framework

Framework

Driven by Metadata The


Trifecta

Powered by Biml Biml Metadata

@KATEGRASS KATEGRASS.COM
The Solution: Build or Buy?
The Goldilocks Dilemma: Options available but none were *just*
right
➢Commercial products
➢Open source solutions

Starting with SSIS 2012, the Integrations Services Catalog


provides a great foundation for many “frameworkey” things

Final decision was to make use of available features and build


our own framework

@KATEGRASS KATEGRASS.COM
Let’s Get Meta:

ETL Frameworks Using Biml
➢ETL Framework

Framework
➢Metadata The
Trifecta
➢Biml
Biml Metadata

@KATEGRASS KATEGRASS.COM
Let’s Get Meta:

ETL Frameworks Using Biml
➢What is a Framework?
➢Characteristics
➢Benefits
➢Functional Components
➢Design Process

@KATEGRASS KATEGRASS.COM
What is a Framework?
➢A group of design patterns, common
and reusable tools and processes that
facilitate the larger (ETL) process
➢A set of foundational building blocks
upon which (ETL) project are built

@KATEGRASS KATEGRASS.COM
What are the Desired Characteristics of 

a Framework?
➢ as SIMPLE as possible
➢Standardized
➢Flexible
➢Metadata Driven

@KATEGRASS KATEGRASS.COM
What are the Benefits of a
Framework?
Broader Support

Abstract Provide
Complexity Transparency Easier
Troubleshooting

Reduced Dev &


Maintenance
Standard Consistent Time
Design Code

Fewer Mistakes

@KATEGRASS KATEGRASS.COM
What are the Functional Components
of a Framework?
… It Depends!
Are you building an Oreo
or an iPad?
Business needs will dictate what
is required and what is useful.
@KATEGRASS KATEGRASS.COM
What are the Functional Components
of a Framework?
What is (or should be) repeated across projects?
Possibilities include:
➢Orchestration & Parallelism ➢Logging & Auditing
➢Reporting & Monitoring ➢Notifications
➢Deployment ➢Fault Tolerance /
➢Templates Recoverability

@KATEGRASS KATEGRASS.COM
Framework Design Process: 

Component Selection
➢Identify your pain points
➢List all your wants
➢Highlight your needs
➢Review for feasibility
➢Rank for Value vs
Complexity
➢Iterative process
@KATEGRASS KATEGRASS.COM
Framework Design Process: 

System Mapping
➢Components have Get your hands
off that keyboard!
been selected, now Pencil and paper
how do they relate? first.

➢What’s the order of


operations?
And lots of eraser!

@KATEGRASS KATEGRASS.COM
FrameworkInit.dt
sx

@KATEGRASS KATEGRASS.COM
OrchestrationRoot.dt
sx

@KATEGRASS KATEGRASS.COM
ThreadRoot.dtsx

@KATEGRASS KATEGRASS.COM
Let’s Get Meta:

ETL Frameworks Using Biml
➢ETL Framework
Framework

➢Metadata
➢Biml The
Trifecta

Biml Metadata

@KATEGRASS KATEGRASS.COM
Let’s Get Meta:

ETL Frameworks Using Biml
➢What is Metadata?
➢Abstracting your Metadata
➢Utilizing your Metadata

@KATEGRASS KATEGRASS.COM
What is Metadata?
In it’s simplest form
“Meta-X equals X about X”

Meta-cognition or the more fun, Meta-joke:


A horse, a duck and a bear walk into a bar. The
bartender says, “what is this, some kind of joke?”

Metadata equals data about data!


* Marko Ticak, Grammarly.com, 12/30/2016, https://www.grammarly.com/blog/meta-meaning/

@KATEGRASS KATEGRASS.COM
What is Metadata?
“all the information that defines and
describes the structures, operations,
and contents of the . . . system”
Technical metadata, business
metadata and process metadata
* The Kimball Group, http://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/technical-dw-bi-system-architecture/

@KATEGRASS KATEGRASS.COM
Abstracting your Metadata
What are the definable properties shared across your
ETL systems?

➢Connection Strings
➢Max Threads
➢ETL pattern (Incremental vs Full Refresh)
➢Database metadata – Table names, column names, etc.
➢Frequency of ETL process – daily, weekly, etc.
➢… the list goes on

@KATEGRASS KATEGRASS.COM
Abstracting your Metadata
What are the definable properties within your
systems?

➢Directory or filename patterns


➢Lag days or cut-off dates
➢Various arguments for executing external tasks
(Calling a zip program for example)

➢… the list goes on

@KATEGRASS KATEGRASS.COM
Technical Metadata and Business Metadata

@KATEGRASS KATEGRASS.COM
Utilizing your Metadata
Inform ANY of the functional
components of your framework:
➢Orchestration & Parallelism ➢Logging & Auditing
➢Reporting & Monitoring ➢Notifications
➢Deployment ➢Fault Tolerance /
Recoverability
➢Templates

@KATEGRASS KATEGRASS.COM
Utilizing your Metadata
At the highest level in this solution,
metadata directs the orchestration.

“Maestro” package determines its actions


based on the metadata and sends info
and commands on to the child packages.

@KATEGRASS KATEGRASS.COM
OrchestrationRoot.dt
sx

@KATEGRASS KATEGRASS.COM
Child dtsx

ThreadRoot.dtsx

@KATEGRASS KATEGRASS.COM
Utilizing your Metadata
Metadata can be used like a blueprint
for your ETL projects, and you can
build your packages manually.

Or…
@KATEGRASS KATEGRASS.COM
… You can it!!!

@KATEGRASS KATEGRASS.COM
Let’s Get Meta:

ETL Frameworks Using Biml
➢ETL Framework
Framework
➢Metadata
➢Biml The
Trifecta

Biml Metadata

@KATEGRASS KATEGRASS.COM
Let’s Get Meta:

ETL Frameworks Using Biml
➢What is Biml?
➢Options for using Biml
➢BimlScript
➢Biml on Steroids (or Metadata)
➢Where to learn more

@KATEGRASS KATEGRASS.COM
What is Biml?
Created by

Business Intelligence Markup Language:

An XML dialect used to define BI assets,


including relational models, SSIS and SSAS
objects.

@KATEGRASS KATEGRASS.COM
What is Biml?
➢Human readable and human writable
➢Compiles into the EXACT same .dtsx packages
that are created via drag and drop in SSDT
➢Biml is to SSIS Packages as HTML is to
webpages

@KATEGRASS KATEGRASS.COM
@KATEGRASS KATEGRASS.COM
Options for using Biml
➢ - FREE plugin for Visual Studio
➢ .com – beta version currently
free
➢ (the software formerly known as Mist)

All available through .com


@KATEGRASS KATEGRASS.COM
Demo

@KATEGRASS KATEGRASS.COM
@KATEGRASS KATEGRASS.COM
@KATEGRASS KATEGRASS.COM
This is a Biml file

*** Insert an example of a simple Biml file here

@KATEGRASS KATEGRASS.COM
@KATEGRASS KATEGRASS.COM
What’s better than Biml?
…BimlScript!
➢Embed small ”nuggets” of VB or C# code
into your Biml
➢Automatically generate more Biml!

@KATEGRASS KATEGRASS.COM
Code
Nuggets
➢Control nuggets: Create variables, use conditional
logic and control flow, call other code <# #>
➢Text nuggets: Take a C# or VB expression,
evaluate it, and replace with the string value! <#=
#>
➢Class nuggets: Write .net methods to reuse
throughout a script file <#+ #>
@KATEGRASS KATEGRASS.COM
Demo

@KATEGRASS KATEGRASS.COM
@KATEGRASS KATEGRASS.COM
@KATEGRASS KATEGRASS.COM
Directives <#@
➢import: Reference a #>
namespace (equivalent to
C# using or VB Imports statement)
<#@ import namespace=“System.Data” #>
➢template: Specify language (C# is default) or
tier <#@ template tier=“3” #>
➢include: Dump an entire file’s worth of code
anywhere you want!
<#@ include file=“Inc_Variables.txt” #>

@KATEGRASS KATEGRASS.COM
<#=CallBimlScript
#>the include directive
➢The cooler, older sibling of
➢References another file which it calls like a method,
allowing you to specify parameters to be passed into the
Callee script
➢Evaluates the codes and replaces the directive with the
string results
➢Callee script uses the property directive to specify name
and type of parameters
<#@ property name=“TableName” type=“String” #>

@KATEGRASS KATEGRASS.COM
Metadata Driven Biml
What’s better than BimlScript?

Metadata
Use the stored values of shared properties to
create configurable packages

@KATEGRASS KATEGRASS.COM
Metadata Driven Biml
Values known at design-time can be used by
Biml to create the packages.
Values known at run-time can be fed into
package variables.

@KATEGRASS KATEGRASS.COM
Demo
Adding Metadata to your BimlScript!

@KATEGRASS KATEGRASS.COM
@KATEGRASS KATEGRASS.COM
@KATEGRASS KATEGRASS.COM
@KATEGRASS KATEGRASS.COM
Learn More about Biml
➢BimlScript.com
➢SQL Server Central – Stairway to Biml
➢Biml blogs – just ask Uncle Google
➢SQL Saturdays

@KATEGRASS KATEGRASS.COM
@KATEGRASS KATEGRASS.COM

You might also like