You are on page 1of 30

Utsedd till Sveriges bsta Microsoft utbildningscenter

Kraft och inspiration som syns

Datakvalitet och Master Data hantering


Jrgen Hasselgren HNO3 Business Intelligence jorgen@hno3.se @HASSELBRANCH

2012-02-03

HNO3 Business Intelligence AB

Why Data Quality is Important

Top 3 impediments

Source: Information Week Reports, 2011

Top Barrier for BI

Source: Information Week Reports, 2011


4

DQ is MDM top driver

Source: Information Week Reports, 2011


5

Master Data Hur kan vi hantera det?


Anvnda ERP systemet Hantera det i datalagret Skapa en Master Data Integration Hub Enterprise MDM

Data WareHouse MDM


Transaktionssystem

Data Warehouse

Master Data Integration Hub


Transaktions -system Integration Hub CRM

Data Warehouse

Data Steward

Enterprise MDM
Transaktions -system CRM MDM Data Warehouse

Data Steward

Data Steward

In metadata, a data steward is a person that is responsible for maintaining a data element in a metadata registry

Vad r datakvalitet? Vad skall jag g igenom?


Mnniskor Process Teknik

201202-03

HNO3 Business Intelligence AB

11

Datakvalitet olika aspekter

Data Specifikation Dubletter Riktighet Konsistent och synkroniserat Finns i tid Tcker allt

201202-03

HNO3 Business Intelligence AB

12

Planlgg
Roller Kundid Avd.chef Sljare Controller Kundansvarig IT Drift A A A F/U/A F Termer Produktnamn A A A A F/U/A A Konto

F Fr data U Underhller data A Anvnder data


201202-03 HNO3 Business Intelligence AB 13

Enterprise Information Management in SQL Server 2012


Data Quality Services Knowledge based Data Cleansing and Matching
Master Data Services Master and reference data Management Integration Services ETL and Data Integration Tool

Audience Poll how many of you use any of

features today?
these 3

Common Data Quality Issues


Data Quality Issue
Are data elements consistently defined and understood ?
Is all necessary data present ? Does the data accurately represent reality or a verifiable source? Do data values fall within acceptable ranges? Data appears several times

Sample Data Problem


Gender code = M, F, U in one system and Gender code = 0, 1, 2 in another system
20% of customers last name is blank, 50% of zip-codes are 99999 A Supplier is listed as Active but went out of business six years ago Salary values should be between 60,000-120,000 Both John Ryan and Jack Ryan appear in the system are they the same person?

Standard
Complete Accurate

Valid
Unique

What does Data Quality Look Like?

Who is involved with Data Quality


Audience Poll: who is responsible for Data Quality in your Organization?

DBA

Data Steward / Business Analyst

BI Developer

Requirements for Data Quality Solutions

Monitoring

Cleansing

Profiling

Matching

18

DQS State lives in catalogs

Data Quality Knowledge Base (DQKB)


Values

3rd party Refer ence Data

Domains Represent the data type Rules

Composite Domains

Domains

& Relati ons

Knowle dge Base


Matching Policy

DQS Build Workflow


Create a KB / Domain Management

Create a new KB or open existing one Define Domains and their data types, rules, set up reference data, domain rules, term based relationships Define Composite Domains to combine multiple simple domains into a single complex domain entity

Define Matching Policy

Point to example source data Define Matching Rules

Run Data Discovery

Prime the KB with knowledge values and terms into the various KB Domains Import clean knowledge data from a table or type in manual entries Correct data manually and define the standard for what is correct

Publish the KB

Data Projects can reference and use the KB once it is published You can go back and edit a KB as needed, but data projects cannot see edits until published again.

Build

Use

Monitor/Configure

DQS Client 3 panes

Building Your Knowledge


Account ID
A124324 7676862 4934235

Demo 1 Introducing YOUR DATA


Home Team Team Type Revenue Type Sales Home Arena Address Line City State
Boston Celtics New York Yankees Basketball Baseball Food & Beverages Music Music 655 389 443 TD Garden Yankee Stadium Safeco Field 100 Legends Way East 161st Street & River Avenue 1516 First Avenue S Boston NY Seattle MA NY WA Seattle Mariners Baseball MLB

Zip
2114 2114

98134 98134

Account ID

Team Type

Address Line

City State

Zip

Composite Domain - Full Address

BIA-319-M | Data Quality Services A Closer Look

Reference Data Service: Composite Domain containing Address Line, City, State & Zip Domains 23

DQS Demo

2/3/2012

BIA-319-M | Data Quality Services A Closer Look

24

Batch Cleansing - Using SSIS

SSIS Data Flow

Values/Rules Reference Data Definition

SSIS Package
Source + DQS Cleansing Destination Mapping Component

Microsoft ConfidentialPreliminary Information Subject to Change

MDS 2012 Adds


Improved Web UI Excel Add-in Easier data updates and management Simplified data model creation Integration with Data Quality (DQS) New staging interface (Entity Based Staging) Improved quality (usability, robustness, security, scale, performance)

MDS Demo Background


Fel i rapporterna Dags fr Data Stewarden att arbeta

Skicka frgor till

@HASSELBRANCH

Kraft och inspiration som syns

Utbildning SQL Server 2012


SQL Server 2012 New features for Database Administrators, A521 SQL Server 2012 New Features for Business Intelligence Developers, A523 Querying Microsoft SQL Server 2012, M10774 Administering SQL Server 2012 Databases, M10775 Developing SQL Server 2012 Databases, M10776

Implementing a Data Warehouse with SQL Server 2012, M10777


Implementing Data Models and Reports with SQL Server 2012, M10778 Under vren Master Data Management Vad mste gras? Obs! Expertseminarium: An In-Depth Look at Developer Features and Performance in SQL Server 2012 with Bob Beauchemin, 3-4 april

You might also like