You are on page 1of 49

Informatica Data Virtualization

The “Foundation” for AGILITY & PRODUCTIVITY

Kerry Holton
Informatica Senior Sales Engineer

1
Let’s Win Something!!!
A copy of “Lean Integration.”
Tell me which box is the ONLY thing
that data virtualization built on data
federation does – and why???

Answer questions along


the way…

Take some good notes !


Informatica Corporation Confidential – Do Not Distribute
2
To Learn More…

Informatica.com > Products > PowerCenter > Data


Virtualization Edition

Informatica.com > Products > Data Virtualization

Sign-Up
JOIN & DISCUSS
2000+ Strong

Expert Roundtables “Data Virtualization & Data


Services Architecture” Group

Data Virtualization Corner


http://vip.informatica.com/?elqPURLPage=8668

Informatica Corporation Confidential – Do Not Distribute


3
Agenda

• “2012” – The Year of “BI” Agility


• Data Virtualization – Overview, Problem & Need
• Key Use Cases
• Customer Examples
• Data Virtualization in Action
• Why Informatica?
• Next Steps & Q&A

Informatica Corporation Confidential – Do Not Distribute


4
I’m writing you a million dollar
check, but you’re not solving my
big problem. My big problem isn’t
getting the data into the data
warehouse. My big problem is …
getting the data out!”
ICC Director (VP of IM) to Dave Lyle (VP Product Strategy), end of Q3, 2009

Informatica Corporation Confidential – Do Not Distribute


5
“2012”
Have any of you had this discussion?
• Need for a new BI infrastructure BI will be the top priority
• Replacing spreadsheets for the CIO, in 2012!
• Faster data access & reporting

Business
/ IT “Demands by users of business intelligence
BI (BI) applications to "just get it done" are turning
typical BI relationships, such as business/IT
alignment and the roles that traditional and next-
generation BI technologies play, upside down. As
business users demand more control over BI
applications, IT is losing its once-exclusive
control over BI platforms, tools, and applications.”
– Boris Evelson, Forrester Research, Blog -
“Top 10 BI Predictions for 2012”
• Business-focused BI
• $100M Qtr. in 2011
• 10k+ customers
Informatica Corporation Confidential – Do Not Distribute
6
How Long Does it Take to Deliver New
Critical Data or Reports to the Business?

Informatica Corporation Confidential – Do Not Distribute


7
The Business Can’t Wait 3-6 Months
For a Single View of All Enterprise Data

Business Business Business Business Business Business Business


Intelligence Intelligence Intelligence Intelligence Intelligence Intelligence Intelligence

ETL SOA
Hand Coding ESB/EAI EII

SWIFT NACHA HIPAA …

Cloud Computing Applications DatabasesUnstructured Warehouses NoSQL Social Partner Data


Informatica Corporation Confidential – Do Not Distribute
8
Overview

Informatica Corporation Confidential – Do Not Distribute


9
HealthNow’s Data Integration Challenges

Business Portal BI
IT
(WebSphere) (Cognos)

NO
REUSE

30,000 Data Marts Were Created by Shadow IT Teams


So What Did the Business Do?
To Add 1 Product Attribute to Existing Report – IT Estimated 1700 Hours
Different Price Info in Each LOB
16 Types of Data Sources

30,000 Data Marts Data Warehouse Facets [Benefits, Products] Product Config Mgmt
(MS Access) (DB2) (Sybase ASE) (MS SQL Server)
The Fundamental Problem(s)…
Typical
Data Integration Process
• It takes too long to explain
1. Design requirements
2. Change
3. Integrate
• It takes months to change a
DW / add new critical data
4. Unit Test
5. Validate • It takes many iterations to
6. Deploy get the right data / reports
Business is
Involved Too Late • Changes can break existing
integrations & impact apps.
As-Is Value Stream Map (LOT OF WAIT & WASTE)

Informatica Corporation Confidential – Do Not Distribute


11
Trying to Solve it in BI Layer Just Wont
Scale…Why?

No Reuse

No Common Data Access Layer


No Easy Way to Handle Change
No Data Quality & No Data Consistency

Unstructured Data Spread Marts EDW Applications

DATA
MART

Informatica Corporation Confidential – Do Not Distribute


12
What is Needed to Solve these
Problems?
BI Composite Apps Portal
Data
COMMON ACCESS LAYER
Consumers ACROSS MANY DATA SOURCES
Logical Data Objects

Data
Abstraction FAST, DIRECT ACCESS TO
CUSTOMER ORDER PRODUCT …
DATA THE BUSINESS TRUSTS

Enterprise DATA ABSTRACTION &


Data Sources REUSE OF SKILLS/LOGIC

Logical View of All Underlying Data SUPPORT ALL USE CASES

Think Virtual Machines for DATA! BI / DW MDM SOA

Informatica Corporation Confidential – Do Not Distribute


13
How is the Market Trying to Address the
Problems?
Data Virtualization Time GAINED by federation
(Built-On Data Federation) is nullified by
Time SPENT on more processing
Limited or SQL/XQuery Only
Data Source Transformations &
Profiling Only No Data Quality
BI
Deliver
X X
Merge
Virtual View
X DW

Cannot Easily Move to


Persistent Store
Access or Reuse

DW
• Addresses specific use cases
• No data movement / no copies / only federation
• Code heavy / not model-based / no reuse
It’s like ONE step forward
• Not tools for business self-service &
• SQL/XQuery-only transformations TWO steps backward
• No data profiling / no data quality
Informatica Corporation Confidential – Do Not Distribute
14
What Are the Top 3 Key Capabilities for a
Project that Needs Data Virtualization?
If Performance is a given…

Dataset - 600

Source – Informatica Data Virtualization Expert’s Forum ,2011

Informatica Corporation Confidential – Do Not Distribute


15
Are We Talking About TWO Separate
Tools?

Informatica Corporation Confidential – Do Not Distribute


16
What Does the Ideal Solution Look
Like?
1 2
Customer
Name Virtual Table
Address
Category
Orders 3
7 CRM Accounts
Virtual Table

MODEL ACCESS & MERGE Business Analyst, Developer,


Manager Steward Architect

Optimizations Common
& Caching Metadata
CRM Virtual Table Accounts Virtual Table

SCALE & PERFORM PROFILE IN RT


Business 4 IT
6
5

Virtual Table DW Batch Web Services

Virtual Table Virtual Table

Call Center Accounts Query WS Advanced Transformations,


Engine Server Data Quality, Data Masking

MOVE OR FEDERATE TRANSFORM IN RT


REUSE INSTANTLY

Informatica Corporation Confidential – Do Not Distribute


17
How Does Informatica Deliver the Ideal
Solution?
Data Virtualization = (Data Integration + Data Federation) in ONE Tool
Analyze & Profile Advanced
Data & Logic Transformations &
Anytime Data Quality
Early Business
Involvement
BI
Deliver

Merge
Virtual View DW
Prototype Move to DW
First or Instantly Reuse
Access as SQL / WS

DW
• Single environment for both data integration and data federation
• No data movement / no copies – but easily reuse virtual views for batch
• Early & iterative business (analyst) involvement – self-service
• Pre-built library of rich ETL-like advanced data transformations
• Integrated real-time, on-the-fly data profiling & data quality

Informatica Corporation Confidential – Do Not Distribute


18
NEWNEWDATA REQUEST
& REPORTS
How Does It Work? •THAT BUSINESS
Change NEEDS
/ add an attribute
• & Join
TRUSTS, DELIVERED
new data not in DW
• IN DAYS
Create vs. MONTHS
a new report

NEW QUERY
INSTANT REUSE
SELECT *
SELECT *
FROM EXISTING
FROM customer_table
customer_table QUERY
INNER JOIN
support_table ON
SELECT *
customer_table.customer_num =
FROM SUPPORT
support_table.customer_id
WHERE customer_name=‘ACME’
DM

DM
DW
DM

DW
CUSTOMER CustSUPPORT
DW PRODUCT INVOICE

DM ODS

WEB

Data
On-boarding
Complement
Trusted
Virtual
New
Results
Retrieve
Query
quality
query
view
blend
retrieved
historical
is
rules
for
new
can
processed
data
ofreport
applied
be
historical
data
in
architecture
customer
physically
real-time
needing
does
by
on-the-
and
not
operational
materialized
without
virtualization
break
with
fly
dataagainst
virtualization
data
datatxt
integrations
not
data
later
movement
in data
DW
delivered
layer
into DW

Informatica Corporation Confidential – Do Not Distribute


19
Informatica Data Virtualization at HealthNow

Business Portal BI
IT
(WebSphere) (Cognos)

NO
Instant Reuse
DW, BI, SOA & MDM
REUSE
(SQL, Web Services, Batch)

Fast, Direct Data Delivery


1 week (vs. 3 months)
Shared
Repository

“Virtual Table”
Common Data Model
MEMBER CLAIM PRODUCT ORDER

30,000 Data Marts Data Warehouse Facets [Benefits, Products] Product Config Mgmt
(MS Access) (DB2) (Sybase ASE) (MS SQL Server)
What Does Informatica’s Data
Virtualization Solution Look Like?
NEW  New PowerCenter Edition for
PowerCenter
Data Virtualization Edition
AGILITY & PRODUCTIVITY

Partitioning
 Combines:
 Data integration (PowerCenter SE)
Data Profiling
 Data Virtualization (IDS Full Use)
Data Federation
(Data Services)  Data Profiling (IDE Full Use)
Developer Tool  Business-IT Collaboration (Analyst)

Analyst Tool
 Packaged for simplicity and
2 Adapters
(PWX for Relational)
attractively priced
 Reuses existing skills and
ETL
(PC Standard Edition) resources

Informatica Corporation Confidential – Do Not Distribute


21
What Use Cases Are Supported?
1
DW/Business Intelligence (BI) Business IT
Prototype DW & accelerate new data Change Deploy to
& reports from months to days Request Production

Weeks/Days
Months
2

Virtual View
MDM
HUB
MDM TRANSACTIONAL
SYSTEMS
Deliver a complete view of master & DATA
transactional data in real-time INCOMPLETE VIEW
COMPLETE VIEW
WAREHOUSE

OF CUSTOMER

Applications
3
Registry BPM
SOA
ESB
Biz. Services
Deliver the missing data services
layer to SOA & applications Data Abstraction
Data Sources
Informatica Corporation Confidential – Do Not Distribute
22
What are the Benefits of Informatica’s
Solution?
• Provide fast, direct access to critical
new data & reports in days vs. months

• Enable rapid iterations to results with


instant Biz-IT collaboration

• Deliver flexibility, ensure reuse &


insulate applications from changes

COMPLETE, CURRENT & TRUSTED


View of All Data, On-Demand
Informatica Corporation Confidential – Do Not Distribute
23
Customer Examples

24
BI, MDM, SOA – HealthNow NY Improves
Risk & Pricing Analysis With Data Services
BI (Cognos) Portal
(WebSphere)
SQL, Web Service

Virtual Table
IDS

Data Marts Data Warehouse Facets [Benefits, Products] Product Config Mgmt
(MS Access) (DB2) (Sybase ASE) (MS SQL Server)

The Challenge The Solution The Benefits


• 16 enterprise databases and over • Logical data models and data • Speed of data delivery –
30,000 Access databases services to represent their core data Implemented first project in around
• Took 1700 man hours to add a entities – MEMBER, 40 man hours. This would have
new product to portfolio CLAIMS,PROVIDER, taken an order of magnitude more
ENCOUNTER, LAB RESULTS in the past
• Business had to go to 5 different
sources for all information related • ‘Rate Letter’ project for • Complete view of the truth -
to paid claims determination of policy rates and Business users now access plan
discounts went live in May 2010 rate information from single service
• Continued data growth with over
30,000 claims processed per day • Over 400 Logical data objects and 2 • Better governance – Centrally
web services being used by around managed virtual views as opposed
• Data proliferation leading to HIPAA 125 end users to one-off data marts is improving
compliance concerns governance of data

25
BI, SOA - Large Latin American Bank
Improves Governance
Microsoft Reporting Services Customized Applications
SQL, Web Service

Virtual Table
Data Virtualization

Transactions Tables Data Warehouse Credit Analysis, Applications, AML Financial Institutions
(Mainframe – Adabas, DB2) (DB2 LUW) (SQL Server) (Flat Files and Messages)

The Challenge The Solution The Benefits


• Lack of visibility for proper • Logical data models to represent core • Speed of data delivery – implemented
supervision and regulation of the business entities (e.g. CUSTOMER) first project in around 60 man hours and
national financial system delivered a new virtual view in < 1hour
• Mainframe virtualization (join data from
• Real-time analysis and joining of data Adabas, DW DB2, Apps., 3rd Party ) • Better risk/fraud governance (across
(Adabas, DB2, SQLServer, Files) more than 6000 financial institutions)
• Logical data models and Web services
• Persistent data replication even for and compliance with BASEL I, BASELII
to deliver flexibility and agility to
one-time use and SOX
respond to changing business needs
• Huge data volumes (Online 6TB, DW • Creation of logical data objects and • Complete single view of the truth -
14 TB) business users can now access
physical materialization of virtual views
consistent customer and plan rate data
• Different reporting tools requesting to familiar PowerCenter environment
different data combinations across • Centralized management and
heterogeneous data sources administration of logical data objects

26
BI, MDM – VW Leverages Delivers a
Complete View of Critical Data On-Demand
BI Portal

SQL, Web Service Reuse


IDQ
Virtual Table
IDS

MDM Hub (Customer, Purchase, Case) DW (Service History) PRD [Campaign History] Transactional Systems (Warranty, Service)
(IBM) (Teradata) (SAGA/Win) (Varied)

The Challenge The Solution The Benefits


• CUSTOMER data in > 30 systems, • Create a common data model for • Completed DI, DQ, & data services
MDM hub, transaction systems, DW VW owners, prospects, & partners production pilot in <1 month
• Have 80% data but missing critical 20% • Federate data in real-time from > 30 • Can leverage operational efficiency &
transactions - WARRANTY, SERVICE systems & transactional systems real-time decisions to differentiate
• No authoritative source of CUSTOMER, • Provide easy-to-use, browser-based • Delivered accurate, complete view of
PRODUCT data, conflicting relationships tools for business & IT to collaborate CUSTOMER data, on-demand
• No complete view of CUSTOMER data • Apply reusable DQ rules on-the-fly • Lowered costs by increasing
on-demand is affecting service to CUSTOMER, PRODUCT data productivity & reuse of data services
• Without complete view of data, can’t • Instantly reuse data services for • Supported strategy to triple sales to
meet goal to sell 3x more cars by 2018 SQL or Web services 1M vehicles annually, by 2018

27
Data Virtualization in Action

Informatica Corporation Confidential – Do Not Distribute


28
The “Keystone” – Business Owns
the Data While IT Retains Control
• Role-based tools for Analysts
(Web) & IT developers (eclipse) Analyst Tool BI Report
(Web Browser) SQL or
Web Service
• Common metadata lets
Analysts & IT collaborate in RT
• Empower business analysts to:

VIRTUAL TABLE
• Define entities & directly access &
merge data to create virtual views
SQL or Portal
Common
• Rapidly profile data sources & Metadata Web Service
logic without more processing
• Quickly find data & rules via
business glossary
• Collaborate, test, validate &
share results
Batch
Developer Tool ETL
• Cuts the wait & the waste in the (Eclipse) Data Warehouse
process

29
The 7 Steps to AGILITY & PRODUCTIVITY
1 2
Customer
Name Virtual Table
Address
Category
Orders 3
7 CRM Accounts
Virtual Table

MODEL ACCESS & MERGE Business Analyst, Developer,


Manager Steward Architect

Optimizations Common
& Caching Metadata
CRM Virtual Table Accounts Virtual Table

SCALE & PERFORM PROFILE IN RT


Business 4 IT
6
5

Virtual Table DW Batch Web Services

Virtual Table Virtual Table

Call Center Accounts Query WS Advanced Transformations,


Engine Server Data Quality, Data Masking

MOVE OR FEDERATE TRANSFORM IN RT


REUSE INSTANTLY

30
1. Model

• Represent underlying data as


business entities (CUSTOMER)
• Provide a common logical
view or abstraction of all data
Common Data Access Layer – Logical Data Object
• Import logical model from
200+ modeling tools (ERWIN)

CUSTOMER ORDER PRODUCT INVOICE


• Use visual and metadata
based mapping language
• Instantly reuse logical data
Unstructured Spread Marts Data marts EDW Applications object for all applications
Data

31
31
2. Access and Merge
Turn many data sources into
ONE with Data Virtualization
CUSTOMER SUPPORT PRODUCT INVOICE

Analytical Transactional Master Archived Interactional


Data Data Data Data Data

Social Warehouses NoSQL

SWIFT NACHA HIPAA …

Cloud Computing Application Database Unstructured Partner Data

32
3. Profile in RT
Rich set of integrated profiling
capability to find data
anomalies and to discover keys
and hidden relationships:

• Column & Rule Profiling

• Midstream or Comparative
Profiling

• Join & Overlap Analysis

• Primary Key / Foreign Key


Profiling

• Dependency Profiling

33
4. Transform in RT

• Metadata-driven, codeless,
graphical environment

• Rich, pre-built library of


advanced transformation

• Integrated Data Quality


transformations

• Define policies to mask


sensitive data in real time

34
5. Reuse Instantly
Batch

• Instantly reuse LDOs for any


mode/protocol (SQL, WS)
• Single click deployment to
batch
• Execution & optimization
SQL Web separate from design-time
services • No re-development & re-
building of LDOs

METADATA
REPOSITORY

35
6. Move or Federate
Data Federation Data Integration

BI BI
Deliver

Merge
Virtual View DW
Single-click deployment to DW
PowerCenter (batch)

Access

DW Advanced Transform
Extract & Load
Quality

• Specific use cases • Majority of use cases


• No data movement / no copies • Physical data movement
• Real-time federation • Bulk/batch, near real-time, real-time
• SQL/XQuery-only transformations • Advanced transformations
• No data quality / business validation • Built-in data quality

36
7. Scale & Perform
• Leverage the proven, high-
performance Informatica engine
• Optimized SQL Query engine &
graphical Query Plan
• High-performance Web services
server
• Rich set of optimizations &
caching mechanisms
• Rule Based, Cost Based, Push Down,
Early Projection, Early Selection, Semi-
Join, Virtual Table & Result Set Caching

• Fine grained access control, WS-


Security & pass-through security
• Database, Schema, Table,
Column, Row-Level (v9.5) security

37
Data Virtualization Built On Data
Federation Does 1 Box – Which 1?
1 2
Customer
Name Virtual Table
Address
Category
Orders 3
7 CRM Accounts
Virtual Table

MODEL ACCESS & MERGE Business Analyst, Developer,


Manager Steward Architect

Optimizations Common
& Caching Metadata
CRM Virtual Table Accounts Virtual Table

SCALE & PERFORM PROFILE IN RT


Business 4 IT
6
5

Virtual Table DW Batch Web Services

Virtual Table Virtual Table

Call Center Accounts Query WS Advanced Transformations,


Engine Server Data Quality, Data Masking

MOVE OR FEDERATE TRANSFORM IN RT


REUSE INSTANTLY

38
Do it Right – Avoid Costly Mistakes!
Enabling Rapid Analyzing & Integrating Scaling with Leveraging
Development Profiling with Quality Flexibility Investments
Sustain & Get it Right Bake-in Prototype First Re-purpose
Maintain 1st Time Quality & Then Scale Logic & Skills
TIME COST TIME COST RISK TIME COST RISK TIME COST TIME COST

Virtual Table

EII
Optimizations
Model & metadata- Profile data AND Leverage pre-built Virtualize or physically Naturally extend
driven environment logic anywhere logic including quality materialize in 1 tool your infrastructure

v/s v/s v/s


Hand-coding can’t do
v/s
Non-integrated
v/s
Re-work, re-deploy &
1000s of Only source profiling,
lines of code need extra processing advanced transforms technologies re-train every time

SQL EII
XQuery X
Simple Cleansing
Web Service

TIME COST TIME COST RISK TIME COST RISK TIME COST RISK TIME COST

Maintenance Many Iterations Limited Rules, Overburden Data Re-invent the


Nightmare & Mistakes No Data Quality Virtualization Wheel

39
Data Virtualization in Action

40
Scenario – Big Company

ISSUES

 Call center talk times increasing = scattered data + many screens


 Time wasted in correcting inconsistent & inaccurate customer data
 Agents can’t easily & quickly identify what products are owned

IMPACT

 Can’t easily identify top customers to improve up-sell/cross-sell


 Low customer satisfaction & growing customer attrition
 High marketing costs without targeted campaigns

41
Demo – Big Company
 Business needs a new report – NOW vs. months!
 Quickly merge data from multiple systems & cleanse
 Analysts know the data – want some self-service
 Join CUSTOMER (Oracle CRM) & ORDER (file)
 Get ORDER TOTAL for ACTIVE customers

Integrate missing data, do


data cleansing “on-the-
Analyst fly,” validate IT Architect /
Developer

Analyst defines business IT enriches the business


entity, profiles, defines entity & publishes for BI
rules & hands over to IT tool, portal or batch

42
Why Informatica?

Informatica Corporation Confidential – Do Not Distribute


43
Why Informatica?
Gartner Magic Quadrant for Forrester Wave: Data
ONLY
Data Integration Tools, INFORMATICA
2011 Virtualization, Q1 ‘12
COMBINES…

Power of
THE BEST OF
The Platform
THE BEST OF
“DATA INTEGRATION” “DATA VIRTUALIZATION”
(SOPHISTICATION) (AGILITY)

“The ability to switch seamlessly and transparently


“With v9, Informatica advanced its capabilities with
between delivery modes (bulk / batch vs. granular
on-the-fly data quality and profiling, a model-driven
…INTO ONE SOLUTION THAT
real-time vs. federation) with minimal rework will be
key for IT organizations seeking to develop a
approach to provisioning data services, performance

REUSES SKILLS
enhancements, cloud integration, common metadata,
successful data integration strategy.”
and role-specific tools.”
Ted Friedman, VP Distinguished Analyst, Gartner
The Forrester Wave: Data Virtualization, Q1 2012

Informatica Corporation Confidential – Do Not Distribute


44
Only Informatica Provides ONE Solution
for Data Integration and Federation
Analyze & Profile Advanced
Data & Logic Transformations &
Anytime Data Quality
Early Business
Involvement
BI
Deliver

Transform
Virtual View DW
Prototype Move to DW
First or Instantly Reuse
Access as SQL/WS

DW

• Single environment for both data integration and data federation


• No data movement / no copies – but can easily reuse virtual views for batch
• Early & iterative business (analyst) involvement, efficient collaboration
• Pre-built library of rich ETL-like advanced data transformations
• Integrated real-time, on-the-fly data profiling & data quality

Informatica Corporation Confidential – Do Not Distribute


45
Next Steps & Q&A

Informatica Corporation Confidential – Do Not Distribute


46
Have the Conversation with the Business!
New data & Business IT “YOU” can
reports take now do it in
too long… DAYS!

1. Identify a Critical Project in Your Company


2. Involve the Business Early & Often
3. Bake-In Quality & Support Advanced Logic
4. Demonstrate Business Value Early
5. Self-Service + Data Virtualization = ROI

Informatica Corporation Confidential – Do Not Distribute


47
Next Steps & Q&A

Informatica.com > Products > PowerCenter > Data


Virtualization Edition

Informatica.com > Products > Data Virtualization

Sign-Up
JOIN & DISCUSS
2000+ Strong

Expert Roundtables “Data Virtualization & Data


Services Architecture” Group

Data Virtualization Corner


http://vip.informatica.com/?elqPURLPage=8668

Informatica Corporation Confidential – Do Not Distribute


48
Informatica Corporation Confidential – Do Not Distribute
49

You might also like