You are on page 1of 10

WH I T E P A P E R

Change Data Capture:


Driving Results with Event Driven Data
This document contains Confdential, Proprietary and Trade Secret Information (Confdential Information) of
Informatica Corporation and may not be copied, distributed, duplicated, or otherwise reproduced in any manner
without the prior written consent of Informatica.
While every attempt has been made to ensure that the information in this document is accurate and complete, some
typographical errors or technical inaccuracies may exist. Informatica does not accept responsibility for any kind of
loss resulting from the use of information contained in this document. The information contained in this document is
subject to change without notice.
The incorporation of the product attributes discussed in these materials into any release or upgrade of any
Informatica software productas well as the timing of any such release or upgradeis at the sole discretion of
Informatica.
Protected by one or more of the following U.S. Patents: 6,032,158; 5,794,246; 6,014,670; 6,339,775; 6,044,374;
6,208,990; 6,208,990; 6,850,947; 6,895,471; or by the following pending U.S. Patents: 09/644,280;
10/966,046; 10/727,700.
This edition published November 2005
1 Driving Results with Event Driven Data
White Paper
Table of Contents
Executive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
What Can Change Data Capture Do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
What Can the CDC Option Do for Me? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
The Challenge: Continuous Legacy Data Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
The Solution: PowerExchange CDC Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
The Results: Improved Customer Satisfaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2
Executive Summary
IT executives everywhere are facing the same pressures. Data volumes are growing, regulatory
requirements must be met, and deadlines are shrinking. Mergers and acquisitions multiply
complexity. Data integration remains as essential as ever. And, for many organizations, day-old
data has little, if any, value.
The real-time enterprise has arrived.
More hardware? Even when budget is available, constraints on power, cooling, and foor space can
make it impossible to add hardware. More headcount? It can be challenging to attract and retain
skilled technical staff. More time? Not when time is of the essence.
Theres another answer: focus on the most relevant, most urgently needed data and integrate it
continuously. By responding to business events at the moment they occurrather than hours or
days laterIT can extend data integration beyond the realm of batch processing and satisfy todays
need for timely, accurate data. When combined with a mature, real-time data integration platform,
event-driven data can be accessed, transformed, and cleansed continuously and used to drive
business processes and decision making across any enterprise.
In this white paper, well explore how the foundation of continuous, event-based data
integrationa technology called change data capture (or CDC)allows IT to create business
value for organizations both large and small.
Our business managers will
continue to reap real benefts with
the timely delivery of accurate
reports and analytics to senior
management and regulatory bodies
alike. Tese activities are being
performed within hours rather than
days and the IT department can get
on with its planned projects instead
of being distracted by generating
one-of reports.
Terence Denton
IT Manager, Edmond de Rothschild
Informatica PowerExchange for IMS DB
Change Data Capture Option
White Paper
3 Driving Results with Event Driven Data
Introduction
Each organization places unique demands on its IT executives, but three common themes have
emerged to impact the majority of organizations. Far from being short-term factors, they have
become part of the IT climate.
Budgets and schedules are tight. For some organizations, this is a way to mitigate project
risk in the wake of highly publicized IT project failures. For others, this is a response to past
adoption of technology that promised more than it delivered. Whether this downward trend
stems from the adoption of agile IT project methodologies, or simply from a sound, conservative
management approach, the impact is the same: theres not enough time.
Change is constant and continuous. IT executives recognize this as more than a clich. The
enterprise itself is constantly changing, in response to external forces such as regulatory
changes or competition. Internal forces such as reorganizations, mergers and acquisitions, and
new business initiatives provide additional pressure for change. The IT environment undergoes
constant tactical-level change in the form of routine technology refreshes, as well as more
strategic changesuch as the adoption of service-oriented architecture.
High availability is a must. Between always-on Web sites that aim for continuous availability,
and increasing data volumes that strain existing windows, batch processing is becoming
less able to satisfy business requirements. When systems can no longer be taken offine for
processing, more responsive IT techniques must be adopted to support the business.
Fortunately, there is a capability called change data capture (or CDC) that addresses precisely
these issues. Described further in the sidebars, CDC provides access to database updates almost
as soon as they occur.
The importance of CDC, however, is not the technology itself but the way it provides visibility to
critical business events as they happen. By supporting continuous data integration, CDC allows
IT executives to sidestep the challenges of constant change and the pressures of budget and
schedule constraints.
CHANGE DATA CAPTUREWHAT IS IT?
Consider the stock tickera continuous
display of stock prices.
ZNFE DPBR JXPY . . .
25.10 +4.54 8.37 -1.22 4.67 -26.18
If you observed the ticker for a while and
made note of the price whenever it changed,
you would be performing change data capture.
Its that simple.
Informaticas CDC Option captures changes
made to a database, not a stock ticker, but
the principle is the same. It provides access to
database updates as soon as they take place.
Not all database updates are especially
noteworthy, but somesuch as the creation of
a new customer or the shipment of a customer
orderrefect important real-world events. By
capturing the database changes related to
these signifcant business events, the CDC
Option serves as the focal point for driving
continuous, event-based data integration.
4
What Can Change Data Capture Do?
It depends what you need it to do.
One customer, an international travel company, needed to continue enhancing its award-winning
customer service; they did it with CDC.
The company had originally built its IT infrastructure on the proprietary AS/400 midrange
environment, but as it began to augment its application portfolio, the packaged applications it
needed were available only on commodity platforms. As a result, the company had to integrate its
critical home-grown legacy application with its newer downstream systems and data warehouse.
Whats more, the initiativecritical for the future of this 50-year-old frmhad to be carried out by
the companys relatively small IT team. The challenges were signifcant.
To achieve the desired level of customer service, downstream systems needed to learn of changes
at a very low latency. One proposed solutioncomparing successive database extracts to identify
changescould not be done frequently enough to meet requirements, and would not scale as the
company grew. An alternative proposal to enhance the legacy application was judged both too
risky and too costly.
Instead, the company selected the Informatica PowerExchange CDC Option. It easily met
all latency requirements, without the need for hardware upgrades. Because it is noninvasive,
implementation was rapid and the cost was much lower than the alternativesboth key
requirements for performing this type of operational data integration.
The result? With CDC in place, new customer orders are propagated downstream in a minute or
less, giving the call center and data warehouse access to the latest information. The resulting
improvement in customer intelligence has provided signifcant business benefts:
Revenue and proftability per customer have grown
New products can be brought to market more quickly
More targeted marketing campaigns have reduced cost
Additional upsell and cross-sell opportunities have been created
Customer satisfaction has increased
Even though the companys IT team is small, Informatica brought CDC within its reach.
HOW DOES CHANGE DATA CAPTURE
WORK?
Every commercial DBMS maintains a
complete record of all changes made to the
database, which is used to recover from failed
transactions and other errors. Recognizing
the importance of CDC, most DBMS products
provide support for accessing this change
data.
However, each DBMS provides CDC support
via its own distinctive methods. For example,
some allow change data to be captured
from memory instantly, while others require
change data to be written to disk before it can
be captured. The effort to understand each
DBMS interfaces and build a connection to
them can be considerable.
Even more daunting are the architectural
issues that arise when designing a robust
CDC environment. Each DBMS leaves highly
complex issues such as recovery/restart
and proper transaction sequencingcritical
for success but challenging and high-risk to
buildsolely to the developer. Because of
this, in-house development of CDC is rarely
undertaken, and the overwhelming majority of
CDC use is via commercial products.
No matter which DBMS is in use, the
PowerExchange CDC Option presents
a uniform interface to developers and
administrators. It automatically discovers the
metadata needed to understand the source
and target, and ensures that only committed
updates are presented for downstream use.
By hiding unnecessary complexityand saving
time and effortPowerExchanges resilient CDC
Option is a prime example of the advantages
of Informaticas vendor-neutral philosophy.
CDC captures database changes, forwarding them to
PowerCenter for further processing
Change Log
CDC Option PowerCenter
White Paper
5 Driving Results with Event Driven Data
What Can the CDC Option Do for Me?
When combined with other components of the Informatica platform, CDC can support a wide
variety of business requirements. For example:
Enterprise Data Warehousing (EDW) initiatives can eliminate the downtime barriers imposed
by large batch extracts and provide up-to-the-minute trickle feeds to any data warehouse that
needs it. CDC is typically the only feasible way to maintain a truly massive warehouse, but as
highlighted above, even small organizations can beneft from the low-latency data integration
that CDC supports, and all organizations can beneft from the reduction in hardware use that
CDC enables.
Operational Data Integration (ODI) projects are ideal uses for CDC. The case study below
discusses ODI in more detail, but the noninvasive, event-based access provided by CDC is
essential for ODI initiatives.
Event-driven architectures (EDA), which are based on the creation and processing of business
events, can use CDC as a key component. Any database update detected by CDC can be
transformed into an event, enriched as needed in Informatica PowerCenter and pushed
into JMS or any other supported messaging systemor can be delivered to a Web service for
further processing. No staging databases are neededtransformed data is passed directly from
PowerCenter to the target.
Virtually any application can be CDC-enabled without being modifed. If a captured database
update does not contain enough information, PowerCenter can access the source database, or
use whatever interface has been mandated for accessing application objects. The resulting data
can be sent to messaging systems, stored in other databases, delivered to a Web service, or
exposed as a Web service for others to use.
The Informatica platform supports fexible data integration architectures, and the PowerExchange
CDC Option brings highly responsive data access to the Informatica platform. Its truly possible to
do morewith less.
Without CDC, all data must be accessed in a batch
and moved whenever it is needed. Thats impossible
with todays low latency requirements, or whenever
data volume is large.
With CDC, only a small number of changes are
processed, and only as they take place. This can
lead to signifcant savings in hardware capacity,
provide better responsiveness, and is essential
for implementing continuous, event-based data
integration.
Informatica
PowerCenter
Informatica
PowerCenter
6
CDC Case Study: EVERTEC
EVERTEC is the IT outsourcing and transaction processing subsidiary of Banco Popular Inc., a
114-year-old fnancial services provider based in Puerto Rico, with more than $44.4 billion in
assets and over 280 branches and offces in the United States, the Caribbean, and Latin America.
EVERTEC employs 1,700 professionals and manages more than 850 million transactions each
year through some 4,300 ATMs and 85,000 point-of-sale (POS) terminals.
The Challenge: Continuous Legacy Data Integration
Formed as the result of a merger, EVERTECs frst assignment was to re-architect a legacy
infrastructure originally built with a messaging system linking 14 VSAM- and DB2-based
mainframe applications. As the centerpiece of this ambitious initiative, EVERTEC blueprinted an
operational data store (ODS) and enterprise data integration platform, called the TIP Transaction
Vault, to provide a unifed view of all customer activity and serve as the foundation urgently
needed to support continued growth.
The Solution: PowerExchange CDC Option
The largest legacy application, called the Integrated Deposit Application (IDA), carries
approximately 90 percent of the customer transaction loadsome 3 million records dailyand
the PowerExchange CDC Option is used to continuously integrate each IDA update with the ODS.
Even though this 20-year-old homegrown application was built on a combination of VSAM and
sequential fles, with an idiosyncratic data model, the Informatica platform easily handles this
event-based data integration requirement.
Although low-latency access to IDA updates is critical, PowerExchange is also used in a
batch mode to integrate another 4.5 million records daily where higher latency is acceptable.
If conditions change, and CDC becomes a necessity for some of these integrations, only
confguration changes will be neededno coding changes will be required as with hand-coded
solutions.
Rapid implementation was a key factor in the success of the project, and EVERTEC estimates
the savings to be between $700,000 and $800,000 over one year compared to custom-coding
alternatives. Ease of use enabled a small team of three to fve full-time employees to deploy
the integration in just six monthsless than half the time otherwise expectedand no scarce
mainframe COBOL resources were needed. These savings will continue as EVERTECs internal
Competency Center builds on its experience with the Informatica platform and reuses prior data
integration work in future projects.
The Results: Improved Customer Satisfaction
The TIP Transaction Vault is helping transform EVERTECs batch-oriented environment into an
on-demand infrastructure that meets customer expectations for timely, accurate information
on account activity. The Informatica platform is the foundation for a data hub that helps ensure
customer satisfaction, reduce churn, and lower the banks administrative costs in felding customer
inquiries and complaints. Thanks to CDC, the Informatica platform has enabled EVERTEC to carry
out the sophisticated data integration projects it needs to continue growing.
For more details about EVERTEC and the PowerExchange CDC Option, please visit
http://www.informatica.com/customers/services/evertec.htm.
As an outsourcing service provider,
EVERTEC has unique demands
for data integration technology
that meets both tactical and
strategic objectives. Informatica
PowerExchange and PowerCenter
have proven hands-down to be the
right choice in enabling us to rapidly
implement a customer-centric
data hub for our client, Banco
Popular. Tey can handle millions
of batch and real-time updates
each day, while giving us a trusted
platform that we can leverage
across other customers and business
opportunities.
Miguel Mercado Morales
Senior Vice President, EVERTEC
White Paper
7 Driving Results with Event Driven Data
Conclusion
The Informatica platform has long been focused on addressing the major issues affecting IT
executives everywhere, and CDC has been an important part of addressing these issues:
Tight budgets and schedules. From the codeless, GUI-based development environment, to
the scalability of the platform, to the noninvasive, low-impact nature of CDC, the Informatica
platform is designed to conserve time, hardware, and staff resources, while delivering highest
quality results.
Continuous change. With the PowerExchange CDC Option, users can easily change the delivery
latency from batch to real time as requirements change, with no more than some confguration
changes. The Informatica platforms reuse capabilities further enhance the ability to respond to
changing conditions. And Informaticas vendor neutrality even allows databases, applications,
and hardware platforms to change with minimal customer impact.
High availability. High availability means little unless the latest updates are made available
when and where theyre needed. CDC captures those changes as they occur, so they can
be delivered anywhere, in any form. Beyond CDC, the Informatica platform itself offers high-
availability options, and real-time integration capabilities currently used to support businesses
around the world.
Let Informatica help you determine where the PowerExchange CDC Option is right for your
organization.
Visit us at www.informatica.com or call (800) 653-3871.
As business needs change, we
must integrate new internal and
external sources. We must ensure
that developed interfaces and
overall consistency are maintained.
Today, thanks to Informatica tools,
we have a physical and logical data
integration architecture that enables
us to respond to these demands
in a modern way, and all the more
quickly since we can share objects,
manipulate them, and reutilize
them at the subtlest level. It provides
assurance in the long term, not only
of substantial cost saving for the
organization but also of increased
responsiveness, which in the end
means better service.
Fabrice Ladou
Technical Expertise Division Head
Mutuelle Gnrale
Informatica PowerExchange for Oracle,
DB2, and VSAM
Worldwide Headquarters, 100 Cardinal Way, Redwood City, CA 94063, USA
phone: 650.385.5000 fax: 650.385.5500 toll-free in the US: 1.800.653.3871 www.informatica.com
Informatica Offces Around The Globe: Australia Belgium Canada China France Germany Japan Korea the Netherlands Singapore Switzerland United Kingdom USA
2008 Informatica Corporation. All rights reserved. Printed in the U.S.A. Informatica, the Informatica logo, and The Data Integration Company are trademarks or registered trademarks of Informatica Corporation in the United States and in
jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.
6816 (10/02/2008)

You might also like