You are on page 1of 10

Change Data Capture:

Driving Results with Event Driven Data

W H I T E PA P E R

This document contains Confidential, Proprietary and Trade Secret Information (Confidential Information) of Informatica Corporation and may not be copied, distributed, duplicated, or otherwise reproduced in any manner without the prior written consent of Informatica. While every attempt has been made to ensure that the information in this document is accurate and complete, some typographical errors or technical inaccuracies may exist. Informatica does not accept responsibility for any kind of loss resulting from the use of information contained in this document. The information contained in this document is subject to change without notice. The incorporation of the product attributes discussed in these materials into any release or upgrade of any Informatica software productas well as the timing of any such release or upgradeis at the sole discretion of Informatica. Protected by one or more of the following U.S. Patents: 6,032,158; 5,794,246; 6,014,670; 6,339,775; 6,044,374; 6,208,990; 6,208,990; 6,850,947; 6,895,471; or by the following pending U.S. Patents: 09/644,280; 10/966,046; 10/727,700. This edition published November 2005

White Paper

Table of Contents
Executive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 What Can Change Data Capture Do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 What Can the CDC Option Do for Me? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
The Challenge: Continuous Legacy Data Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 The Solution: PowerExchange CDC Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 The Results: Improved Customer Satisfaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Driving Results with Event Driven Data

Our business managers will continue to reap real benefits with the timely delivery of accurate reports and analytics to senior management and regulatory bodies alike. These activities are being performed within hours rather than days and the IT department can get on with its planned projects instead of being distracted by generating one-off reports.
Terence Denton IT Manager, Edmond de Rothschild Informatica PowerExchange for IMS DB Change Data Capture Option

Executive Summary
IT executives everywhere are facing the same pressures. Data volumes are growing, regulatory requirements must be met, and deadlines are shrinking. Mergers and acquisitions multiply complexity. Data integration remains as essential as ever. And, for many organizations, day-old data has little, if any, value. The real-time enterprise has arrived. More hardware? Even when budget is available, constraints on power, cooling, and floor space can make it impossible to add hardware. More headcount? It can be challenging to attract and retain skilled technical staff. More time? Not when time is of the essence. Theres another answer: focus on the most relevant, most urgently needed data and integrate it continuously. By responding to business events at the moment they occurrather than hours or days laterIT can extend data integration beyond the realm of batch processing and satisfy todays need for timely, accurate data. When combined with a mature, real-time data integration platform, event-driven data can be accessed, transformed, and cleansed continuously and used to drive business processes and decision making across any enterprise. In this white paper, well explore how the foundation of continuous, event-based data integrationa technology called change data capture (or CDC)allows IT to create business value for organizations both large and small.

White Paper

Introduction
Each organization places unique demands on its IT executives, but three common themes have emerged to impact the majority of organizations. Far from being short-term factors, they have become part of the IT climate.
Budgets and schedules are tight. For some organizations, this is a way to mitigate project

CHANGE DATA CAPTUREWHAT IS IT?


Consider the stock tickera continuous display of stock prices. ZNFE DPBR JXPY ... 25.10 +4.54 8.37 -1.22 4.67 -26.18 If you observed the ticker for a while and made note of the price whenever it changed, you would be performing change data capture. Its that simple. Informaticas CDC Option captures changes made to a database, not a stock ticker, but the principle is the same. It provides access to database updates as soon as they take place. Not all database updates are especially noteworthy, but somesuch as the creation of a new customer or the shipment of a customer orderreflect important real-world events. By capturing the database changes related to these significant business events, the CDC Option serves as the focal point for driving continuous, event-based data integration.

risk in the wake of highly publicized IT project failures. For others, this is a response to past adoption of technology that promised more than it delivered. Whether this downward trend stems from the adoption of agile IT project methodologies, or simply from a sound, conservative management approach, the impact is the same: theres not enough time.
Change is constant and continuous. IT executives recognize this as more than a clich. The

enterprise itself is constantly changing, in response to external forces such as regulatory changes or competition. Internal forces such as reorganizations, mergers and acquisitions, and new business initiatives provide additional pressure for change. The IT environment undergoes constant tactical-level change in the form of routine technology refreshes, as well as more strategic changesuch as the adoption of service-oriented architecture.
High availability is a must. Between always-on Web sites that aim for continuous availability,

and increasing data volumes that strain existing windows, batch processing is becoming less able to satisfy business requirements. When systems can no longer be taken offline for processing, more responsive IT techniques must be adopted to support the business. Fortunately, there is a capability called change data capture (or CDC) that addresses precisely these issues. Described further in the sidebars, CDC provides access to database updates almost as soon as they occur. The importance of CDC, however, is not the technology itself but the way it provides visibility to critical business events as they happen. By supporting continuous data integration, CDC allows IT executives to sidestep the challenges of constant change and the pressures of budget and schedule constraints.

Driving Results with Event Driven Data

HOW DOES CHANGE DATA CAPTURE WORK?


Every commercial DBMS maintains a complete record of all changes made to the database, which is used to recover from failed transactions and other errors. Recognizing the importance of CDC, most DBMS products provide support for accessing this change data. However, each DBMS provides CDC support via its own distinctive methods. For example, some allow change data to be captured from memory instantly, while others require change data to be written to disk before it can be captured. The effort to understand each DBMS interfaces and build a connection to them can be considerable. Even more daunting are the architectural issues that arise when designing a robust CDC environment. Each DBMS leaves highly complex issues such as recovery/restart and proper transaction sequencingcritical for success but challenging and high-risk to buildsolely to the developer. Because of this, in-house development of CDC is rarely undertaken, and the overwhelming majority of CDC use is via commercial products. No matter which DBMS is in use, the PowerExchange CDC Option presents a uniform interface to developers and administrators. It automatically discovers the metadata needed to understand the source and target, and ensures that only committed updates are presented for downstream use. By hiding unnecessary complexityand saving time and effortPowerExchanges resilient CDC Option is a prime example of the advantages of Informaticas vendor-neutral philosophy.
CDC Option PowerCenter

What Can Change Data Capture Do?


It depends what you need it to do. One customer, an international travel company, needed to continue enhancing its award-winning customer service; they did it with CDC. The company had originally built its IT infrastructure on the proprietary AS/400 midrange environment, but as it began to augment its application portfolio, the packaged applications it needed were available only on commodity platforms. As a result, the company had to integrate its critical home-grown legacy application with its newer downstream systems and data warehouse. Whats more, the initiativecritical for the future of this 50-year-old firmhad to be carried out by the companys relatively small IT team. The challenges were significant. To achieve the desired level of customer service, downstream systems needed to learn of changes at a very low latency. One proposed solutioncomparing successive database extracts to identify changescould not be done frequently enough to meet requirements, and would not scale as the company grew. An alternative proposal to enhance the legacy application was judged both too risky and too costly. Instead, the company selected the Informatica PowerExchange CDC Option. It easily met all latency requirements, without the need for hardware upgrades. Because it is noninvasive, implementation was rapid and the cost was much lower than the alternativesboth key requirements for performing this type of operational data integration. The result? With CDC in place, new customer orders are propagated downstream in a minute or less, giving the call center and data warehouse access to the latest information. The resulting improvement in customer intelligence has provided significant business benefits:
Revenue and profitability per customer have grown New products can be brought to market more quickly More targeted marketing campaigns have reduced cost Additional upsell and cross-sell opportunities have been created Customer satisfaction has increased

Even though the companys IT team is small, Informatica brought CDC within its reach.

Change Log

CDC captures database changes, forwarding them to PowerCenter for further processing

White Paper

What Can the CDC Option Do for Me?


When combined with other components of the Informatica platform, CDC can support a wide variety of business requirements. For example:
Enterprise Data Warehousing (EDW) initiatives can eliminate the downtime barriers imposed
Informatica PowerCenter

by large batch extracts and provide up-to-the-minute trickle feeds to any data warehouse that needs it. CDC is typically the only feasible way to maintain a truly massive warehouse, but as highlighted above, even small organizations can benefit from the low-latency data integration that CDC supports, and all organizations can benefit from the reduction in hardware use that CDC enables.
Operational Data Integration (ODI) projects are ideal uses for CDC. The case study below

Without CDC, all data must be accessed in a batch and moved whenever it is needed. Thats impossible with todays low latency requirements, or whenever data volume is large.

discusses ODI in more detail, but the noninvasive, event-based access provided by CDC is essential for ODI initiatives.
Event-driven architectures (EDA), which are based on the creation and processing of business

events, can use CDC as a key component. Any database update detected by CDC can be transformed into an event, enriched as needed in Informatica PowerCenter and pushed into JMS or any other supported messaging systemor can be delivered to a Web service for further processing. No staging databases are neededtransformed data is passed directly from PowerCenter to the target.
Virtually any application can be CDC-enabled without being modified. If a captured database

Informatica PowerCenter

update does not contain enough information, PowerCenter can access the source database, or use whatever interface has been mandated for accessing application objects. The resulting data can be sent to messaging systems, stored in other databases, delivered to a Web service, or exposed as a Web service for others to use. The Informatica platform supports flexible data integration architectures, and the PowerExchange CDC Option brings highly responsive data access to the Informatica platform. Its truly possible to do morewith less.

With CDC, only a small number of changes are processed, and only as they take place. This can lead to significant savings in hardware capacity, provide better responsiveness, and is essential for implementing continuous, event-based data integration.

Driving Results with Event Driven Data

As an outsourcing service provider, EVERTEC has unique demands for data integration technology that meets both tactical and strategic objectives. Informatica PowerExchange and PowerCenter have proven hands-down to be the right choice in enabling us to rapidly implement a customer-centric data hub for our client, Banco Popular. They can handle millions of batch and real-time updates each day, while giving us a trusted platform that we can leverage across other customers and business opportunities.
Miguel Mercado Morales Senior Vice President, EVERTEC

CDC Case Study: EVERTEC


EVERTEC is the IT outsourcing and transaction processing subsidiary of Banco Popular Inc., a 114-year-old financial services provider based in Puerto Rico, with more than $44.4 billion in assets and over 280 branches and offices in the United States, the Caribbean, and Latin America. EVERTEC employs 1,700 professionals and manages more than 850 million transactions each year through some 4,300 ATMs and 85,000 point-of-sale (POS) terminals.

The Challenge: Continuous Legacy Data Integration


Formed as the result of a merger, EVERTECs first assignment was to re-architect a legacy infrastructure originally built with a messaging system linking 14 VSAM- and DB2-based mainframe applications. As the centerpiece of this ambitious initiative, EVERTEC blueprinted an operational data store (ODS) and enterprise data integration platform, called the TIP Transaction Vault, to provide a unified view of all customer activity and serve as the foundation urgently needed to support continued growth.

The Solution: PowerExchange CDC Option


The largest legacy application, called the Integrated Deposit Application (IDA), carries approximately 90 percent of the customer transaction loadsome 3 million records dailyand the PowerExchange CDC Option is used to continuously integrate each IDA update with the ODS. Even though this 20-year-old homegrown application was built on a combination of VSAM and sequential files, with an idiosyncratic data model, the Informatica platform easily handles this event-based data integration requirement. Although low-latency access to IDA updates is critical, PowerExchange is also used in a batch mode to integrate another 4.5 million records daily where higher latency is acceptable. If conditions change, and CDC becomes a necessity for some of these integrations, only configuration changes will be neededno coding changes will be required as with hand-coded solutions. Rapid implementation was a key factor in the success of the project, and EVERTEC estimates the savings to be between $700,000 and $800,000 over one year compared to custom-coding alternatives. Ease of use enabled a small team of three to five full-time employees to deploy the integration in just six monthsless than half the time otherwise expectedand no scarce mainframe COBOL resources were needed. These savings will continue as EVERTECs internal Competency Center builds on its experience with the Informatica platform and reuses prior data integration work in future projects.

The Results: Improved Customer Satisfaction


The TIP Transaction Vault is helping transform EVERTECs batch-oriented environment into an on-demand infrastructure that meets customer expectations for timely, accurate information on account activity. The Informatica platform is the foundation for a data hub that helps ensure customer satisfaction, reduce churn, and lower the banks administrative costs in fielding customer inquiries and complaints. Thanks to CDC, the Informatica platform has enabled EVERTEC to carry out the sophisticated data integration projects it needs to continue growing. For more details about EVERTEC and the PowerExchange CDC Option, please visit http://www.informatica.com/customers/services/evertec.htm.

White Paper

Conclusion
The Informatica platform has long been focused on addressing the major issues affecting IT executives everywhere, and CDC has been an important part of addressing these issues:
Tight budgets and schedules. From the codeless, GUI-based development environment, to

As business needs change, we must integrate new internal and external sources. We must ensure that developed interfaces and overall consistency are maintained. Today, thanks to Informatica tools, we have a physical and logical data integration architecture that enables us to respond to these demands in a modern way, and all the more quickly since we can share objects, manipulate them, and reutilize them at the subtlest level. It provides assurance in the long term, not only of substantial cost saving for the organization but also of increased responsiveness, which in the end means better service.
Fabrice Ladou Technical Expertise Division Head Mutuelle Gnrale Informatica PowerExchange for Oracle, DB2, and VSAM

the scalability of the platform, to the noninvasive, low-impact nature of CDC, the Informatica platform is designed to conserve time, hardware, and staff resources, while delivering highest quality results.
Continuous change. With the PowerExchange CDC Option, users can easily change the delivery

latency from batch to real time as requirements change, with no more than some configuration changes. The Informatica platforms reuse capabilities further enhance the ability to respond to changing conditions. And Informaticas vendor neutrality even allows databases, applications, and hardware platforms to change with minimal customer impact.
High availability. High availability means little unless the latest updates are made available

when and where theyre needed. CDC captures those changes as they occur, so they can be delivered anywhere, in any form. Beyond CDC, the Informatica platform itself offers highavailability options, and real-time integration capabilities currently used to support businesses around the world. Let Informatica help you determine where the PowerExchange CDC Option is right for your organization. Visit us at www.informatica.com or call (800) 653-3871.

Driving Results with Event Driven Data

Worldwide Headquarters, 100 Cardinal Way, Redwood City, CA 94063, USA phone: 650.385.5000 fax: 650.385.5500 toll-free in the US: 1.800.653.3871 www.informatica.com

Informatica Offices Around The Globe: Australia Belgium Canada China France Germany Japan Korea the Netherlands Singapore Switzerland United Kingdom USA
2008 Informatica Corporation. All rights reserved. Printed in the U.S.A. Informatica, the Informatica logo, and The Data Integration Company are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.

6816 (10/02/2008)

You might also like