Informatica PowerExchange (Version 9.

0)

CDC Guide for Linux, UNIX, and Windows

Informatica PowerExchange CDC Guide for Linux, UNIX, and Windows Version 9 .0 December 2009 Copyright (c) 1998-2009 Informatica. All rights reserved.

This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation. This Software may be protected by U.S. and/or international Patents and other Patents Pending. Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013 ©(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable. The information in this product or documentation is subject to change without notice. If you find any problems in this product or documentation, please report them to us in writing. Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange, PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange and Informatica On Demand are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners. Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies. All rights reserved. Copyright © Sun Microsystems. All rights reserved. Copyright © RSA Security Inc. All Rights Reserved. Copyright © Ordinal Technology Corp. All rights reserved.Copyright © Aandacht c.v. All rights reserved. Copyright Genivia, Inc. All rights reserved. Copyright 2007 Isomorphic Software. All rights reserved. Copyright © Meta Integration Technology, Inc. All rights reserved. Copyright © Intalio. All rights reserved. Copyright © Oracle. All rights reserved. Copyright © Adobe Systems Incorporated. All rights reserved. Copyright © DataArt, Inc. All rights reserved. Copyright © ComponentSource. All rights reserved. Copyright © Microsoft Corporation. All rights reserved. Copyright © Rouge Wave Software, Inc. All rights reserved. Copyright © Teradata Corporation. All rights reserved. Copyright © Yahoo! Inc. All rights reserved. Copyright © Glyph & Cog, LLC. All rights reserved. This product includes software developed by the Apache Software Foundation (http://www.apache.org/), and other software which is licensed under the Apache License, Version 2.0 (the "License"). You may obtain a copy of the License at http://www.apache.org/licenses/ LICENSE-2.0. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. This product includes software which was developed by Mozilla (http://www.mozilla.org/), software copyright The JBoss Group, LLC, all rights reserved; software copyright © 1999-2006 by Bruno Lowagie and Paulo Soares and other software which is licensed under the GNU Lesser General Public License Agreement, which may be found at http://www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. The product includes ACE(TM) and TAO(TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University, University of California, Irvine, and Vanderbilt University, Copyright ( ©) 1993-2006, all rights reserved. This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (copyright The OpenSSL Project. All Rights Reserved) and redistribution of this software is subject to terms available at http://www.openssl.org. This product includes Curl software which is Copyright 1996-2007, Daniel Stenberg, <daniel@haxx.se>. All Rights Reserved. Permissions and limitations regarding this software are subject to terms available at http://curl.haxx.se/docs/copyright.html. Permission to use, copy, modify, and distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies. The product includes software copyright 2001-2005 ( ©) MetaStuff, Ltd. All Rights Reserved. Permissions and limitations regarding this software are subject to terms available at http://www.dom4j.org/ license.html. The product includes software copyright © 2004-2007, The Dojo Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms available at http:// svn.dojotoolkit.org/dojo/trunk/LICENSE. This product includes ICU software which is copyright International Business Machines Corporation and others. All rights reserved. Permissions and limitations regarding this software are subject to terms available at http://source.icu-project.org/repos/icu/icu/trunk/ license.html. This product includes software copyright © 1996-2006 Per Bothner. All rights reserved. Your right to use such materials is set forth in the license which may be found at http://www.gnu.org/software/ kawa/Software-License.html. This product includes OSSP UUID software which is Copyright © 2002 Ralf S. Engelschall, Copyright © 2002 The OSSP Project Copyright © 2002 Cable & Wireless Deutschland. Permissions and limitations regarding this software are subject to terms available at http://www.opensource.org/licenses/mit-license.php.

This product includes software developed by Boost (http://www.boost.org/) or under the Boost software license. Permissions and limitations regarding this software are subject to terms available at http:/ /www.boost.org/LICENSE_1_0.txt. This product includes software copyright © 1997-2007 University of Cambridge. Permissions and limitations regarding this software are subject to terms available at http://www.pcre.org/license.txt. This product includes software copyright © 2007 The Eclipse Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms available at http:// www.eclipse.org/org/documents/epl-v10.php. This product includes software licensed under the terms at http://www.tcl.tk/software/tcltk/license.html, http://www.bosrup.com/web/ overlib/?License, http://www.stlport.org/doc/license.html, http://www.asm.ow2.org/license.html, http://www.cryptix.org/LICENSE.TXT, http://hsqldb.org/web/hsqlLicense.html, http://httpunit.sourceforge.net/doc/license.html, http://jung.sourceforge.net/license.txt , http:// www.gzip.org/zlib/zlib_license.html, http://www.openldap.org/software/release/license.html, http://www.libssh2.org, http://slf4j.org/ license.html, and http://www.sente.ch/software/OpenSourceLicense.htm. This product includes software licensed under the Academic Free License (http://www.opensource.org/licenses/afl-3.0.php), the Common Development and Distribution License (http://www.opensource.org/licenses/cddl1.php) the Common Public License (http:// www.opensource.org/licenses/cpl1.0.php) and the BSD License (http://www.opensource.org/licenses/bsd-license.php). This product includes software copyright © 2003-2006 Joe WaInes, 2006-2007 XStream Committers. All rights reserved. Permissions and limitations regarding this software are subject to terms available at http://xstream.codehaus.org/license.html. This product includes software developed by the Indiana University Extreme! Lab. For further information please visit http://www.extreme.indiana.edu/. This Software is protected by U.S. Patent Numbers 5,794,246; 6,014,670; 6,016,501; 6,029,178; 6,032,158; 6,035,307; 6,044,374; 6,092,086; 6,208,990; 6,339,775; 6,640,226; 6,789,096; 6,820,077; 6,823,373; 6,850,947; 6,895,471; 7,117,215; 7,162,643; 7,254,590; 7, 281,001; 7,421,458; and 7,584,422, international Patents and other Patents Pending.. DISCLAIMER: Informatica Corporation provides this documentation "as is" without warranty of any kind, either express or implied, including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. Informatica Corporation does not warrant that this software or documentation is error free. The information provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software and documentation is subject to change at any time without notice.
NOTICES This Informatica product (the “Software”) includes certain drivers (the “DataDirect Drivers”) from DataDirect Technologies, an operating company of Progress Software Corporation (“DataDirect”) which are subject to the following terms and conditions: 1. THE DATADIRECT DRIVERS ARE PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. 2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE ODBC DRIVERS, WHETHER OR NOT INFORMED OF THE POSSIBILITIES OF DAMAGES IN ADVANCE. THESE LIMITATIONS APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT LIMITATION, BREACH OF CONTRACT, BREACH OF WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS. Part Number: PWX-CCl-900-0001

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Chapter 1: Change Data Capture Introduction. . . . . . . . vi Informatica Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Change Data Capture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Change Data Extraction and Apply. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 PowerExchange CDC Data Sources. . . . . . . . . . . . . . . . . . . . . . . . . vi Informatica Customer Portal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 PowerExchange Integration with PowerCenter. . . . . . . . . . . . . . . . . . . . . . . . . . 6 PowerExchange Logger for Linux. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Summary of CDC Implementation Tasks. . . . . . . . . . . . . . 16 Stopping the PowerExchange Listener. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Starting the PowerExchange Listener. . . . . . . . . . . . . . . vii Part I: PowerExchange CDC Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Chapter 2: PowerExchange Listener. . . . . . . . . . . . . . . . . and Windows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . UNIX. . . . . . . . . . . . . . . . . . . . . . . . vii Informatica Global Customer Support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . UNIX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Microsoft SQL Server Data Sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 CAPI_CONNECTION Statements. . . . . . 12 PowerExchange Listener Overview. . 6 PowerExchange Listener. . . . . . . . . . . . . . . . . . . 5 i5/OS and z/OS Data Sources with Offload Processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 PowerExchange CDC Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Table of Contents i . . . . 5 Oracle Data Sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi Informatica Documentation. . . . . . . . . . . . . . . . . . 7 PowerExchange CDC Architecture. . . . . . . . vi Informatica How-To Library. . . 4 DB2 for Linux. . . . . . . . . . 12 Customizing the dbmover. . . . . . . . . vii Informatica Knowledge Base. . . . . . . . . . . vi Informatica Web Site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . and Windows Data Sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 PowerExchange Navigator . . . . . . . . . . . . . . . . . . . . . . . . .cfg File for CDC. . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Informatica Multimedia Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Part II: PowerExchange CDC Components. . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Displaying Active PowerExchange Listener Tasks. . . . . . . . . . . . 2 PowerExchange CDC Overview. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . 20 PowerExchange Logger Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 PowerExchange Logger Memory Requirement on Linux or UNIX. . . . . . . . . . . . . 57 CDC Restrictions. . . . . . . 56 DB2 for Linux. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 PowerExchange Logger Considerations on Linux and UNIX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Required User Authority. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Chapter 4: DB2 for Linux. . . . . 54 Part III: PowerExchange CDC Data Sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Cache Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 PWXCCL Syntax and Parameters. . . . . . 27 Enabling a Capture Registration for PowerExchange Logger Use. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . and Windows Change Data Capture. . . . . . . . . . . . . . . . . . . . 47 How the PowerExchange Logger Determines the Start Point for a Cold Start. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . UNIX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 File Switches. . . . . . . . . . . 44 Starting the PowerExchange Logger. 21 PowerExchange Logger Log Files. . . . . . . . . . . . . . . . . . . . . . . 53 Backing Up PowerExchange Logger Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Lock Files. . . . and Windows CDC Overview. . 58 Configuring DB2 for CDC. . . . . . . . . . . . . . . . . 27 Customizing the PowerExchange Logger Configuration File. . . . . . . . . . 25 PowerExchange Logger Operational Modes. . . . . . . . . . . . . . . . . . . . . . . . . . 49 Managing the PowerExchange Logger. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Customizing dbmover. . . . . . . . . . . 19 PowerExchange Logger Tasks. . . . . . . . . . . . . . . . . . . . . . . . . 58 ii Table of Contents . 19 PowerExchange Logger Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Prerequisites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Continuous Mode. . . . . . 49 Commands for Controlling and Stopping PowerExchange Logger Processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Message Log Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . and Windows. . .Chapter 3: PowerExchange Logger for Linux. . . 21 Checkpoint Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Configuring the PowerExchange Logger. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Maintaining the PowerExchange Logger CDCT File and Log Files. . . . . . . . . . . . . . . . . . . . . . . . 26 Batch Mode. . . . . . . . . . . . UNIX. . . . . .cfg for the PowerExchange Logger. . . . . . . . . . . . . . . . 49 Assessing PowerExchange Logger Performance. . . . . . . . . . 27 Running the PowerExchange Logger in Background Mode. . . . . . . . . . . . . . . . . . . . . . . . 54 Re-creating the CDCT File After a Failure. . . . . . . . . . . . . . . . . . . . . 56 Planning for DB2 CDC. . . . . . . . . . . . UNIX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Cold Starting the PowerExchange Logger . . . . 43 Using PowerExchange Logger Group Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 CDCT File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . 70 Planning for SQL Server CDC. . . . . . . . . . . . 66 Stopping DB2 CDC. . . . . . . . . . . . . . . . . . . . . . . . .Configuring PowerExchange for DB2 CDC. UNIX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Requirements and Restrictions for Oracle LogMiner CDC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Chapter 6: Oracle Change Data Capture with Oracle LogMiner. . . . . . . . . . . . . . . . . . 65 Task Flow for DB2 Data Map Use. . . . . . . . . 61 Customizing dbmover.cfg for DB2 CDC. . . . . . . . . . . . 71 SQL Server CDC Prerequisites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Changing a SQL Server Source Table Definition. . . 83 Configuration in an Oracle RAC Environment. . and Windows CDC Troubleshooting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Customizing dbmover. . 78 Disabling Publication of Change Data for a SQL Server Source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Configuring Oracle for LogMiner CDC. . . . . . . . . . . . . 59 Configuring PowerExchange CDC without the PowerExchange Logger. . . . . . . . . . . . . . . . . . . . . . . . . . 71 Datatypes Supported for SQL Server CDC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 SQL*Loader Restrictions. . . . . . . . . . . . . . . . . . . . . 71 SQL Server CDC Restrictions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Managing SQL Server CDC. . . 69 Chapter 5: Microsoft SQL Server Change Data Capture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Overview of Oracle LogMiner CDC. . 70 Microsoft SQL Server CDC Overview. .cfg for SQL Server CDC. . . . . 67 DB2 for Linux. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Configuring PowerExchange for SQL Server CDC. . . . . . . . . . 66 Reconfiguring a Partitioned Database or Database Partition Group. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Configuring SQL Server for CDC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Managing DB2 CDC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Table of Contents iii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Configuring PowerExchange CDC with the PowerExchange Logger. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Configuring PowerExchange CDC without the PowerExchange Logger. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Initializing the Capture Catalog Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Planning for Oracle LogMiner CDC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Changing a DB2 Source Table Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Configuration Script Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 IBM APARs for Specific Issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Datatypes Supported for Oracle LogMiner CDC. . . . . . 60 Creating the Capture Catalog Table. . . . . . . 61 Using a DB2 Data Map. . . . . . . . . . . . . . . 69 Workaround for SQL1224 Error on AIX. . . . . . . . . . . 83 Oracle Configuration for LogMiner CDC. . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Configuring PowerExchange CDC with the PowerExchange Logger. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Performance Considerations for Oracle LogMiner CDC. . . . . . . . . . 71 Required User Authority for SQL Server CDC. . . . . . . . . .

. . . . . . . . . . . . . . . . . .cfg for Oracle LogMiner CDC. . 121 Offload Processing. . . 104 Chapter 7: Introduction to Change Data Extraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 PowerCenter Recovery Files for Nonrelational Targets. . . . . . . . . . . . . . . . 135 Displaying Restart Tokens. . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Using Group Source with CDC Sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 PowerExchange-Generated Columns in Extraction Maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Configuring PowerCenter CDC Sessions. . . . . . . . . . . . 105 Extraction Modes. . . . . . . . . . . 88 Configuring Oracle LogMiner CDC without the PowerExchange Logger. . . . . . . . . . . . . . . . . . . . 123 Multithreaded Processing. . . . . . . . . . . . . . 116 Using Group Source with Nonrelational Sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Restart Tokens and the Restart Token File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Part IV: Change Data Extraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Customizing dbmover. 113 Restart Processing for CDC Sessions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Restart Token File. . . . . . 117 Commit Processing with PWXPC. . . . . . . . . . . . . . . . . . . . . . . 128 Configuring Application Connection Attributes. . . . . . . . . . . . . . . . . . . . . . . 90 Management of Oracle LogMiner CDC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Changing Default Values for Session and Connection Attributes. . . . . . . . . 88 Configuring Oracle LogMiner CDC with the PowerExchange Logger. . . . .PowerExchange Configuration for Oracle LogMiner CDC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Overview of Extracting Change Data. . . . . . . . . . . . . . . . 119 Maximum and Minimum Rows per Commit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Generating Restart Tokens. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Creating Restart Tokens for Extractions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Group Source Processing in PowerExchange. . 124 Chapter 8: Extracting Change Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Task Flow for Extracting Change Data. . . . . . . . . . . . . . . . 102 Stopping Oracle LogMiner CDC. . . . . . . . . . . 123 CDC Offload Processing. . . . . . . . . . . . . . . . . . . . . . 102 Changing a Source Table Definition Used in Oracle LogMiner CDC. . . . . . . . . . . 135 iv Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 PowerCenter Recovery Tables for Relational Targets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Controlling Commit Processing . . . . . . . . . . . . . . . . 126 Testing a Change Data Extraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Change Data Extraction Overview. . . . . . . . 110 Recovery and Restart Processing for CDC Sessions. . . . . . . . . . . . . . . . . . . . . . . 112 Application Names. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Target Latency . . 121 Examples of Commit Processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Recovery Processing. . . . . . . . . . . . . . . . . . . . . . . . . . .Configuring the Restart Token File. . . . . . . . . . . . . . . . . . . . 142 Stop Command Processing. . . . . . . . . . . . . . . . . . . . . . . . .146 Example of Session Recovery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 CDC Offload and Multithreaded Processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Extracting Change Data Captured on a Remote System. . . . . . . . . . . . . . . . . . . . . . . . 143 Terminating Conditions. . . . . . . . . . . . . . . . . . . . . . . . . .146 Chapter 10: Monitoring and Tuning Options. .141 Warm Start Processing. . . . . . . . . . . . . . . . 161 Configuring PowerExchange to Capture Change Data on a Remote System. . . . . . . . . . . . . . . 140 Starting PowerCenter CDC Sessions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Chapter 9: Managing Change Data Extractions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .168 Configuration File Examples for CDC Offload Processing. . . . . . . .151 Tuning Change Data Extractions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .148 Monitoring CDC Sessions in PowerCenter. 168 Index.160 Enabling Offload and Multithreaded Processing for CDC Sessions. . . . . . . . . . . . . . .155 Using Connection Options to Tune CDC Sessions . . . . . . . . .148 Monitoring CDC Sessions in PowerExchange. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .154 Using PowerExchange Parameters to Tune CDC Sessions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 Cold Start Processing. . . . . . . . . . . 136 Restart Token File Statements. . . . . . . . . . . . . . 144 Examples of Creating a Restart Point. . . . . . . . . . . . . . . . .Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .144 Recovering PowerCenter CDC Sessions. . . . . . . . . . . . . . . . . . . . . 142 Stopping PowerCenter CDC Sessions. . . . . . . . . . . 148 Monitoring Change Data Extractions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .159 Planning for CDC Offload and Multithreaded Processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Restart Token File . . . . . 143 Changing PowerCenter CDC Sessions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Table of Contents v . . . .

The site contains information about Informatica. Informatica Documentation Center. Informatica Resources Informatica Customer Portal As an Informatica customer.com. UNIX.informatica. some PowerExchange CDC processing for DB2 for i5/OS data sources and z/OS data sources can also run on Linux. its background. The site contains product information. or ideas about this documentation.com. the Informatica Knowledge Base. The Documentation team updates documentation as needed. or Windows. and Windows® ¨ PowerExchange for Oracle ® ¨ PowerExchange for SQL Server ® Note: If you use the offloading feature. usable documentation. contact the Informatica Documentation team through email at infa_documentation@informatica. UNIX®. and manage PowerExchange Change Data Capture (CDC) on Linux. you can access the Informatica Customer Portal site at http://my. upcoming events. If you have questions. verify that you have installed the required PowerExchange components.informatica. the Informatica Multimedia Knowledge Base.Preface This guide describes how to configure. To get the latest documentation for your product. UNIX. This guide applies to the CDC option of the following PowerExchange products: ¨ PowerExchange for DB2 ® for Linux®.com. access to the Informatica customer support case management system (ATLAS).informatica.com. and Windows systems. comments. Informatica Web Site You can access the Informatica corporate web site at http://www. Let us know if we can contact you regarding your comments. Before implementing change data capture. navigate to the Informatica Documentation Center from http://my. and sales offices. You will also find product and partner vi . Informatica Documentation The Informatica Documentation team takes every effort to create accurate. We will use your feedback to improve our documentation. user group information. implement. the Informatica How-To Library. and access to the Informatica user community. newsletters.

Informatica Knowledge Base As an Informatica customer.informatica.com. Use the following telephone numbers to contact Informatica Global Customer Support: North America / South America Toll Free +1 877 463 2435 Europe / Middle East / Africa Toll Free 00 800 4632 4357 Asia / Australia Toll Free Australia: 1 800 151 830 Singapore: 001 800 4632 4357 Standard Rate India: +91 80 4112 5738 Standard Rate Brazil: +55 11 3523 7761 Mexico: +52 55 1168 9763 United States: +1 650 385 5800 Standard Rate Belgium: +32 15 281 702 France: +33 1 41 38 92 26 Germany: +49 1805 702 702 Netherlands: +31 306 022 797 Spain and Portugal: +34 93 480 3760 United Kingdom: +44 1628 511 445 Preface vii . comments. The Multimedia Knowledge Base is a collection of instructional multimedia files that help you learn about common concepts and guide you through performing specific tasks.com.com. The HowTo Library is a collection of resources to help you learn more about Informatica products and features.com. comments. compare features and behaviors. and technical tips. you can access the Informatica Knowledge Base at http://my. or ideas about the Knowledge Base. technical white papers.informatica. or ideas about the Multimedia Knowledge Base. contact the Informatica Knowledge Base team through email at KB_Feedback@informatica.informatica. If you have questions. You can request a user name and password at http://my. The services area of the site includes important information about technical support.com.com.information. and implementation services. Informatica How-To Library As an Informatica customer. Informatica Multimedia Knowledge Base As an Informatica customer. Informatica Global Customer Support You can contact a Customer Support Center by telephone or through the WebSupport Service. you can access the Informatica Multimedia Knowledge Base at http://my.informatica. WebSupport requires a user name and password. you can access the Informatica How-To Library at http://my. training and education. It includes articles and interactive demonstrations that provide solutions to common problems. and guide you through performing specific real-world tasks. contact the Informatica Knowledge Base team through email at KB_Feedback@informatica. If you have questions. You can also find answers to frequently asked questions. Use the Knowledge Base to search for documented solutions to known technical issues about Informatica products.

viii .

2 1 .Part I: PowerExchange CDC Introduction This part contains the following chapters: ¨ Change Data Capture Introduction.

2. or Windows After materializing target tables or files with PowerExchange bulk data movement. and Windows ¨ Microsoft SQL Server on Windows ¨ Oracle on Linux. 2 ¨ PowerExchange CDC Data Sources. Change data apply. PowerExchange can read change data directly from the RDBMS log files or database. PowerExchange captures change data for the source tables. 10 PowerExchange CDC Overview PowerExchange Change Data Capture (CDC) works in conjunction with PowerCenter to capture changes to data in source tables and replicate those changes to target tables and files. or Windows operating systems. extracts captured change data for movement to the target. Optionally. 2 . UNIX. The change data replication process consists of following high-level steps: 1. Change data extraction. 7 ¨ PowerExchange CDC Architecture. 3. Change data capture. UNIX. PowerExchange. PowerExchange. UNIX. UNIX. you can use the PowerExchange Logger for Linux. These sources are: ¨ DB2 for Linux. you can use PowerExchange CDC to synchronize the targets with their corresponding source tables.CHAPTER 1 Change Data Capture Introduction This chapter includes the following topics: ¨ PowerExchange CDC Overview. in conjunction with PowerCenter. 4 ¨ PowerExchange CDC Components. 8 ¨ Summary of CDC Implementation Tasks. Synchronization is faster when you replicate only the change data rather than all of the data. This guide describes PowerExchange CDC for relational database sources on Linux. and Windows to capture change data to its log files. 6 ¨ PowerExchange Integration with PowerCenter. transforms and applies the extracted change data to target tables or files. in conjunction with PowerCenter.

you must use the real-time extraction mode. Optionally. From PowerCenter.Both before and after images if you select an image type of “BA” in the CDC application connection attributes for PowerCenter. PowerExchange captures the following image types: . Depending on the statement type. To extract change data from PowerExchange Logger log files. The capture registration provides metadata for the columns that are selected for change capture. In most situations. and Windows. When you create a capture registration for a source table. or Oracle redo logs. session. UNIX. PowerExchange captures the following data images: ¨ For INSERTS. you can import the extraction map or import the table definition from the source database through PowerExchange. An after image reflects a row just after an INSERT operation. If you use the offloading feature in combination with the PowerExchange Logger for Linux. PowerExchange captures before images only. PowerExchange can then extract change data from the PowerExchange Logger log files rather than from the database log files. PowerExchange passes these changes as DELETEs to PowerCenter. and Windows. The extraction map describes the columns for which to extract change data. DELETE. and UPDATE operations. you must define a capture registration in the PowerExchange Navigator. For DB2 for Linux. PowerExchange passes an UPDATE to PowerCenter as a DELETE of the before-image data followed by an INSERT of the after-image data.After images if you select an image type of “AI” in the CDC application connection attributes. and Windows data sources only. you can use PowerExchange CDC Overview 3 . To define a data source in PowerCenter. you can create a data map if you have user-defined or multi-field columns for which you want to manipulate data before loading it to the target. To extract change data directly from source DB2 or Oracle log files or SQL Server distribution database. UNIX. you can create alternative extraction maps. Microsoft SQL Server distribution databases. You can edit the extraction map to remove columns from extraction processing. Also. you must define a mapping. ¨ For UPDATEs. For each source table. you can include transformations in the mapping to manipulate the change data. ¨ For DELETEs. use the PowerExchange Logger for Linux. When you define a CDC session. you run a CDC workflow and session that extracts and applies change data. Informatica recommends that you import the extraction map. UNIX. If you do not retain database log files long enough for CDC processing to complete. the PowerExchange Navigator generates a corresponding extraction map and application name for the extraction. a PowerExchange Logger process can log change data from data sources on an i5/ OS or z/OS system. Also. PowerExchange passes these changes as INSERTs to PowerCenter. PowerExchange captures changes that result from successful SQL INSERT. The PowerExchange Logger writes change data to its log files. and workflow in PowerCenter. unless you also request before-image data. PowerExchange passes an UPDATE to PowerCenter as an UPDATE or INSERT. PowerExchange passes only the after-image data for an updated row. The targets can be on the same system as the source or on a different system. you must specify a connection type.Change Data Capture PowerExchange can capture change data directly from DB2 recovery logs. Change Data Extraction and Apply PowerExchange works with PowerCenter to extract change data and write it to one or more target tables or files. each for a subset of the columns that are registered for capture. PowerExchange captures after images only. you can import a DB2 data map instead of the extraction map. . The connection type determines the extraction mode and access method that PowerExchange uses to extract data. For DB2 only. A before image reflects a row just prior to the last DELETE operation.

either the batch extraction mode or continuous extraction mode. Also. The PowerExchange Navigator generates a corresponding extraction map and application name. including DB2 log positioning information. During extraction processing. and Windows Data Sources PowerExchange captures change data from DB2 recovery log files for the database that contains your source tables. PowerExchange extracts changes from the change stream in chronological order based on the unit of work (UOW) end time. PowerExchange maintains restart tokens for each source table. DB2 for Linux. This mode provides the lowest latency for change data extraction but potentially the highest impact on system resources. the extraction request ends. PowerExchange passes only the successfully committed changes to PowerCenter for processing. you can use the PowerExchange Navigator. UNIX. Reads change data continuously from open and closed PowerExchange Logger log files in near real time. you must create a capture registration for each source table. In the PowerExchange Navigator. archive logging must be active for the database. For CDC to work. To generate current restart tokens. changes that were contiguous in the change stream might not be contiguous in the reconstructed UOW that PowerExchange passes to PowerCenter. When the PowerExchange Listener receives an extraction request. and Windows in combination with the offloading feature. you must create a PowerExchange capture catalog table in the source database. After processing the log files. The capture catalog table stores information about the source tables and columns. This mode provides the highest latency for change data extraction but minimizes the impact on system resources. If you use the PowerExchange Logger for Linux. Batch extraction mode Continuous extraction mode To initiate change data extraction and apply processing. If you are capturing changes from DB2 recovery logs or Oracle redo logs. You can import the extraction map into PowerCenter to define the source for extraction and apply processing. PowerExchange can also capture change data from Microsoft SQL Server data sources on Windows. or Windows systems. the special override statement in the restart token file. you can also process change data from data sources on i5/OS or z/OS. RELATED TOPICS: ¨ “Introduction to Change Data Extraction” on page 105 PowerExchange CDC Data Sources PowerExchange can capture change data from DB2 and Oracle data sources on Linux. Restart tokens are used for all extraction modes. on an ongoing basis. run a CDC workflow and session from PowerCenter. PowerExchange does not pass ABORTed or UNDO changes. This mode also minimizes database log accesses and the log retention period that is required for CDC. The following table describes these extraction modes: Extraction Mode Real-time extraction mode Description Reads change data directly from the database log files in near real time. or the DTLUAPPL utility. 4 Chapter 1: Change Data Capture Introduction . To properly resume extraction processing. Reads change data from PowerExchange Logger log files that are in a closed state when an extraction request is made. UNIX. it pulls the change data from the log files and transmits the data to PowerCenter for extraction and apply processing. UNIX.

Because PowerExchange reads data from Oracle archive logs. the Microsoft SQL Server Agent must also be running. you must run Oracle in ARCHIVELOG mode. If you have Oracle Version 10g Release 2 or later. or Windows systems. If you create a data map. UNIX. UNIX. When the extraction process runs. verify that each source table in the distribution database has a primary key.If you have a source table with user-defined fields or multi-field columns. and Windows can log change data from i5/OS and z/OS systems as well as from other Linux. RELATED TOPICS: ¨ “Oracle Change Data Capture with Oracle LogMiner” on page 80 i5/OS and z/OS Data Sources with Offload Processing You can use CDC offload processing in combination with the PowerExchange Logger for Linux. With offload processing. and Windows Change Data Capture” on page 56 Microsoft SQL Server Data Sources PowerExchange CDC uses Microsoft SQL Server transactional replication technology to access data in SQL Server distribution databases. UNIX. Also. For CDC to work. Also. use a distributed server as the host of the distribution database. In a RAC. you can create a data map to manipulate these fields with expressions. you might want to create data map to manipulate packed data in a CHAR column. RELATED TOPICS: ¨ “CDC Offload and Multithreaded Processing” on page 159 PowerExchange CDC Data Sources 5 . the Oracle archive logs for all Oracle instances in the RAC must reside on shared disk storage for PowerExchange to access them. For example. a PowerExchange Logger process on Linux. If your database has a high volume of change activity. and Windows to log change data from data sources on systems other than the system where the PowerExchange Logger runs. For example. PowerExchange supports CDC in Oracle Real Application Cluster (RAC) environments. you must enable SQL Server Replication on the system from which change data is captured. RELATED TOPICS: ¨ “Microsoft SQL Server Change Data Capture” on page 70 Oracle Data Sources PowerExchange uses Oracle LogMiner to read change data from Oracle archive logs. you must still create a capture registration and merge the data map with the extraction map that is generated for the capture registration. UNIX. a PowerExchange Logger process can log change data from a DB2 instance on z/OS. PowerExchange requires a copy of the Oracle online catalog in the archive logs to determine restart points for change data extraction processing. RELATED TOPICS: ¨ “DB2 for Linux.

To use the PowerExchange Logger. Oracle redo logs.6. ¨ PowerExchange Navigator. the PowerExchange Navigator communicates with the PowerExchange Listener to get all capture registrations defined for that instance. Required.6. and Windows The PowerExchange Logger for Linux. Although PowerExchange 8. Note: The PowerExchange Condense component has been deprecated in PowerExchange Version 8. This practice maintains transactional integrity. This location corresponds to a NODE statement in the dbmover. PowerExchange Listener The PowerExchange Listener manages capture registrations and extraction maps for all CDC data sources. run one PowerExchange Logger process for each database type and instance. or delete capture registrations or extraction maps in the PowerExchange Navigator. 6 Chapter 1: Change Data Capture Introduction . The PowerExchange Logger writes all successful UOWs in chronological order based on end time to its log files. It also manages data maps if you create any for DB2 for Linux. RELATED TOPICS: ¨ “PowerExchange Listener” on page 12 PowerExchange Logger for Linux. UNIX. The PowerExchange Logger replaces PowerExchange Condense.cfg file.1 tolerates continued use of PowerExchange Condense for partial condense processing. You can extract the change data from the PowerExchange Logger log files in either batch or continuous mode. A PowerExchange Listener is not required if PowerExchange and the PowerCenter Integration Service run on the same physical machine. The PowerExchange Listener maintains this information in the following files: ¨ CCT file for capture registrations ¨ CAMAPS directory for extraction maps ¨ DATAMAPS directory for DB2 data maps The PowerExchange Listener also handles PowerCenter extraction requests for both change data replication and bulk data movement. ¨ PowerExchange Logger for Linux. when you open a registration group for a RDBMS instance. unless PowerExchange and the PowerCenter Integration Service are installed on the same physical machine. Informatica recommends that you migrate to the PowerExchange Logger. UNIX. edit.1. Future PowerExchange versions will require migration to the PowerExchange Logger. For example. and Windows tables. Optional. and Windows. UNIX.PowerExchange CDC Components The following PowerExchange components are used for change data capture (CDC): ¨ PowerExchange Listener. Use of the PowerExchange Logger is optional. Required. the PowerExchange Navigator uses the location value in the registration group and extraction group to contact the PowerExchange Listener. or a SQL Server distribution database and writes that data to PowerExchange Logger log files. UNIX. or Windows captures change data from DB2 recovery logs. When you create.

The PowerExchange Logger can use only one Oracle LogMiner session to read change data for all extractions that process an Oracle instance. For DB2 sources. The corresponding extraction map is automatically generated. PowerExchange Integration with PowerCenter PowerCenter provides transformation and data cleansing functions that you can use in CDC sessions. You can import the extraction maps into PowerCenter so that they can be used for moving change data to the target. extraction maps. For Oracle. and data maps. This feature can significantly reduce restart times. see PowerExchange Interfaces for PowerCenter. For more information about PWXPC and the PowerExchange ODBC drivers. you can also define data maps if you need to perform column-level processing. this overhead reduction can be significant. Use continuous extraction mode for near-real-time access to change data. use either the PowerExchange Client for PowerCenter (PWXPC) or the PowerExchange ODBC drivers in PowerCenter. better performance. use PowerCenter in conjunction with PowerExchange to extract and transform the change data and then apply it to one or more targets. ¨ You do not need to retain the source RDBMS log files longer than normal for CDC. and better recovery and restart capabilities. you do not need the RDBMS client software. PowerExchange Integration with PowerCenter 7 . from the PowerExchange Navigator. To integrate PowerExchange with PowerCenter. you must install the SQL Server client software on the PowerExchange Navigator machine. This configuration enables PowerExchange to use one Oracle LogMiner session for all extractions that process an Oracle instance. Note: If the PowerExchange Navigator is not installed on the same machine as a Microsoft SQL Server data source. Note: This guide assumes that you use PWXPC. such as adding user-defined columns and building expressions to populate them. After capturing change data. For the same situation with DB2 and Oracle data sources. and Windows” on page 19 PowerExchange Navigator The PowerExchange Navigator is the graphical user interface from which you define and manage capture registrations. PWXPC provides more functionality. The client software is required because PowerExchange uses SQL Server services when creating capture registrations. Tip: For Oracle data sources. RELATED TOPICS: ¨ “PowerExchange Logger for Linux. Instead. UNIX. Informatica recommends that you use PWXPC.Benefits of the PowerExchange Logger include: ¨ Source database overhead is reduced because PowerExchange makes fewer accesses to the source log files or database to read change data. including the performance of real-time extractions. Informatica recommends that you run the PowerExchange Logger rather than use real-time extraction mode. you can point to the PowerExchange Listener on the machine that contains the source DB2 database or Oracle instance. Multiple concurrent LogMiner sessions can significantly degrade performance on the machine where CDC sessions run. You must define a capture registration for each source table. ¨ PowerExchange does not need to reposition its point in the DB2 or Oracle logs from which to resume reading data.

IMS segments. After the PWXPC reader reads the change data. With this configuration. or WebSphere MQ. you can also load data to those targets. For example. PowerExchange CDC uses the CAPXRT access method to capture change data from a SQL Server distribution database.PowerExchange CDC Architecture The PowerExchange CDC architecture is sufficiently flexible to handle many change data replication scenarios. The targets can be tables or files on the same system as the source or on other systems. When an extraction request runs. DB2 for z/OS tables. Note: The Oracle UOW Cleanser reconstruct UOWs from redo logs into complete and consecutive UOWs that are in chronological order by end time. you can replicate change data from multiple sources in the same database or instance to multiple target tables in a single extraction process. you might want to run a separate PowerExchange Logger for each source RDBMS to create separate sets of log files for each RDBMS type. PowerCenter uses the mapping and workflow that you created to transform the data and load it to the target. VSAM data sets. the PowerCenter extraction session pulls the change data that PowerExchange captured. and Oracle redo logs. PowerExchange incorporates the UOW Cleanser function into the consumer API (CAPI) for extracting changes from the data source. You can use PowerExchange in conjunction with PowerCenter to replicate change data from multiple sources of the same RDBMS type to multiple targets of different types in a single session. PowerCenter connects to the PowerExchange Call Level Interface (SCLI) to contact the PowerExchange Listener. DB2 recovery logs. You can run multiple instances of PowerExchange CDC components on a single system. The following figure shows a simple CDC configuration that uses real-time extraction mode to access change data directly from the change stream without the PowerExchange Logger: In this real-time configuration. For DB2 and SQL Server. The change data is passed to the SCLI and then to the PWXPC CDC Real Time reader. The PowerCenter Integration Service can write data to tables in some RDBMSs as well as to flat files and XML files. 8 Chapter 1: Change Data Capture Introduction . for example. If you installed PowerExchange or PowerExchange (PowerCenter Connect) products that provide connectivity to additional nonrelational or relational targets. In this manner.

if necessary. PWXPC extracts change data from the PowerExchange Logger log files in batch extraction mode with the CAPX access method. Batch and continuous extraction sessions can run concurrently. you can create one source definition and one mapping that covers both extraction modes. When an extraction session runs. batch and continuous extractions must run as separate sessions. PWXPC extracts change data in continuous mode with the CAPXRT access method. However. PowerExchange CDC Architecture 9 . For a batch extraction session. use a PWX CDC Real Time application connection. use a PWX CDC Change application connection. the extraction session stops after it completes processing the log files. For some source tables. After the data is in the PowerExchange log files. In this mode. PWXPC contacts the PowerExchange Listener. For other source tables. you can run batch extractions to replicate change data to targets that need to be synchronized periodically. For a continuous extraction session.The following figure shows a CDC configuration that uses the PowerExchange Logger in both batch extraction mode and continuous extraction mode: In this configuration. the PowerExchange Logger captures change data from the change stream for SQL Server. In PowerCenter. For example. The PowerExchange Listener reads the PowerExchange Logger log files and calls the SCLI on the PowerCenter Integration Service machine to transmit the change data to PowerCenter. the source RDBMS log files can be deleted. the extraction session extracts change data on an ongoing basis. Oracle. In this mode. and DB2 tables and writes that data to its log files. and run continuous extractions to replicate change data to targets that need to be synchronized in near real time.

Define data sources for CDC 6 From the PowerExchange Navigator.Chapter 4. Start the PowerExchange Listener on the machine with the source database. “Oracle Change Data Capture with Oracle LogMiner” on page 80 “Configuring the PowerExchange Logger” on page 27 “Starting the PowerExchange Logger” on page 47 2 3 4 (Optional) Configure the PowerExchange Logger. create DB2 data maps. connections.Summary of CDC Implementation Tasks After you install PowerExchange. “Microsoft SQL Server Change Data Capture” on page 70 . workflows.PowerCenter Workflow Basics Guide 10 Chapter 1: Change Data Capture Introduction . “Customizing the dbmover. UNIX. and start extraction processing. define and activate capture registrations and extraction maps for the data sources. Establish a start point for the extraction.cfg File for CDC” on page 12 “Starting the PowerExchange Listener” on page 16 . materialize targets.PowerExchange Interfaces for PowerCenter .Chapter 5. you can configure change data capture and extraction. For DB2 sources that have user-defined or multi-field columns that you want to manipulate. Then run the workflow. “DB2 for Linux. PowerExchange Navigator Guide 7 PowerExchange Navigator Guide Materialize targets and start capturing changes 8 9 Materialize the target from the source. The following table identifies the tasks for implementing change data capture and extraction processing for a data source: Step Task References Configure and start PowerExchange CDC components 1 Configure parameters in the dbmover. and sessions. configure mappings. . Perform RDBMS-specific configuration tasks for CDC. 5 (Optional) Start the PowerExchange Logger. and Windows Change Data Capture” on page 56 .PowerCenter Designer Guide .Chapter 6.cfg file for the PowerExchange Listener. PowerExchange Bulk Data Movement Guide “Restart Tokens and the Restart Token File” on page 109 Extract and apply change data 10 From PowerCenter.

Part II: PowerExchange CDC Components This part contains the following chapters: ¨ PowerExchange Listener. and Windows. 19 11 . 12 ¨ PowerExchange Logger for Linux. UNIX.

CHAPTER 2 PowerExchange Listener This chapter includes the following topics: ¨ PowerExchange Listener Overview.cfg File for CDC. 12 ¨ Customizing the dbmover.cfg file that pertain to CDC processing. and data maps for CDC data sources. extraction maps. ¨ Interact with other PowerExchange Listeners on other nodes to facilitate communication among the PowerExchange Navigator. Customizing the dbmover. and any system to which PowerExchange processing is offloaded. 12 ¨ Starting the PowerExchange Listener. ¨ Provide captured change data to PowerCenter when you run a PowerCenter CDC session. 12 . ¨ Connect to the system with the PowerExchange Logger log files to extract change data. and PowerExchange Logger log files. UNIX. ¨ Provide captured change data or source table data to the PowerExchange Navigator when you perform a database row test of an extraction map or a data map. 16 ¨ Stopping the PowerExchange Listener. data sources. 16 ¨ Displaying Active PowerExchange Listener Tasks. This topic describes the key CDC parameters that are common to the PowerExchange source RDBMSs on Linux. 17 PowerExchange Listener Overview In a change data capture (CDC) environment. PowerCenter Integration Service. extraction maps. ¨ Determine the directory in which to store capture registrations.cfg parameters to perform the following functions: ¨ Connect to source RDBMS databases and objects to capture change data. or Windows. The PowerExchange Listener uses these dbmover. a PowerExchange Listener can provide some or all of the following services: ¨ Store and manage capture registrations.cfg File for CDC You must configure the parameters in the dbmover.

Syntax is: CAPI_SRC_DFLT=(source_type.The following table describes the key dbmover. and UDB. PowerExchange uses a source-specific type of CAPI_CONNECTION statement. if any . which contains application names for PowerCenter extractions that use ODBC connections. such as MSQL. or UDB for DB2 for Linux.cfg statements that are required for CDC: Statement CAPI_CONNECTION Description A named set of parameters that the PowerExchange Consumer API (CAPI) uses to connect to the change stream and control extraction processing. and Windows Change Data Capture” on page 56 ¨ “Microsoft SQL Server Change Data Capture” on page 70 ¨ “Oracle Change Data Capture with Oracle LogMiner” on page 80 ¨ “CAPX CAPI_CONNECTION Parameters” on page 14 Customizing the dbmover. You can specify a CAPI_SRC_DFLT statement for each source database type. Use the CAPI_SRC_DFLT parameter to indicate a default CAPI_CONNECTION for a data source type. and Windows. ORA for Oracle. you can identify one of them as the default. UNIX. PowerExchange requires a connection statement for real-time extraction mode and continuous extraction mode.capi_connection_name) CAPI_SRC_DFLT Where: . Informatica recommends that you use a unique directory name to separate these CDC objects from the PowerExchange code. UNIX. If you define multiple CAPI_CONNECTION statements for a data source. CAPT_PATH Path to the local directory that stores the following files for CDC: . which contains information about PowerExchange Logger log files if you use the PowerExchange Logger This directory can be a directory that you created specifically for these files or another existing directory. see the section for your source type.capi_connection_name is the unique name of the CAPI_CONNECTION statement that you want to use as the default statement. Default is the PowerExchange installation directory.CDEP file. For continuous extraction from PowerExchange Logger log files.source_type is one of the following source database types: MSS for Microsoft SQL Server.CCT file.CDCT file. For real-time extraction. ORCL. The CAPI_CONNECTION statement that PowerExchange uses by default for a specific data source type when no CAPI connection override is supplied. You can override the default CAPI_CONNECTION with another defined CAPI_CONNECTION in multiple ways. This practice makes migrating to a new PowerExchange version easier. which contains capture registrations . Path to the local directory that stores extraction maps. Informatica recommends that you use a unique directory name to separate these CDC objects from the PowerExchange code. CAPT_XTRA RELATED TOPICS: ¨ “DB2 for Linux. You can define up to eight CAPI_CONNECTION statements in a DBMOVER configuration file for the same data source type or different data source types. For more information. PowerExchange CDC uses the CAPX CAPI_CONNECTION statement. This directory can be a directory that you created specifically for these files or another existing directory.cfg File for CDC 13 . This practice makes migrating to a new PowerExchange version easier. A CAPI connection is specific to a data source type. . Default is the PowerExchange installation directory.

a UDB CAPI_CONNECTION If you use continuous extraction mode to extract change data from PowerExchange Logger log files. [FILEWAIT=seconds. and Windows Yes for continuous extraction mode Syntax: CAPI_CONNECTION=( [DLLTRACE=trace_id.cfg file. The order of precedence that PowerExchange uses to determine which CAPI_CONNECTION statement to use is described in the PowerExchange Reference Manual. You can specify up to eight CAPI_CONNECTION statements in a dbmover. you can identify one of these statements as the source-specific default. you must define one of the following source-specific types of CAPI_CONNECTION statements: ¨ For Microsoft SQL Server. In addition to or in lieu of defaults. and Windows CAPI_CONNECTION Parameters” on page 62 ¨ “Microsoft SQL Server CAPI_CONNECTION Parameters” on page 76 ¨ “ORCL CAPI_CONNECTION Statement” on page 92 ¨ “UOWC CAPI_CONNECTION Statement” on page 99 CAPX CAPI_CONNECTION Parameters The CAPX CAPI_CONNECTION statement specifies the Consumer API (CAPI) parameters needed for continuous extraction of change data from PowerExchange Logger for Linux. you must also define a CAPX CAPI_CONNECTION statement. PowerExchange uses the parameters that you specify in the CAPI_CONNECTION statements to connect to the change stream and to customize capture and extraction processing. Operating Systems: Required: Linux. you do not need to specify CAPI_CONNECTION statements to perform database row tests. To perform database row tests for data sources that are defined by capture registrations local to the PowerExchange Navigator.] NAME=name. RELATED TOPICS: ¨ “CAPX CAPI_CONNECTION Parameters” on page 14 ¨ “DB2 for Linux. Note: When you extract change data. an MSQL CAPI_CONNECTION ¨ For Oracle.] TYPE=(CAPX. UNIX. and Windows log files. UNIX.cfg file on any Linux. UNIX. an ORCL CAPI_CONNECTION and a UOW CAPI_CONNECTION for the UOW Cleanser ¨ For DB2 for Linux.CAPI_CONNECTION Statements PowerExchange requires that you define CAPI_CONNECTION statements in the dbmover. PowerExchange uses CAPI_CONNECTION statements to connect to the change stream for the data source. [TRACE=trace. and Windows. Otherwise. You can identify one of the statements as the overall default.] [RSTRADV=seconds] 14 Chapter 2: PowerExchange Listener . For each data source. DFLTINST=collection_id. you must specify the appropriate CAPI_CONNECTION statements on the PowerExchange Navigator machine. UNIX. UNIX. or Windows system where PowerExchange captures or extracts change data. If you define multiple CAPI_CONNECTION statements for the same source type. you can define specific CAPI_CONNECTION overrides in multiple ways.

PowerExchange returns the next committed "empty UOW. Valid values are from 1 through 86400. Unique user-defined name for this CAPI_CONNECTION statement.. Specify this parameter only at the direction of Informatica Global Customer Support. sometimes called the instance name or collection identifier." which includes only updated restart information. Then PowerExchange returns the next committed empty UOW that includes the updated restart information and resets the wait interval to 0. The wait interval is reset to 0 when PowerExchange completes processing a UOW that includes changes of interest or returns an empty UOW because the wait interval expired without any changes of interest having been received. that is defined in capture registrations. that PowerExchange waits before advancing restart and sequence tokens for a registered data source during periods when UOWs do not include any changes of interest for the data source. ) Required. This value must match the instance or database name that is displayed in the Resource Inspector of the PowerExchange Navigator for the registration group that contains the capture registrations. For continuous extraction mode. Maximum length is eight alphanumeric characters. FILEWAIT=seconds Optional. Default is 1. Specify this parameter only at the direction of Informatica Global Customer Support. PowerExchange waits 5 seconds after it completes processing the last UOW or after the previous wait interval expires. Maximum length is eight alphanumeric characters. User-defined name of the TRACE statement that activates internal DLL tracing for this CAPI. DFLTINST=collection_id Required. A source identifier. RSTRADV=nnnnn Time interval.. When the wait interval expires. Type of CAPI_CONNECTION statement. . if you specify 5. If RSTRADV is not specified. PowerExchange does not advance restart and sequence tokens for a registered source during periods when no changes of interest are received. Time interval. including those not of interest for CDC. this value must be CAPX. In this case. NAME=name Required. in seconds. when PowerExchange warm starts. that PowerExchange waits before checking for new PowerExchange Logger log files.cfg File for CDC 15 . in seconds. For example. User-defined name of the TRACE statement that activates the common CAPI tracing. Customizing the dbmover. from the restart point. TRACE=trace Optional. Valid values are 0 through 86400. it reads all changes.) ) Parameters: Enter the following parameters: DLLTRACE=trace_id Optional. No default is provided. TYPE=(CAPX.

use the CLOSE or CLOSE FORCE command.Warning: A value of 0 can degrade performance because PowerExchange returns an empty UOW after each UOW processed.log file and then starts the PowerExchange Listener. To stop active PowerExchange Listener tasks. 16 Chapter 2: PowerExchange Listener . use one of the following methods: ¨ Enter dtllst at the command line to run the PowerExchange Listener in foreground mode: dtllst node1 [config=directory/myconfig_file] [license=directory/mylicense_key_file] Include the optional config and license parameters if you want to specify configuration and license key files that override the original dbmover. use the STOPTASK command. Stopping the PowerExchange Listener To stop the PowerExchange Listener. Alternatively. To start a PowerExchange Listener service from the Windows Start menu. This script deletes the detail.key files. use one of the following methods: ¨ Run the PowerExchange Listener as a Windows service. Starting the PowerExchange Listener To start the PowerExchange Listener. Note: You cannot start the PowerExchange Listener by using the pwxcmd program. The syntax is the same as for Linux and UNIX except that the & and nohup operands are not supported. On a Windows system. On a Linux or UNIX system. click Start > Programs > Informatica PowerExchange > Start PowerExchange Listener. You can add an ampersand (&) at the end to run the PowerExchange Listener in background mode and add the prefix "nohup" at the beginning to run the PowerExchange Listener persistently: nohup dtllst node1 [config=directory/myconfig_file] [license=directory/mylicense_key_file] & ¨ Run the startlst script. you can run the dtllst program or use other system-specific methods.cfg and license. use the dtllstsi program to enter the start command from a Windows command prompt: dtllstsi start “service_name” ¨ Enter dtllst. Your product license must allow this manual mode of PowerExchange Listener operation. which was installed with PowerExchange. which is the usual practice.

CDC subtasks. and status. or Windows system. UNIX. if the PowerExchange Listener does not respond to a CLOSE FORCE command. This information includes the TCP/ IP address. enter the following command at the command line on the screen where the PowerExchange Listener task is running in foreground mode: D Displaying Active PowerExchange Listener Tasks 17 .The following table describes these commands and the syntax for issuing each command from the command line against a PowerExchange Listener task that is running in foreground mode: Command CLOSE Description Stops the PowerExchange Listener after all of the following subtasks complete: . PowerExchange waits 30 seconds for current user subtasks on the PowerExchange Listener to complete. UNIX. UNIX. which stop at the next commit of a unit of work (UOW) . A “kill” operation is similar to a CLOSE operation. or stoptask command to a PowerExchange Listener running in foreground or background mode. Alternatively. or Windows: C CLOSE FORCE On Linux or UNIX: C F On Windows: CF STOPTASK On Linux or UNIX: STOPTASK app_name On Windows: STOPTASK APPLID=app_name The app_name is the name of an active change data extraction process. application name. or Windows system. press Ctrl + C once to issue CLOSE or press Ctrl + C twice to issue CLOSE FORCE. on the local system or a remote system.Bulk data movement subtasks . use the standard operating system commands to find the PowerExchange Listener process ID and then “kill” that process. Displaying Active PowerExchange Listener Tasks You can use the DISPLAY ACTIVE command to display information about each active PowerExchange Listener task that is running in foreground mode on a Linux. You can get this name from the PWX-00712 messages in the PowerExchange Listener D (DISPLAY ACTIVE) command output. This command is useful if you have long-running subtasks on the PowerExchange Listener. You can issue these pwxcmd commands from the command line or include them in scripts or batch files.PowerExchange Listener subtasks Forces the cancellation of all user subtasks and stops the PowerExchange Listener. access type. Then PowerExchange cancels any remaining user subtasks and stops the PowerExchange Listener. if the PowerExchange Listener is running in background mode. UNIX. you can use any of the following methods: ¨ On a Linux. or Windows system. Command Line Syntax On Linux. ¨ On a Windows system. Stops a PowerExchange Listener task for a specific extraction application process. port number. closeforce. use the pwxcmd program to issue the close. PowerExchange waits to stop the PowerExchange Listener until either the end UOW or commit threshold is reached. ¨ On a Linux or UNIX system. On a Linux.

UNIX. or batch file to a PowerExchange Listener running on the local system or a remote system.Alternatively. script. 18 Chapter 2: PowerExchange Listener . The pwxcmd listtask command produces the same output as the DISPLAY ACTIVE command. you can issue the pwxcmd listtask command from a command line. on a Linux. or Windows system.

it extracts change data from the log files instead of from the change stream. UNIX. UNIX. or Windows can also process data from data sources on i5/OS or z/OS systems. 19 . The PowerExchange Logger runs in continuous mode or batch mode. UNIX. which reduces database I/O. Use the PowerExchange Logger to reduce database overhead due to CDC processing. Also. or Windows. and Windows is similar in function to PowerExchange Condense on i5/OS or z/OS systems. UNIX. you often do not need to extend the retention period for source database log files to accommodate CDC processing. If you use the offloading feature. With the PowerExchange Logger. You must run one PowerExchange Logger process for each source type and instance. and Windows This chapter includes the following topics: ¨ PowerExchange Logger Overview. a PowerExchange Logger process on Linux. Note: The PowerExchange Logger for Linux. The PowerExchange Logger writes only the successful units of work (UOWs) to its log files. and Windows captures change data from PowerExchange data sources and write that data to PowerExchange Logger log files. PowerExchange accesses the source database fewer times to read change data. UNIX. The PowerExchange Logger can capture change data from DB2 recovery logs or Oracle redo logs on Linux.CHAPTER 3 PowerExchange Logger for Linux. 20 ¨ PowerExchange Logger Files. 25 ¨ PowerExchange Logger Operational Modes. 21 ¨ File Switches. 25 ¨ PowerExchange Logger Considerations on Linux and UNIX. 27 ¨ Starting the PowerExchange Logger. or from a Microsoft SQL Server distribution database on a Windows. as defined in a registration group. in chronological order based on end time. When a PowerCenter CDC session runs. 19 ¨ PowerExchange Logger Tasks. 49 PowerExchange Logger Overview The PowerExchange Logger for Linux. 47 ¨ Managing the PowerExchange Logger. because change data is extracted from the PowerExchange Logger log files. 27 ¨ Configuring the PowerExchange Logger.

the Controller starts the Command Handler subtask and then the Writer subtask. After loading this information. When PowerCenter workflow sessions run.cfg. and Windows . The use of multiple. Use the COLL_END_LOG parameter to control whether the PowerExchange Logger runs in continuous mode or batch mode. PowerExchange starts a separate LogMiner session for each extraction. If the PROMPT parameter is set to Y in the pwxccl. and writes change data to PowerExchange Logger log files. Informatica recommends that you use the PowerExchange Logger and continuous extraction mode. writes records to the CDCT file during a file switch. For each PowerExchange Logger process.When you create capture registrations for data sources. the Command Handler waits for the Writer subtask to initialize before accepting a user command. The configuration file contains parameters for controlling the PowerExchange Logger and for identifying the source instance. For i5/OS or z/OS data sources. set the Condense option to Part. Do not use real-time extraction mode with the PowerExchange Logger. and rolls back CDCT records when you warm start the PowerExchange Logger from an earlier point in time. you must define a configuration file. If you use real-time extraction mode. if you set the Condense option to Full in capture registrations. The Writer initializes the CAPI for the source database. reads change data from the change stream. you can extract change data from PowerExchange Logger log files in batch extraction mode or continuous extraction mode. Tip: For Oracle near-real-time CDC. These tasks perform the following functions: Controller task Loads parameter settings from the PowerExchange Logger pwxccl. 20 Chapter 3: PowerExchange Logger for Linux. concurrent LogMiner sessions can significantly degrade the performance on the system where LogMiner runs. the Writer waits for you to respond to confirmation prompts before proceeding with a cold start or a rollback of CDCT records. RELATED TOPICS: ¨ “PowerExchange Logger Operational Modes” on page 25 ¨ “Customizing the PowerExchange Logger Configuration File” on page 28 PowerExchange Logger Tasks The PowerExchange Logger uses a Controller task with Command Handler and Writer subtasks. including i5/OS and z/OS data sources for which processing is offloaded. without the PowerExchange Logger. Command Handler subtask Processes PowerExchange Logger commands from various sources. and loads the capture registrations from the CCT file. Reads the cache file from the last run to determine if capture registrations have been added or removed. the PowerExchange Logger ignores the registrations and does not process change data from those sources. The Writer also performs checkpoint processing. Writer subtask Performs most of the PowerExchange Logger work that uses CPU time. PowerExchange provides a sample configuration file named pwxccl.cfg file. PowerExchange then uses one Oracle LogMiner session for all extractions that process an Oracle instance.cfg configuration file. The PowerExchange Logger supports only partial condense processing. including user stdin and the pwxcmd program. UNIX.cfg file. deletes expired CDCT records. determines the start or restart point in the change stream. If the PROMPT parameter is set to Y in the pwxccl.

or tables.PowerExchange Logger Files A PowerExchange Logger process writes information to the CDCT file.cfg file on the system where the PowerExchange Logger runs. and you are not using a group definition file. back up and restore the CDCT file. and other control information. This temporary record enables the PowerExchange Logger to retrieve source data for extractions that run in continuous extraction mode. It also uses cache files and lock files during processing. PowerExchange Logger Files 21 . the Writer subtask writes two keyed records to the CDCT file. log file name. checkpoint files. 2. including the log file name. When a PowerCenter CDC session runs in continuous extraction mode or batch extraction mode. the PowerExchange Logger Writer subtask writes keyed records to the CDCT file. the Writer subtask writes a temporary record for the log file. or the first time the PowerExchange Logger receives change data based on an active capture registration. and delete expired CDCT records. When source data for each table is first received and written to the log file. re-create the CDCT file based on PowerExchange Logger log files if necessary. After a file switch. number of records read. which does not include a registration tag name. and change record count. The PowerExchange Logger then deletes the temporary CDCT records that do not include the registration tags. If you use a group definitions file. 3. When a file switch occurs. if a log file contains change records for two registration tags. RELATED TOPICS: ¨ “Maintaining the PowerExchange Logger CDCT File and Log Files” on page 53 PowerExchange Logger Log Files The PowerExchange Logger creates log files for storing change data records when it first encounters changes for source tables and columns of interest. to the CDCT file. You can have as many temporary records as groups in the group definition file. processing is similar to that in the previous example except that the Writer subtask writes one temporary record without a registration tag for each log file that received source data. the CDCT file is in the directory from which the PowerExchange Logger is invoked. These records contain information about each closed PowerExchange Logger log file. the PowerExchange Listener reads the CDCT file to determine the PowerExchange Logger log files from which to extract change data. The PowerExchange Logger creates the CDCT file in the directory that is specified by the CAPT_PATH statement in the dbmover. Tip: You can use the PWXUCDCT utility to print information about CDCT records. PowerExchange Logger log files. Each record includes the registration tag name. one for each of the registration tags. If the CAPT_PATH statement is not specified. the following processing occurs: 1. whether before images are included. CDCT File The PowerExchange Logger stores information about its log files in the CDCT file. and PowerExchange message logs. These source tables and columns must be defined in active capture registrations. UOW start and end times. For example.

The PowerExchange Logger creates log files based on the EXT_CAPT_MASK parameter in the pwxccl.cfg file. This parameter specifies a path to the directory where log files are stored and a prefix for the log file names. Log file names have the following format:
path/prefix.CND.CPyymmdd.Thhmmssnnn

Where:
¨ path/prefix is the EXT_CAPT_MASK value. ¨ yymmdd is the date when the file is created. ¨ hhmmss is a 24-hour time when the file is created. ¨ nnn is a generated sequence number, starting at 001, that makes each file name unique.

The log files remain open until a file switch occurs or the PowerExchange Logger shuts down. When you run a PowerCenter CDC session in continuous extraction mode or batch extraction mode, PowerExchange extracts change data from the PowerExchange Logger log files.

RELATED TOPICS:
¨ “Introduction to Change Data Extraction” on page 105

Checkpoint Files
The PowerExchange Logger creates checkpoint files to store restart tokens and sequence tokens for correctly resuming CDC processing after a PowerExchange Logger warm start. The PowerExchange Logger writes information to the checkpoint files each time a file switch occurs or a SHUTDOWN or SHUTCOND command is issued. Note: Checkpoint files are not used for PowerExchange Logger cold starts. The PowerExchange Logger creates checkpoint files based on the CHKPT_BASENAME and CHKPT_NUM parameters in the pwxccl.cfg file, as follows:
¨ The CHKPT_BASENAME parameter specifies the path to the directory where checkpoint files are stored and a

base file name. Checkpoint file names have the following format:
path/base_name.Vn.ckp

Where:
- path/base_name is the CHKPT_BASENAME value. - n is a number that the PowerExchange Logger appends to the file name. This number can be a value from 0

to (CHKPT_NUM value - 1).
¨ The CHKPT_NUM parameter specifies the number of checkpoint files. At least two checkpoint files are required.

22

Chapter 3: PowerExchange Logger for Linux, UNIX, and Windows

Checkpoint files are sequential files that have a binary variable-length format. Checkpoint files contain the following types of records:
Checkpoint Record 1 Description Main record that contains the checkpoint timestamp and restart and sequence tokens. This information is used to determine the restart point in the change stream for a PowerExchange Logger warm start. Optional. Uncommitted registrations at the end of a PowerExchange Logger log file that did not end on a Commit record. Optional. Names of the PowerExchange Logger log files that were closed.

2

3

If you need to relocate a PowerExchange Logger configuration, you can copy the checkpoint files to another machine that has the same integer endian format. However, you cannot copy checkpoint files to a machine that uses a different integer endian format because the integer fields in checkpoint files that define record length are platform-dependent. To display information about checkpoint files, you can use the following methods:
¨ Issue the DISPLAY CHECKPOINTS command to display a message that reports the sequence number and

timestamp of the last checkpoint file written.
¨ Use the PWXUCDCT utility REPORT_CHECKPOINTS command to print a report that provides information

about each checkpoint file, including its timestamp, restart and sequence tokens, reason for the checkpoint, number of expired CDCT records that were deleted, and number of log files to which data was written.

RELATED TOPICS:
¨ “Customizing the PowerExchange Logger Configuration File” on page 28

Cache Files
The PowerExchange Logger creates two identical cache files, one of which is a backup, in the CHKPT_BASENAME directory. The cache files store registration tag names for warm start processing. When the PowerExchange Logger warm starts, it reads the cache file from its last run to determine if any capture registrations have been added or removed. If so, the PowerExchange Logger issues message PWX-06119.

Lock Files
During initialization, a PowerExchange Logger process creates lock files to prevent other PowerExchange Logger processes from accessing the same CDCT file, checkpoint files, and log files concurrently. As long as the PowerExchange Logger process holds a lock on the lock files, locking is in effect for the resources for which the lock files were created. PowerExchange Logger locking works on local disks on Linux, UNIX, or Windows systems. It also works on the following shared file systems on Linux or UNIX systems:
¨ Veritas Storage Foundation ™ Cluster File System by Symantec ¨ IBM General Parallel File System ¨ EMC Celerra network-attached storage (NAS) with Network File System (NFS) protocol version 3 ¨ NetApp NAS with NFS version 3

PowerExchange Logger Files

23

The PowerExchange Logger creates lock files in the following order: 1. 2. A lock file for the CDCT file for a source instance. The PowerExchange Logger generates the lock file name and location based on the directory that is specified in the CAPT_PATH parameter of the dbmover.cfg file. A lock file for checkpoint files. The PowerExchange Logger generates the lock file name and location based on the directory and base file name that are specified in the CHKPT_BASENAME parameter of the pwxccl.cfg file. One of the following lock files:
¨ If you do not use a group definition file, a lock file for PowerExchange Logger log files. The

3.

PowerExchange Logger generates the lock file name and location based the directory and file-name prefix that are specified in the EXT_CAPT_MASK parameter of the pwxccl.cfg file.
¨ If you use a group definition file, a lock file for each set of the PowerExchange Logger log files that is

defined by the GROUP statements in the group definition file. The PowerExchange Logger generates the lock file names and locations based on the external_capture_mask parameter in each GROUP statement. In this case, the PowerExchange Logger ignores the EXT_CAPT_MASK parameter in the pwxccl.cfg file when creating lock files and processing log files. Lock file names end with _lockfile.lck. For example, a lock file for the CDCT file could have the name CDCT_oracoll1_lockfile.lck. When the PowerExchange Logger process ends, it unlocks the lock files to enable other PowerExchange Logger processes to access the previously locked resources. To identify a PowerExchange Logger process that holds a lock, look up the process ID (PID) in the Task Manager on a Windows system or issue the ps command on a UNIX or Linux system. Also, the PowerExchange Logger writes messages to the PowerExchange message log that indicate the locking status. Look for the following key messages:
¨ To verify that lock files are created, look for PWX-25802 messages, such as: PWX-25802 Process pwxccl.exe pid 5428 locked file C:\capture\captpath\CDCT_instance_lockfile.lck ¨ To verify that lock files are unlocked, look for PWX-25803 messages, such as: PWX-25803 Process pwxccl.exe pid 5428 unlocked file C:\capture\extcapt\loggerfiles_lockfile.lck ¨ If the PowerExchange Logger process cannot find the lock file that it needs to access some resources, it writes

message PWX-25800:
PWX-25800 Could not find lock file file_name ¨ If a lock file is locked by another process, the PowerExchange Logger process writes some or all of the

following messages, depending on if it can acquire a lock before the maximum retry interval that is specified in PWX-25814 elapses:
PWX-25804 PWX-25811 date time PWX-25812 PWX-25813 PWX-25814 PWX-25815 Error trying to lock PowerExchange Logger files File file_name is locked by process process_name pid process_id on host host_name date time File file_name is locked by pid process_id start offset length bytes No information is available on process which locked file file_name Trying to lock file file_name until number seconds elapses File file_name is locked by another process and no more waiting is allowed.

If a PowerExchange Logger process ends abnormally with message PWX-25815 and return code 25815, try to determine the status of the other PowerExchange Logger process that is holding the lock. This other process is identified in message PWX-25811. For example, the other process might not have completely shut down, or both processes might be trying to use the same files because of an error in their pwxccl.cfg configuration files.

Message Log Files
The PowerExchange Logger writes messages to the PowerExchange message log file.

24

Chapter 3: PowerExchange Logger for Linux, UNIX, and Windows

cfg file. To set the operational mode. you can send a pwxcmd fileswitch command to a PowerExchange Logger process running on the local system or a remote system. However. This automatic rotation of message log files prevents out-of-space conditions. use the COLL_END_LOG parameter in the pwxccl. PowerExchange switches to another alternative log file. include the LOGPATH parameter in the dbmover. and Windows.cfg file. Alternatively. The PowerExchange Logger automatically performs a file switch when the criteria in the following parameters of the pwxccl. the PowerExchange Logger continues to check the log files at set intervals. Also. the file switch does not occur.By default. When an alternative log file becomes full. UNIX. When alternative logging is enabled.log and is located in the working directory where the PowerExchange Logger process runs. This process is called a file switch. including each PowerExchange Logger process. this file is named detail. you can optionally specify another directory for PowerExchange message log files. you can force a file switch by entering the fileswitch command from the command line. UNIX. To specify a unique directory for PowerExchange message log files. on Linux. or Windows. The PowerExchange Logger waits until the file-switch criteria are met again. PowerExchange buffers messages before writing them to the alternative log files on disk at a specific flush interval. RELATED TOPICS: ¨ “File Switches” on page 25 ¨ “Customizing the PowerExchange Logger Configuration File” on page 28 ¨ “Extraction Modes” on page 106 File Switches 25 . Use of this parameter can help you find the PowerExchange message log files more easily. Also. Only when the log files contain data does the file switch occur. in a separate directory. File Switches When running in continuous mode. the PowerExchange Logger periodically closes its open log files if they contain data and then opens a new set of log files. This mode of writing messages can reduce I/O activity on the alternative log files. PowerExchange creates a set of alternative log files for each PowerExchange process. you can implement alternative logging by specifying the TRACING statement in the dbmover. You can also enable the use of alternative log files.cfg file.cfg file are met: ¨ FILE_SWITCH_CRIT ¨ FILE_SWITCH_MIN ¨ FILE_SWITCH_VAL If the open log files do not contain data when the file-switch criteria in these parameters are met. on Linux. Also. If the files still do not contain data. RELATED TOPICS: ¨ “Configuring the PowerExchange Logger” on page 27 PowerExchange Logger Operational Modes A PowerExchange Logger process can operate in continuous mode or batch mode.

Run the PowerExchange Logger in continuous mode unless you have a specific reason to use batch mode. UNIX. To prevent log files from becoming too large. To enable continuous mode. and Windows . the PowerExchange Logger sleeps for 30 seconds. However. use batch extraction mode for any workflows that extract change data from the PowerExchange Logger log files. ¨ You have a database with intermittent activity that occurs at unpredictable intervals. For example. set the COLL_END_LOG parameter to 0. Batch Mode The PowerExchange Logger process shuts down after the number of seconds in the NO_DATA_WAIT2 parameter of the pwxccl.cfg file elapse and no data has been received. ¨ The FILESWITCH command is manually entered at the command line or with the pwxcmd program. In continuous mode. the PowerExchange Logger process is temporarily suspended. Use batch mode in the following situations: ¨ You want to run the PowerExchange Logger on a scheduled basis after batch applications that update the database complete. ¨ You cannot restart the PowerExchange Logger process often enough to keep up with the change volume. you can use either continuous or batch extraction mode for workflows that extract change data from the PowerExchange Logger log files. ¨ You want to run the PowerExchange Logger manually or for testing. and then performs another processing cycle. Consider using continuous mode in the following situations: ¨ You have a database with a high level of change activity that occurs continuously. You can use the NO_DATA_WAIT2 parameter in the pwxccl. Files that are too large can extend restart times for CDC sessions that run in continuous extraction mode or batch extraction mode. If you need to reduce the amount of time that the PowerExchange Logger sleeps on a quiet system. The PowerExchange Logger process continues to run until you enter the SHUTDOWN or SHUTCOND command. When you run the PowerExchange Logger in batch mode. When you run the PowerExchange Logger in continuous mode. and FILE_SWITCH_MIN parameters.cfg file. To enable batch mode. the PowerExchange Logger process runs continuously until you manually stop it. a large NO_DATA_WAIT2 value can delay processing of a SHUTDOWN command. the PowerExchange Logger process periodically performs a file switch. if you set the NO_DATA_WAIT2 parameter to 30 seconds. Then use the pwxcmd program to send commands to the PowerExchange Logger process that is running in background mode. FILE_SWITCH_VAL. set the COLL_END_LOG parameter to 1 in the pwxccl.cfg file to prevent the PowerExchange Logger from consuming too much CPU time when PowerExchange is not receiving changes. ¨ The CONDENSE command is manually entered at the command line or with the pwxcmd program. set NO_DATA_WAIT2 parameter to the number of seconds that PowerExchange waits at the end-of-log for more change data before shutting down the PowerExchange Logger. provided that no updates are received. Also. 26 Chapter 3: PowerExchange Logger for Linux.cfg file elapses. each time the Writer subtask completes a logging cycle. Tip: On a Linux or UNIX system. you can adjust the FILE_FLUSH_VAL.Continuous Mode In continuous mode. The next cycle is triggered by any of the following events: ¨ The wait interval that is defined in the NO_DATA_WAIT parameter of the pwxccl. you can run a continuous PowerExchange Logger process in background mode. ¨ You want to avoid the overhead of scheduling PowerExchange Logger runs.

use the Linux or UNIX ulimit command to set the size limits for maximum memory and virtual memory to unlimited. PowerExchange writes the error messages PWX-00271 and PWX-00904 to the PowerExchange message log file when you attempt to start the PowerExchange Logger on Linux or UNIX.PowerExchange Logger Considerations on Linux and UNIX If you run the PowerExchange Logger on a Linux or UNIX system. the PowerExchange Logger ignores this setting and issues an error message. create a PowerExchange group definition file that defines groups of capture registrations for the tables. Informatica recommends that you set the COLL_END_LOG parameter to 0 in the pwxccl. To send commands to a PowerExchange Logger process that is running in the background. Configuring the PowerExchange Logger To configure the PowerExchange Logger. Also. verify that the Condense option is set to Part in the capture registrations for all sources that PowerExchange Logger processes. the PowerExchange Logger issues error message PWX-06427 and ends. Running the PowerExchange Logger in Background Mode You can run a PowerExchange Logger process in background mode on Linux or UNIX systems. The specific ulimit syntax varies by platform and shell. For more information about this command.cfg file. PowerExchange Logger Considerations on Linux and UNIX 27 . the registration must have a status of active and a Condense setting of Part. Enabling a Capture Registration for PowerExchange Logger Use For the PowerExchange Logger to use a capture registration. If the PowerExchange Logger does not find any active capture registration. use the pwxcmd program. If you specify PROMPT=Y. see the documentation for your Linux or UNIX operating system. review the requirements for the amount of memory needed and for running the PowerExchange Logger in background mode. To enable pwxcmd use. PowerExchange Logger Memory Requirement on Linux or UNIX The PowerExchange Logger requires sufficient amounts of main memory and virtual memory to process change data. define the CONDENSENAME parameter in the pwxccl. If you want the PowerExchange Logger to create separate log files for one or more groups of tables. To prevent this problem.cfg file to run the PowerExchange Logger continuously. accept the default value of N for the PROMPT parameter. as defined in a registration group.cfg file and define the SVCNODE statement in the dbmover. Also. you must define a PowerExchange Logger configuration file for each source type and instance. If the memory is not sufficient. For background PowerExchange Logger processes.

configure its parameters in the PowerExchange Logger configuration file. select Part. In the Condense list. in the PowerExchange installation directory that is specified in the PWX_HOME environment variable on Linux or UNIX or PATH environment variable on Windows. select Active in the Status list. open the capture registration. In the Resource Inspector. 28 Chapter 3: PowerExchange Logger for Linux. If you specify a parameter value that contains one or more spaces. you must specify the CS parameter when you start the PowerExchange Logger to identify the alternative path or file name or both. Parameter Descriptions This topic describes the PowerExchange Logger parameters that you can specify in pwxccl. The PowerExchange Logger replaces PowerExchange Condense on Linux. Rename the file to pwxccl. you can copy its dtlca.cfg configuration file and then customize the copy. If you do so. If you used the similar PowerExchange Condense feature in an earlier PowerExchange release. UNIX. Use this example file as a starting point for your customized file.To enable a capture registration for PowerExchange Logger use: 1.cfg. you must enclose the value in double quotation marks. 3.cfg or use the CS execution parameter. such as a Windows path. and Windows. 2. In the PowerExchange Navigator. Make sure that you use straight quotation marks ("). and Windows . UNIX.cfg. PowerExchange provides an example configuration file. You might want to add PowerExchange Logger parameters that PowerExchange Condense did not support. Customizing the PowerExchange Logger Configuration File Before you start the PowerExchange Logger. You can rename the example file and copy it to another directory. named pwxccl.

If you enter AI for this parameter. Default is local. This image type must be consistent with the image type delivered to the target during extraction processing.You cannot extract before images to the target. Informatica recommends that you specify BA so that you have the flexibility to use either AI or BA for the PowerCenter Image Type connection attribute for extraction processing. Enter the node name of the remote node. . You can also specify an optional user ID and password to control connection to the specified node.BA for before and after images. For more information.cfg file on the local machine where the PowerExchange Logger runs. Configuring the PowerExchange Logger 29 . . Do not specify this parameter if the capture registrations and change data are on the local machine where the PowerExchange Logger runs.If you add DTL_CI columns to extraction maps. any Insert or Delete operations result in Null values in these columns.You cannot use DTL_BI columns in extraction maps. as specified in a NODE statement in the dbmover. The PowerExchange Logger can capture after images only or both before and after images of the data. The PowerExchange Logger writes the change data to its local log files. The PowerExchange Logger uses the specified node name to connect to the PowerExchange Listener on the remote node to read capture registrations and change data. the following limitations apply: . This parameter is optional. CAPTURE_NODE A node name that is specified in a NODE statement in the dbmover.cfg file on the local machine where the PowerExchange Logger runs. . Specify this parameter only if you are using CDC offload processing with the PowerExchange Logger. Valid Values . The node name that the PowerExchange Logger uses to retrieve capture registrations and change data.AI for after images. see the CAPTURE_NODE_UID parameter and the CAPTURE_NODE_EPWD or CAPTURE_NODE_PWD parameter. Default is AI.The parameters are: Parameter CAPT_IMAGE Description Data image type that the PowerExchange Logger captures to its log files.

If you specify this parameter. is used to control PowerExchange access to capture registrations and change data. If you specify this parameter. in conjunction with the CAPTURE_NODE_UID value. you must enter a password or encrypted password with either the CAPTURE_NODE_PWD or CAPTURE_NODE_EPWD parameter. do not also specify CAPTURE_NODE_EPWD. UNIX. and Windows . A clear text password that is associated with the user ID specified in the CAPTURE_NODE_UID parameter. in conjunction with the CAPTURE_NODE_UID value. This password. is used to control PowerExchange access to capture registrations and change data. if you specify CAPTURE_NODE_UID. do not also specify CAPTURE_NODE_PWD. This parameter is optional. However. you must enter a password or encrypted password with either the CAPTURE_NODE_PWD or CAPTURE_NODE_EPWD parameter. if you specify CAPTURE_NODE_UID. Tip: You can create an encrypted password in the PowerExchange Navigator by selecting File > Encrypt Password . This password. This parameter is optional. Valid Values CAPTURE_NODE_PWD 30 Chapter 3: PowerExchange Logger for Linux.Parameter CAPTURE_NODE_EPWD Description An encrypted password that is associated with the user ID specified in the CAPTURE_NODE_UID parameter. However.

enter a database user ID that permits access to the SQL Server distribution database.For Oracle sources. or Windows sources. UNIX. . If CAPTURE_NODE specifies a z/OS or i5/OS node that has a SECURITY setting of 0. If the specified user ID does not have the authority that is required to read capture registrations or change data. error message PWX-00231 is issued. PowerExchange uses the user ID under which the PowerExchange Listener job runs to control access to capture registrations and change data. . or Windows local or remote node. Otherwise. you must enter a valid operating system user ID for this parameter. error message PWX-00231 is issued. However. indicating a signon failure. enter a valid operating system user ID that has DB2 DBADM or SYSADM authority. do not specify this parameter. Whether this parameter is required depends on the operating system of the local or remote node and the SECURITY setting in its DBMOVER configuration file. Valid Values Configuring the PowerExchange Logger 31 .For DB2 for Linux. For SQL Server instances that use Windows Authentication.For Microsoft SQL Server instances that use SQL Server Authentication. PowerExchange uses this user ID to control access to capture registrations and change data. If CAPTURE_NODE specifies a z/OS or i5/OS node that has a SECURITY setting of 2. If CAPTURE_NODE specifies a z/OS or i5/OS node that has a SECURITY setting of 1. indicating a signon failure. enter a database user ID that permits access to Oracle redo logs and Oracle LogMiner. PowerExchange uses the user ID under which the PowerExchange Listener was started. UNIX. Otherwise. PowerExchange uses the user ID under which the PowerExchange Listener job runs to control access to capture registrations and change data. For a Linux. access fails. you must enter a valid operating system user ID for this parameter.Parameter CAPTURE_NODE_UID Description User ID that is used to control access to capture registrations and change data on the local machine or on the remote node that is specified in the CAPTURE_NODE parameter. enter a user ID that is valid for your data source type: . do not specify this parameter unless you want to specify another user. In this case.

Recommended. you must cold start the PowerExchange Logger. Cleanup processing occurs during startup. If you decrease the number of checkpoint files after running the PowerExchange Logger. If you perform a warm start. PowerExchange Logger operational mode. Log files that are older than this period and their corresponding CDCT records are deleted automatically during PowerExchange Logger cleanup processing.Parameter CHKPT_BASENAME Description Required. where n is a number from 0 to (CHKPT_NUM value . Required. it waits for the number of minutes specified in the NO_DATA_WAIT parameter before starting another processing cycle.1. Default is 0. If a CDCT file becomes large. and Windows . When creating the full checkpoint file name. Tip: Set this parameter to minimize the size of the CDCT file while preserving the log files that contain the earliest change data you might need to access. . 1 for batch mode. Number of checkpoint files to use. 32 Chapter 3: PowerExchange Logger for Linux. PowerExchange appends Vn. or shutdown processing. COLL_END_LOG 0 for continuous mode. and latency of change data extraction. this high read activity is not a consideration. PowerExchange reads the CDCT file each time the interval specified in the FILEWAIT parameter of the CAPX CAPI_CONNECTION statement elapses. COND_CDCT_RET_P Any number greater than 0. in days. for CDCT records and PowerExchange Logger log files.chkptV1. system resource use. UNIX.1).chkpt Valid Values Maximum length is 256. The PowerExchange Logger requires at least two checkpoint files.ckp CHKPT_NUM Recommended. After the Writer subtask completes a processing cycle. Default is 3. For example: /capture/logger. For example: /capture/logger. If you use continuous extraction mode. Retention period. file switch. this read activity can increase I/O. the PowerExchange Logger might restart from an incorrect location in its log files. Runs the PowerExchange Logger continuously until you manually stop it. Options are: .0. Runs the PowerExchange Logger in batch mode. If you use batch extraction mode. Checkpoint files store information for properly resuming PowerExchange Logger processing after a warm start. An existing directory path and base file name that PowerExchange uses to create checkpoint files. Default is 60. The PowerExchange Logger shuts down after the seconds specified in the NO_DATA_WAIT2 parameter elapse and no data has been received. A number from 2 through 999999.

cfg file. CONN_OVR Valid CAPI_CONNECTION name for the source type.cfg. Default is 600. Informatica recommends that you specify CONN_OVR. Name of the override CAPI_CONNECTION statement to use for the PowerExchange Logger. and Windows process to which pwxcmd commands will be issued. The SVCNODE statement specifies the TCP/IP port on which this service listens for pwxcmd commands. If you do not specify CONN_OVR. the PowerExchange Logger uses the default CAPI_CONNECTION if one is specified in dbmover. Recommended. that the PowerExchange Logger waits after receiving the SHUTDOWN or pwxcmd shutdown command before stopping. Tip: If you run the PowerExchange Logger as a background process in continuous mode. No default. in seconds. specify this parameter so that you can use the pwxcmd program to issue commands to the PowerExchange Logger. Configuring the PowerExchange Logger 33 . If you have a large number of capture registrations. UNIX. Syntax is: CONDENSENAME=service_name Valid Values Maximum length is 64 characters. you cannot shut down a PowerExchange Logger process that is running in the background or send status information to a computer that is remote from where the PowerExchange Logger runs. Without the use of pwxcmd. During a shutdown. the PowerExchange Logger updates the CDCT file for each capture registration that is used to capture change data. you might need to increase this timeout period.Parameter CONDENSENAME Description Optional. A number from 0 through 2147483647. It is the only type of override that the PowerExchange Logger can use. A name for the command-handling service for a PowerExchange Logger for Linux. CONDENSE_SHUTDOWN_TIMEOUT Maximum amount of time. This service name must match the service name that is specified in the associated SVCNODE statement in the dbmover.

A source identifier. this value is the Database name that is displayed for the registration group in the Resource Inspector. and Windows .For DB2 for Linux. sometimes called the instance name. When used with DB_TYPE. . EPWD A deprecated parameter.MSS for Microsoft SQL Server . that is defined in capture registrations. If you use CDC offload processing with the PowerExchange Logger to capture change data from z/OS or i5/OS data sources.For Oracle.Parameter DBID Description Required.For Microsoft SQL Server. Source RDBMS type. This value must match the instance or database name that is displayed in the Resource Inspector of the PowerExchange Navigator for the registration group that contains the capture registrations. see “Configuring PowerExchange to Capture Change Data on a Remote System” on page 162 for information about what to enter for this parameter. Use CAPTURE_NODE_EPWD instead. and Windows .cfg.ORA for Oracle If you use CDC offload processing with the PowerExchange Logger to capture change data from z/OS or i5/OS data sources. Open the registration group in the PowerExchange Navigator to view this Instance value. and Windows.UDB for DB2 for Linux. CAPTURE_NODE_EPWD takes precedence. it defines selection criteria for capture registrations in the CCT file. For Microsoft SQL Server. Valid Values . this value is the Instance name that is displayed for the registration group and is also the first positional parameter in the ORACLEID statement in dbmover. UNIX. 34 Chapter 3: PowerExchange Logger for Linux. DB_TYPE Required. UNIX. UNIX. an instance name is generated when you create a registration group. . . If both CAPTURE_NODE_EPWD and EPWD are specified. see “Configuring PowerExchange to Capture Change Data on a Remote System” on page 162 for information about what to enter for this parameter. this value is the Instance name that is displayed for the registration group in the Resource Inspector.

Parameter EXT_CAPT_MASK Description Required.CP080718.CPyymmdd. do not reuse an EXT_CAPT_MASK value until the PowerExchange Logger process has completed processing all of the log files that match the mask. seconds. a PowerExchange Logger process might corrupt log files that are used by another PowerExchange Logger process. Configuring the PowerExchange Logger 35 . To create the log files.CND. the PowerExchange Logger appends the following information: . and a day. . which starts from 001. For example: /capture/pwxlog. even if it is unrelated to PowerExchange Logger processing.T1545001 Warning: Do not use the same EXT_CAPT_MASK value for multiple PowerExchange Logger processes. An existing directory path and a unique prefix to be used for generating the PowerExchange Logger log files. For example: /capture/pwxlog Valid Values Maximum length is 256 characters. Otherwise. Also. . minutes. including hours. Verify that no existing files match this path and prefix.hhmmss is 24-hour time value. PowerExchange considers any file that matches this path and prefix to be a PowerExchange Logger log file.yymmdd is a date composed of a two-digit year.nnn is a generated sequence number.Thhmmssnnn Where: .CND. a month. No default.

and values that are too low can degrade PowerExchange Logger and system performance. data to the current log file on disk.Any value from 1 through 86400 sets the flush interval to that specific value.Parameter FILE_FLUSH_VAL Description Recommended. Do not specify this value if you use continuous extraction mode. Otherwise.A -1 causes the PowerExchange Logger process to not flush data to the current log file. Values that are too high can increase change extraction latency. Type of units to use for the FILE_SWITCH_MIN and FILE_SWITCH_VAL parameters.R for records. . Valid Values -1 or any number from 0 through 86400. 36 Chapter 3: PowerExchange Logger for Linux. which determine when to do an automatic file switch. Valid values are: . and Windows . Default is M. or writing. .M for minutes. This parameter affects the latency of change data extractions that run in continuous extraction mode. the latency of your continuousmode extractions increases. Specify this value only if you use batch extraction mode. File flush interval in seconds. Default is -1. Flushing data to disk enables the data to be read by extractions running in continuous extraction mode. The PowerExchange Logger waits for this interval to elapse before flushing. . Warning: A value of 0 can degrade PowerExchange Logger and file system performance. Informatica recommends that you set this parameter to a value that is equal to or greater than the NO_DATA_WAIT2 value because file flushes cannot occur until the NO_DATA_WAIT2 period expires. Set this value as appropriate for your CDC environment. FILE_SWITCH_CRIT .A 0 results in a flush after every record. UNIX.

A value from -1 through 2147483647. Before the min_val_ign threshold is met. You can use this parameter to reduce change data latency when running extractions in continuous extraction mode. This situation occurs because the PowerExchange Logger does a file switch each time it encounters a data source without an entry in the CDCT file. The min_val_ign value is ignored if the PowerExchange Logger is warm started.min_val_ign.Parameter FILE_SWITCH_MIN Description File-switch criteria that the PowerExchange Logger uses when it encounters change data for a new source. File switch processing is controlled by FILE_SWITCH_VAL only. .Any value from 1 through 2147483647 causes the PowerExchange Logger to perform a file switch when this specified number of FILE_SWITCH_CRIT units is reached. Valid values are: . before a file switch can be performed. Configuring the PowerExchange Logger 37 . Syntax is: FILE_SWITCH_MIN=(min_val. Valid values are: .min_val.A 0 causes the PowerExchange Logger to use the minimum file switch value specified in min_val immediately after it is cold started. Where: min_val is the minimum number of FILE_SWITCH_CRIT units that must elapse after the PowerExchange Logger encounters a change record for a source that has no entry in the CDCT file.Any value from 1 through 2147483647 causes the PowerExchange Logger to ignore the min_val keyword for the specified number of units. min_val_ign is the minimum number of FILE_SWITCH_CRIT units that must pass during a PowerExchange Logger cold start before the PowerExchange Logger uses the min_val value.A 0 causes the PowerExchange Logger to perform a file switch each time a new source is encountered. Thereafter. .0).min_val_ign) Valid Values . A value from 0 through 2147483647. the CDCT file is emptied. a file switch occurs each time the PowerExchange Logger encounters a change record for a registered data source for the first time.A -1 causes this parameter to be ignored. During a cold start. only FILE_SWITCH_VAL controls file switch activity. Default is (-1. .0) can result in a large number of file switches when the PowerExchange Logger is cold started. Warning: The value (0. .

The PowerExchange Logger does not maintain the CDCT retention array and does not delete expired CDCT records. Tip: When using continuous extraction mode. the PowerExchange Logger performs a file switch every 30 records. are deleted. the file switch does not occur. if this value is 30 and FILE_SWITCH_CRIT=R. No default. Default is 30. Specify a value that results in log files of the appropriate size for your environment. Controls how expired CDCT records. Or if FILE_SWITCH_CRIT=M. LOGGER_DELETES_EXPIRED_CDCT_RE CORDS Y or N Default is Y. set this parameter to a value that causes file switches to occur within the timeframe that meets your change extraction latency requirements. Path and file name of the optional PowerExchange Logger group definition file. and Windows . It also defines the path that the PowerExchange Logger uses to create the log files that contain the change data for each group. To use the DELETE_EXPIRED_CDCT command.Y. The PowerExchange Logger maintains the CDCT retention array and deletes expired CDCT records during file switches. set this parameter such that you have larger log files and a smaller CDCT file. However. Options are: . as determined by FILE_SWITCH_CRIT. you can issue the DELETE_EXPIRED_CDCT command from the PWXUCDCT utility to delete expired CDCT records. Note: This parameter does not affect PowerExchange Logger deletions of CDCT records rolled back because of a cold start or a warm start to a prior point in time.Parameter FILE_SWITCH_VAL Description Number of minutes or change records. This file defines groups of capture registrations that the PowerExchange Logger uses to capture change data to separate sets of log files. 38 Chapter 3: PowerExchange Logger for Linux. If you enter Y. This parameter is optional. the PowerExchange Logger performs a file switch every 30 minutes. If the PowerExchange Logger log files contain no data when the FILE_SWITCH_VAL threshold is reached. that must elapse before PowerExchange performs a file switch. For example. you must specify N. When using batch extraction mode. This value affects the size of the PowerExchange Logger log files. UNIX. for which the retention period has elapsed. . you cannot issue the DELETE_EXPIRED_CDCT command from the PWXUCDCT utility to delete expired CDCT records.N. Valid Values Any number greater than 0. GROUPDEFS Maximum length is 255 characters.

If the value of FILE_SWITCH_CRIT is M and the value of FILE_SWITCH_VAL is less than the value of NO_DATA_WAIT. If you enter a higher value. scheduled basis. If you run the PowerExchange Logger in continuous mode. you can usually avoid memory shortages by setting the LOGGER_DELETES_EXPIRED_CDCT_RECOR DS parameter to N and running the PWXUCDCT utility DELETE_EXPIRED_CDCT command on a regular. Configuring the PowerExchange Logger 39 . Tip: If you have a large volume of CDCT records. execution of commands for the PowerExchange Logger might be delayed. this value should be low so that the next logging cycle starts shortly after the current one completes. For continuous extraction mode. the CAPI sleeps. Default is 600. The recommended value is 2. specify the number of minutes that the PowerExchange Logger must wait before starting the next logging cycle. If this wait period elapses and new change data has not been received. PowerExchange returns control to the PowerExchange Logger. NO_DATA_WAIT2 Any number greater than 0. Retention array items define when CDCT records expire and indicate the log file names and registration tags referenced by the CDCT records. Valid Values A number from 1 through 999. and the PowerExchange Logger then stops the current logging cycle. Recommended value is 2. Default is 60. the PowerExchange Logger uses the FILE_SWITCH_VAL value instead. A value of 0 causes no waiting to occur between PowerExchange Logger processing cycles.Parameter MAX_RETENTION_EXPIRY_DAYS Description Maximum number of days to hold retention array items in memory. Number of seconds that PowerExchange waits at the end-of-log for more change data before returning control to the PowerExchange Logger. If source data is not available. Default is 999. NO_DATA_WAIT 0 or greater. Use the MAX_RETENTION_EXPIRY_DAYS parameter only in situations with extreme memory limitations.

Use CAPTURE_NODE_PWD instead. controls whether PowerExchange displays a user confirmation prompt and waits for a response when you perform one of the following actions: . and Windows .Cold start the PowerExchange Logger. You must respond to the message for startup processing to continue. Options are: . A deprecated parameter. Does not display the confirmation messages. and continues processing. If you run the PowerExchange Logger in foreground mode. issues error message PWX-33253. CAPTURE_NODE_PWD takes precedence. . the default is N. This situation occurs only if checkpoint files that were more recent than the current ones were deleted. . the default is Y. Default is N for a PowerExchange Logger process that runs in background mode or as a PowerExchange Logger Service in the Informatica domain. if you enter PROMPT=Y in the pwxccl.Warm start the PowerExchange Logger from a previous position in the change stream. PowerExchange attempts to start without first prompting for user confirmation. PWD 40 Chapter 3: PowerExchange Logger for Linux. Displays the confirmation message PWX-33236 for a cold start or PWX-33242 for a warm start.Parameter PROMPT Description When you run the PowerExchange Logger in foreground mode. If both CAPTURE_NODE_PWD and PWD are specified. the PowerExchange Logger ignores this setting. If you run the PowerExchange Logger in background mode or as a PowerExchange Logger Service in the Informatica domain.N. and the CDCT file still contains records related to the deleted files. Valid Values Y or N Default is Y for a PowerExchange Logger that runs in foreground mode.cfg file.Y. In this case. UNIX.

Valid Values .0 .Specific restart and sequence token values. Configuring the PowerExchange Logger 41 . the default location is the most current Oracle catalog dump.Not specified. see the PowerExchange Condense chapter in the PowerExchange CDC Guide for i5/OS and PowerExchange CDC Guide for z/OS for information about what to enter for these parameters. PowerExchange Logger processing starts from one of the following restart points during a cold start. The PowerExchange Logger does not automatically trap and handle system errors. A restart point is defined by both a restart token and a sequence token. it attempts to shut down in a controlled manner. Depending on how you set these parameters. Options are: . SIGNALLING Y or N Default is N. the operating system uses default error handling. processing resumes from the specific restart point defined by these token values. If you enter 0 for both parameters. After the PowerExchange Logger handles the error. The PowerExchange Logger automatically handles certain errors such as memory corruption.For DB2.For Oracle.N. . Instead. Usually. Indicates whether the PowerExchange Logger attempts to take automatic action in the event of certain errors.For Microsoft SQL Server. the default location is the current log position at the time the PowerExchange capture catalog was created. If you enter restart token and sequence token values other than 0. . the default handing is to report the program line in error and dump memory. If you do not specify these parameters. . processing starts from the default start location: .Parameter RESTART_TOKEN and SEQUENCE_TOKEN Description Parameters that define a restart point for starting change data processing when a PowerExchange Logger is cold started. .Y. the default location is the oldest data available in the publication database. If you use CDC offload processing with the PowerExchange Logger to capture change data from z/OS or i5/OS data sources. processing starts from the current end-of-log position.

Y.cfg file in the PowerExchange installation directory. and Windows . CAPTURE_NODE_UID takes precedence. . Options are: . data to the current log file on disk /* -1 = No flush.cfg PowerExchange provides an example pwxccl. or writing.chkpt COND_CDCT_RET_P=50 LOGGER_DELETES_EXPIRED_CDCT_RECORDS=Y /* 0 = continuous. Valid Values VERBOSE .Y for verbose messaging . The PowerExchange Logger logs a single terse message for each file switch and checkpoint. UNIX. checkpoint.0) 42 Chapter 3: PowerExchange Logger for Linux. 0 = flush every record. RELATED TOPICS: ¨ “PowerExchange Logger Operational Modes” on page 25 ¨ “Configuring PowerExchange to Capture Change Data on a Remote System” on page 162 Example pwxccl. 1 to N flush every N seconds /*FILE_FLUSH_VAL=60 /* Minimum number of FILE_SWITCH_CRIT units after new CDCT source entry (normal. Use CAPTURE_NODE_UID instead. and file-switch processing.N for terse messaging Default is Y. which you can customize. such as when starting or ending a cycle of reading source data or doing a file switch. Verbose messaging often includes processing statistics such as records processed and elapsed time. 1 = Stop at end-of-log (batch) COLL_END_LOG=0 /* Number of minutes to wait between CAPI read cycles in seconds NO_DATA_WAIT=0 /* Number of seconds to wait at the end-of-log for more change data NO_DATA_WAIT2=60 /* Number of seconds before flushing.Parameter UID Description A deprecated parameter. The PowerExchange Logger logs multiple messages at various processing points.coldstart) /*FILE_SWITCH_MIN=(0.N. If both CAPTURE_NODE_UID and UID are specified. such as cleanup. Indicates whether the PowerExchange Logger writes verbose or terse messages to the PowerExchange message log file for activities that it performs frequently. The example file contains the following statements: /* Name for PWXCMD control /*CONDENSENAME=PWXCCL1 DBID=ORACOLL1 DB_TYPE=ORA CAPTURE_NODE_UID=user_id CAPTURE_NODE_EPWD=encrypted_password /* CAPTURE_NODE_PWD=plain_text_password PROMPT=Y EXT_CAPT_MASK=/capture/condenseO CHKPT_NUM=3 CHKPT_BASENAME=/capture/condenseO. condense.

If you specify this statement. For more information about pwxcmd commands. see the PowerExchange Command Reference. For more information about all DBMOVER configuration parameters. Use the following key parameters: CAPT_PATH Required. and ORACLEID statements for Oracle. such as the ORCL CAPI_CONNECTION. PowerExchange switches to the next log file and begins overwriting any data in that file. see the PowerExchange Reference Manual. Also. PowerExchange creates a set of alternative log files for each PowerExchange process in a separate directory. The CCT file contains information about capture registrations. SVCNODE Optional. and the log file size in MB. the PowerExchange Logger requires source-specific statements. Enables alternative logging. Alternative logging is faster and enables you to customize the amount of data logged for long-running jobs. Specifies a unique path to the PowerExchange message log files. Use this parameter to create message log files in a directory that is separate from your current working directory so that you can find the message log files more easily. such as a PowerExchange Logger process that runs in continuous mode. the number of log files. You must define this parameter if you run the PowerExchange Logger process in background mode on a Linux or UNIX system. LOGPATH Optional. The CDCT file contains information about the PowerExchange Logger log files. You can specify the directory location. and Windows Change Data Capture” on page 56 ¨ “Microsoft SQL Server Change Data Capture” on page 70 ¨ “Oracle Change Data Capture with Oracle LogMiner” on page 80 Configuring the PowerExchange Logger 43 . TRACING Optional. you must enclose the values in double quotation marks. you can include some optional parameters to help make finding messages for the PowerExchange Logger easier or to send commands to a PowerExchange Logger process that is running in background mode. Customizing dbmover. RELATED TOPICS: ¨ “DB2 for Linux.FILE_SWITCH_CRIT=M FILE_SWITCH_VAL=20 CAPT_IMAGE=BA SEQUENCE_TOKEN=00 RESTART_TOKEN=00 Note: If you enter values in the EXT_CAP_MASK and CHKPT_BASENAME parameters that include spaces. UNIX.cfg for the PowerExchange Logger To use the PowerExchange Logger. Specifies the TCP/IP port on which a command-handling service for a PowerExchange Logger process listens for commands that you issue with the pwxcmd program. In addition to these parameters. Specifies the path to the directory where the CCT and CDCT files reside. UOWC CAPI_CONNECTION. also specify the LOGPATH statement. When a log file reaches the specified size.cfg file. such as file names and number of records. you must define the CAPT_PATH statement and certain source-specific statements in the dbmover.

and Windows. Group definitions can help improve the efficiency of extraction sessions because the extractions target a more specific set of PowerExchange Logger log files. Tip: When using group definitions. you can define a single capture registration for the table and specify it once. PowerExchange requirements for unregistered versions of tables. and Windows. it reads the group definition file and creates a separate set of log files for each defined group. the PowerExchange Logger might need to read many change records in the log files before finding the changes of interest. the PowerExchange Logger processes change data for all tables that reside on the instance specified by the DBID parameter and that have active capture registrations with the Condense option set to Part. SCHEMA statements are not supported for SQL Server sources on Windows or any data source on z/OS. and SOUTH groups. On Linux. With group definitions. ¨ For Oracle. you can optimize extraction efficiency by defining a CDC session in PowerCenter for each group of tables defined in the group definition file. and west regions. for which a REG statement is not specified. Then specify only the override schemas in the EAST. you can define a group that includes a subset of capture registrations. east. WEST. it is more likely to find the change data for a table in the group faster because it reads only the log files for that group. UNIX. and define another CDC session that extracts change from the log files for the high-activity group. specify its path and file name in the GROUPDEFS parameter of the pwxccl. When an extraction process runs. if you have five source tables with a low level of change activity and one table with a high level of change activity. you can override the schema name in the group definition by using a SCHEMA statement. This configuration enables the CDC session for the low-activity tables to find and extract the few change records for these tables much more quickly. By using the SCHEMA statement. you must register all versions of a table in PowerExchange and specify a REG statement in the group definition file. UNIX. For a table with a low level of change activity. If you have multiple tables with the same table name but different schemas. you can avoid creating multiple capture registrations and specifying each one in the group definition file. When the PowerExchange Logger process starts. in the group definition file. you can register the north EMPLOYEE table only and specify the capture registration name in the NORTH group. For any other group that includes the same table with a different schema. under a single group.Using PowerExchange Logger Group Definitions To create separate sets of PowerExchange Logger log files for groups of tables. and Windows. in PowerCenter. Changes for all of these tables are written to a single set of log files (not taking into account file switching). vary by source type: ¨ For DB2 for Linux. create a PowerExchange Logger group definition file. Note: SCHEMA statements are optional for DB2 for i5/OS sources and for DB2 and Oracle sources on Linux. south. ¨ For Microsoft SQL Server. if you have an EMPLOYEE table with different schemas for the north. you must define any unregistered version of a table with the DATA CAPTURE CHANGES clause. which is similar to the supplemental log group that was created for the registered copy of the table at registration completion. and Windows . By default. define a CDC session that extracts change data from the PowerExchange Logger log files for the low-activity group. The PowerExchange Logger then writes change data to a separate set of log files for the tables that are associated with these registrations. UNIX. Then. UNIX. For example. For example. RELATED TOPICS: ¨ “Customizing the PowerExchange Logger Configuration File” on page 28 ¨ “PowerExchange Logger Group Definition File” on page 45 44 Chapter 3: PowerExchange Logger for Linux. Then. you can define a group that includes the low-activity tables and another group that includes only the high-activity table only. you must create an Oracle supplemental log group for the unregistered table.cfg file.

the REG statements apply to the preceding GROUP statement. The following table describes the statements and parameters in the group definition file: Statement Positional Parameter Description Data Type and Length VARCHAR(255) GROUP group_name A unique user-defined name for the group. If you use the offloading feature to have the PowerExchange Logger process data from z/OS sources. You can specify multiple SCHEMA statements under a GROUP if you want the tables with those schemas to be included in the group. Optional. the PowerExchange Logger ignores the EXT_CAPT_MASK parameter in the pwxccl. Note: This path and prefix is used for the group instead of any the path and prefix that are specified in the EXT_CAPT_MASK parameter of the pwxccl. this parameter is also not supported for the z/OS sources. ¨ Each external_capture_mask must be unique on the system. ¨ REG statements apply to the preceding SCHEMA statement. You can optionally use this parameter for DB2 for i5/OS sources and for DB2 and Oracle sources on Linux. If a SCHEMA statement is not present. UNIX. Registration name that is specified in the Name field of a capture registration. Configuring the PowerExchange Logger 45 . Each GROUP statement contains REG or SCHEMA parameters that directly or indirectly identify a group of capture registrations and tables for which you want to create separate sets of PowerExchange Logger log files. the PowerExchange Logger assumes REG=*. SCHEMA statements are not supported for SQL Server sources on Windows or any data source on z/OS. and Windows. If omitted. ¨ SCHEMA statements are optional for DB2 for i5/OS sources and for DB2 and Oracle sources on Linux. This parameter is required. Name of the override schema. you must specify the path and file name of the file in the GROUPDEFS parameter of the pwxccl. This lowercase name can be the full registration name or the first part of the name followed by an asterisk (*) wildcard. This parameter is required. and Windows.cfg file.cfg file. Note: If you specify the GROUPDEFS parameter. This parameter is optional.cfg file when creating log files. Note: This parameter is not supported for SQL Server sources on Windows. Optional. UNIX. For the PowerExchange Logger to use the group definition file. ¨ If you use a SCHEMA statement. you must define a capture registration in the group. A unique path and file-name prefix for the PowerExchange Logger log files that are created for tables in the group.PowerExchange Logger Group Definition File A PowerExchange Logger group definition file contains one or more GROUP statements. external_capture_mask VARCHAR(255) REG registration_name VARCHAR(8) SCHEMA schema_name VARCHAR(255) Use the following rules and guidelines when you create a PowerExchange Logger group definition file: ¨ Each group_name must be unique within the group definition file.

This example file defines the following groups: ¨ Company1People group. Groups tables that have the schema Company2 and that are associated with capture registrations that have names beginning with “Emp” or “Em” or the name “Manager. only the first group that includes the table logs changes for it. Company2."/user/logger_files/people/UK/condense") SCHEMA=Company2 REG=Manager REG=Emp* REG=Em* SCHEMA=Company3 REG=Manager REG=Emp* GROUP=(All_Managers.¨ If the file contains a SCHEMA or REG statement without a preceding GROUP statement.cfg.” ¨ All_Managers group.” Changes for these tables are logged to log files that have file names beginning with “condense” and that are located at “/user/logger_files/people/company1/. the PowerExchange Logger logs changes for that registration only under the first group in the group definition file that includes the registration.” ¨ UK_People group. The example file contains the following statements: GROUP=(Company1People."/user/logger_files/jobs/company2/condense") REG=Job* Note: Because this example is for a group definition file on a Linux or UNIX system. the PowerExchange Logger issues a syntax error. or Company3 and that are associated with the capture registration with the name “Manager."/user/logger_files/people/company1/condense") REG=Emp* REG=Manager GROUP=(UK_People. in the PowerExchange installation directory.” Changes for these tables are logged to log files that have names beginning with “condense” and that are located at “/user/logger_files/people/managers/. ¨ If a registration belongs to multiple groups. pwxcclgrp."/user/logger_files/people/managers/condense") SCHEMA=Company1 REG=Manager SCHEMA=Company2 REG=Manager SCHEMA=Company3 REG=Manager GROUP=(AllCompany3_Locations. Groups tables that have the schema Company1.” Changes for these tables are logged to log files that have names beginning with “condense” and that are located at “/user/logger_files/ people/UK/.” 46 Chapter 3: PowerExchange Logger for Linux. Groups all tables associated with capture registrations that have names beginning with “Emp” or the name “Manager. Example Group Definition File PowerExchange provides an example group definition file. the PowerExchange Logger includes all of the active capture registrations that are defined for the specified DBID instance and for which the Condense option is set to Part. ¨ Do not include the same schema.table value in more than one group. the paths include forward slashes. and Windows . UNIX. Use this example as a starting point when creating your group definition file. A group definition file on Windows system would be similar but have back slashes."/user/logger_files/locations/company3/condense") REG=loc* GROUP=(Company2Jobs. ¨ If you do not define at least one REG statement for a GROUP. If a table is included in multiple groups.

and license parameters. and All_Managers groups. in the pwxccl. For more information about pwxccl syntax. PWXCCL Syntax The pwxccl statement has the following syntax: pwxccl [coldstart={Y|N}] [config=path/pwx_config_file] [cs=path/pwxlogger_config_file] [license=path/license_file] Use the following rules and guidelines when you enter the pwxccl statement: ¨ To cold start the PowerExchange Logger. the table COMPANY2. For example.” Changes for these tables are logged to log files that have names beginning with “condense” and that are located at “/user/logger_files/jobs/company2/. ¨ A warm start uses the restart and sequence tokens in the last checkpoint file to resume CDC processing. You can perform a warm start only if you have run the PowerExchange Logger previously and have recent checkpoint files. you must perform a cold start. the full path is required only if the file is not in the default location. cs.cfg configuration file to determine the point in the change stream from which the PowerExchange Logger starts reading changes. The default is N.” ¨ Company2Jobs group. PWXCCL Syntax and Parameters To start the PowerExchange Logger process. Groups all tables that are associated with capture registrations that have names beginning with “Job. ¨ All parameters are optional. Starting the PowerExchange Logger You can cold start or warm start the PowerExchange Logger process. Starting the PowerExchange Logger 47 . append an ampersand (&) at the end of the statement to run the PowerExchange Logger in background mode. ¨ On Linux and UNIX. if present. run the pwxccl program. However. If you are starting the PowerExchange Logger for the first time. the cs parameter is required. UK_People. which is located in the PowerExchange installation directory by default. You cannot use the pwxcmd program to start the PowerExchange Logger. However. ¨ A cold start uses the restart and sequence tokens. Groups all tables that are associated with capture registrations that have names beginning with “loc. changes for this table are logged only under the Company1People group because it is the first group in the file that includes this table. PWXCCL Parameters You can specify several optional parameters in the pwxccl statement. if you specify the config or license parameter. ¨ In the config. you must set the coldstart parameter to Y.¨ AllCompany3_Locations group.” Some tables might be included in more than one group.MANAGERS is in the Company1People. see the PowerExchange Command Reference.” Changes for these tables are logged to log files that have names beginning with “condense” and that are located at “/user/logger_files/locations/company3/.

¨ If you enter only zeroes (a single 0. or current point in time in the change stream.cfg file in the installation directory. Full path and file name for a license key file that overrides the default license. The override file must have a file name or path that is different from that of the default file. If the CDCT file contains records. the PowerExchange Logger ends with error message PWX-33227. If no checkpoint file exists in the CHKPT_BASENAME directory. This override file takes precedence over any other override configuration file that you optionally specify with the PWX_CONFIG environment variable. the PowerExchange Logger starts from the point in the change stream that the token values identify. Cold starts the PowerExchange Logger. the PowerExchange Logger starts from one of the following points in the change stream: ¨ If you do not define the RESTART_TOKEN and SEQUENCE_TOKEN parameters. The override file must have a path or file name that is different from that of the default file. the PowerExchange Logger ignores the files. 48 Chapter 3: PowerExchange Logger for Linux. How the PowerExchange Logger Determines the Start Point for a Cold Start When you cold start a PowerExchange Logger process. The override file must have a path or file name that is different from that of the default file. or an even number of 0s) in the RESTART_TOKEN and SEQUENCE_TOKEN parameters. and Windows . it uses the RESTART_TOKEN and SEQUENCE_TOKEN parameters. Use this parameter to specify a PowerExchange Logger configuration file that overrides the default pwxccl. ¨ If you enter valid restart information in the RESTART_TOKEN and SEQUENCE_TOKEN parameters.N. Based on how you set these parameters. Full path and file name of the PowerExchange Logger configuration file. You must specify COLDSTART=Y to perform a cold start. the full path is required only if the file is not in the default location. Use this method to start the PowerExchange Logger from a specific point. . Tip: You can generate restart and sequence tokens for the current EOL by running the DTLUAPPL utility with the RSTTKN GENERATE parameter or by performing a database row test with the SELECT CURRENT_RESTART SQL statement in PowerExchange Navigator. UNIX. If you specify Y and checkpoint files exist. Full path and file name for a DBMOVER configuration file that overrides the default dbmover. Warm starts the PowerExchange Logger from the restart point that is indicated in the last checkpoint file. The absence of checkpoint files does not trigger a cold start. the PowerExchange Loggers starts from the current end-of-log (EOL).Y.The following table describes each parameter: Parameter coldstart Description Indicates whether to cold start or warm start the PowerExchange Logger. config cs license Note: In these parameters.key file in the installation directory. if present.cfg in the installation directory. Default is N. Enter one of the following values: . the PowerExchange Logger starts from the oldest available change record in the change stream. the PowerExchange Logger deletes these records.cfg configuration file to determine the point in the change stream at which to start reading changes. in the pwxccl. This override file takes precedence over any other override license key file that you optionally specify with the PWX_LICENSE environment variable.

you can add an ampersand (&) at the end of the statement to run the PowerExchange Logger in background mode. you must specify the CONDENSENAME parameter in the pwxccl.cfg.cfg. you can display messages about PowerExchange Logger processing. Note: To use pwxcmd. The output is displayed on screen and written to the PowerExchange message log.cfg configuration file. the capture registrations will not be available. If you previously ran the PowerExchange Logger and have existing checkpoint. CDCT. move.cfg file. you must include COLDSTART=Y. and license parameters if you want to override the default dbmover.key files. stop the PowerExchange Logger. and Windows. as long as another PowerExchange Logger process is not using them. 2. the PowerExchange Logger deletes these records. On Linux and UNIX systems. For more information about PowerExchange Logger syntax. and Windows commands to manually initiate a file switch or another logging cycle. RELATED TOPICS: ¨ “How the PowerExchange Logger Determines the Start Point for a Cold Start” on page 48 ¨ “PWXCCL Parameters” on page 47 Managing the PowerExchange Logger To assess the status of the PowerExchange Logger for Linux. UNIX.cfg file and the SVCNODE statement in the dbmover. You can move or rename the files. Managing the PowerExchange Logger 49 . Warning: If you delete.Cold Starting the PowerExchange Logger Use this procedure to cold start the PowerExchange Logger. To cold start the PowerExchange Logger. cs. and log files. UNIX. you might need to stop the PowerExchange Logger. set the RESTART_TOKEN and SEQUENCE_TOKEN parameters in a manner that causes the PowerExchange Logger to start from the appropriate point in the change stream. or rename the CCT file. You can enter these commands from the command line or by using the pwxcmd program. Do delete them if you want to retain change processing history. Commands for Controlling and Stopping PowerExchange Logger Processing Use PowerExchange Logger for Linux. see the PowerExchange Command Reference. enter the following statement at the command line: pwxccl coldstart=y Include the optional config. memory use.cfg file. To cold start the PowerExchange Logger: 1. During a cold start. retain these files for historical purposes. or display messages about PowerExchange Logger processing and system resource use. the PowerExchange Logger ignores any checkpoint files that exist in the directory that is specified by the CHKPT_BASENAME parameter in the pwxccl. and license. 3. pwxccl. In the pwxccl. Occasionally. and CPU use. In the start statement. If the CDCT file contains records.

Displays message PWX-26041. in microseconds. Displays events that the PowerExchange Logger Controller. The information includes the file sequence number. Displays the CPU time spent. performing file switches. Total. timestamp. The wait period is defined by the NO_DATA_WAIT parameter in pwxccl. arranged by command. writing data to log files. Processing phases include reading source data. Also includes the total CPU time for all PowerExchange Logger processing. manually starts a new PowerExchange Logger logging cycle before the wait period for starting another cycle has elapsed. number of data records and commit records. and commit time. Displays PowerExchange Logger memory use. in bytes. by processing phase. and Maximum. with totals for the entire PowerExchange Logger process. Command Handler. Displays all messages that can be produced by the other PowerExchange Logger DISPLAY commands. which reports information about the latest checkpoint file. Also indicates if the Writer is processing data or is in a sleep state waiting for an event or timeout to occur. for PowerExchange Logger processing during the current logging cycle. for each PowerExchange Logger task and subtask. DISPLAY ALL displayall DISPLAY CHECKPOINTS displaycheckpoints DISPLAY CPU displaycpu DISPLAY EVENTS displayevents DISPLAY MEMORY displaymemory 50 Chapter 3: PowerExchange Logger for Linux. and Writer tasks are waiting on.cfg. and Windows . Memory use is reported for the following categories: Application. UNIX. and performing “other” processing such as initialization.The following table describes each command: Command-line Command CONDENSE pwxcmd Command condense Description When the PowerExchange Logger is running in continuous mode.

you must use the pwxcmd program. Closes open PowerExchange Logger log files if they contain data and then switches to a new set of log files. Displays the status of the PowerExchange Logger Writer subtask. the file switch does not occur. DISPLAY STATUS displaystatus FILESWITCH fileswitch Managing the PowerExchange Logger 51 . writing source data to a PowerExchange Logger log file. Insert. Commit. If you use batch extraction mode.Command-line Command DISPLAY RECORDS pwxcmd Command displayrecords Description Displays counts of change records that the PowerExchange Logger processed during the current processing cycle. Record types are Delete. Record counts are shown by record type. If the log files do not contain data. Usually. you can use this command to make change data in the current log files available for extraction processing before the next file switch is due to occur. Update. To issue the fileswitch command from a script or batch file. initializing. displays counts of change records for the current set of PowerExchange Logger log files. or starting a checkpoint. If the PowerExchange Logger did not receive changes during the current cycle. you do not need to perform manual file switches if you use continuous extraction mode. and Total. for example.

takes a final checkpoint to record the latest restart and sequence tokens. stops the Writer and Command Handler subtasks. CAPI Read number. see the PowerExchange Command Reference. takes a final checkpoint. Enter VERBOSE=Y in the pwxccl. During shutdown processing. example output. Use this command to stop a PowerExchange Logger process that is running in continuous mode. Use this command if a logging cycle has not run recently. Assessing PowerExchange Logger Performance To assess PowerExchange Logger performance. closes the CAPI. fileswitch. stops the Writer and Command Handler subtasks. you can view key PowerExchange Logger messages that report CPU use and elapsed times for processing. This file total number. the PowerExchange Logger closes open log files. and then ends the pwxccl program. Other number ¨ Message PWX-33279 issued after each file switch and checkpoint: PWX-33279 CPU total number. Other number 52 Chapter 3: PowerExchange Logger for Linux. the following verbose messages indicate CPU use by the Writer subtask: ¨ Message PWX-33274 is issued before the Writer subtask starts reading source data after initialization and before the PowerExchange Logger shuts down: PWX-33274 CPU Total number. Writing number.Command-line Command SHUTCOND pwxcmd Command shutcond Description Stops the PowerExchange Logger in a controlled manner after initiating and completing a final logging cycle. the PowerExchange Logger closes open log files. and Windows . closes the CAPI. For example. File switching number. and then ends the pwxccl program. The final logging cycle enables the PowerExchange Logger to capture all of the changes up to point when the command is issued. UNIX. updates the CDCT file. condense. record expiration. SHUTDOWN shutdown For more information about command syntax. and shutdown processing. and pwxcmd use. Stops the PowerExchange Logger in a controlled manner after closing any open PowerExchange Logger log files and writing the latest restart position to the checkpoint files.cfg configuration file to have the PowerExchange Logger produce more detailed messages during initialization. CAPI Reads number. Writing file number. After the logging cycle completes. updates the CDCT file.

Use the following utility commands to perform maintenance tasks: Command CREATE_CDCT_BACKUP Description Back up all CDCT records for the source instance that is specified in the DBID parameter of the pwxccl." such as initialization and Command Handler processing of commands ¨ The DISPLAY RECORDS command displays counts of change records that the PowerExchange Logger processed during the current processing cycle. Record counts are shown for each type of change record processed and for total records processed. Processing phases include: . List information about the CDCT file and its records.If you do not use verbose messaging. For more information about these commands. log file name. Change record types include Delete. including example output. in microseconds.cfg configuration file. Delete CDCT records for which the retention period has expired and any PowerExchange Logger log files that are referenced by those records. see the PowerExchange Command Reference. the command reports the record number. and Commit. List CDCT records in the order in which they expire. Use this command only if you set the LOGGER_DELETES_EXPIRED_CDCT_RECORDS parameter to N in the pwxccl.Performing file switches . For each CDCT record.Reading source data . Maintaining the PowerExchange Logger CDCT File and Log Files You can use the PWXUCDCT utility to maintain the PowerExchange Logger CDCT file and log files. and start and end restart tokens. by processing phase and with the total for all processing. ¨ The DISPLAY CPU command displays the CPU time spent. start and end times. registration tag name. you can use the DISPLAY CPU and DISPLAY RECORDS commands to gather statistics that are useful for assessing PowerExchange Logger performance and status. for PowerExchange Logger processing during the current logging cycle. number of change records received for the registered table.cfg file. Insert. If the PowerExchange Logger did not receive changes during the current cycle.Performing "other processing.cfg configuration file. if the original backup file is damaged or deleted. Delete PowerExchange Logger log files that are not referenced by any record in the CDCT file.Writing data to PowerExchange Logger log files . DELETE_EXPIRED_CDCT DELETE_ORPHAN_FILES DERIVE_CDCT_BACKUP REPORT_CDCT REPORT_CDCT_BY_TIME REPORT_CONFIG Managing the PowerExchange Logger 53 . Update. the command displays counts of change records for the current PowerExchange Logger log files. List the parameter settings in the PowerExchange Logger pwxccl. Create a backup of the CDCT file based on PowerExchange Logger log files.

reason for the checkpoint. 3. 54 Chapter 3: PowerExchange Logger for Linux. To back up the CDCT file. If existing files become damaged or deleted. For each file. number of expired CDCT records that were deleted. You must derive a CDCT backup based on the current PowerExchange Logger log files and then restore that backup. you can use the PWXUCDCT utility CREATE_CDCT_BACKUP command. the Writer subtask processes the files in the reverse sequence. and number of log files to which change data was written. Restore the CDCT file from a backup if the CDCT file is damaged or deleted. 2. Backing Up PowerExchange Logger Files Periodically. sequence and restart tokens. ¨ Verify that messages PWX-25140 through PWX-25145 provide reasonable record counts for the records read from the backup file and for the records that were changed in the CDCT file. List PowerExchange Logger log files in the order in which they were created. Issue PWXUCDCT utility DERIVE_CDCT_BACKUP command. CDCT file. Back up the files in the following sequence to ensure that you have a checkpoint file that matches the backup: 1. For more information about using PWXUCDCT utility commands. see the PowerExchange Utilities Guide. 1. back up the PowerExchange Loggers files during a period when source data is not being written to the PowerExchange Logger log files. List PowerExchange Logger log files based on their file names. you can re-create the CDCT file based on the PowerExchange Logger log files. Checkpoint files CDCT file PowerExchange Logger log files Note: During a file switch. you can then use the backups to restore the files. UNIX. based on when they were written.Command REPORT_CHECKPOINTS Description List checkpoint files in chronological order. and Windows . UNIX. the list provides information such as number of capture registrations processed. and Windows checkpoint files. Verify that the restore operation was successful as follows: ¨ Verify that the return code from the PWXUCDCT utility is zero. see the PowerExchange Utilities Guide. REPORT_EXPIRED_CDCT REPORT_FILES_BY_TIME REPORT_ORPHAN_FILES RESTORE_CDCT For more information about the PWXUCDCT utility. If possible. 3. from earliest to latest. Re-creating the CDCT File After a Failure If the CDCT file and its recent backups are damaged or deleted. Restore the derived backup by issuing the PWXUCDCT utility RESTORE_CDCT command. List PowerExchange Logger log files that are not referenced by any record in the CDCT file. back up the PowerExchange Logger for Linux. from earliest to latest. and log files. 2.

Part III: PowerExchange CDC Data Sources This part contains the following chapters: ¨ DB2 for Linux. 80 55 . 70 ¨ Oracle Change Data Capture with Oracle LogMiner. UNIX. and Windows Change Data Capture. 56 ¨ Microsoft SQL Server Change Data Capture.

For example. you can create a data map to manipulate and prepare that data for 56 . 65 ¨ Managing DB2 CDC. you can optionally create a data map to manipulate that data with expressions. ¨ If a source table contains columns in which you store data in a format that is inconsistent with the column datatype. 57 ¨ Configuring DB2 for CDC. and Windows Change Data Capture This chapter includes the following topics: ¨ DB2 for Linux. UNIX. and Windows recovery logs for the database that contains your source tables. you can select a subset of columns for which to capture data. perform the following configuration tasks in PowerExchange: ¨ Define a capture registration for each source table. The capture catalog table stores information about all tables in the source database. ¨ Create a PowerExchange capture catalog table in the database. PowerExchange uses the PowerExchange Client for PowerCenter (PWXPC) to coordinate with PowerCenter to move the captured change data to one or more targets. Also. 69 DB2 for Linux. PowerExchange generates a corresponding extraction map. if you store packed data in a CHAR column. 58 ¨ Configuring PowerExchange for DB2 CDC. UNIX. For PowerExchange to capture DB2 change data. UNIX. 59 ¨ Using a DB2 Data Map. you can define an additional extraction map. and Windows CDC Troubleshooting. 66 ¨ DB2 for Linux. UNIX. you must perform the following configuration tasks in DB2: ¨ Ensure that archive logging is active for the database.CHAPTER 4 DB2 for Linux. and Windows CDC Overview. including column definitions and DB2 log positions. In the capture registration. Optionally. 56 ¨ Planning for DB2 CDC. UNIX. and Windows CDC Overview PowerExchange captures change data from the DB2 for Linux.

the user ID that you specify for database access must have SYSADM or DBADM authority.cfg file. You must merge the data map with the extraction map for the source table during capture registration creation. Planning for DB2 CDC 57 . faster CDC restart.loading to a target. RELATED TOPICS: ¨ “PowerExchange Logger for Linux. and Windows CDC. UNIX. UNIX. and Windows” on page 19 ¨ “Introduction to Change Data Extraction” on page 105 ¨ “Extracting Change Data” on page 125 Planning for DB2 CDC Before you configure DB2 for Linux. Prerequisites PowerExchange CDC has the following prerequisites: ¨ Archive logging must be active for the database that contains the source tables from which change data is to be captured. and no need to prolong retention of DB2 log files for change capture. The change data is then extracted from the PowerExchange Logger log files. PowerExchange works in conjunction with PowerCenter to extract change data from DB2 recovery logs or PowerExchange Logger log files and load that data to one or more targets. ¨ DB2 source tables must have been defined with the DATA CAPTURE CHANGES clause for capture processing to occur. you specify this user ID in the UDB CAPI_CONNECTION statement in the dbmover. UNIX. Required User Authority For PowerExchange to read change data from DB2 logs. Usually. Benefits of the PowerExchange Logger include fewer database accesses. Also. verify that the following prerequisites and user authority requirements are met. ¨ If you want to use the PowerExchange Logger for Linux. configure the PowerExchange Logger. and Windows to capture change data and write it to PowerExchange Logger log files. review the restrictions so that you can properly configure CDC.

58 Chapter 4: DB2 for Linux. If the INSERT is processed first. Otherwise. PowerExchange cannot capture change data for these tables. change data capture processing might fail with the error message PWX-20628. You can create a capture registration for a table that includes columns with DECFLOAT.CDC Restrictions The following restrictions apply to DB2 CDC processing: ¨ To extract change data on a DB2 client machine that is remote from the DB2 server where the change data is captured. enable archive logging for the DB2 database. If archive logging is not enabled. ¨ Set DB2CODEPAGE to 1208. or to reconfigure a database partition group. RELATED TOPICS: ¨ “Reconfiguring a Partitioned Database or Database Partition Group” on page 67 Configuring DB2 for CDC To configure DB2 for Linux. or Windows for PowerExchange CDC. and Windows Change Data Capture . both the original row and the updated row appear on the target until the DELETE is processed. PowerExchange processes the UPDATE as two operations: a DELETE and an INSERT. PowerExchange does capture change data for the other columns in the registered table that have supported datatypes. ¨ If you alter a column datatype to or from FOR BIT DATA. However. For more information. PowerExchange continues to use the datatype that is specified in the existing capture registration. Set the following user environment variables in any process that runs PowerExchange CDC or the DTLUCUDB program: ¨ Set DB2NOEXITLIST to ON. 4. and PowerExchange does not capture change data for them. based on the DB2 log information. PowerExchange does not detect the datatype change. PowerExchange might not be able to resume change data capture properly. However. Otherwise.User-defined datatypes. PowerExchange issues the error messages PWX-20204 and PWX-20229 during CDC. and XML datatypes.DECFLOAT. ¨ In a partitioned database. PowerExchange cannot predictably determine the order in which to perform the DELETE and INSERT operations. both machines must have the same architecture. Verify that the DB2 source tables are defined with the DATA CAPTURE CHANGES clause. if an UPDATE to a table row changes the partition key and that change causes the row to move to another partition. and XML datatypes. see the IBM DB2 documentation. LOB. UNIX. ¨ The maximum length of a row from which PowerExchange can capture change data is 32 KB. ¨ PowerExchange cannot capture change data for the following DB2 datatypes: . use the INCLUDE LONGVAR COLUMNS clause to alter the table so that PowerExchange can capture data for the LONG columns. ¨ To add or drop partitions in a partitioned database and then redistribute table data across the updated partition group. the registration does not include these columns. If a table that is selected for change data capture includes columns with a LONG datatype. In the DB2 Control Center Configure Database Logging Wizard. . 2. Otherwise. UNIX. you must use a special procedure. perform the following tasks: 1. LOB. 3. PowerExchange might issue the error message PWX-20094 during CDC processing. Tables that include columns with user-defined datatypes cannot be registered for change data capture.

complete the following tasks to configure PowerExchange CDC: 1. 5. Optionally. Activate the capture registrations. UNIX. create a data map if you want to perform field-level processing. you do this task after materializing the targets. UNIX. delete the existing registrations and extraction maps and create new ones. When you configure the dbmover. and Windows CDC depend on whether you want to use the PowerExchange Logger for Linux. and Windows and the extraction mode you plan to use. create a capture registration for each source table. UNIX. Create the PowerExchange capture catalog table.Configuring PowerExchange for DB2 CDC The tasks that you perform to configure PowerExchange for DB2 for Linux. This practice prevents having to edit the capture registrations later if you decide to use the PowerExchange Logger. include the following statements: ¨ CAPT_PATH ¨ CAPT_XTRA ¨ UDB CAPI_CONNECTION 4.cfg for DB2 CDC” on page 61 ¨ “Creating the Capture Catalog Table” on page 60 ¨ “Introduction to Change Data Extraction” on page 105 ¨ “Extracting Change Data” on page 125 Configuring PowerExchange for DB2 CDC 59 . Next Step: Configure and start extractions. In the PowerExchange Navigator. unless you have a specific reason not to do so. You might want to set the Condense option to None if you plan to run both real-time and continuous extractions against tables defined by the same capture registrations and you do not want the PowerExchange Logger to capture change data for some registered tables. If capture registrations already exist for the source tables. 2. Tip: Set the Condense option to Part even though you do not plan to use the PowerExchange Logger. Run the DTLUCUDB SNAPSHOT command to initialize the capture catalog table. You must use real-time extraction mode. UNIX. RELATED TOPICS: ¨ “PowerExchange Logger for Linux.cfg file. Usually. The PowerExchange Navigator generates a corresponding extraction map. and Windows” on page 19 Configuring PowerExchange CDC without the PowerExchange Logger If you plan to run extractions in real-time extraction mode and not use the PowerExchange Logger for Linux. RELATED TOPICS: ¨ “Initializing the Capture Catalog Table” on page 61 ¨ “Customizing dbmover. and Windows. 3.

If capture registrations already exist for these tables. NULL. Create the PowerExchange capture catalog table. the capture catalog table still contains positioning information for the partition. and Windows Change Data Capture . you do this task after materializing the targets. Use the following DDL to create the capture catalog table: CREATE TABLE VTSTIME VTSACC NODENUM SEQ DTLCCATALOG ( TIMESTAMP INTEGER SMALLINT INTEGER NOT NOT NOT NOT NULL. UNIX. 6. Next Step: Configure and start extractions. UNIX. include the following statements: ¨ CAPT_PATH ¨ CAPT_XTRA ¨ UDB CAPI_CONNECTION ¨ CAPX CAPI_CONNECTION (for continuous extraction mode only) 4. When you configure the dbmover. column definitions. complete the following tasks to configure PowerExchange CDC: 1. NULL. NULL.cfg file. 60 Chapter 4: DB2 for Linux. 2. The PowerExchange Navigator generates a corresponding extraction map. Start the PowerExchange Logger. Run the DTLUCUDB SNAPSHOT command to initialize the capture catalog table. delete the existing registrations and extraction maps and create new ones. Configure the pwxccl.cfg for DB2 CDC” on page 61 ¨ “Introduction to Change Data Extraction” on page 105 ¨ “Extracting Change Data” on page 125 ¨ “CAPX CAPI_CONNECTION Parameters” on page 14 Creating the Capture Catalog Table The PowerExchange capture catalog table stores information about the CDC source tables.Configuring PowerExchange CDC with the PowerExchange Logger If you plan to use the PowerExchange Logger for Linux. Activate the capture registrations. 5. Usually. If the database has multiple partitions. In the PowerExchange Navigator. You must create this table in the same database that contains the source tables from which change data is captured.cfg file for the PowerExchange Logger. RELATED TOPICS: ¨ “Configuring the PowerExchange Logger” on page 27 ¨ “Starting the PowerExchange Logger” on page 47 ¨ “Creating the Capture Catalog Table” on page 60 ¨ “Initializing the Capture Catalog Table” on page 61 ¨ “Customizing dbmover. If the database has only a single partition. and valid DB2 log positions. 3. You can use either batch extraction mode or continuous extraction mode. create a capture registration for each DB2 source table. and Windows and run extractions in batch or continuous extraction mode. the capture catalog table stores positioning information for each partition. You must select Part in the Condense drop-down list. 7.

you must set the REPLACE parameter to Y to enable PowerExchange to overwrite the data. NODENUM. and CDCT file for information about PowerExchange Logger for Linux. TBNAME VARCHAR(128). Usually. and Windows. Customizing dbmover. ¨ CAPT_XTRA.cfg configuration file. and Windows sources. If necessary. This statement defines a specific directory for the PowerExchange message log files. you can specify another table name. Initializing the Capture Catalog Table To initialize the PowerExchange capture catalog table. If this failure occurs. You should need to do this task only once. OP VARCHAR(1024) NOT NULL. TBSCHEMA VARCHAR(128). and Windows CAPI_CONNECTION Parameters” on page 62 Configuring PowerExchange for DB2 CDC 61 . To find PowerExchange messages more easily. the snapshot fails. Path to the local directory where the following files reside: CCT file for capture registrations.. ¨ UDB CAPI_CONNECTION. Tip: Informatica recommends that you place the PowerExchange capture catalog table in the DB2 catalog partition. Add this statement to the dbmover. Path to the local directory for extraction maps. UNIX. A named set of parameters that the CAPI uses to connect to the change stream and control extraction processing for DB2 for Linux.cfg file on the system where DB2 capture registrations are stored. If you plan to use continuous extraction mode. After the snapshot successfully completes. Also add the other statements that are required for CDC and any optional statements that you want to use. the table name is DTLCCATALOG. VTSACC. accept the default of N. you must also define the CAPX CAPI_CONNECTION statement. run the SNAPSHOT command again after the DB2 catalog updates are complete. UNIX. SEQ) ) In this DDL. include the CAPI connection statement that is specific to DB2 for Linux.cfg for DB2 CDC In the dbmover. The following statements are required for DB2 CDC: ¨ CAPT_PATH. RELATED TOPICS: ¨ “CAPX CAPI_CONNECTION Parameters” on page 14 ¨ “DB2 for Linux. run the DTLUCUDB utility with the SNAPSHOT command. UNIX. PRIMARY KEY(VTSTIME. CDEP file for application names used in ODBC extractions. this location is where the source database resides. use the following syntax: DTLUCUDB SNAPSHOT [DB=database_name] [CCATALOG=capture_catalog_name] [UID=user_id] [EPWD=encrypted_password] [REPLACE=Y|N] If the capture catalog table contains existing rows of data. include the LOGPATH statement. Note: If you run the DTLUCUDB SNAPSHOT command while the DB2 catalog is being updated. To specify the command. UNIX. For a new capture catalog table. This location corresponds to the Location node that you specify when defining a registration group. and Windows log files. back up the capture catalog table to create a point of consistency for recovery.

UNIX. UNIX.USERID=db2admin .] [UPDINT=seconds. UNIX.] [UPDREC=num_records. [TRACE=trace.CCATALOG=mylib. Unique user-defined name for this CAPI_CONNECTION statement. and Windows Yes for DB2 for Linux. and Windows CAPI_CONNECTION Parameters The UDB CAPI_CONNECTION statement specifies the Consumer API (CAPI) parameters needed for DB2 for Linux. and Windows CDC: CAPT_PATH=c:/pwxcapt/Vnnn CAPT_XTRA=c:/pwxcapt/Vnnn/extrmaps CAPI_CONN_NAME=UDBCC CAPI_CONNECTION=(NAME=UDBCC . UNIX. UNIX.captcat_tbl .DLLTRACE=bbbb . Data Sources: Required: DB2 for Linux. NAME=name Required.TYPE=(UDB .] [SPACEPRI=primary_space.PASSWORD=db2admin)) DB2 for Linux.] NAME=name.Example Statements The following statements are typical of those included in a dbmover. TRACE=trace Optional.] [RSTRADV=seconds. Specify this parameter only at the direction of Informatica Global Customer Support. 62 Chapter 4: DB2 for Linux.] [MEMCACHE=cache_size. User-defined name of the TRACE statement that activates internal DLL tracing for this CAPI. Specify this parameter only at the direction of Informatica Global Customer Support.] [EPWD=encryted_password.] [PASSWORD=password. [CCATALOG=capture_catalog. and Windows Change Data Capture . and Windows CDC Syntax: CAPI_CONNECTION=( [DLLTRACE=trace_id. Maximum length is eight alphanumeric characters.] TYPE=(UDB. UNIX. and Windows CDC sources.cfg for DB2 for Linux.] [UDBSCHEMA=schema. User-defined name of the TRACE statement that activates the common CAPI tracing.] [DBCONN=database_name.] [USERID=user_id] ) ) Parameters: Enter the following parameters: DLLTRACE=trace_id Optional.

If the change stream contains multiple large UOWs and the memory cache is insufficient. in kilobytes.. MEMCACHE=cache_size Optional. If the memory cache is too small to hold all of the changes in a UOW. . where creator is the user ID that is used to connect to the database. A UOW might require multiple UOW spill files to hold all of the changes for that UOW. UNIX. The override database must contain tables and columns that are identical to those in the original database. Each UOW spill file contains one UOW.table_name. Configuring PowerExchange for DB2 CDC 63 . The original database name is included in the registration tag names and extraction map names. PowerExchange keeps all changes for each UOW in the memory cache until it processes the end-UOW record. the default value is often too small to eliminate UOW spill files. that PowerExchange allocates to reconstruct complete UOWs. Use this parameter if you want extract change data from another database that is identical to the one specified in the registration group. Informatica recommends that so you specify a larger value. Type of CAPI_CONNECTION statement. you must specify either the PASSWORD or EPWD parameter. Important: If the change stream contains only small UOWs.TYPE=(UDB. If you specify the USERID parameter. However. A database name that specifies an override database to which to connect for data extraction. Encrypted password that is used with the database user ID specified in the USERID parameter. You can create encrypted passwords by using the PowerExchange Navigator. For DB2 for Linux. DBCONN=database_name Optional. called UOW spill files. PowerExchange processes the change stream more efficiently if it does not need to use UOW spill files.. Memory cache size. Default is creator. this value must be UDB. ) Required. EPWD=encryted_password Optional. Name of the PowerExchange capture catalog table in the format creator. large numbers of UOW spill files can cause a disk space shortage.DTLCCATALOG. PowerExchange might create numerous UOW spill files. and Windows sources. For each extraction session. Do not specify both PASSWORD and EPWD. the default value might be sufficient. CCATALOG=capture_catalog Optional. In addition to degrading extraction performance. PowerExchange spills the changes to a sequential files on disk.

when PowerExchange warm starts. Otherwise. or 1 MB. you must specify either the PASSWORD or EPWD parameter. and Windows Change Data Capture . PASSWORD=password Optional. Note: The UOW spill files are temporary files that are deleted when PowerExchange closes them. Warning: A value of 0 can degrade performance because PowerExchange returns an empty UOW after each UOW processed. In this case. PowerExchange waits 5 seconds after it completes processing the last UOW or after the previous wait interval expires. 64 Chapter 4: DB2 for Linux. use caution when coding large values for MEMCACHE. including those not of interest for CDC. PowerExchange creates the UOW spill file names by using the Windows _tempnam function with a prefix of dtlq. specify the TMP environment variable. For example. Default is 1024. if you specify 5. PowerExchange creates the UOW spill file names by using the operating system tempnam function with a prefix of dtlq. that PowerExchange waits before advancing restart and sequence tokens for a registered data source during periods when UOWs do not include any changes of interest for the data source. UNIX. Valid values are 0 through 86400. PowerExchange allocates UOW spill files as temporary files. Then PowerExchange returns the next committed empty UOW that includes the updated restart information and resets the wait interval to 0. If RSTRADV is not specified. To use a different directory. Valid values are from 1 through 519720. PowerExchange returns the next committed "empty UOW.The location in which PowerExchange allocates the UOW spill files varies by operating system. from the restart point. RSTRADV=nnnnn Time interval. Do not specify both PASSWORD and EPWD. specify the TMPDIR environment variable." which includes only updated restart information. it reads all changes. PowerExchange uses the current directory by default for UOW spill files. ¨ For Windows. as follows: ¨ For Linux and UNIX. SPACEPRI=primary_space Optional. PowerExchange uses the current directory by default for UOW spill files. Valid values are from 1 through 2147483647. many concurrent extraction sessions might cause memory constraints. To use a different directory. They are not visible in the directory while open. If you specify the USERID parameter. No default is provided. When the wait interval expires. Warning: Because PowerExchange allocates the cache size for each extraction operation. PowerExchange does not advance restart and sequence tokens for a registered source during periods when no changes of interest are received. The wait interval is reset to 0 when PowerExchange completes processing a UOW that includes changes of interest or returns an empty UOW because the wait interval expired without any changes of interest having been received. Clear text password that is used with the database user ID specified in the USERID parameter. in seconds.

you can manually add an extraction map. Valid values are from 1 through 2147483647.Default is 2147483647. which is composed of a log sequence number (LSN) and VTS. If you specify this parameter. Note: The field names in the data map must match the actual column names. The user ID must have SYSADM or DBADM authority. Minimum number of DB2 log records that PowerExchange must read for a partition before it can write a positioning entry to the PowerExchange capture catalog table. For example. Minimum number of seconds that PowerExchange must wait after encountering a virtual timestamp (VTS) in the DB2 log records for a partition before writing a positioning entry to the PowerExchange capture catalog table. Also. such as a CHAR or VARCHAR column that stores multiple packed data fields. you can define expressions only for data maps. The PowerExchange Navigator automatically generates an extraction map when you create a capture registration. You can use the bulk data maps for CDC if you merge them with the extraction maps for your data sources. a table can contain a single column that stores an array of fields in a format that is not consistent with the column datatype. Alternatively. Using a DB2 Data Map 65 . Using a DB2 Data Map If you want PowerExchange to perform field-level processing on some records in a DB2 for Linux. you must use a data map. Note: The UPDINT minimum wait period must also be met before positioning entries can be written to the capture catalog table. In the PowerExchange Navigator. you must also specify either the PASSWORD or EPWD parameter. or 2 GB. Default is 10000. as indicated in the DB2 capture registration. The positioning entry. UDBSCHEMA=schema Optional. UNIX. if you add a user-defined field to a table in record view. Database user ID. You might have data maps available for your source tables if you used PowerExchange bulk data movement to materialize your data targets. which is composed of a LSN and VTS. Default is 600. indicates a location in the DB2 logs. and Windows source table. Note: The UPDREC minimum number of records must also be met before positioning entries can be written to the capture catalog table. UPDREC=number_records Optional. USERID=user_id Optional. in some DB2 environments. The positioning entry. Valid values are from 1 through 2147483647. you can build an expression to populate it. You can use an expression to modify this data before PowerCenter replicates it to a target. Schema name that overrides the schema name in capture registrations. Bulk data movement requires data maps. indicates a location in the DB2 logs. UPDINT=seconds Optional.

table_name DATA CAPTURE NONE When DATA CAPTURE NONE is specified. ¨ To temporarily stop change data capture. you might need to change the definition of a DB2 for Linux. 66 Chapter 4: DB2 for Linux. and change the Status value from Active to History. UNIX. Merge the DB2 data map with the extraction map for the table. you cannot activate the registration again. UNIX. create a capture registration for the DB2 source table. such as maintenance on the capture catalog table or redistribution of table data across reconfigured database partitions. PowerExchange can no longer capture change data for the table from the log files. 4. Because CDC requires expanded format. If your metadata changes affect the columns from which change data is captured. and Windows source table that is registered for change data capture. Perform a row test on the merged extraction map. RELATED TOPICS: ¨ “Stopping PowerCenter CDC Sessions” on page 142 Changing a DB2 Source Table Definition Occasionally. DB2 no longer writes changes to the DB2 log files in expanded format. If you set it back to DATA CAPTURE CHANGES. 2. UNIX. This status change permanently stops change data capture based on the capture registration. Warning: After you set the status of a capture registration to History.Task Flow for DB2 Data Map Use Perform the following tasks to use a DB2 data map for change data capture: 1. Stopping DB2 CDC You might need to stop change data capture for a DB2 source table to perform troubleshooting or routine maintenance tasks. you might need to rematerialize the targets. use this procedure to enable PowerExchange to switch to the updated table definition. to change the table definitions. and Windows Change Data Capture . alter the DB2 table to specify the DATA CAPTURE NONE clause: ALTER owner. for example. In the PowerExchange Navigator. 3. Create a DB2 data map for the same DB2 source table if one is not available from a previous bulk data movement operation. and Windows CDC for source tables occasionally. use one of the following methods: ¨ Open the capture registration for a source table. To stop change data capture. while preserving access to previously captured data. RELATED TOPICS: ¨ “Testing a Change Data Extraction” on page 126 Managing DB2 CDC You might need to stop DB2 for Linux.

Use DDL to make the table changes. UNIX. you can add any new columns that you defined. Edit the mapping if necessary. ¨ Reconfigure a database partition group by adding or removing existing partitions. after making these types of changes. Then stop all workflows that extract change data for the table. Then reconfigure the database partition group or groups to reflect the change. 3. create a new capture registration that reflects the metadata changes and set its status to Active. PowerExchange uses the newly activated capture registration for change data capture. Alternatively. if you need to add or delete existing columns. 8. Then set the capture registration status to Active. If PowerExchange change data capture is active in the partitioned database environment. 2. rematerialize the target tables. you must use the following procedure to properly resume change data capture after making the reconfiguration changes. Note: PowerExchange does not capture change data based on capture registrations that have a status of History or Inactive. as needed. open the original capture registration and set its status to History. and Windows partitioned database environment. 4. or drop columns for which change data is captured. Alternatively.Perform this procedure whenever you add. right-click the capture registration and click Amend Columns. alter. Typically. Verify that any change data that was captured under the previous table definition has completed extraction processing. This action creates a new version of the capture registration that has a status of Inactive. change the target table definition to reflect the source table metadata changes. In the PowerExchange Navigator. After materialization completes. and UPDATE activity against the table. 7. To change a DB2 source table definition: 1. if you created a new version of the original capture registration by amending columns. If necessary. You can then add or delete columns. In PowerCenter Designer. 9. Change data for that column is still captured but is not extracted. edit the associated extraction map to point to the new capture registration version. Also. import the altered source and target tables. Managing DB2 CDC 67 . and UPDATE activity against the table. RELATED TOPICS: ¨ “Creating Restart Tokens for Extractions” on page 135 Reconfiguring a Partitioned Database or Database Partition Group In a DB2 for Linux. 10. If necessary. 5. Tip: If you no longer need to capture change data from a column in a table. Restart extraction processing. you can remove the column from the extraction map without changing the capture registration. INSERT. Right-click the associated extraction map and click Amend Capture Registrations. you run the DB2 REDISTRIBUTE DATABASE PARTITION GROUP command to redistribute table data among the partitions in the updated database partition group. you might need to perform the following reconfiguration tasks: ¨ Add a new partition to a partitioned database. INSERT. create new restart tokens. Stop DELETE. Re-enable DELETE. You do not need to perform this procedure if you are selectively capturing change data for a subset of columns and none of the selected columns are affected by the metadata changes. 6. or drop an existing partition. In the PowerExchange Navigator.

6. For each table for which the DATA CAPTURE CHANGES clause is specified. Run the PowerExchange DTLUCUDB SNAPUPDT command. 4. Execute the SQL for adding the new database partition or for dropping an existing partition. 8. For each table for which you specified DATA CAPTURE NONE in step 2. In PowerCenter.Adding or Dropping Database Partitions Use the following procedure to create a new partition in a partitioned database or to drop an existing partition. Restart the PowerCenter CDC sessions to resume extraction processing. 3. If you do not perform this step. and then update the appropriate database partition group for the change: 1. and Windows Change Data Capture . 3. 68 Chapter 4: DB2 for Linux. For each table for which the DATA CAPTURE CHANGES clause is specified. 2. Execute the ALTER DATABASE PARTITION GROUP SQL to add the new partition to or remove the dropped partition from the appropriate database partition group. reinstate the DATA CAPTURE CHANGES clause. stop all CDC sessions that extract change data for the tables in the partitioned database instance. Set the REPLACE option set to Y. If you do not perform this step. RELATED TOPICS: ¨ “Initializing the Capture Catalog Table” on page 61 Reconfiguring a Database Partition Group Use the following procedure to add a partition to or remove a partition from a database partition group without changing the partitioning of the partitioned database instance: 1. 2. 6. DB2 records the data redistribution changes that result from the RESTRIBUTE command as regular change data activity. Execute the ALTER DATABASE PARTITION GROUP SQL to add the new partition to or remove the dropped partition from the appropriate database partition group. Run the DB2 REDISTRIBUTE DATABASE PARTITION GROUP command to redistribute table data among the partitions in the altered database partition group. UNIX. Note: This step temporarily disables DB2 capture of changes to its log files. DB2 records the data redistribution changes that result from the RESTRIBUTE command as regular change data activity. In PowerCenter. For each table for which you specified DATA CAPTURE NONE in step 2. reinstate the DATA CAPTURE CHANGES clause. 7. Restart the PowerCenter CDC sessions to resume extraction processing. This step updates the PowerExchange capture catalog table to reflect the reconfigured partitioned database. 5. 9. specify DATA CAPTURE NONE. Note: This step temporarily disables DB2 capture of changes to its log files. specify DATA CAPTURE NONE. Back up the PowerExchange capture catalog table. 4. 5. Tip: Informatica recommends that you first perform a test run with the REPLACE option set to N. stop all CDC sessions that extract change data for the tables in the partitioned database instance. Run the DB2 REDISTRIBUTE DATABASE PARTITION GROUP command to redistribute table data among the partitions in the altered database partition group.

DB2 for Linux. To resolve this issue. attempt the solution that is described. To resolve this issue. go to the IBM Web site for more information or apply the appropriate FixPak for your DB2 version. UNIX. Msg=[IBM][CLI Driver] SQL1224N A database agent could not be started to service a request. If you cannot resolve the problem.1 FixPak 1 or later.COLUMNS VIEW. SQLSTATE=55032. UNIX. which include character columns with an incorrect code page: JR30422: "ALTER TABLE ALTER COLUMN" STATEMENT DOES NOT ALTER THE CODEPAGE COLUMN IN THE SYSCAT. UNIX. DB2 for Linux. To implement a loopback connection without changing the database alias that users enter for database connection. contact Informatica Global Customer Support. Workaround for SQL1224 Error on AIX On AIX systems only. or was terminated as a result of a database system shutdown or a force command. ¨ The following issue can cause invalid PowerExchange capture registrations that include character columns with an incorrect code page: JR30420: "ALTER TABLE ALTER COLUMN" STATEMENT DOES NOT ALTER THE CODEPAGE COLUMN IN THE SYSCAT. issue the following DB2 commands: db2 db2 db2 db2 db2 catalog tcpip node node_name1 remote server_name1 server port_number1 uncatalog database database_name1 catalog database database_name1 at node node_name1 catalog database database_name1 as database_alias1 catalog database database_alias1 as database_name1 at node node_name1 For more information about these commands. DB2 for Linux. UNIX. and Windows 9. you might receive the following PowerExchange message for a DB2 SQL1224 error when you connect locally to a DB2 database that has multiple other local connections: PWX-20604 State=08001. and Windows CDC Troubleshooting 69 .DB2 for Linux. and Windows CDC Troubleshooting If you encounter the following issue when running DB2 for Linux. and Windows 9. apply DB2 9.5: ¨ The following issue can cause invalid PowerExchange capture registrations. implement a loopback TCP/IP connection for the local DB2 database. search the IBM Web site for the latest information about this APAR. SQLSTATE=42615 To resolve this issue. IBM APARs for Specific Issues If you encounter the issues documented in the following IBM APARs.COLUMNS VIEW. see your IBM DB2 documentation. To circumvent this problem. UNIX. The database can then function as a remote client that uses TCP/IP instead of interprocess communications (IPC) over shared memory.1: ¨ The following issue can result in a SQL error message: IY87631: PESSIMISTIC LOCKING FOR CLI SQL_CONCUR_LOCK NO LONGER WORKING IN V8 The SQL error message is: SQL0644N Invalid value specified for keyword "CONCURRENCY" in statement "ATTRIBUTE-STRING". and Windows CDC. Code=-1224. search the IBM Web site for the latest information about this APAR.

PowerExchange uses the PowerExchange Client for PowerCenter (PWXPC) to coordinate with PowerCenter to move the captured change data to one or more targets. PowerExchange works with PowerCenter to extract change data from the SQL Server distribution database or PowerExchange Logger log files and load that data to one or more targets. 70 ¨ Planning for SQL Server CDC. UNIX. 73 ¨ Configuring PowerExchange for SQL Server CDC. configure the PowerExchange Logger. and Windows” on page 19 ¨ “Introduction to Change Data Extraction” on page 105 ¨ “Extracting Change Data” on page 125 70 . 71 ¨ Configuring SQL Server for CDC. To configure CDC in PowerExchange. 78 Microsoft SQL Server CDC Overview PowerExchange uses SQL Server transactional replication to capture change data from SQL Server distribution databases. If you want to use the PowerExchange Logger for Linux. 74 ¨ Managing SQL Server CDC. PowerExchange generates a corresponding extraction map. In the capture registration. For CDC to work. If your database has a high volume of change activity. RELATED TOPICS: ¨ “PowerExchange Logger for Linux.CHAPTER 5 Microsoft SQL Server Change Data Capture This chapter includes the following topics: ¨ Microsoft SQL Server CDC Overview. Benefits of the PowerExchange Logger include fewer database accesses and faster CDC restart. you must enable SQL Server Replication on the system from which change data is to be captured. you must define a capture registration for each source table. UNIX. you should use a distributed server as the host of the distribution database. you can select a subset of columns for which to capture data. and Windows to capture change data and write it to PowerExchange Logger log files. The change data is then extracted from the PowerExchange Logger log files.

Planning for SQL Server CDC
Before you configure SQL Server change data capture (CDC), verify that the following prerequisites and user authority requirements are met. Also, review the restrictions so that you can properly configure CDC.

SQL Server CDC Prerequisites
PowerExchange CDC has some SQL Server prerequisites. These prerequisites are:
¨ PowerExchange CDC requires an edition of Microsoft SQL Server 2000 or later that supports transactional

replication. You must configure and enable transactional replication on the source system to participate in CDC.
¨ If you use Microsoft SQL Server 2008, install the Microsoft SQL Server 2005 Backward Compatibility

components if have not done so. You can download these components from the Microsoft Web site.
¨ The Microsoft SQL Server Agent and Log Reader Agent must be running on the Windows machine from which

change data is extracted. Usually, the SQL Server Agent remains running after it is initially started. For more information, see your SQL Server documentation.
¨ Each source table in the distribution database must have a primary key. ¨ If the PowerExchange Navigator does not reside on the same machine as the Microsoft SQL Server software,

you must install the SQL Server client components on the PowerExchange Navigator machine.

Required User Authority for SQL Server CDC
PowerExchange CDC requires the following user authority levels:
¨ To create capture registrations in the PowerExchange Navigator, you must be a member of the SQL Server

sysadmin server role.
¨ To run change data extractions against a SQL Server distribution database, you must have read access to that

database. If you do not specify a user ID and password, the PowerExchange Navigator and your extraction processes attempt to use your Windows user ID and password to connect to the SQL Server distribution database.

Datatypes Supported for SQL Server CDC
This topic identifies the SQL Server datatypes that PowerExchange supports for CDC. The following table lists the datatypes and indicates whether they are supported for CDC:
Datatype bigint binary bit char date Supported for CDC? Yes Yes Yes Yes No This datatype was introduced in SQL Server 2008. Comments

Planning for SQL Server CDC

71

Datatype datetime datetime2 datetimeoffset decimal float geography geometry hierarchyid image1 int money nchar ntext1 numeric nvarchar real smalldatetime smallint smallmoney sql_variant

Supported for CDC? Yes No No Yes Yes No No No No Yes Yes Yes No Yes Yes Yes Yes Yes Yes No

Comments

This datatype was introduced in SQL Server 2008. This datatype was introduced in SQL Server 2008.

This datatype was introduced in SQL Server 2008. This datatype was introduced in SQL Server 2008. This datatype was introduced in SQL Server 2008. Use varbinary(MAX) instead.

Use nvarchar(MAX) instead.

PowerExchange does not capture change data for sql_variant columns but does capture change data for other columns in the same table. Use varchar(MAX) instead. This datatype was introduced in SQL Server 2008.

text1 time timestamp tinyint uniqueidentifier

No No Yes Yes Yes

PowerCenter imports the uniqueidentifier datatype as a varchar datatype of 38 characters.

72

Chapter 5: Microsoft SQL Server Change Data Capture

Datatype user-defined datatypes (UDTs)

Supported for CDC? Yes

Comments PowerExchange treats a UDT in the same way as the datatype on which the UDT is based.

varbinary varchar xml

Yes Yes Yes PowerExchange treats this datatype as varchar(MAX).

1. PowerExchange might not be able to capture change data for columns that have the datatypes of image, ntext, or text because of SQL Server transactional replication restrictions on these types of columns. Instead, use the alternative datatypes that Microsoft recommends, as shown in the Comments column.

SQL Server CDC Restrictions
The following restrictions apply to SQL Server CDC:
¨ PowerExchange does not capture change data for SQL Server system tables. ¨ The maximum length of a row for which PowerExchange can capture and process change data is 32 KB. ¨ PowerExchange does not capture the user ID that is associated with the original transaction that updated the

database.
¨ The timestamp that PowerExchange records for each captured change indicates when the change was

captured, not when the original transaction occurred.
¨ PowerExchange does not capture change data for derived columns that are not persisted. SQL Server

computes values for these columns at run-time based on an expression but does not store the values in a table.
¨ SQL Server publishes deferred updates to SQL Server tables as DELETEs followed by INSERTs rather than as

UPDATEs. Consequently, PowerExchange propagates deferred updates as DELETEs followed by INSERTs, even if you select AI for the Image Type attribute in the CDC connection. PowerExchange does not include before image (BI) and change indicator (CI) information in DELETE and INSERT operations. For more information about deferred updates, see your Microsoft SQL Server documentation.

Configuring SQL Server for CDC
You must perform a few configuration tasks to prepare SQL Server for PowerExchange change data capture (CDC). If your SQL Server tables have a high level of update activity, use a distributed server as the host of the distribution database from which change data is captured. This practice prevents competition between PowerExchange CDC and your production database for CPU use and disk storage.

Configuring SQL Server for CDC

73

unless you have a particular reason not to do so. you do this task after materializing the targets.cfg for SQL Server CDC” on page 75 ¨ “Introduction to Change Data Extraction” on page 105 74 Chapter 5: Microsoft SQL Server Change Data Capture . UNIX. This practice prevents having to change the capture registrations later if you decide to use the PowerExchange Logger. 3. You must use real-time extraction mode. Start the SQL Server Agent and Log Reader Agent if they are not running. However. RELATED TOPICS: ¨ “PowerExchange Logger for Linux. and Windows” on page 19 Configuring PowerExchange CDC without the PowerExchange Logger If you plan to run extractions in real-time extraction mode and not use the PowerExchange Logger for Linux. RELATED TOPICS: ¨ “Customizing dbmover. When you configure the dbmover. delete the existing registrations and extraction maps and create new ones. accept this default retention period. define the following statements: ¨ CAPT_PATH ¨ CAPT_XTRA ¨ MSQL CAPI_CONNECTION 2.To configure SQL Server for PowerExchange CDC. Next Step: Configure and start extractions. Informatica recommends that you increase the retention period to 14 days. perform the following tasks: 1. If you do not use the PowerExchange Logger. 2. 3. Activate the capture registrations. If you are use the PowerExchange Logger. If capture registrations already exist for these tables. create a capture registration for each SQL Server source table. The PowerExchange Navigator generates a corresponding extraction map. For more information. and Windows and the extraction mode you plan to use. UNIX. Tip: The default transactional retention period at the Distributor is 72 hours. you might need to a lower value if you have a high volume of transactions or space constraints. see your Microsoft SQL Server documentation. In the PowerExchange Navigator. see your Microsoft SQL Server documentation. Tip: Set the Condense option to Part even though you do not plan to use the PowerExchange Logger. Verify that each source table in the distribution database has a primary key. complete the following tasks to configure PowerExchange CDC: 1. UNIX. Configuring PowerExchange for SQL Server CDC The tasks that you perform to configure PowerExchange for change data capture (CDC) depend on whether you want to use the PowerExchange Logger for Linux. For more information.cfg file. The PowerExchange Navigator generates a corresponding extraction map for each capture registration. Usually. Configure and enable SQL Server transactional replication. You might want to set the Condense option to None if you run both real-time and continuous extractions against tables defined by the same capture registrations and do not want the PowerExchange Logger to capture change data for certain registered tables. and Windows.

RELATED TOPICS: ¨ “Customizing the PowerExchange Logger Configuration File” on page 28 ¨ “Starting the PowerExchange Logger” on page 47 ¨ “Customizing dbmover. If you plan to use the PowerExchange Logger and continuous extraction mode. include the CAPI connection statement that is specific to SQL Server. The following statements are required for SQL Server CDC: ¨ CAPT_PATH.cfg for SQL Server CDC In the dbmover. Configuring PowerExchange for SQL Server CDC 75 . Configure the pwxccl. Path to the local directory that stores extraction maps. When you configure the dbmover.cfg file. A named set of parameters that the CAPI uses to connect to the change stream and control extraction processing for SQL Server CDC. CDEP file for application names used in ODBC extractions.cfg file on the system where SQL Server capture registrations are stored. delete the existing registrations and extraction maps and create new ones.cfg file for the PowerExchange Logger. you must also define the CAPX CAPI_CONNECTION statement. Add this statement to the dbmover. and Windows. and Windows log files. 4. ¨ CAPT_XTRA. Start the PowerExchange Logger.cfg configuration file. This location corresponds to the Location node that you specify when defining a registration group. You can use either batch extraction mode or continuous extraction mode. ¨ MSQL CAPI_CONNECTION. Next Step: Configure and start extractions. define the following statements: ¨ CAPT_PATH ¨ CAPT_XTRA ¨ MSQL CAPI_CONNECTION ¨ CAPX CAPI_CONNECTION (for continuous extraction mode only) 2. The PowerExchange Navigator generates a corresponding extraction map. 5. Usually. and CDCT file for information about PowerExchange Logger for Linux. If capture registrations already exist for these tables. Path to the local directory that stores the following files for CDC: CCT file for capture registrations. you do this task after materializing the targets. In the PowerExchange Navigator create a capture registration for each SQL Server source table. Usually. You must set the Condense option to Part. UNIX. Activate the capture registrations. Also add the other statements that are required for CDC and any optional statements that you want to use. complete the following tasks to configure PowerExchange CDC: 1. 3. this location is where the source database resides.¨ “Extracting Change Data” on page 125 Configuring PowerExchange CDC with the PowerExchange Logger If you plan to run extractions in batch or continuous extraction mode and use the PowerExchange Logger for Linux.cfg for SQL Server CDC” on page 75 ¨ “Introduction to Change Data Extraction” on page 105 ¨ “Extracting Change Data” on page 125 ¨ “CAPX CAPI_CONNECTION Parameters” on page 14 Customizing dbmover. UNIX.

TYPE=(MSQL.] [RSTRADV=seconds] ) ) Parameters: Enter the following parameters: DLLTRACE=trace_id Optional.To find PowerExchange messages more easily. include the LOGPATH statement.] [MEMCACHE=cache_size.DISTSRV=AUX159908\PWXPC . [DWFLAGS=flag1flag2flag3. Maximum length is eight alphanumeric characters. User-defined name of the TRACE statement that activates internal DLL tracing for this CAPI. 76 Chapter 5: Microsoft SQL Server Change Data Capture .] TYPE=(MSQL. This statement defines a specific directory for the PowerExchange message log files. NAME=name Required.] [EOF={N|Y}.] [POLWAIT=seconds. [TRACE=trace. Specify this parameter only at the direction of Informatica Global Customer Support.RSTRADV=30)) Note: You must use non-curly double quotation marks around values that include a space.cfg for SQL Server CDC: LOGPATH="C:\Informatica\PowerExchangeVnnn\Logs" CAPT_XTRA="C:\Informatica\PowerExchangeVnnn\Capture\camaps" CAPT_PATH="C:\Informatica\PowerExchangeVnnn\Capture" CAPI_CONN_NAME=CAPIMSSC CAPI_CONNECTION=(NAME=CAPIMSSC . Microsoft SQL Server CAPI_CONNECTION Parameters The MSQL CAPI_CONNECTION statement specifies the Consumer API (CAPI) parameters needed for Microsoft SQL Server CDC sources. DISTSRV=distribution_server.DISTDB=distribution . RELATED TOPICS: ¨ “CAPX CAPI_CONNECTION Parameters” on page 14 ¨ “Microsoft SQL Server CAPI_CONNECTION Parameters” on page 76 Example Statements The following statements are typical of those included in a dmover. DISTDB=distribution_database.] NAME=name. Unique user-defined name for this CAPI_CONNECTION statement. Data Sources: Required: Microsoft SQL Server Yes for Microsoft SQL Server CDC Syntax: CAPI_CONNECTION=( [DLLTRACE=trace_id.

Enter one of the following options: ¨ N. Enter Y to continue processing or N to stop processing. Informatica recommends that you use one of the following alternative methods to stop change data extractions at EOL: ¨ For CDC sessions that use real-time extraction mode. TYPE=(MSQL. DISTSRV=distribution_server Required. or schema changes occur. DWFLAGS=flag1flag2flag3 Optional. Type of CAPI_CONNECTION statement. User-defined name of the TRACE statement that activates the common CAPI tracing. Controls whether PowerExchange stops a change data extraction when a schema change is detected. enter 0 for the WAITTIME parameter in the ODBC data source. Important: This name is different from the network name of the instance if the distribution database resides on a different server.TRACE=trace Optional. Specify this parameter only at the direction of Informatica Global Customer Support. ¨ For PowerExchange Logger for Linux. Because this parameter affects all users of the AS4J CAPI_CONNECTION statement. ¨ For CDC sessions that use ODBC connections. Default is NNN. PowerExchange stops change data extractions when EOL is reached. Network name of the server that hosts the distribution database. and Windows. EOF={N|Y} Optional. DISTDB=distribution_database Required. ) Required. Name of the distribution database. PowerExchange does not stop change data extractions when EOL is reached. this value must be MSQL. UNIX.. Enter Y to continue processing or N to stop processing. ¨ flag3. Default is N. enter 0 for the Idle Time attribute of the PWX MSSQL CDC Real Time application connection. enter 1 for the COLL_END_LOG statement in the pwxccl. . Enter Y to continue processing or N to stop processing.. Controls whether PowerExchange stops a change data extraction when the requested start sequence is not found in the transaction log. Controls whether PowerExchange stops change data extractions when the end-of-log (EOL) is reached. ¨ flag2.cfg configuration file. truncation. Configuring PowerExchange for SQL Server CDC 77 . For Microsoft SQL Server sources. Controls whether PowerExchange stops a change data extraction when data of an unexpected length is retrieved from the distribution database. ¨ Y. Specify this parameter only at the direction of Informatica Global Customer Support. Series of three positional parameters that control whether processing stops or continues when data loss. Enter the following positional parameters: ¨ flag1.

u Open the capture registration for the table. In this case. or avoid capturing unwanted changes. POLWAIT=seconds Optional. Valid values are from 1 through 2147483647. Memory cache size. Warning: After the registration status is set to History. When the wait interval expires. including those not of interest for CDC. you cannot activate the registration for CDC use again. Default is 1. for example. For example. from the restart point. that PowerExchange allocates to cache a single change. Valid values are 0 through 86400. This action disables publication of the SQL Server article for the table to the distribution database. in seconds. PowerExchange returns the next committed "empty UOW. 78 Chapter 5: Microsoft SQL Server Change Data Capture . it reads all changes. Default is 248. No default is provided. in kilobytes. when PowerExchange warm starts. change the table definition. RSTRADV=nnnnn Time interval. in seconds. and change the Status setting from Active to History. to change the table definitions. PowerExchange waits 5 seconds after it completes processing the last UOW or after the previous wait interval expires. Managing SQL Server CDC You might need to stop CDC for source tables occasionally. which causes change capture to stop. if you specify 5. Time interval. Warning: A value of 0 can degrade performance because PowerExchange returns an empty UOW after each UOW processed. Disabling Publication of Change Data for a SQL Server Source You can disable publication of change data for a SQL Server source. PowerExchange does not advance restart and sequence tokens for a registered source during periods when no changes of interest are received." which includes only updated restart information. that PowerExchange waits before advancing restart and sequence tokens for a registered data source during periods when UOWs do not include any changes of interest for the data source. Then PowerExchange returns the next committed empty UOW that includes the updated restart information and resets the wait interval to 0.MEMCACHE=cache_size Optional. that PowerExchange waits after reaching the end of current data before polling for new data. If RSTRADV is not specified. The wait interval is reset to 0 when PowerExchange completes processing a UOW that includes changes of interest or returns an empty UOW because the wait interval expired without any changes of interest having been received. Valid values are from 1 through 519720. For example. you might disable publication to perform some database maintenance.

8. 2. Edit the mapping if necessary. Use DDL to change the table definition in SQL Server. Managing SQL Server CDC 79 . create a new capture registration that reflects the metadata changes and set its status to Active. INSERT. Delete the capture registration and extraction map. Then stop all workflows that extract change data for the table. The newly activated capture registration becomes eligible for change data capture. or dropping columns. 11. and UPDATE activity against the table. If necessary. 6. and UPDATE activity against the table. Change data for that column is still captured but is not extracted. 5. 3. altering. use this procedure to enable PowerExchange to use the updated table definition and preserve access to previously captured data. In the PowerCenter Designer. 10. Cold start the extraction workflows. change the target table definition to reflect the source table metadata changes. If necessary. Tip: If you no longer need to capture change data from a column in a table. Table definition changes include adding. import the altered source and target definitions. create new restart tokens. PowerExchange creates a corresponding extraction map. rematerialize the target tables. 9. In the PowerExchange Navigator. 7. Create new restart tokens for the altered table. INSERT. you can remove the column from the extraction map without changing the capture registration. After materialization completes. 4.Changing a SQL Server Source Table Definition If you change the definition of a SQL Server source table that is registered for change data capture. Re-enable DELETE. Verify that any change data that was captured under the previous table definition has completed extraction processing. Stop DELETE. To change a SQL Server source table definition: 1.

UNIX. Benefits of using the PowerExchange Logger include fewer database accesses. In the capture registration. in chronological order based on commit time. PowerExchange uses the PowerExchange Client for PowerCenter (PWXPC) in conjunction with PowerCenter. In PowerExchange. also configure the PowerExchange Logger. PowerExchange generates a corresponding extraction map. In Oracle. To implement Oracle LogMiner CDC. and PowerCenter. faster CDC restart. PowerExchange starts a separate Oracle LogMiner session for each extraction session. 80 ¨ Planning for Oracle LogMiner CDC. If you want to use the PowerExchange Logger for Linux. to PowerExchange Logger log files.CHAPTER 6 Oracle Change Data Capture with Oracle LogMiner This chapter includes the following topics: ¨ Overview of Oracle LogMiner CDC. Also. To move the change data to one or more targets. PowerExchange requires a copy of the catalog to determine restart points for change data extraction processing. The change data is then extracted from the PowerExchange Logger log files in either continuous extraction mode or batch extraction mode. and no need to prolong retention of the Oracle redo files for change capture. Note: Informatica strongly recommends that you use the PowerExchange Logger for Oracle LogMiner CDC. PowerExchange. If you use real-time extraction mode without the PowerExchange Logger. you can select a subset of columns for which to capture data. 88 ¨ Management of Oracle LogMiner CDC. 81 ¨ Oracle Configuration for LogMiner CDC. you need to perform configuration tasks in Oracle. and Windows. Running multiple. define a capture registration for each source table. ensure that ARCHIVELOG mode with global minimal supplemental logging is enabled so that change data can be retrieved from archived redo logs. concurrent sessions can significantly degrade performance of the system where LogMiner runs. The PowerExchange Logger can capture change data from Oracle redo logs and write only the successful units of work (UOWs). 80 . 83 ¨ PowerExchange Configuration for Oracle LogMiner CDC. ensure that a copy of the Oracle online catalog exists in the archived redo logs. 102 Overview of Oracle LogMiner CDC PowerExchange can use Oracle LogMiner to read change data from Oracle redo logs.

BINARY_DOUBLE Yes Planning for Oracle LogMiner CDC 81 . ¨ If you truncate Oracle source tables from which change data is captured. The following table identifies the Oracle datatypes that PowerExchange supports for Oracle LogMiner CDC: Datatype BFILE Supported for CDC? No Comments Data for columns that have this datatype are not completely logged in the Oracle redo logs and cannot be captured. the Client binaries are installed by default.PowerExchange works with PowerCenter to extract change data from Oracle redo logs or PowerExchange Logger log files and load that data to one or more targets. you must rematerialize the corresponding targets. ¨ PowerExchange requires the Oracle Client binaries. In these situations. and performance considerations. or does not completely log. ¨ Oracle global minimal supplemental logging must be enabled. ¨ A copy of the Oracle catalog must exist in the Oracle archived redo logs. and Windows” on page 19 ¨ “Introduction to Change Data Extraction” on page 105 Planning for Oracle LogMiner CDC Before you configure Oracle change data capture. PowerExchange cannot retrieve change data for columns that have these datatypes. Requirements and Restrictions for Oracle LogMiner CDC The following restrictions and requirements apply to Oracle LogMiner CDC: ¨ The Oracle instance must be running in ARCHIVELOG mode. data with some datatypes in the Oracle redo logs. or if you drop and re-create source tables. This specification is also required if the network is configured for Multi-Threaded Server (MTS) mode. ¨ The maximum length of a row for which PowerExchange can capture and process change data is 32 KB. Oracle does not log. Datatypes Supported for Oracle LogMiner CDC PowerExchange uses Oracle LogMiner to retrieve changes from the Oracle redo logs. To use SQL*Net connectivity on a machine that does not have an installed Oracle instance. RELATED TOPICS: ¨ “PowerExchange Logger for Linux. ¨ Oracle LogMiner continuous mining reads archived redo logs only from the directory to which they were originally written. Consequently. requirements. configure a TNS entry on the client machine with SERVER=DEDICATED in the CONNECT_DATA section of the connect descriptor. you must install the Oracle Client. UNIX. PowerExchange cannot continue to extract change data for these tables. ¨ If PowerExchange CDC is not installed on the same machine as the Oracle instance. review the following restrictions. When you install Oracle.

5 or later. . Do not use Truncate.Numbers with a defined precision and scale are treated as NUMCHAR. However. Because PowerExchange does not capture DDL. . NUMBER Yes NVARCHAR2 Yes RAW TIMESTAMP TIMESTAMP WITH TIME ZONE TIMESTAMP WITH LOCAL TIME ZONE VARCHAR2 Yes Yes No No Yes SQL*Loader Restrictions PowerExchange CDC can capture data that was loaded into Oracle tables by the SQL*Loader utility. PowerExchange handles NUMBER columns as follows: . ¨ The load method should be Insert. Append.Numbers with a scale of 0 and a precision value less than 10 are treated as INTEGER.Datatype BINARY_FLOAT CHAR DATE FLOAT LOBs LONG LONG RAW NCHAR Supported for CDC? Yes Yes Yes Yes No No No Yes Comments For CDC support of this datatype. you must have PowerExchange 8. PowerExchange cannot capture data that was loaded by a direct path load because Oracle LogMiner does not support direct path loads. Truncate causes SQL*Loader to issue TRUNCATE TABLE DDL. it cannot capture any row deletions that result from TRUNCATE TABLE DDL. you must have PowerExchange 8. For CDC support of this datatype. or Replace. the following restrictions apply: ¨ The load type must be conventional path.5 or later. 82 Chapter 6: Oracle Change Data Capture with Oracle LogMiner .Numbers with an undefined precision and scale are treated as DOUBLE.

¨ Set the transaction_auditing parameter to “True. To manage the CDCT file size. PowerExchange CDC creates an Oracle LogMiner session for each real-time extraction. Oracle Configuration for LogMiner CDC 83 . and Windows.01. If a CDCT file is large. Use the script file that is appropriate for your environment to perform the following configuration tasks: ¨ Grant required Oracle privileges.cfg configuration file for the PowerExchange Logger for Linux. use the COND_CDCT_RET_P statement in the pwxccl. oracapt_rac.Performance Considerations for Oracle LogMiner CDC The following considerations pertain to PowerExchange CDC performance: ¨ Use real-time extraction mode only if you run very few concurrent change data extractions. ¨ Enable global minimal supplement logging. PowerExchange supports CDC in RAC environments only for Oracle 10g Release 2 and later. Each script file contains sample SQL statements for performing the necessary configuration tasks. For sample SQL and DDL. PowerExchange read operations can result in a high level of I/O activity. increased use of system resources. The comments provide important information. ¨ Configure Oracle LogMiner. and increased extraction latency.sql file. ¨ Copy the Oracle catalog to the archived redo logs. PowerExchange extracts change data from PowerExchange Logger log files. Before running any of the SQL statements. Configuration Script Files To configure Oracle for CDC.sql Configures Oracle for CDC in a non-RAC environment. Because LogMiner sessions are resource intensive. The CDCT file contains information about the PowerExchange Logger log files. ¨ If you use continuous extraction mode. Configuring Oracle for LogMiner CDC This section describes steps for configuring Oracle for LogMiner CDC. refer to the oracapt. Oracle Configuration for LogMiner CDC PowerExchange provides sample script files to help you configure Oracle for PowerExchange CDC.2. read the comments in the script file. For continuous extraction mode. ¨ Enable ARCHIVELOG mode. they can impact overall system performance. use continuous extraction mode.sql Configures Oracle for CDC in an RAC environment.” if you run an Oracle version earlier than 10. UNIX. Instead. minimize the size of the CDCT file. PowerExchange provides the following script files for RAC and non-RAC environments: oracapt. use the sample Oracle configuration script files in the PowerExchange installation directory. PowerExchange reads the CDCT file each time the interval that is specified in the FILEWAIT parameter of the CAPX CAPI_CONNECTION statement elapses.

1.2.2. For more information.1. or if the "compatible" parameter is set to an Oracle version earlier than 9.sql files. see the oracapt. ALTER SYSTEM SET log_archive_dest_1 = 'location=/oracle_path/arch' SCOPE=SPFILE.0. Set the Oracle Compatible Parameter (Oracle 9.0) If you use Oracle 9. verify that the transaction_auditing parameter is set to "True" in the init.sql and oracapt_rac.2. you must stop and restart the Oracle instance for your changes to take effect.0.sql or oracapt_rac. you must set this parameter to 9. Enable ARCHIVELOG Mode For CDC. 84 Chapter 6: Oracle Change Data Capture with Oracle LogMiner . For more information. Specify an Archive Log Destination Edit your init. Step 4. To set this parameter in the spfile file.0. issue the following SQL statements to indicate the archive log destination: CONNECT SYS/sys_pwd AS SYSDBA.ora initialization parameter file.ora file to specify the archive log destination and file-name format. Step 2.ora or spfile file. ARCHIVELOG mode is not enabled.0? SCOPE=SPFILE. see the Oracle database administrator’s guide for your Oracle version. Step 3. Set the Oracle transaction_auditing Parameter If you use an Oracle version earlier than 10.0. This setting is required for Oracle CDC to work properly. To enable ARCHIVELOG mode. Stop and Restart the Oracle Database If you set the ARCHIVELOG mode or the "compatible" or "transaction_auditing" parameter. By default.6 or 10.ora or spfile file. issue the following SQL statement: ALTER SYSTEM SET compatible=?9. Step 5. STARTUP MOUNT. You can find the patch by searching My Oracle Support (formerly MetaLink) Knowledge Base for bug report 3456259. The specific SQL and configuration steps vary for RAC and non-RAC environments and are described in the oracapt.4. you must edit the appropriate parameters in this file to identify the archive log destination and file name format. STARTUP. To set this parameter.2. If you run Oracle 9. you must execute some ALTER SYSTEM SET SQL. Tip: Back up your database after both SHUTDOWN commands. ALTER DATABASE ARCHIVELOG.sql file. execute the following SQL statement in an SQL*Plus session: Alter SYSTEM SET transaction_auditing=TRUE SCOPE=SPFILE. If you use a server parameter file (spfile).2.Step 1. SHUTDOWN IMMEDIATE. if you use a server parameter file (spfile). issue the following statements: SHUTDOWN IMMEDIATE.0 and the "compatible" parameter is not specified in the init. see your Oracle database administrator's guide.2. ALTER DATABASE OPEN. Alternatively. Oracle must be running in ARCHIVELOG mode. install the appropriate patch for your release instead.0. If you use the Oracle init.1. For more information.

V$LOGMNR_CONTENTS PUBLIC.V$TRANSACTION Oracle Configuration for LogMiner CDC 85 .cfg file.V$INSTANCE PUBLIC. you must either grant the LOCK ANY TABLE system privilege or grant the SELECT object privilege on each table that is registered for change data capture.V$NLS_PARAMETERS PUBLIC.V$PARAMETER PUBLIC. you must either grant the LOCK ANY TABLE system privilege or grant the SELECT object privilege on each table that is registered for change data capture. You can either use an existing user who has the required authority as the CDC user. CONNECT All LOCK ANY TABLE All SELECT ANY TRANSACTION 10g and later The following table identifies the minimum object privileges that Oracle CDC users must have: Object Name Source tables Object Privilege If you specify GENRLOCK=Y in the ORCL CAPI_CONNECTION statement of the dbmover. a CDC user must have specific Oracle system and object privileges.V$DATABASE PUBLIC. Grant User Privileges Required for Oracle LogMiner CDC To extract change data from Oracle redo logs. for your environment. If you specify GENRLOCK=Y in the ORCL CAPI_CONNECTION statement of the dbmover.sql and oracapt_rac. Required for users that extract Oracle CDC data in real time and for PowerExchange Logger tasks. Edit this SQL.V$ARCHIVED_LOG PUBLIC. Step 6.For more information.sql configuration script files contain the required SQL GRANT statements. SELECT SELECT SELECT SELECT SELECT SELECT SELECT PUBLIC. The following table identifies the minimum system privileges that Oracle CDC users must have: System Privilege ALTER ANY TABLE Oracle Release All Description Required for users that create capture registrations and allow PowerExchange to automatically run the DDL that is generated for creating a supplemental log group at registration completion.cfg file. The oracapt. or create a user and grant the required privileges to that user. as needed.sql file. see the oracapt. Required for users who extract Oracle CDC data in real time and for PowerExchange Logger tasks.

which is included in the oracapt. PowerExchange requires these images to properly process changes.and after-images of the data that changed.DBMS_LOGMNR SYS. COMMIT.sql or oracapt_rac. Note: You must also define a supplemental log group for each Oracle source table. To create the table space. Oracle Streams. PowerExchange generates DDL for adding a supplemental log group for the table. Specify NOLOGGING if you use Oracle LogMiner only for PowerExchange CDC and an occasional query.Object Name SYS.DBA_LOG_GROUPS SYS. you can still execute this ALTER statement.sql configuration files: ALTER DATABASE ADD SUPPLEMENTAL LOG DATA. This step is necessary only if you have not previously configured LogMiner for use with other Oracle features such as logical standby databases. or native Oracle change capture processes.sql and oracapt_rac. Oracle Streams. If you do not know whether minimal global supplemental logging has been enabled for your database. Oracle supplemental log groups cause Oracle to log full before. Step 8.DBMS_FLASHBACK SYS. log in to the Oracle database and execute the following SQL statement. The statement has no effect if minimal supplemental logging is active. This step prevents the SYSTEM table space (in Oracle 9i) or SYSAUX table space (in Oracle 10g or later) from becoming full and causing service problems during PowerExchange CDC. When you register an Oracle source table in the PowerExchange Navigator.DBA_LOG_GROUP_COLUMNS SYS.DBMS_LOGMNR_D Object Privilege SELECT SELECT EXECUTE EXECUTE EXECUTE Step 7. 86 Chapter 6: Oracle Change Data Capture with Oracle LogMiner . use the DDL in the PowerExchange oracapt. 1. Configuring Oracle Minimal Global Supplemental Logging PowerExchange requires Oracle to use minimal global supplemental logging for Oracle LogMiner to properly handle chained rows.ora' SIZE 50M REUSE AUTOEXTEND ON NEXT 10M MAXSIZE 100M EXTENT MANAGEMENT LOCAL. Create a Table Space for Oracle LogMiner Use (Optional) Create a table space exclusively for Oracle LogMiner use. or native Oracle change capture processes. Change NOLOGGING to LOGGING if you use any of the following Oracle features: logical standby databases. issue the following DDL: CREATE TABLESPACE "LOGMNRTS" NOLOGGING DATAFILE '/oracle_path/datafilename. To enable minimal global supplemental logging. To create the LogMiner table space.sql file that is supplied for this purpose.

” You can increase the maximum number of open cursors to handle the extra LogMiner processing.sql for more information.dbms_logmnr_d. Oracle Configuration for LogMiner CDC 87 . In this situation.store_in_redo_logs). PowerExchange reads the last catalog copy in the archived logs. problems might occur. you must set up an Oracle flash recovery area on the shared file system that contains all of the table data for the RAC. Copy the Oracle Catalog to the Archived Logs PowerExchange CDC requires a copy of the Oracle online catalog in the Oracle archived redo logs to determine the point from which to restart change data extractions.For the DATAFILE value. you might receive messages that state “number of open cursors exceeded. For each Oracle instance in the RAC. set the LOG_ARCHIVE_DEST_1 parameter to point to that recovery area. see the comments in oracapt. If these archived logs are inaccessible from the machine with the Oracle instance to which you are connected. issue the following command in an SQL*Plus session: begin SYS. set the CATBEGIN. To control how often Oracle copies the catalog and the time period within which the copy operation can occur. When you configure LogMiner for the first time.SET_TABLESPACE('LOGMNRTS'). You should copy the catalog on a routine basis to minimize CDC restart times. even if you specified ONLINECAT=Y in the ORCL CAPI_CONNECTION statement.BUILD( options => sys.DBMS_LOGMNR_D. CATEND.ora file for each of these Oracle instances. see Knowledge Base (KB) item 102503. Enter the following command: EXECUTE SYS. PowerExchange can process change data for database instances in a real application cluster (RAC) environment. For more information. 3. 2.DBMS_LOGMNR_D package. On Windows.DBMS_LOGMNR_D. Additional tasks for ensuring access to archived redo logs vary by operating system. Note: PowerExchange uses Oracle LogMiner to read change data from the archived logs. / Tip: Periodically. If you use an archived log destination other than the LOG_ARCHIVE_DEST_1 path and LogMiner processing lags behind. enter the following command: ALTER PACKAGE SYS. Tip: LogMiner opens a number of cursors internally to handle its processing. the LogMiner session might fail. To copy the catalog. The Oracle instance from which you run PowerExchange CDC must be able to access the Oracle archived redo logs for all Oracle instances in the RAC for which you want to capture change data. If this statement fails with the ORA_01353 message. Certain Oracle patches might be required. LogMiner starts reading change data from the archived logs in the LOG_ARCHIVE_DEST_1 directory.DBMS_LOGMNR_D COMPILE BODY. define the LOG_ARCHIVE_DEST_1 parameter to point to the directory in which you want Oracle to create the archived logs. In the init.cfg file. To recompile the SYS. and CATINT parameters in the ORCL CAPI_CONNECTION statement of the dbmover. specify a file name based on your local Oracle database file naming standards for the data files that comprise this table space. Configuration in an Oracle RAC Environment If you use Oracle 10g Release 2 or later. Step 9. PowerExchange requests Oracle to recopy the catalog to the Oracle archived redo logs. end.

and Windows and the extraction mode that you plan to use. UNIX. If you selected the Execute DDL now option. If you use shared storage or NFS access. PowerExchange executes the DDL for 88 Chapter 6: Oracle Change Data Capture with Oracle LogMiner . ¨ Store all archived redo logs on shared storage. ¨ Set up Network File System (NFS) access to the archive logs.On Linux and UNIX. If capture registrations already exist for these tables. Tip: Set the Condense option to Part even though you do not plan to use the PowerExchange Logger. which has a LOG_ARCHIVE_DEST_1 parameter that points to the following archive log directory: /ora/arch2/ ORA1 is the Oracle instance that runs CDC. you can use any of the following methods: ¨ Set up an Oracle flash recovery area in the same manner as for Windows. all of the Oracle instances in the RAC that participate in CDC must have access to the Oracle online redo logs. When you configure the dbmover. assume that ORA2 is an Oracle instance in a RAC. include the following statements: ¨ CAPT_PATH ¨ CAPT_XTRA ¨ ORACLEID ¨ ORCL CAPI_CONNECTION ¨ UOWC CAPI_CONNECTION For more information. the Oracle instance from which you run CDC must access the archived logs of the other RAC member instances. Also. You must enter a name in the Supplemental Log Group Name field.cfg file on the Oracle source machine. Usually. complete the following tasks to configure PowerExchange for Oracle LogMiner CDC: 1. unless you have a specific reason not to do so. see the PowerExchange Reference Manual. The mount point that the ORA1 machine must use to access the ORA2 archive logs is also /ora/arch2/. The PowerExchange Navigator generates a corresponding extraction map and the DDL for creating a supplemental log group. For example. This practice prevents having to edit the capture registrations later if you decide to use the PowerExchange Logger. Configuring Oracle LogMiner CDC without the PowerExchange Logger If you plan to run extractions in real-time extraction mode and not use the PowerExchange Logger for Linux. UNIX. This access uses the mount points that match the archive log directories defined for those member instances. 2. these redo logs reside on shared storage. create a capture registration for each Oracle source table. PowerExchange Configuration for Oracle LogMiner CDC The tasks that you perform to configure PowerExchange for CDC depend whether you want to use the PowerExchange Logger for Linux. and Windows. You might want to set the Condense option to None if you plan to run both real-time and continuous extractions against tables defined by the same capture registrations and do not want the PowerExchange Logger to capture change data for some registered tables. In the PowerExchange Navigator. delete the existing registrations and extraction maps and create new ones.

5. and Windows and run extractions in batch or continuous extraction mode.cfg files on the Windows machine where the PowerExchange Navigator runs and on the PowerCenter Integration Service machine. 3. 2. UNIX.creating a supplemental log group when you click Finish. If you did not select this option. PowerExchange executes the DDL for creating a supplemental log group when you click Finish. In the PowerExchange Navigator.cfg files. 4. 7. 6. see the PowerExchange Reference Manual. After stopping updates to the source tables. Customize the dbmover. The PowerExchange Navigator generates a corresponding extraction map and the DDL for creating a supplemental log group. 3. You can also set the Status option to Active. you do this task after materializing the targets.cfg for Oracle LogMiner CDC” on page 90 ¨ “Introduction to Change Data Extraction” on page 105 Configuring Oracle LogMiner CDC with the PowerExchange Logger If you plan to use the PowerExchange Logger for Linux. materialize the target tables. You must select Part in the Condense list. You can use either batch extraction mode or continuous extraction mode. Next Step: Configure and start extractions. create a capture registration for each Oracle source table. Configure the pwxccl. If you did not select this option. delete the existing registrations and extraction maps and create new ones. you must also specify an ORACLEID statement. PowerExchange Configuration for Oracle LogMiner CDC 89 . If you selected the Execute DDL now option. if these machines are separate from the Oracle source machine. In the PowerExchange Navigator.cfg file for the PowerExchange Logger. 8. You must use real-time extraction mode. complete the following tasks to configure PowerExchange for Oracle LogMiner CDC: 1. Usually. include the following statements: ¨ CAPT_PATH ¨ CAPT_XTRA ¨ ORACLEID ¨ ORCL CAPI_CONNECTION ¨ UOWC CAPI_CONNECTION ¨ CAPX CAPI_CONNECTION (for continuous extraction mode only) For more information. Activate the capture registrations. In each of these dbmover. or wait until after you materialize the target tables. Start the PowerExchange Listener on the source machine. RELATED TOPICS: ¨ “Customizing dbmover. On the Windows machine. Next Step: Configure and start extractions. When you configure the dbmover. and enter a name in the Supplemental Log Group Name field.cfg file used to access the source tables. Allow changes to be written to the source tables. you must execute the DDL prior to starting extraction processing. perform a database row test on the extraction maps to verify that PowerExchange can access the source data. you must specify a NODE statement that points to the machine that contains the Oracle source tables. If capture registrations already exist for these tables. Start the PowerExchange Logger. you must execute the DDL prior to starting extraction processing. 9.

cfg configuration file. this location is where the source database resides.cfg statements. Informatica recommends including the LOGPATH and TRACING statements to make finding messages easier.APPEND=Y.RECLEN=255. The following statements are required for Oracle CDC with Oracle LogMiner: CAPT_PATH Path to the local directory where the CCT file and CDCT file reside. database. Additionally. CAPX CAPI_CONNECTION (required for continuous extraction only) If you plan to use the PowerExchange Logger and continuous extraction mode. you must also define a CAPX CAPI_CONNECTION statement. CAPT_XTRA Path to the local directory where extraction maps reside.CAPIUOWC) 90 Chapter 6: Oracle Change Data Capture with Oracle LogMiner . The CDCT file contains information about PowerExchange Logger log files. The LOGPATH statement defines a directory specifically for PowerExchange message log files. Example Oracle LogMiner CDC Statements The following statements are typical of those included in a dmover. The CAPINAME parameter in the UOWC CAPI_CONNECTION points to an ORCL CAPI_CONNECTION.FLUSH=99) CAPT_XTRA=/pwx/capture/vnnn/camaps CAPT_PATH=/aus/pwx/capture/vnnn ORACLEID=(FOX123. ORACLEID Oracle source instance.cfg for Oracle LogMiner CDC: LOGPATH=/pwx/logs TRACING=/PFX=PWXLOG. The CCT file contains capture registrations.FO920DTL) CAPI_SRC_DFLT=(ORA.cfg for Oracle LogMiner CDC” on page 90 ¨ “Customizing the PowerExchange Logger Configuration File” on page 28 ¨ “Starting the PowerExchange Logger” on page 47 ¨ “Introduction to Change Data Extraction” on page 105 Customizing dbmover. and the TRACING statement enables PowerExchange to create an alternative set of message log files for each PowerExchange process. and connection information.cfg for Oracle LogMiner CDC In the dbmover. UOWC CAPI_CONNECTION A named set of parameters for the UOW Cleanser. Define the CAPI_CONNECTION statements in the dbmover. ORCL CAPI_CONNECTION A named set of parameters that the CAPI uses to connect to the change stream and control extraction processing for Oracle sources.RELATED TOPICS: ¨ “Customizing dbmover.FILENUM=3. Usually. For more information about all dbmover.cfg file that is on the system where the Oracle capture registrations are stored. see the PowerExchange Reference Manual. This location corresponds to the Location node that you specify when defining a registration group. include the statements that are required for Oracle LogMiner CDC and any optional statements that you want to use.

the value of the ORACLE_SID environment variable is used by default and the PowerExchange Logger does not use Oracle SQL*Net for connection.CATBEGIN=00:01 .DLLTRACE=ORA2 . CAPI_CONNECTION=(NAME=CAPXORA . Oracle connection string. Tip: If possible.ORACOLL=FOX123 .CATEND=23:59 .TYPE=(UOWC .DFLTINST=FOX920)) ORACLEID Statement The ORACLEID statement specifies the Oracle source database and connection information for PowerExchange CDC with Oracle LogMiner.CATINT=1440 . For Oracle LogMiner CDC only.BYPASSUF=Y . oracle_db. if you have multiple Oracle databases and capture changes from a database other than the default database.TYPE=(ORCL .ARRAYSIZE=1000 . the default Oracle database is used.] ) Parameters: Enter the following positional parameters: capture_connect_string Optional. CAPI_CONNECTION=(NAME=CAPIORA .CAPINAME=CAPIORA . If the ORACLE_SID environment variable is not defined. you must specify both the source_connect_string and capture_connect_string parameters. This connection string must be specified in the Oracle Client tnsnames.CAPI_CONN_NAME=CAPIUOWC /* /* CAPI connection statements /* /* Both UOWC and ORCL CAPI_CONNECITON statements are required for Oracle CDC. that the PowerExchange Logger uses to connect to the Oracle database with the source tables for Oracle LogMiner CDC.ora file that is used for connection to the Oracle source database.MEMCACHE=50000 . even if the PowerExchange Logger is running on the same machine as the Oracle source database. Data Sources: Required: Oracle CDC sources Yes for Oracle LogMiner CDC Syntax: ORACLEID=( collection_id. Set the following PowerExchange Configuration for Oracle LogMiner CDC 91 .RSTRADV=1800)) /* Additional CAPX CAPI_CONNECTION statement is required for continuous extraction mode.] [capture_connect_string. [source_connect_string.TYPE=(CAPX . if defined. defined in TNS. bypass the use of SQL*Net to improve PowerExchange Logger performance.SELRETRY=0)) CAPI_CONNECTION=(NAME=CAPIUOWC . If this value is null.

cfg file on the machine from which the PowerExchange Listener retrieves data for PowerExchange Navigator requests.cfg file on the machine where the PowerExchange Logger runs. the default Oracle database is used. whenever possible.ora file on the machine with the source database. one of the following variables: LD_LIBRARY_PATH. If you plan to run a database row test on extraction maps for the source tables. oracle_db Required. Specify the ORACLEID statement in the dbmover.environment variables. If the ORACLE_SID environment variable is not defined. the source connection string is used only for PowerExchange Navigator access to the Oracle source database. If this value is null. the collection ID in the registration group defined for the source tables. This connection string must be defined in the Oracle Client tnsnames. or if you plan to perform Oracle LogMiner CDC without the PowerExchange Logger. 92 Chapter 6: Oracle Change Data Capture with Oracle LogMiner . This value must match the ORACOLL parameter value in the ORCL CAPI_CONNECTION statement. to enable connection to the appropriate Oracle database without using the capture_connect_string parameter and SQL*Net: ¨ ORACLE_HOME ¨ ORACLE_SID ¨ PATH ¨ On a Linux or UNIX operating system. Name of the Oracle database that contains the source tables you registered for change data capture. Oracle connection string. For Oracle LogMiner CDC. on the machine where your PowerExchange extractions run. Usage Notes: PowerExchange requires an ORACLEID statement for each Oracle database for which you want to capture and extract change data. source_connect_string Optional. Enter this parameter in the dbmover. defined in TNS. if defined. ORCL CAPI_CONNECTION Statement The ORCL CAPI_CONNECTION statement specifies the Consumer API (CAPI) parameters needed for Oracle CDC sources that use Oracle LogMiner.cfg file. You can specify a maximum of 20 ORACLEID statements in a single dbmover.cfg file. or SHLIB_PATH collection_id Required. that is used to connect to the Oracle database that contains the source tables. Note: The source connection string is not used to transfer change data. Maximum length is eight characters. LIBPATH. and the DBID value in the PowerExchange Logger pwxccl. User-defined identifier for this ORACLEID statement.] NAME=name. also specify the capture_connect_string parameter. the value of the ORACLE_SID environment variable is used by default. Data Sources: Related Statements: Required: Oracle sources UOWC CAPI_CONNECTION Yes for Oracle LogMiner CDC Syntax: CAPI_CONNECTION=( [DLLTRACE=trace_id.

of the prefetch array that PowerExchange uses to read the Oracle redo logs. For Oracle CDC sources that use LogMiner.. [ARRAYSIZE=array_size. Type of CAPI_CONNECTION statement. A value of less than 100 can degrade Oracle CDC performance.] ORACOLL=collection_id.] [SNGLINST={N|Y}] ) Parameters: Enter the following parameters: DLLTRACE=trace_id Optional. LogMiner returns unformatted log records when Global Temporary Tables are updated. Specify this parameter only at the direction of Informatica Global Customer Support. Note: A value of 0 disables prefetch.] [LOGDEST=logdest_id. User-defined name of the TRACE statement that activates the common CAPI tracing. or when ONLINECAT=Y is specified and the log data that is being read is inconsistent with the catalog. Controls whether PowerExchange ends abnormally or issues a warning message when an unformatted log record is returned from Oracle LogMiner. in number of rows.] [COMMITINT=minutes. Unique user-defined name for this CAPI_CONNECTION statement. Valid values are from 0 through 2147483647.] [CATEND=hh:mm. Size.) [TRACE=trace.] [ONLINECAT={N|Y}.] [CATBEGIN=hh:mm.. TYPE=(ORCL. BYPASSUF={N|Y} Optional. . ARRAYSIZE=array_size Optional. this value must be ORCL. ) Required.] [IGNUFMSG={N|Y}. User-defined name of the TRACE statement that activates internal DLL tracing for this CAPI. [SELRETRY=retry_number. Default is 100. Specify this parameter only at the direction of Informatica Global Customer Support.] TYPE=(ORCL.] [BYPASSUF={N|Y}. Maximum length is eight alphanumeric characters.] [LGTHREAD=instance_number. PowerExchange Configuration for Oracle LogMiner CDC 93 . TRACE=trace Optional. Specify 0 only at the direction of Informatica Global Customer Support.] [GENRLOCK={N|Y}.] [CATINT=minutes. NAME=name Required.

If you specify a value for the CATBEGIN parameter. or in-flight. until the LogMiner session ends. Latest time of day. many warning messages might be written. in 24-hour clock format. Oracle leaves these transactions open. CATEND=hh:mm Optional. at which PowerExchange requests Oracle to write a copy of the Oracle catalog to the redo logs. the restart of all future realtime extraction operations might be impacted because PowerExchange always begins reading change data at the beginning of the oldest in-flight UOW. CATBEGIN=hh:mm Optional. you must also specify a value for the CATBEGIN parameter. Valid values are from 1 through 60. PowerExchange must occasionally issue SQL COMMIT operations to end these in-flight transactions. Default is 1440. Earliest time of day. at which PowerExchange requests Oracle to write a copy of the Oracle catalog to the redo logs. ¨ Y.Enter one of the following options: ¨ N. Otherwise. Valid values are from 1 through 1440. To be able to restart change data extraction operations efficiently. and then continues processing. If you specify a value for the CATEND parameter. you must also specify a value for the CATEND parameter. Tip: Specify Y if the Oracle instance contains Global Temporary tables. You can specify Y for the IGNUFMSG parameter to suppress these warning messages. between requests to copy the Oracle catalog to the redo logs. Instead. in 24-hour clock format. Otherwise. Time interval. PowerExchange does not request Oracle to take a copy of the Oracle catalog. in minutes. COMMITINT=minutes Optional. between the SQL COMMIT operations issued by PowerExchange to commit the transactions automatically generated by the Oracle LogMiner session. the Oracle LogMiner interface automatically generates transactions for the LogMiner sessions that PowerExchange initiates. do not include the BYPASSUF parameter. CATINT=minutes Optional. If this interval elapses but the time is outside of the time period specified in the CATBEGIN and CATEND parameters. in minutes. PowerExchange waits until the time specified for the CATBEGIN parameter to request a catalog copy. Although PowerExchange does not update data in user tables while reading change data from the redo logs. Default is 00:00. 94 Chapter 6: Oracle Change Data Capture with Oracle LogMiner . Depending on the amount of unformatted log data. PowerExchange ends with an error whenever it receives an unformatted log record from Oracle LogMiner. PowerExchange writes a warning message to the PowerExchange message log that warns that unformatted log data has been found. Default is N. Time interval. Default is 24:00.

which allows new changes to the table to occur. IGNUFMSG={N|Y} Optional. PowerExchange writes warning messages. A safe restart point for a source table is a point in the change stream that does not skip any in-flight UOWs for that table. Enter one of the following options: ¨ N. To generate a safe restart point for a source table. PowerExchange generates restart tokens that match the current EOL in the following situations: ¨ The PowerExchange Logger for Linux. ¨ A DTLUAPPL utility operation that uses the RSTTKN GENERATE option. PowerExchange obtains a lock for the table represented by the capture registration specified in the utility control statements. ¨ The restart token file for a CDC session specifies the CURRENT_RESTART option on the RESTART1 and RESTART2 special override statements. PowerExchange generates safe restart points for source tables. PowerExchange generates restart points that match the current EOL. Default is N. PowerExchange obtains locks only for the tables in the CDC session to which the special override statements apply. PowerExchange releases the lock on the source table after the restart point generation process completes.cfg configuration file does not specify the SEQUENCE_TOKEN and RESTART_TOKEN parameters. Enter one of the following options: ¨ N. ¨ Y. If no in-flight UOWs exist for a table. PowerExchange obtains a lock for the table represented by capture registration associated with the extraction map used in the database row test. ¨ A database row test in the PowerExchange Navigator that uses the SELECT CURRENT_RESTART SQL statement. PowerExchange uses the current EOL.Default is 5. and Windows is cold started and the pwxccl. PowerExchange Configuration for Oracle LogMiner CDC 95 . Controls whether PowerExchange generates a safe restart point for requests for restart points that match the current end-of-log (EOL). Default is N. PowerExchange then searches the Oracle catalog for the point in the change stream that matches the earliest active transaction for the table and uses this point as the restart point. GENRLOCK={N|Y} Optional. ignoring any in-flight transactions for the source tables. UNIX. PowerExchange obtains an exclusive lock on the table to stop further changes. ¨ Y. Controls whether PowerExchange writes warning messages to the PowerExchange message log file for unformatted data records. PowerExchange does not write any warning messages. PowerExchange obtains locks for all tables represented by capture registrations selected for processing by the PowerExchange Logger.

ora file. If LogMiner uses the online catalog and you make schema changes made while LogMiner is reading log data. or allow it to default. the numeric identifier for the archive log destination that you want to force PowerExchange to use. For RAC environments. if you have the following requirements: ¨ You specify Y for the BYPASSUF parameter and need to change the schema of tables registered for capture while change data extraction operations are running. The SNGINST parameter affects how PowerExchange uses the archive log destination and the Oracle instance specified by LOGDEST and LGTHREAD. Therefore. When PowerExchange is configured to use the online catalog for formatting log data. LogMiner passes unformatted log records for subsequent changes to PowerExchange. For example. If you specify N for the BYPASSUF parameter. Valid values are from 1 through 10. ONLINECAT={N|Y} Optional. when LogMiner uses the online catalog. PowerExchange validates and then ignores the LOGDEST and LGTHREAD parameters. or allow it to default. Otherwise. Controls whether PowerExchange directs Oracle LogMiner to use the Oracle online catalog or the copy of the catalog in the redo logs to format log data for CDC. LGTHREAD=instance_number Optional. it does not track DDL changes. Enter one of the following options: ¨ N. and cannot format log records for tables that have schema changes. Oracle LogMiner uses the online catalog and PowerExchange cannot track schema changes. For RAC environments. ¨ You need to start an extraction from a point in the Oracle redo logs that contains table data that was captured under a previous schema. However. specify N for the ONLINECAT parameter. Change data extraction operations generally initialize faster when PowerExchange is configured to create LogMiner sessions with the online catalog instead of a catalog copy. 96 Chapter 6: Oracle Change Data Capture with Oracle LogMiner .LOGDEST=logdest_id Optional. you must copy the online catalog to the Oracle redo logs on a regular basis. If you specify Y for the ONLINECAT parameter. Valid values are from 1 through 2147483647. which results in change data loss. This archive log destination must be local to the Oracle instance that PowerExchange is using. to use archived logs from the destination set by the LOG_ARCHIVE_DEST_3 parameter in the init. PowerExchange skips the unformatted record and continues processing. the numeric instance number for the Oracle instance that PowerExchange uses to identify the archived redo logs to process. If you specify Y for the ONLINECAT parameter. The SNGINST parameter affects how PowerExchange uses the archive log destination and the Oracle instance specified by LOGDEST and LGTHREAD. PowerExchange fails the extraction request after Oracle passes the first unformatted record. PowerExchange validates and then ignores the LOGDEST and LGTHREAD parameters. Oracle LogMiner uses the copy of the catalog from the archived redo logs and PowerExchange tracks schema changes to ensure that data loss does not occur. specify LOGDEST=3. it still uses catalog copies to determine the restart point for change data extraction operations. Therefore. ¨ Y.

which must match the value specified in the ORACLEID statement. ¨ Y. Valid values are from 0 through 2147483647. Default is N. shutdown behavior does not noticeably change. the wait interval is reset to 0. Oracle Catalog Parameters in the ORCL CAPI_CONNECTION Statement The CATINT. This setting improves CPU consumption but can prolong extraction session shutdown. In RAC environments. PowerExchange uses non-blocking SQL to ensure that a user request to shut down a extraction session is processed in a timely manner. SELRETRY=retry_number Optional. controls whether PowerExchange uses only the archived redo logs from a specific Oracle instance and archive log destination. SNGLINST={N|Y} Optional. PowerExchange does not use non-blocking SQL. the change data extraction operation ends. CATBEGIN and CATEND parameters in the ORCL CAP_CONNECTION statement can significantly affect PowerExchange performance. For all remaining Oracle instances in the RAC. and the process begins again for the next call to LogMiner. After PowerExchange passes these logs to an Oracle LogMiner session.Default is N. you must run separate change data extraction processes and then determine how to properly merge the change data so that you can apply it to targets. These parameters control the frequency with which the Oracle catalog is copied to the Oracle redo logs and the time period within which the copy operation can occur. If you specify Y. After the call to LogMiner has been retried the specified number of times. LogMiner determines the other archived redo logs to read. LogMiner does not read any other archived redo logs. When LogMiner returns data. On quiescent Oracle instances. When you restart PowerExchange extraction processing. On Oracle instances where update activity is occurring. After PowerExchange processes the logs from the specified location. If you specify 0. Number of times that PowerExchange immediately loops back to the Oracle LogMiner call before implementing a graduated-scale wait loop. Default is 1000. Oracle collection identifier. PowerExchange directs Oracle LogMiner to begin reading change data from the redo logs starting from the SCN of the last Oracle catalog copy that was written to the logs prior to the end of the previous extraction session. ORACOLL=collection_id Required. If you specify a non-zero value. The wait interval begins at one millisecond and gradually increases to one second. PowerExchange does not honor a shutdown request until log data is returned from Oracle. PowerExchange implements a wait interval between each subsequent retry. PowerExchange uses only the archive log destination and Oracle instance that you specify in LOGDEST and LGTHREAD parameters to read archived redo logs. you must also specify the LOGDEST and LGTHREAD parameters to identify the archive log destination and Oracle instance to use. PowerExchange uses the specified Oracle instance to search for archived redo logs that contain copies of the Oracle catalog. PowerExchange Configuration for Oracle LogMiner CDC 97 . Enter one of the following options: ¨ N.

As a result. before continuing to the new change data that begins at SCN 160. A PowerExchange extraction session extracted these changes before ending at SCN 100.To configure the CATINT. A PowerExchange extraction session extracted these changes before ending at SCN 100. PowerExchange reprocesses the data between SCN 10 and SCN 100. The default frequency of once a day might not be sufficient if you have a high volume of transaction activity. Example 2 Assume that the Oracle catalog was copied to the Oracle redo logs twice: at SCN 10 and at SCN 80. When you restart PowerExchange extraction processing. With multiple catalog copies. Change data was logged starting at SCN 40 and ending at SCN 60. additional changes have been logged starting at SCN 160. Since the extraction session ended. before continuing to the new change data that begins at SCN 160. additional changes have been logged starting at SCN 160. PowerExchange needs to reprocess less change data. CATBEGIN. 98 Chapter 6: Oracle Change Data Capture with Oracle LogMiner . LogMiner begins reading change data from the second catalog copy at SCN 80 because it is the latest catalog copy prior to the session end at SCN 100. LogMiner must begin reading change data from the initial catalog copy at SCN 10 because it is the latest catalog copy prior to the session end at SCN 100. Example 1 Assume that the Oracle catalog was initially copied to the Oracle redo logs at SCN 10 and another copy has not yet been written to the logs. try various settings until you find a combination that provides for efficient restart processing. This reprocessing of data impacts PowerExchange performance. PowerExchange reprocesses only the data between SCN 80 and SCN 100. The following examples demonstrate how copying the Oracle catalog multiple times can affect the amount of change data that is reread from the archived redo logs when PowerExchange extraction processing is restarted. Change data was logged starting at SCN 40 and ending at SCN 60. Since the extraction session ended. When you restart PowerExchange extraction processing. and CATEND parameters. As a result.

UOWC CAPI_CONNECTION Statement
The UOWC CAPI_CONNECTION statement specifies the Consumer API (CAPI) parameters needed for the UOW Cleanser. In the change stream for some data sources, changes from multiple UOWs are intermingled. The UOW Cleanser reconstructs the intermingled changes read from the change stream into complete UOWs in chronological order based on end time.
Data Sources: DB2 for i5/OS Oracle LogMiner CDC z/OS CDC AS4J CAPI_CONNECTION for i5/OS ORCL CAPI_CONNECTION for Oracle LRAP CAPI_CONNECTION for z/OS Yes for the noted data sources

Related Statements:

Required:

Syntax:
CAPI_CONNECTION=( [DLLTRACE=trace_id,] NAME=name, [TRACE=trace,] TYPE=(UOWC, CAPINAME=name, [BLKSIZE=block_size,] [DATACLAS=data_class,] [MEMCACHE=cache_size,] [RSTRADV=seconds,] [SPACEPRI=primary_space,] [SPACETYPE={BLK|TRK|CYL},] [STORCLAS=storage_class,] [UNIT=unit] ) )

Parameters: Enter the following parameters: DLLTRACE=trace_id Optional. User-defined name of the TRACE statement that activates internal DLL tracing for this CAPI. Specify this parameter only at the direction of Informatica Global Customer Support. NAME=name Required. Unique user-defined name for this CAPI_CONNECTION statement. Maximum length is eight alphanumeric characters. TRACE=trace Optional. User-defined name of the TRACE statement that activates the common CAPI tracing. Specify this parameter only at the direction of Informatica Global Customer Support. TYPE=(UOWC, ... ) Required. Type of CAPI_CONNECTION statement. For the UOW Cleanser, this value must be UOWC.

PowerExchange Configuration for Oracle LogMiner CDC

99

BLKSIZE=block_size Optional. Block size, in bytes, for the sequential UOW spill files that the UOW Cleanser creates when the memory cache cannot hold all changes for a UOW. Valid values and defaults vary by platform:
¨ For Oracle LogMiner CDC sources, enter a value from 8 through 65535. Default is 32768. ¨ For i5/OS CDC sources, enter a value from 8 through 32760. Default is 32760. ¨ For z/OS CDC sources, enter a value from 8 through 32760. Default is 18452.

CAPINAME=name Required. Value from the NAME parameter in the related source-specific CAPI_CONNECTION statement. The source-specific CAPI_CONNECTION is one of the following statement types:
¨ AS4J CAPI_CONNECTION statement for i5/OS CDC sources ¨ LRAP CAPI_CONNECTION statement for z/OS CDC sources ¨ ORCL CAPI_CONNECTION statement for Oracle LogMiner CDC sources

DATACLAS=data_class Optional. On z/OS, the SMS data class that the UOW Cleanser uses when allocating the sequential UOW spill files. If you do not specify this parameter, the SMS ACS routines can assign the data class. MEMCACHE=cache_size Optional. Memory cache size, in kilobytes, that PowerExchange allocates to reconstruct complete UOWs. For each extraction session, PowerExchange keeps all changes for each UOW in the memory cache until it processes the end-UOW record. If the memory cache is too small to hold all of the changes in a UOW, PowerExchange spills the changes to a sequential files on disk, called UOW spill files. Each UOW spill file contains one UOW. A UOW might require multiple UOW spill files to hold all of the changes for that UOW. If the change stream contains multiple large UOWs and the memory cache is insufficient, PowerExchange might create numerous UOW spill files. PowerExchange processes the change stream more efficiently if it does not need to use UOW spill files. In addition to degrading extraction performance, large numbers of UOW spill files can cause a disk space shortage. Important: If the change stream contains only small UOWs, the default value might be sufficient. However, the default value is often too small to eliminate UOW spill files. Informatica recommends that so you specify a larger value. The location in which PowerExchange allocates the UOW spill files varies by operating system, as follows:
¨ For i5/OS, PowerExchange uses CRTPF command to create a physical file for UOW spill files.

PowerExchange creates the UOW spill file names by using the C/C++ tmpnam() function.
¨ For Linux and UNIX, PowerExchange uses the current directory by default for UOW spill files. To use

a different directory, specify the TMPDIR environment variable. PowerExchange creates the UOW spill file names by using the operating system tempnam function with a prefix of dtlq. Note: The UOW spill files are temporary files that are deleted when PowerExchange closes them. They are not visible in the directory while open.

100

Chapter 6: Oracle Change Data Capture with Oracle LogMiner

¨ For Windows, PowerExchange uses the current directory by default for UOW spill files. To use a

different directory, specify the TMP environment variable. PowerExchange creates the UOW spill file names by using the Windows _tempnam function with a prefix of dtlq.
¨ For z/OS, PowerExchange uses dynamic allocation to allocate temporary data sets for the UOW spill

files. Generally, SMS controls the location of temporary data sets. If you do not use SMS to control temporary data sets, the UNIT parameter controls the location for the UOW spill files. Because PowerExchange allocates temporary data sets for the UOW spill files, z/OS assigns these files system-generated data set names, which begin with SYSyyddd.Thhmmss.RA000.jobname. Valid values are from 1 through 519720. Warning: Because PowerExchange allocates the cache size for each extraction operation, use caution when coding large values for MEMCACHE. Otherwise, many concurrent extraction sessions might cause memory constraints. Default is 1024, or 1 MB. RSTRADV=nnnnn Time interval, in seconds, that PowerExchange waits before advancing restart and sequence tokens for a registered data source during periods when UOWs do not include any changes of interest for the data source. When the wait interval expires, PowerExchange returns the next committed "empty UOW," which includes only updated restart information. The wait interval is reset to 0 when PowerExchange completes processing a UOW that includes changes of interest or returns an empty UOW because the wait interval expired without any changes of interest having been received. For example, if you specify 5, PowerExchange waits 5 seconds after it completes processing the last UOW or after the previous wait interval expires. Then PowerExchange returns the next committed empty UOW that includes the updated restart information and resets the wait interval to 0. If RSTRADV is not specified, PowerExchange does not advance restart and sequence tokens for a registered source during periods when no changes of interest are received. In this case, when PowerExchange warm starts, it reads all changes, including those not of interest for CDC, from the restart point. Valid values are 0 through 86400. No default is provided. Warning: A value of 0 can degrade performance because PowerExchange returns an empty UOW after each UOW processed. SPACEPRI=primary_space Optional. On z/OS, the primary space value that the UOW Cleanser uses to allocate UOW spill files. The UOW Cleanser does not use secondary space. Instead, when a spill file becomes full, the UOW Cleanser allocates another spill file of the same size. The SPACETYP parameter specifies the space units for this value. Default is 50 cylinders. SMS ACS routines can override the UOW spill file size. Valid values are from 1 through 2147483647. Default is 50 cylinders. Note: On i5/OS, the UOW Cleanser allocates UOW spill files as physical files with SIZE(*NOMAX), which means that the maximum spill file size is controlled by the system maximum file size. On Linux, UNIX, and Windows, PowerExchange allocates UOW spill files as temporary files that are 2 GB in size.

PowerExchange Configuration for Oracle LogMiner CDC

101

Use cylinders. the SMS storage class name that the UOW Cleanser uses to allocate UOW spill files.SPACETYPE={BLK|TRK|CYL} Optional. the type of space units that the UOW Cleanser uses to allocate UOW spill files. use this procedure to enable PowerExchange to switch to the updated table definition. ¨ CYL. the generic or esoteric unit name that the UOW Cleanser uses to allocate UOW spill files.and after-images of data that changed. RELATED TOPICS: ¨ “Stopping PowerCenter CDC Sessions” on page 142 Changing a Source Table Definition Used in Oracle LogMiner CDC Occasionally. On z/OS. On z/OS. 102 Chapter 6: Oracle Change Data Capture with Oracle LogMiner . Management of Oracle LogMiner CDC You might need to stop CDC for source tables occasionally. you might need to change the definition of an Oracle source table that is registered for change data capture. If you reinstate the supplemental log group later. and change the Status value from Active to History. you should rematerialize the target database. Stopping Oracle LogMiner CDC You might need to stop Oracle change data capture for a source table to perform troubleshooting or routine maintenance tasks. use one of the following methods: ¨ Open the capture registration for the source table. to change the table definitions. Use tracks. ¨ Drop the supplemental log group by executing the following SQL: ALTER TABLE schema. Warning: A capture registration that has a status of History cannot be activated again. Oracle stops recording full before. ¨ TRK. This method permanently stops change data capture for a table based on a particular capture registration. STORCLAS=storage_class Optional. UNIT=unit Optional. Default is BLK.table_name DROP SUPPLEMENTAL LOG GROUP After you drop the supplemental log group. Use blocks. Enter one of the following options: ¨ BLK. On z/OS. for example. while preserving access to previously captured data. If your metadata changes affect the columns from which change data is captured. To stop change data capture.

Tip: If you no longer need to capture change data from a column in a table.Perform this procedure whenever you add. 8. 10. restart the PowerExchange Logger process so that it will begin using the new capture registration. 11. Restart extraction processing. Verify that any change data that was captured under the previous table definition has completed extraction processing. INSERT. Drop the supplemental log group for the table. Use DDL to make the table changes. the PowerExchange Navigator runs the DDL for creating a new supplemental log group. Edit the mapping if necessary. 3. create new restart tokens. and UPDATE activity against the table. 5. If necessary. 12. 4. Re-enable DELETE. Note: PowerExchange does not capture change data based on capture registrations that have a status of History or Inactive. In PowerCenter Designer. you can remove that column from the extraction map without changing the capture registration. 2. 6. If you use the PowerExchange Logger for Linux. To change a source table definition used in Oracle LogMiner CDC: 1. Then stop all workflows that extract change data for the table. or drop columns for which change data is captured. import the altered source and target tables. alter. 9. UNIX. Stop DELETE. RELATED TOPICS: ¨ “Creating Restart Tokens for Extractions” on page 135 Management of Oracle LogMiner CDC 103 . change the target table definition to reflect the source table metadata changes. Also select the Execute DDL now option so that when you finish the capture registration. PowerExchange uses the newly activated capture registration for change data capture. rematerialize the target tables. and UPDATE activity against the table. and Windows. create a new capture registration that reflects the metadata changes and set its status to Active. After materialization completes. open the original capture registration and set its status to History. Change data for the column is still captured but is not extracted. You do not need to perform this procedure if you are selectively capturing change data for a subset of columns and none of the selected columns are affected by the table definition changes. 7. If necessary. In the PowerExchange Navigator. In the PowerExchange Navigator. INSERT.

140 ¨ Monitoring and Tuning Options. 105 ¨ Extracting Change Data. 148 104 .Part IV: Change Data Extraction This part contains the following chapters: ¨ Introduction to Change Data Extraction. 125 ¨ Managing Change Data Extractions.

123 Change Data Extraction Overview Use PowerExchange in conjunction with PWXPC and PowerCenter to extract captured change data and write it to one or more targets. sessions. you can use the source definitions in PowerCenter to create mappings. 118 ¨ Offload Processing. If you import extraction maps. 111 ¨ Group Source Processing in PowerExchange. 116 ¨ Commit Processing with PWXPC. ¨ For relational data sources. you do not need to manually add or remove these columns from the PowerCenter source definition. import the extraction map from PowerExchange. 105 ¨ Extraction Modes. To extract changes captured by PowerExchange. import the metadata for the capture source into PowerCenter Designer. 106 ¨ Restart Tokens and the Restart Token File. After you import the metadata. Use one of the following methods: ¨ For nonrelational data sources. you can import either the metadata from the database or the extraction map from PowerExchange. 109 ¨ Recovery and Restart Processing for CDC Sessions. RELATED TOPICS: ¨ “PowerExchange-Generated Columns in Extraction Maps” on page 106 105 . If you import metadata from the database. Review the topics in this chapter to learn key concepts about extraction processing so that can configure CDC sessions to extract change data efficiently and to enable proper restart and recovery. you might need to modify the source definition in Designer to add PowerExchange-defined CDC columns or to remove any columns that are not included in the extraction map. and workflows for extracting the change data from PowerExchange.CHAPTER 7 Introduction to Change Data Extraction This chapter includes the following topics: ¨ Change Data Extraction Overview. 106 ¨ PowerExchange-Generated Columns in Extraction Maps.

¨ In the PowerExchange Navigator. the CDC session ends. After processing the condense files. such as the change type and timestamp. the PowerExchange Navigator hides these columns 106 Chapter 7: Introduction to Change Data Extraction . use one of the following extractions modes: Real-time extraction mode Continuously extracts change data directly from the PowerExchange Logger for MVS log files in near real time. UNIX. Some extraction modes are available only if you use PowerExchange Condense or the PowerExchange Logger for Linux. configure a PWX CDC Real Time application connection in PowerCenter for your data source type. configure the PowerExchange Logger for Linux. set the Condense option to Part or Full in your capture registrations. Extraction processing continues until the CDC session is stopped or interrupted. configure the following items: ¨ On the remote Linux. When you perform a database row test on an extraction map. Continuously extracts change data from open and closed PowerExchange Logger for Linux.Extraction Modes You can use different modes to extract change data captured by PowerExchange. the PowerExchange Navigator displays the PowerExchange-generated columns in the results. Batch extraction mode Extracts change data from PowerExchange Condense condense files on MVS that are closed at the time the session runs. configure a PWX CDC Change application connection for your data source type. To implement this mode. and Windows. To implement this mode. ¨ In the PowerExchange Navigator. To implement this mode. or Windows system. set the Condense option to Part in your capture registrations. Depending on your extraction requirements. UNIX. RELATED TOPICS: ¨ “Configuring PowerExchange to Capture Change Data on a Remote System” on page 162 ¨ “Extracting Change Data Captured on a Remote System” on page 168 PowerExchange-Generated Columns in Extraction Maps Besides the table columns defined in capture registrations. UNIX. The extraction mode is determined by the PowerCenter connection type and certain PowerExchange CDC configuration parameters. and Windows to log change data that was originally captured on MVS. ¨ In PowerCenter. When you import an extraction map in Designer. PWXPC includes the PowerExchange-generated columns in the source definition. extraction maps include columns that PowerExchange generates. These PowerExchange-generated columns contain CDC-related information. and Windows log files in near real time. Continuous extraction mode. configure the following items: ¨ In PowerCenter. configure a PWX CDC Real Time application connection for your data source type. UNIX. By default.

with the following exceptions: . which when combined with the restart token comprises the restart token pair.Microsoft SQL Server.DB2 for i5/OS. The value of DTL__CAPXRESTART1 is also known as the sequence token. restart tokens for all data source types have the same length. The value might be null. Otherwise.from view when you open the extraction map. Oracle provides the user ID. If you specify LIBASUSER=Y on the AS4J CAPI_CONNECTION statement. . A binary value that contains the DBID of the distribution database and the name of the distribution server. except on z/OS where sequence tokens for all data source types have the same length. the UIDFMT parameter determines the value. A binary value that represents a position in the change stream that can be used to reconstruct the UOW state for the change record. Datatype VARBIN Length 255 DTL__CAPXRESTART2 VARBIN 255 DTL_CAPXRRN DTL__CAPXUOW DECIMAL VARBIN 10 255 DTL__CAPXUSER VARCHAR 255 PowerExchange-Generated Columns in Extraction Maps 107 . If known. The length of a sequence token varies by data source type. The user ID of the user that made the change to the data source. The length of a restart token varies by data source type. You must edit an extraction map to select these columns. . A sequence token for a change record is a strictly ascending and repeatable value.Microsoft SQL Server CDC.Oracle. which when combined with the sequence token comprises the restart token pair. If you do not specify UIDFMT on the LRAP CAPI_CONNECTION. all columns except the DTL__columnname_CNT and DTL__columnname_IND columns are selected in an extraction map. On z/OS. except for change data extracted from full condense files. To display these columns. and select Show Auto Generated Columns. A binary value that contains the instance name from the registration group of the capture registration. open the extraction map. .Change data extracted from full condense files on z/OS or i5/OS. the value is the library and file name to which the change was made. the value is the user ID of the user that made the change. For DB2 on i5/OS only. with the following exceptions: .DB2 for z/OS. The following table describes the columns that PowerExchange generates for each change record: Column DTL__CAPXRESTART1 Description A binary value that represents the position of the end of the UOW for that change record followed by the position of the change record itself. The value is null because Microsoft SQL Server does not record this information in the distribution database. the relative record number. Note: By default. . A binary value that represents the position in the change stream of the start of the UOW for the change record. right-click anywhere within the Extract Definition window. The value of DTL__CAPXRESTART2 is also known as the restart token.

Null indicator column. Indicates that the column did not changed.I.D.Null value. and day (DD) format.YYYYMMDD is the date in year (YYYY). Valid values are: .Column DTL__CAPXTIMESTAMP Description The timestamp for when the change was made to the data source. Indicates that the column changed. month (MM). a single character that indicates whether the selected column was changed. Binary count column. Note: By default.Y. DELETE operation.U. Valid values are: . INSERT operation. You must edit an extraction map to select these columns. Valid values are: .Y. PowerExchange generates this column for variable length columns of types VARCHAR and VARBIN to determine the length of the column during change data extraction processing. For UPDATE operations. Note: By default. Indicates an INSERT or DELETE operation. PowerExchange generates this column for nullable columns to indicate the nullable value for the column. For DB2 for z/OS sources only. . Note: Oracle does not support microseconds in the timestamp. the value of the before image of the selected column in the change record. null indicator columns are not selected in an extraction map. DTL__columnname_CNT NUM32U 0 DTL__columnname_IND BIN 1 108 Chapter 7: Introduction to Change Data Extraction . Indicates that DB2 deleted this row because of a cascade delete rule. and microseconds (nnnnnn) format. . Indicates that DB2 did not delete this row because of a cascade delete rule. . binary count columns are not selected in an extraction map. minutes (mm). . CHAR 1 DTL__CAPXCASDELIND CHAR 1 DTL__BI_columnname Datatype of the source column CHAR Length of the source column 1 DTL__CI_columnname For UPDATE operations. DTL__CAPXACTION A single character that indicates the type of change operation. as recorded by the source DBMS in the following format: YYYYMMDDhhmmssnnnnnn Datatype CHAR Length 20 Where: . a single character that indicates whether DB2 has deleted the row because the table specifies the ON DELETE CASCADE clause. You must edit an extraction map to select these columns.hhmmssnnnnnn is the time in hours (hh). . seconds (ss). .N. UPDATE operation.N.

a binary value that represents the full condense file and the position of the change record in that file. except on z/OS where sequence tokens for all data source types have the same length. On z/OS.Restart Tokens and the Restart Token File PowerExchange uses a pair of token values. restart tokens for all data source types have the same length. called a restart token pair. and the restart token value to verify that the instance is the same as the instance recorded for the change record. PowerExchange begins to read and pass change data to PWXPC. with the following exceptions: ¨ For Microsoft SQL Server CDC. and the restart token value to verify that the distribution database is the same as the distribution database specified on the CAPI connection. a binary value that contains the instance name from the registration group for the capture registration. Restart Tokens and the Restart Token File 109 . An open UOW is a UOW for which PowerExchange has read the beginning of the UOW from the change stream but has not yet read the commit record. a binary value that represents the change stream position of the end of the UOW for that change record followed by the position of the change record itself. you should generate restart token values that represent the point-in-time in the change stream where you materialized the targets. For a new CDC session. PowerExchange uses these restart token values to determine the point from which to start reading change data from the change stream. to determine where to begin extracting change data in the change stream for a CDC session. The length of a sequence token varies by data source type. Each source in a CDC session can have unique values for its restart token pair in the restart token file. with the following exceptions: ¨ For Microsoft SQL Server CDC. A restart token pair matches the position in the change stream for a change record and has the following parts: Sequence token For each change record that PowerExchange reads from the change stream. In some cases. After determining the start point in the change stream for a CDC session. Restart token For each change record that PowerExchange reads from the change stream. PWXPC uses the sequence token value for each source in the CDC session to determine the point at which to start providing the change data passed from PowerExchange to a specific source. a binary value that represents a position in the change stream that can be used to reconstruct the UOW state for that record. except for change data extracted from full condense files. PowerExchange uses the sequence token value to determine the point from which to start reading change data from the condense files. a binary value that represents the position of the change record in the distribution database. PowerExchange uses the sequence token value to determine the point from which to start reading change data from that distribution database. the restart token might contain the position of the oldest open UOW. a binary value that contains the DBID of the distribution database and the name of the distribution server. ¨ For change data extracted from full condense files on z/OS or i5/OS. or end-UOW. ¨ For change data extracted from full condense files on z/OS and i5/OS. ¨ For change data extracted from full condense files on z/OS or i5/OS. with the following exceptions: ¨ For Microsoft SQL Server CDC. A sequence token for a change record is a strictly ascending and repeatable value. The length of a restart token varies by data source type.

PWXPC can generate restart tokens when it starts to extract change data for a CDC session. you must materialize the targets for the CDC session with data from the data sources. PWXPC does not create any other restart token folder name. PWXPC reads the restart tokens for each source in the CDC session from the state table or file. PWXPC stores restart tokens for CDC sessions at the following locations: ¨ For relational targets. Alternatively. PWXPC creates this folder.You should specify restart token values in the restart token file in the following situations: ¨ When creating a new CDC session. specify a restart token pair for each data source. Additionally. 110 Chapter 7: Introduction to Change Data Extraction . you run a bulk data movement session. PWXPC reads the restart token file in the folder specified in the RestartToken File Folder attribute of the source CDC connection. use the SELECT CURRENT_RESTART SQL statement when you perform a database row test. you should generate restart tokens that represent the point-in-time in the change stream when the materialization occurred. ¨ Run the DTLUAPPL utility with the GENERATE RSTTKN option. Restart Token File You can use the restart token file to provide restart tokens for a new CDC session. If this folder does not exist and the RestartToken File Folder attribute contains the default value of $PMRootDir/Restart. edit the restart token file that PWXPC uses to specify the token values before you start the CDC session. specify CURRENT_RESTART on the RESTART1 and RESTART2 special override statements. in a state table in the target database ¨ For nonrelational targets. use one of the following methods: ¨ In the PWXPC restart token file for the CDC session. PWXPC also reads the restart token file for the CDC session and overrides the restart tokens for any sources that have token values included in the file. You can also use the restart token file to override restart tokens for sources in an existing CDC session. Generating Restart Tokens Before you begin extracting change data. After you materialize the targets and before you allow changes to be made to the data source again. ¨ If you need to override token values for a data source that is defined in an existing CDC session. specify a restart token pair for the new source. PowerExchange provides a number of methods to generate restart tokens. you can use the special override statement to specify a restart token pair for some or all data sources. If you use the DTLUAPPL utility or the PowerExchange Navigator to generate restart tokens. Usually. ¨ When adding a data source to an existing CDC session. To generate restart tokens that match the current end of the change stream. PWXPC uses the name specified in the RestartToken File Name attribute to create an empty restart token file. to perform this task. If the file does not exist. in a state file on the PowerCenter Integration Service machine When you restart a CDC session. or for a source that you add to an existing CDC session. Specify the name and location of the restart token file in the following attributes of the source PWX CDC application connection: ¨ RestartToken File Folder ¨ RestartToken File Name When you run a CDC session. PWXPC then verifies that the restart token file exists. specify the override token values. ¨ In the PowerExchange Navigator.

In the event of a session failure. PowerExchange and PWXPC use the restart information to determine the correct point in the change stream from which to restart extracting change data and then applying it to the targets. PWXPC and PowerCenter provide recovery and restart processing for that session. ¨ Nonrelational targets. PWXPC writes the following message to the session log to indicate that recovery is in effect: PWXPC_12094 [INFO] [CDCRestart] Advanced GMD recovery in effect. do not change the mapping. PWXPC. Recovery state file in the shared location on the PowerCenter Integration Service machine. PWXPC. target. and PWXPC recovers the restart information. PWXPC.Recovery and Restart Processing for CDC Sessions If you select Resume from the last checkpoint for the Recovery Strategy attribute in a CDC session that extracts change data from PowerExchange. PWXPC uses information in the recovery tables to determine where to begin reading the change stream. originates from PowerExchange on the system from which the change data is extracted. Recovery and Restart Processing for CDC Sessions 111 . PWXPC uses the saved restart information to resume reading the change data from the point of interruption. You can include both relational and nonrelational targets in a single CDC session. When the PowerCenter Integration Service recovers the session. including the state of each source. it uses information in the recovery tables to determine where to begin loading data to target tables. and transformation. The PowerCenter Integration Service saves the session state of operation and maintains target recovery tables. commits both the change data and the restart tokens for that data in the same commit. When you recover or restart a CDC session. Otherwise. it writes to recovery tables on the target database system. Recovery is automatic. in conjunction with the PowerCenter Integration Service. will be compromised if a flat file target is in the same session. do not use a resume recovery strategy. The PowerCenter Integration Service restores the session state of operation. determines how much of the source data it needs to reprocess. the PowerCenter Integration Service recovers the session state of operation. The PowerCenter Integration Service stores the session state of operation in the shared location that is specified in $PMStorageDir. in conjunction with the PowerCenter Integration Service. writes the change data to the target files and then writes the restart tokens to the recovery state file. The restart information for CDC sessions. PWXPC uses one of the following locations to store and retrieve restart information. Restart tokens for all targets in the CDC session. the session. grant table creation privilege to the database user name configured in the target database connection. If you want the PowerCenter Integration Service to create the recovery tables. or the state information before you restart the session. PWXPC saves restart information for all sources in a CDC session. Recovery state tables in the target databases. When you run a CDC session that uses a resume recovery strategy. Data loss or duplication might occur. in conjunction with the PowerCenter Integration Service. including relational targets. which includes the restart tokens. Restriction: If any of the targets in the CDC session use the PowerCenter File Writer to write CDC data to flat files. As a result. based on the target type: ¨ Relational targets. The PowerCenter Integration Service saves relational target recovery information in the target database. you must create the recovery tables manually. duplicate data might be applied to the targets when you restart failed CDC sessions. PowerCenter and PWXPC cannot guarantee recovery if you make any of these changes. If you run a session with resume recovery strategy and the session fails. PowerCenter Recovery Tables for Relational Targets When the PowerCenter Integration Service runs a session that has a resume recovery strategy. which ensures that the applied data and the restart tokens are in-sync.

the PowerCenter Integration Service stores the session state of operation in the shared location. ¨ PM_REC_STATE. For nonrelational targets. Also. These entries can comprise more than one row. 1. The information remains in the table between session runs. and in the state file on the PowerCenter Integration Service machine.024-byte binary column. the PowerCenter Integration Service does not remove the recovery information from the target database. Contains state and restart information for CDC sessions. The following columns can contain PWXPC-specific restart information: ¨ APPL_ID. ¨ PM_TGT_RUN_ID. When this value matches an APPL_ID value for a row in the state table. Contains the restart information for the session in a variable-length. For example. PWXPC cannot restart the CDC session from the point of interruption. PWXPC uses the restart information from this column to perform restart processing for the CDC session. PWXPC no longer updates the restart information in the target database. If you disable recovery. Unlike the session state information.024 bytes. Each session entry in a state table contains a number of repository identifiers and execution state data such as the checkpoint number and CDC restart information. When the PowerCenter Integration Service commits change data is to the targets tables. Contains target load information for the session run. PowerCenter Recovery Files for Nonrelational Targets If you configure a resume recovery strategy for a CDC session. CDC sessions with heterogeneous target tables have state table entries in each unique relational target database and an entry in a state file on the PowerCenter Integration Service machine for each nonrelational target. If you edit or drop the recovery tables before you recover a session. The PowerCenter Integration Service creates an entry in the state table for each CDC session. starting from zero. the PowerCenter Integration Service increases the value of the SEQ_NUM column by one. Contains information the PowerCenter Integration Service uses to identify each target on the database. The PowerCenter Integration Service updates it with each commit to the target tables. contains state and CDC restart information for a CDC session. the PowerCenter Integration Service. For each row added. The PowerCenter Integration Service stores any state information for the session. in the state table in the target SQL Server database. in conjunction with PWXPC. If you manually create this table. The PowerCenter Integration Service removes the information from this table after each successful session and initializes the information at the beginning of subsequent sessions. ¨ STATE_DATA. $PMStorageDir. PWXPC stores the application name and restart information for all sources in the CDC session. on the PowerCenter Integration Service machine. Recovery State Table The recovery state table. the PowerCenter Integration Service adds additional rows to accommodate the remainder of the restart information. the PowerCenter Integration Service creates the following recovery tables in the target database: ¨ PM_RECOVERY. Also.For relational targets. This table resides in the same target database as the target tables. a CDC session that targets Oracle and SQL Server tables and a MQ Series queue has an entry in the state table in the target Oracle database. restart information persists in this table across successful sessions. selects the row from the state table for the CDC session. If the amount of restart information for a session exceeds 1. Contains the value the PWXPC creates by appending the task instance ID of the CDC session to the value that you specify in the Application Name attribute in the source PWX CDC application connection. you must create a row and enter a value other than zero for LAST_TGT_RUN_ID to ensure that the session recovers successfully. PM_REC_STATE. it also commits the restart information for that data in this column. the PowerCenter Integration Service cannot recover the session. the PowerCenter Integration Service also stores the target recovery status in a 112 Chapter 7: Introduction to Change Data Extraction .

contains the complete file name. PWXPC then updates the restart token file with the restart tokens for each source in the CDC session. and makes no attempt to recover the session. The PowerCenter Integration Service creates the recovery state file in the shared location. CDC sessions with heterogeneous target tables have state table entries in each unique relational target database and an entry in a state file on the PowerCenter Integration Service machine for each nonrelational target. Recovery State File For all nonrelational targets in a session. If necessary. Use one of the following methods to start CDC sessions: ¨ Cold start. if any. The message CMN_65003. The CDC session continues to run until stopped or interrupted. When you cold start a CDC session. When you warm start a CDC session. Restart Processing for CDC Sessions Each source in a CDC session has its own restart point. To establish a starting extraction. PWXPC appends the repository task instance ID for the CDC session to the Application Name value to create the APPL_ID value in the recovery state table and the appl_id portion in the recovery state file name.recovery state file in the shared location on the PowerCenter Integration Service machine. Nonrelational target files include MQ Series message queues. you must use the restart token file to provide restart tokens and then cold start the CDC session. and other PowerCenter nonrelational targets. PWXPC reconciles the restart tokens for sources provided in the restart token file. The method you use to start a CDC session controls how PWXPC determines the restart information for the sources in that session. PWXPC uses the restart token file to acquire restart tokens for all sources. When you recover a CDC session. The PowerCenter Integration Service uses various task and workflow repository attributes to complete the file name. For example. does not read the state table or file. PWXPC reads the restart tokens from any applicable state tables and file. Each restart token pair should match a point in the change stream where the source and target are in a consistent state. and the session ends. $PMStorageDir. When you change the CDC session to add or remove sources and targets. The session continues to run until stopped or interrupted. ¨ Warm start. or restart. in conjunction with the PowerCenter Integration Service. PWXPC performs recovery processing. When you configure the PWX CDC application connection for each CDC session. ¨ Recovery start. Application Names PWXPC. PWXPC performs recovery processing. you should create and populate the restart token file with restart tokens for each source in the session. The file name has the following prefix: pm_rec_state_appl_id PWXPC creates the value for the appl_id variable in the file name by appending the task instance ID of the CDC session to the value that you specify in the Application Name attribute in the source PWX CDC application connection. point in the change Recovery and Restart Processing for CDC Sessions 113 . If necessary. which the PowerCenter Integration Service writes to the session log. uses the application name you specify as part of the key when it stores and retrieves the restart information for the CDC session. Because the value of the APPL_ID column and the state recovery file contains the task instance ID for the session. you materialize a target table from a source and do not change the source data after materialization. Before you run a CDC session for the first time. changes to the CDC session such as adding and removing sources or targets affects restart processing. PWXPC stores the restart information for nonrelational target files in this state file. specify a unique value in the Application Name attribute. PowerExchange nonrelational targets. the PowerCenter Integration Service uses a recovery state file on the PowerCenter Integration Service machine. with any restart tokens that exist in the state tables or file.

B. DB2 for Linux. 114 Chapter 7: Introduction to Change Data Extraction . If you cold start a CDC session and a restart token file does not exist. which is one of the following: . DB2 for i5/OS Oldest condense file. UNIX. the CDC session might not produce the correct results. Because you did not provide any restart information. and Windows log file. and Windows log file. PWXPC assigns the restart point for source A to source C because this restart point is the oldest one supplied. code a special override statement with the CURRENT_RESTART option in the restart token file that has the file name that you specified in the PWX CDC application connection in the CDC session.Oldest restart point for which an archive log is available . PowerExchange uses the default restart point only if all sources in a CDC session have null restart tokens. Instead. Oldest PowerExchange Logger for Linux. When you cold start the CDC session. PWXPC requests that PowerExchange use the current end-point in the change stream as the extraction start point. Source C does not have existing or supplied restart tokens. Real-time Extraction Mode Best available restart point as determined by the PowerExchange Logger for MVS. as recorded in the CDCT. PowerExchange then assigns the default restart point to each source. The restart point for source A is older than that for source B. and Windows Oldest PowerExchange Logger for Linux. Default Restart Points for Null Restart Tokens The default restart points that PowerExchange uses when it receives null restart tokens vary by data source type. the PowerCenter Integration Service still runs the session. and C. The following table describes the default restart points for null restart tokens. PWXPC does not assign null restart tokens to source C. The restart token file contains restart tokens for sources A and B. Because some sources in the CDC session have explicit restart points. For example. If some sources have non-null restart tokens. UNIX. UNIX. and Windows log file.stream. PWXPC ignores any entries in the state tables or state file for the sources in the CDC session. you can resume change activity to the sources. PWXPC passes null restart tokens for all sources to PowerExchange and indicates that the restart tokens for each source are NULL in message PWXPC_12060. Current log position at the time the PowerExchange capture catalog was created. as recorded in the CDCT. by data source type and extraction method: Data Source Type All MVS sources Batch and Continuous Extraction Mode Oldest condense file. PWXPC assigns the oldest restart point from those tokens to any sources for which no restart tokens are specified. Oracle Current Oracle catalog dump. as recorded in the CDCT. provide valid restart tokens. Warning: If you use null restart tokens. Oldest PowerExchange Logger for Linux. When you cold start CDC sessions. as recorded in the CDCT. PWXPC uses the restart token file to determine the restart tokens for all sources.Current active log if there are no available archive logs. Oldest journal receiver still attached on the journal receiver chain. Determining the Restart Tokens for Cold Start Processing When you cold start a CDC session. UNIX. After the CDC session starts. Microsoft SQL Server Oldest data available in the Publication database. as recorded in the CDCT. a new CDC session contains the sources A.

Assigns the restart tokens in the explicit override statements to the specified sources. PWXPC performs the following processing: . . . PWXPC performs the following processing: . ¨ If the restart token file contains only explicit override statements.Assigns the oldest supplied restart point to any sources for which an explicit override statement was not specified.Assigns restart tokens from a state table or state file to the appropriate sources.Assigns the restart tokens from state tables or the state file to all remaining sources that do not have restart tokens supplied in the restart token file. ¨ If the restart token file contains explicit override statements and if some but not all sources have a matching entry in a state table or a state file. PWXPC uses the restart tokens from the state tables or state file. .Assigns the oldest supplied restart point to all sources that do not have restart tokens. Recovery and Restart Processing for CDC Sessions 115 .Assigns the restart tokens in the explicit override statements to the specified sources. .More specifically. in conjunction with restart token file. ¨ If the restart token file contains a special override statement and explicit override statements. . PWXPC performs the following processing: . PWXPC assigns null restart tokens to all sources in the session. PWXPC performs the following processing: . More specifically. to determine the restart tokens for all sources.Assigns the restart tokens in the special override statement to all remaining sources. ¨ If the restart token file contains explicit override statements and no sources have a matching entry in a state table or no state file. ¨ If the restart token file is empty or does not exist and if all sources have an entry in a state table or state file. ¨ If the restart token file contains explicit override statements and if all sources have an entry in a state table or a state file.Assigns the restart tokens in the explicit override statements to the specified sources. ¨ If the restart token file contains only the special override statement. PWXPC performs the following processing: .Assigns the restart tokens in the explicit override statements to the specified sources.Assigns the oldest available restart point to all sources that do not have restart tokens. Determining the Restart Tokens for Warm Start Processing When you warm start a CDC session. .Assigns the restart tokens in the explicit override statements to the specified sources. PWXPC assigns null restart tokens to all sources in the CDC session. PWXPC uses one of the following methods to determine the restart tokens: ¨ If the restart token file is empty or does not exist and there is no matching entry in a state table or state file. PWXPC performs the following processing: . .Assigns the oldest available restart point to all sources that do not have restart tokens supplied in the restart token file or from a state table or state file. ¨ If the restart token file is empty or does not exist and if some but not all sources have a matching entry in a state table or a state file. PWXPC uses one of the following methods to determine the restart tokens: ¨ If the restart token file is empty or does not exist. provided that the tokens have not been supplied in the restart token file.Assigns any restart tokens found in a state table and state file to the appropriate sources. PWXPC assigns the restart tokens in the special override statement to all sources. uses the state tables and state file.

For IMS. import a single table from the data map. To import the data map as a multi-record data map. PowerExchange ruses the source interest list to determine the sources for which to read data from the change stream. The table represents the relational view of the related record. PWXPC assigns the restart tokens in the special override statement to all sources. import the data map as a multi-record data map. A single mapping can contain one or more multi-record source definitions and single-record source definitions. PWXPC creates a connection to PowerExchange for each source definition in the mapping and reads the source data.¨ If the restart token file contains only the special override statement. PWXPC performs the following processing: . PowerExchange uses group source processing for all source definitions that you include in a single mapping.Assigns the restart tokens in the special override statement to all remaining sources. If you want the source definition to represent only a single record type. nonrelational data sources: ¨ IMS unload data sets ¨ Sequential data sets and flat files ¨ VSAM data sets PowerExchange uses group source processing to read all records for a single multi-group source qualifier in a mapping. PWXPC then provides the change data to the appropriate source in the mapping. PWXPC passes PowerExchange the source data map information from the source definition metadata. If you want the source definition to include all record types. PowerExchange reads the data set or file and passes all of the data records to PWXPC. the PowerExchange data map defines a record and a table for each unique record type. PWXPC then provides the data records to the appropriate source record type in the multi-group source qualifier. If you import a multi-record data map. select Multi-Record Datamaps in the Import from PowerExchange dialog box. and sequential or flat file data sources. PowerExchange determines it from the PowerExchange data map. VSAM. If you use PWXPC connections for bulk data movement operations. PowerExchange reads data from the same physical source in a single pass. For data sources with multiple record types. which includes the data set or file name if available. When PowerExchange encounters changes for a source in the interest list. PowerExchange uses group source processing for the following multiple-record. the source definition has a group for each 116 Chapter 7: Introduction to Change Data Extraction . When you run a bulk data movement session. PWXPC passes a source interest list that contains all of the sources. This processing enhances throughput and reduces resource consumption by eliminating multiple reads of the source data. it passes the change data to PWXPC. If PWXPC does not pass the data set or file name. If you use PWX NRDB Batch application connections. you can use Designer to import data maps with multiple record types to create PowerCenter source definitions.Assigns the restart tokens in the explicit override statements to the specified sources. ¨ If the restart token file contains a special override statement and explicit override statements. When you run a CDC session. . With group source processing. Group Source Processing in PowerExchange When you extract change data using PWX CDC application connections. Using Group Source with Nonrelational Sources PowerExchange can use group source processing for some nonrelational data sources that support multiple record types in a single file.

PowerExchange reads the change stream twice. If you import a single table from a multirecord data map. unique mapping and session for the IMS sources. you can create a source definition for each record type. PowerExchange uses group source processing to read all of the records in the data set or file in a single pass. When you import IMS data maps as multi-record data maps. Otherwise. Alternatively. PowerExchange automatically uses group source processing and reads the change stream in a single pass for all source definitions in the mapping. by using either the Import from PowerExchange dialog box or the Import from Database dialog box.table in the data map. When you create source definitions from extraction maps. For example. you can use this multi-record source definition in a mapping. Using Group Source with CDC Sources When you use PWX CDC application connections to extract change data. you cannot run a CDC session that contains a mapping with both VSAM and IMS source definitions. ¨ Import the table definitions from relational databases. The PowerExchange-defined columns include the change indicator and before image columns as well as the DTL__CAPX columns. PowerExchange uses group source processing for all source definitions in the mapping. Informatica recommends that you use extraction maps to create source definitions for all CDC sources. To perform bulk data movement operations on IMS databases. All source definitions must be for the same data source type. When you run a session that contains this mapping. ¨ The source definition contains the PowerExchange-defined CDC columns. the mapping and session creation process is simpler for the following reasons: ¨ The source definition contains the extraction map name. create mappings that have a source definition for each segment in the IMS database. the session fails with message PWXPC_10080. PowerExchange reads the sequential file one time to extract the data. All sources in the mapping must be the same data source type and must read the same change stream. or Oracle. such as DB2. To create source definitions in Designer that can be used to extract change data. if you have a sequential file that contains three different record types. the source definition has only a single group. PowerExchange reads the data set or file once for each source definition. you can use the source definitions only to process IMS unload data sets. import source metadata by using one of the following methods: ¨ Import a PowerExchange extraction map by using the Import from PowerExchange dialog box. To extract change data for both IMS and VSAM data sources. When you run a session that contains a mapping with source definitions for each table in a multi-record data map. you must import extraction maps from PowerExchange. even though the change stream is the same. create unique a mapping and session for the VSAM sources and a separate. Restriction: To read change data for nonrelational sources. which eliminates the need to add these columns to the source definition. For example. VSAM. Do not include multiple data source types in the mapping. Group Source Processing in PowerExchange 117 . A group contains metadata for the fields in the table. IMS. When you run a session that contains a mapping with a single source definition for all records in a multi-record data map. if you import the data map as a multi-record data map and create a single multi-record source definition. Then create a mapping that contains the three source definitions. When you extract change data. When you run a session that contains the mapping. which eliminates the need to provide it when you configure the session. PowerExchange reads the sequential file three times. You cannot use multi-record IMS source definitions to read all segments from an IMS database in a single pass. once for the session with VSAM sources and once for the session with IMS sources.

Commit Processing with PWXPC The PowerCenter Integration Service. such as the PowerExchange Logger for MVS. it cannot be used for bulk data movement operations. The commit type determines when the PowerCenter Integration Service commits data to the target. mappings that use source definitions created from database relational metadata can be used for either change data extraction or bulk data movement. and PWXPC provides the changes to the appropriate source qualifier. By default. When you run a CDC session that specifies target-based commit processing. which indicates targetbased commit processing. The following example mapping shows three DB2 sources. PWXPC ignores the Commit Interval attribute. even if the sessions extract change data from the same change stream. the Commit Type attribute on the session Properties tab specifies Target. To control commit processing. PowerExchange passes the change data to PWXPC. PowerExchange uses a connection for each session. and PWXPC controls the timing of commit processing. commits data to the target based on commit properties and the commit type. configure attributes on the PWX CDC Change and Real Time application connections. For CDC sessions. Note: Because the example mapping uses source definitions created from extraction maps. the PowerCenter Integration Service automatically changes the commit type to source-based and writes message WRT_8226 in the session log. in conjunction with PWXPC. 118 Chapter 7: Introduction to Change Data Extraction . PowerExchange extracts change data in chronological order. PowerExchange uses group source processing to read the change stream and extract the changes for all three source tables. for which the source definitions were created from extraction maps: If you include this mapping in a session that uses a PWX DB2zOS CDC application connection. However.If you create a workflow that contains multiple CDC sessions. Commit properties specify the commit interval and the number of UOWs or change records that you want to use as a basis for the commit. based on when the UOWs were completed. the PowerCenter Integration Service always uses source-based commit processing.

Default is 0. it resets the UOW count. Whenever one of the commitment control values is met. and the realtime flush latency timer. this threshold must be met before a commit can occur.000 milliseconds. PWXPC continues to read change data. whichever is first. PWXPC commits change data only when one of the following values is met: ¨ Maximum Rows Per commit ¨ Real-time Flush Latency in milli-seconds ¨ UOW Count If you specify a value for the Minimum Rows Per commit attribute. PWXPC issues a final commit to Commit Processing with PWXPC 119 . which means that PowerExchange does not use minimum rows. PWXPC does not wait for a UOW boundary to commit the change data. Default is 0. which means that PWXPC does not use maximum rows. Default is 0. Minimum number of change records that PowerExchange reads from the change stream before it passes any commit records in the change stream to PWXPC. Number of milliseconds that must pass before PWXPC flushes the data buffer to commit the change data to the targets. When the PWXPC CDC reader ends normally. Then. If necessary. or terminates abnormally. the maximum and minimum rows. Real-time Flush Latency in milli-seconds. When this latency period expires. Before reaching this minimum value. ends. The following table describes the connection attributes that control commit processing: Connection Attribute Real Time or Change Connections Both Description Maximum Rows Per commit Maximum number of change records that PWXPC processes before it flushes the data buffer to commit the change data to the targets. PowerExchange skips commit records and passes only the change records to PWXPC. PWXPC flushes the data buffer to commit the change data to the targets. Number of UOWs that PWXPC processes before it flushes the data buffer to commit the change data to the targets. PWXPC continues to process change records across UOW boundaries until this maximum rows threshold is met. After PWXPC commits the change data. PWXPC continues to read the changes in the current UOW until the end of that UOW is reached. However. Commit processing continues until the CDC session is stopped. which means that PWXPC uses 2. PWXPC commits that data to the targets. PWXPC flushes the data buffer to commit the change data to the targets only when Maximum Rows Per commit. you can specify certain PWX CDC Real Time or Change application connection attributes. Minimum Rows Per commit Real Time Real-time Flush Latency in milliseconds Real Time UOW Count Both You can specify values for the all of these commitment control attributes.RELATED TOPICS: ¨ “Commitment Control Options” on page 133 Controlling Commit Processing To control commit processing. Default is 1. or UOW Count is met. However.

which results in many. the final restart tokens might represent a point in the change stream that is earlier than final change data that the PowerCenter Integration Service commits to the targets. larger UOWs more efficiently than many small UOWs. you can use the Maximum Rows Per commit attribute to specify the maximum number of change records that PWXPC reads before it commits the change data to the targets. PowerExchange and PWXPC can process fewer.000 changes before it issues a commit. set the Commit Type attribute to Source and disable the Commit On End Of File attribute. PowerExchange passes the next commit record to PWXPC and then resets the minimum rows counter. This attribute causes PWXPC to commit change data without waiting for a UOW boundary. PowerExchange discards any commit records that it reads from the change stream and passes only change records to PWXPC. if you have an application that makes 100. to the targets. 120 Chapter 7: Introduction to Change Data Extraction . small UOWs in the change stream. By using a subpacket commit for large UOWs. the RDBMS can release the locks in the target databases for these change records and the PowerCenter Integration Service can reuse the buffer space for new change records. duplicate data can occur in the targets because the PowerCenter Integration Service commits any remaining change data in the buffer to the targets. When the maximum rows limit is met. the PWXPC CDC reader writes the following message to the session log: PWXPC_12075 [INFO] [CDCRestart] Session complete. along with their restart tokens. You can use these attributes to mitigate the effects of processing very small or very large UOWs. which is called a subpacket commit. Next session will restart at: [restart1_token] : Restart 2 [restart2_token] Restart 1 Restriction: If you enable the Commit On End Of File attribute on the session Properties tab. The Minimum Rows Per commit attribute controls the size of the UOWs read from the change stream. After the minimum rows limit is met. This final commit by the PowerCenter Integration Service occurs after the PWXPC CDC reader has committed all complete UOWs in the buffer. Prior to ending. you should use the maximum rows attribute only if you have large UOWs that cannot be processed without impacting either the PowerCenter Integration Service machine or the target databases. if you use the minimum rows limit to increase the size of UOWs. you can improve CDC processing efficiency. Online transactions that run in transaction control systems such as CICS and IMS often commit after making only a few changes. Warning: Because PWXPC can commit change data to the targets between UOW boundaries. Generally. PowerExchange simply skips some of the original commit records in the change stream. Until the minimum rows value is met. you can minimize storage use on the PowerCenter Integration Service machine and lock contention on the target databases. Use this attribute to specify the minimum number of change records that PowerExchange must pass to PWXPC before passing a commit record. As a result.flush all complete. buffered UOWs and their final restart tokens to the targets. you can use the Minimum Rows Per commit attribute to create larger UOWs of a more uniform size. Therefore. Maximum Rows per Commit If you have very large UOWs. Minimum Rows per Commit If your change data has many small UOWs. Do not use this connection attribute if you have targets in the CDC session with RI constraints. For example. After the commit processing.000 change records. A minimum rows limit does not impact the relational integrity of the change data because PowerExchange does not create new commits points in the change stream data. PWXPC flushes the change data from the buffer on the PowerCenter Integration Service machine and commits the data to the targets. relational integrity (RI) might be compromised. To prevent possible duplicate data when you restart CDC sessions. you can use the maximum rows attribute to commit the change data before PWXPC reads all 100. Maximum and Minimum Rows per Commit The Maximum Rows Per commit attribute controls the size of the UOWs written to the targets.

Alternatively. which disables this option ¨ 0 for Minimum Rows Per commit. and the target databases. Each UOW contains 1.Example This example uses the Maximum Rows Per commit and UOW Count attributes to control commit processing. these values might decrease throughput because change data flushes too frequently for the PowerCenter Integration Service or the target databases to handle. The commitment control attributes have the following values: ¨ 300 for Maximum Rows Per commit ¨ 0 for Minimum Rows Per commit. The following default values can result in the lowest latency: ¨ 0 for Maximum Rows Per commit. However. which is equivalent to 2000 milliseconds or 2 seconds ¨ 1 for UOW Count These values can decrease target latency because PWXPC commits changes after each UOW. the PowerCenter Integration Service machine. Lower target latency results in higher resource consumption because the PowerCenter Integration Service must flush the change data more frequently and the target databases must process more commit requests. PWXPC flushes the data buffer after reading the first 300 records in a UOW. The change data is composed of UOWs of the same size. these values also cause the highest resource consumption on the source system. which is equivalent to 2 seconds ¨ 1 for UOW Count Based on the maximum rows value. If this processing occurs quickly. which disables this option ¨ 0 for Real-time Flush Latency in milli-seconds. PWXPC continues to commit change data to the targets every 300 records. or on UOW boundaries.Target Latency Target latency is the total time that PWXPC uses to extract change data from the change stream and that the PowerCenter Integration Service uses to apply that data to the targets. To lower resource consumption and potentially increase throughput for CDC sessions.000 change records. The values you select for the commitment control attributes affect target latency. specify a value greater than the default value for only one of the following attributes: ¨ Maximum Rows Per commit ¨ UOW Count ¨ Real-time Flush Latency in milli-seconds Disable the unused attributes. You can affect target latency by setting the commit control attributes. Subpacket Commit and UOW Count . target latency is low. Commit Processing with PWXPC 121 . which disables this attribute ¨ 0 for Real-time Flush Latency in milli-seconds. Examples of Commit Processing The following examples show how the commitment control attributes affect commit processing with PWXPC. This action commits the change data to the targets. You must balance target latency requirements with resource consumption on the PowerCenter Integration Service machine and the target databases.

which disables this attribute ¨ 0 for Minimum Rows Per commit. The commitment control attributes have the following values: ¨ 0 for Maximum Rows Per commit. PWXPC reads the next 1. In this example. PWXPC does not commit change data to the targets because the UOW counter was reset to 0 after the last commit. which is less than the real-time flush latency timer. When PWXPC reaches UOW 1. Because all of the UOWs have the same number of change records. PWXPC then resets both the UOW counter and real-time flush latency timer. which is equivalent to 5 seconds ¨ 1000 for UOW Count Initially. When the end of the UOW is read. PWXPC commits this change data to the target because the UOW counter has been met. which disables this attribute ¨ 100 for Minimum Rows Per commit ¨ -1 for Real-time Flush Latency in milli-seconds. PWXPC commits the change data because the UOW Count value is 1.Example This example uses the UOW Count and Real-time Flush Latency in milli-seconds attributes to control commit processing. The change data consists of UOWs of the same size. Because the real-time flush latency interval has expired. PWXPC then resets the real-time flush latency timer and the UOW counter. PWXPC reads 900 complete UOWs in 5 seconds. PWXPC commits change data at the following points: ¨ After UOW 900 because the real-time latency flush latency timer matched first ¨ After UOW 1. If the real-time flush latency interval expires before PWXPC reads 300 change records. The commitment control attributes have the following values: ¨ 0 for Maximum Rows Per commit. PWXPC flushes the data buffer to commit the change data to the targets.000.000 change records based on the UOW count value UOW Count and Time-Based Commits . whichever limit is met first. Each UOW contains ten change records. PWXPC resets the UOW and maximum row counters and the real-time flush latency timer each time it commits. PWXPC still commits based on the maximum rows value because that threshold is met before a UOW boundary occurs. PWXPC continues to read change data and commit the data to the targets. PWXPC continues to read change data and to commit the data to the targets at the same points in each UOW. After this commit. In this example.000 UOWs in 4 seconds. based on the UOW count or the realtime flush latency flush time.PWXPC commits on UOW boundaries only for the UOW count and real-time flush latency interval. PWXPC commits change data at the following points: ¨ 300 change records based on the maximum rows value ¨ 600 change records based on the maximum rows value ¨ 900 change records based on the maximum rows value ¨ 1.Example This example uses the Minimum Rows Per commit and UOW Count attributes to control commit processing. which disables this attribute ¨ 5000 for Real-time Flush Latency in milli-seconds. The change data consists of UOWs of varying sizes. which is disables this attribute ¨ 10 for UOW Count 122 Chapter 7: Introduction to Change Data Extraction .900 because the UOW count matched first during the second commit cycle Minimum Rows and UOW Count .

such as populating change-indicator and before-image columns or running expressions. PWXPC commits change data after 1. PWXPC increases the UOW counter to one. So.PWXPC passes the minimum rows value to PowerExchange and requests change data from the change stream. You can use CDC offload processing to distribute processing to the PowerCenter Integration Service machine running the extraction. When PowerExchange reads the last change record in the tenth UOW. At this point. you can use CDC offload processing. Offload Processing You can use CDC offload processing and multithreaded processing to improve performance and efficiency of realtime CDC sessions. which is also after every 10 UOWs because each UOW contains 10 change records and the UOW Count is 10. You can use multithreaded processing to increase parallelism on the PowerCenter Integration Service machines. CDC offload processing moves the column-level and UOW Cleanser processing to the PowerCenter Integration Service machine running the extraction. PowerExchange also performs any data manipulation operations that you defined in the extraction map. PowerExchange also runs the UOW Cleanser to reconstruct complete UOWs from the change data in the change stream on the system. and Windows to copy change data to PowerExchange Logger log files on a remote system. PowerExchange passes the commit record for the tenth UOW to PWXPC and resets the minimum rows counter. and Oracle sources. PowerExchange performs column-level processing on the system on which the changes are captured. You can also use CDC offload processing to copy change data to a remote system by using the PowerExchange Logger for LINUX. By default. For MVS. DB2 for i5/OS. To reduce the overhead of column-level and UOW Cleanser processing. UNIX. You can then extract the change data from the remote system rather than the original source system. CDC Offload Processing When you extract change data. This column-level processing of change data occurs in the PowerExchange Listener and can be CPU-intensive. ¨ You have insufficient resources on the machine where the change data resides to provide the necessary throughput you require. the minimum rows limit is met. PowerExchange skips the commit records of the first nine UOWs. PowerExchange maps the captured data to the columns in the extraction map. ¨ You have spare cycles on the PowerCenter Integration Service machine and those cycles are cheaper than the cycles on the machine on which the changes are captured. UNIX. PowerExchange and PWXPC continue to read the change data until the UOW counter is 10. CDC offload processing can also be used by the PowerExchange Logger for Linux. and Windows. Offload Processing 123 . In this example.000 change records. PWXPC flushes the data buffer to commit the change data to the targets and resets the UOW counter. Use CDC offload processing to help increase concurrency and throughput and decrease costs in the following situations: ¨ You have insufficient resources on the machine where the change data resides to run the number of concurrent extraction sessions you require. Because the minimum rows value is 100. which reduces processing on the source system.

By default. then multithreaded processing may provide increased throughput. If you use multithreaded processing. PowerExchange merges the threads and passes the UOW to the PWXPC CDC reader for processing. PowerExchange performs column-level processing on the change stream as a single thread. PowerExchange might be able to extract changes faster and more efficiently by processing more than one UOW simultaneously. you can also use multithreaded processing. PowerExchange multithreaded processing splits a UOW into multiple threads on the PowerCenter Integration Service machine. Multithreaded processing works most efficiently when PowerExchange on the source machine is supplying data fast enough to take full advantage of the multiple threads on the PowerCenter Integration Service machine. which might improve help improve throughput even more. After the column-level processing completes. If PowerExchange completely utilizes a single processor on the PowerCenter Integration Service machine. 124 Chapter 7: Introduction to Change Data Extraction .Multithreaded Processing If you use CDC offload processing for change data extractions.

sessions. you must import PowerExchange extraction maps. To stop a CDC session using real-time extraction mode based on certain user-defined events. you can import the metadata from either database definitions or PowerExchange extraction maps. Additionally. UNIX. you must create a mapping and then an application connection. and workflow in Workflow Manager. For relational data sources. such as the before image (BI) and change indicator (CI) columns. After creating the source and target definitions in Designer. you should create restart tokens to define an extraction start point in the change stream. To extract change data that PowerExchange captures. When you import extraction maps. if appropriate. 135 ¨ Configuring the Restart Token File. and Windows runs If you use offload processing with real-time extractions. 136 Overview of Extracting Change Data Use PowerExchange in conjunction with PWXPC and PowerCenter to extract captured change data and write the data to one or more targets. 126 ¨ Testing a Change Data Extraction. Also. and workflows based on the same source and target definitions. session. you can also use multithreaded processing. 126 ¨ Configuring PowerCenter CDC Sessions. you can offload column-level extraction processing and any UOW Cleanser processing from the source system to the following remote locations: ¨ PowerCenter Integration Service machine ¨ A remote machine where the PowerExchange Logger for Linux. Tip: Informatica recommends that you import the metadata from PowerExchange extraction maps instead of from database definitions. you must import metadata for the CDC sources and the targets of the change data in Designer. Before starting a CDC session. the source definition contains all of the PowerExchangegenerated CDC columns. 125 ¨ Task Flow for Extracting Change Data. Restart tokens might also be required for resuming extraction processing in a recovery scenario. 135 ¨ Displaying Restart Tokens.CHAPTER 8 Extracting Change Data This chapter includes the following topics: ¨ Overview of Extracting Change Data. For nonrelational sources. 125 . 128 ¨ Creating Restart Tokens for Extractions. you can configure event table processing. You can create multiple mappings. PWXPC derives the extraction map name from the source definition so you do not need to code the extraction map name for each source in the session properties.

and PowerCenter Workflow Manager to configure and start extraction processing. A database row test verifies that: ¨ PowerExchange has captured change data for a data source defined in a capture registration. If you want to stop extraction processing based on certain events. In Designer. 126 Chapter 8: Extracting Change Data . UNIX. 2. 5. ¨ PowerExchange Condense or the PowerExchange Logger for Linux. and Windows has captured change data for a capture registration. complete configuration of the data source and PowerExchange for CDC. RELATED TOPICS: ¨ “Creating Restart Tokens for Extractions” on page 135 ¨ “Starting PowerCenter CDC Sessions” on page 140 ¨ “Monitoring and Tuning Options” on page 148 ¨ “Testing a Change Data Extraction” on page 126 Testing a Change Data Extraction Perform a database row test in the PowerExchange Navigator to ensure that PowerExchange can retrieve data when the extraction map is used in a CDC session. Create restart tokens for the CDC session. 4. If you want to offload column-level extraction processing and UOW Cleanser processing from the source system to the PowerCenter Integration Service machine or PowerExchange Logger for Linux. perform a database row test on the extraction map in PowerExchange Navigator. Edit the extraction map if necessary. You can make the following changes: ¨ Deselect any column for which you do not want to extract the change data.Task Flow for Extracting Change Data Perform the following tasks in the PowerExchange Navigator. ¨ The extraction map properly maps the captured change data. if applicable. Start the CDC session. In Workflow Manager. and Windows machine. and create capture registrations in the PowerExchange Navigator. configure a mapping to extract and process change data. For real-time extractions. To test the extraction map. 7. implement event table processing. 1. 3. PowerCenter Designer. 9. PowerExchange still captures change data for these columns. configure offload processing. configure a connection and session. 6. Configure the restart token file. you can also configure multithreaded processing. 8. In Designer. UNIX. import metadata for the sources and targets. ¨ Add change indicator (CI) and before image (BI) columns. 10. Before you begin.

UNIX. Click Advanced.TableName is the table name of the data source. 5. A SQL SELECT statement that PowerExchange generates for the fields in the extraction map. Select the extraction map and click File > Database Row Test. . a table is identified in the following format: Schema. Real-time extraction mode or continuous extraction mode. a unique application name is not required. You can edit this statement. UNIX. .cfg file on the Windows machine from which you run the database row test.RegName_TableName UserID and Password Application Name SQL Statement Where: . change timestamp. a user ID and password that provides access to the source change data.Schema is schema for the extraction map. Location Node name for the location of the system on which the captured change data resides. The results include the PowerExchange-defined CDC columns. enter the CAPX CAPI_CONNECTION name in the CAPI Connection Name field. 8. the DTL__ columns. Note: If you enter CAPX in the DB Type field. Otherwise. Open the extraction map. and Windows has closed at least one condense or log file. In the statement. In the CAPX Advanced Parameters or CAPXRT Advanced Parameters dialog box. PowerExchange does not retain the value that you specify. For a row test. open the extraction group that includes the extraction map that you want to test. . if necessary. PowerExchange also writes this message if no change data for the data source has been captured. Click Go.To test change data extraction: 1. 6. 7. condensed. enter or edit the following information: Field DB Type Description An extraction mode indicator: . PowerExchange displays no data in PowerExchange Navigator and writes the PWX-04520 message in the PowerExchange message log on the extraction system. ¨ If you use the PowerExchange Logger for Linux.RegName is the name of the capture registration that corresponds to the extraction map. enter information. 2.CAPXRT. The database row test returns each change from the extraction start point by column. 4. This name must be defined in a NODE statement in the dbmover. Batch extraction mode. or logged. you can only extract change data after PowerExchange Condense or the PowerExchange Logger for Linux. Click OK. Optionally.CAPX. In the Resource Explorer of the PowerExchange Navigator. including the following: ¨ If you use continuous extraction mode. At least one character to represent the application name. and user ID of the user who made the change. 3. enter location of the extraction maps in the Location field. which provide information such as the change type. Testing a Change Data Extraction 127 . and Windows to offload change data to system remote from the system on which it was captured. In the Database Row Test dialog box.

Default is enabled. Before running CDC sessions. This commit occurs after PWXPC commits the restart tokens. Changing Default Values for Session and Connection Attributes Certain PowerCenter session and application connection attributes have default values that are only appropriate for bulk data movement. To properly restart CDC session. You must change the values of these attributes for CDC sessions. Default is the first 20 characters of the WorkFlow Name. the PowerCenter Integration Service does not consider errors when writing to targets as fatal. you might experience change data loss because PWXPC has advanced the restart tokens values. The following types of error are non-fatal: .Configuring PowerCenter CDC Sessions After you import metadata for CDC data sources and targets into PowerCenter. As a result. you must configure numerous session and connection attributes. you can create a mapping and a CDC session to extract change data.Key constraint violations . The PowerCenter Integration Service automatically overrides it to Source. Warning: The default might not result in a unique name.Database trigger responses If write errors occur. you must set this option to 1. However. which PWXPC creates if it does not exist. which can cause an out-of-sync condition between the restart tokens and the target data. Default value is Fail task and continue workflow. By default.Loading nulls into a not null field . The following table summarizes these attributes and their recommended values: Attribute Name Attribute Location Properties Tab Recommended Value Source Description Commit Type Default is Target. duplicate data can occur when CDC sessions restart. The PowerCenter Integration Service performs a commit when the session ends. Default value RestartToken File Folder Application Connection Use the default value of $PMRootDir/Restart. Commit On End Of File Properties Tab Disabled Recovery Strategy Properties Tab Resume from last checkpoint Stop on errors Config Object Tab 1 Application Name Application Connection Code a unique name for each CDC session. 128 Chapter 8: Extracting Change Data . To maintain target data and restart token integrity. Default value is 0. PowerExchange CDC and PWXPC require that this option is set to Resume from last checkpoint. you cannot disable Commit On End Of File unless you change Commit Type to Source.

you must configure PowerExchange Condense or the PowerExchange Logger for Linux. you cannot apply these changes Configuring PowerCenter CDC Sessions 129 . see PowerExchange Interfaces for PowerCenter. you can use BI columns to handle update operations that change the value of a key column of a row. ¨ BA. PowerExchange provides the before-image (BI) and after-image (AI) data for the updated row as separate SQL operations: ¨ A DELETE with the before-image data ¨ An INSERT with the after-image data Note: To select BA with batch or continuous extraction mode. use the Image Type attribute to configure the format of the change data that a CDC session extracts. which adds additional columns to the extraction map with the name of DTL__BI_columnname. allow update operations to key columns. Default is 0. PowerExchange provides these changes as an UPDATE operation. PowerExchange provides the after-image data for updated row as a SQL UPDATE operation. and Windows to log before and after images. the value for Application Name is used. in a single SQL UPDATE operation. select AI for the Image Type attribute. For a complete list of all PWX CDC application connection attributes. If you select AI for the Image Type attribute. Otherwise. For example. Because some relational databases do not allow updates to primary key columns. you can make decisions about UPDATE operations in a mapping because the before and after-image data is contained in a single record. Before and after images. Default is BA. Otherwise. Warning: The default may not result in a unique name. After images only. Use the PowerExchange Navigator to update the extraction map with before-image columns. If you select AI for the Image Type attribute. along with the afterimage data. such as DB2 for z/ OS. RestartToken File Name Number of Runs to Keep RestartToken File Application Connection 1 or higher Configuring Application Connection Attributes To extract change data. Image Type For update operations. the default is the workflow name. If you select BA for the Image Type attribute. UNIX. you must configure certain application connection attributes. Some relational databases.Attribute Name Attribute Location Application Connection Recommended Value Code a unique name for each CDC session. You can also configure one or more data columns in an extraction map with before-image (BI) columns. Specify a value greater than 0 so a history is available for recovery purposes. When you configure BI columns. PowerExchange then includes before-image data in any BI columns. If you use BI columns. PWXPC keeps only one backup copy of the restart token initialization and termination files. Select one of the following options for the Image Type attribute: ¨ AI. Description If no value is entered for Application Name. The RDBMS understands that this operation is equivalent to deleting the row and then re-adding it with a new primary key and logs the change as an update. you can only select after images.

as updates. If you configure BI columns for key columns, you can then use the Flexible Key Custom transformation to be change any UPDATE operations for key columns into a DELETE operation followed by an INSERT operation.

Event Table Processing
You can use event table processing to stop the extraction of changes based on user-defined events, such as an end-of-day event. For example, to stop an extraction process every night, after all of the changes for the day have been processed, write a change to the event table at midnight. This change triggers PowerExchange to stop reading change data and shut down the extraction process after the current UOW completes. Event table processing has the following rules and guidelines:
¨ You can only use event table processing with real-time or continuous extraction modes. ¨ You must create the event table, and define the applications that can update the table. ¨ You must register the event table for change data capture from the PowerExchange Navigator. ¨ A CDC session monitors a single event table. Each user-defined event requires its own event table and a

separate extraction process.
¨ The event table and all of the source tables in the CDC session must be of the same source type.

To implement event table processing: 1. Create an event table. The event table must be of the same source type and on the same machine as the change data that is extracted. For example, if you extract DB2 change data on MVS, the event table must be a DB2 table in the same DB2 subsystem as the DB2 source tables for the extraction. 2. In the PowerExchange Navigator, create a capture registration and extraction map for the event table. When you create a capture registration, the PowerExchange Navigator generates an extraction map. 3. 4. In PowerCenter, create a CDC session, and specify the extraction map name in the Event Table attribute on the PWX CDC Real Time application connection. When the defined event occurs, update the event table. When PowerExchange reads the update to the event table, PowerExchange places an end-of-file (EOF) into the change stream. PWXPC processes the EOF, passes it to the PowerCenter Integration Service, and then shuts down the PowerExchange reader. The PowerCenter Integration Service completes writing all of the data currently in the pipeline to the targets and then ends the CDC session.

CAPI Connection Name Override
PowerExchange allows a maximum of eight CAPI_CONNECTION statements in the DBMOVER configuration file. You can use multiple CAPI_CONNECTION statements to extract changes from more than one data source type with a single PowerExchange Listener on a single machine. For example, you can extract changes for Oracle and DB2 for Linux, UNIX, and Windows through a single PowerExchange Listener by specifying multiple CAPI_CONNECTION statements in the dbmover.cfg file. To specify the CAPI_CONNECTION statement that PowerExchange uses to extract change data in a CDC session, code the name in the CAPI Connection Name Override attribute. You must code CAPI_CONNECTION statements on the system where the change data resides so that PowerExchange can extract change data for a data source type. If you use CDC offload processing, you must also code the CAPI_CONNECTION statements in the dbmover.cfg file on the PowerCenter Integration Service machine.

130

Chapter 8: Extracting Change Data

Idle Time
To indicate whether a real-time or continuous extraction mode CDC session should run continuously or shutdown after reaching the end-of-log (EOL), use the Idle Time attribute. Enter one of the following values for the Idle Time attribute:
¨ -1. The CDC session runs continuously. PowerExchange returns end-of-file (EOF) only when the CDC session

is manually stopped.
¨ 0. After reaching EOL, PowerExchange returns EOF and the CDC session ends. ¨ n. After reaching EOL, PowerExchange waits for n seconds and, if no new change data of interest arrives, the

CDC session ends. Otherwise, the CDC session continues until PowerExchange waits for n seconds without reading new change data of interest. Default is -1. PowerExchange determines the EOL by using the current end of the change stream at the point that PowerExchange started to read the change stream. PowerExchange uses the concept of EOL because the change stream is generally not static, and so the actual end-of-log is continually moving forward. After PowerExchange reaches EOL, it writes the PWX-09967 message in the PowerExchange message log. Typically, real-time and continuous extraction mode CDC sessions use the default value of -1 for the Idle Time attribute. If necessary, you can manually stop a never-ending CDC session by using the PowerCenter Workflow Monitor, pmcmd commands, or the PowerExchange STOPTASK command. Alternatively, you can set the Idle Time attribute to 0. After PowerExchange reaches EOL, it returns an EOF to PWXPC. PWXPC and the PowerCenter Integration Service then perform the following processing: 1. 2. 3. 4. PWXPC flushes all buffered UOWs and the ending restart tokens to the targets. The CDC reader ends. After the PowerCenter Integration Service finishes writing the flushed data to the targets, the writer ends. After any post-session commands and tasks execute, the CDC session ends.

If you set the Idle Time attribute to a positive number, the following processing occurs: 1. 2. PowerExchange reads the change stream until it reaches EOL, and then timing for the idle time begins. If more data is in the change stream after EOL, PowerExchange continues to read the change stream, looking for change data of interest to the CDC session, as follows:
¨ If the idle time expires before PowerExchange reads a change record of interest for the CDC session,

PowerExchange stops reading the change stream.
¨ If PowerExchange reads a change record of interest to the CDC session, PowerExchange restarts the

timer, passes the change data to PWXPC, and continues to read the change stream. This processing continues until the idle time expires. 3. 4. After the idle time expires, PowerExchange passes an EOF to PWXPC. PWXPC and the PowerCenter Integration Service perform the same processing as when the Idle Time attribute is set to 0 and the CDC session ends.

If you set the Idle Time attribute to a low value, the CDC session might end before all available change data in the change stream has been read. If you want a CDC session to end periodically, Informatica recommends that you set the Idle Time attribute to 0 because active systems are rarely idle. When a CDC session ends because either the idle time value has been reached or a PowerExchange STOPTASK command has been issued, PWXPC writes the following message in the session log:
[PWXPC_10072] [INFO] [CDCDispatcher] session ended after waiting for [idle_time] seconds. Idle Time limit is reached

If you stop a never-ending CDC session with the PowerExchange STOPTASK command, PWXPC substitutes 86400 for the idle_time variable in the PWXPC_10072 message.

Configuring PowerCenter CDC Sessions

131

Note: If you specify values for the Reader Time Limit and Idle Time attributes, the PowerCenter Integration Service stops reading data from the source when the first one of these terminating conditions is reached. Because the reader time limit does not result in normal termination of a CDC session, Informatica recommends that you use only the idle time limit.

Restart Control Options
PWXPC uses the restart information to tell PowerExchange from which point to start reading the captured change data. To specify restart information, PWXPC provides options that you must configure for each CDC session. The following table describes the restart attributes you must configure for CDC sessions:
Connection Attribute Application Name Description Application name for the CDC session. Specify a unique name for each CDC session. The application name is case sensitive and cannot exceed 20 characters. Default is the first 20 characters of the workflow name. Directory name on the PowerCenter Integration Service machine that contains the restart token override file. Default is $PMRootDir/Restart. File name in the RestartToken File Folder that contains the restart token override file. PWXPC uses the contents of this file, if any, in conjunction with the state information to determine the restart point for the CDC session. Default is the Application Name, if specified, or the workflow name, if Application Name is not specified.

RestartToken File Folder

RestartToken File Name

Informatica recommends that you specify a value for the Application Name attribute, because the default value might not result in a unique name. The values for Application Name and RestartToken File Name attributes must be unique for every CDC session. Non-unique values for either of these attributes can cause unpredictable results that include session failures and potential data loss.

PowerExchange Flush Latency
PowerExchange reads change data into a buffer on the source machine, or on the PowerCenter Integration Service machine if you use CDC offload processing. The PowerExchange Consumer API (CAPI) interface flushes the buffer that contains the data to PWXPC on the PowerCenter Integration Service machine for processing when the one of the following conditions occurs:
¨ The buffer becomes full. ¨ The CAPI interface timeout, also called the PowerExchange flush latency, expires. ¨ A commit point occurs.

PowerExchange uses the flush latency value as the CAPI interface timeout value on the source machine, or on the PowerCenter Integration Service machine if you use CDC offload processing. For CDC sessions that use real-time or continuous extraction mode, set the flush latency in the PWX Latency in seconds attribute of the PWX CDC Real Time application connection. For CDC sessions that use batch extraction mode, PowerExchange always uses two seconds for the flush latency. Restriction: The value of PWX Latency in seconds impacts the speed with which a CDC session responds to a stop command from Workflow Monitor or pmcmd, because PWXPC must wait for PowerExchange to return control before it can handle the stop request. Informatica recommends that you use the default value of 2 seconds for the PWX Latency in seconds attribute.

132

Chapter 8: Extracting Change Data

commits occur only on UOW boundaries. Note: The Maximum Rows Per commit attribute is a count of records within a UOW. After PowerExchange flushes the change data to PWXPC. Until the minimum rows limit is met. PowerExchange discards any commit records that it reads from the change stream and passes only change records to PWXPC. which means that PowerExchange does not use minimum rows. PWXPC issues a real-time flush to commit the change data and the restart tokens to the targets and writes the PWXPC_12128 message to the session log. PWXPC uses the value that you specify to commit change records between UOW boundaries. in conjunction with PowerExchange and the PowerCenter Integration Service. Default is 0. After the maximum rows limit is met. When you specify a low maximum rows limit. the commit occurs after PWXPC processes the 100 changes for the second source.000 change records. controls the timing of commit processing for CDC sessions based on the values you code for the commitment control options.PowerExchange writes the message PWX-09957 in the PowerExchange message log to reflect the CAPI interface timeout value set from the flush latency value. Minimum Rows Per commit For real-time or continuous extraction mode. Commitment Control Options PWXPC. In this example. After the minimum rows limit is met. PWXPC resets the maximum rows limit when a real-time flush occurs because either the maximum rows limit or UOW count is met or the real-time flush latency timer expires. Configuring PowerCenter CDC Sessions 133 . Do not use this connection attribute if you have targets in the CDC session with RI constraints. If necessary. To control commit processing. relational integrity (RI) might be compromised. which means that PWXPC does not use maximum rows. PWXPC issues the commit after reading 1. The maximum rows limit is cumulative across all sources in the CDC session. If you set the maximum rows value to 1000. Warning: Because PWXPC can commit the change data to the targets between UOW boundaries. Default is 0. regardless of the number of sources to which the changes were originally made. unlike the UOW Count attribute that is a count of complete UOWs. PWXPC continues to process change records across UOW boundaries until the maximum rows limit is met. PWXPC does not wait for a UOW boundary to commit the change data. PWXPC uses the maximum rows limit to commit data before an end-UOW is received. the session consumes more system resources on the PowerCenter Integration Service and target systems because PWXPC flushes data to the targets more frequently. Otherwise. Use a maximum rows limit when extremely large UOWs in the change stream might cause locking issues on the target database or resource issues on the node running the PowerCenter Integration Service. PWXPC issues a real-time flush when the limit value is reached. a process also called sub-packet commit. minimum number of change records that PowerExchange reads from the change stream before it passes a commit record to PWXPC. a UOW contains 900 changes for one source followed by 100 changes for a second source and then 500 changes for the first source. If you select Retrieve PWX Log Entries on the application connection. PWXPC also writes this message in the session log. If you specify either 0 or no value. PowerExchange passes the next commit record to PWXPC and then resets the minimum rows counter. set one or more of the following connection attributes: Maximum Rows Per commit Maximum number of change records in a source UOW that PWXPC processes before it flushes the data buffer to commit the change data to the targets. For example. PWXPC provides the data to the appropriate sources in the CDC session for further processing and the PowerCenter Integration Service commits the data to the targets.

PWXPC issues a real-time flush to commit the change data and the restart tokens to the targets. PWXPC only commits change data to the targets based on the values of the Maximum Rows Per commit. which results in many. Real-Time Flush Latency in milli-seconds. you can set the Minimum Rows Per commit attribute to create larger UOWs of a more uniform size. As PWXPC reads change data from PowerExchange and provides that data to the appropriate source in the CDC session. PowerExchange changes the number of change records in a UOW to match or exceed the limit. By using the minimum rows limit to increase the size of UOWs. ¨ 0 to 2000.000 milliseconds. number of milliseconds that must pass before PWXPC flushes the data buffer to commit the change data to the targets. or one of the UOW count or maximum row limit is met. PWXPC resets the UOW count when a real-time flush occurs because the UOW count or maximum rows limit is met. you must balance performance and resource consumption with latency requirements. PWXPC flushes the change data for all complete UOWs after the interval expires and the next UOW boundary occurs. Online transactions that run in transaction control systems such as CICS and IMS often commit after making only a few changes. PWXPC does not use the UOW Count attribute to control commit processing. The lower you set the flush latency interval value. which means that PWXPC uses 2. or 2 seconds. A minimum rows limit does not impact the relational integrity of the change data because PowerExchange does not create new commits points in the change stream data. Disables data flushes based on time. Enter one of the following for the UOW count value: ¨ -1 or 0. larger UOWs more efficiently than many small UOWs. it counts the number of UOWs. PowerExchange and PWXPC process fewer. After the UOW count value is reached. the faster you commit change data to the targets. Therefore. if you require the lowest possible latency for the apply of changes to the targets. the CDC session might consume more system resources on the PowerCenter Integration Service and target systems because PWXPC commits to the targets more frequently.If you specify a minimum rows limit. It merely skips some of the original commit records in the change stream. When you choose the flush latency interval value. When you specify low flush latency intervals. After the flush latency interval expires and PWXPC reaches a UOW boundary. PWXPC resets the flush latency interval when a real-time flush occurs because either the interval expires. Real-Time Flush Latency in milli-seconds For real-time or continuous extraction mode. UOW Count Number of complete UOWs that PWXPC reads from the change stream before flushing the change data to the targets. PWXPC flushes change data after reading the number of UOWs specified by UOW Count attribute.Interval set to the specified value. 134 Chapter 8: Extracting Change Data . If your change data has many small UOWs. PWXPC does not commit change data to the targets when the minimum rows limit occurs. Default is 1. Interval set to 2000 milliseconds. Default is 0. ¨ 1 to 999999999. you can improve CDC processing efficiency. or the flush latency interval expires. If you set the flush latency interval value is 0 or higher. and UOW Count attributes. small UOWs in the change stream. PWXPC issues a real-time flush to commit the change data and the restart tokens to the targets and writes the PWXPC_10082 message in the session log. Enter one of the following values for the flush latency interval: ¨ -1. specify a low value for the flush latency interval. ¨ 2000 to 86400. and writes the PWXPC_10081 message in the session log.

duplicate data might be written to the targets. Real-Time Flush Latency in milli-seconds. When the session executes. When you choose values for the UOW Count. Generate current restart tokens for sources by using the GENERATE RSTKKN option in the DTLUAPPL utility.The lower you set the value for the UOW Count attribute. For more information. Generate current restart tokens for sources by performing a database row test in PowerExchange Navigator and coding a SELECT CURRENT_RESTART SQL statement. The database row test output includes the following columns for the token values: ¨ DTL__CAPXRESTART1 column for the sequence token ¨ DTL__CAPXRESTART2 column for the restart token Creating Restart Tokens for Extractions 135 . you must establish an extraction start point. You can generate current restart tokens for the end of the change stream by using one the following methods: ¨ PWXPC restart token file. when you restart the CDC session. the Commit at End of File attribute is enabled. By default. edit the restart token file that PWXPC uses to specify the token values before you start the CDC session. PWXPC requests that PowerExchange provide restart tokens for the current end of the change stream. the faster that PWXPC flushes change data to the targets. ¨ DTLUAPPL utility. this point is the end of the change stream because changes to the source are inhibited until the target is materialized and restart tokens are generated. see “Commit Processing with PWXPC” on page 118. Warning: You must ensure that the session properties Commit Type attribute specifies Source and that the Commit at End of File attribute is disabled. and UOW Count values all result in a real-time flush of change data. Displaying Restart Tokens In the PowerExchange Navigator. which causes the data and restart tokens to be committed to the targets. Generate current restart tokens for CDC sessions that use real-time or continuous extraction mode by coding the CURRENT_RESTART option on the RESTART1 and RESTART2 special override statements in the PWXPC restart token file. the lowest possible latency for applying change data also results in the highest possible resource consumption on the PowerCenter Integration Service and the target systems. An optimal extraction start point matches a time in the change stream that occurs after the target has been synchronized with the source but before any new changes occur for the source. Creating Restart Tokens for Extractions Before you extract change data. which PWXPC then uses as the extraction start point. balance performance and resource consumption with latency requirements. Commit processing for CDC sessions is not controlled by a single commitment control attribute. you can perform a database row test on an extraction map to display the restart token pair for each row of change data. Real-Time Flush Latency in milli-seconds. ¨ Database Row Test. The Maximum Rows Per commit. Usually. which causes the PowerCenter Integration Service to write additional data to the targets after the CDC reader has committed the restart tokens and shut down. To achieve the lowest possible latency for applying change data to targets. and Maximum Rows Per commit attributes. However. As a result. set the UOW Count attribute to 1. If you use a PowerExchange utility or the PowerExchange Navigator to generate restart tokens.

If the restart token file name is not specified in the application connection. When you use the DTLUAPPL utility to generate restart tokens. Nonunique file names can cause unpredictable results. Before you run a CDC session for the first time. Specify the directory that contains the restart token file. if available. When a CDC session runs. Informatica recommends that you always code a value for the RestartToken File Name attribute. Because this name must be unique. If you do not specify a value in this attribute. PWXPC_10082. PWXPC uses the application name. if specified. When you run a CDC session. PWXPC uses the workflow name. 136 Chapter 8: Extracting Change Data .If you include the DTL__CAPXRESTART1 and DTL__CAPXRESTART2 columns in your PowerCenter source definition. Restriction: The value of RestartToken File Name attribute in must be unique for every CDC session. If the folder does not exist and the attribute contains the default value of $PMRootDir/Restart. use the PRINT statement to display the generated values. PowerExchange provides the restart tokens for each row when you extract change data in a CDC session. Configuring the Restart Token File When you configure the CDC session in PowerCenter. PWXPC does not create any other restart token folder name. the sequence token is in the Sequence field and restart token is in the PowerExchange Logger field. specify the name and location of the restart token file in the following attributes of the source PWX CDC application connection: ¨ RestartToken File Folder. ¨ In the messages PWXPC_10081. the sequence token is the first token value and is followed by the restart token. message PWXPC_12057 in the session log contains the restart token file folder and the restart token file name. PWXPC verifies that the restart token file exists. PWXPC uses the name specified in the RestartToken File Name attribute to create an empty restart token file. in the Sequence field and displays the restart token in the Restart field. Otherwise. configure the restart token file to specify the point in the change stream from which PowerExchange begins to extract change data. without the usual trailing eight zeros. In the PRINT output. check the following places: ¨ For existing CDC sessions. ¨ In Workflow Manager. DTLUAPPL displays the sequence token. Otherwise. the PWX CDC application connection associated with the source in the CDC session contains the restart token file name and folder location. and PWXPC_12128. ¨ RestartToken File Name. Specify the unique name of the restart token file. the sequence token is in the Restart Token 1 field and the restart token is in the Restart Token 2 field. You can also configure the restart token file to add new sources to a CDC session or to restart change data extraction from a specific point in the change stream. such as change data loss and session failures. PowerExchange and PWXPC display restart token values in the following messages: ¨ In the messages PWX-04565 and PWX-09959. To locate the restart token file name for a CDC session. ¨ In the messages PWXPC_12060 and PWXPC_12068. PWXPC creates it. If one does not exist. PWXPC uses the value of the Application Name. PWXPC uses the name of the workflow.

You can code explicit override statements for one or more sources in a CDC session. ¨ An explicit override statement for a source takes precedence over any special override statement. therefore. ¨ Special override. Restart Token File Statement Syntax For the comment statements. The explicit override statement has the following parameters: Configuring the Restart Token File 137 . you can specify one or more explicit override statements and one special override statement. Comment statements must begin with: <!-- Explicit Override Statements Use the explicit override statement to specify the restart token pair for a specific source. ¨ Do not include blank lines between statements. ¨ All statements are optional. use the following syntax: <!-. Define the source by specifying the extraction map name. Comment Statements You can use the comment statement anywhere in the restart token file. multiple extraction map names. you can use explicit override statements in conjunction with the special override statement to provide restart tokens for all sources in a CDC session. You must provide the PowerExchange extraction map name. use the following syntax: extraction_map_name=sequence_token extraction_map_name=restart_token For special override statements. Specify a restart token pair for a specific source. A source can have multiple extraction maps and. Specify a restart token pair for one or more sources.Restart Token File Statements You can use the following types of statements in a the restart token file: ¨ Comment ¨ Explicit override. When you warm start a CDC session. Each source specification consists of a pair of restart tokens containing the source extraction map name with the restart token values.comment_text For explicit override statements. an explicit override statement for a source overrides the restart tokens stored in the state table or file for that source. You can provide a specific restart token pair or request that PowerExchange use the current restart point. Alternatively. ¨ Comment lines must begin with: <!-¨ Per file. use the following syntax: RESTART1={sequence_token|CURRENT_RESTART} RESTART2={restart_token|CURRENT_RESTART} The following rules and guidelines apply: ¨ Statements can begin in any column.

¨ For CDC data map sources. extraction_ map_name The extraction map name for the data source. These attributes override the schema and map names of the source extraction map.extraction_map_name=restart1_token and extraction_map_name=restart2_token The PowerExchange extraction map name and the sequence and restart tokens for the source. check one of the following: ¨ For CDC data map sources. You must specify both the RESTART1 and RESTART2 parameters. the special override statement overrides the restart tokens stored in the the state table or file for all sources. The PWXPC CDC reader opens a separate connection to PowerExchange to request generation of current restart tokens. restart1_token The sequence token part of the restart token pair. Special Override Statement Use the special override statement to specify or generate restart tokens for one or more sources. restart1_token The sequence token part of the restart token pair. the Schema Name and Map Name values in the source Metadata Extensions in Designer. Restriction: You can only use CURRENT_RESTART for CDC sessions that use real-time and continuous extraction mode. CURRENT_RESTART PowerExchange generates current restart tokens. which varies based on data source type. which varies based on data source type. ¨ For relational sources. 138 Chapter 8: Extracting Change Data . You cannot use this option for CDC sessions that use batch extraction mode. Alternatively. You can use the special override statement to provide restart tokens for all sources in a CDC session. The special override statement has the following parameters: RESTART1={restart1_token|CURRENT_RESTART} and RESTART2={restart2_token|CURRENT_RESTART} The sequence token and restart token in the restart token pair or the current end of the change stream. which varies based on data source type. which varies based on data source type. To determine the extraction map name. you can use explicit override statements in conjunction with the special override statement to provide or override restart tokens for all sources in a CDC session. When you warm start a CDC session. restart2_token The restart token part of the restart token pair. and then provides the generated restart tokens to all applicable sources. except those sources specified in explicit override statements. restart2_token The restart token part of the restart token pair. the Extraction Map Name attribute in the session properties. the Schema Name Override and Map Name Override attributes in the session properties. You can also generate current restart tokens in the Database Row Test dialog box in the PowerExchange Navigator.

rrtb0005_RRTB_SRC_005 000000AD775600000000000000AD77560000000000000000 override) d1dsn9.rrtb0002_RRTB_SRC_002=000000A3719500000000000000A371950000000000000000 d1dsn9. PWXPC reads the restart token file to process any override statements for restart tokens.rrtb0007_RRTB_SRC_007 000000AD775600000000000000AD77560000000000000000 override) Restart Token 2 C1E4E2D340400000013FF36200000000 C1E4E2D34040000000968FC600000000 C1E4E2D34040000000AD5F2C00000000 C1E4E2D340400000060D1E6100000000 C1E4E2D34040000000AD5F2C00000000 C1E4E2D34040000000AD5F2C00000000 C1E4E2D34040000000AD5F2C00000000 Source Restart file Restart file Restart file (special Restart file Restart file (special Restart file (special Restart file (special PWXPC indicates the source of the restart token values for each source. a CDC session contains seven source tables.rrtb0003_RRTB_SRC_003 000000AD775600000000000000AD77560000000000000000 override) d1dsn9.rrtb0004_RRTB_SRC_004=000006D84E7800000000000006D84E780000000000000000 d1dsn9.Example In the example. For the sources that had explicit override statements in the restart token file. For the sources to which PWXPC assigns the special override restart tokens.rrtb0001_RRTB_SRC_001=C1E4E2D340400000013FF36200000000 <!-.rrtb0001_RRTB_SRC_001=0000060D1DB2000000000000060D1DB20000000000000000 d1dsn9.rrtb0001_RRTB_SRC_001 0000060D1DB2000000000000060D1DB20000000000000000 d1dsn9.rrtb0004_RRTB_SRC_004 000006D84E7800000000000006D84E780000000000000000 d1dsn9. PWXPC writes message PWXPC_12060 to the session log with the following information: =============================== Session restart information: =============================== Extraction Map Name Restart Token 1 d1dsn9. This restart token file specifies explicit override statements to provide the restart tokens for three sources and the special override statement to provide the restart tokens for the remainder of the source.rrtb0006_RRTB_SRC_006 000000AD775600000000000000AD77560000000000000000 override) d1dsn9.rrtb0004_RRTB_SRC_004=C1E4E2D340400000060D1E6100000000 When you warm start the CDC session. In this case.Restart Tokens for the Table: rrtb0001_RRTB_SRC_001 --> d1dsn9. the restart token file overrides all restart tokens for all sources in the CDC session.rrtb0002_RRTB_SRC_002=C1E4E2D34040000000968FC600000000 <!-. Configuring the Restart Token File 139 .rrtb0002_RRTB_SRC_002 000000A3719500000000000000A371950000000000000000 d1dsn9. The restart token file contains the following statements: <!-.Restart Tokens for the Table: rrtb0001_RRTB_SRC_004 --> d1dsn9. PWXPC writes “Restart file (special override)” in the Source column.Restart Tokens for the Table: rrtb0001_RRTB_SRC_002 --> d1dsn9.Restart Tokens for existing tables --> restart1=000000AD775600000000000000AD77560000000000000000 Restart2=C1E4E2D34040000000AD5F2C00000000 <!-.Restart Token File . PWXPC writes “Restart file” in the Source column. After resolving the restart tokens for all sources.

A CDC session that uses batch extraction mode runs until it reaches EOL or it is stopped or interrupted. When you cold start a CDC session. 146 Starting PowerCenter CDC Sessions Use Workflow Manager. PWXPC performs recovery processing. use the Recover command from Workflow Manager or Workflow Monitor. You can do a cold start. 144 ¨ Recovering PowerCenter CDC Sessions. You can also use the pmcmd starttask or startworkflow commands. A CDC session that uses real-time or extraction mode runs continuously until it is stopped or interrupted. A CDC session that uses real-time or continuous extraction mode runs continuously until it is stopped or interrupted. or recovery start. Warm start To warm start a CDC session. 142 ¨ Changing PowerCenter CDC Sessions. or a task in the workflow. 140 ¨ Stopping PowerCenter CDC Sessions. A CDC session that uses batch extraction mode runs until it reaches the end of log (EOL) or it is stopped or interrupted. use the Start or Restart commands in Workflow Manager or Workflow Monitor. PWXPC reconciles any restart tokens provided in the restart token file with any restart tokens that exist in the state tables or file. When recovery completes. You can start the entire workflow. You can also use the pmcmd recoverworkflow command or the starttask or startworkflow commands with the recovery option.CHAPTER 9 Managing Change Data Extractions This chapter includes the following topics: ¨ Starting PowerCenter CDC Sessions. the CDC session ends. warm start. When you warm start a CDC session. The method you use determines how PWXPC acquires the restart information. part of a workflow. or pmcmd to start a workflow or task for a CDC session. PWXPC uses the restart token file to acquire restart tokens for all sources. Recovery start To start recovery for a CDC session. You can also use the pmcmd starttask or startworkflow commands with the norecovery option. If necessary. Workflow Monitor. PWXPC does not read the state tables or file or makes any attempt to recover the session. use the Cold Start command in Workflow Manager or Workflow Monitor. 140 . Use one of the following methods to start a CDC session: Cold start To cold start a CDC session.

PWXPC creates the initialization restart token file with the initial restart tokens. PWXPC reads the restart tokens from any applicable state tables or file. PWXPC does not use the restart token file. This processing continues until the session ends or is stopped. You do not need to recover failed workflows and tasks before you restart them. 4. 6. PWXPC writes the following message in the session log: PWXPC_12091 [INFO] [CDCRestart] Cold start requested PWXPC reads the restart tokens from only the restart token file and associates a restart token with each source in the session. 5. PWXPC continues processing change data from PowerExchange and commits the data and restart tokens to the targets. You can also use the pmcmd starttask or startworkflow commands with the norecovery option. the following processing occurs: 1. PWXPC commits the restart tokens for each source to the appropriate state tables or file and then writes the message PWXPC_12104 to the session log. After you request a cold start for a CDC session. If necessary. PWXPC writes the following message in the session log: PWXPC_12092 [INFO] [CDCRestart] Warm start requested.When you recover a CDC session. 3. either cold start or warm start the session. PowerExchange begins extracting change data and passing the data to PWXPC for processing. Cold Start Processing Cold start workflows and tasks by using the Cold Start command in Workflow Manager or Workflow Monitor. PWXPC skips recovery processing. The PowerCenter Integration Service commits flushed change data and restart tokens to any relational targets and updates any nonrelational files. When you warm start a workflow or task. After you request a warm start for a CDC session. 4. you cannot override restart tokens for sources. 3. You can also use the pmcmd starttask or startworkflow commands. 6. Consequently. the following processing occurs: 1. PWXPC creates the initialization restart token file with the reconciled restart tokens. Warm Start Processing Warm start workflows and tasks by using the Start or Restart command in Workflow Manager or Workflow Monitor. PWXPC queries the PowerCenter Integration Service about the commit levels of all targets. PWXPC performs recovery processing. Restriction: If a CDC session requires recovery processing. PWXPC reconciles the restart tokens from the restart token file and from the state tables or file. PWXPC updates the restart token file with the restart tokens for each source in the CDC session. Targets will be resynchronized automatically if required 2. PWXPC re-reads the change data for the last unit-of-work (UOW) that was committed to the targets with the highest commit level and flushes the data to those targets with lower commit levels. PWXPC commits the reconciled restart tokens and then writes message PWXPC_12104 to the session log. If recovery is not required and the reconciled restart tokens differ from those in the state tables or file. PWXPC automatically performs recovery. 2. To begin extracting change data again. PWXPC passes the restart tokens to PowerExchange. If all targets in the session have the same commit level. 5. Starting PowerCenter CDC Sessions 141 . and then the session ends. If recovery is required.

3. You can use recovery to populate the restart token file with the restart tokens for all sources in a CDC session so that you can then cold start the CDC session or to ensure that the targets and restart tokens are in a consistent state. you cannot override restart tokens for sources. PWXPC writes the following message in the session log: PWXPC_12093 [INFO] [CDCRestart] Recovery run requested. the following processing occurs: 1. aborttask. PWXPC continues processing change data from PowerExchange and commits the data and restart tokens to the targets. Consequently. If all targets in the session have the same commit level. Restriction: If a CDC session requires recovery processing. 142 Chapter 9: Managing Change Data Extractions . You can also use the pmcmd recoverworkflow command. To process change data from the point of recovery. You can also use pmcmd stoptask. issue the Stop or Abort command in Workflow Monitor. warm start or cold start the workflow or task. After the PWXPC CDC reader and PowerCenter Integration Service process all of the data in the pipeline and shut down. the session ends. PWXPC creates the initialization restart token file with the reconciled restart tokens.7. 8. you do not need to recover failed workflows and tasks before you restart them because PWXPC automatically performs recovery processing when you warm start a workflow or task. Targets will be resynchronized if required and processing will terminate 2. This processing continues until the session ends or is stopped. However. In PowerCenter. and ends. 4. PWXPC passes the restart tokens to PowerExchange. or abortworkflow commands. 6. PWXPC reads the restart tokens from the recovery state tables or file. PWXPC does not use the restart token file. PWXPC re-reads the change data for the last UOW that was committed to the targets with the highest commit level and flushes the data to those targets with lower commit levels. PowerExchange begins extracting change data and passing the data to PWXPC for processing. 5. PWXPC updates the restart token file with the final restart tokens. In PowerExchange. Recovery Processing Recover workflows and tasks by selecting the Recover command in Workflow Manager or Workflow Monitor. and updates any nonrelational files. or the starttask or startworkflow command with the recovery option. Stopping PowerCenter CDC Sessions You can stop CDC sessions from PowerCenter or PowerExchange. If recovery is required. PWXPC queries the PowerCenter Integration Service about the commit levels of all targets. The PowerCenter Integration Service commits any flushed change data and restart tokens to any relational targets. Use one of the following methods to stop a running CDC session: Stop Use the Stop command in Workflow Monitor or the pmcmd stoptask or stopworkflow commands. stopworkflow. creates the termination restart token file. PWXPC skips recovery processing. After you request recovery for a CDC session. issue the STOPTASK command or run the DTLUTSK utility.

PWXPC then writes the messages PWXPC_12101 and PWXPC_12068 to the session log. and Windows log files. PowerExchange. If you use batch extraction mode by configuring a PWX CDC Change application connection. which ends the session. after it reaches EOL. it flushes any complete and uncommitted UOWs with the associated restart tokens to the targets. passes EOF to PWXPC to end the CDC session. After PWXPC reaches a terminating condition. If you specify an extraction map table in the Event Table attribute of the PWX CDC Real Time application connection. PowerExchange stops the extraction task in the PowerExchange Listener and passes an EOF to the PowerCenter Integration Service. The PowerCenter Integration Service processes all of data in the pipeline and writes it to the targets. passes EOF to PWXPC to end the CDC session. or by using pwxcmd or the DTLUTSK utility. passes PWXPC EOF to end the CDC session. The PowerCenter Integration Service performs any post-session tasks and ends the session. ¨ Batch extraction mode. PowerExchange. The PowerCenter Integration Service commits the data to the targets and ends the session. 7. A terminating condition determines when the PWXPC stops reading change data from the sources and ends the CDC session. the following processing occurs: 1. 2. PowerExchange. after it reads all closed PowerExchange Condense condense files or PowerExchange Logger for Linux. When you abort a CDC session. Terminating Conditions To stop a CDC session based on a user-defined event or at EOL.STOPTASK Use the PowerExchange STOPTASK command. it kills the DTM process and ends the session. the PowerCenter Integration Service requests PWXPC to stop. 6. You can run the STOPTASK command on the source system that is extracting the change data. 4. If you specify 0 for the Idle Time attribute on a PWX CDC Real Time application connection. Stop Command Processing Stop CDC sessions and workflows by using the Stop command in Workflow Monitor or the pmcmd stopttask or stopworkflow command. You can configure the following termination conditions for CDC sessions: ¨ Event table processing. When you issue the STOPTASK command. When PWXPC receives an EOF. and then writes the message PWXPC_12075 to the session log. after it reads a change record for the event table. The PowerCenter Integration Service sends an acknowledgment to PWXPC indicating that the targets have been updated. PowerExchange sends an EOF to PWXPC. PWXPC writes the termination restart token file. it flushes the change data to the targets and passes an EOF to the PowerCenter Integration Service. Abort Use the Abort command in Workflow Monitor or the pmcmd aborttask or abortworkflow commands. If the PowerCenter Integration Service cannot finish processing and committing data within this timeout period. from the PowerExchange Navigator. configure a termination condition in the session. If you use a PowerCenter stop command. Stopping PowerCenter CDC Sessions 143 . UNIX. You can also use the PowerExchange STOPTASK command. ¨ Idle Time. If you use a PowerExchange stop command. After you issue a stop command in PowerCenter or PowerExchange. 5. the PowerCenter Integration Service waits 60 seconds to allow the readers and the writers time to process all of the data in the pipeline and shut down. 3. The PWXPC CDC reader shuts down.

6. Stop the workflow.rrtb0001_RRTB_SRC_001 000000AD220F00000000000000AD220F0000000000000000 d1dsn9. select the Stop command in Workflow Monitor.rrtb0003_RRTB_SRC_003 000000AD220F00000000000000AD220F0000000000000000 Restart Token 2 C1E4E2D34040000000AD0D9C00000000 C1E4E2D34040000000AD0D9C00000000 C1E4E2D34040000000AD0D9C00000000 Source GMD storage GMD storage GMD storage PWXPC also writes the restart tokens in the restart token file specified in the CDC application connection. 4. Examples of Creating a Restart Point The following examples show different methods of creating a restart point for a source table that is added to an existing CDC session. 4. Verify that the restart token file in the source CDC connection points to the same restart token file updated in the recovery. The second example uses DTLUAPPL to generate current restart tokens. To add a new source and use CURRENT_RESTART to create restart tokens: 1. 5.Example In this example. Make changes to the session or workflow.Changing PowerCenter CDC Sessions You can add new sources and targets to an existing CDC sessions. Edit the restart token file to specify the CURRENT_RESTART option for the new source. 2. PWXPC writes the ending restart tokens for all sources in a CDC session to the restart token file that you specified on the PWX CDC application connection. Cold start the CDC session.rrtb0002_RRTB_SRC_002 000000AD220F00000000000000AD220F0000000000000000 d1dsn9. If you add sources to the CDC session. To change a PowerCenter CDC session: 1. If you remove sources from the CDC session. The first example uses the CURRENT_RESTART option of the special override statement in the restart token file to generate current restart tokens. To do so. add statements to the restart token file that provide restart tokens for the new sources. you must cold start the session. Edit the mapping. Because a cold start is required. 7. and workflow to add the new source. 144 Chapter 9: Managing Change Data Extractions . When you recover tasks. After the workflow ends. you must also get the latest restart tokens for the original sources prior to restarting the session. RRTB_SRC_004. if necessary. 3. the example uses the CURRENT_RESTART option in the restart token file to generate a restart token that represents the current end of the change stream. For the new source. session. PWXPC writes the following messages in the session log: PWXPC_12060 [INFO] [CDCRestart] =============================== Session restart information: =============================== Extraction Map Name Restart Token 1 d1dsn9. 3. is added to an existing CDC session that contains three sources. To stop the workflow. 2. RRTB_SRC_004. After the workflow stops. you can perform a recovery. update the restart token file to remove their restart tokens. The restart points for the existing sources are maintained. recover the CDC session. a new source table. Adding a New Source and Use CURRENT_RESTART to Create Restart Tokens . select the Recover Task command in Workflow Monitor to run a recovery session. Afterward.

After the workflow stops. select the Stop command in Workflow Monitor.rrtb0003_RRTB_SRC_003=000000AD220F00000000000000AD220F0000000000000000 d1dsn9. Use the following DTLUAPPL control cards: mod APPL dummy DSN7 rsttkn generate mod rsttkn rrtb004 end appl dummy print appl dummy The PRINT command produces the following output: Registration name=<rrtb004. The restart points for the existing sources are maintained. The DTLUAPPL utility is used to generate a restart token that represent the current end of the change stream.rrtb0002_RRTB_SRC_002=C1E4E2D34040000000AD0D9C00000000 d1dsn9. RRTB_SRC_004. select the Recover Task command from Workflow Monitor to run a recovery session. RRTB_SRC_004. Run DTLUAPPL with RSTTKN GENERATE to generate restart tokens for the current end of the change stream. To stop the workflow.rrtb0003_RRTB_SRC_003 000000AD220F00000000000000AD220F0000000000000000 Restart Token 2 C1E4E2D34040000000AD0D9C00000000 C1E4E2D34040000000AD0D9C00000000 C1E4E2D34040000000AD0D9C00000000 Source GMD storage GMD storage GMD storage PWXPC also writes the restart tokens in the restart token file specified in the CDC application connection.rrtb0001_RRTB_SRC_001=C1E4E2D34040000000AD0D9C00000000 d1dsn9.rrtb0002_RRTB_SRC_002=000000AD220F00000000000000AD220F0000000000000000 d1dsn9.existing sources d1dsn9. Because the restart points for the other sources are earlier than the one just generated for RRTB_SRC_004.existing sources d1dsn9.rrtb0002_RRTB_SRC_002=000000AD220F00000000000000AD220F0000000000000000 d1dsn9.rrtb0001_RRTB_SRC_001=C1E4E2D34040000000AD0D9C00000000 d1dsn9. The updated file contains the following lines: <!-.Example In this example.new source Changing PowerCenter CDC Sessions 145 . Edit the restart token file to add the new source and its tokens.rrtb0003_RRTB_SRC_003=C1E4E2D34040000000AD0D9C00000000 <!-. PWXPC connects to PowerExchange and generates restart tokens that match the current end of the change stream for the new source.1> tag=<DB2DSN7rrtb0041> Sequence=<00000DBF240A0000000000000DBF240A00000000> Restart =<C1E4E2D3404000000DBF238200000000> Add eight zeros to the end of the Sequence value to create the sequence value for the restart token file.rrtb0003_RRTB_SRC_003=C1E4E2D34040000000AD0D9C00000000 <!-. is added to an existing CDC session containing three sources.rrtb0002_RRTB_SRC_002 000000AD220F00000000000000AD220F0000000000000000 d1dsn9.new source RESTART1=CURRENT_RESTART RESTART2=CURRENT_RESTART 5. 1. a new source table.rrtb0001_RRTB_SRC_001=000000AD220F00000000000000AD220F0000000000000000 d1dsn9. 2.The updated file appears as follows: <!-.rrtb0002_RRTB_SRC_002=C1E4E2D34040000000AD0D9C00000000 d1dsn9. 3. 4.rrtb0003_RRTB_SRC_003=000000AD220F00000000000000AD220F0000000000000000 d1dsn9. PWXPC writes the following messages in the session log: PWXPC_12060 [INFO] [CDCRestart] =============================== Session restart information: =============================== Extraction Map Name Restart Token 1 d1dsn9. Adding a New Source and Use DTLUAPPL to Create Restart Tokens . and workflow to add the new source. Edit the mapping. 5. PWXPC then passes the restart tokens to PowerExchange to begin change data extraction. RRTB_SRC_004. PWXPC does not pass any change data to this new source until the first change following its generated restart point is read. Cold start the session. session.rrtb0001_RRTB_SRC_001 000000AD220F00000000000000AD220F0000000000000000 d1dsn9.rrtb0001_RRTB_SRC_001=000000AD220F00000000000000AD220F0000000000000000 d1dsn9.

such as source or target data errors ¨ Transitory or environmental errors. if required.rrtb0004_RRTB_SRC_004 00000FCA65840000000000000D2E004A00000000FFFFFFFF d1dsn8. If a CDC session fails because of permanent errors. If you rematerialize the target table. When you warm start the session. PWXPC passes these restart tokens to PowerExchange to begin change data extraction. If a session fails because of transitory or environmental errors. Restriction: If a CDC session requires recovery processing.rrtb0004_RRTB_SRC_004=00000DBF240A0000000000000DBF240A0000000000000000 d1dsn9. The PWXPC_12060 message records the restart tokens for the session and its sources. a CDC session with relational targets is aborted in the Workflow Monitor. Workflow Monitor. the Restart Task command is issued from the Workflow Monitor to restart the CDC session. and then cold start the CDC session. or pmcmd to recover a workflow or task for a CDC session that fails. In other cases. PWXPC automatically performs recovery. as shown in the following example: PWXPC_12060 [INFO] [CDCRestart] =============================== Session restart information: =============================== Extraction Map Name Restart Token 1 d1dsn8. Alternatively. and then restart the session. restart the session after you have corrected the errors.rrtb0004_RRTB_SRC_004=C1E4E2D3404000000DBF238200000000 6. and writes the following message in the session log: PWXPC_12092 [INFO] [CDCRestart] Warm start requested. you must correct the errors before restarting the CDC session. When you warm start a CDC session. Cold start the session. you might need to rematerialize the target table from the source table before you start extracting and applying change data again. Then.rrtb0009_RRTB_SRC_009 00000FCA65840000000000000D2E004A00000000FFFFFFFF d1dsn8. Because the restart points for the other sources are earlier than the one just generated for RRTB_SRC_004. such as infrastructure problems.rrtb0006_RRTB_SRC_006 00000FCA65840000000000000D2E004A00000000FFFFFFFF d1dsn8. and network availability issues If you run a session with a resume recovery strategy and the session fails. Example of Session Recovery In this example.rrtb0003_RRTB_SRC_003 00000FCA65840000000000000D2E004A00000000FFFFFFFF Restart Token 2 C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000 Source GMD storage GMD storage GMD storage GMD storage GMD storage GMD storage 146 Chapter 9: Managing Change Data Extractions .d1dsn9. Targets will be resynchronized automatically if required PWXPC then reads the restart tokens from the state tables or file and writes the message PWXPC_12060 in the session log. you cannot override the restart tokens because PWXPC does not read the restart token file. you can correct the error and then restart the CDC session. such as SQL or other database errors. With some failures. PWXPC automatically performs a recovery. Recovering PowerCenter CDC Sessions Use Workflow Manager. do not edit the state information or the mapping for the session before you restart the session. server failures. You can recover the entire workflow or a task in the workflow. you should provide restart tokens that match the materialization point in the change stream. you can recover a CDC session. A CDC session can fail for the following reasons: ¨ Permanent errors. PWXPC does not pass any change data to this new source until the first change following the generated restart point is read.rrtb0008_RRTB_SRC_008 00000FCA65840000000000000D2E004A00000000FFFFFFFF d1dsn8.rrtb0005_RRTB_SRC_005 00000FCA65840000000000000D2E004A00000000FFFFFFFF d1dsn8.

rrtb0007_RRTB_SRC_007 00000FCA65840000000000000D2E004A00000000FFFFFFFF 00000FCA65840000000000000D2E004A00000000FFFFFFFF 00000FCA65840000000000000D2E004A00000000FFFFFFFF C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000 C1E4E2D3404000000D21B1A500000000 GMD storage GMD storage GMD storage If PWXPC detects that recovery is required. PWXPC reads the change data records between the points defined by the two restart token values in the PWXPC_12069 message and then issues a commit for the data and the restart tokens.rrtb0002_RRTB_SRC_002 d1dsn8. During recovery processing. As a result. PWXPC usually stores end-UOW restart tokens in the state table or file. Reader will resend the the oldest uncommitted UOW to resync targets: from: Restart 1 [00000FCA65840000000000000D2E004A00000000FFFFFFFF] : Restart 2 [C1E4E2D3404000000D21B1A500000000] to: Restart 1 [00000FCA65840000000000000D300D8000000000FFFFFFFF] : Restart 2 [C1E4E2D3404000000D21B1A500000000]. This message usually includes the restart tokens for both the begin-UOW and the end-UOW for the oldest uncommitted UOW that PWXPC re-reads during recovery. PWXPC writes the message PWXPC_12069 in the session log. The PowerCenter Integration Service writes the flushed change data to the target tables and writes the restart tokens to the state table. Because this session specifies a maximum rows threshold. Recovering PowerCenter CDC Sessions 147 . Then the session ends. the restart tokens might not represent an end-UOW. PWXPC can commit change data and restart tokens between UOW boundaries. The following example PWXPC_12069 message include “from” restart tokens that are the same as those displayed in the example PWXPC_12060 message: PWXPC_12069 [INFO] [CDCRestart] Running in recovery mode.d1dsn8. The sequence token values in the Restart 1 fields represent the start and end change records in the UOW that is displayed in the Restart 2 field. the restart token values in the Restart 2 fields in both the “from” and “to” restart tokens is the begin-UOW value. if you specify a maximum rows threshold.rrtb0001_RRTB_SRC_001 d1dsn8. However.

To direct PowerExchange to write read progress messages. You can request that PowerExchange write statistical information about CDC sessions that use multithreaded processing. and PowerCenter issue messages that you can use to monitor the progress of CDC sessions. PWXPC can also display progress and statistical information about CDC sessions in the PowerCenter Workflow Monitor. 159 Monitoring Change Data Extractions PowerExchange. PowerExchange writes messages that include statistical information about the change records processed. Default is 250 records. you can use the following information to monitor the extraction of change data by CDC sessions: ¨ Read progress messages. PWXPC writes the progress messages in the session log. Specify Y to have PowerExchange write PWX-04587 messages that indicate the number of records read for a CDC session. Monitoring CDC Sessions in PowerExchange In PowerExchange. ¨ Extraction statistics messages. 148 . 154 ¨ CDC Offload and Multithreaded Processing. If you select the Retrieve PWX log entries option on a PWX CDC application connection. You can use the LISTTASK command to display active CDC sessions. Default is N.CHAPTER 10 Monitoring and Tuning Options This chapter includes the following topics: ¨ Monitoring Change Data Extractions. ¨ Multithreaded processing statistics messages. When extraction sessions end. Read Progress Messages You can request that PowerExchange write messages that indicate read progress to the PowerExchange log file. PWXPC. Specify the number of records that PowerExchange reads before writing the PWX-04587 messages to the PowerExchange log file. ¨ LISTTASK command output. include the following parameters in the DBMOVER configuration file: ¨ PRGIND. ¨ PRGINT. You can request that PowerExchange write messages that indicate the number of change records read by a CDC session. 148 ¨ Tuning Change Data Extractions.

To monitor the effectiveness of multithreaded processing. PWXPC writes these messages in the session log. For statistical information about the change data applied to the target. Otherwise. PowerExchange writes the following messages that contain statistical information about the session: ¨ PWX-04578. Valid values are from 10000 through 50000000. If you select the Retrieve PWX log entries option on the connection in the CDC session. This message includes the total number of records read for that CDC session. You can use the information in the messages to tune multithreaded processing. the DBMOVER configuration file contains the following parameters: PRGIND=Y PRGINT=100 When a CDC session that has a session name of s_cdc_DB2_SQL_stats runs. you must specify 1 or above for Worker Threads on the connection. Monitoring Change Data Extractions 149 . For PowerExchange to write statistics messages for threads. PowerExchange writes the following messages to the PowerExchange log file: PWX-04587 intserv/wf_cdc_mon_stats/s_cdc_DB2_SQL_stats: Records read=100 PWX-04587 intserv/wf_cdc_mon_stats/s_cdc_DB2_SQL_stats: Records read=200 PWX-04587 intserv/wf_cdc_mon_stats/s_cdc_DB2_SQL_stats: Records read=300 PowerExchange continues to write PWX-04587 messages for this CDC session until the session ends. ¨ num_records is the cumulative number of records read since the CDC session started. to direct PowerExchange to write read progress messages after 100 records. Multithreaded Processing Statistics If you use CDC offload processing. This information might not reflect the data that was applied to the targets. ¨ PWX-04588. ¨ workflow_name is the name of the workflow that contains the CDC session. PowerExchange does not use multithreaded processing or produce statistics messages. ¨ session_name is the name of the CDC session. you can also use multithreaded processing to attempt to increase throughput on the PowerCenter Integration Service machine where the offloaded processing runs. This message includes the number of insert. You can use this information to determine the speed with which PowerExchange processes change data from the change stream. review the session log. update. In the PowerExchange log file. each of these messages has a date and timestamp. Important: The statistical information in the PowerExchange messages represents the change data that PowerExchange read for a CDC session. and total records read for the source.The PWX-04587 messages have the following format: PWX-04587 int_server/workflow_name/session_name: Records read=num_records Where: ¨ int_server is the name of the PowerCenter Integration Service. commit. Extraction Statistics Messages When a CDC session ends. PowerExchange writes this message for the entire CDC session. PowerExchange writes this message for each source in the CDC session. For example. delete. specify the following parameter in the DBMOVER configuration file on the PowerCenter Integration Service machine: SHOW_THREAD_PERF=number_records Number of change records that PowerExchange reads during a statistics reporting interval before writing the statistics messages PWX-31524 through PWX-31259 to the PowerExchange log file.

Application=appl_name1.The messages that PowerExchange writes during each statistics interval contain the following information: ¨ PWX-31255. and maximum times in microseconds. Partner=10. If the parsing and external processing times are higher than the I/O time. On Windows. This message includes the external percentage of the total time and average. Mode=Read.10. which is the time that PowerExchange on the PowerCenter Integration Service machine spent reading change data from the PowerExchange Listener on the source system. UNIX. This field provides the PowerCenter session name in the following format: integration_server_name/workflow_name/session_name For example. which is the time that PowerExchange on the PowerCenter Integration Service machine spent in column-level processing for the change records on all threads. This message includes the delay percentage of the total time and average. PowerExchange writes the following sample messages after 10. which is the total time that PowerExchange on the PowerCenter Integration Service machine spent processing the change data before passing it to PWXPC. This message includes the parsing percentage of the total time and average.000 change records have been read and the next UOW boundary is reached: PWX-31254 PowerExchange threading stats for last 10000 rows. and maximum times in microseconds. and maximum times in microseconds.10. Cycle time. You can issue the command from the command line. Cycle (array) size is 25 rows. minimum. minimum. if two active CDC sessions are active. 0 out of array occured. PWX-31255 Cycle time: 99% (avg: 5706 min: 4735 max: 7790 usecs) PWX-31256 IO time: 4% (avg: 234 min: 51 max: 950 usecs) PWX-31257 Parse time: 79% (avg: 4549 min: 4108 max: 5425 usecs) PWX-31258 Extern time: 20% (avg: 1144 min: 616 max: 3242 usecs) PWX-31259 Delay time: 0% (avg: 7 min: 4 max: 115 usecs) DISPLAY ACTIVE or LISTTASK Command Output Issue the PowerExchange Listener DISPLAY ACTIVE command to display CDC sessions that are active in the PowerExchange Listener. which is the time that the PowerExchange on the PowerCenter Integration Service machine waited to receive new change records to process from the PowerExchange Listener on the source system. you might improve throughput by increasing the number of threads for the CDC session. ¨ PWX-31258. minimum. AM=CAPXRT. I/O time. External time. 0 out of array occured. and maximum times in microseconds. Status=Active. The command output includes the PwrCntrSess field. ¨ PWX-31259. This message includes the I/O percentage of the total time and average. issue the pwxcmd listtask command from a Linux. SHOW_THREAD_PERF=10000 is specified in the DBMOVER configuration file. PwrCntrSess=intserv1/workflow1/cdc_sess1. which is the time that PowerExchange on the PowerCenter Integration Service machine spent combining the change records from all threads back into a single UOW to pass to PWXPC and for PWXPC to flush the data to PowerCenter. For the following example. enter the equivalent LISTTASK command in the Database Row Test dialog box. Cycle (array) size is 25 rows. Port=2480. PWX-31255 Cycle time: 100% (avg: 5709 min: 4741 max: 7996 usecs) PWX-31256 IO time: 4% (avg: 235 min: 51 max: 1021 usecs) PWX-31257 Parse time: 79% (avg: 4551 min: 4102 max: 5495 usecs) PWX-31258 Extern time: 20% (avg: 1145 min: 618 max: 3287 usecs) PWX-31259 Delay time: 0% (avg: 7 min: 4 max: 165 usecs) PWX-31254 PowerExchange threading stats for last 100000 rows. Process=. if you want to issue the command from the PowerExchange Navigator. ¨ PWX-31257. This message includes the total percentage of time and average. the command produces the following output: PWX-00711 Active tasks: PWX-00712 TaskId=1. SessId= 150 Chapter 10: Monitoring and Tuning Options . Parsing time. ¨ PWX-31256. and maximum times in microseconds. Alternatively. minimum. minimum. or Windows system to a PowerExchange Listener running on the local system or a remote system. Delay time.01.

For each PWXPC flush message. [restart2] because Real-time Flush Latency [latency] is reached PWXPC_12128 [INFO] [CDCDispatcher] raising real-time flush with restart tokens [restart1]. When PWXPC flushes change data to commit the data to the targets.PWX-00712 TaskId=2. AM=CAPXRT. During the execution of the CDC session. you must select performance details while the session is running. To enable the collection of performance details. You can use these messages to monitor the progress of a CDC session. PWXPC does not display performance details. The details include a single source qualifier that reflects group source processing for the change data. Process=. PWXPC does not store performance details in the repository so you cannot view previous performance details for CDC sessions. select Collect performance data on the Properties tab of the CDC session.10. If you configure a CDC session to report performance details. This messages displays the source-based commit statistics. you can monitor the progress of the session in the Workflow Monitor. PWXPC and PowerCenter write messages to the session log. Otherwise.02. displaying the reason for the flush: PWXPC_10081 [INFO] [CDCDispatcher] raising real-time flush with restart tokens [restart1]. Port=2480. [restart2] because the Maximum Rows Per commit [count] is reached You can use the restart tokens in the PWXPC flush messages to monitor the processing of the change data. Monitoring Change Data Extractions 151 . PwrCntrSess=intserv2/workflow2/cdc_sess2. RELATED TOPICS: ¨ “Using Connection Options to Tune CDC Sessions ” on page 157 ¨ “Tuning Commit Processing ” on page 159 ¨ “Viewing Performance Details in the Workflow Monitor” on page 151 Viewing Performance Details in the Workflow Monitor Performance details include counters that you can use to assess the efficiency of a CDC session and change data extraction processing. Note: To view performance details for a CDC session that has ended. you can use the performance details to determine the bottleneck. ¨ Performance details in Workflow Monitor. PowerCenter writes a WRT_8160 message after committing change data to the targets. you can use the following information to monitor the progress of CDC sessions: ¨ Session log messages. PWXPC displays data for all performance counter fields. SessId= PWX-00713 2 active tasks PWX-00709 0 Dormant TCBs Monitoring CDC Sessions in PowerCenter In PowerCenter. PWXPC refreshes the statistical information every 10 seconds. If you have selected a resume recovery strategy in the CDC session. [restart2] because the UOW Count [count] is reached PWXPC_10082 [INFO] [CDCDispatcher] raising real-time flush with restart tokens [restart1]. If you notice degradation of CDC session performance. Session Log Messages You can use messages that PWXPC and PowerCenter write to the session log to monitor the progress of CDC sessions. Mode=Read.10. Partner=10. From Workflow Monitor. it writes one of the following messages to the session log. Application=appl_name2. Status=Active. you can view the details for the current CDC session while it is executing.

1. . In the Properties window.CDC offload processing .Large network bandwidth .3 End Packets In Current Interval 1.No Data To Process. Number of change records received from PowerExchange during the current statistics interval. The Counter Value column displays the PowerCenter node name. in milliseconds.To view performance details in the Workflow Monitor: 1.6 Max Data Read Rate (rows/sec) 152 Chapter 10: Monitoring and Tuning Options . this value is small. To view performance details.Restart Advance. In the last read. Number of change records read per second by PowerExchange during the current statistics interval. depending on the quantity of change data being processed: . 3.If PowerExchange is reading large amounts of change data from the change stream. Maximum number of change records that PowerExchange read per second during a statistics interval. PowerExchange passed restart tokens to PWXPC but did not pass change data. The following factors can increase this value: .1 Time Last Data Row Read Time. PowerExchange did not pass data to PWXPC.5 Mean Data Read Rate (rows/sec) Mean number of change records that PowerExchange read per second. as indicated by one of the following values: . In Workflow Monitor. 1. . Number of UOWs received from PowerExchange during the current statistics interval. PowerExchange passed change data and restart tokens to PWXPC for processing.Processing Data.Multithreaded processing 1. select the data source qualifier. when PWXPC last received data from PowerExchange.2 Data Rows In Current Interval 1. from the start of the CDC session. .If PowerExchange is waiting for change data at the end of the change stream. The following table describes the fields that PowerCenter displays in the Performance Counter column in the Performance area: Performance Counter Field 1 PowerExchange CDC Reader Status: Description Current status of the PWXPC reader. The Performance Counter column displays a data source qualifier from the CDC session. this value is usually large and reflects the maximum PowerExchange throughput. 2. right-click a session and select Get Run Properties.4 Data Read Rate In Current Interval (rows/sec) 1. from the start of the CDC session. click the Performance area. The value varies.

3 Commit Rate In The Current Interval. Waiting for change data. Some of these UOWs might have started before the current statistics interval began. If a resume recovery strategy is not selected.9 Commits Pending 3 Capture Timestamps 3.Number of transformations in the pipeline 2.7 Max Throughput (rows/sec) 2. 2. Mean rate of processing for the CDC session.Idle. .4 Mean Commit Rate (rows/sec) Mean number of change records per second for the rate displayed in 2. recorded from the start of the CDC session.8 Commits In Current Interval 2. The following factors can influence this rate: .1 Timestamp On Last End Packet Read The capture timestamp. in number of change records per second.1 Time Of Last Commit 2. This count includes the change records in all committed UOWs.Recovery Disabled.3 Commit Rate In The Current Interval. the PWXPC CDC reader cannot obtain PowerCenter status information. DTL__CAPXTIMESTAMP. from the last UOW committed to the target. 2.2 Rows Processed To Commit In Current Interval Timestamp of the last commit to a target.6 Mean Throughput (rows/sec) 2. Maximum throughput for the CDC session.Responsiveness of the target . This rate includes reading the UOW from PowerExchange and committing the change data to the targets.3 Commit Rate In Current Interval (rows/sec) 2. from the last UOW read for a source in the CDC session.Performance Counter Field 2 PowerCenter Processing Status: Description Overall status of the CDC session. Number of change records flushed by the PWXPC reader during the current statistics interval. as indicated by one of the following values: .Processing Data.6 Mean Throughput Rate in that it takes into account only the time when the session is actively processing data and does not reflect processing overlap in PowerCenter. A large value might indicate problems with target responsiveness. This value differs from the 2.5 Max Commit Rate (rows/sec) Maximum number of change records per second for the commit rate displayed in 2. 2. Number of commits processed to completion by the target during the current statistics interval.Number of available DTM buffers . Processing rate. DTL__CAPXTIMESTAMP.2 Timestamp On Last Target Commit Monitoring Change Data Extractions 153 . for the change records for the UOW that was last committed during the current statistics interval. 3. Number of commits that were issued by the PWXPC reader but that have not yet reached the targets. . Data is being processed. The capture timestamp.

2 Timestamp On Last Target Commit value from the 2.2 Rows Read 4. UNIX. Value that results from subtracting 3. you can use specify parameters and options in the DBMOVER configuration file and on PWX CDC connections.7 TS on Last Commit minus TS at Commit (2. the value is enclosed in parentheses. ¨ Multithreaded processing. If this result is negative. and Windows to capture change data on a different machine. ¨ Asynchronous network communication.5 Rows Processed Description Total elapsed time for the CDC session. 4. CDC sessions can then extract change data from the PowerExchange Logger log files on that machine. Total number of UOWs read. Total number of change records processed through PowerCenter and committed to the targets. you can optionally use multithreaded processing to attempt to increase throughput.1 Elapsed Time 4.1-3. ¨ CDC offload processing. 154 Chapter 10: Monitoring and Tuning Options . By distributing processing. PowerExchange uses asynchronous communication for most send and receive operations. rather than from the change stream on the original source machine. Total number of change records read from PowerExchange. Use the following methods to tune CDC sessions: ¨ Parameters and options. but you can tune the feature if you need to. You can also use CDC offload processing with the PowerExchange Logger for Linux. 2) Tuning Change Data Extractions You can use PowerExchange configuration parameters and connection options in PowerCenter to tune CDC sessions. Multithreaded processing uses multiple threads on the PowerCenter Integration Service machine to perform the offloaded PowerExchange processing. If you use CDC offload processing.4 Time in PowerExchange Processing 4.6 Commits to Target 4. In addition. Total number of flushes that the PWXPC reader issued and that were committed to the targets. Total time of PowerExchange processing for the CDC session. You can use CDC offload processing to distribute PowerExchange column-level processing for change data to the PowerCenter Integration Service machine that runs the CDC session.1 Time Of Last Commit value. you can reduce PowerExchange processing overhead on the system on which the change data resides. To tune sessions.3 End Packets Read 4.Performance Counter Field 4 Totals 4. This feature is enabled automatically and usually requires no tuning. overlapping network processing with data processing. you can use CDC offload and multithreaded processing to improve throughput by moving processing for change data to a different machine.

To avoid unnecessary CPU consumption.. TCP/IP transports the change data to a receiving TCP/IP buffer on the target system. PWXPC then reads the change data and passes it to PowerCenter.. in bytes. Also.. a value of 10240 (10 MBs) is a good starting value. On the PWX CDC application connection. You might need to increase this value if you have large UOWs. PowerExchange processes a UOW more efficiently if all of the changes are cached in memory. that is allocated to reconstruct complete UOWs. PowerExchange uses the COMPRESS setting in the DBMOVER configuration file on the remote system that contacts the PowerExchange Listener. COMPRESS={Y|N} Defines whether PowerExchange uses its proprietary compression algorithm to compress data before it is sent to TCP/IP for transmission to the remote platform. Default is 1024. This message includes a recommended minimum value. set COMPRESS to N in the PowerExchange DBMOVER configuration file on the PowerCenter Integration Service machine. in kilobytes. For most environments. PowerExchange usually writes change data to its application data buffer on the source system until the buffer is full. PowerCenter processes the data and applies it to the targets. Enter an APPBUFSIZE value that is greater than the maximum size of any single data row to be sent. PowerExchange writes the PWX-01295 message in the PowerExchange log file on the source system. Default is Y. If the APPBUFSIZE value is not optimal. .MEMCACHE=cache_value. the changes spill to a disk file.. If the target system is remote. PowerExchange on the target system reads the change data from the TCP/IP buffer into its application data buffer. verify that the APPBUFSIZE value matches the TCPIPBUFSIZE value in the same DBMOVER configuration file. which is the commit record.. enter the same APPBUFSIZE value in the DBMOVER configuration files on the source and target systems. Valid values are from 34816 through 1048576. If you are applying change data from the change stream on the source system to a remote target system. Default is 128000. This data buffer can exist on a source or target system. CAPI_CONNECTION=( . you can override the compression setting in the DBMOVER configuration file. the CPU consumption of the PowerExchange Listener on the source system might increase.)) Amount of memory cache. PowerExchange then sends the data to a sending TCP/IP buffer on the source system. increase this parameter. If a UOW might be larger than 1024 KB in size. The TCPIPBUFSIZE parameter specifies the maximum size of the TCP/IP buffer. You can specify the MEMCACHE parameter on the following CAPI_CONNECTION statement types: ¨ MSQL ¨ UDB ¨ UOWC PowerExchange keeps all changes in each UOW in cache until it processes the end-UOW record. Valid values are from 1 through 519720. Tuning Change Data Extractions 155 . If you enable compression. you can customize the following parameters in the DBMOVER configuration file: APPBUFSIZE=size Defines the maximum size. of the buffer that PowerExchange uses to read or write data. If the MEMCACHE value is too small to hold all of the changes in a UOW in cache.Using PowerExchange Parameters to Tune CDC Sessions To tune your PowerExchange installation.

. receive_bufsize.port.send_msgsize. send_msgsize. Contact your network administration to determine the best values to use on your system.receive_msgsize. PowerExchange uses the operating system defaults...receive_bufsize.) Defines a port on which a PowerExchange Listener listens for local or remote connections. If you do not specify RSTRADV. which vary based on operating system. PWXPC flushes its data buffer and commits restart tokens to the targets. use a reasonable value for MEMCACHE based on your extraction processing needs and the number of CDC sessions that run concurrently. 156 Chapter 10: Monitoring and Tuning Options . and receive_msgsize define the send and receive buffer and message sizes.send_msgsize. CAPI_CONNECTION=( . Note: Do not specify values for the send and receive buffer and message sizes that exceed the TCP maximum receive buffer size.)) Number of seconds that PowerExchange waits before advancing the restart tokens for a data source by returning an empty unit of work (UOW). if you specify RSTRADV=5 and changes are not made to the data source for five seconds. The positional parameters the send_bufsize. PowerExchange resets the wait period after it reads the next UOW for that source or when it returns an empty UOW because the wait period expires. PowerExchange uses the restart tokens to determine the start point in the change stream for change data extractions. You can specify the RSTRADV parameter on the following CAPI_CONNECTION statement types: ¨ MSQL ¨ UDB ¨ UOWC Empty UOWs contain restart tokens only. receive_bufsize.) Defines a port the IP information that PowerExchange uses to communicate with a remote PowerExchange Listener.. and receive_msgsize define the send and receive buffer and message sizes.TCPIP. ¨ A low value can cause the UOW Count option on the PWX CDC connection to match more quickly than expected. .receive_msgsize. you can use the RSTSADV parameter to periodically advance to the restart tokens for those sources.send_bufsize.. For example.. PowerExchange does not return empty UOWs to advance the restart point. .Tip: PowerExchange uses the MEMCACHE value to allocate cache memory to each connection for change data extractions. If you do not specify values for these parameters. Advancing the restart tokens speeds up restart processing for CDC sessions by minimizing the amount of change data that must be reprocessed. which vary based on operating system. Excessive flush activity can adversely affect performance on the PowerCenter Integration Service machine and target databases.TCPIP. NODE=(node_name. For sources with low change activity.receive_bufsize. PowerExchange uses the operating system defaults. LISTENER=(node_name.hostname. The wait period for the RSTRADV value starts after a UOW for a data source is processed. send_msgsize. Valid values are from 0 through 86400.send_bufsize. PowerExchange returns an empty UOW with restart tokens to PWXPC after each UOW is processed. Consider the following issues when you set RSTRADV on CAPI_CONNECTION statements in the PowerExchange DBMOVER configuration file: ¨ A value of 0 adversely affects performance.. If you do not specify values for these parameters. PowerExchange returns an empty UOW to advance the restart point for the data source. The positional parameters the send_bufsize. To maximize throughput..RSTRADV=rstr_secs. without any data.port. . consider increasing the send and receive buffer and message sizes on the LISTENER statement on the source system. When the UOW counter matches.. To prevent excessive memory use by a PowerExchange Listener.

with which PWXPC flushes the data buffer to commit the change data to the targets. reduce commit processing. The type of data encryption that PowerExchange uses. remove or comment out all TRACE statements in the DBMOVER configuration files on all systems. You should use them only at the direction of Informatica Global Customer Support.trace_level. Real-time Flush Latency in mill-seconds To improve efficiency on the PowerCenter Integration Service machine and the target databases. consider increasing the send and receive buffer and message sizes on the NODE statement on the target system. and Windows and specified CAPT_IMAGE=BA in the pwxccl.cfg configuration file. Encryption Type Do not use encryption. With DTL_BI columns. Note: Do not specify values for the send and receive buffer and message sizes that exceed the TCP maximum receive buffer size. TRACE statements can severely impact PowerExchange performance. If you use the PowerExchange Logger for Linux. Image Type Set to AI. The following table describes the connection options that you can use to tune CDC sessions: Connection Option Compression Description Select this option to compress source data during the PowerCenter session. UOW Count To improve efficiency on the PowerCenter Integration Service machine and the target databases.99) Defines PowerExchange diagnostic traces that Informatica Global Customer Support uses to solve problems with PowerExchange code. before images of the data can still be embedded in Update rows if you add DTL_BI columns to the extraction map. Default is 0. in milliseconds. you can set this option to AI or BA. RELATED TOPICS: ¨ “Using Connection Options to Tune CDC Sessions ” on page 157 Using Connection Options to Tune CDC Sessions In PowerCenter. The number of UOWs that PWXPC reads from the source before it flushes the data buffer to commit the change data to the targets. which is equivalent to two seconds. Tuning Suggestion Do not use compression. Indicates whether PWXPC extracts after images (AI) only or both before and after images (BA) for change data. Tuning Change Data Extractions 157 . Default is None. Default is disabled. Default is 1. To enhance performance. TRACE=(trace_id. The frequency. Default is BA.To maximize throughput. If you specify AI. UNIX. reduce commit processing. Contact your network administration to determine the best values to use on your system. you can manipulate before-image data in the mappings. you can customize options on the PWX CDC connections to tune CDC sessions.

or run many sessions that use multithreaded processing. RELATED TOPICS: ¨ “Tuning Commit Processing ” on page 159 ¨ “CDC Offload and Multithreaded Processing” on page 159 158 Chapter 10: Monitoring and Tuning Options . see PowerExchange Interfaces for PowerCenter. Default is 0. which means that PWXPC does not use maximum rows. Valid values are from 1 through 64. you might experience memory shortages on the PowerCenter Integration Service machine. If you select Offload Processing. which means that PWXPC does not use minimum rows. you can also set this option to have PowerExchange use multiple threads for the offloaded processing on the PowerCenter Integration Service machine. specifying no activity timeout. other than heartbeat data. reduce commit processing. Valid values are from 25 through 100000. Enter the number of threads that you want PowerExchange to use. Minimum Rows Per commit If your UOWs contain only a few changes.Connection Option PWX Latency in seconds Description Select the maximum time. Default is 0. PowerExchange aborts the connection and indicates a timeout error. use the default of -1. For most applications. If no data. Warning: If you specify a large value. For more information about connection options. the size of the storage array. A value of -1 means that no activity timeout is set. Default is 0. that PowerExchange on the source platform waits for more change data before flushing data to PWXPC on the PowerCenter Integration Service platform. Tuning Suggestion Use the default value. Worker Threads Array Size Use 25. Default is 2. Maximum number of change records that PWXPC reads from the source before it flushes the data buffer to commit the change data to the targets. PowerExchange will use heartbeat processing to detect failed connections. have large records. Instead. Maximum Rows Per commit To improve efficiency on the PowerCenter Integration Service machine and the target databases. in number of records. see “CDC Offload and Multithreaded Processing” on page 159. Select this option to request CDC offload processing. TCPIP Activity Timeout Activity timeout. Minimum number of change records that PowerExchange reads from the change stream before it passes any commit records to PWXPC. select a larger value for this option to increase the size of the UOWs. see “CDC Offload and Multithreaded Processing” on page 159. Default is No. which means that PowerExchange does not use multithreaded processing. Offload Processing For more information about offload processing. Default is 25. For more information about offload processing. in seconds. for the threads. If the Worker Threads value is greater than zero. is sent or received during this time interval (in seconds).

If your change data has many small UOWs. the change data remains on the source system and PowerExchange moves the column-level processing to the PowerCenter Integration Service machine that runs the CDC session. PowerExchange also moves the UOW Cleanser processing to the PowerCenter Integration Service machine. CDC sessions can then use continuous extraction mode to extract the change data from the PowerExchange Logger log files instead of from the source system. If the session log contains mostly PWXPC_10081 flush messages. you can adjust the values that you set for following commitment control options on the PWX CDC connection: ¨ UOW Count. CDC Offload and Multithreaded Processing 159 . When you use CDC offload processing with the PowerExchange Logger for Linux. PWXPC might also flush change data too frequently because the PWX CDC connection in the CDC session uses too many of the commitment control options. you can use the Minimum Rows Per commit option to create larger UOWs of more uniform size. PowerExchange does the following processing: ¨ Reads the change data from the source system and stores it in PowerExchange Logger log files ¨ For MVS. ¨ Real-time Flush Latency in milli-seconds. you might need to increase the value for this option. and Windows. and Oracle sources. DB2 for i5/OS. and Oracle sources. UNIX. you might need to increase the value for this option. The following additional factors can also affect the efficiency with which change data is applied to the targets: ¨ Buffer Memory. If the session log contains mostly PWXPC_10082 flush messages. UNIX.5 Time in PowerExchange Processing and 4. If you have enabled the collection of performance details in the CDC session.6 Elapsed Time. and Windows When you use CDC offload processing with real-time extractions. or Windows machine. In this case. CDC Offload and Multithreaded Processing You can use CDC offload processing with the following types of change data extractions: ¨ CDC sessions that use real-time extraction mode ¨ PowerExchange Logger for Linux. If the elapsed time is much larger that the PowerExchange processing time. Contact your database administrator to ensure that access to the database is optimized. To resolve this issue.Tuning Commit Processing If the PowerCenter session log for a CDC session contains groups of PWXPC flush messages followed by groups of source-based commit messages from PowerCenter. ¨ Maximum Rows Per commit. For MVS. ¨ Target database. review the difference between performance counters 4. If the session log contains mostly PWXPC_12128 flush messages. moves the UOW Cleanser processing to the machine on which the PowerExchange Logger is running The PowerExchange Logger stores the change data in log files on the Linux. UNIX. buffer memory constraints might exist. you can improve CDC processing efficiency. the CDC session might be reading change data faster than the data can be processed and written to the targets. The performance of the target database can impact the performance of the CDC session. DB2 for i5/OS. PowerExchange and PWXPC can process a few UOWs of larger size more efficiently than many small UOWs. use a single option to control commit processing and disable the unused options. By using the Minimum Rows Per commit option to increase the size of UOWs. The DTM Buffer Size and Default Buffer Block Size values can impact the performance of the CDC session. you might need to increase the value for this option.

UNIX. ¨ For z/OS data sources. you must specify -1 or 0 as the attribute value. When you define the capture registration in the PowerExchange Navigator. PowerExchange does validate CAPX. If any of the capture registrations for z/OS or i5/ OS data sources specify Full for the Condense option. PowerExchange uses multiple threads to process the change records in each UOW. Planning for CDC Offload and Multithreaded Processing Before you configure CDC offload and multithreaded processing. Informatica recommends that you set the Idle Time connection attribute to 0 so that the workflow session ends when the end-of-log (EOL) is reached.Record-level exits ¨ To capture change data to PowerExchange Logger for Linux. ¨ PowerExchange does not support CDC offload processing for capture registrations that have been created from data maps that use any of the following options: . the PowerExchange Logger ignores them. and restrictions.User access methods .User-defined fields that invoke programs by using the CALLPROG function . ¨ Each PowerExchange Logger for Linux. and Windows. Also. and Windows process must read all of the capture registrations that it uses from a single CCT file on the remote system. ¨ PowerExchange does not invoke MVS RACF security authorization for change data extraction. configure the CAPI_CONNECTION statements in the dbmover. configure the CAPI_CONNECTION statements in the dbmover. UNIX. and Windows and a group definition file. ¨ If you set the optional Idle Time attribute on the PWXPC connection. With this configuration. Specifically. If you enter a value greater than 0. PowerExchange does not validate any CAPX. For the PowerExchange Logger for Linux.REG profiles during PowerExchange Logger processing. and Windows log files. ¨ If you use batch extraction mode. you must configure capture registrations for partial condense processing. certain restrictions and requirements apply. requirements. Restrictions and Requirements for CDC Offload Processing If you use CDC offload processing. do not include the SCHEMA statement in the group definition file. UNIX. each PowerExchange Logger process must store the names of its log files in a unique CDCT file on the local system. review the following considerations.CND profiles for extracting change data when a workflow runs.You can use multithreaded processing for CDC sessions that select offload processing. 160 Chapter 10: Monitoring and Tuning Options . Consider the following restrictions and requirements before implementing offload processing: ¨ You must configure CAPI_CONNECTION statements for the data source in the DBMOVER configuration file on the remote system. PowerExchange does not support SCHEMA statements for z/OS data sources. PWXPC uses 0. PowerExchange uses a single thread to process change data on the PowerCenter Integration Service machine.cfg configuration file that the PowerExchange Logger uses. you can leave the PowerExchange Logger running continuously. By default.cfg configuration file on the PowerCenter Integration Service machine. UNIX. When you select multithreaded processing. For real-time extraction mode. However. select Part in the Condense list. if you use offload processing with the PowerExchange Logger for Linux.

Default is 25. Default is No. Array Size If the Worker Threads value is greater than zero. PowerExchange overlaps network processing with the processing of the change data on the PowerCenter Integration Service machine. For MVS sources. PowerExchange determines whether to use offload processing. CDC Offload and Multithreaded Processing 161 . Specifies whether to use CDC offload processing to move PowerExchange processing for the change data from the source system to the PowerCenter Integration Service machine. To enable CDC offload and multithreaded processing for CDC sessions: 1. Worker Threads When you select CDC offload processing. multithreaded processing might improve performance for a CDC session. CAPI Connection Name Specifies the name of the source CAPI_CONNECTION statement in the dbmover. Otherwise. remove all MVS-specific parameters from the UOWC CAPI_CONNECTION statement.cfg configuration file on the PowerCenter Integration Service machine. review the following considerations: ¨ Use multithreaded processing when the PWX reader thread of a CDC session uses 100% of a single CPU on a multi-CPU server on the PowerCenter Integration Service platform while processing change data. additional threads do not improve throughput.Yes . specifies the size of the storage array for each thread. When a single CPU is consumed. the value for the Worker Threads option should not exceed the number of installed or available processors on the PowerCenter Integration Service machine. try specifying 1 for the Worker Threads option to help improve throughput. specifies the number of threads that PowerExchange uses on the PowerCenter Integration Service machine to process change data. ¨ For optimal performance. Before you configure multithreaded processing options. spreading the PowerExchange processing across multiple threads improves throughput.No . ¨ If the network processing between the source and PowerCenter Integration Service machines is slow. When you specify one or more worker threads. in numbers of records.Auto. Copy the CAPI_CONNECTION statements from the DBMOVER configuration file on the source system to the dbmover. Offload Processing 2. you must configure connection options in the CDC session and CAPI_CONNECTION statements in the PowerExchange DBMOVER configuration file. You must also enter a value for the Array Size.cfg configuration file on the PowerCenter Integration Service machine.Considerations for Multithreaded Processing In specific situations. Enabling Offload and Multithreaded Processing for CDC Sessions To use CDC offload processing and multithreaded processing.cfg on the PowerCenter Integration Service machine. This node name must be the name of a NODE statement in the dbmover. Default is 0. Select one of the following values: . Configure the following options on the PWX CDC Real Time application connection for the CDC session: Connection Option Location Description Specifies the node name of the system on which the change data resides.

UNIX.cfg file that the PowerExchange Logger uses. and Windows can capture change data from i5/OS and MVS systems as well as from other Linux.cfg file that the PowerExchange Listener uses must specify the same CAPT_PATH value as the dbmover.cfg file in the PowerExchange installation directory. The following steps describe how to configure a PowerExchange Logger and PowerExchange Listener to offload change data from source systems and capture that data to PowerExchange Logger log files on Linux. With CDC offload processing. CDC sessions use continuous extraction mode to extract the change data from the PowerExchange Logger log files instead of from the source system. or Windows system. RELATED TOPICS: ¨ “Extracting Change Data Captured on a Remote System” on page 168 Configuring pwxccl.cfg and the dbmover. and Windows to capture change data from source systems other than the system where the PowerExchange Logger runs. UNIX.cfg file for the PowerExchange Logger and the PowerExchange Listener. UNIX. Before you start a PowerExchange Logger for Linux. or Windows. and Windows Microsoft SQL Server MVS sources Oracle CAPI_CONNECTION Statements AS4J and UOWC UDB MSQL LRAP and UOWC ORCL and UOWC Configuring PowerExchange to Capture Change Data on a Remote System You can use CDC offload processing with the PowerExchange Logger for Linux. The dbmover.cfg Configure the pwxccl. UNIX. You must first install PowerExchange on the remote Linux. UNIX.cfg and dbmover. When you use CDC offload processing. and Windows process on a remote system.cfg configuration files on that system.Use the following table to select the correct CAPI_CONNECTION statement types to configure. configure the pwxccl.cfg configuration file for the PowerExchange Logger on the remote system where the PowerExchange Logger will run.cfg configuration files. customize the following parameters: 162 Chapter 10: Monitoring and Tuning Options . UNIX. which you can copy and then edit. based on source type: CDC Source Type DB2 for i5/OS DB2 for Linux. UNIX. a PowerExchange Logger for Linux. Alternatively. you must also configure and start a PowerExchange Listener on that system. each PowerExchange Logger must have unique pwxccl. To extract the change data from the PowerExchange Logger on the remote system. you can use the same dbmover. or Windows systems. For CDC offload processing. PowerExchange provides a sample pwxccl.

do not specify this parameter unless you want to specify another user. If you specify CAPTURE_NODE_EPWD. If you specify CAPTURE_NODE_UID. and Windows process to which you issue pwxcmd commands. Tip: You can create an encrypted password in the PowerExchange Navigator by selecting File > Encrypt Password. For a SQL Server instance that uses Windows Authentication. Whether this parameter is required depends on the operating system of the remote node and the SECURITY setting in the DBMOVER configuration file for the PowerExchange Listener on that node. If you specify CAPTURE_NODE_PWD.CAPTURE_NODE Specifies the node name of the system on which the change data was originally captured. PowerExchange uses the user ID under which the PowerExchange Listener job runs to control access to capture registrations and change data. CONDENSENAME Optional. PowerExchange uses the user ID under which the PowerExchange Listener job runs. PowerExchange uses the specified user ID to control access to capture registrations and change data. Specifies a name for the command-handling service for a PowerExchange Logger for Linux. specify a user ID that is valid for the data source type: ¨ For a DB2 for Linux. CAPTURE_NODE_EPWD Specifies an encrypted password for the CAPTURE_NODE_UID user ID. If you specify CAPTURE_NODE_UID. If the CAPTURE_NODE is an MVS or i5/OS system with a SECURITY setting of 1 or 2. This node name must match the node name in a NODE statement in the dbmover. If the CAPTURE_NODE is an MVS or i5/OS system with a SECURITY setting of 0. do not also specify CAPTURE_NODE_PWD. If the CAPTURE_NODE is a Linux. UNIX. if the SECURITY setting is 1. enter a database user ID that permits access to the SQL Server distribution database. enter a database user ID that permits access to Oracle redo logs and Oracle LogMiner. ¨ For an Oracle source. CAPTURE_NODE_PWD Specifies a clear text password for the CAPTURE_NODE_UID user ID. enter a valid operating system user ID that has DB2 DBADM or SYSADM authority. you must specify a password for that user ID by using either CAPTURE_NODE_EPWD or CAPTURE_NODE_PWD. CAPTURE_NODE_UID Specifies a user ID that permits PowerExchange to read capture registrations and change data on the remote node that is specified in the CAPTURE_NODE parameter. In this case. ¨ For a SQL Server instance that uses SQL Server Authentication. CHKPT_BASENAME Specifies an existing path and base name file name to use for generating the PowerExchange Logger checkpoint files. you must specify a valid operating system user ID. CDC Offload and Multithreaded Processing 163 . PowerExchange uses the user ID under which the PowerExchange Listener was started. or Windows source. If the SECURITY setting is 2. you must specify a password for that user ID by using either CAPTURE_NODE_EPWD or CAPTURE_NODE_PWD. UNIX. do not also specify CAPTURE_NODE_EPWD. or Windows system. do not specify this parameter. However.cfg configuration file that the PowerExchange Logger uses. UNIX.

cfg file that the PowerExchange Logger uses. UNIX. For data sources that include UOW Cleanser (UOWC) CAPI_CONNECTION statements. it defines selection criteria for capture registrations in the CCT file. For all other data sources. DB_TYPE Specifies the data source type. and Windows DB2 for z/OS IDMS log-based IMS Microsoft SQL Server Oracle VSAM DB_TYPE Value ADA DCM AS4 UDB DB2 IDL IMS MSS ORA VSM DBID Specifies the source collection identifier that is defined in the registration group. specify the CAPI_CONNECTION name for the data source type. This CAPI_CONNECTION statement defines the connection to the change stream for the data source type. CONN_OVR Specifies the name of the CAPI_CONNECTION statement in the dbmover. Use the following table to select the correct DB_TYPE to configure. based on source type: CDC Source Type Adabas Datacom DB2 for i5/OS DB2 for Linux.This service name must match the service name in the associated SVCNODE statement in the DBMOVER configuration file. 164 Chapter 10: Monitoring and Tuning Options . When used with DB_TYPE. specify the name of this statement. The PowerExchange Navigator displays this value in the Resource Inspector when you open the registration group.

The Instance name that is displayed for the registration group in the PowerExchange Navigator.The Instance name that is displayed for the registration group in the PowerExchange Navigator. One of the following values: .The RN parameter value from the DB2 statement in the REPDB2OP member of the RUNLIB library. . . The Instance name that is displayed for the registration group in the PowerExchange Navigator.For Datacom synchronous CDC.For Datacom table-based CDC. ORCL and UOWC The Instance name that is displayed for the registration group in the PowerExchange Navigator. CDC Offload and Multithreaded Processing 165 .The LOGSID parameter value in the ECCRIDLP member of the RUNLIB library. EXT_CAPT_MASK Specifies an existing path and unique prefix to be used for generating the PowerExchange Logger log files. and Windows DB2 for z/OS IDMS Log-based IMS Microsoft SQL Server Oracle VSAM EPWD A deprecated parameter. the REG_MUF parameter value in the ECCRDCMP member of the RUNLIB library. based on source type: CDC Source Type Adabas DBID Value The Instance name that is displayed for the registration group in the PowerExchange Navigator.For IMS log-based CDC. . One of the following values: .The INST parameter value in the AS4J CAPI_CONNECTION statement in the DBMOVER member of the CFG file.The IMSID value that is displayed for the registration group in the PowerExchange Navigator. the first parameter of the IMSID statement in the CAPTIMS member of the RUNLIB library.The Logsid value that is displayed for the registration group in the PowerExchange Navigator. . If both CAPTURE_NODE_EPWD and EPWD are specified. Use CAPTURE_NODE_EPWD instead. CAPTURE_NODE_EPWD takes precedence. . One of the following values: . The Database name that is displayed for the registration group in the PowerExchange Navigator. the MUF parameter value in the DTLINPUT data set specified in the MUF JCL. One of the following values: .The MUF Name value that is displayed for the registration group in the PowerExchange Navigator. UNIX.Use the following table to select the correct DBID value. One of the following values: . . Datacom DB2 for i5/OS DB2 for Linux.

PowerExchange provides a sample dbmover. 166 Chapter 10: Monitoring and Tuning Options . If both CAPTURE_NODE_PWD and PWD are specified. The CDCT file contains information about the PowerExchange Logger log files.cfg configuration file for the PowerExchange Logger and PowerExchange Listener.cfg file.cfg configuration file. CAPTURE_NODE_UID takes precedence. If you do not specify these parameters. UID A deprecated parameter. Alternatively.cfg on the PowerExchange Logger Machine On the remote system where the PowerExchange Logger will run. which you can copy and then edit.cfg file that the PowerExchange Listener uses must specify the same CAPT_PATH value as the dbmover. set the following parameters: CAPT_PATH Specifies the path to the directory where the CDCT file resides. Each PowerExchange Logger that uses CDC offload processing to capture change data requires its own CDCT file. you must run a PowerExchange Listener so CDC sessions can extract the offloaded change data. such as file names and number of records. CAPX CAPI_CONNECTION Specifies parameters for continuous extraction of change data from PowerExchange Logger log files. the PowerExchange Logger uses the end of the change stream as the restart point when cold started. Note: Unless the change data is captured on the PowerCenter Integration Service machine.cfg file in the PowerExchange installation directory.PWD A deprecated parameter. and Windows” on page 19 Configuring dbmover. In continuous extraction mode. RESTART_TOKEN and SEQUENCE_TOKEN Optional.cfg file that the PowerExchange Logger and PowerExchange Listener will use. The format of the restart tokens varies based on data source type and. Use CAPTURE_NODE_PWD instead. CAPTURE_NODE_PWD takes precedence. extractions run in near real time and read the data in the PowerExchange Logger log files as the change stream. In the DFLTINST parameter of the CAPX CAPI_CONNECTION. NODE Specifies the TCP/IP connection information for a PowerExchange Listener. UNIX. LOGPATH Specifies the path to the PowerExchange log files that contain PowerExchange Logger messages. configure the dbmover. Specifies a restart point for starting change data processing when the PowerExchange Logger is cold started. If both CAPTURE_NODE_UID and UID are specified. For CDC offload processing.cfg that the PowerExchange Logger uses. The dbmover. RELATED TOPICS: ¨ “PowerExchange Logger for Linux. you can use the same dbmover. specify the DBID value from the PowerExchange Logger pwxccl. This step assumes that you use the same dbmover. Use CAPTURE_NODE_UID instead. if specified. must match the format required by the DB_TYPE specified.

With the AI setting. and Windows log files Configuring Capture Registrations for the PowerExchange Logger For the PowerExchange Logger on Linux.cfg on the PowerCenter Integration Service Machine In the dbmover. Source-specific CAPI_CONNECTION Specifies CAPI parameters that are specific to the data source type and that PowerExchange uses to connect to the change stream. UNIX. remove MVS-specific parameters from the UOWC CAPI_CONNECTION statement. PowerExchange generates corresponding extraction maps.Configure a NODE statement for the system on which the change data was originally captured. UNIX. capture registrations for the remote source must specify Part for the Condense option. TRACING Optional. the PowerExchange Logger captures after images only. By using alternative logging.cfg configuration file on the PowerCenter Integration Service machine. Enables alternative logging. UNIX. SVCNODE Optional. and Windows to capture change data from a remote system.cfg configuration file. If capture registrations do not specify Part for the Condense option. you can separate PowerExchange Logger messages from other PowerExchange messages. Tip: Do not add DTL_BI or DTL_CI columns to the extraction maps if you set the CAPT_IMAGE parameter to AI in the pwxccl. Use the following table to select the correct CAPI_CONNECTION statement types to configure. add a NODE statement for the PowerExchange Listeners that run on the following systems: ¨ The system where the change data was originally captured and where the capture registrations reside ¨ The system where the change data is stored in PowerExchange Logger for Linux. Copy the CAPI_CONNECTION statements from the DBMOVER configuration file on the source system where the change data resides. Then create the capture registrations again. and Windows Microsoft SQL Server MVS sources Oracle CAPI_CONNECTION Statements AS4J and UOWC UDB MSQL LRAP and UOWC ORCL and UOWC For MVS sources. and Windows process listens for pwxcmd commands. You can edit the PowerExchange-generated extraction maps or create additional ones. Configuring dbmover. CDC Offload and Multithreaded Processing 167 . delete the capture registrations and corresponding extraction maps. based on source type: CDC Source Type DB2 for i5/OS DB2 for Linux. Specify the node name for this statement in the CAPTURE_NODE parameter of the PowerExchange Logger pwxccl. UNIX.cfg configuration file. Specifies the TCP/IP port on which a command-handling service for a PowerExchange Listener or PowerExchange Logger for Linux.

or Windows. if you captured change data for a DB2 for Linux. UNIX. you can use local mode to extract the data instead of a PowerExchange Listener. Specify the node name for the PowerExchange Listener that runs on the remote system where the change data was stored in PowerExchange Logger log files. Note: If the remote system also runs the PowerCenter Integration Service. Extracting Change Data Captured on a Remote System After you have captured change data on a remote system in the PowerExchange Logger for Linux. UNIX. select the appropriate PWX CDC Real Time connection for the source type.cfg configuration file used by the PowerExchange Listener on the remote system where the change data is stored in PowerExchange Logger log files. Specify the name of the CAPX CAPI_CONNECTION in the dbmover. a CDC session that uses real-time connections to extract change data from an Oracle source is changed to use CDC offload processing. The source change data remains on Oracle system but all column-level and UOW Cleanser processing is moved to the PowerCenter Integration Service machine. Customize the following connection options to extract offloaded change data: ¨ Location.cfg configuration file that the PowerExchange Listener uses to read change data: /* UOW Cleanser CAPI_CONNECTION=(NAME=UOWCORA. Extracting Change Data from Oracle Using CDC Offload Processing . For more information about configuring PWX CDC Real Time application connections. Configuration File Examples for CDC Offload Processing The following examples show the configuration required for CDC offload processing. You do not need to specify this parameter if the PowerExchange Listener is running without security.RSTRADV=600)) 168 Chapter 10: Monitoring and Tuning Options .Example In this example. specify a valid database user ID. PowerExchange cannot populate BI columns with before images. specify a valid operating system user ID. In the CDC session. see PowerExchange Interfaces for PowerCenter. The PowerExchange Listener on the original source system stores the capture registrations. ¨ Map Location User and Map Location Password. and Windows log files. Also. ¨ Map Location. UNIX.CAPINAME=CAPIORA. you can use continuous extraction mode to extract the change data in a CDC session.TYPE=(UOWC. with this setting. Specify the node name for the PowerExchange Listener that runs on the source system where the change data was originally captured. and Windows source to PowerExchange Logger log files on a remote system.Consequently. ¨ CAPI Connection Name Override. For example. Starting the PowerExchange Logger and PowerExchange Listener Start the PowerExchange Logger and PowerExchange Listener on the remote system that will capture the change data. If the PowerExchange Listener on the source system is running on MVS or i5/OS and is configured with security. The Oracle system has the following CAPI_CONNECTION statements in the dbmover. Specify a user ID and password that can access the capture registrations for the change data. PowerExchange writes Nulls to CI columns for any INSERT or DELETE operations. If the PowerExchange Listener on the data source system is running on Linux. use a PWX DB2LUW CDC Real Time connection to extract the data.

ORAINST2) /* /* Source-specific CAPI Connection CAPI_CONNECTION=(NAME=UOWCORA.TYPE=(ORCL. Stop the CDC session./* Oracle CDC CAPI_CONNECTION=(NAME=CAPIORA. and Windows on the remote UNIX system by completing the following steps: ¨ Configure pwxccl. the name is UOWCORA. In this example.catint=120.cfg file on the Oracle system to the dbmover. In this example. 4.ORACOLL=PRODORA)) To extract change data from Oracle using CDC offload processing: 1. ¨ Configure dbmover. Capturing and Extracting Change Data from a Remote UNIX System . and Windows on a different UNIX system from where the Oracle instance runs. logpath=/pwx/logs/oracond CAPT_XTRA=/pwx/capture/oracond/camaps CAPT_PATH=/pwx/capture/oracond ORACLEID=(PRODORA. change data for Oracle sources is captured by the PowerExchange Logger for Linux.ORAINST2.TCPIP.cfg..cfg on the PowerExchange Logger machine. rather than from the system where the change data was originally captured. In this example.cfg configuration file on the PowerCenter Integration Service machine.catint=120. The Oracle sources are registered for capture on the UNIX system where the Oracle instance runs.TYPE=(ORCL.cfg file that the PowerExchange Listener uses to read change data: /* UOW Cleanser CAPI_CONNECTION=(NAME=UOWCORA.cfg on the remote UNIX system has the following parameters: /* /* dbmover.TYPE=(UOWC. Update the following options on the PWX CDC Real Time application connection in the CDC session: ¨ Select Yes for the Offload Processing option.cfg: CAPI_CONNECTION=(NAME=UOWCORA.TYPE=(ORCL.2480) .ORACOLL=PRODORA)) The instance name used to register the Oracle tables for capture on the original UNIX system is called PRODORA.ORACOLL=PRODORA)) 2.prodora2.Example In this example. ¨ In the CAPI Connection Name option.cfg configuration file on the PowerCenter Integration Service machine for CDC offload processing. Copy the UOWC and ORCL CAPI_CONNECTION statements from the dbmover.cfg /* LISTENER=(unix1. Configure the PowerExchange Logger for Linux. To capture and extract change data from a remote UNIX system: 1..catint=120. the dbmover. the following CAPI_CONNECTION statements are copied into the dbmover. Configure the dbmover. UNIX.CAPINAME=CAPIORA.TCPIP. The original UNIX system has the following CAPI_CONNECTION statements in the dbmover. 3.TYPE=(UOWC. specify the name of the UOWC CAPI_CONNECTION statement.CAPINAME=CAPIORA.2480) NODE=(ORA2. UNIX.RSTRADV=600)) /* Oracle CDC CAPI_CONNECTION=(NAME=CAPIORA. and Windows will run. Restart the CDC session. The following procedure assumes that PowerExchange is installed and configured on the remote UNIX system where the PowerExchange Logger for Linux.CAPINAME=CAPIORA.ORAINST2. UNIX.TYPE=(UOWC.RSTRADV=600)) CAPI_CONNECTION=(NAME=CAPIORA. A CDC session then extracts the change data for the Oracle sources from PowerExchange Logger log files on the remote UNIX system.RSTRADV=600)) CDC Offload and Multithreaded Processing 169 .

TCPIP. and workflow to extract the change data.DFLTINST=PRODORA. specify the password for the Oracle user ID. configure a PWX Oracle CDC Real Time application connection in the CDC session. the original UNIX system for the extraction maps. start the PowerExchange Listener and PowerExchange Logger on the remote UNIX system.catint=120.cfg file has the following parameters: /* /* pwxccl. Create and configure the PowerCenter mapping. After you configure the dbmover. which is where the Oracle instance runs and the tables are registered for capture. and the CAPX CAPI_CONNECTION name to use continuous extraction mode: ¨ For the Location option.TYPE=(CAPX.RSTRADV=600)) In this example. To extract the change data from the remote UNIX system. 5. In this example.2480) NODE=(ORA2. specify a valid Oracle user ID.chkpt COND_CDCT_RET_P=50 COLL_END_LOG=0 NO_DATA_WAIT=1 NO_DATA_WAIT2=2 FILE_SWITCH_VAL=200000 FILE_SWITCH_CRIT=R CAPT_IMAGE=BA SIGNALLING=N UID=orauser PWD=orapwd VERBOSE=Y 2. On the PowerCenter Integration Service machine. session.TYPE=(ORCL.cfg on the PowerCenter Integration Service machine: NODE=(unix1. specify the following options to point to the remote UNIX system for the change data. ¨ NODE statement to point to the PowerExchange Listener on the original UNIX system. specify unix1. 170 Chapter 10: Monitoring and Tuning Options .ORACOLL=PRODORA)) /* /* CAPX CAPI Connection for continuous extraction CAPI_CONNECTION=(NAME=CAPXORA.2480) 4. customize the following statements: ¨ NODE statement to point to the PowerExchange Listener on the remote UNIX system.CAPI_CONNECTION=(NAME=CAPIORA.unix1. the following statements are added to the dbmover. In this example.cfg /* DBID=PRODORA DB_TYPE=ORA CONN_OVR=UOWCORA CAPTURE_NODE=ORA2 CAPTURE_NODE_UID=orauser CAPTURE_NODE_PWD=orapwd EXT_CAPT_MASK=/pwx/capture/oracond/condense CHKPT_NUM=3 CHKPT_BASENAME=/pwx/capture/oracond/condense. the pwxccl. which is where the PowerExchange Logger runs. ¨ For the CAPI Connection Name option.TCPIP. ¨ For the Map Location User option. ¨ For the Map Location option. Cold start the CDC session to extract the change data from the PowerExchange Logger log files on the remote UNIX system.cfg configuration files. 3.FILEWAIT=60.prodora2. specify ORA2. ¨ For the Map Location Password option.cfg and the pwxccl. specify CAPXORA.

92 RSTRADV parameter 155 UDB CAPI_CONNECTION statement 62 UOWC parameters 99 capture catalog table creating 60 DTLUCUDB SNAPSHOT command 61 initializing the table 61 capture registrations grouping in PowerExchange Logger group definition file 44 settings for the PowerExchange Logger 27 CAPX CAPI_CONNECTION parameters parameters and syntax 14 catalog. PowerExchange Logger 20 171 . UNIX. 59 Oracle LogMiner CDC 83 PowerExchange Listener 12 PowerExchange Logger 27 SQL Server CDC 73 continuous extraction mode 106 Controller task. and Windows CDC 56 Oracle LogMiner CDC 80 overview 2 PowerExchange components 6 SQL Server CDC 70 task summary 10 change data extraction creating restart tokens for extractions 135 extracting data captured from a remote system 168 extraction modes 106 monitoring in PowerCenter 151 monitoring in PowerExchange 148 offload processing 159 overview 3 overview of extracting change data 125 task flow 126 testing extraction maps 126 tuning CDC sessions 154 checkpoint files 22. UNIX. 53 close (pwxcmd) 16 closeforce (pwxcmd) 16 commit processing configuring for CDC sessions 133 controlling with connection attributes 119 examples 121 in CDC sessions 118 minimum and maximum rows per commit 120 target latency 121 tuning 159 compatible parameter 84 components. and Windows CDC 58. 54 change data capture (CDC) architecture 8 data source types 4 DB2 for Linux. PowerExchange CDC 8 archive log destination 84 ARCHIVELOG mode enabling for Oracle LogMiner CDC 84 B batch extraction mode 106 C cache files 23 CAPI connection statements CAPI_CONNECTION statement 12 CAPI_SRC_DFLT statement 12 CAPX parameters 14 introduction 14 MEMCACHE parameter 155 MSQL CAPI_CONNECTION statement 76 ORCL CAPI_CONNECTION statement 91. 159 recovery example 146 restart points for warm starts 115 restart token file 137 stopping 142 tuning 154 CDCT file 21. 43 application name configuring for CDC sessions 132 application names 113 architectural diagrams batch or continuous extraction processing 8 real-time extraction processing 8 architecture. 53. Oracle copying for Oracle LogMiner CDC 87 parameters in ORCL CAPI_CONNECTION 97 CDC data map extraction map 137 CDC sessions buffer memory 159 commit processing 118 default restart points 114 methods of starting 113. PowerExchange for CDC 6 PowerExchange Listener 6. 140 monitoring in PowerCenter 151 monitoring in PowerExchange 148 offload processing 123.INDEX A alternative logging 25. 12 PowerExchange Logger 6 PowerExchange Navigator 7 configuration tasks DB2 for Linux.

cfg statements 62 IBM APARs 69 initializing the capture catalog table 61 overview 56 planning 57 prerequisites 57 restrictions 58 stopping 66 troubleshooting 69 user authority requirement 57 using a data map 65 DB2 partitioned databases reconfiguring 67 DB2 SQL1224 error 69 DB2CODEPAGE environment variable 58 DB2NOEXITLIST environment variable 58 dbmover. UNIX. and Windows example statements 62 general CDC parameters 12 LOGPATH parameter 43 Oracle LogMiner CDC example statements 90 Oracle LogMiner CDC parameters 90 PowerExchange Logger parameters 43 SQL Server CDC example statements 76 SQL Server CDC parameters 75 SVCNODE parameter 43 TRACE parameter 155 TRACING parameter 43 types of CAPI connection statements for CDC 14 detail. and Windows CDC 65 data sources. UNIX.cfg parameters 61 example dbmover.cfg APPBUFSIZE 155 CAPI_CONNECTION statements 12 CAPI_SRC_DFLT statement 12 CAPT_PATH parameter 43 CAPT_PATH statement 12 CAPT_XTRA statement 12 COMPRESS parameter 155 DB2 for Linux.D data maps use in DB2 for Linux. 150 lock files 23 log files of PowerExchange Logger file switches 25 log files. PowerExchange-generated DTL__BI_columnname 106 DTL__CAPXACTION 106 DTL__CAPXCASDELIND 106 DTL__CAPXRESTART1 106 DTL__CAPXRESTART2 106 DTL__CAPXRRN 106 DTL__CAPXTIMESTAMP 106 DTL__CAPXUOW 106 DTL__CAPXUSER 106 DTL__CI_columnname 106 extraction maps PowerExchange-generated columns 106 extraction modes 106 extraction of change data creating restart tokens for extractions 135 extracting data captured from a remote system 168 extraction modes 106 monitoring in PowerCenter 151 monitoring in PowerExchange 148 offload processing 159 overview of extracting change data 125 task flow 126 testing extraction maps 126 tuning CDC sessions 154 F file switches description 25 FILESWITCH command 49 G group definition file configuring for PowerExchange Logger 44 example file 46 GROUP statement 45 REG statement 45 SCHEMA statement 45 statements and parameters 45 group source description 116 processing CDC data for multiple source definitions 117 I idle time configuring for a CDC session 131 description 131 integration with PowerCenter 7 L listtask (pwxcmd) 17. Oracle 172 Index . PowerExchange Logger maintaining 53 naming 22 LogMiner.log 25 diagrams batch or continuous extraction processing 8 real-time extraction processing 8 DISPLAY ACTIVE command 150 DTL__CAPXRESTART1 sequence token 135 DTL__CAPXRESTART2 restart token 135 DTLUAPPL displaying restart tokens 135 DTLUCUDB SNAPSHOT command 61 DTLUTSK utility 142 E extraction map columns. and Windows CDC parameters 61 DB2 for Linux. and Windows CDC changing a source table definition 66 configuring in DB2 58 configuring in PowerExchange with the Logger 60 configuring in PowerExchange without the Logger 59 creating the capture catalog table 60 dbmover. UNIX. UNIX. types 4 database row tests 126 datatypes SQL Server 71 DB2 for Linux.

12 PowerExchange Logger 6 PowerExchange Navigator 7 PowerExchange Listener CLOSE command 16 DISPLAY ACTIVE command 17.configuring for Oracle CDC 86 M maximum row count configuring for a CDC session 133 message log files 25 Microsoft SQL Server CDC changing a source table definition 79 configuration tasks 73 configuring in PowerExchange with the Logger 75 configuring in PowerExchange without the Logger 74 datatypes supported 71 dbmover. 92 output files.cfg parameters 75 example dbmover. 159 planning considerations 160 restrictions and requirements 160 statistics messages 149 enabling ARCHIVELOG mode 84 example dbmover. checkpoint. 150 displaying active listener tasks 17 overview 12 starting 16 stopping 16 STOPTASK command 16 PowerExchange Logger for Linux. UNIX. and log files 54 batch mode 26 cache files 23 CDCT file 21 change capture from a remote source 162 checkpoint files 22 cold starting 49 CONDENSE command 49 configuring 27 continuous mode 26 controlling 49 dbmover.cfg statements 90 overview 80 performance considerations 83 planning 81 restrictions and requirements 81 SQL*Loader restrictions 82 stopping 102 supplemental logging requirement 86 supported datatypes 81 transaction_auditing parameter 84 user privileges required 85 ORCL CAPI_CONNECTION statement CATBEGIN parameter 97 CATEND parameter 97 CATINT parameter 97 Oracle catalog parameters 97 parameters and syntax 91. and Windows assessing performance 52 backing up CDCT.sql 83 oracapt.cfg statements 76 overview 70 planning 71 prerequisites 71 restrictions 73 stopping 78 user authority requirements 71 minimal global supplemental logging 86 minimum row count configuring for a CDC session 133 monitoring CDC sessions PowerCenter output to monitor 151 PowerCenter session log messages 151 PowerExchange extraction statistics messages 149 PowerExchange multithreaded processing statistics 149 PowerExchange output to monitor 148 PowerExchange read progress messages 148 viewing performance details in PowerCenter 151 MSQL CAPI_CONNECTION statement parameters and syntax 76 multithreaded processing enabling for CDC sessions 161 overview 123.cfg parameters 90 Index 173 . PowerExchange Logger cache files 23 CDCT file 21 checkpoint files 22 P partitioned DB2 database reconfiguring 67 performance CDC session performance details 151 offload processing and multithreaded processing 159 Oracle LogMiner CDC considerations 83 PowerExchange Client for PowerCenter (PWXPC) 7 PowerExchange components for CDC 6 PowerExchange Listener 6.sql 83 Oracle CDC configuring Oracle LogMiner 86 Oracle LogMiner configuring for Oracle CDC 86 Oracle LogMiner CDC archive log destination 84 changing a source table definition 102 compatible parameter 84 configuration in a RAC environment 87 configuration script files 83 configuring in Oracle 83 configuring PowerExchange with the Logger 89 configuring PowerExchange without the Logger 88 copying the Oracle catalog 87 dbmover.cfg parameters 43 DISPLAY ALL command 49 DISPLAY CHECKPOINTS command 49 DISPLAY CPU command 49 DISPLAY EVENTS command 49 DISPLAY MEMORY command 49 DISPLAY RECORDS command 49 DISPLAY STATUS command 49 extracting remotely captured changes from Logger log files 168 O offload processing configuration examples 168 enabling for CDC sessions 161 Logger capture of changes from a remote source 162 overview 123. 159 planning considerations 160 restrictions and requirements 160 oracapt_rac.

112 PM_RECOVERY table 111 PM_TGT_RUN_ID table 111 recovery information for nonrelational targets 112 recovery state file for nonrelational targets 113 recovery tables for relational targets 111 restart $PMRootDir/Restart 132.cfg parameters 28 regenerating the CDCT file after a failure 54 required capture registration settings 27 running in background mode on Linux or UNIX 27 SHUTCOND command 49 SHUTDOWN command 49 start point in change stream 48 starting 47 stopping 49 subtasks 20 PowerExchange-generated extraction map columns DTL__BI_columnname 106 DTL__CAPXACTION 106 DTL__CAPXCASDELIND 106 DTL__CAPXRESTART1 106 DTL__CAPXRESTART2 106 DTL__CAPXTIMESTAMP 106 DTL__CAPXUOW 106 DTL__CAPXUSER 106 DTL__CI_columnname 106 DTL__columnname_CNT 106 DTL__columnname_IND 106 pwxccl statement parameters 48 syntax 47 pwxccl.cfg CAPT_IMAGE parameter 29 CAPTURE_NODE parameter 29 CAPTURE_NODE_EPWD parameter 29 CAPTURE_NODE_PWD parameter 29 CAPTURE_NODE_UID parameter 29 CHKPT_BASENAME parameter 29 CHKPT_NUM parameter 29 COLL_END_LOG parameter 29 COND_CDCT_RET_P parameter 29 CONDENSE_SHUTDOWN_TIMEOUT parameter 29 CONDENSENAME parameter 29 configuring 28 CONN_OVR parameter 29 DB_TYPE parameter 29 DBID parameter 29 example file 42 EXT_CAPT_MASK parameter 29 FILE_FLUSH_VAL parameter 29 FILE_SWITCH_CRIT parameter 29 FILE_SWITCH_MIN parameter 29 FILE_SWITCH_VAL parameter 29 GROUPDEFS parameter 29 LOGGER_DELETES_EXPIRED_CDCT_RECORDS parameter 29 MAX_RETENTION_EXPIRY_DAYS parameter 29 NO_DATA_WAIT parameter 29 NO_DATA_WAIT2 parameter 29 parameters 28 PROMPT parameter 29 RESTART_TOKEN parameter 29 SEQUENCE_TOKEN parameter 29 SIGNALLING parameter 29 UID parameter 29 VERBOSE parameter 29 pwxcmd close 16 closeforce 16 listtask 17 listtask command 150 PWXPC 7 R real application clusters (RACs) configuring for Oracle LogMiner CDC 87 real-time extraction mode 106 real-time flush latency configuring for a CDC session 133 reconfiguring DB2 partitioned database 67 recovery example 146 PM_REC_STATE table 111. 136 application name 132 default restart points 114 earliest restart points 114 methods of starting CDC sessions 113. 140 null restart tokens 114 restart token file 110. 132 restart token file folder 132 RESTART1 138 RESTART2 138 restart points defaults 114 earliest 114 restart token DTL__CAPXRESTART2 135 restart token file example 139 explicit override 137 overview 109 special override 138 syntax 137 restart tokens creating for extractions 135 displaying with DTLUAPPL 135 DTL__CAPXRESTART1 135 DTL__CAPXRESTART2 135 null 114 overview 109 recovery state file 113 recovery state table 112 row tests 126 S sequence token DTL__CAPXRESTART1 135 SHOW_THREAD_PERF parameter 149 source RDBMSs 4 source table definitions changing a DB2 table definition 66 174 Index .FILESWITCH command 49 group definition file 44 lock files 23 log file switches 25 log files 22 maintaining CDCT file and log files 53 memory requirement on Linux and UNIX 27 message log files 25 offload processing 162 operational modes 25 output files 21 overview 19 pwxccl.

changing a SQL Server table definition 79 changing an Oracle table definition 102 SQL Server CDC changing a source table definition 79 configuration tasks 73 configuring in PowerExchange with the Logger 75 configuring in PowerExchange without the Logger 74 datatypes supported 71 dbmover. UNIX. UNIX. and Windows CDC requirement 57 Oracle LogMiner CDC requirements 85 SQL Server CDC requirement 71 T task flow CDC implementation 10 extracting change data 126 terminating conditions idle time for CDC sessions 131 testing a change data extraction 126 transaction_auditing parameter 84 troubleshooting W warm starts CDC session restart points 115 Index 175 . and Windows CDC 69 tuning CDC sessions APPBUFSIZE parameter 155 buffer memory 159 CAPI_CONNECTION MEMCACHE parameter 155 CAPI_CONNECTION RSTRADV parameter 155 commit processing tuning 159 COMPRESS parameter 155 DBMOVER tuning parameters 155 methods 154 PWX CDC connection options 157 TRACE parameter 155 U UDB CAPI_CONNECTION statement parameters and syntax 62 UOW count configuring for a CDC session 133 UOWC CAPI_CONNECTION parameters parameters and syntax 99 user authority DB2 for Linux.cfg parameters 75 example dbmover. stopping 142 supplemental logging. Oracle 86 DB2 for Linux.cfg statements 76 overview 70 planning 71 prerequisites 71 restrictions 73 stopping 78 user authority requirements 71 SQL*Loader restrictions for Oracle CDC 82 STOPTASK command CDC sessions.