Informatica Powercenter 8 Level Ii Developer Lab Guide: Version - Pc8Liid 20060910

Informatica PowerCenter 8
Level II Developer
Lab Guide
Version - PC8LIID 20060910
Informatica PowerCenter Level II Developer Lab Guide

Version 8.1
September 2006
Copyright (c) 19982006 Informatica Corporation.

All rights reserved. Printed in the USA.
This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions
on use and disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or
transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation.
Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as
provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR
52.227-14 (ALT III), as applicable.
The information in this document is subject to change without notice. If you find any problems in the documentation, please report them to us in writing.
Informatica Corporation does not warrant that this documentation is error free. Informatica, PowerMart, PowerCenter, PowerChannel, PowerCenter Connect, MX,
and SuperGlue are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other
company and product names may be trade names or trademarks of their respective owners.
Portions of this software are copyrighted by DataDirect Technologies, 1999-2002.
Informatica PowerCenter products contain ACE (TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University and
University of California, Irvine, Copyright (c) 1993-2002, all rights reserved.
Portions of this software contain copyrighted material from The JBoss Group, LLC. Your right to use such materials is set forth in the GNU Lesser General Public
License Agreement, which may be found at http://www.opensource.org/licenses/lgpl-license.php. The JBoss materials are provided free of charge by Informatica,
as-is, without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular
purpose.
Portions of this software contain copyrighted material from Meta Integration Technology, Inc. Meta Integration is a registered trademark of Meta Integration
Technology, Inc.
This product includes software developed by the Apache Software Foundation (http://www.apache.org/). The Apache Software is Copyright (c) 1999-2005 The
Apache Software Foundation. All rights reserved.
This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit and redistribution of this software is subject to terms available at
http://www.openssl.org. Copyright 1998-2003 The OpenSSL Project. All Rights Reserved.
The zlib library included with this software is Copyright (c) 1995-2003 Jean-loup Gailly and Mark Adler.
The Curl license provided with this Software is Copyright 1996-200, Daniel Stenberg, <Daniel@haxx.se>. All Rights Reserved.
The PCRE library included with this software is Copyright (c) 1997-2001 University of Cambridge Regular expression support is provided by the PCRE library
package, which is open source software, written by Philip Hazel. The source for this library may be found at ftp://ftp.csx.cam.ac.uk/pub/software/programming/
pcre.
InstallAnywhere is Copyright 2005 Zero G Software, Inc. All Rights Reserved.
Portions of the Software are Copyright (c) 1998-2005 The OpenLDAP Foundation. All rights reserved. Redistribution and use in source and binary forms, with or
without modification, are permitted only as authorized by the OpenLDAP Public License, available at http://www.openldap.org/software/release/license.html.
This Software is protected by U.S. Patent Numbers 6,208,990; 6,044,374; 6,014,670; 6,032,158; 5,794,246; 6,339,775 and other U.S. Patents Pending.
DISCLAIMER: Informatica Corporation provides this documentation as is without warranty of any kind, either express or implied,
including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. The information provided in this
documentation may include technical inaccuracies or typographical errors. Informatica could make improvements and/or changes in the products described in this
documentation at any time without notice.
Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
About This Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Document Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Other Informatica Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Obtaining Informatica Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Visiting Informatica Customer Portal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Visiting the Informatica Web Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Visiting the Informatica Developer Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Visiting the Informatica Knowledge Base . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Obtaining Informatica Professional Certification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Providing Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
Obtaining Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
Lab 1: Dynamic Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Step 1: Create Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Step 2: Preview Target Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Step 3: View Source Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Step 4: Create Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Step 5: Run Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Step 6: Verify Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Step 7: Verify Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Lab 2: Workflow Alerts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Step 1: Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Step 2: Mappings Required . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Step 3: Reusable Sessions Required . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Step 4: Create a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Step 5: Create a Worklet in the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Step 6: Create a Timer Task in the Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Step 7: Create an E-Mail Task in the Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Step 8: Create a Control Task in the Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Step 9: Add Reusable Session to the Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Step 10: Link Tasks in Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Step 11: Add Reusable Session to the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Step 12: Link Tasks in Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Step 13: Run Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Table of Contents
Informatica PowerCenter 8 Level II Developer
iii
Lab 3: Dynamic Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Step 1: Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Step 2: Mapping Required . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Step 3: Copy Reusable Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Step 4: Create Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Step 5: Create Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Step 6: Add Session to Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Step 7: Create a Timer Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Step 8: Create an Assignment Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Step 9: Link Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Step 10: Run Workflow by Editing the Workflow Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Step 11: Monitor the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Lab 4: Recover a Suspended Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Step 1: Copy the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Step 2: Edit the Workflow and Session for Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Step 3: Edit the Session to Cause an Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Step 4: Run the Workflow, Fix the Session, and Recover the Workflow . . . . . . . . . . . . . . . . . . 23
Lab 5: Using the Transaction Control Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Step 1: Create Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Step 4: Verify Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Step 5: Verify Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Lab 6: Error Handling with Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Lab 7: Handling Fatal and Non-Fatal Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
iv
Table of Contents

Lab 8: Repository Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Step 1: Create a Query to Search for Targets with Customer . . . . . . . . . . . . . . . . . . . . . . . 50
Step 2: Validate, Save, and Run the Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Step 3: Create A Query to Search For Mapping Dependencies . . . . . . . . . . . . . . . . . . . . . . . . 52
Step 4: Validate, Save, and Run the Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Step 5: Modify and Run the Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Step 6: Run the Query Accessed by the Repository Manager . . . . . . . . . . . . . . . . . . . . . . . . . 54
Step 7: Create Your Own Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Lab 9: Performance and Tuning Workshop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Workshop Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Workshop Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Establish ETL Baseline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Documented Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Lab 10: Partitioning Workshop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

Workshop Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Scenario 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Scenario 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Scenario 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Scenario 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Table of Contents
vi
Table of Contents
Preface
Welcome to PowerCenter, Informaticas software product that delivers an open, scalable data integration
solution addressing the complete life cycle for all data integration projects including data warehouses and
data marts, data migration, data synchronization, and information hubs. PowerCenter combines the latest
technology enhancements for reliably managing data repositories and delivering information resources in
a timely, usable, and efficient manner.
The PowerCenter metadata repository coordinates and drives a variety of core functions, including
extracting, transforming, loading, and managing data. The Integration Service can extract large volumes
of data from multiple platforms, handle complex transformations on the data, and support high-speed
loads. PowerCenter can simplify and accelerate the process of moving data warehouses from development
to test to production.
Preface
vii
About This Guide

Welcome to the PowerCenter 8 Level II Developer course.
This course is designed for data integration and data warehousing implementers. You should be familiar
with PowerCenter, data integration and data warehousing terminology, and Microsoft Windows.
Document Conventions
This guide uses the following formatting conventions:
If you see
It means
Example
>
Indicates a submenu to navigate to.
Click Repository > Connect.

In this example, you should click the Repository menu
or button and choose Connect.
viii
boldfaced text
Indicates text you need to type or enter.
Click the Rename button and name the new source

definition S_EMPLOYEE.
UPPERCASE
Database tables and column names are shown

in all UPPERCASE.
T_ITEM_SUMMARY
italicized text
Indicates a variable you must replace with

specific information.
Connect to the Repository using the assigned login_id.
Note:
The following paragraph provides additional

facts.
Note: You can select multiple objects to import by

using the Ctrl key.
Tip:
The following paragraph provides suggested

uses or a Velocity best practice.
Tip: The m_ prefix for a mapping name is
Preface
Other Informatica Resources

In addition to the student guides, Informatica provides these other resources:
Informatica Documentation
Informatica Customer Portal
Informatica web site
Informatica Developer Network
Informatica Knowledge Base
Informatica Professional Certification
Informatica Technical Support
Obtaining Informatica Documentation

You can access Informatica documentation from the product CD or online help.
Visiting Informatica Customer Portal

As an Informatica customer, you can access the Informatica Customer Portal site at http://
my.informatica.com. The site contains product information, user group information, newsletters, access
to the Informatica customer support case management system (ATLAS), the Informatica Knowledge Base,
and access to the Informatica user community.
Visiting the Informatica Web Site

You can access Informaticas corporate web site at http://www.informatica.com . The site contains
information about Informatica, its background, upcoming events, and locating your closest sales office.
You will also find product information, as well as literature and partner information. The services area of
the site includes important information on technical support, training and education, and
implementation services.
Visiting the Informatica Developer Network

The Informatica Developer Network is a web-based forum for third-party software developers. You can
access the Informatica Developer Network at the following URL:
http://devnet.informatica.com
The site contains information on how to create, market, and support customer-oriented add-on solutions
based on interoperability interfaces for Informatica products.
Visiting the Informatica Knowledge Base

As an Informatica customer, you can access the Informatica Knowledge Base at http://
my.informatica.com. The Knowledge Base lets you search for documented solutions to known technical
issues about Informatica products. It also includes frequently asked questions, technical white papers, and
technical tips.
Obtaining Informatica Professional Certification

You can take, and pass, exams provided by Informatica to obtain Informatica Professional Certification.
For more information, go to:
http://www.informatica.com/services/education_services/certification/default.htm
Preface
ix
Providing Feedback
Email any comments on this guide to aconlan@informatica.com.
Obtaining Technical Support

There are many ways to access Informatica Technical Support. You can call or email your nearest
Technical Support Center listed in the following table, or you can use our WebSupport Service.
Use the following email addresses to contact Informatica Technical Support:
support@informatica.com for technical inquiries
support_admin@informatica.com for general customer service requests
WebSupport requires a user name and password. You can request a user name and password at http://
my.informatica.com .
North America / South

America
Europe / Middle East / Africa
Asia / Australia
Informatica Corporation
Headquarters
100 Cardinal Way
Redwood City, California
94063
United States
Informatica Software Ltd.

6 Waltham Park
Waltham Road, White Waltham
Maidenhead, Berkshire
SL6 3TN
United Kingdom
Informatica Business
Solutions Pvt. Ltd.
301 & 302 Prestige Poseidon
139 Residency Road
Bangalore 560 025
India
Toll Free
877 463 2435
Toll Free
00 800 4632 4357
Toll Free
Australia: 00 11 800 4632 4357
Singapore: 001 800 4632 4357
Standard Rate
United States: 650 385 5800
Standard Rate
Belgium: +32 15 281 702
France: +33 1 41 38 92 26
Germany: +49 1805 702 702
Netherlands: +31 306 022 797
United Kingdom: +44 1628 511
445
Standard Rate
India: +91 80 5112 5738
Preface
Lab 1: Dynamic Lookup Cache

Technical Description
You have a customer table in your target database that contains existing customer information. You also
have a flat file that contains new customer data. Some rows in the flat file contain new information on
new customers, and some contain updated information on existing customers. You need to insert the new
customers into your target table and update the existing customers.
The source file may contain multiple rows for a customer. It may also contain rows that contain updated
information for some columns and NULLs for the columns that do not need to be updated.
To do this, you will use a Lookup transformation using a dynamic cache that looks up data on the target
table. The Integration Service inserts new rows and updates existing rows in the lookup cache as it inserts
and updates rows in the target table. If you configure the Lookup transformation properly, the Integration
Service ignores NULLs in the source when it updates a row in the cache and target.
Objectives
Use a dynamic lookup cache to update and insert rows in a customer table
Use a Router transformation to route rows based on the NewLookupRow value
Use an Update Strategy transformation to flag rows for update or insert
Duration
45 minutes
Mapping Overview
Lab 1
Velocity Deliverable: Mapping Specifications

Mapping Name
m_DYN_update_customer_list_xx
Source System
Flat file
Target System
EDWxx
Initial Rows
Rows/Load
Short Description
Update the existing customer list with new and updated information.
Load Frequency
On demand
Preprocessing
None
Post Processing
None
Error Strategy
None
Reload Strategy
None
Unique Source Fields
CUST_ID
Sources
Files
File Name
File Location
updated_customer_list.txt
In the Source Files directory on the Integration Service process machine.
Create shortcut from

DEV_SHARED folder
Targets
Tables
Schema/Owner
EDWxx
Table Name
Update
Delete
CUSTOMER_LIST
Insert
Unique Keys
yes
PK_KEY
CUST_ID

DEV_SHARED folder
Source To Target Field Matrix
Ignore NULL Inputs for Updates

(Lookup Transformation)
Target Column
Source File or Transformation
Source Column
PK_KEY
LKP_CUSTOMER_LIST
Sequence-ID
CUST_ID
CUST_ID
FIRSTNAME
FIRSTNAME
Yes
LASTNAME
LASTNAME
Yes
ADDRESS
ADDRESS
Yes
CITY
CITY
Yes
Lab 1
Target Column
Source File or Transformation
Source Column
Ignore NULL Inputs for Updates

(Lookup Transformation)
STATE
STATE
Yes
ZIP
ZIP
Yes
Detailed Overview
Repository Object Name
Object Type
Description and Instructions
Mapping
Shortcut_to_updated_customer_list
Source Definition
Flat file in $PMSourceFileDir directory.

Create shortcut from DEV_SHARED folder.
SQ_Shortcut_to_updated_customer_list
Source Qualifier
Connect to input/output ports of the Lookup transformation,

LKP_CUSTOMER_LIST.
LKP_CUSTOMER_LIST
Lookup
Lookup transformation based on the target definition

Shortcut_to_CUSTOMER_LIST and the target table CUSTOMER_LIST.
- Change the input/output port names prepend them with IN_.
- Use dynamic caching.
- Define the lookup condition using the customer ID ports.
- Configure the Lookup properties so it inserts new rows and updates
existing rows. (Insert Else Update)
- Ignore NULL inputs for all lookup/output ports except CUST_ID and
PK_KEY.
- Associate input/output ports with a similar name for each lookup/output
port.
- PK_KEY must be an integer in order to specify Sequence-ID as the
Associated Port.
- Connect the NewLookupRow port and all lookup/output ports to
RTR_Insert_Update.
RTR_Insert_Update
Router
Create two output groups with the following names:

- UPDATE_EXISTING: Condition is NewLookupRow=2. Connect output
ports to UPD_Update_Existing.
- INSERT_NEW: Condition is NewLookupRow=1. Connect output ports to
UPD_Insert_New.
Do not connect any of the NewLookupRow ports to any transformation.
Do not connect the Default output group ports to any transformation.
UPD_Insert_New
Update Strategy
Update Strategy Expression DD_INSERT.

Connect all input/output ports to CUSTOMER_LIST_Insert.
UPD_Update_Existing
Update Strategy
Update Strategy Expression DD_UPDATE.

Connect all input/output ports to CUSTOMER_LIST_Update.
CUSTOMER_LIST_Insert
Target Definition
First instance of the target table definition in EDWxx schema.

Create shortcut from DEV_SHARED folder of the CUSTOMER_LIST target
definition.
In the mapping, rename the target instance name to
CUSTOMER_LIST_Insert.
CUSTOMER_LIST_Update
Target Definition
Second instance of the target table definition in EDWxx schema.

Create shortcut from DEV_SHARED folder of the CUSTOMER_LIST target
definition.
In the mapping, rename the target instance name to
CUSTOMER_LIST_Update.
Lab 1
Instructions
Step 1: Create Mapping
1.
Connect to the PC8A_DEV repository using Developerxx as the user name and developerxx as the
password.
2.
Create a mapping called m_DYN_update_customer_list_xx, where xx is your student number. Use

the mapping details described in Detailed Overview on page 3 for guidelines.
Figure 1-1 shows an overview of the mapping you must create:
Figure 1-1. m_DYN_update_customer_list_xx Mapping
Step 2: Preview Target Data

1.
In the m_DYN_update_customer_list_xx mapping, preview the target data to view the rows that
exist in the table.
Lab 1
2.
Use the ODBC_EDW ODBC connection to connect to the target database. Use EDWxx as the user
name and password.
Figure 1-2. Preview Target Data for CUSTOMER_LIST Table Before Session Run
The CUSTOMER_LIST table should contain the following data:

PK_KEY
CUST_ID
FIRSTNAME
LASTNAME
ADDRESS
CITY
STATE
ZIP
111001
55001
Melvin
Bradley
4070 Morning Trl
New York
NY
30349
111002
55002
Anish
Desai
2870 Elliott Cir Ne
New York
NY
30305
111003
55003
Anderson
1538 Chantilly Dr Ne
New York
NY
30324
111004
55004
Chris
Ernest
2406 Glnrdge Strtford Dr New York
NY
30342
111005
55005
Rudolph
Gibiser
6917 Roswell Rd Ne
New York
NY
30328
111006
55006
Bianco
Lo
146 W 16th St
New York
NY
10011
111007
55007
Justina
Bradley
221 Colonial Homes Dr NW New York
NY
30309
111008
55008
Monique
Freeman
260 King St
San Francisco
CA
94107
111009
55009
Jeffrey
Morton
544 9th Ave
San Francisco
CA
94118
Lab 1
Step 3: View Source Data

1.
Navigate to the $PMSourceFileDir directory. By default, the path is:

C:\Informatica\PowerCenter8.1.0\server\infa_shared\SrcFiles
2.
Open updated_customer_list.txt in a text editor.

The updated_customer_list.txt source file contains the following data:
CUST_ID,FIRSTNAME,LASTNAME,ADDRESS,CITY,STATE,ZIP
67001,Thao,Nguyen,1200 Broadway Ave,Burlingame,CA,94010
67002,Maria,Gomez,390 Stelling Ave,Cupertino,CA,95014
67003,Jean,Carlson,555 California St,Menlo Park,CA,94025
67004,Chris,Park,13450 Saratoga Ave,Santa Clara,CA,95051
55002,Anish,Desai,400 W Pleasant View Ave,Hackensack,NJ,07601
55006,Bianco,Lo,900 Seville Dr,Clarkston,GA,30021
55003,Janice,MacIntosh,,,,
67003,Jean,Carlson,120 Villa St,Mountain View,CA,94043
3.
Notice that the row for customer ID 55003 contains some NULL values. You do not want to insert
the NULL values into the target, you only want to update the other column values in the target.
4.
Notice that the file contains two rows with customer ID 67003. Because of this, you must use a
dynamic cache for the Lookup transformation.
5.
Close the file.
Step 4: Create Workflow

1.
Open the Workflow Manager and open your ~Developerxx folder.
2.
Create a workflow named wf_DYN_update_customer_list_xx.
3.
Create a session named s_m_DYN_update_customer_list_xx using the

m_DYN_update_customer_list_xx mapping.
4.
In the session, verify that the target connection is EDWxx.
5.
Verify that the Target load type is set to Normal and the Truncate target table option is not checked.
6.
Verify the specified source file name is updated_customer_list.txt and the specified location is
$PMSourceFileDir.
Step 5: Run Workflow

Run workflow wf_DYN_update_customer_list_xx.
Lab 1
Step 6: Verify Statistics
Step 7: Verify Results

1.
Preview the target data from the mapping to verify the results.
Figure 1-3 shows the Preview Data dialog box for the CUSTOMER_LIST table:
Figure 1-3. Preview Target Data for CUSTOMER_LIST Table After Session Run
Lab 1
The CUSTOMER_LIST table should contain the following data:

PK_KEY
CUST_ID
FIRSTNAME
LASTNAME
ADDRESS
CITY
STATE
ZIP
111001
55001
Melvin
Bradley
4070 Morning Trl
New York
NY
30349
111002
55002
Anish
Desai
400 W Pleasant View Ave
Hackensack
NJ
07601
111003
55003
Janice
MacIntosh 1538 Chantilly Dr Ne
New York
NY
30324
111004
55004
Chris
Ernest
2406 Glnrdge Strtford Dr New York
NY
30342
111005
55005
Rudolph
Gibiser
6917 Roswell Rd Ne
New York
NY
30328
111006
55006
Bianco
Lo
900 Seville Dr
Clarkston
GA
30021
111007
55007
Justina
Bradley
221 Colonial Homes Dr NW New York
NY
30309
111008
55008
Monique
Freeman
260 King St
San Francisco
CA
94107
111009
55009
Jeffrey
Morton
544 9th Ave
San Francisco
CA
94118
111010
67001
Thao
Nguyen
1200 Broadway Ave
Burlingame
CA
94010
111011
67002
Maria
Gomez
390 Stelling Ave
Cupertino
CA
95014
111012
67003
Jean
Carlson
120 Villa St
Mountain View
CA
94043
111013
67004
Chris
Park
13450 Saratoga Ave
Santa Clara
CA
95051
2.
Look at customer ID 55003. It should not contain any NULLs.
3.
Look at customer ID 67003. It should contain data from the last row for customer ID 67003 in the
source file.
Lab 1
Lab 2: Workflow Alerts

Business Purpose
A session usually runs for under an hour. Occasionally, it will run longer. The administrator would like to
be notified via an alert if the session runs longer than an hour. A second session is to run after the first
session completes.
A Worklet will be created with a Worklet variable to define the time the Workflow started plus one hour.
A Timer Task will be created in the Worklet to wait for one hour before sending an email. If the session
runs for less than an hour a Control Task will be issued to stop the timer.
Objectives
Create a Workflow
Create a Worklet
Create a Timer Task
Create an Email Task
Create a Control Task
Create a condition to control the Email Task
Duration
30 minutes
Worklet Overview
Workflow Overview
Lab 2
Instructions
Step 1: Setup
Connect to the PC8A_DEV repository in the Designer and Workflow Manager.
Step 2: Mappings Required

If any of the following mappings do not exist in the ~Developerxx folder, copy them from the
SOLUTIONS_ADVANCED folder. Rename the mappings to have the _xx reflect the Developer number.
m_DIM_CUSTOMER_ACCT_xx
m_DIM_CUSTOMER_ACCT_STATUS_xx
Step 3: Reusable Sessions Required

If any of the following sessions do not exist in the ~Developerxx folder, copy them from the
SOLUTIONS_ADVANCED folder. Resolve any conflicts that may occur. Rename the mappings to have
the _xx reflect the Developer number.
s_m_DIM_CUSTOMER_ACCT_xx
s_m_DIM_CUSTOMER_ACCT_STATUS_xx
Step 4: Create a Workflow

Create a Workflow called wf_DIM_CUSTOMER_ACCT_LOAD_xx.
Step 5: Create a Worklet in the Workflow

1.
Create a Worklet called wl_DIM_CUSTOMER_ACCT_LOAD_xx.
2.
Open the Worklet and create the following tasks.
Step 6: Create a Timer Task in the Worklet
10
1.
Create a Timer task and name it tim_SESSION_RUN_TIME.
2.
Edit the Timer task and click the Timer tab.
3.
Select the Relative time: radio button.
Lab 2
4.
Select the Start after 1 Hour from the start time of this task.
Step 7: Create an E-Mail Task in the Worklet

1.
Create an Email task and name it eml_SESSION_RUN_TIME.
2.
Click the Properties tab.
3.
For the Email User Name type - administrator@anycompany.com.
4.
For the Email Subject type - session s_m_DIM_CUSTOMER_ACCT_xx ran an hour or longer.
5.
For the Email Text type an appropriate message.
Lab 2
11
Step 8: Create a Control Task in the Worklet

1.
Create a Control task and name it ctrl_STOP_SESS_TIMEOUT.
2.
Edit the Control task and click the Properties tab.
3.
Set the Control Option attribute to Stop parent.
Step 9: Add Reusable Session to the Worklet

1.
Add s_m_DIM_CUSTOMER_ACCT_xx to wl_DIM_CUSTOMER_ACCT_LOAD_xx.
2.
Verify source connections are ODS and source file name is customer_type.txt.
3.
Verify target connections are EDWxx.
4.
Verify lookup connections are valid - DIM tables to EDWxx, ODS tables to ODS.
5.
Truncate target table.
6.
Ensure Target Load Type is Normal.
Step 10: Link Tasks in Worklet

1.
Link Start to tim_SESSION_RUN_TIME and s_m_DIM_CUSTOMER_ACCT_xx.
2.
Link tim_SESSION_RUN_TIME to eml_SESSION_RUN_TIME.
3.
Link s_m_DIM_CUSTOMER_ACCT_xx to ctrl_STOP_SESS_TIMEOUT Link.
Step 11: Add Reusable Session to the Workflow
12
1.
Add s_m_DIM_CUSTOMER_ACCT_STATUS_xx to wf_DIM_CUSTOMER_ACCT_LOAD_xx.
2.
Verify source connections are ODS and source file name is customer_type.txt.
3.
Verify target connections are EDWxx.

Lab 2
4.
Verify lookup connections are valid - DIM tables to EDWxx, ODS tables to ODS.
5.
Truncate target table.
6.
Step 12: Link Tasks in Workflow

1.
Link Start to wl_DIM_CUST_ACCT_LOAD_xx.
2.
Link wl_DIM_CUST_ACCT_LOAD_xx to s_m_DIM_CUSTOMER_ACCT_STATUS_xx.

1.
In the Workflow Monitor, click the Filter Tasks button in the toolbar, or select Filters > Tasks from
the menu.
2.
Make sure to show all of the tasks.
3.
When you run your workflow, the Task View should look as follows.
Lab 2
13
14
Lab 2
Lab 3: Dynamic Scheduling

Business Purpose
The Department Dimension table must load sales information on an hourly basis during the business day.
It does not load during non-business hours (before 6 a.m. or after 6 p.m.). The start time of the loading
session should be calculated and started based on the workflow starting time.
Use workflow variables to calculate when the session starts. The starting time of the session has to be at
the top of the hour on or after 6 a.m. and not on or after 6 p.m. To accomplish this, the workflow will run
continuously.
Objectives
Create and use workflow variables
Create an Assignment Task
Create a Timer Task
Duration
30 minutes
Workflow Overview
Lab 3
15
Instructions
Step 1: Setup
Connect to PC8A_DEV Repository in the Designer and Workflow Manager.
Step 2: Mapping Required

The following Mapping will be used in this lab. If the below Mapping does not exist in the ~Developerxx
folder, copy it from the SOLUTIONS_ADVANCED folder. Change the xx in the mapping name to
reflect the Developer Number.
m_SALES_DEPARTMENT_xx
Step 3: Copy Reusable Sessions

Copy the following reusable session from the SOLUTIONS_ADVANCED folder to the ~Developerxx
folder. Change the xx in the session name to reflect the Developer Number.
s_m_SALES_DEPARTMENT_xx

Create a Workflow called wf_SALES_DEPARTMENT_xx.
Step 5: Create Workflow Variables

1.
Add three variables as follows:
2.
Click OK.
3.
Save.
Step 6: Add Session to Workflow
16
1.
Add reusable session s_m_SALES_DEPARTMENT_xx to the Workflow.
2.
Source Database Connection should be ODS.
3.
Target Database Connection should be EDWxx.
4.

Lab 3
5.
Truncate the Target Table.
Step 7: Create a Timer Task

1.
Create a Timer Task called tim_SALES_DEPARTMENT_START.
2.
Edit the Timer task and click the Timer tab.
3.
Select the Absolute time: radio button.
4.
Select the Use this workflow date-time variable to calculate the wait radio button.
5.
Select the ellipsis to browse variables.
6.
Double click on wf_SALES_DEPARTMENT_xx.
7.
Select $$NEXT_START_TIME as the workflow variable.
8.
Save.
Step 8: Create an Assignment Task

1.
Create an Assignment Task called asgn_SALES_DEPARTMENT_START_TIME.
2.
Add the following expressions:
Calculates the absolute workflow start time to the hour

$$TRUNC_START_TIME = TRUNC(WORKFLOWSTARTTIME, 'HH')
Extracts/assigns the hour from the above calculation

$$HOUR_STARTED = GET_DATE_PART($$TRUNC_START_TIME, 'HH')
Calculates/assigns the start time of the session

$$NEXT_START_TIME = IIF($$HOUR_STARTED >= 5 AND $$HOUR_STARTED < 17,
ADD_TO_Date($$TRUNC_START_TIME, 'HH',1),
DECODE($$HOUR_STARTED,
0, ADD_TO_DATE($$TRUNC_START_TIME, 'HH',6),
17, ADD_TO_DATE($$TRUNC_START_TIME,'HH',13),
20 ,ADD_TO_DATE($$TRUNC_START_TIME,'HH',10),
23, ADD_TO_DATE($$TRUNC_START_TIME,'HH',7)))
Note: The above functions could be nested together in one assignment expression if desired.
Step 9: Link Tasks

1.
Create a link from the Start Task to asgn_SALES_DEPARTMENT_START_TIME.
Lab 3
17
2.
Create a link from asgn_SALES_DEPARTMENT_START_TIME to

tim_SALES_DEPARTMENT_START.
3.
Create a link from tim_SALES_DEPARTMENT_START to s_m_SALES_DEPARTMENT_xx.
4.
Save the repository.
Step 10: Run Workflow by Editing the Workflow Schedule

Note: In order for the top of the hour to be calculated based on the workflow start time, the workflow
must be configured to execute continuously.
18
1.
Edit workflow wf_SALES_DEPARTMENT_xx.
2.
Click the SCHEDULER Tab.
3.
Verify that the scheduler is Non Reusable.
4.
Edit the schedule.
5.
Click the Schedule Tab.
6.
Click Run Continuously.
7.
Click OK.
8.
Click OK.
9.
Save the repository. This will start the workflow.
Lab 3
Step 11: Monitor the Workflow

1.
Open the Gantt Chart View.
Note: Notice that assignment task as already executed and the timer task is running.
2.
Browse the Workflow Log.
3.
Verify the results of the Assignment expressions in the log file. Listed below are examples:
Variable [$$TRUNC_START_TIME], Value [05/23/2004 16:00:00].
Variable [$$HOUR_STARTED], Value [16].
Variable [$$NEXT_START_TIME], Value [05/23/2004 17:00:00].
4.
Verify the Load Manager message that tells when the timer task will complete. Listed below is an
example message:
INFO : LM_36606 [Sun May 23 16:05:02 2004] : (2288|2004) Timer task instance
[TM_SALES_DEPARTMENT_START]: The timer will complete at [Sun May 23 17:00:00 2004].
5.
Open Task View.
6.
At or near the top of the hour, open the monitor to check the status of the session. Verify that it
starts(started) at the desired time. Below is an example:
7.
After the session completes, notice that the workflow automatically starts again.
8.
If the workflow starts after 5 p.m., the timer message in the workflow log will show that the timer will
end at 6 a.m. the following morning. Listed below is an example:
Lab 3
19
INFO : LM_36608 [Sun May 23 17:00:25 2004] : (2288|2392) Timer

task instance
[TM_SALES_DEPARTMENT_START]: Timer task specified to wait until absolute time [Mon
May 24 06:00:00 2004], specified by variable [$$NEXT_START_TIME].
INFO : LM_36606 [Sun May 23 17:00:25 2004] : (2288|2392) Timer task instance
[TM_SALES_DEPARTMENT_START]: The timer will complete at [Mon May 24 06:00:00 2004].
20
9.
Stop or abort the workflow at any time. Afterwards, edit the workflow scheduler and select RUN ON
DEMAND.
10.
Save the repository.
Lab 3
Lab 4: Recover a Suspended Workflow

In this lab, you will configure a mapping and its related session and workflow for recovery. Then, you will
change a session property to create an error that causes the session to suspend when you run it. You will
fix the error and recover the workflow.
Objectives
Configure a mapping, session, and workflow for recovery.
Recover a suspended workflow.
Duration
30 minutes
Lab 4
21
Instructions
Step 1: Copy the Workflow
1.
Open the Repository Manager.
2.
Copy the wkf_Stage_Customer_Contacts_xx workflow from the SOLUTIONS_ADVANCED folder

to your folder.
3.
In the Workflow Manager, open the wkf_Stage_Customer_Contacts_xx workflow.
4.
Rename the workflow to replace xx with your student number.
5.
Rename the session in the workflow to replace xx with your student number.
6.
Save the workflow.
Step 2: Edit the Workflow and Session for Recovery

1.
Open the wkf_Stage_Customer_Contacts_xx workflow.
2.
Edit the workflow, and on the General tab, select Suspend on Error.
3.
Edit the s_m_Stage_Customer_Contacts_xx session and click the Properties tab.
4.
Scroll to the end of the General Options settings and select Resume from last checkpoint for the
Recovery Strategy.
5.
Click the Mapping tab and change the target load type to Normal.
Note: When you configure a session for bulk load, the session is not recoverable using the resume
recovery strategy. You must use normal load.
6.
22
Change the target database connection to EDWxx.

Lab 4
7.
Save the workflow.
Step 3: Edit the Session to Cause an Error

In this step, you will edit the session so that when the Integration Service runs it, there will be an error.
1.
Edit the s_m_Stage_Customer_Contacts_xx session, and click the Mapping tab.

The source in the mapping uses a file list, customer_list.txt. To make the session encounter an error,
you will change the value in the Source Filename session property.
2.
On the Sources node, change the source file name to customer_list1234.txt.
3.
Click the Config Object tab.
4.
In the Error Handling settings, configure the session to stop on one error.
5.
Save the workflow.
Step 4: Run the Workflow, Fix the Session, and Recover the Workflow
1.
Run the workflow.

The Workflow Monitor shows that the Integration Service suspends the workflow and fails the
session.
Suspended Workflow and FailedSession

2.
Open the session log.
Lab 4
23
3.
Scroll to the end of the session log.
Session run has completed with failure.
Notice that the Integration Service failed the session.

Next, you will fix the session.
4.
In the Workflow Manager, edit the session.
5.
On the Mapping tab, enter customer_list.txt as the source file name.
6.
Save the workflow.
7.
In the Workflow Manager, right-click the workflow, and choose Recover Workflow.
The Workflow Monitor shows that the Integration Service is running the workflow and that the
session is running as a recovery run.
Running Recovery Session Run
24
Lab 4
When the session and workflow complete, the Workflow Monitor shows that the session completed
successfully as a recovery run.
Successful Recovery Session Run

8.
Open the session log.
9.
Search for session run completed with failure.
Notice that the Integration Service continues to write log events to the same session log.
Lab 4
25
10.
Search for recovery run.
The Integration Service writes recovery information to the session log.

11.
26
Close the Log Viewer.
Lab 4
Lab 5: Using the Transaction Control Transformation

Business Purpose
Line item data is read and sorted by invoice number. We need each invoice number committed to the
target database as a single transaction.
A flag will be created to tell PowerCenter when a new set of Invoice numbers are found. A Transaction
Control Transformation will be created to tell the database when to issue a commit.
Objectives
Create a flag to check for new INVOICE_NOs
Commit upon seeing a new set of INVOICE_NOs
Duration
45 minutes
Mapping Overview
Lab 5
27

Mapping Name
m_DIM_LINE_ITEM_xx
Source System
ODS
Target System
Initial Rows
EDWxx
Rows/Load
Short Description
Commit on a new set of INVOICE NO's
Load Frequency
On demand
Preprocessing
None
Post Processing
None
Error Strategy
None
Reload Strategy
None
LINE_ITEM_NO
Sources
Tables
Table Name
Schema/Owner
ODS_LINE_ITEM
ODS
Selection/Filter

DEV_SHARED folder
Targets
Tables
Schema/Owner
EDWxx
Table Name
Update
Delete
DIM_LINE_ITEM
Insert
Unique Key
yes
LINE_ITEM_NO

DEV_SHARED folder

Target Table
Target Column
Source Table
Source Column
Expression
DIM_LINE_ITEM
LINE_ITEM_NO
ODS_LINE_ITEM
LINE_ITEM_NO
Issue a commit upon a new set of Invoice Nos.
DIM_LINE_ITEM
INVOICE_NO
ODS_LINE_ITEM
INVOICE_NO
DIM_LINE_ITEM
PRODUCT_CODE
ODS_LINE_ITEM
PRODUCT_CODE
DIM_LINE_ITEM
QUANTITY
ODS_LINE_ITEM
QUANTITY
DIM_LINE_ITEM
PRICE
ODS_LINE_ITEM
PRICE
DIM_LINE_ITEM
COST
ODS_LINE_ITEM
COST
28
Lab 5
Detailed Overview
Transformation Name
Type
Description
Mapping
Mapping
m_DIM_LINE_ITEM_xx
ODS_LINE_ITEM
Source Definition
Table Source definition in ODS schema.

Create shortcut from DEV_SHARED folder.
Shortcut_to_sq_ODS_LINE_ITEM
Source Qualifier
Send to srt_DIM_LINE_ITEM:
LINE_ITEM_NO, INVOICE_NO, PRODUCT_CODE, QUANTITY,
DISCOUNT, PRICE, COST
srt_DIM_LINE_ITEM
Sorter
Sort by INVOICE_NO
Send to exp_DIM_LINE_ITEM
INVOICE_NO
SEND to tc_DIM_LINE_ITEM:
exp_DIM_LINE_ITEM
Expression
Uncheck the 'o' on INVOICE_NO

Create a variable called v_PREVIOUS_INVOICE_NO as a decimal
10,0 to house the value of the previous row's INVOICE_NO.
Expression:
INVOICE_NO
Create a variable called v_NEW_INVOICE_NO_FLAG as an Integer to
set a flag to check whether the current row's INVOICE_NO is the same
as the previous row's INVOICE_NO
Expression:
IIF(INVOICE_NO=v_PREVIOUS_INVOICE_NO, 0,1)
Move v_NEW_INVOICE_NO_FLAG above
v_PREVIOUS_INVOICE_NO
Create an output port called NEW_INVOICE_NO_FLAG_out as a
integer to hold the value of the flag
Expression:
v_NEW_INVOICE_NO_FLAG
SEND to tc_DIM_LINE_ITEM:
NEW_INVOICE_NO_FLAG_out
tc_DIM_LINE_ITEM
Transaction Control
On the ports tab, delete the _out from NEW_INVOICE_FLAG_out

On the properties tab enter the following Transaction Control
Condition:
IIF(NEW_INVOICE_NO_FLAG=1,
TC_COMMIT_BEFORE,TC_CONTINUE_TRANSACTION)
SEND to DIM_LINE_ITEM:
Shortcut_to_DIM_LINE_ITEM
Target Definition
Target definition in the EDWxx schema.

Create a shortcut from DEV_SHARED folder.
Lab 5
29
Instructions
Create a mapping called m_DIM_LINE_ITEM_xx, where xx is your student number. Use the mapping
details described in the previous pages for guidelines.

1.
Open ~Developerxx folder.
2.
Create workflow named wf_DIM_LINE_ITEM_xx.
3.
Create session named s_m_DIM_LINE_ITEM_xx.
4.
In the session, edit Mapping tab and expand the Sources node. Under Connections verify that the
Connection Value is ODS.
5.
Expand the Targets node and verify that the Connection value is correct, the Target load type is set to
Normal and the Truncate target table option is checked.

Run workflow wf_DIM_LINE_ITEM_xx.
30
Lab 5
Lab 5
31
32
Lab 5
Lab 6: Error Handling with Transactions

Business Purpose
The IT Department would like to prevent erroneous data from being committed into the
DIM_VENDOR_PRODUCT table. They would also like to issue a commit every time a new group of
VENDOR_IDs is written. A rollback will also be issued for an entire group of vendors if any record in
that group has an error.
Records will be committed when a new group of VENDOR_IDs comes in. This will require a flag to be
set to determine whether a VENDOR_ID is new or not. Rows will need to be rolled back if an error
occurs. An error flag will be set when a business rule is violated.
Objectives
Use a Transaction Control Transformation to Commit based upon Vendor IDs and issue a rollback
based upon errors.
Duration
60 minutes
Mapping Overview
Lab 6
33

Mapping Name
m_DIM_VENDOR_PRODUCT_TC_xx
Source System
Flat File
Initial Rows
Target System
EDWxx
Rows/Load
Short Description
Issue a commit based upon VENDOR_ID, but only if the PRODUCT_CODE is not null and the
CATEGORY is valid for all records in the group. A rollback of the entire group should occur if Informatica
comes across a null PRODUCT code or an invalid CATEGORY.
Load Frequency
On demand
Preprocessing
None
Post Processing
None
Error Strategy
None
Reload Strategy
None
None
Sources
Files
File Name
File Location
PRODUCT.txt
In the Source Files directory on the Integration Service process machine

DEV_SHARED folder
Targets
Tables
Schema/Owner
EDWxx
Table Name
Update
Delete
DIM_VENDOR_PRODUCT
Insert
Unique Key
yes

DEV_SHARED folder
Lookup Transformation Detail
34
Lookup Name
lkp_ODS_VENDOR
Lookup Table Name
ODS_VENDOR
Description
The VENDOR_NAME, FIRST_CONTACT and VENDOR_STATE are needed to populate

DIM_VENDOR_PRODUCT.
Match Condition(s)
ODS.VENDOR_ID = PRODUCT.VENDOR_ID
Filter/ SQL Override
N/A
Return Value(s)
VENDOR_NAME, FIRST_CONTACT and VENDOR_STATE
Location
ODS
Lab 6

Target Table
Target Column
Source Table
Source Column
Expression
DIM_VENDOR_PRODUCT
PRODUCT_CODE
PRODUCT
PRODUCT_CODE
DIM_VENDOR_PRODUCT
VENDOR_ID
PRODUCT
VENDOR_ID
DIM_VENDOR_PRODUCT
VENDOR_NAME
PRODUCT
Derived Value from

lkp_ODS_VENDOR
Return VENDOR_NAME from

lkp_ODS_VENDOR
DIM_VENDOR_PRODUCT
VENDOR_STATE
PRODUCT
Derived Value from

lkp_ODS_VENDOR
Return VENDOR_STATE from

lkp_ODS_VENDOR
DIM_VENDOR_PRODUCT
PRODUCT_NAME
PRODUCT
PRODUCT_NAME
DIM_VENDOR_PRODUCT
CATEGORY
PRODUCT
CATEGORY
DIM_VENDOR_PRODUCT
MODEL
PRODUCT
MODEL
DIM_VENDOR_PRODUCT
PRICE
PRODUCT
PRICE
DIM_VENDOR_PRODUCT
FIRST_CONTACT
PRODUCT
Derived Value from

lkp_ODS_VENDOR
Return FIRST_CONTACT from

lkp_ODS_VENDOR
Detailed Overview
Transformation Name
Type
Description
Mapping
Mapping
m_DIM_VENDOR_PRODUCT_TC_xx
PRODUCT.txt
Source Definition
Drag in Shortcut from DEV_SHARED
Sq_Shortcut_To_ PRODUCT
Source Qualifier
Data Source Qualifier for flat file

SEND PORT to exp_SET_ERROR_FLAG:
PRODUCT_CODE, VENDOR_ID, CATEGORY, PRODUCT_NAME,
MODEL, PRICE
exp_SET_ERROR_FLAG
Expression
Output port: ERROR_FLAG

Expression:
IIF(ISNULL(PRODUCT_CODE) OR ISNULL(CATEGORY),
TRUE, FALSE)
Send all output ports to srt_VENDOR_ID.
srt_VENDOR_ID
Lab 6
Sorter
Sort data ascending by VENDOR_ID & ERROR_FLAG. This puts any

error records at the end of each group.
SEND all PORTS to exp_SET_TRANS_TYPE.
SEND PORTS to lkp_ODS_VENDOR:
VENDOR_ID
35
Transformation Name
Type
Description
exp_SET_TRANS_TYPE
Expression
1. Create a variable called v_PREV_VENDOR_ID as a Decimal with

precision of 10 to house the value of the previous vendor.
Expression: VENDOR_ID
2. Create a variable port called v_NEW_VENDOR_ID_FLAG as an
integer to check and see if the current VENDOR_ID is new.
Expression:
IIF(VENDOR_ID != v_PREV_VENDOR_ID, TRUE,
FALSE)
Variables can be used to remember values across rows.
V_PREV_VENDOR_ID must always hold the value of the previous
VENDOR_ID, so it must be placed after v_NEW_VENDOR_ID_FLAG
3. Create an output port as a string(8) called TRANSACTION_TYPE
to tell the Transaction Control Transformation whether to CONTINUE,
COMMIT, or ROLLBACK.
Expression:
IIF(ERROR_FLAG = TRUE,
'ROLLBACK',
IIF(v_NEW_VENDOR_ID_FLAG = TRUE,
'COMMIT',
'CONTINUE'))
Since we sorted to put error records at the end of each group, when we
ROLLBACK, we'll be rolling back the whole group.
4. SEND all output PORTS to tc_DIM_VENDOR_PRODUCT.
lkp_ODS_VENDOR
Lookup
Create a connected lookup to ODS.ODS_VENDOR. Create an input

port for the source data field VENDOR_ID
Rename VENDOR_ID1 to VENDOR_ID_in
Set Lookup Condition:
VENDOR_ID = VENDOR_ID_in
SEND PORTS to tc_DIM_VENDOR_PRODUCT
VENDOR_NAME, FIRST_CONTACT, VENDOR_STATE
tc_DIM_VENDOR_PRODUCT
Transaction
Control
Expression:
DECODE(TRANSACTION_TYPE,
'COMMIT', TC_COMMIT_BEFORE,
'ROLLBACK', TC_ROLLBACK_AFTER,
'CONTINUE', TC_CONTINUE_TRANSACTION)
// If we're starting a new group, we need to COMMIT the
// prior group.
// If we hit an error, we need to ROLLBACK the current
// group including the current record.
PORTS to SEND to DIM_VENDOR_PRODUCT:
All ports except for TRANSACTION_TYPE
Shorcut_To_DIM_VENDOR_PROD
UCT
Target Table
All data without errors will be routed here

Create shortcut from DEV_SHARED folder
36
Lab 6
Instructions
Create a mapping called m_DIM_VENDOR_PRODUCT_TC_xx, where xx is your student number.
Use the mapping details described in the previous pages for guidelines.

1.
2.
Create workflow named wf_DIM_VENDOR_PRODUCT_TC_xx.
3.
Create session named s_m_DIM_VENDOR_PRODUCT_TC_xx
4.
Source file is found in the Source Files directory on the Integration Service machine
5.
Verify that the source filename is PRODUCT.txt (extension required)
6.
Verify target database connection value is EDWxx
7.
Verify target load type is Normal
8.
Select Truncate for DIM_VENDOR_PRODUCT
9.
Set Lookup connection to ODS

Run workflow wf_DIM_VENDOR_PRODUCT_TC_xx.
Lab 6
37
38
Lab 6
Lab 7: Handling Fatal and Non-Fatal Errors

Business Purpose
ABC Incorporated would like to track which records are failing when trying to run a load from the
PRODUCT Flat File to the DIM_VENDOR_PRODUCT table. Also some of the developers have
noticed dirty data being loaded into the DIM_VENDOR_PRODUCT table, therefore users are getting
dirty data in their reports.
Instead of using a Transaction Control Transformation, route the Fatal Errors off to a Fatal Error table
and route the Nonfatal Errors off to a Nonfatal table. All good data will be sent to the EDW.
Objectives
Trap all database errors and load them to a table called ERR_FATAL.
Trap the dirty data coming through from the CATEGORY field and write it to a table called
ERR_NONFATAL.
Write all data without fatal or nonfatal errors to DIM_VENDOR_PRODUCT.
Duration
60 minutes
Lab 7
39
Mapping Overview
40
Lab 7

Mapping Name
m_DIM_VENDOR_PRODUCT_xx
Source System
Flat File
Initial Rows
Target System
EDWxx
Rows/Load
Short Description
If a fatal error is found, route data to a fatal error table, If a nonfatal error is found route data to a
nonfatal table, If data is free of errors route it to DIM_VENDOR_PRODUCT.
Load Frequency
On demand
Preprocessing
None
Post Processing
None
Error Strategy
Create a flag for both fatal errors and nonfatal errors. Route bad data to its respective table.
Reload Strategy
None
None
Sources
Files
File Name
File Location
PRODUCT.txt
In the Source Files directory on the Integration Service process machine.

DEV_SHARED folder
Targets
Tables
Schema/Owner
EDWxx
Table Name
Update
Delete
DIM_VENDOR_PRODUCT
Insert
Unique Key
yes

DEV_SHARED folder
Tables
Schema/Owner
EDWxx
Table Name
Update
Delete
ERR_NONFATAL
Insert
Unique Key
yes
ERR_ID

DEV_SHARED folder
Lab 7
41
Tables
Schema/Owner
EDWxx
Table Name
Update
Delete
ERR_FATAL
Insert
Unique Key
yes
ERR_ID

DEV_SHARED folder
42
Lab 7
Lookup Transformation Detail

Lookup Name
lkp_ODS_VENDOR
Lookup Table Name
ODS_VENDOR
Description
The VENDOR_NAME, FIRST_CONTACT and VENDOR_STATE are needed to populate

DIM_VENDOR_PRODUCT.
Match Condition(s)
ODS.VENDOR_ID = PRODUCT.VENDOR_ID
Filter/ SQL Override
N/A
Return Value(s)
VENDOR_NAME, FIRST_CONTACT and VENDOR_STATE
Location
ODS

Target Table
Target Column
Source Table
Source Column
Expression
ERR_NONFATAL
ERR_ID
PRODUCT
Derived Value
Generated from seq_ERR_ID_ERR_NONFATAL
ERR_NONFATAL
REC_NBR
PRODUCT
REC_NUM
N/A
ERR_NONFATAL
ERR_RECORD
PRODUCT
Derived Value
The entire source record is concatenated
ERR_NONFATAL
ERR_DESCRIPTION
PRODUCT
Derived Value
First, records must be tested for validity.

Run a check to see if the PRODUCT_CODE is
Null.
Set a flag to True or False
Run a check to see if CATEGORY is Null
Rows must be separated into Fatal, Nonfatal and
Good Data
All NONFATAL ERRORS have a description of
INVALID CATEGORY
ERR_NONFATAL
LOAD_DATE
PRODUCT
Derived Value
Date and time the session runs
ERR_FATAL
ERR_ID
PRODUCT
Derived Value
Generated from seq_ERR_ID_ERR_FATAL
ERR_FATAL
REC_NBR
PRODUCT
REC_NUM
N/A
ERR_FATAL
ERR_RECORD
PRODUCT
Derived Value
The entire record is concatenated and sent to

the ERR_FATAL table.
ERR_FATAL
ERR_DESCRIPTION
PRODUCT
Derived Value
First, records must be tested for validity.

Run a check to see if the PRODUCT_CODE is
null.
Run a check to see if CATEGORY is Null
Rows must be separated into Fatal, Nonfatal and
Good Data
All Fatal Errors have a description of NULL
VALUE IN KEY
ERR_FATAL
LOAD_DATE
PRODUCT
Derived Value
The Date and time the session runs.
DIM_VENDOR_PR
ODUCT
PRODUCT_CODE
PRODUCT
PRODUCT_CODE
Rows must have a non null PRODUCT_CODE

and a valid CATEGORY.
DIM_VENDOR_PR
ODUCT
VENDOR_ID
PRODUCT
VENDOR_ID

Lab 7
43
Target Table
Target Column
Source Table
Source Column
Expression
DIM_VENDOR_PR
ODUCT
VENDOR_NAME
PRODUCT
Derived Value from

lkp_ODS_VENDOR

DIM_VENDOR_PR
ODUCT
VENDOR_STATE
PRODUCT
Derived Value from

lkp_ODS_VENDOR

DIM_VENDOR_PR
ODUCT
PRODUCT_NAME
PRODUCT
PRODUCT_NAME

DIM_VENDOR_PR
ODUCT
CATEGORY
PRODUCT
CATEGORY

DIM_VENDOR_PR
ODUCT
MODEL
PRODUCT
MODEL

DIM_VENDOR_PR
ODUCT
PRICE
PRODUCT
PRICE

DIM_VENDOR_PR
ODUCT
FIRST_CONTACT
PRODUCT
Derived Value from

lkp_ODS_VENDOR

Detailed Overview
Transformation Name
Type
Description
Mapping
Mapping
m_DIM_VENDOR_PRODUCT_xx
PRODUCT.txt
Flat File Source

Definition
Drag in Shortcut from DEV_SHARED
Shortcut_To_sq_PRODUCT
Source Qualifier
Source Qualifier for flat file.

Create shortcut from DEV_SHARED folder
exp_ERROR_TRAPPING
Expression
Check to see if PRODUCT_CODE is NULL

Derive ISNULL_PRODUCT_CODE_out by creating an output port
CODE: IIF(ISNULL(PRODUCT_CODE),'FATAL','GOOD DATA')
Check to see if CATEGORY is NULL
Derive INVALID_CATEGORY_out by creating an output port
IIF(ISNULL(CATEGORY), 'NONFATAL', 'GOOD
DATA')
Derive ERR_RECORD_out by creating an output port that
concatenates the entire record.
Use a To_Char function to convert all non-strings to strings
SEND PORTS to lkp_ODS_VENDOR:
VENDOR_ID
SEND PORTS to rtr_PRODUCT_DATA:
PRODUCT_CODE, ISNULL_PRODUCT_CODE_out, VENDOR_ID,
CATEGORY, INVALID_CATEGORY_out, PRODUCT_NAME, MODEL,
PRICE, REC_NUM, ERR_RECORD_out
44
Lab 7
Transformation Name
Type
Description
lkp_ODS_VENDOR
Lookup
Create a connected lookup to ODS.ODS_VENDOR Create an input

port for the source data field VENDOR_ID
Rename VENDOR_ID1 to VENDOR_ID_in
Set Lookup Condition:
VENDOR_ID = VENDOR_ID_in
SEND PORTS to rtr_PRODUCT_DATA:
VENDOR_NAME, FIRST_CONTACT, VENDOR_STATE
rtr_PRODUCT_DATA
Router
Create groups to route the data off to different paths:

Group = NONFATAL_ERRORS
CODE: INVALID_CATEGORY_out='NONFATAL'
Group = FATAL_ERRORS
CODE: ISNULL_PRODUCT_CODE_out='FATAL'
The default group will contain rows that do not match the above
conditions, hence all good rows.
PORTS TO SEND TO exp_ERR_NONFATAL:
NONFATAL_ERRORS.PRODUCT_CODE
PORTS to SEND to ERR_NONFATAL:
NONFATAL_ERRORS.REC_NUM,
NONFATAL_ERRORS.ERR_RECORD
PORTS to SEND to exp_ERR_FATAL:
FATAL_ERRORS.PRODUCT_CODE
PORTS to SEND to ERR_FATAL:
FATAL_ERRORS.REC_NUM,
FATAL_ERRORS.ERR_RECORD
PORTS to SEND to DIM_VENDOR_PRODUCT:
DEFAULT.PRODUCT_CODE, DEFAULT.VENDOR_ID,
DEFAULT.VENDOR_NAME, DEFAULT.VENDOR_STATE,
DEFAULT.PRODUCT_NAME, DEFAULT.CATEGORY,
DEFAULT.MODEL, DEFAULT.PRICE, DEFAULT.FIRST_CONTACT
exp_ERR_FATAL
Expression
Derive ERR_DESCRIPTION_out by creating an output port

CODE: 'NULL VALUE IN KEY'
Derive LOAD_DATE_out by creating an output port
CODE: SESSSTARTTIME
PORTS to SEND to ERR_FATAL:
LOAD_DATE_out, ERR_DESCRIPTION_out
exp_ERR_NONFATAL
Expression
Derive ERR_DESCRIPTION_out by creating an output port

CODE: INVALID CATEGORY'
Derive LOAD_DATE_out by creating an output port
CODE: SESSSTARTTIME
PORTS to SEND to ERR_NONFATAL:
LOAD_DATE_out, ERR_DESCRIPTION_out
seq_ERR_FATAL
Sequence
Generator
Generate the ERR_ID for ERR_FATAL
seq_ERR_NONFATAL
Sequence
Generator
Generate the ERR_ID for ERR_NONFATAL
ERR_FATAL
Target
Traps all of the FATAL ERRORS
ERR_NONFATAL
Target
Traps all NONFATAL ERRORS
DIM_VENDOR_PRODUCT
Target
All good data to be loaded into the target table.
Lab 7
45
Instructions
Create a mapping called m_DIM_VENDOR_PRODUCT_xx, where xx is your student number. Use the
mapping details described in the previous pages for guidelines.

1.
2.
Create workflow named wf_DIM_VENDOR_PRODUCT_xx.
3.
Create session named s_m_DIM_VENDOR_PRODUCT_xx.

Source file is found in the Source Files directory on the Integration Service process machine.
4.
Verify source file name is PRODUCT all Uppercase with an extension of .txt.
5.
Verify the target database connect is EDWxx.
6.
Change the target load type to Normal.
7.
Truncate DIM_VENDOR_PRODUCT.
8.
Set Lookup connection to ODS.

1.
Run workflow wf_DIM_VENDOR_PRODUCT_xx.
46
Lab 7

ERR_NONFATAL
ERR_FATAL
DIM_VENDOR_PRODUCT
Lab 7
47
48
Lab 7
Lab 8: Repository Queries

In this lab, you will search for repository objects by creating and running object queries.
Objectives
Create object queries
Run object queries
Duration
15 minutes
Lab 8
49
Instructions
Step 1: Create a Query to Search for Targets with Customer
First, you will create a query that searches for target objects with the string customer in the target name.
1.
In the Designer, choose Tools > Queries.

The Query Browser appears.
2.
Click New to create a new query.

Figure 8-4 shows the Query Editor:
Figure 8-4. Query Editor
Run the query.

Validate the query.
Add AND or OR
operators.
Add a new query

parameter.
3.
In the Query Name field, enter targets_customer.
4.
In the Parameter Name column, select Object Type.
5.
In the Operator column, select Is Equal To.
6.
In the Value 1 column, select Target Definition.
7.
Click the New Parameter button.

Notice that the Query Editor automatically adds an AND operator for the two parameters.
AND Operator
50
Lab 8
8.
Edit the new parameter to search for object names that contain the text customer.
Step 2: Validate, Save, and Run the Query

1.
Click the Validate button to validate the query.

Run
Validate
Save
The PowerCenter Client displays a dialog box stating if the query is valid or not. If the query is not
valid, fix the error and validate it again.
2.
Click Save.
The PowerCenter Client saves the query to the repository.
3.
Click Run.
The Query Results window shows the results of the query you created. Your query results might
include more objects than in the following results:
Some columns only apply to objects in a versioned repository, such as Version Comments, Label
Name, and Purged By User.
Lab 8
51
Step 3: Create A Query to Search For Mapping Dependencies

Next, you will create a query that returns all dependent objects for a mapping. A dependent object is an
object used by another object. The query will search for both parent and child dependent objects. An
example child object of a mapping is a source. An example parent object of a mapping is a session.
1.
Close the Query Editor, and create a new query.
2.
Enter product_inventory_mapping_dependents as the query name.
3.
Edit the first parameter so the object name contains product.
4.
Add another parameter, and choose Include Children and Parents for the parameter name.
Note: When you search for children and parents, you enter the following information in the value
columns:
Value 1. Object type(s) for dependent object(s), the children and parents.
Value 2. Object type(s) for the object(s) you are querying.
Value 3. Reusable status of the dependent object(s).
The PowerCenter Client automatically chooses Where for the operator.

5.
Click the arrow in the Value 1 column, select the following objects, and click OK:
6.
Mapplet
Source Definition
Target Definition
Transformation
In the Value 2 column, choose Mapping.

Note: When you access the Query Editor from the Designer, you can only search for Designer
repository objects. To search for all repository object types that use the mapping you are querying,
create a query from the Repository Manager.
7.
Choose Reusable Dependency in the third value column.
Step 4: Validate, Save, and Run the Query
52
1.
Validate the query.
2.
Save and run the query.
Lab 8
Your query results might look similar to the following results:
The query returned objects in all folders in the repository. Next, you will modify the query so it only
returns objects in your folder.
Step 5: Modify and Run the Query

1.
In the Query Editor, place the cursor somewhere in the last parameter and then add a new parameter.
2.
Modify the parameter so it searches for folders equal to the SOLUTIONS_ADVANCED folder.
3.
Validate and save the query.
4.
Run the query.
Lab 8
53
Notice that the even though the query says to include parent and child objects, it does not display any
parent objects to the mapping. Parent objects of a mapping include sessions, worklets, and workflows.
When you run a query accessed by the Designer, the query results only display Designer objects.
Similarly, when you run a query accessed by the Workflow Manager, the query results only display
Workflow Manager objects.
In the next step, you will run the same query accessed by the Repository Manager.
Step 6: Run the Query Accessed by the Repository Manager
54
1.
Open the Repository Manager and connect to the repository.
2.
Open the Query Browser. For details on how to do this, see Create a Query to Search for Targets
with Customer on page 50.
3.
Select the product_inventory_mapping_dependents query, and run it by clicking Execute.
Lab 8
Notice that the query results show all parent (and child) objects, including Workflow Manager
objects, such as workflows.
Step 7: Create Your Own Queries

1.
Create a new query that searches for invalid mappings.

Tip: You might need to modify a mapping in your folder to make it invalid. You can copy the mapping
with a new name, and then delete links to the target.

2.
Create a new query that searches for impacted mappings.

Tip: You can modify a source or target used in a mapping by removing a column. The Designer or
Workflow Manager invalidates a parent object when you modify a child object in such a way that the
parent object may not be able to run.
Lab 8
55
56
Lab 8
Lab 9: Performance and Tuning Workshop

Business Purpose
The support group within the IT Department has taken over the support of an ETL system that was
recently put into production. The implementation team seemed to do a good job but the over the last few
runs some of the sessions/mappings have been running very slowly and they need to be optimized. Due to
budget constraints, the management does not want to pay consultants to optimize the sessions/mappings
so the task has fallen on the support group. It has been mandated that the group reduce the run time of
one particular session/mapping by at least 30%. The Team Lead is confident that the group is up to the
challenge, they have just returned from an Informatica Advanced Training course.
The session that needs to be optimized is wf_FACT_MKT_SEGMENT_ORDERS_xx.
This session runs a mapping that reads in a flat file of order data, finds the customer market segment
information, aggregates the orders and writes the values out to a relational table.
The support group needs to find the bottleneck(s), determine the cause of the bottleneck(s) and then
reduce the bottleneck(s). The reduction in run time must be at least 30%.
Objectives
Use learned techniques to determine and reduce the bottleneck(s) that exist.
Duration
120 minutes
Object Locations
ProjectX folder
Lab 9
57
Workshop Details
Overview
This workshop is designed to assist the developers with the task at hand. It does not give detailed
instructions on how to identify a bottleneck, determine the cause of a bottleneck or how to optimize the
session/mapping. The approach to take is left entirely up to the discretion of the developers. The
optimization techniques to use are also left up to the developers. The workshop will provide instructions
on establishing a typical read baseline and on running the original session.
The suggested steps to follow are:
1.
Establish a typical read baseline
2.
Run the original session
3.
Identify and reduce the bottlenecks
Target
Source
Mapping
Session
Important: For detailed information on identifying bottlenecks and reducing bottlenecks, see the
Performance Tuning Guide in the PowerCenter online help. To access the online help, press the F1 key in
any of the PowerCenter Client tools. In the online help, click the Contents tab and expand the section for
the Performance Tuning Guide.
Workshop Rules
The rules of the workshop are:
Developers must work in teams of two.
Partitioning cannot be used to optimize the session.
Data results must match the initial session run.
Think out of the box.
Ask the instructor any questions that come to mind.
Establish ETL Baseline

In order to obtain a starting point for measurement purposes it is necessary to establish baselines. Ideally
a baseline should be established for the ETL process, the network and disks.
A straight throughput mapping sourcing from a RDBMS and writing to a flat file will establish a typical
read baseline.
Typical Read Baseline

In order to have a reasonable measurement for uncovering source bottlenecks a typical read baseline will
need to be established. This can be accomplished by running a straight throughput mapping that sources
a relational table and writes to a flat file. The session properties can be used to accomplish this.
58
1.
In the Repository Manager, copy the wf_Source_Baseline_xx workflow from the ProjectX folder to
your folder.
2.
In the Workflow Manager, open the wf_Source_Baseline_xx workflow in your folder.

Lab 9
3.
Edit the session named s_m_Source_Baseline_xx, and click the Mapping tab:
a.
Edit the Sources node and ensure the database connection is ODS.
b.
Edit the Targets node and change the Writer from Relational Writer to File Writer.
c.
Change the Targets Properties for the Output and Reject filenames to include your assigned
student number.
4.
Save, start and monitor the workflow.
5.
Document the results in the table provided in Documented Results on page 65.
Run Original Session

Running the original session will provide a starting point to measure the progress against.
1.
In the Repository Manager, copy the wf_FACT_MKT_SEGMENT_ORDERS_xx workflow from

the ProjectX folder to your folder.
2.
In the Workflow Manager, edit the session named s_m_FACT_MKT_SEGMENT_ORDERS_xx

located in the wf_FACT_MKT_SEGMENT_ORDERS_xx workflow in your folder.
3.
In the Mapping Tab, edit the Sources node:
4.
a.
Ensure the ORDER_LINE_ITEM source filename value is daily_order_line_item.dat.
b.
Ensure the ODS_INVOICE_SUMMARY database connection is ODS.
In the Mapping Tab, edit the Targets node:

a.
Ensure the database connection is EDWxx.
b.
Ensure the Target load type is set to Normal.
c.
Ensure the Truncate target table option is checked.
5.
Save, start and monitor the workflow.
6.
Document the results in the table provided in Documented Results on page 65.
Lab 9
59

Mapping Name
m_FACT_MKT_SEGMENT_ORDERS_xx
Source System
ODS and Flat File
Target System
EDWxx
Initial Rows
4,015,335
Rows/Load
437,023
Short Description
Calculates totals for quantity, revenue and cost for market segments. Values are summarized by customer,
date, market segment, region and item.
Load Frequency
On demand
Preprocessing
None
Post Processing
None
Error Strategy
None
Unique Source
Fields
SOURCES
Tables
Table Name
Schema/Owner
Selection/Filter
daily_order_line_item
Flat File
This is a daily order line item file that contains order

information for customers. The file contains 1,328,667 rows
of order data for August 29, 2003 and is sorted by order id.
This file is joined to the ODS_INVOICE_SUMMARY
relational table in order to retrieve the payment type that the
customer uses. It is assumed that the customer uses the
same payment type each time. The payment types are
CREDIT CARD, DEBIT CARD, CASH and CHECK
The source file is called daily_order_line_item.dat. The
location for the file can be found by checking the service
variable $PMSourceFileDir.
ODS_INVOICE_SUMMARY
ODS
This is a monthly summary of customer invoice data. The

table contains invoice number, customer, order date,
payment type and amount. The Primary Key is Invoice
Number. The table contains 2,686,668 rows.
TARGETS
Tables
Schema Owner
EDWxx
Table Name
Update
Delete
FACT_MKT_SEGMENT_ORDERS
60
Insert
Unique Key
Yes
ORDER_KEY (system generated)
Lab 9
LOOKUPS
Lookup Name
lkp_ITEM_ID
Table
DIM_ITEM
Description
The FACT_MKT_SEGMENT_ORDERS fact table needs to have the ITEM_KEY stored on it as a Foreign
Key. The item id contained in the source will be matched with the item id in the DIM_ITEM table to retrieve
the ITEM_KEY.
The cost of each item needs to be obtained from this table and used in the calculation of item costs for
each row written to the target.
This table contains 27 rows.
Location
EDWxx
Match
Condition(s)
DIM_ITEM.ITEM_ID = ORDER_LINE_ITEM.ITEM_ID
Filter/SQL
Override
N/A
Return Value(s)
ITEM_KEY, COST
Lookup Name
lkp_CUSTOMER_INFO
Table
DIM_CUSTOMER_PT
Description
The FACT_MKT_SEGMENT_ORDERS fact table needs to have the customer key stored on it as a Foreign
Key. The CUSTOMER_ID contained in the source will be matched with the CUSTOMER_ID in the
DIM_CUSTOMER_PT table to retrieve the customer key (C_CUSTKEY).
The market segment of each customer is also retrieved and used in aggregate groupings.
This table contains 1,000,000 rows.
Location
EDWxx
Match
Condition(s)
DIM_CUSTOMER_PT.C_CUST_ID = ORDER_LINE_ITEM.CUSTOMER_ID
Filter/SQL
Override
N/A
Return Value(s)
C_CUSTKEY, C_CUST_ID, C_MKTSEGMENT
SOURCE TO TARGET FIELD MATRIX

Target table name: FACT_MKT_SEGMENT_ORDERS
Target Column
Source Table
Source Column
Expression
ORDER_DATE
ORDER_LINE_ITEM
ORDER_DATE
N/A
ORDER_QUANTITY
ORDER_LINE_ITEM
QUANTITY
Sum of QUANTITY grouped by customer key, order date,

market segment, region and item key.
ORDER_REVENUE
ORDER_LINE_ITEM
REVENUE
Sum of REVENUE grouped by customer key, order date,

market segment, region and item key.
PYMT_TYPE
ODS_INVOICE_SUMMARY
PYMT_TYPE
N/A
ORDER_KEY
Derived Value
Generated by a Sequence Generator
CUSTOMER_KEY
Derived Value
Foreign Key referencing the DIM_CUSTOMER_PT table.

Obtained via a lookup to the dimension table on the
CUSTOMER_ID column.
Lab 9
61
Target Column
Source Column
Expression
MKTSEGMENT
Derived Value
The market segment that the customer belongs in.

Obtained via a lookup to the DIM_CUSTOMER_PT
dimension table.
REGION
Derived Value
Derived based on customer id. If the customer id is: <

50000 the region is 'WEST',
>= 50000 and < 95000 the region is 'CENTRAL',
>= 95000 and < 120000 the region is 'SOUTH',
>= 120000 and < 200501 the region is 'EAST',
>= 200501 the region will be 'UNKNOWN'
ITEM_KEY
Derived Value
Foreign Key referencing the DIM_ITEM table. Obtained

via a lookup to the DIM_ITEM dimension table on the
ITEM_ID column.
ORDER_COST
Derived Value
SUM of the (COST * QUANTITY). COST is obtained via

a lookup to the DIM_ITEM dimension table.
62
Source Table
Lab 9
DETAILED OVERVIEW
Transformation Name
Type
Description
Mapping
Mapping
m_FACT_MKT_SEGMENT_ORDERS_xx
Shortcut_to_ORDER_LINE_ITEM
Source Definition
Flat file containing daily order information for each customer. Contains
orders for August 29, 2003. This table contains 1,328,667 rows
Sq_Shortcut_to_ORDER_LINE_IT
EM
Source Qualifier
Flat File Source Qualifier

Sent to jnr_PAYMENT_TYPE:
All Ports
Shortcut_to_ODS_INVOICE_SUM
MARY
Source Qualifier
Relational table containing a summary of the invoices for the month.

This table contains data from August 1, 2003 through August 29, 2003.
The key is INVOICE_NO and the table contains 2,686,668 rows
Sq_Shortcut_To_ODS_INVOICE_S
UMMARY
Source Qualifier
Relational Source Qualifier

Sent to jnr_PAYMENT_TYPE:
All Ports
Jnr_PAYMENT_TYPE
Joiner
Joiner transformation that joins the ORDER_LINE_ITEM table to the

ODS_INVOICE_SUMMARY table.
Master Source: ORDER_LINE_ITEM
Detail Source: ODS_INVOICE_SUMMARY
Join Condition:
ORDER_DATE = ORDER_DATE
CUSTOMER_ID = CUSTOMER_ID
Sent to lkp_ITEM_ID:
ORDER_LINE_ITEM: ITEM_ID
Sent to lkp_CUSTOMER_INFO:
ORDER_LINE_ITEM: CUSTOMER_ID
Sent to exp_SET_UNKNOWN_KEYS:
ORDER_LINE_ITEM: ORDER_DATE, QUANTITY, PRICE
ODS_INVOICE_SUMMARY: PYMT_TYPE
lkp_ITEM_ID
Lookup
Lookup transformation that obtains item keys from the DIM_ITEM

table. The DIM_ITEM table is located in the EDWxx schema.
Lookup Condition
ITEM_ID from DIM ITEM =
ITEM_ID from ORDER_LINE_ITEM
ITEM_KEY, COST
Lab 9
63
Transformation Name
Type
Description
Lkp_CUSTOMER_INFO
Lookup
Lookup transformation that obtains customer keys from the

DIM_CUSTOMER_PT table. The DIM_CUSTOMER_PT table is
located in the EDWxx schema.
Lookup Condition
CUSTOMER_ID from DIM_CUSTOMER_PT =
CUSTOMER_ID from ORDER_LINE_ITEM
C_CUSTKEY, C_CUST_ID, C_MKTSEGMENT
exp_SET_UNKNOWN_KEYS
Expression
Expression Transformation that sets values for missing columns (item

key, mktsegment). It also defines the region the customer belongs in.
Output Ports:
MKTSEGMENT_out
Formula:
IIF( ISNULL(MKTSEGMENT), 'UNKNOWN',
MKTSEGMENT)
ITEM_ID_out
Formula:
IIF(ISNULL(ITEM_KEY), 0.00, ITEM_COST)
REGION_OUT
Formula:
IIF(C_CUST_ID > 0 AND C_CUST_ID < 50000,
'WEST',
IIF(C_CUST_ID >= 50000 AND C_CUST_ID <
95000, 'CENTRAL',
120000, 'SOUTH',
200501, 'EAST', 'UNKNOWN'))))
Sent to agg_VALUES:
All output ports
agg_VALUES
Aggregator
Aggregator transformation that calculates the revenue, quantity and

cost
Group by ports:
C_CUSTKEY, ORDER_DATE, MKTSEGMENT, REGION, ITEM_KEY
Output ports:
ORDER_QUANTITY
Formula: SUM(QUANTITY)
ORDER_REVENUE
Formula: SUM(PRICE * QUANTITY)
ORDER_COST
Formula: SUM(ITEM_COST * QUANTITY)
Sent to FACT_MKT_SEGMENT_ORDERS:
All output ports
64
Lab 9
Transformation Name
Type
Description
Seq_ORDER_KEY
Sequence
Generator
Sequence Generator transformation that populates the system

generated ORDER_KEY
Sent to FACT_MKT_SEGMENT_ORDERS:
NEXTVAL
Shortcut_to_FACT_MKT_SEGMEN
T_ORDERS
Target Definition
Fact table located in EDWxx schema
Documented Results
Session Name
Rows
Processed
Rows Failed
Start Time
End Time
Elapsed Time
(Secs)
Rows Per
Second
ETL Read Baseline

Original Session
Write to Flat File Test
(Target)
Filter Test (Source)
Read Mapping Test
(Source or Mapping)
Filter Test (Mapping)
Lab 9
65
66
Lab 9
Lab 10: Partitioning Workshop

Business Purpose
The support group within the IT Department has taken over the support of an ETL system that was
recently put into production. During development the test data was not up to standard, therefore serious
performance testing could not be accomplished. The system has been in production for a while and the
support group has already taken some steps to optimize the sessions that have been running. The time
window is still tight so the management wants the support group to look at partitioning some of the
sessions to see if this would help.
The sessions/mappings that are in need of analysis are:
s_m_Target_Bottleneck_xx. This session reads in a relational source that contains customer account
balances for the year.
s_m_Items_Bottleneck_xx. This mapping reads a large flat file of item sold data, filters out last years
stock, applies some row level manipulation, performs a lookup to get cost information and then loads
the data into an Oracle table.
Note: The s_m_Items_Bottleneck_xx mapping is a hypothetical example. It does not exist in the
repository.
s_m_Source_Bottleneck_xx. This mapping reads in one relational source that contains customer
account balances and another relational source that contains customer demographic information. The
two tables are joined at the database side.
s_m_Mapping_Bottleneck_xx. This mapping reads in a flat file of order data, finds the customer
market segment information, filters out rows that haven't sold more than one item, aggregates the
orders and writes the values out to a relational table.
The support group needs to review each one of these sessions to if it makes sense to partition the session.
Objectives
Review the sessions and based on knowledge gained from the presentations determine what
partitioning, if any, should be done.
Duration
60 minutes
Object Locations
ProjectX folder
Lab 10
67
Workshop Scenarios
Scenario 1
The session in question is s_m_Target_Bottleneck_xx has been optimized already but it is felt that more
can be done. The machine that the session is running on has 32 Gig of Memory and 16 CPUs. The
mapping takes account data from a relational source, calculates various balances and then writes the data
out to the BalanceSummary table. The BalanceSummary table is an Oracle table that the DBA has
partitioned by the account_num column.
Answer the following questions:
Question
I.
How many pipeline stages does this

session contain?
II.
What default partition points does this

session contain?
III.
Can partitions be added/deleted or can the

partition types be changed to make this
more efficient?
IV.
What partition types should be used and

where?
V.
In what way will this increase

performance?
Answers
Review Partition Points

1.
68
Edit the s_m_Target_Bottlenck_xx Session located in the wf_Target_Bottleneck_xx workflow.
Lab 10
2.
Click the Mapping > Partitions tab to see the partition points.
3.
Select each Transformation and look at the window at the bottom of the screen to see what partition
type is being used for that particular partition point.
Partition Test
The purpose of this section is to implement partitioning on the s_m_Target_Bottleneck_xx session.
1.
Copy the wf_Target_Bottleneck_xx workflow and rename it to wf_Target_Bottleneck_Partition_xx
2.
Edit the s_m_Target_Bottleneck_xx Session located in the wf_Target_Bottleneck_Partition_xx

workflow and rename in to s_m_Target_Bottleneck_Partition_xx
3.
Click the Mapping tab, and then click the Partitions tab
4.
On the Partitions tab, select the Shortcut_to_BalanceSummary transformation, click the Edit
Partition Point icon and add two new partitions
5.
Select Key Range from the drop down box and click OK
6.
Leave <**All**> selected in the Key Range drop down menu
7.
Click on Edit Keys - this allows the definition of the columns that are going to be in the key range
8.
Add the Account_num column to the Key Range and select OK
9.
Input the following ranges for the 3 partitions
Partition #1 - start range 1, end range 3500

Partition #3 - start range 7000
10.
Select the SQ_Shortcut_to_Source2 partition Point and edit the partition point
11.
Select Key Range from the drop down box
Lab 10
69
12.
Add the Account_num column to the Key Range and select OK
13.
Input the following ranges for the 3 partitions

Partition #3 - start range 7000
14.
Save, start and monitor the Workflow
15.
Compare the results against the original session results and against the indexed session results. Is there
a performance gain?
Conclusion
The instructor will discuss the answers to the questions in the lab wrap-up.
Scenario 2
Note: The mapping shown in this scenario is a hypothetical example. It does not exist in the repository.
The session in question is s_m_Items_Bottleneck_xx has been running slowly and the Project manager
wants it optimized. The machine that this is running on has 8 Gig of Memory and 4 CPUs. The mapping
takes items sold data from a large flat file, transforms it and writes out to an Oracle table. The flat file
comes from one location and splitting it up is not an option. The second Expression transformation is
very complex and takes a long time to push the rows through.
Mapping Overview

Question
70
I.
How many pipeline stages does this

session contain?
II.
What default partition points does this

session contain?
III.

partition types be changed to make this
more efficient?
IV.

where?
V.
In what way will this increase

performance?
Answers
Lab 10
Conclusion
Scenario 3
The session in question is s_m_Source_Bottleneck_xx has been running slowly and the Project manager
wants it optimized. The machine that this is running on has 2 Gig of Memory and 2 CPUs. The mapping
reads one relational source that contains customer account balances and another relational source that
contains customer demographic information. The tables are joined at the database side, the rows are then
pushed through an expression transformation and loaded into an Oracle table.
Mapping Overview

Question
I.
How many pipeline stages does this session

contain?
II.
What default partition points does this session

contain?
III.

partition types be changed to make this more
efficient?
IV.

where?
V.
In what way will this increase performance?
Answers
Conclusion
Scenario 4
The session in question is s_m_Mapping_Bottleneck_Sorter_xx is still not running quite as fast as is
needed. The machine that this is running on has 24 Gig of Memory and 16 CPUs. The mapping reads a
flat file source that is really 3 region specific flat files being read from a file list. The rows are then passed
through two lookups to obtain item costs and customer information. It is then sorted and aggregated
before being loaded into an Oracle table. The customer is part of the sort key and the DBA has
Lab 10
71
partitioned the Oracle table by customer_key. What can be done to further optimize this session/
mapping?
Mapping Overview

Question
I.

contain?
II.

contain?
III.

partition types be changed to make this more
efficient?
IV.

where?
V.
Answers
Conclusion
72
Lab 10
Answers
Scenario 1
Question
Answers
I.

contain?
II.

contain?
Source Qualifier, Target
III.
Can partitions be added/deleted or can the partition

types be changed to make this more efficient?
Yes
IV.
What partition types should be used and where?
Key_Range at both the source and the target
V.
This will add multiple connections to the source and target which will
result in data being read concurrently. This will be faster.
Scenario 2
Question
Answers
I.

contain?
II.

contain?
Source Qualifier and Target
III.

Yes
IV.
Additional pass-through at the exp_complex_calculations

transformation
V.
This will add one more pipeline stage which in turn will give you an
additional buffer to move data.
Scenario 3
Question
Answers
I.

contain?
II.

contain?
Source Qualifier, Target
III.

No - Each partition takes at least between 1-2 CPUs
IV.
N/A
V.
N/A
Lab 10
73
Scenario 4
74
Question
Answers
I.

contain?
II.

contain?
Source Qualifier, Aggregator and Target
III.

Yes
IV.
3 Partitions
- Key-Range at Target
- Split the source into the 3 regions specific files and read each one
into one of the partitions
- Hash Auto Keys an the Sorter Transformation. This will also allow
you to remove the partition point at the aggregator if you like.
V.
Additional connections at the target will load faster.

You need to split the source flat file into the 3 region specific files
because you can have only one
connection open to a flat file
The Hash Auto-Keys is required to make sure that there is no
overlap at the aggregator. You could also remove the partition point
at the aggregator if you like.
If the flat files significantly vary in size then you may want to add a
round robin somewhere. In this particular mapping this will not make
sense to do this.
Lab 10

Informatica Powercenter 8 Level Ii Developer Lab Guide: Version - Pc8Liid 20060910

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Informatica Powercenter 8 Level Ii Developer Lab Guide: Version - Pc8Liid 20060910

Uploaded by

Copyright:

Available Formats

Informatica PowerCenter 8

Informatica PowerCenter Level II Developer Lab Guide

Copyright (c) 19982006 Informatica Corporation.

Lab 1: Dynamic Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Lab 2: Workflow Alerts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Lab 3: Dynamic Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Lab 4: Recover a Suspended Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Lab 5: Using the Transaction Control Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Lab 6: Error Handling with Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Lab 7: Handling Fatal and Non-Fatal Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Step 4: Verify Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Lab 8: Repository Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Lab 9: Performance and Tuning Workshop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Lab 10: Partitioning Workshop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

About This Guide

Indicates a submenu to navigate to.

Click Repository > Connect.

Indicates text you need to type or enter.

Click the Rename button and name the new source

Database tables and column names are shown

Indicates a variable you must replace with

Connect to the Repository using the assigned login_id.

The following paragraph provides additional

Note: You can select multiple objects to import by

The following paragraph provides suggested

Tip: The m_ prefix for a mapping name is

Other Informatica Resources

Informatica Customer Portal

Informatica web site

Informatica Developer Network

Informatica Knowledge Base

Informatica Professional Certification

Informatica Technical Support

Obtaining Informatica Documentation

Visiting Informatica Customer Portal

Visiting the Informatica Web Site

Visiting the Informatica Developer Network

Visiting the Informatica Knowledge Base

Obtaining Informatica Professional Certification

Obtaining Technical Support

support@informatica.com for technical inquiries

support_admin@informatica.com for general customer service requests

North America / South

Europe / Middle East / Africa

Informatica Software Ltd.

Lab 1: Dynamic Lookup Cache

Use a Router transformation to route rows based on the NewLookupRow value

Use an Update Strategy transformation to flag rows for update or insert

Velocity Deliverable: Mapping Specifications

Unique Source Fields

In the Source Files directory on the Integration Service process machine.

Create shortcut from

Create shortcut from

Source To Target Field Matrix

Ignore NULL Inputs for Updates

Source File or Transformation

Source File or Transformation

Ignore NULL Inputs for Updates

Description and Instructions

Flat file in $PMSourceFileDir directory.

Connect to input/output ports of the Lookup transformation,

Lookup transformation based on the target definition

Create two output groups with the following names: