P. 1
Di Designer Guide

Di Designer Guide

|Views: 501|Likes:
Published by Massimo Bellucci

More info:

Published by: Massimo Bellucci on Aug 30, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

09/21/2012

pdf

text

original

Sections

  • Data Integrator Designer Guide
  • Welcome
  • Overview of this document
  • Audience and assumptions
  • More Data Integrator product documentation 1
  • More Data Integrator product documentation
  • About this chapter
  • Creating a Data Integrator repository
  • Associating the repository with a Job Server
  • Entering repository information
  • Version restrictions
  • Oracle login
  • Microsoft SQL Server login
  • IBM DB2 login
  • Sybase ASE login
  • Resetting users
  • Data Integrator objects
  • Reusable objects
  • Single-use objects
  • Object hierarchy
  • Designer window
  • Menu bar
  • Project menu
  • Edit menu
  • View menu
  • Tools menu
  • Debug menu
  • Validation menu
  • Window menu
  • Toolbar 3
  • Help menu
  • Toolbar
  • Project area 3
  • Project area
  • Tool palette
  • Workspace
  • Moving objects in the workspace area
  • Connecting and disconnecting objects
  • Describing objects
  • Scaling the workspace
  • Arranging workspace windows
  • Closing workspace windows
  • Local object library
  • Object editors
  • Working with objects
  • Creating new reusable objects
  • Changing object names
  • Viewing and changing object properties
  • Creating descriptions
  • Creating annotations
  • Saving and deleting objects
  • Searching for objects
  • General and environment options
  • Designer — Environment
  • Designer — General
  • Designer — Graphics
  • Designer — Central Repository Connections
  • Data — General
  • Job Server — Environment
  • Job Server — General
  • Projects
  • Objects that make up a project
  • Creating new projects
  • Opening existing projects
  • Saving projects
  • Jobs
  • Creating jobs
  • Naming conventions for objects in jobs
  • Datastores
  • What are datastores?
  • Database datastores
  • Mainframe interface
  • Defining a database datastore
  • Changing a datastore definition
  • Browsing metadata through a database datastore
  • Importing metadata through a database datastore
  • Imported table information
  • Imported stored function and procedure information
  • Ways of importing metadata
  • Reimporting objects
  • Memory datastores
  • Memory table target options
  • Persistent cache datastores
  • Linked datastores
  • Adapter datastores
  • DB1
  • Defining an adapter datastore
  • Browsing metadata through an adapter datastore
  • Importing metadata through an adapter datastore
  • Creating and managing multiple datastore configurations
  • Definitions
  • Why use multiple datastore configurations?
  • Creating a new configuration
  • Adding a datastore alias
  • Portability solutions
  • Migration between environments
  • Multiple instances
  • OEM deployment
  • Multi-user development
  • Job portability tips
  • Renaming table and function owner
  • Defining a system configuration
  • What are file formats?
  • File format editor 6
  • File format editor
  • Creating file formats
  • Creating a new file format
  • Modeling a file format on a sample file
  • Replicating and renaming file formats
  • Creating a file format from an existing flat table schema
  • Editing file formats
  • File format features
  • Reading multiple files at one time
  • Identifying source file names
  • Number formats
  • Ignoring rows with specified markers
  • Date formats at the field level
  • Error handling for flat-file sources
  • Creating COBOL copybook file formats
  • File transfers
  • Custom transfer system variables for flat files
  • Custom transfer options for flat files
  • Setting custom transfer options
  • Design tips
  • Web log support
  • Word_ext function
  • Concat_date_time function
  • WL_GetKeyValue function
  • chapter
  • Data Flows
  • What is a data flow?
  • Naming data flows
  • Data flow example
  • Steps in a data flow
  • Data flows as steps in work flows
  • Intermediate data sets in a data flow
  • Passing parameters to data flows 7
  • Operation codes
  • Passing parameters to data flows
  • Creating and defining data flows
  • Source and target objects
  • Source objects
  • Target objects
  • Adding source or target objects to data flows
  • Template tables
  • Transforms
  • Transform editors
  • Adding transforms to data flows
  • Query transform overview
  • Adding a Query transform to a data flow
  • Query editor
  • Data flow execution
  • Push down operations to the database server
  • Distributed data flow execution
  • Load balancing
  • Caches
  • Audit Data Flow Overview
  • What is a work flow?
  • Steps in a work flow
  • Order of execution in work flows 8
  • Order of execution in work flows
  • Example of a work flow
  • Creating work flows 8
  • Creating work flows
  • Conditionals
  • While loops
  • Design considerations
  • Defining a while loop
  • Using a while loop with View Data
  • Try/catch blocks
  • Categories of available exceptions
  • Scripts
  • Debugging scripts using the print function
  • Nested Data
  • What is nested data?
  • Representing hierarchical data 9
  • Representing hierarchical data
  • Formatting XML documents
  • Importing XML Schemas
  • Importing XML schemas
  • Importing abstract types
  • Importing substitution groups
  • Specifying source options for XML files
  • Reading multiple XML files at one time
  • Mapping optional schemas
  • Using Document Type Definitions (DTDs)
  • Generating DTDs and XML Schemas from an NRDM schema
  • Operations on nested data
  • Overview of nested data and the Query transform
  • FROM clause construction
  • Nesting columns
  • Using correlated columns in nested data
  • Distinct rows and nested data
  • Grouping values across nested schemas
  • Unnesting nested data
  • How transforms handle nested data
  • XML extraction and parsing for columns
  • Sample Scenarios
  • Overview
  • Request-response message processing
  • What is a real-time job?
  • Real-time versus batch
  • Messages
  • Real-time job examples
  • Creating real-time jobs
  • Real-time job models
  • Single data flow model
  • Multiple data flow model
  • Using real-time job models
  • Creating a real-time job
  • Real-time source and target objects
  • Secondary sources and targets
  • Transactional loading of tables
  • Design tips for data flows in real-time jobs
  • Testing real-time jobs
  • Executing a real-time job in test mode
  • Using an XML file target
  • Building blocks for real-time jobs
  • Supplementing message data
  • Branching data flow based on a data cache value
  • Calling application functions
  • Designing real-time applications 10
  • Designing real-time applications
  • Reducing queries requiring back-office application access
  • Messages from real-time jobs to adapter instances
  • Real-time service invoked by an adapter instance
  • Example of when to use embedded data flows 11
  • Example of when to use embedded data flows
  • Creating embedded data flows
  • Using the Make Embedded Data Flow option
  • Creating embedded data flows from existing flows
  • Using embedded data flows
  • Testing embedded data flows
  • Troubleshooting embedded data flows
  • The Variables and Parameters window
  • The Variables and Parameters window opens
  • Using local variables and parameters
  • Parameters
  • Passing values into data flows
  • Defining local variables
  • Defining parameters
  • Using global variables
  • Creating global variables
  • Viewing global variables
  • Setting global variable values
  • Local and global variable rules 12
  • Local and global variable rules
  • Naming
  • Replicating jobs and work flows
  • Importing and exporting
  • Environment variables
  • Setting file names at run-time using variables
  • Overview of Data Integrator job execution
  • Preparing for job execution 13
  • Preparing for job execution
  • Validating jobs and job components
  • Ensuring that the Job Server is running
  • Setting job execution options
  • Executing jobs as immediate tasks
  • Monitor tab
  • Log tab
  • Debugging execution errors
  • Using Data Integrator logs
  • Examining trace logs
  • Examining monitor logs
  • Examining error logs
  • Examining target data
  • Changing Job Server options
  • Chapter overview
  • Using the Data Profiler
  • Data sources that you can profile
  • Connecting to the profiler server
  • Profiler statistics
  • Column profile
  • Basic profiling
  • Detailed profiling
  • Relationship profile
  • Executing a profiler task
  • Submitting column profiler tasks
  • Submitting relationship profiler tasks
  • Monitoring profiler tasks using the Designer
  • Viewing the profiler results
  • Viewing column profile data
  • Viewing relationship profile data
  • Using View Data to determine data quality
  • Data tab
  • Profile tab
  • Relationship Profile or Column Profile tab
  • Using the Validation transform
  • Analyze column profile
  • Define validation rule based on column profile
  • Using Auditing
  • Auditing objects in a data flow
  • Accessing the Audit window
  • Defining audit points, rules, and action on failure
  • Guidelines to choose audit points
  • Auditing embedded data flows
  • Enabling auditing in an embedded data flow
  • Audit points not visible outside of the embedded data flow
  • Resolving invalid audit labels
  • Viewing audit results
  • Job Monitor Log
  • Job Error Log
  • Metadata Reports
  • Data Cleansing with Data Integrator Data Quality
  • Overview of Data Integrator Data Quality architecture
  • Overview of steps to use Data Integrator Data Quality 14
  • Data Quality Terms and Definitions
  • Overview of steps to use Data Integrator Data Quality
  • Creating a Data Quality datastore
  • Importing Data Quality Projects 14
  • Importing Data Quality Projects
  • Using the Data Quality transform
  • Mapping input fields from the data flow to the project
  • Creating custom projects
  • Data Quality blueprints for Data Integrator
  • Using View Where Used
  • From the object library
  • From the workspace
  • Using View Data
  • Accessing View Data
  • Viewing data in the workspace
  • View Data properties
  • Filtering
  • Sorting
  • View Data tool bar options
  • View Data tabs
  • Column Profile tab
  • Using the interactive debugger
  • Before starting the interactive debugger
  • Changing the interactive debugger port
  • Starting and stopping the interactive debugger
  • Windows
  • Filters and Breakpoints window
  • Menu options and tool bar
  • Viewing data passed by transforms
  • Push-down optimizer
  • Comparing Objects
  • Overview of the Difference Viewer window
  • To change the color scheme
  • Navigating through differences
  • Calculating usage dependencies 15
  • Calculating usage dependencies
  • Metadata exchange
  • Importing metadata files into Data Integrator
  • Exporting metadata files from Data Integrator
  • Creating Business Objects universes 16
  • Creating Business Objects universes
  • Mappings between repository and universe metadata
  • Attributes that support metadata exchange 16
  • Attributes that support metadata exchange
  • Recovery Mechanisms
  • Recovering from unsuccessful job execution
  • Automatically recovering jobs
  • Enabling automated recovery
  • Marking recovery units
  • Running in recovery mode
  • Ensuring proper execution path
  • Using try/catch blocks with automatic recovery
  • Ensuring that data is not duplicated in targets
  • Using preload SQL to allow re-executable data flows
  • Manually recovering jobs using status tables
  • Processing data with problems
  • Using overflow files
  • Filtering missing or bad values
  • Handling facts with missing dimensions
  • Understanding changed-data capture
  • Full refresh
  • Capturing only changes
  • Source-based and target-based CDC
  • Using CDC with Oracle sources
  • Overview of CDC for Oracle databases
  • Setting up Oracle CDC
  • CDC datastores
  • Importing CDC data from Oracle
  • Viewing an imported CDC table
  • Configuring an Oracle CDC source
  • Creating a data flow with an Oracle CDC source
  • Maintaining CDC tables and subscriptions
  • Limitations
  • Using CDC with DB2 sources
  • Guaranteed delivery
  • Setting up DB2
  • Setting up Data Integrator
  • CDC Services
  • Importing CDC data from DB2
  • Configuring a DB2 CDC source
  • Using CDC with Attunity mainframe sources
  • Setting up Attunity CDC
  • Importing mainframe CDC data
  • Configuring a mainframe CDC source
  • Using mainframe check-points
  • Using CDC with Microsoft SQL Server databases
  • Overview of CDC for SQL Server databases
  • Setting up SQL Replication Server for CDC
  • Importing SQL Server CDC data
  • Configuring a SQL Server CDC source
  • Using CDC with timestamp-based sources
  • Processing timestamps
  • Overlaps
  • Overlap avoidance
  • Overlap reconciliation
  • Presampling
  • Types of timestamps
  • Create-only timestamps
  • Update-only timestamps
  • Create and update timestamps
  • Timestamp-based CDC examples
  • Preserving generated keys
  • Using the lookup function
  • Comparing tables
  • Preserving history
  • Additional job design tips
  • Header and detail synchronization
  • Using CDC for targets 18
  • Capturing physical deletions
  • Using CDC for targets
  • Administrator
  • SNMP support
  • About the Data Integrator SNMP agent
  • Job Server, SNMP agent, and NMS application architecture
  • About SNMP Agent’s Management Information Base (MIB)
  • About an NMS application
  • Configuring Data Integrator to support an NMS application
  • SNMP configuration parameters
  • Job Servers for SNMP
  • System Variables
  • Access Control, v1/v2c
  • Access Control, v3
  • Traps
  • Troubleshooting
  • Index

Data Integrator Designer Guide

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. See the last page of this document and find out how you can participate and help to improve our documentation.

Data Integrator 11.7.2 for Windows and UNIX

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Copyright

If you find any problems with this documentation, please report them to Business Objects S.A. in writing at documentation@businessobjects.com. Copyright © Business Objects S.A. 2007. All rights reserved.

Trademarks

Business Objects, the Business Objects logo, Crystal Reports, and Crystal Enterprise are trademarks or registered trademarks of Business Objects SA or its affiliated companies in the United States and other countries. All other names mentioned herein may be trademarks of their respective owners. Business Objects products in this release may contain redistributions of software licensed from third-party contributors. Some of these individual components may also be available under alternative licenses. A partial listing of third-party contributors that have requested or permitted acknowledgments, as well as required notices, can be found at: http://www.businessobjects.com/thirdparty

Third-party contributors

Patents

Business Objects owns the following U.S. patents, which may cover products that are offered and sold by Business Objects: 5,555,403, 6,247,008 B1, 6,578,027 B2, 6,490,593 and 6,289,352. April 26, 2007

Date

2

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents
Chapter 1 Introduction 17 Welcome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Overview of this document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Audience and assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 More Data Integrator product documentation . . . . . . . . . . . . . . . . . . . . . . 19 Chapter 2 Logging in to the Designer 21

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Creating a Data Integrator repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Associating the repository with a Job Server . . . . . . . . . . . . . . . . . . . . . . . 22 Entering repository information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Version restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Oracle login . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Microsoft SQL Server login . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 IBM DB2 login . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Sybase ASE login . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Resetting users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Chapter 3 Designer user interface 29

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Data Integrator objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Reusable objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Single-use objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Object hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Designer window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Menu bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Project menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Edit menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 View menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Data Integrator Designer Guide

3

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Tools menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Debug menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Validation menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Window menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Help menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Project area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Tool palette . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Moving objects in the workspace area . . . . . . . . . . . . . . . . . . . . . . . . . 44 Connecting and disconnecting objects . . . . . . . . . . . . . . . . . . . . . . . . . 45 Describing objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Scaling the workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Arranging workspace windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Closing workspace windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Local object library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Object editors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Working with objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Creating new reusable objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Changing object names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Viewing and changing object properties . . . . . . . . . . . . . . . . . . . . . . . . 53 Creating descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Creating annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Saving and deleting objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Searching for objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 General and environment options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Designer — Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Designer — General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Designer — Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Designer — Central Repository Connections . . . . . . . . . . . . . . . . . . . . 69 Data — General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Job Server — Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Job Server — General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Chapter 4

Projects and Jobs

71

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Objects that make up a project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Creating new projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Opening existing projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Saving projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Creating jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Naming conventions for objects in jobs . . . . . . . . . . . . . . . . . . . . . . . . 76 Chapter 5 Datastores 79

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 What are datastores? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Database datastores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Mainframe interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Defining a database datastore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Changing a datastore definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Browsing metadata through a database datastore . . . . . . . . . . . . . . . 90 Importing metadata through a database datastore . . . . . . . . . . . . . . . 94 Memory datastores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Persistent cache datastores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Linked datastores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Adapter datastores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Defining an adapter datastore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Browsing metadata through an adapter datastore . . . . . . . . . . . . . . . 114 Importing metadata through an adapter datastore . . . . . . . . . . . . . . . 114 Creating and managing multiple datastore configurations . . . . . . . . . . . . 115 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Why use multiple datastore configurations? . . . . . . . . . . . . . . . . . . . 117 Creating a new configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Adding a datastore alias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Portability solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Data Integrator Designer Guide

5

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Job portability tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Renaming table and function owner . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Defining a system configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Chapter 6 File Formats 135

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 What are file formats? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 File format editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Creating file formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Creating a new file format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 Modeling a file format on a sample file . . . . . . . . . . . . . . . . . . . . . . . . 143 Replicating and renaming file formats . . . . . . . . . . . . . . . . . . . . . . . . . 144 Creating a file format from an existing flat table schema . . . . . . . . . . 147 Editing file formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 File format features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Reading multiple files at one time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Identifying source file names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Number formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Ignoring rows with specified markers . . . . . . . . . . . . . . . . . . . . . . . . . 151 Date formats at the field level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Error handling for flat-file sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Creating COBOL copybook file formats . . . . . . . . . . . . . . . . . . . . . . . . . . 157 File transfers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 Custom transfer system variables for flat files . . . . . . . . . . . . . . . . . . 160 Custom transfer options for flat files . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Setting custom transfer options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Design tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 Web log support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Word_ext function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 Concat_date_time function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 WL_GetKeyValue function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

6

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Chapter 7

Data Flows

171

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 What is a data flow? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Naming data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Data flow example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Steps in a data flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Data flows as steps in work flows . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Intermediate data sets in a data flow . . . . . . . . . . . . . . . . . . . . . . . . . 174 Operation codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Passing parameters to data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Creating and defining data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 Source and target objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 Source objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Target objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Adding source or target objects to data flows . . . . . . . . . . . . . . . . . . 180 Template tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 Transform editors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Adding transforms to data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 Query transform overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Adding a Query transform to a data flow . . . . . . . . . . . . . . . . . . . . . . 188 Query editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Data flow execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Push down operations to the database server . . . . . . . . . . . . . . . . . . 192 Distributed data flow execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 Load balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Audit Data Flow Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Chapter 8 Work Flows 197

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 What is a work flow? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 Steps in a work flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

Data Integrator Designer Guide

7

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Order of execution in work flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Example of a work flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 Creating work flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 While loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Design considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Defining a while loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 Using a while loop with View Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Try/catch blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Categories of available exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Debugging scripts using the print function . . . . . . . . . . . . . . . . . . . . . 213 Chapter 9 Nested Data 215

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 What is nested data? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 Representing hierarchical data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Formatting XML documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Importing XML Schemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 Specifying source options for XML files . . . . . . . . . . . . . . . . . . . . . . . 228 Mapping optional schemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 Using Document Type Definitions (DTDs) . . . . . . . . . . . . . . . . . . . . . 232 Generating DTDs and XML Schemas from an NRDM schema . . . . . 234 Operations on nested data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Overview of nested data and the Query transform . . . . . . . . . . . . . . . 236 FROM clause construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Nesting columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 Using correlated columns in nested data . . . . . . . . . . . . . . . . . . . . . . 241 Distinct rows and nested data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Grouping values across nested schemas . . . . . . . . . . . . . . . . . . . . . . 243 Unnesting nested data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 How transforms handle nested data . . . . . . . . . . . . . . . . . . . . . . . . . . 246 XML extraction and parsing for columns . . . . . . . . . . . . . . . . . . . . . . . . . . 247

8

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Sample Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 Chapter 10 Real-time jobs 253

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 Request-response message processing . . . . . . . . . . . . . . . . . . . . . . . . . 254 What is a real-time job? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Real-time versus batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 Real-time job examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 Creating real-time jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Real-time job models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Using real-time job models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Creating a real-time job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Real-time source and target objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 Secondary sources and targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 Transactional loading of tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 Design tips for data flows in real-time jobs . . . . . . . . . . . . . . . . . . . . 269 Testing real-time jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270 Executing a real-time job in test mode . . . . . . . . . . . . . . . . . . . . . . . . 270 Using View Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 Using an XML file target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 Building blocks for real-time jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 Supplementing message data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 Branching data flow based on a data cache value . . . . . . . . . . . . . . 275 Calling application functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280 Designing real-time applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 Reducing queries requiring back-office application access . . . . . . . . 281 Messages from real-time jobs to adapter instances . . . . . . . . . . . . . 282 Real-time service invoked by an adapter instance . . . . . . . . . . . . . . 282 Chapter 11 Embedded Data Flows 283

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

Data Integrator Designer Guide

9

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Example of when to use embedded data flows . . . . . . . . . . . . . . . . . . . . . 285 Creating embedded data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 Using the Make Embedded Data Flow option . . . . . . . . . . . . . . . . . . . 286 Creating embedded data flows from existing flows . . . . . . . . . . . . . . . 289 Using embedded data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Testing embedded data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 Troubleshooting embedded data flows . . . . . . . . . . . . . . . . . . . . . . . . 293 Chapter 12 Variables and Parameters 295

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 The Variables and Parameters window . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 Using local variables and parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Passing values into data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Defining local variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 Defining parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 Using global variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 Creating global variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 Viewing global variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 Setting global variable values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306 Local and global variable rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Naming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Replicating jobs and work flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Importing and exporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 Environment variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 Setting file names at run-time using variables . . . . . . . . . . . . . . . . . . . . . . 314 Chapter 13 Executing Jobs 317

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318 Overview of Data Integrator job execution . . . . . . . . . . . . . . . . . . . . . . . . 318 Preparing for job execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 Validating jobs and job components . . . . . . . . . . . . . . . . . . . . . . . . . . 319

10 Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Ensuring that the Job Server is running . . . . . . . . . . . . . . . . . . . . . . . 320 Setting job execution options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 Executing jobs as immediate tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 Monitor tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 Log tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 Debugging execution errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324 Using Data Integrator logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 Examining target data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 Changing Job Server options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 Chapter 14 Data Quality 333

Chapter overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 Using the Data Profiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 Data sources that you can profile . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 Connecting to the profiler server . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 Profiler statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 Executing a profiler task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 Monitoring profiler tasks using the Designer . . . . . . . . . . . . . . . . . . . 349 Viewing the profiler results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350 Using View Data to determine data quality . . . . . . . . . . . . . . . . . . . . . . . 356 Data tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356 Profile tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357 Relationship Profile or Column Profile tab . . . . . . . . . . . . . . . . . . . . . 357 Using the Validation transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358 Analyze column profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358 Define validation rule based on column profile . . . . . . . . . . . . . . . . . 359 Using Auditing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 Auditing objects in a data flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 Accessing the Audit window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366 Defining audit points, rules, and action on failure . . . . . . . . . . . . . . . 367 Guidelines to choose audit points . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 Auditing embedded data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372 Resolving invalid audit labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376

Data Integrator Designer Guide 11

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 Overview of steps to use Data Integrator Data Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 Using the Data Quality transform . . . . . . . . . . . 408 View Data tool bar options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 From the workspace . . . . . . . . . . . . . . . . . . . . . . 377 Data Cleansing with Data Integrator Data Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435 Push-down optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .This document is part of a SAP study on PDF usage. . . . . . . . . . . . . . . . 404 Using View Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Overview of the Difference Viewer window . . . 381 Creating a Data Quality datastore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439 12 Data Integrator Designer Guide . . 403 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 Accessing View Data . . . . . . . . . . . . . . . . 398 From the object library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 Data Quality Terms and Definitions . . . . . . . . . . 381 Importing Data Quality Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436 Comparing Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433 Viewing data passed by transforms . . . . . . . . . . . . 427 Menu options and tool bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Find out how you can participate and help to improve our documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 Using the interactive debugger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398 Using View Where Used . . . 406 View Data properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 Data Quality blueprints for Data Integrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418 Starting and stopping the interactive debugger . . . . . . . . . . . 418 Before starting the interactive debugger . . . . . . . 405 Viewing data in the workspace . . . . . . . . . 385 Creating custom projects . . . . . Contents Viewing audit results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 Chapter 15 Design and Debug 397 About this chapter . 424 Windows . . . 378 Overview of Data Integrator Data Quality architecture . . 435 Limitations . . . . . . . . . . . . . . . . . . . . . 413 View Data tabs . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463 Processing data with problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470 Chapter 18 Techniques for Capturing Changed Data 471 About this chapter . . . . . . . . . . . 456 Running in recovery mode . . . . . . . 458 Using try/catch blocks with automatic recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . 447 Creating Business Objects universes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466 Filtering missing or bad values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472 Capturing only changes . . . . . . 450 Attributes that support metadata exchange . . . . . . . . .This document is part of a SAP study on PDF usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459 Ensuring that data is not duplicated in targets . . . . . . 442 Calculating usage dependencies . . . . . . . . . . . . . . . . . . . . . . 443 Chapter 16 Exchanging metadata 445 About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457 Ensuring proper execution path . . . . . . . . . . . . 449 Mappings between repository and universe metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 Marking recovery units . 467 Handling facts with missing dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454 Automatically recovering jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446 Metadata exchange . . . . 461 Manually recovering jobs using status tables . . . . . . . . . . . . . . . . 454 Enabling automated recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446 Importing metadata files into Data Integrator . . . . . . . . . . . . . . . . . . . . . . Find out how you can participate and help to improve our documentation. 454 Recovering from unsuccessful job execution . . . 447 Exporting metadata files from Data Integrator . . . . . . . . . . . . . . . . . . . . . . . . . 466 Using overflow files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contents Navigating through differences . . . . . . . . . . . . . 451 Chapter 17 Recovery Mechanisms 453 About this chapter . . . . . . . . 472 Understanding changed-data capture . . . . . . . . . . . . 472 Data Integrator Designer Guide 13 . . . . . . . . . . . . . . . . . . . . . 460 Using preload SQL to allow re-executable data flows . . . . . . . . . . . . . . . . . . . . . . 472 Full refresh . .

. . . . . . . . . . . . . . . . . . . 502 Limitations . . . . . . . . . 518 Configuring a SQL Server CDC source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513 Setting up SQL Replication Server for CDC . . . . . . . . . . . . . . . . 487 Configuring an Oracle CDC source . . . . . . . . . . . . . 506 Setting up Data Integrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516 Importing SQL Server CDC data . . . . . . 480 Viewing an imported CDC table . . . . . . . . . . . . 507 Importing mainframe CDC data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510 Using mainframe check-points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513 Overview of CDC for SQL Server databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522 14 Data Integrator Designer Guide . . . . . . . . . . . . . . 505 Setting up Attunity CDC . . . . . . . . . . . . . . . . . 495 Guaranteed delivery . . . . . . . . . . . . . . . . . . . . . . . . . 519 Using CDC with timestamp-based sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502 Configuring a DB2 CDC source . . . . . . . . . . . . . . . 504 Using CDC with Attunity mainframe sources . . . . . . . . . . . . . . . 515 Setting up Data Integrator . . . . . . . 475 Overview of CDC for Oracle databases . . . . . .This document is part of a SAP study on PDF usage. . . . . . . . . . . . . . 496 Setting up Data Integrator . . . . . . . . . . . . Contents Source-based and target-based CDC . 500 Importing CDC data from DB2 . . . . . . . . . 491 Maintaining CDC tables and subscriptions . . . . . . . . . . . . 498 CDC Services . . . . . . . . . . . . . . . . . . 488 Creating a data flow with an Oracle CDC source . . . . 496 Setting up DB2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511 Using CDC with Microsoft SQL Server databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 Using CDC with DB2 sources . . 509 Configuring a mainframe CDC source . . . . . . . . . . . . 499 CDC datastores . . . . . . 480 Importing CDC data from Oracle . . . . . . . . . . . . . . . . . . . . . . . . 475 Setting up Oracle CDC . . . . . 473 Using CDC with Oracle sources . . . . . . . . . . . . . . . . . . . . . . Find out how you can participate and help to improve our documentation. . . . . . . . . . . . . . 479 CDC datastores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . Find out how you can participate and help to improve our documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544 Using CDC for targets . . . . . . . . . . . . . . . . . . . . . 527 Types of timestamps .This document is part of a SAP study on PDF usage. . . . . . 523 Overlaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548 Administrator . . . . . 545 Chapter 19 Monitoring jobs 547 About this chapter . . . . . . 553 Configuring Data Integrator to support an NMS application . . . . . . . . . 548 Job Server. . . . . . . . . . . . . . . 554 Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 Additional job design tips . . . . . . . . . . . . . . . . 549 About SNMP Agent’s Management Information Base (MIB) . . . . and NMS application architecture . . . . . . . SNMP agent. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567 Index 569 Data Integrator Designer Guide 15 . . . . . . . . . . . . . . . . . . . . . . Contents Processing timestamps . . . . . . . . . . . . 548 About the Data Integrator SNMP agent . . . . . . . . . . . . . . . . . . 548 SNMP support . . . . . . . . . . . . . . . . . . . . . . . . 550 About an NMS application . . . . 533 Timestamp-based CDC examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Contents 16 Data Integrator Designer Guide .

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide Introduction chapter .

and load data from databases and applications into a data warehouse used for analytic and on-demand queries. transform. and messaging concepts. or data integration. You understand your source data systems. business intelligence. 1 Introduction Welcome Welcome Welcome to the Data Integrator Designer Guide. Find out how you can participate and help to improve our documentation. 18 Data Integrator Designer Guide . frontoffice. You understand your organization’s data needs. You can also use the Designer to define logical paths for processing message-based queries and transactions from Web-based. or database administrator working on data extraction. consultant. RDBMS. and back-office applications.This document is part of a SAP study on PDF usage. The Data Integrator Designer provides a graphical user interface (GUI) development environment in which you define data application logic to extract. You are familiar with SQL (Structured Query Language). This chapter discusses these topics: • • • Overview of this document Audience and assumptions More Data Integrator product documentation Overview of this document The book contains two kinds of information: • • Conceptual information that helps you understand the Data Integrator Designer and how it works Procedural information that explains in a step-by-step manner how to accomplish a task While you are learning about the product While you are performing tasks in the design and early testing phase of your data-movement projects As a general source of information during any phase of your projects You will find this book most useful: • • • Audience and assumptions This and other Data Integrator product documentation assumes the following: • • • • You are an application developer. data warehousing.

Find out how you can participate and help to improve our documentation. To view documentation in PDF format. and SOAP protocols.) You are familiar Data Integrator installation environments—Microsoft Windows or UNIX. etc. select Start > Programs > Business Objects > Data Integrator > Data Integrator Documentation and select: • • • • • • • • • • Release Notes—Opens this document.This document is part of a SAP study on PDF usage. Introduction More Data Integrator product documentation 1 • If you are interested in using this product to design real-time processing. you should be familiar with: • • • DTD and XML Schema formats for XML files Publishing Web Services (WSDL. an online resource for the Data Integrator user community) Knowledge Base—Opens a browser window to Business Objects’ Technical Support Knowledge Exchange forum (access requires registration) Data Integrator Designer Guide 19 . you can: • If you accepted the default installation. HTTP. which includes known and fixed bugs. migration considerations. you can view technical documentation from many locations. More Data Integrator product documentation Consult the Data Integrator Getting Started Guide for: • • • An overview of Data Integrator products and architecture Data Integrator installation and configuration information A list of product documentation and a suggested reading path After you install Data Integrator. and last-minute documentation corrections Release Summary—Opens the Release Summary PDF. which describes the latest Data Integrator features Tutorial—Opens the Data Integrator Tutorial PDF. which you can use for basic stand-alone training purposes Release Notes Release Summary Technical Manuals Tutorial Select one of the following from the Designer’s Help menu: Other links from the Designer’s Help menu include: DIZone—Opens a browser window to the DI Zone.

including Data Integrator documentation for previous releases (including Release Summaries and Release Notes). Use Online Help’s links and tool bar to navigate. 20 Data Integrator Designer Guide . Online Help opens to the subject you selected. You can also open Help.This document is part of a SAP study on PDF usage. by visiting the Business Objects documentation Web site at http://support. using one of the following methods: • • Choose Contents from the Designer’s Help menu.com/ documentation/.businessobjects. 1 Introduction More Data Integrator product documentation You can also view and download PDF documentation. Find out how you can participate and help to improve our documentation. Click objects in the object library or workspace and press F1.

Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide Logging in to the Designer chapter .This document is part of a SAP study on PDF usage.

you can create a repository at any time using the Data Integrator Repository Manager. In production environments. When running a job from a repository. Data Integrator repositories can reside on Oracle. enter the database connection information for the repository and select Local for repository type. 3. choose Programs > Business Objects > Data Integrator > Repository Manager (assuming you installed Data Integrator in the Data Integrator program group). Associating the repository with a Job Server Each repository must be associated with at least one Job Server.This document is part of a SAP study on PDF usage. To create a local repository Define a database for the local repository using your database management system. Sybase ASE. This adds the Data Integrator repository schema to the specified database. you can balance loads appropriately. Microsoft SQL Server. 1. Click Create. Find out how you can participate and help to improve our documentation. You can link any number of repositories to a single Job Server. which is the process that starts jobs. you select one of the associated repositories. From the Start menu. When you log in to the Data Integrator Designer. 2 Logging in to the Designer About this chapter About this chapter This chapter describes how to log in to the Data Integrator Designer. 4. IBM DB2. The same Job Server can run jobs stored on multiple repositories. This chapter discusses: • • • • Creating a Data Integrator repository Associating the repository with a Job Server Entering repository information Resetting users Creating a Data Integrator repository You must configure a local repository to log in to Data Integrator. However. In the Repository Manager window. 2. Typically. you create a repository during installation. you are actually logging in to the database you defined for the Data Integrator repository. 22 Data Integrator Designer Guide .

you can view Data Integrator and repository versions by selecting Help > About Data Integrator. The required information varies with the type of database containing the repository. 11. However.This document is part of a SAP study on PDF usage. but not repository 6. Logging in to the Designer Entering repository information 2 Typically. Data Integrator alerts you if there is a mismatch between your Designer version and your repository version. After you log in. Designer 11.6.7 (equal to or less than).5. This section discusses: • • • • • Version restrictions Oracle login Microsoft SQL Server login IBM DB2 login Sybase ASE login Version restrictions Your repository version must be associated with the same major release as the Designer and must be less than or equal to the version of the Designer. See the Data Integrator Getting Started Guide for detailed instructions. in this example.5 (different major release version). choose Programs > Business Objects > Data Integrator > Server Manager (assuming you installed Data Integrator in the Data Integrator program group). you can define or edit Job Servers or links between repositories and Job Servers at any time using the Data Integrator Server Manager.0 is the earliest repository version that could be used with Designer version 11. Find out how you can participate and help to improve our documentation. you define a Job Server and link it to a repository during installation. Entering repository information To log in. During login. enter the connection information for your Data Integrator repository.7 can access repositories 11. From the Start menu. So.7. 2. To create a Job Server for your local repository Open the Data Integrator Server Manager.0. For example. and 11. Data Integrator Designer Guide 23 . repository 11. 1. 11.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. 2 Logging in to the Designer Entering repository information Some features in the current release of the Designer might not be supported if you are not logged in to the latest version of the repository. select Programs > Business Objects > Data Integrator > Data Integrator Designer. Oracle login From the Windows Start menu. 24 Data Integrator Designer Guide .

Database connection name — The TNSnames. select Programs > Business Objects > Data Integrator > Data Integrator Designer. Database server name —The database server name. For a Microsoft SQL Server repository. you must complete the following fields: • • • Database type — Select DB2. For a DB2 repository. User name and Password —The user name and password for a Data Integrator repository defined in an Oracle database. • • IBM DB2 login From the Windows Start menu. select Programs > Business Objects > Data Integrator > Data Integrator Designer. Windows authentication — Select to have Microsoft SQL Server validate the login account name and password using information from the Windows operating system. Find out how you can participate and help to improve our documentation. Microsoft SQL Server login From the Windows Start menu. complete the following fields: • • • • Database type — Select Oracle. User name and Password — The user name and password for a Data Integrator repository defined in a Microsoft SQL Server database.ora entry or Net Service Name of the database. Remember — Check this box if you want the Designer to store this information for the next time you log in. you must complete the following fields: • • • • Database type — Select Microsoft_SQL_Server. DB2 datasource — The data source name. Remember — Check this box if you want the Designer to store this information for the next time you log in. Data Integrator Designer Guide 25 . Logging in to the Designer Entering repository information 2 In the Repository Login window. User name and Password — The user name and password for a Data Integrator repository defined in a DB2 database.This document is part of a SAP study on PDF usage. Database name — The name of the specific database to which you are connecting. clear to authenticate using the existing Microsoft SQL Server login account name and password and complete the User name and Password fields.

Remember — Check this box if you want the Designer to store this information for the next time you log in.This document is part of a SAP study on PDF usage. you must complete the following fields: • • Database type — Select Sybase ASE. 2 Logging in to the Designer Resetting users • Remember — Check this box if you want the Designer to store this information for the next time you log in. For a Sybase ASE repository. 26 Data Integrator Designer Guide . more than one person may attempt to log in to a single repository. • • • Database name — Enter the name of the specific database to which you are connecting. Resetting users Occasionally. If the case does not match. User name and Password — Enter the user name and password for this database. the case you type for the database server name must match the associated case in the SYBASE_Home\interfaces file. when logging in to a Sybase repository in the Designer. Database server name — Enter the database’s server name. the Reset Users window appears. Find out how you can participate and help to improve our documentation. If this happens. select Programs > Business Objects > Data Integrator > Data Integrator Designer. Sybase ASE login From the Windows Start menu. you might receive an error because the Job Server cannot communicate with the repository. Note: For UNIX Job Servers. listing the users and the time they logged in to the repository.

Continue to log in to the system regardless of who else might be connected. Find out how you can participate and help to improve our documentation. Logging in to the Designer Resetting users 2 From this window. Subsequent changes could corrupt the repository. Data Integrator Designer Guide 27 . Exit to terminate the login attempt and close the session. you have several options. You can: • • • Reset Users to clear the users in the repository and set yourself as the currently logged in user.This document is part of a SAP study on PDF usage. Note: Only use Reset Users or Continue if you know that you are the only user connected to the repository.

2 Logging in to the Designer Resetting users 28 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide Designer user interface chapter .

the name of the database to which you connect is an option for the datastore object. Reusable objects Single-use objects Data Integrator has two types of objects: • • The object type affects how you define and retrieve the object. in a datastore. Reusable objects You can reuse and replicate most objects defined in Data Integrator. projects. 3 Designer user interface About this chapter About this chapter This chapter provides basic information about the Designer’s graphical user interface. Properties. or work with in Data Integrator Designer are called objects. Properties describe an object. The local object library shows objects such as source and target metadata. system functions. For example. and jobs. the name of the object and the date it was created are properties. which control the operation of objects. Find out how you can participate and help to improve our documentation. edit. It contains the following topics: • • • • • • • • • • • Data Integrator objects Designer window Menu bar Toolbar Project area Tool palette Workspace Local object library Object editors Working with objects General and environment options Data Integrator objects All “entities” you define.This document is part of a SAP study on PDF usage. For example. which document the object. but do not affect its operation. 30 Data Integrator Designer Guide . Objects are hierarchical and consist of: • • Options.

like a weekly load job and a daily load job. When you drag and drop an object from the object library. A data flow.This document is part of a SAP study on PDF usage. If the data flow changes. Access reusable objects through the local object library. you are changing the object in all other places in which it appears. is a reusable object. A reusable object has a single definition. for example. can call the same data flow. Single-use objects Some objects are defined only within the context of a single job or data flow. Designer user interface Data Integrator objects 3 After you define and save a reusable object. Object hierarchy Data Integrator object relationships are hierarchical. If you change the definition of the object in one place. Multiple jobs. The object library contains object definitions. Find out how you can participate and help to improve our documentation. all calls to the object refer to that definition. for example scripts and specific transform definitions. The following figure shows the relationships between major Data Integrator object types: Data Integrator Designer Guide 31 . Data Integrator stores the definition in the local repository. you are really creating a new reference (or call) to the existing object definition. both jobs use the new version of the data flow. You can then reuse the definition as often as necessary by creating calls to the definition.

3 Designer user interface Designer window Designer window The Data Integrator Designer user interface consists of a single application window and several embedded supporting windows. Find out how you can participate and help to improve our documentation. 32 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage.

Project area. Designer user interface Menu bar 3 Project area Menubar Toolbar Workspace Tool palette Local object library The application window contains the Menu bar. tabbed Workspace. Menu bar This section contains a brief description of the Designer’s menus: • • • • • Project menu Edit menu View menu Tools menu Debug menu Data Integrator Designer Guide 33 . Toolbar. Tool palette. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. and tabbed Local object library.

Print — Print the active workspace. Save All — Save all changes to objects in the current Designer session. Exit — Exit Data Integrator Designer. DTD.This document is part of a SAP study on PDF usage. 34 Data Integrator Designer Guide . real-time job. Delete — Delete the selected object. datastore. 3 Designer user interface Menu bar • • • Validation menu Window menu Help menu Project menu The project menu contains standard Windows as well as Data Integratorspecific options. or custom function. work flow. transform. batch job. Compact Repository — Remove redundant and obsolete objects from the repository tables. Find out how you can participate and help to improve our documentation. data flow. XML Schema. Print Setup — Set up default printer information. Close — Close the currently open project. file format. Open — Open an existing project. • • • • • • • • • • New — Define a new project. Save — Save the object open in the workspace.

View menu A check mark indicates that the tool is active. • • • Toolbar — Display or remove the toolbar in the Designer window. • Paste — Paste the contents of the clipboard into the active workspace or text box. • • Delete — Delete the selected object. use Replicate in the object library to make an independent copy of an object. Clear All — Clear all objects in the active workspace (no undo). you must cut or copy the objects again. Data Integrator Designer Guide 35 . Find out how you can participate and help to improve our documentation. Palette — Display or remove the floating tool palette. Note: You cannot copy reusable objects using the Copy command. instead. Status Bar — Display or remove the status bar in the Designer window. Designer user interface Menu bar 3 Edit menu The Edit menu provides standard Windows commands with a few restrictions. Cut — Cut the selected object or text and place it on the clipboard. • • • Undo — Undo the last operation (text edits only). Copy — Copy the selected object or text to the clipboard.This document is part of a SAP study on PDF usage. Note: You can only paste clipboard contents once. To paste again.

see “Creating and managing multiple datastore configurations” on page 115. Output — Open or close the Output window. 3 Designer user interface Menu bar • • Enabled Descriptions — View descriptions for objects with enabled descriptions. Refresh — Redraw the display. For more information. Profiler Monitor — Display the status of Profiler tasks.This document is part of a SAP study on PDF usage. see “Variables and Parameters” on page 295. see “Local object library” on page 47. The Output window shows errors that occur such as during job validation or object export. • • • • • • • Object Library — Open or close the object library window. see the Data Integrator Reference Guide. For more information. Custom Functions — Display the Custom Functions window. see “Project area” on page 41. Tools menu An icon with a different color background indicates that the tool is active. For more information. see “Using the Data Profiler” on page 335. Project Area — Display or remove the project area from the Data Integrator window. System Configurations — Display the System Configurations editor. For more information. Variables — Open or close the Variables and Parameters window. Use this command to ensure the content of the workspace represents the most up-to-date information from the repository. Find out how you can participate and help to improve our documentation. 36 Data Integrator Designer Guide . For more information. For more information.

BusinessObjects Universes — Export (create or update) metadata in Business Objects Universes. For more information. see “Connecting to the profiler server” on page 336. report type. This command opens the Export editor in the workspace. • • • • • • Debug menu The only options available on this menu at all times are Show Filters/ Breakpoints and Filters/Breakpoints. Metadata Reports — Display the Metadata Reports window. Export — Export individual repository objects to another repository or file.This document is part of a SAP study on PDF usage. see “Using the interactive debugger” on page 418. To export your whole repository. • • • Execute . Options — Display the Options window. For more information. Designer user interface Menu bar 3 • • Profiler Server Login — Connect to the Profiler Server. Import From File — Import objects into the current repository from a file. Show Filters/Breakpoints . Metadata Exchange — Import and export metadata to third-party systems via a file. and the objects in the repository that you want to list in the report. You can drag objects from the object library into the editor for export. see the Data Integrator Advanced Development and Migration Guide. see the Data Integrator Advanced Development and Migration Guide. Data Integrator Designer Guide 37 . Select the object type. See the Data Integrator Advanced Development and Migration Guide. See “Metadata reporting tool” on page 427. See “Metadata exchange” on page 446.Opens the Debug Properties window which allows you to run a job in the debug mode. Start Debug . See “General and environment options” on page 66. Central Repositories — Create or edit connections to a central repository for managing object versions among multiple users.Opens the Execution Properties window which allows you to execute the selected job. See “Creating Business Objects universes” on page 449.Shows and hides filters and breakpoints in workspace diagrams. in the object library right-click and select Repository > Export to file. The Execute and Start Debug options are only active when a job is selected. Find out how you can participate and help to improve our documentation. For more information. For more information. All other options are available as appropriate when a job is running in the Debug mode.

• • • Validate — Validate the objects in the current workspace view or all objects in the job before executing the application. Tile Vertically — Display window panels one above the other. For more information. Forward — Move forward in the list of active workspace windows. Display Language — View a read-only version of the language associated with the job.This document is part of a SAP study on PDF usage. Tile Horizontally — Display window panels side by side. Close All Windows — Close all open windows. Display Optimized SQL — Display the SQL that Data Integrator generated for a selected data flow.Opens a window you can use to manage filters and breakpoints. See the Data Integrator Performance Optimization Guide. 3 Designer user interface Menu bar • Filters/Breakpoints . Validation menu The Designer displays options on this menu as appropriate when an object is open in the workspace. Find out how you can participate and help to improve our documentation. • • • • • • Back — Move back in the list of active workspace windows. Window menu The Window menu provides standard Windows options. 38 Data Integrator Designer Guide . see “Filters and Breakpoints window” on page 432. Cascade — Display window panels overlapping with titles showing.

This document is part of a SAP study on PDF usage. Release Notes — Display current release notes. Navigate to another open object by selecting its name in the list. About Data Integrator — Display information about Data Integrator including versions of the Designer. Designer user interface Toolbar 3 • A list of objects open in the workspace also appears on the Windows menu. and a link to the Business Objects Web site. including: Data Integrator Designer Guide 39 . You can also access the same file from the Help menu in the Administrator or from the <linkdir>\Doc\Books directory. Job Server and engine. Technical Manuals— Display a PDF version of Data Integrator documentation. Help menu • • Contents — Display on-line help. This format prints graphics clearly and includes a master index and page numbers/references. copyright information. It is provided for users who prefer to print out their documentation. Data Integrator‘s on-line help works with Microsoft Internet Explorer version 5.5 and higher. Data Integrator provides application-specific tools. This file contains the same content as on-line help. Find out how you can participate and help to improve our documentation. The name of the currently-selected object is indicated by a check mark. • • • Toolbar In addition to many of the standard Windows tools. Release Summary — Display summary of new features in the current release.

View Where Used Opens the Output window. Go Back Go Forward Move back in the list of active workspace windows. before you decide to make design changes. which lists parent objects (such as jobs) of the object currently open in the workspace (such as a data flow). 3 Designer user interface Toolbar Icon Tool Description Close all windows Closes all open windows in the workspace. Opens and closes the central object library window. Objects included in the definition are also validated. Move forward in the list of active workspace windows. Enables the system level setting for viewing object descriptions in the workspace. Use this command to find other jobs that use the same data flow. To see if an object in a data flow is reused elsewhere. Local Object Library Central Object Library Variables Project Area Output View Enabled Descriptions Validate Current View Validate All Objects in View Audit Objects in Data Flow Opens and closes the local object library window.This document is part of a SAP study on PDF usage. Validates the object definition open in the workspace. Find out how you can participate and help to improve our documentation. 40 Data Integrator Designer Guide . Validates the object definition open in the workspace. Opens and closes the variables and parameters creation window. right-click one and select View Where Used. Opens the Audit window to define audit labels and rules for the data flow. Other objects included in the definition are also validated. Opens and closes the output window. Opens and closes the project area.

View the status of currently executing jobs. When you deselect Allow Docking. Use the tools to the right of the About tool with the interactive debugger. including which steps are complete and which steps are executing. or select Hide from the menu. Project area The project area provides a hierarchical view of the objects used in each project. Selecting a specific job execution displays its status. To quickly switch between your last docked and undocked locations. Provides a hierarchical view of all objects used in each project. Opens the Data Integrator About box. View the history of complete jobs. you can click and drag the project area to any location on your screen and it will not dock inside the Designer window. just double-click the gray border. click its toolbar icon. Designer user interface Project area 3 Icon Tool Data Integrator Management Console About Description Opens and closes the Management Console window. view and manage projects. Logs can also be viewed with the Data Integrator Administrator. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide 41 . you can click and drag the project area to dock at and undock from any edge within the Designer window. When you drag the project area away from a Designer window edge. Tabs on the bottom of the project area support different tasks. To unhide the project area. • When you select Hide. with product component version numbers and a link to the Business Objects Web site. See “Menu options and tool bar” on page 433. the project area disappears from the Designer window. it stays undocked. • When you select Allow Docking. To control project area location.This document is part of a SAP study on PDF usage. These tasks can also be done using the Data Integrator Administrator. Tabs include: Create. right-click its gray border and select/deselect Allow Docking.

adding a call to the existing definition. hold the cursor over the icon until the tool tip for the icon appears. When you create an object from the tool palette. For example. You can move the tool palette anywhere on your screen or dock it on any edge of the Designer window. if you select the data flow icon from the tool palette and define a new data flow. If a new object is reusable. 3 Designer user interface Tool palette Here’s an example of the Project window’s Designer tab.This document is part of a SAP study on PDF usage. Tool palette The tool palette is a separate window that appears by default on the right edge of the Designer workspace. it will be automatically available in the object library after you create it. The icons in the tool palette allow you to create new objects in the workspace. the window highlights your location within the project hierarchy. later you can drag that existing data flow from the object library. The icons are disabled when they are not allowed to be added to the diagram open in the workspace. you are creating a new definition of an object. which shows the project hierarchy: Project Work flow Data flow Job As you drill down into objects in the Designer workspace. as shown. The tool palette contains the following icons: 42 Data Integrator Designer Guide . To show the name of each icon. Find out how you can participate and help to improve our documentation.

Designer user interface Tool palette 3 Icon Tool Pointer Description (class) Available Returns the tool pointer to a Everywhere selection pointer for selecting and moving objects in a diagram. (reusable) Creates a new data flow.This document is part of a SAP study on PDF usage. (single-use) Creates a table for a target. (single-use) Jobs and work flows Creates a new conditional object. Jobs and work (single-use) flows Creates a new try object. (reusable) Used only with the SAP licensed extension. work flows. (singleuse) Creates a new catch object. Creates a new script object. Find out how you can participate and help to improve our documentation. Creates a template for a query. (singleuse) Jobs and work flows Jobs and work flows Jobs. and data flows Data Integrator Designer Guide 43 . Data flows Use it to define column mappings and row selections. (single-use) Data flows Jobs and work flows Jobs and work flows Work flow Data flow R/3 data flow Query transform Template table Template XML Data transport Script Conditional Try Catch Annotation Creates an XML template. (single. (single-use) Creates an annotation.Data flows use) Used only with the SAP Licensed extension. Creates a new work flow.

the workspace becomes “active” with your selection.This document is part of a SAP study on PDF usage. Drag the object to where you want to place it in the workspace. 1. such as: • • • • • • Moving objects in the workspace area Connecting and disconnecting objects Describing objects Scaling the workspace Arranging workspace windows Closing workspace windows Moving objects in the workspace area Use standard mouse commands to move objects in the workspace. 2. To move an object to a different place in the workspace area Click to select the object. The workspace provides a place to manipulate system objects and graphically assemble data movement processes. This diagram is a visual representation of an entire data movement application or some part of a data movement application. Find out how you can participate and help to improve our documentation. These processes are represented by icons that you drag and drop into a workspace to create a workspace diagram. 3 Designer user interface Workspace Workspace When you open or select a job or any flow within a job hierarchy. This section describes major workspace area tasks. 44 Data Integrator Designer Guide .

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Designer user interface Workspace

3

Connecting and disconnecting objects
You specify the flow of data through jobs and work flows by connecting objects in the workspace from left to right in the order you want the data to be moved: To connect objects Place the objects you want to connect in the workspace. Click and drag from the triangle on the right edge of an object to the triangle on the left edge of the next object in the flow.

1. 2.

1.

To disconnect objects Click the connecting line.

2.

Press the Delete key.

Describing objects
You can use descriptions to add comments about objects. You can use annotations to explain a job, work flow, or data flow. You can view object descriptions and annotations in the workspace. Together, descriptions and annotations allow you to document a Data Integrator application. For example, you can describe the incremental behavior of individual jobs with numerous annotations and label each object with a basic description.

Data Integrator Designer Guide

45

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

3

Designer user interface Workspace

For more information, see “Creating descriptions” on page 57 and “Creating annotations” on page 59.

Scaling the workspace
You can control the scale of the workspace. By scaling the workspace, you can change the focus of a job, work flow, or data flow. For example, you might want to increase the scale to examine a particular part of a work flow, or you might want to reduce the scale so that you can examine the entire work flow without scrolling. 1. To change the scale of the workspace In the drop-down list on the tool bar, select a predefined scale or enter a custom value.

2.

Alternatively, right-click in the workspace and select a desired scale.

46

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Designer user interface Local object library

3

Note: You can also select Scale to Fit and Scale to Whole:

• •

Select Scale to Fit and the Designer calculates the scale that fits the entire project in the current view area. Select Scale to Whole to show the entire workspace area in the current view area.

Arranging workspace windows
The Window menu allows you to arrange multiple open workspace windows in the following ways: cascade, tile horizontally, or tile vertically.

Closing workspace windows
When you drill into an object in the project area or workspace, a view of the object’s definition opens in the workspace area. The view is marked by a tab at the bottom of the workspace area, and as you open more objects in the workspace, more tabs appear. (You can show/hide these tabs from the Tools > Options menu. Go to Designer > General options and select/deselect Show tabs in workspace. For more information, see the “General and environment options” section.) Note: These views use system resources. If you have a large number of open views, you might notice a decline in performance. Close the views individually by clicking the close box in the top right corner of the workspace. Close all open views by selecting Window > Close All Windows or clicking the Close All Windows icon on the toolbar.

Local object library
The local object library provides access to reusable objects. These objects include built-in system objects, such as transforms, and the objects you build and save, such as datastores, jobs, data flows, and work flows.

Data Integrator Designer Guide

47

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

3

Designer user interface Local object library

The local object library is a window into your local Data Integrator repository and eliminates the need to access the repository directly. Updates to the repository occur through normal Data Integrator operation. Saving the objects you create adds them to the repository. Access saved objects through the local object library. To learn more about local as well as central repositories, see the Data Integrator Advanced Development and Migration Guide. To control object library location, right-click its gray border and select/deselect Allow Docking, or select Hide from the menu.

When you select Allow Docking, you can click and drag the object library to dock at and undock from any edge within the Designer window. When you drag the object library away from a Designer window edge, it stays undocked. To quickly switch between your last docked and undocked locations, just double-click the gray border. When you deselect Allow Docking, you can click and drag the object library to any location on your screen and it will not dock inside the Designer window.

When you select Hide, the object library disappears from the Designer window. To unhide the object library, click its toolbar icon.

To open the object library Choose Tools > Object Library, or click the object library icon in the icon bar.
Object library window

Transform object list

Tabs for other object types

48

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Designer user interface Local object library

3

The object library gives you access to the object types listed in the following table. The table shows the tab on which the object type appears in the object library and describes the Data Integrator context in which you can use each type of object. Tab Description Projects are sets of jobs available at a given time. Jobs are executable work flows. There are two job types: batch jobs and real-time jobs. Work flows order data flows and the operations that support data flows, defining the interdependencies between them. Data flows describe how to process a task. Transforms operate on data, producing output data sets from the sources you specify. The object library lists both built-in and custom transforms. Datastores represent connections to databases and applications used in your project. Under each datastore is a list of the tables, documents, and functions imported into Data Integrator. Formats describe the structure of a flat file, XML file, or XML message. Custom Functions are functions written in the Data Integrator Scripting Language. You can use them in Data Integrator jobs. To display the name of each tab as well as its icon, do one of the following: • Make the object library window wider until the names appear.

Hold the cursor over the tab until the tool tip for the tab appears, as shown.

Data Integrator Designer Guide

49

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

3

Designer user interface Object editors

To sort columns in the object library Click the column heading. For example, you can sort data flows by clicking the Data Flow column heading once. Names are listed in ascending order. To list names in descending order, click the Data Flow column heading again.

Object editors
To work with the options for an object, in the workspace click the name of the object to open its editor. The editor displays the input and output schemas for the object and a panel below them listing options set for the object. If there are many options, they are grouped in tabs in the editor. A schema is a data structure that can contain columns, other nested schemas, and functions (the contents are called schema elements). A table is a schema containing only columns. A common example of an editor is the editor for the query transform, as shown in the following illustration:

Input schema Output schema

Parameter tabs

Tabs of open windows

For specific information about the query editor, see “Query editor” on page 189.

50

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Designer user interface Working with objects

3

In an editor, you can:

• • • •

Undo or redo previous actions performed in the window (right-click and choose Undo or Redo) Find a string in the editor (right-click and choose Find) Drag-and-drop column names from the input schema into relevant option boxes Use colors to identify strings and comments in text boxes where you can edit expressions (keywords appear blue; strings are enclosed in quotes and appear pink; comments begin with a pound sign and appear green) Note: You cannot add comments to a mapping clause in a Query transform. For example, the following syntax is not supported on the Mapping tab:
table.column # comment

The job will not run and you cannot successfully export it. Use the object description or workspace annotation feature instead.

Working with objects
This section discusses common tasks you complete when working with objects in the Designer. With these tasks, you use various parts of the Designer—the toolbar, tool palette, workspace, and local object library. Tasks in this section include:

• • • • • • •

Creating new reusable objects Changing object names Viewing and changing object properties Creating descriptions Creating annotations Saving and deleting objects Searching for objects

Creating new reusable objects
You can create reusable objects from the object library or by using the tool palette. After you create an object, you can work with the object, editing its definition and adding calls to other objects. To create a reusable object (in the object library) Open the object library by choosing Tools > Object Library.

1.

Data Integrator Designer Guide

51

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

3

Designer user interface Working with objects

2. 3. 4.

Click the tab corresponding to the object type. Right-click anywhere except on existing objects and choose New. Right-click the new object and select Properties. Enter options such as name and description to define the object. To create a reusable object (using the tool palette) In the tool palette, left-click the icon for the object you want to create. Move the cursor to the workspace and left-click again. The object icon appears in the workspace where you have clicked.

1. 2.

To open an object’s definition You can open an object’s definition in one of two ways:

From the workspace, click the object name.

Click the object name to open its definition

Data Integrator opens a blank workspace in which you define the object

From the project area, click the object.

You define an object using other objects. For example, if you click the name of a batch data flow, a new workspace opens for you to assemble sources, targets, and transforms that make up the actual flow. 1. 2. 3. To add an existing object (create a new call to an existing object) Open the object library by choosing Tools > Object Library. Click the tab corresponding to any object type. Select an object.

52

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Designer user interface Working with objects

3

4.

Drag the object to the workspace.

Note: Objects dragged into the workspace must obey the hierarchy logic explained in “Object hierarchy” on page 31. For example, you can drag a data flow into a job, but you cannot drag a work flow into a data flow.

Changing object names
You can change the name of an object from the workspace or the object library. You can also create a copy of an existing object. Note: You cannot change the names of built-in objects. 1. 2. 3. To change the name of an object in the workspace Click to select the object in the workspace. Right-click and choose Edit Name. Edit the text in the name text box. Click outside the text box or press Enter to save the new name. To change the name of an object in the object library Select the object in the object library. Right-click and choose Properties. Edit the text in the first text box. Click OK. To copy an object Select the object in the object library. Right-click and choose Replicate. Data Integrator makes a copy of the top-level object (but not of objects that it calls) and gives it a new name, which you can edit.

1. 2. 3. 4.

1. 2.

Viewing and changing object properties
You can view (and, in some cases, change) an object’s properties through its property page. 1. 2. To view, change, and add object properties Select the object in the object library. Right-click and choose Properties. The General tab of the Properties window opens.

Data Integrator Designer Guide

53

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

3

Designer user interface Working with objects

3.

Complete the property sheets. The property sheets vary by object type, but General, Attributes and Class Attributes are the most common and are described in the following sections. When finished, click OK to save changes you made to the object properties and to close the window. Alternatively, click Apply to save changes without closing the window.

4.

General tab
The General tab contains two main object properties: name and description.

From the General tab, you can change the object name as well as enter or edit the object description. You can add object descriptions to single-use objects as well as to reusable objects. Note that you can toggle object descriptions on and off by right-clicking any object in the workspace and selecting/deslecting View Enabled Descriptions. Depending on the object, other properties may appear on the General tab. Examples include:

• •

Execute only once — See “Creating and defining data flows” in Chapter 7: Data Flows for more information. Recover as a unit — See “Marking recovery units” in Chapter 17: Recovery Mechanisms for more information about this work flow property.

54

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Designer user interface Working with objects

3

• • •
Attributes tab

Degree of parallelism — See the Data Integrator Performance Optimization Guide for more information about this advanced feature. Use database links — See the Data Integrator Performance Optimization Guide for more information about this advanced feature. Cache type — See the Data Integrator Performance Optimization Guide for more information about this advanced feature.

The Attributes tab allows you to assign values to the attributes of the current object.

To assign a value to an attribute, select the attribute and enter the value in the Value box at the bottom of the window. Some attribute values are set by Data Integrator and cannot be edited. When you select an attribute with a system-defined value, the Value field is unavailable.

Data Integrator Designer Guide

55

3 Designer user interface Working with objects Class Attributes tab The Class Attributes tab shows the attributes available for the type of object selected.This document is part of a SAP study on PDF usage. To create a new attribute for a class of objects. all data flow objects have the same class attributes. For example. To delete an attribute. The new attribute is now available for all of the objects of this class. Find out how you can participate and help to improve our documentation. 56 Data Integrator Designer Guide . select it then right-click and choose Delete. You cannot delete the class attributes predefined by Data Integrator. right-click in the attribute list and select Add.

you also import or export its description. The object-level setting is saved with the object in the repository. A description is associated with a particular object. To activate that system-level setting. or click the View Enabled Descriptions button on the toolbar. Both settings must be activated to view the description for a particular object. The Designer determines when to show object descriptions based on a system-level setting and an object-level setting. The objectlevel setting is also disabled by default unless you add or edit a description from the workspace. select View > Enabled Descriptions.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. To activate the object-level setting. The system-level setting is disabled by default. test. You can see descriptions on workspace diagrams. descriptions are a convenient way to add comments to workspace objects. When you import or export that repository object (for example. and production environments). right-click the object and select Enable object description. Designer user interface Working with objects 3 Creating descriptions Use descriptions to document objects. when migrating between development. Data Integrator Designer Guide 57 . Therefore. The system-level setting is unique to your setup.

When you move an object. 1. then right-clicking one of the selected objects. view the object’s name in the status bar. you can select the View Enabled Descriptions button on the toolbar. Right-click the work flow and select Enable Object Description. right-click an object and select Properties. 3 Designer user interface Working with objects An ellipses after the text in a description indicates that there is more text. In the Properties window. select Enabled Descriptions. right-click an object. select Enabled Descriptions. To see all the text. To add a description to an object from the workspace From the View menu. 2. 3. The description displays in the workspace under the object. To display a description in the workspace In the project area. Alternately. The description for the object displays in the object library. 1. resize the description by clicking and dragging it. you can select multiple objects by: • Pressing and holding the Control key while selecting objects in the workspace diagram. its description moves as well. Enter your comments in the Description text box. 2. 58 Data Integrator Designer Guide . 1. Alternately. select an existing object (such as a job) that contains an object to which you have added a description (such as a work flow). Click OK. Click OK. enter text in the Description box. The description displays automatically in the workspace (and the object’s Enable Object Description option is selected). To add a description to an object In the project area or object library. right-click an object and select Properties. In the workspace.This document is part of a SAP study on PDF usage. 3. To see which object is associated with which selected description. 1. Find out how you can participate and help to improve our documentation. 4. 2. 3. From the View menu. To hide a particular object’s description In the workspace diagram.

To edit object descriptions In the workspace. cut. Note: If you attempt to edit the description of a reusable object. Designer user interface Working with objects 3 • 2. after deactivating the alert. copy. An annotation is associated with the job. across all jobs. You can select the Do not show me this again check box to avoid this alert. even if the View Enabled Descriptions option is checked. deselect Enable Object Description. you import or export associated annotations. In the pop-up menu. you can right-click any object and select Properties to open the object’s Properties window and add or edit its description. Data Integrator Designer Guide 59 . In the Project menu. Data Integrator alerts you that the description will be updated for every occurrence of the object. Dragging a selection box around all the objects you want to select. or data flow. part of a flow. Find out how you can participate and help to improve our documentation. However. or a diagram in a workspace. Creating annotations Annotations describe a flow. work flow. 3. 2. work flow. double-click an object description. or paste text into the description. When you import or export that job. select Save.This document is part of a SAP study on PDF usage. then right-clicking one of the selected objects. you can only reactivate the alert by calling Technical Support. Enter. The description for the object selected is hidden. 1. because the object-level switch overrides the system-level switch. Alternately. or data flow where it appears.

An annotation appears on the diagram. To delete an annotation Right-click an annotation. click the annotation icon. edit. you can select an annotation and press the Delete key. work flow. conditional. You can use annotations to describe any workspace such as a job. 2.This document is part of a SAP study on PDF usage. you can move them out of the way or delete them. In addition. 3. In the tool palette. 3 Designer user interface Working with objects 1. or while loop. data flow. You can add any number of annotations to a diagram. and delete text directly on the annotation. Alternately. 60 Data Integrator Designer Guide . However. To annotate a workspace diagram Open the workspace diagram you want to annotate. 1. Select Delete. you can resize and move the annotation by clicking and dragging. catch. You cannot hide annotations that you have added to the workspace. You can add. Click a location in the workspace to place the annotation. Find out how you can participate and help to improve our documentation. 2.

The content of the included reusable objects is not saved. the object properties. only the call is saved. Designer user interface Working with objects 3 Saving and deleting objects “Saving” an object in Data Integrator means storing the language that describes the object to the repository. You can save reusable objects. You can choose to save changes to the reusable object currently open in the workspace. Find out how you can participate and help to improve our documentation. Data Integrator stores the description even if the object is not complete or contains an error (does not validate).This document is part of a SAP study on PDF usage. the definitions of any single-use objects it calls. and any calls to other reusable objects are recorded in the repository. singleuse objects are saved only as part of the definition of the reusable object that calls them. Data Integrator Designer Guide 61 . When you save the object.

Repeat these steps for other individual objects you want to save. To delete an object definition from the repository In the object library. 2. 1. To save changes to a single reusable object Open the project in which your object is included. Right-click and choose Delete. 1. Click OK. For more information. 3 Designer user interface Working with objects 1.This document is part of a SAP study on PDF usage. Note: Built-in objects such as transforms cannot be deleted from the object library. • If you attempt to delete an object that is being used. To save all changed objects in the repository Choose Project > Save All. (optional) Deselect any listed object to avoid saving it. Data Integrator marks all calls to the object with a red “deleted” icon to indicate that the calls are invalid. Data Integrator lists the reusable objects that were changed since the last save operation. select the object. 2. • If you select Yes. Find out how you can participate and help to improve our documentation. 3. Choose Project > Save. 2. You must remove or replace these calls to produce an executable job. see “Using View Where Used” on page 398. Data Integrator provides a warning message and the option of using the View Where Used feature. 62 Data Integrator Designer Guide . This command saves all objects open in the workspace. Note: Data Integrator also prompts you to save all objects that have changes when you execute a job and when you exit the Designer. Saving a reusable object saves any single-use object included in it.

If you delete a reusable object from the workspace or from the project area. Data Integrator displays the Search window. 2. Options available in the Search window are described in detail following this procedure. To delete an object call Open the object that contains the call you want to delete. Designer user interface Working with objects 3 1. 1. Right-click the object call and choose Delete. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. To search for an object Right-click in the object library and choose Search. 2. From the search results window you can use the context menu to: • • • Open an item View the attributes (Properties) Import external tables as repository metadata Data Integrator Designer Guide 63 . Searching for objects From within the object library. Enter the appropriate values for the search. you can search for objects defined in the repository or objects available through a datastore. Click Search. only the object call is deleted. The object definition remains in the object library. 3. The objects matching your entries are listed in the window.

objects you create in the Designer have no description unless you add a one. When you designate a datastore. When searching a datastore or application. If you are searching in a datastore and the name is case sensitive in that datastore. you can also choose to search the imported data (Internal Data) or the entire datastore (External Data). By default. 3 Designer user interface Working with objects You can also drag objects from the search results window and drop them in the desired location. Jobs. If you are searching in the repository. Where to search. Objects imported into the Data Integrator repository have a description from their source. 64 Data Integrator Designer Guide . enter the name as it appears in the database or application and use double quotation marks (") around the name to preserve the case. the name is not case sensitive. Files. From the Advanced tab. Hierarchies. The type of object to find. When searching the repository. you can choose to search for objects based on their Data Integrator attribute values. and Domains. You can designate whether the information to be located Contains the specified name or Equals the specified name using the drop-down box next to the Name field. Work flows. Description Type Look in The Search window also includes an Advanced tab.This document is part of a SAP study on PDF usage. The object description to find. The search returns objects whose description attribute contains the value entered. Data flows. choose from object types available through that datastore. choose from Tables. The Basic tab in the Search window provides you with the following options: Option Name Description The object name to find. IDOCs. You can search by attribute values only when searching in the repository. Choose from the repository or a specific datastore. Find out how you can participate and help to improve our documentation.

The type of search performed. Designer user interface Working with objects 3 The Advanced tab provides the following options: Option Attribute Description The object attribute in which to search. The attribute value to find. Find out how you can participate and help to improve our documentation. Value Match Data Integrator Designer Guide 65 .This document is part of a SAP study on PDF usage. Select Equals to search for any attribute that contains only the value specified. Select Contains to search for any attribute that contains the value specified. The attributes are listed for the object type specified on the Basic tab.

New — Allows you to specify a new value for the default Job Server from a drop-down list of Job Servers associated with this repository.This document is part of a SAP study on PDF usage. Default Job Server: If a repository is associated with several Job Servers. The window displays option groups for Designer. 66 Data Integrator Designer Guide . Changes are effective immediately. The standard options include: • • • • • • • Designer — Environment Designer — General Designer — Graphics Designer — Central Repository Connections Data — General Job Server — Environment Job Server — General Designer — Environment Default Administrator for Metadata Reporting: Administrator — Select the Administrator that the metadata reporting tool uses. See the Data Integrator Supplement for SAP for more information about these options. If you change the default Job Server. select Tools > Options. Expand the options by clicking the plus icon. one Job Server must be defined as the default Job Server to use at login. As you select each option group or option. 3 Designer user interface General and environment options General and environment options To open the Options window. Current — Displays the current value of the default Job Server. Note: Job-specific options and path names specified in Designer refer to the current default Job Server. SAP options appear if you install these licensed extensions. modify these options and path names. Data. a description appears on the right. and Job Server options. Find out how you can participate and help to improve our documentation. An Administrator is defined by host name and port.

View data by clicking the magnifying glass icon on source and target objects. Element names are not allowed to exceed this number. The default is checked. You may choose to constrain the port used for communication between Designer and Job Server when the two components are separated by a firewall. Data Integrator automatically passes the value as a parameter with the same name to a data flow called by a work flow. Data Integrator Designer Guide 67 . the name of the server group appears. For more information.This document is part of a SAP study on PDF usage. Interactive Debugger — Allows you to set a communication port for the Designer to communicate with a Job Server while running in Debug mode. Designer user interface General and environment options 3 Designer Communication Ports: Allow Designer to set the port for Job Server communication — If checked. Enter port numbers in the From port and To port text boxes. The default is 100. To specify a specific listening port. The default is 17 characters. see “Using View Data” on page 404. Allows you to specify a range of ports from which the Designer can choose a listening port. Default parameters to variables of the same name — When you declare a variable at the work-flow level. Changes will not take effect until you restart Data Integrator. Designer automatically sets an available port to receive messages from the current Job Server. Maximum schema tree elements to auto expand — The number of elements displayed in the schema tree. Server group for local repository — If the local repository that you logged in to when you opened the Designer is associated with a server group. enter the same port number in both the From port and To port text boxes. Find out how you can participate and help to improve our documentation. but the Designer only displays the number entered here. Number of characters in workspace icon name — Controls the length of the object names displayed in the workspace. Uncheck to specify a listening port or port range. Designer — General View data sampling size (rows) — Controls the sample size used to display the data in sources and targets in open data flows in the workspace. Specify port range — Only activated when you deselect the previous control. Object names are allowed to exceed this number. see “Changing the interactive debugger port” on page 423. Enter a number for the Input schema and the Output schema. For more information.

see “Tools” on page 467. Modify settings for each type using the remaining options. be sure to validate your entire job before saving it. Find out how you can participate and help to improve our documentation. Data Integrator performs a complete job validation before running a job. Line Thickness — Set the connector line thickness. Using these options. Show dialog when job is completed — Allows you to choose if you want to see an alert or just read the trace messages. gray. 3 Designer user interface General and environment options Automatically import domains — Select this check box to automatically import domains when importing a table that references a domain. 68 Data Integrator Designer Guide . otherwise. Note that this option is only available with a plain background style. The default is on. Use navigation watermark — Add a watermark graphic to the background of the flow type selected. the workspace remains as is. Color scheme — Set the background color to blue. Data Integrator automatically stores this information in the AL_COLMAP table (ALVW_MAPPING view) when you save a data flow. For more information. Line Type — Choose a style for object connector lines.This document is part of a SAP study on PDF usage. the Designer switches the workspace to the monitor view during job execution. • • • • • • Workspace flow type — Switch between the two workspace flow types (Job/Work Flow and Data Flow) to view default settings. Calculate column mapping while saving data flow — Calculates information about target tables and columns and the sources used to populate them. Show tabs in workspace — Allows you to decide if you want to use the tabs at the bottom of the workspace to navigate. Designer — Graphics Choose and preview stylistic elements to customize your workspaces. If you keep this default setting. you can easily distinguish your job/work flow design workspace from your data flow design workspace. If you select this option. The default is unchecked. Background style — Choose a plain or tiled background pattern for the selected flow type. This functionality is highly sensitive to errors and will skip data flows with validation problems. With this option enabled. Perform complete validation before job execution — If checked. or white. You can see this information when you generate metadata reports. you should validate your design manually before job execution. Open monitor on job execution — Affects the behavior of the Designer when you execute a job.

For example. Job Server — General Use this window to reset Job Server options (see “Changing Job Server options” on page 329) or with guidance from Business Objects Customer Support. Two-digit years less than this value are interpreted as 20##. if the Century Change Year is set to 15: Two-digit year 99 16 15 14 Interpreted as 1999 1916 1915 2014 Convert blanks to nulls for Oracle bulk loader — Converts blanks to NULL values when loading data using the Oracle bulk loader utility and: • • the column is not part of the primary key the column is nullable Job Server — Environment Maximum number of engine processes — Sets a limit on the number of engine processes that this Job Server can have running concurrently. Data Integrator Designer Guide 69 . Designer user interface General and environment options 3 Designer — Central Repository Connections Displays the central repository connections and the active central repository. For contact information. Reactivate automatically — Select if you want the active central repository to be reactivated whenever you log in to Data Integrator using the current local repository. The default value is 15.This document is part of a SAP study on PDF usage. Two-digit years greater than or equal to this value are interpreted as 19##. right-click one of the central repository connections listed and select Activate. Find out how you can participate and help to improve our documentation. Data — General Century Change Year — Indicates how Data Integrator interprets the century for two-digit years.com/ support/.businessobjects. To activate a central repository. visit http://www.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. 3 Designer user interface General and environment options 70 Data Integrator Designer Guide .

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide Projects and Jobs chapter .

Opening a project makes one group of objects easily accessible in the user interface. expand it to view the lower-level objects contained in the object. 72 Data Integrator Designer Guide . You can use a project to group jobs that have schedules that depend on one another or that you want to monitor together. 4 Projects and Jobs About this chapter About this chapter Project and job objects represent the top two levels of organization for the application flows you create using the Designer. This chapter contains the following topics: • • Projects Jobs Projects A project is a reusable object that allows you to group jobs. Data Integrator shows you the contents as both names in the project area hierarchy and icons in the workspace. If a plus sign (+) appears next to an object. A project is the highest level of organization offered by Data Integrator. and the DF_EmpMap data flow contains multiple objects.This document is part of a SAP study on PDF usage. Projects have common characteristics: • • • Projects are listed in the object library. In the following example. Objects that make up a project The objects in a project appear hierarchically in the project area. Only one project can be open at a time. the Job_KeyGen job contains two data flows. Projects cannot be shared among multiple users. Find out how you can participate and help to improve our documentation.

The name can include alphanumeric characters and underscores (_). It cannot contain blank spaces. 3. Projects and Jobs Projects 4 Click here to open the first level Click here to open the next level Each item selected in the project area also displays in the workspace: Creating new projects 1. Data Integrator closes that project and opens the new one. Select the name of an existing project from the list. Data Integrator Designer Guide 73 . Click Open.This document is part of a SAP study on PDF usage. To open an existing project Choose Project > Open. Note: If another project was already open. they also appear in the project area. 2. 3. Opening existing projects 1. Click Create. Enter the name of your new project. 2. As you add jobs and other lowerlevel objects to the project. The new project appears in the project area. To create a new project Choose Project > New > Project. Find out how you can participate and help to improve our documentation.

A job diagram is made up of two or more objects connected together. Each step is represented by an object icon that you place in the workspace to create a job diagram. (optional) Deselect any listed object to avoid saving it. 4 Projects and Jobs Jobs Saving projects 1. In production. You can manually execute and test jobs in development. Data Integrator lists the jobs. Note: Data Integrator also prompts you to save all objects that have changes when you execute a job and when you exit the Designer. organize its content into individual work flows. To save all changes to a project Choose Project > Save All. For more information on work flows. Jobs A job is the only object you can execute. 2. Click OK. You can include any of the following objects in a job definition: • Data flows • • • • • • • • Sources Targets Transforms Scripts Conditionals While Loops Try/catch blocks Work flows If a job becomes complex. 74 Data Integrator Designer Guide . 3. and data flows that you edited since the last save.This document is part of a SAP study on PDF usage. you can schedule batch jobs and set up real-time jobs as services that execute a process when Data Integrator receives a message request. then create a single job that calls those work flows. see Chapter 8: Work Flows. work flows. Find out how you can participate and help to improve our documentation. Saving a reusable object saves any single-use object included in it. A job is made up of steps you want executed together.

It cannot contain blank spaces. The name can include alphanumeric characters and underscores (_). For more information. To create a job in the project area In the project area. Projects and Jobs Jobs 4 Real-time jobs use the same components as batch jobs. see Chapter 10: Real-time jobs. select the project name. To create a job in the object library Go to the Jobs tab. you are telling Data Integrator to validate these objects according the requirements of the job type (either batch or real-time). 2. When you drag a work flow or data flow icon into a job. Data Integrator opens a new workspace for you to define the job. 1.This document is part of a SAP study on PDF usage. 2. Right-click and choose New Batch Job or Real Time Job. There are some restrictions regarding the use of some Data Integrator features with real-time jobs. Right-click Batch Jobs or Real Time Jobs and choose New. Data Integrator Designer Guide 75 . Creating jobs 1. 3. Edit the name. Find out how you can participate and help to improve our documentation. You can add work flows and data flows to both batch and real-time jobs.

A new job with a default name appears. 4. 5. To add the job to the open project. drag it into the project area. Find out how you can participate and help to improve our documentation. The name can include alphanumeric characters and underscores (_). Naming conventions for objects in jobs We recommend that you follow consistent naming conventions to facilitate object identification across all systems in your enterprise. This allows you to more easily work with metadata across all applications such as: • • • • Data-modeling applications ETL applications Reporting applications Adapter software development kits 76 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. 4 Projects and Jobs Jobs 3. Right-click and select Properties to change the object’s name and add a description. It cannot contain blank spaces.

you might want to provide standardized names for objects that identify a specific action across all object types. By using a prefix or suffix. Prefix DF_ EDF_ EDF_ RTJob_ WF_ JOB_ _DS DC_ SC_ _Memory_DS PROC_ _Input _Output Suffix Object Data flow Embedded data flow Embedded data flow Real-time job Work flow Job Datastore Datastore configuration System configuration Memory datastore Stored procedure Example DF_Currency EDF_Example_Input EDF_Example_Output RTJob_OrderStatus WF_SalesOrg JOB_SalesOrg ORA_DS DC_DB2_production SC_ORA_test Catalog_Memory_DS PROC_SalesStatus Although Data Integrator Designer is a graphical user interface with icons representing objects in its windows. In addition to prefixes and suffixes.<PROC_Name> <datastore>.This document is part of a SAP study on PDF usage. naming conventions can also include path name identifiers. RTJob_OrderStatus. Find out how you can participate and help to improve our documentation. you can more easily identify your object’s type.<owner>. other interfaces might require you to identify object types by the text alone. Projects and Jobs Jobs 4 Examples of conventions recommended for use with jobs and other objects are shown in the following table.<PROC_Name> Data Integrator Designer Guide 77 . the stored procedure naming convention can look like either of the following: <datastore>. In addition to prefixes and suffixes. For example: DF_OrderStatus.<package>. For example.<owner>.

This document is part of a SAP study on PDF usage. 4 Projects and Jobs Jobs 78 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.

Data Integrator Designer Guide Datastores chapter . Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

PeopleSoft. and Siebel Applications. Note: Objects deleted from a datastore connection are identified in the project area and workspace by a red “deleted” icon . For example. These configurations can be direct or through adapters. The specific information that a datastore object can access depends on the connection configuration. you can add a set of configurations (DEV. Note: Data Integrator reads and writes data stored in flat files through flat file formats as described in Chapter 6: File Formats. Edwards World.This document is part of a SAP study on PDF usage. This visual flag allows you to find and update data flows affected by datastore changes. Oracle Applications. 5 Datastores About this chapter About this chapter This chapter contains the following topics: • • • • What are datastores? Database datastores Adapter datastores Creating and managing multiple datastore configurations What are datastores? Datastores represent connection configurations between Data Integrator and databases or applications. Datastore configurations allow Data Integrator to access metadata from a database or application and read from or write to that database or application while Data Integrator executes a job. See “Adapter datastores” on page 111.D. J. This allows you to plan ahead for the different environments your datastore may be used in and limits the work involved with migrating jobs.D. Find out how you can participate and help to improve our documentation. Data Integrator reads and writes data stored in XML documents through DTDs and XML Schemas. Data Integrator datastores can connect to: • • • Databases and mainframe file systems. make corresponding changes in the datastore information in Data Integrator—Data Integrator does not automatically detect the new information. You can create multiple configurations for a datastore. TEST. See the appropriate Data Integrator Supplement. See “Formatting XML documents” on page 219. SAP R/3 and SAP BW. These 80 Data Integrator Designer Guide . and PROD) to the same datastore name. Applications that have pre-packaged or user-written Data Integrator adapters. When your database or application changes. Edwards One World and J. See “Database datastores” on page 81.

and Teradata databases (using native connections) Other databases (through ODBC) A Data Integrator repository. Microsoft SQL Server. Sybase IQ. refer to the Attunity documentation. Database datastores Database datastores can represent single or multiple Data Integrator connections with: • • • • Legacy systems using Attunity Connect IBM DB2. Datastores Database datastores 5 connection settings stay with the datastore during export or import. see “Creating and managing multiple datastore configurations” on page 115. For more information. MySQL. When running or scheduling a job. select a system configuration. Find out how you can participate and help to improve our documentation. Sybase ASE. using a memory datastore or persistent cache datastore Mainframe interface Defining a database datastore Browsing metadata through a database datastore Importing metadata through a database datastore Memory datastores Persistent cache datastores Linked datastores This section discusses: • • • • • • • Mainframe interface Data Integrator provides the Attunity Connector datastore that accesses mainframe data sources through Attunity Connect. and thus.This document is part of a SAP study on PDF usage. • • • • Adabas DB2 UDB for OS/390 and DB2 UDB for OS/400 IMS/DB VSAM Data Integrator Designer Guide 81 . For a complete list of sources. The data sources that Attunity Connect accesses are in the following list. Oracle. the set of datastore configurations for your current environment. Netezza. Business Objects Data Federator. Group any set of datastore configurations into a system configuration.

In the Database type box. 5 Datastores Database datastores • Flat files on OS/390 and flat files on OS/400 Attunity Connector accesses mainframe data using software that you must manually install on the mainframe server and the local client (Job Server) computer. 1. because Data Integrator loads ODBC drivers directly on UNIX platforms. Configure ODBC data sources on the client (Data Integrator Job Server). install the Attunity Connect product. 3. Attunity also offers an optional tool called Attunity Studio. Finish entering values in the remainder of the dialog. Data Integrator connects to Attunity Connector using its ODBC interface. 5. Find out how you can participate and help to improve our documentation.5. For more information about how to install and configure these products. To create an Attunity Connector datastore In the Datastores tab of the object library. Clients To access mainframe data using Attunity Connector. 2. which you can use for configuration and administration. the installer will prompt you to provide an installation directory path for Attunity connector software. select Database. right-click and select New. refer to their documentation.1 or higher. The ODBC driver is required. Enter a name for the datastore. Configuring an Attunity datastore To use the Attunity Connector datastore option. It is not necessary to purchase a separate ODBC driver manager for UNIX and Windows platforms. In the Datastore type box. 82 Data Integrator Designer Guide . an zSeries computer).This document is part of a SAP study on PDF usage. upgrade your repository to Data Integrator version 6. select Attunity Connector. you do not need to install a driver manager. When you install a Data Integrator Job Server on UNIX. Servers Install and configure the Attunity Connect product on the server (for example. In addition. 4.

the name of the Attunity data source must be specified when referring to a table. Data Integrator Designer Guide 83 . Datastores Database datastores 5 • To create an Attunity Connector datastore. However. Since a single datastore can access multiple software systems that do not share the same namespace. Data Integrator’s format for accessing Attunity tables is unique to Data Integrator. precede the table name with the data source and owner names separated by a colon. You specify a unique Attunity server workspace name. and the Attunity daemon port number. With an Attunity Connector. location of the Attunity daemon. Find out how you can participate and help to improve our documentation.TableName When using the Designer to create your jobs with imported Attunity tables. be sure to use this format. The format is as follows: AttunityDataSource:OwnerName. when you author SQL. see “Specifying multiple data sources in one Attunity datastore” on page 84.This document is part of a SAP study on PDF usage. Data Integrator automatically generates the correct SQL for this format. You can author SQL in the following constructs: • • • • • SQL function SQL transform Pushdown_sql function Pre-load commands in table loader Post-load commands in table loader For information about how to specify multiple data sources in one Attunity datastore. you must know the Attunity data source name.

which reduces the amount of data transmitted through your network. if you have a DB2 data source named DSN4 and a VSAM data source named Navdemo. use the same workspace name for each data source. separate data source names with semicolons in the Attunity data source box using the following format: AttunityDataSourceName. enter the following values into the Data source box: DSN4. If you have several types of data on the same computer. 84 Data Integrator Designer Guide . If you want to change any of the default options (such as Rows per Commit or Language). ensure that you meet the following requirements: • • • All Attunity data sources must be accessible by the same user name and password. When you setup access to the data sources in Attunity Studio. Click OK.Navdemo Requirements for an Attunity Connector datastore Data Integrator requires the following for Attunity Connector datastores: • For any table in Data Integrator. for example a DB2 database and VSAM. To specify multiple sources in the Datastore Editor. You can now use the new datastore connection to import metadata tables into the current Data Integrator repository. the maximum size of the Attunity data source name and actual owner name is 63 (the ":" accounts for 1 character). 5 Datastores Database datastores 6.AttunityDataSourceName For example. Specifying multiple data sources in one Attunity datastore You can use the Attunity Connector datastore to access multiple Attunity data sources on the same Attunity Daemon location. For example. In the case of Attunity tables. All Attunity data sources must use the same workspace. click the Advanced button. the maximum size of the owner name is 64 characters. you might want to access both types of data using a single connection. “Defining a database datastore” on page 85.This document is part of a SAP study on PDF usage. Data Integrator cannot access a table with an owner name larger than 64 characters. Find out how you can participate and help to improve our documentation. For general information about these options see. If you list multiple data source names for one Attunity Connector datastore. you can use a single connection to join tables (and push the join operation down to a remote server). 7.

. For example. To avoid this error. 3. authorize the user (of the datastore/ database) to create. (OPEN) This error occurs because of insufficient file permissions to some of the files in the Attunity installation directory. they will produce a warning message and will run less efficiently. right-click and select New. 2. to allow Data Integrator to use parameterized SQL when reading or writing to DB2 databases. get appropriate access privileges to the database or file system that the datastore describes. which is not enough to correctly represent a timestamp value. When you select a Datastore Type. Choose Database.This document is part of a SAP study on PDF usage. However. 1. When running a job on UNIX. It cannot contain spaces. Enter the name of the new datastore in the Datastore Name field. the job could fail with following error: [D000] Cannot open file /usr1/attun/navroot/def/sys System error 13: The file access permissions do not allow the specified action. The name can contain any alphabetical or numeric characters or underscores (_). Find out how you can participate and help to improve our documentation. To define a Database datastore In the Datastores tab of the object library. To define a datastore. Datastores Database datastores 5 Limitations All Data Integrator features are available when you use an Attunity Connector datastore except the following: • • • • • • Bulk loading Imported functions (imports metadata for tables only) Template tables (creating tables) The datetime data type supports up to 2 sub-seconds only Data Integrator cannot load timestamp data into a timestamp column in a table because Attunity truncates varchar data to 8 characters. Select the Datastore type. Data Integrator displays other options relevant to that type. change the file permissions for all files in the Attunity directory to 777 by executing the following command from the Attunity installation directory: $ chmod -R 777 * Defining a database datastore Define at least one database datastore for each database or mainframe file system with which you are exchanging data. execute and drop stored procedures. If a user is not authorized to create. execute and drop stored procedures jobs will still run. Data Integrator Designer Guide 85 .

see the Data Integrator Performance Optimization Guide. Sybase IQ. or Teradata. 86 Data Integrator Designer Guide . MySQL. Sybase ASE. Select the Database type. The Enable automatic data transfer check box is selected by default when you create a new datastore and you chose Database for Datastore type. Microsoft SQL Server. Enter the appropriate information for the selected database type. you can save the datastore or add more information to it: • To save the datastore and close the Datastore Editor. Memory. click OK.This document is part of a SAP study on PDF usage. Data Federator. 5. and Persistent Cache. 5 Datastores Database datastores 4. Keep Enable automatic data transfer selected to enable transfer tables in this datastore that the Data_Transfer transform can use to push down subsequent database operations. Find out how you can participate and help to improve our documentation. Memory. DB2. ODBC. 7. Data Federator. Choose from Attunity Connector. For more information. Netezza. 6. Oracle. This check box displays for all databases except Attunity Connector. At this point.

See the Data Integrator Reference Guide for a description of the options in the grid for each database. select Advanced. Datastores Database datastores 5 • To add more information. click the cells under each configuration name. Find out how you can participate and help to improve our documentation. To enter values for each configuration option.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 87 .

the correct database type to use when creating a datastore on Netezza was ODBC. Find out how you can participate and help to improve our documentation.7.This document is part of a SAP study on PDF usage. see “Creating and managing multiple datastore configurations” on page 115. 88 Data Integrator Designer Guide . Note: On versions of Data Integrator prior to version 11. Edit Show ATL OK Cancel Apply For more information about creating multiple configurations for a single datastore. 8. Data Integrator 11. it is recommended that you choose Data Integrator’s Netezza option as the Database type rather than ODBC. 5 Datastores Database datastores For the datastore as a whole. Use the tool bar on this window to add. and manage multiple configurations for a datastore.7. Click OK.1 provides a specific Netezza option as the Database type instead of ODBC.0. See “Ways of importing metadata” on page 96 for the procedures you will use to import metadata from the connected database or application. When using Netezza as the database with Data Integrator. Opens a text window that displays how Data Integrator will code the selections you make for this datastore in its scripting language. configure. Saves selections. Cancels selections and closes the Datastore Editor window. the following buttons are available: Buttons Import unsupported data types as VARCHAR of size Description The data types that Data Integrator supports are documented in the Reference Guide. Opens the Configurations for Datastore dialog. If you want Data Integrator to convert a data type in your source that it would not normally support. select this option and enter the number of characters that you will allow. Saves selections and closes the Datastore Editor (Create New Datastore) window.

3. the name of the database to connect to is a datastore option. Data Integrator Designer Guide 89 . For example. See the Data Integrator Reference Guide for a detailed description of the options on the Configurations for Datastore dialog (opens when you select Edit in the Datastore Editor). or click Edit to add. 2. Click OK. To change datastore properties Go to the datastore tab in the object library. Right-click the datastore name and choose Edit. 1. 4. The options take effect immediately. The Properties window opens. edit. Find out how you can participate and help to improve our documentation. 3. Once you add a new configuration to an existing datastore. you can use the fields in the grid to change connection values and properties for the new configuration. Datastores Database datastores 5 Changing a datastore definition Like all Data Integrator objects. 1. the name of the datastore and the date on which it was created are datastore properties. The Datastore Editor appears in the workspace (the title bar for this dialog displays Edit Datastore). Right-click the datastore name and select Properties. Click OK. 2. Change the datastore properties. The individual properties available for a datastore are described in the Data Integrator Reference Guide. To change datastore options Go to the Datastores tab in the object library. For example. datastores are defined by both options and properties: • • Options control the operation of objects. click Advanced and change properties for the current configuration. or delete additional configurations. Properties are merely descriptive of the object and do not affect its operation. You can change the connection information for the current datastore configuration. Properties document the object.This document is part of a SAP study on PDF usage.

Click the plus sign (+) next to the datastore name to view the object types in the datastore. Find out how you can participate and help to improve our documentation. click the plus sign (+) next to tables to view the imported tables. 1. 3. Click again to sort in reversealphabetical order. and template tables. Click the plus sign (+) next to an object type to view the objects of that type imported from the datastore. tables. To view imported objects Go to the Datastores tab in the object library.This document is part of a SAP study on PDF usage. 90 Data Integrator Designer Guide . database datastores have functions. 5 Datastores Database datastores Browsing metadata through a database datastore Data Integrator stores metadata information for all imported objects in a datastore. For example. To sort the list of objects Click the column heading to sort the objects in each grouping and the groupings in each datastore alphabetically. 2. You can use Data Integrator to view metadata for imported or nonimported objects and to check whether the metadata has changed for objects already imported. For example.

3. Only available if you select one table. 4. You can also search through them. The datastore explorer lists the tables in the datastore. Imports (or re-imports) metadata from the database into the repository. Datastores Database datastores 5 Column heading 1. If you select one or more tables. (Alternatively. Select Repository metadata to view imported tables. Select External metadata to view tables in the external database. you can right-click for further options. and select Open. Choose a datastore. you can right-click for further options. 1. If you select one or more tables. see “To import by searching” on page 99. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. For more information about the search feature. Data Integrator Designer Guide 91 . Command Open1 Import Reconcile Description Opens the editor for the table metadata. Checks for differences between metadata in the database and metadata in the repository. Checks for differences between metadata in the repository and metadata in the database.) Data Integrator opens the datastore explorer in the workspace. 2. Reimports metadata from the database into the repository. Command Open1 Reconcile Reimport Description Opens the editor for the table metadata. You can view tables in the external database or tables in the internal repository. To view datastore metadata Select the Datastores tab in the object library. you can double-click the datastore icon. right-click.

To browse the metadata for an external table In the browser window showing the list of external tables. To determine if a schema is has changed since it was imported In the browser window showing the list of repository tables. Only available if you select one table. 2. 1. A table editor appears in the workspace and displays the schema and attributes of the table. select the table you want to view. 5 Datastores Database datastores Command Delete Properties View Data 1 Description Deletes the table or tables from the repository. Shows the properties of the selected table. Right-click and choose Reconcile. select External Metadata. The Imported column displays YES to indicate that the table has been imported into the repository. Right-click and choose Open. 1. Opens the View Data window which allows you to see the data currently in the table. To use the most recent metadata from Data Integrator.This document is part of a SAP study on PDF usage. 3. Find out how you can participate and help to improve our documentation. reimport the table. 1. Choose the table or tables you want to check for changes. 92 Data Integrator Designer Guide . The Changed column displays YES to indicate that the database tables differ from the metadata imported into Data Integrator. 2.

In the Properties window. right-click a table to open the shortcut menu. 2. From the shortcut menu. Datastores Database datastores 5 1. Right-click and select Open. Click an index to see the contents. Data Integrator Designer Guide 93 . 4. click the Indexes tab. click Properties to open the Properties window. From the datastores tab in the Designer. 1. The left portion of the window displays the Index list. To view the metadata for an imported table Select the table name in the list of imported tables. A table editor appears in the workspace and displays the schema and attributes of the table. 3. Find out how you can participate and help to improve our documentation. 2.This document is part of a SAP study on PDF usage. To view secondary index information for tables Secondary index information can help you understand the schema of an imported table.

it ignores the column entirely. Find out how you can participate and help to improve our documentation. The description of the table. This section discusses: • • • • Imported table information Imported stored function and procedure information Ways of importing metadata Reimporting objects Imported table information Data Integrator determines and stores a specific set of metadata information for tables. The name of the table column. and data types. After importing metadata. The data type for each column. you can import metadata for tables and functions. In some cases. descriptions. 94 Data Integrator Designer Guide . The edits are propagated to all objects that call these objects. Metadata Table name Table description Column name Column description Column data type Description The name of the table as it appears in the database.This document is part of a SAP study on PDF usage. If a column is defined as an unsupported data type. if Data Integrator cannot convert the data type. Data Integrator converts the data type to one that is supported. you can edit column names. The description of the column. 5 Datastores Database datastores Importing metadata through a database datastore For database datastores.

For more information. MS SQL Server. Imported stored function and procedure information Data Integrator can import stored procedures from DB2.6). Functions and procedures appear in the Function branch of each datastore tree. and Sybase IQ databases. Find out how you can participate and help to improve our documentation. You can also import stored functions and packages from Oracle. Datastores Database datastores 5 Metadata Primary key column Description The column(s) that comprise the primary key for the table. Oracle. After a table has been added to a data flow diagram. You can configure imported functions and procedures through the function wizard and the smart editor in a category identified by the datastore name. Table attribute Owner name Varchar and Column Information from Business Objects Data Federator tables Any decimal column imported to Data Integrator from a Business Objects Data Federator data source is converted to the decimal precision and scale(28.This document is part of a SAP study on PDF usage. You may change the decimal precision or scale and varchar size within Data Integrator after importing form the Business Objects Data Federator data source. these columns are indicated in the column list by a key icon next to the column name. Note: The owner name for MySQL and Netezza data sources corresponds to the name of the database or schema where the table appears. Any varchar column imported to Data Integrator from a Business Objects Data Federator data source is varchar(1024). see the Data Integrator Reference Guide. owner Imported functions and procedures appear on the Datastores tab of the object library. and Sybase ASE. Name of the table owner. Information that is imported for functions includes: • • • Function parameters Return type Name. You can use these functions and procedures in the extraction specifications you give Data Integrator. Information Data Integrator records about the table such as the date created and date modified if these values are available. Data Integrator Designer Guide 95 .

6. Go to the Datastores tab. right-click the object and choose Reconcile. To verify whether the repository contains the most recent metadata for an object. you must select a table rather than a folder that contains tables. 2. 3. there is a plus sign (+) to the left of the name. 5. 5 Datastores Database datastores Ways of importing metadata This section discusses methods you can use to import metadata: • • • To import by browsing To import by name To import by searching To import by browsing Note: Functions cannot be imported by browsing. Right-click and choose Open. Find out how you can participate and help to improve our documentation. The workspace contains columns that indicate whether the table has already been imported into Data Integrator (Imported) and if the table schema has changed since it was imported (Changed). the tables are organized and displayed as a tree structure. If this is true. 96 Data Integrator Designer Guide . The items available to import through the datastore appear in the workspace. Select the items for which you want to import metadata. to import a table. In some environments. Right-click and choose Import. Select the datastore you want to use. 4.This document is part of a SAP study on PDF usage. Click the plus sign to navigate the structure. Open the object library. For example. 1.

If you are importing a stored procedure. To import by name Open the object library. to specify all tables. Find out how you can participate and help to improve our documentation. choose the type of item you want to import from the Type list. In the Import By Name window. Select the datastore you want to use. Datastores Database datastores 5 7. In the object library. 6. go to the Datastores tab to display the list of imported objects. 3. select Function. Data Integrator Designer Guide 97 . 2. 1. Specify the items you want imported. 4. Note: Options vary by database type. 5. enter the name as it appears in the database and use double quotation marks (") around the name to preserve the case. Right-click and choose Import By Name. • For tables: • Enter a table name in the Name box to specify a particular table. if available. or select the All check box. Click the Datastores tab. If the name is case-sensitive in the database (and not all uppercase).This document is part of a SAP study on PDF usage.

98 Data Integrator Designer Guide . If you leave the owner name blank. variables. You cannot import an individual function or procedure defined within a package. enter the name as it appears in the database and use double quotation marks (") around the name to preserve the case. any table with the specified table name). you specify matching tables regardless of owner (that is. cursors. Data Integrator allows you to import procedures or functions created within packages and use them as top-level procedures or functions. An Oracle package is an encapsulated collection of related program objects (e. • For functions and procedures: • In the Name box.g. functions. procedures. If you enter a package name. enter the name of the function or stored procedure. Data Integrator imports all stored procedures and stored functions defined within the Oracle package. Find out how you can participate and help to improve our documentation.. You can also enter the name of a package.This document is part of a SAP study on PDF usage. Otherwise. and exceptions) stored together in the database. constants. 5 Datastores Database datastores • Enter an owner name in the Owner box to limit the specified tables to a particular owner. If the name is case-sensitive in the database (and not all uppercase). Data Integrator will convert names into all upper-case characters.

Select Contains or Equals from the drop-down list to the right depending on whether you provide a complete or partial search value. 1. Open the object library. clear the Callable from SQL expression check box. you specify matching functions regardless of owner (that is.table_name rather than simply table_name. A stored procedure cannot be pushed down to a database inside another SQL statement when the stored procedure contains a DDL statement. • If you are importing an Oracle function or stored procedure and any of the following conditions apply. 6. 5. or issues any ALTER SESSION or ALTER SYSTEM commands. 3. 7. If you leave the owner name blank. ends the current transaction with COMMIT or ROLLBACK. enter the name as it appears in the database and use double quotation marks (") around the name to preserve the case. Datastores Database datastores 5 • Enter an owner name in the Owner box to limit the specified functions to a particular owner. Find out how you can participate and help to improve our documentation. Equals qualifies only the full search string. The Search window appears. Select the name of the datastore you want to use. Right-click and select Search. Enter the entire item name or some part of it in the Name text box. 8. Click OK.This document is part of a SAP study on PDF usage. That is. Data Integrator Designer Guide 99 . Select the object type in the Type box. 2. (Optional) Enter a description in the Description text box. any function with the specified name). To import by searching Note: Functions cannot be imported by searching. 4. 7. Click the Datastores tab. you need to search for owner. If the name is case-sensitive in the database (and not all uppercase).

13. 11. select the table. and choose Import. or table. The advanced options only apply to searches of imported items. Reimporting objects If you have already imported an object such as a datastore. Select External from the drop-down box to the right of the Look In box. Click Search. Data Integrator lists the tables matching your search criteria. External indicates that Data Integrator searches for the item in the entire database defined by the datastore. Find out how you can participate and help to improve our documentation. you can reimport it. 100 Data Integrator Designer Guide . 10. Select the datastore in which you want to search from the Look In box. 5 Datastores Database datastores 9. right-click. 12. which updates the object’s metadata from your database (reimporting overwrites any changes you might have made to the object in Data Integrator).This document is part of a SAP study on PDF usage. function. To import a table from the returned list. Go to the Advanced tab to search using Data Integrator attribute values. Internal indicates that Data Integrator searches only the items that have been imported.

If you selected multiple objects to reimport (for example with Reimport All). you opened the datastore.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 101 . IDOCs. or right-click a category node or datastore name and click Reimport All. Click Yes to reimport the metadata. 2. Right-click an individual object and click Reimport. Data Integrator requests confirmation for each object unless you check the box Don’t ask me again for the remaining objects. Datastores Database datastores 5 To reimport objects in previous versions of Data Integrator. you can reimport objects using the object library at various levels: • • • Individual objects — Reimports the metadata for an individual object such as a table or function Category node level — Reimports the definitions of all objects of that type in that datastore. and hierarchies To reimport objects from the object library In the object library. If you are unsure whether to reimport (and thereby overwrite) the object. You can also select multiple individual objects using Ctrl-click or Shiftclick. click View Where Used to display where the object is currently being used in your jobs. 3. 4. In this version of Data Integrator. You can skip objects to reimport by clicking No for that object. click the Datastores tab. viewed the repository metadata. 1. The Reimport dialog box opens. for example all tables in the datastore Datastore level — Reimports the entire datastore and all its dependent objects including tables. functions. Find out how you can participate and help to improve our documentation. and selected the objects to reimport.

Memory tables can be used to: • Move data between data flows in real-time jobs. or adapter. The data in memory tables cannot be shared between different real-time jobs. Memory tables can cache data from relational database tables and hierarchical data files such as XML messages and SAP IDocs (both of which contain nested schemas). Memory tables are represented in the workspace with regular table icons. • The lifetime of memory table data is the duration of the job. By storing table data in memory.This document is part of a SAP study on PDF usage. By caching intermediate data. Be sure to use the naming convention “Memory_DS”. In the Name box. Data (typically small amounts in a real-time job) is stored in memory to provide immediate access instead of going to the original source data. Datastore names are appended to table names when table icons appear in the workspace. In Data Integrator. the LOOKUP_EXT function and other transforms and functions that do not require database operations can access data without having to read it from a remote database. 2. Memory datastores are designed to enhance processing performance of data flows executing in real-time jobs. enter the name of the new datastore. Therefore. To define a memory datastore From the Project menu. application. Creating memory datastores You can create memory datastores using the Datastore Editor window. label a memory datastore to distinguish its memory tables from regular database tables in the workspace. a datastore normally provides a connection to a database. the performance of real-time jobs with multiple data flows is far better than it would be if files or regular tables were used to store intermediate data. Store table data in memory for the duration of a job. Support for the use of memory tables in batch jobs is not available. 102 Data Integrator Designer Guide . a memory datastore contains memory table schemas saved in the repository. only use memory tables when processing small quantities of data. By contrast. For best performance. Memory tables are schemas that allow you to cache intermediate data. A memory datastore is a container for memory tables. 1. Find out how you can participate and help to improve our documentation. 5 Datastores Database datastores Memory datastores Data Integrator also allows you to create a database datastore using Memory as the Database type. select New > Datastore.

Data Integrator defines the memory table’s schema and saves the table. See “Create Row ID option” on page 104 for more information. 8. If you want a system-generated row ID column in the table. The memory table appears in the workspace as a template table icon. Datastores Database datastores 5 3. Subsequently. 4. you can use a memory table as a source or target in any data flow. Creating memory tables When you create a memory table. Instead. 1. Data Integrator creates the schema for each memory table automatically based on the preceding schema. 5. 5. The Create Table window opens. From the Project menu select Save. In the workspace. 6. In the Database Type box select Memory.This document is part of a SAP study on PDF usage. 7. which can be either a schema from a relational database table or hierarchical data files such as XML messages. Click inside a data flow to place the template table. you do not have to specify the table’s schema or import the table’s metadata. To create a memory table From the tool pallet. select the memory datastore. 2. the table appears with a table icon in the workspace and in the object library under the memory datastore. Enter a table name. click the template table icon. 3. click the Create Row ID check box. Using memory tables as sources and targets After you create a memory table as a target in one data flow. the memory table’s icon changes to a target table icon and the table appears in the object library under the memory datastore’s list of tables. Click OK. The first time you save the job. In the Datastore type box keep the default Database. Connect the memory table to the data flow as a target. 4. No additional attributes are required for the memory datastore. From the Create Table window. Data Integrator Designer Guide 103 . See Chapter 10: Realtime jobs for an example of how to use memory tables as sources and targets in a job. Click OK. Find out how you can participate and help to improve our documentation.

The schema of the preceding object is used to update the memory target table’s schema. Connect the memory table as a source or target in the data flow. 2. and drag it into an open data flow. If you are using a memory table as a target. To update the schema of a memory target table Right-click the memory target table’s icon in the work space. Otherwise. All occurrences of the current memory table are updated with the new schema. 2. Find out how you can participate and help to improve our documentation. Create Row ID option If the Create Row ID is checked in the Create Memory Table window. 4. the second row inserted gets a value of 2. click the Datastores tab.This document is part of a SAP study on PDF usage. Update Schema option You might want to quickly update a memory target table’s schema if the preceding schema changes. To do this. See “Memory table target options” on page 104 for more information. If you deselect this option. Memory table target options The Delete data from table before loading option is available for memory table targets. open the memory target table editor. open the memory table’s target table editor to set table options. 6. use the Update Schema option. 3. 5 Datastores Database datastores 1. To set this option. To use a memory table as a source or target In the object library. 1. 5. This new column allows you to use a LOOKUP_EXT expression as an iterator in a script. The current memory table is updated in your repository. Save the job. Data Integrator generates an integer column called DI_Row_ID in which the first row inserted gets a value of 1. Expand Tables. Expand the memory datastore that contains the memory table you want to use. The default is yes. you would have to add a new a memory table to update a schema. A list of tables appears. Select the memory table you want to use as a source or target. 104 Data Integrator Designer Guide . Select Update Schema. new data will append to the existing table data. etc.

Select the LOOKUP_EXT function arguments (line 7) from the function editor when you define a LOOKUP_EXT function. table1 is a memory table. The TRUNCATE_TABLE(DatastoreName.Owner. If Data Integrator runs out of memory while executing any operation. Use the DI_Row_ID column to iterate through a table using a lookup_ext function in a script.’MAX’].TableName) Data Integrator also provides a built-in function that you can use to explicitly expunge data from a memory table.[O]. This function can be used with any type of datastore. For more information about these and other Data Integrator functions.’=’.table1. Data Integrator Designer Guide 105 . particularly when using memory tables. This provides finer control than the active job has over your data and memory usage.. see the Data Integrator Reference Guide. $I = $I + 1. while ($count < $NumOfRows) begin $data = lookup_ext([memory_DS. use the following syntax: TOTAL_ROWS(DatastoreName.This document is part of a SAP study on PDF usage.TableName) function can only be used with memory tables. Datastores Database datastores 5 Note: The same functionality is available for other datastore types using the SQL function. if ($data != NULL) begin $count = $count + 1. Data Integrator exits. There are no owners for memory datastores.[DI _Row_ID. end end In the preceding script..$I]).. so tables are identified by just the datastore name and the table name as shown. The TOTAL_ROWS(DatastoreName.[A]. The table’s name is preceded by its datastore name (memory_DS). For example: $NumOfRows = total_rows(memory_DS. If used with a memory datastore. is that Data Integrator runs out of virtual memory space.table1) $I = 1. a blank space (where a table owner would be for a regular table). Find out how you can participate and help to improve our documentation.TableName) function returns the number of rows in a particular table in a datastore. a dot. then a second dot. $count=0.. Troubleshooting memory tables • One possible error.’NO_CACHE’.

Persistent cache datastores Data Integrator also allows you to create a database datastore using Persistent cache as the Database type. You cannot perform incremental inserts. You can then subsequently read from the cache table in another data flow. • Two log files contain information specific to memory tables: trace_memory_reader log and trace_memory_loader log. a persistent cache datastore contains cache table schemas saved in the repository. Persistent cache tables can cache data from relational database tables and files. When you load data into a persistent cache table. By contrast. In Data Integrator. 106 Data Integrator Designer Guide . application. a datastore normally provides a connection to a database. To correct this error. For example. Find out how you can participate and help to improve our documentation. Persistent cache tables allow you to cache large amounts of data. you can access a lookup table or comparison table locally (instead of reading from a remote database). You can create cache tables that multiple data flows can share (unlike a memory table which cannot be shared between different real-time jobs).This document is part of a SAP study on PDF usage. or updates on a persistent cache table. • You can store a large amount of data in persistent cache which Data Integrator quickly loads into memory to provide immediate access during a job. Data Integrator always truncates and recreates the table. deletes. use the Update Schema option or create a new memory table to match the schema of the preceding object in the data flow. • A persistent cache datastore is a container for cache tables. you can create a cache once and subsequent jobs can use this cache instead of creating it each time. For example. if a large lookup table used in a lookup_ext function rarely changes. Persistent cache datastores provide the following benefits for data flows that process large volumes of data. 5 Datastores Database datastores • A validation and run time error occurs if the schema of a memory table does not match the schema of the preceding object in the data flow. You create a persistent cache table by loading data into the persistent cache target table using one data flow. or adapter. Note: You cannot cache data from hierarchical data files such as XML messages and SAP IDocs (both of which contain nested schemas).

To define a persistent cache datastore From the Project menu. In the Database Type box. 1. Be sure to use a naming convention such as “Persist_DS”. In the Name box. Persistent cache tables are represented in the workspace with regular table icons. 4. Subsequently.This document is part of a SAP study on PDF usage. • From the tool pallet: a. you can either type or browse to a directory where you want to store the persistent cache. In the Datastore type box. Find out how you can participate and help to improve our documentation. Data Integrator defines the persistent cache table’s schema and saves the table. keep the default Database. select Persistent cache. 2. 6. Datastores Database datastores 5 Creating persistent cache datastores You can create persistent cache datastores using the Datastore Editor window. select New > Datastore. You create a persistent cache table in one of the following ways: • • As a target template table in a data flow (see “To create a persistent cache table as a target in a data flow” below) As part of the Data_Transfer transform during the job execution (see the Data Integrator Reference Guide) To create a persistent cache table as a target in a data flow Use one of the following methods to open the Create Template window: 1. you do not have to specify the table’s schema or import the table’s metadata. Data Integrator creates the schema for each persistent cache table automatically based on the preceding schema. Click the template table icon. The first time you save the job. Instead. 3. Click OK. b. Data Integrator Designer Guide 107 . 5. enter the name of the new datastore. Creating persistent cache tables When you create a persistent cache table. In the Cache directory box. Therefore. Click inside a data flow to place the template table in the workspace. the table appears with a table icon in the workspace and in the object library under the persistent cache datastore. Datastore names are appended to table names when table icons appear in the workspace. label a persistent cache datastore to distinguish its persistent cache tables from regular database tables in the workspace.

a. b. select the persistent cache datastore. 5 Datastores Database datastores c. This option is the default. • From the object library: a. map the Schema In columns that you want to include in the persistent cache table. Compare_by_name — Data Integrator maps source columns to target columns by name. 7. Connect the persistent cache table to the data flow as a target (usually a Query transform). see the Data Integrator Reference Guide. Click the template table icon and drag it to the workspace. In the Query transform. Expand a persistent cache datastore. There are two options: • • Compare_by_position — Data Integrator disregards the column names and maps source columns to target columns by position. Open the persistent cache table’s target table editor to set table options. you can change the following options for the persistent cache table.This document is part of a SAP study on PDF usage. Column comparison — Specifies how the input columns are mapped to persistent cache table columns. 108 Data Integrator Designer Guide . On the Create Template window. 2. On the Options tab of the persistent cache target table editor. For more information. The persistent cache table appears in the workspace as a template table icon. On the Create Template window. 4. Click OK. 6. Find out how you can participate and help to improve our documentation. 3. 5. enter a table name.

In DB2. 9. see the Data Integrator Reference Guide. specify the key column or columns to use as the key in the persistent cache table. This option is selected by default. see the Data Integrator Reference Guide. you can use the persistent cache table as a source in any data flow. On the Keys tab.This document is part of a SAP study on PDF usage. Include duplicate keys — Select this check box to cache duplicate keys. Oracle calls these paths database links. the one-way communication path from a database server to another database server is provided by an information server that allows a set of servers to get Data Integrator Designer Guide 109 . 8. Using persistent cache tables as sources After you create a persistent cache table as a target in one data flow. Linked datastores Various database vendors support one-way communication paths from one database server to another. the template table’s icon changes to a target table icon and the table appears in the object library under the persistent cache datastore’s list of tables. From the Project menu select Save. Find out how you can participate and help to improve our documentation. In the workspace. For information about the persistent cache table options. For more information. You can also use it as a lookup table or comparison table. Datastores Database datastores 5 b.

and database type. Relationship between database links and Data Integrator datastores A database link stores information about how to connect to a remote data source. a local Oracle database server. Data Integrator uses linked datastores to enhance its performance by pushing down operations to a target database using a target datastore. cannot use the same link to access data in Orders. These solutions allow local users to access data on a remote database.This document is part of a SAP study on PDF usage. Users connected to Customers however. 5 Datastores Database datastores data from remote data sources.You can associate the datastore to another datastore and then import an external database link as an option of a Data Integrator datastore. Additional requirements are as follows: • • • • • A local server for database links must be a target server in Data Integrator A remote server for database links must be a source server in Data Integrator An external (exists first in a database) database link establishes the relationship between any target datastore and a source datastore A Local datastore can be related to zero or multiple datastores using a database link for each remote database Two datastores can be related to each other using one link only 110 Data Integrator Designer Guide . The datastores in a database link relationship are called linked datastores. The same information is stored in a Data Integrator database datastore. stored in the data dictionary of database Customers. Customers. can store a database link to access information in a remote Oracle database. to access data on Orders. Data Integrator refers to communication paths between databases as database links. which can be on the local or a remote computer and of the same or different database type. such as its host name. In Microsoft SQL Server. called Orders. password. see the Data Integrator Performance Optimization Guide. For more information. user name. For example. The datastores must connect to the databases defined in the database link. linked servers provide the one-way communication path from one database server to another. Find out how you can participate and help to improve our documentation. Users logged into database Customers must define a separate link. database name.

Data Integrator. However. you can create multiple external database links that connect to the same remote source.This document is part of a SAP study on PDF usage. For information about creating a linked datastore. Dblink4 relates Ds1 with Ds3. Dblink2 is not mapped to any datastore in Data Integrator because it relates Ds1 with Ds2. which are also related by Dblink1. Find out how you can participate and help to improve our documentation. This relationship is called linked datastore Dblink1 (the linked datastore has the same name as the external database link). • • • • Dblink1 relates datastore Ds1 to datastore Ds2. are on database DB1 and Data Integrator reads them through datastore Ds1. Adapter datastores Depending on the adapter implementation. For example. Data Integrator adapters allow you to: • • Browse application metadata Import application metadata into a Data Integrator repository Data Integrator Designer Guide 111 . see the Data Integrator Reference Guide. Although it is not a regular case. DBLink 1 through 4. allows only one database link between a target datastore and a source datastore pair. you cannot import DBLink2 to do the same. Dblink3 is not mapped to any datastore in Data Integrator because there is no datastore defined for the remote data source to which the external database link refers. Datastores Adapter datastores 5 The following diagram shows the possible relationships between database links and linked datastores: Remote Servers Local Server D s 2 D s 1 DBLink1 DBLink2 DBLink4 D s 3 DB2 DBLink1 DBLink2 DBLink3 DBLink4 DB1 DB3 Four database links. if you select DBLink1 to link target datastore DS1 with source datastore DS2.

while Data Integrator extracts data from or loads data directly to the application.This document is part of a SAP study on PDF usage. see the Data Integrator Management Console: Administrator Guide. For information about installing. see the Data Integrator Getting Started Guide. To define an adapter datastore In the Object Library. 5 Datastores Adapter datastores • Move batch and real-time data between Data Integrator and applications Business Objects offers an Adapter Software Development Kit (SDK) to develop your own custom adapters. Create new Datastore). Right-click and select New. For more information on these products. Also. To define a datastore. configuring. if the data source is SQL-compatible. you must have appropriate access privileges to the application that the adapter serves. you can buy Data Integrator prepackaged adapters to access application metadata and data in any application. For example. contact your Business Objects Sales Representative. 112 Data Integrator Designer Guide . Data Integrator jobs provide batch and real-time data movement between Data Integrator and applications through an adapter datastore’s subordinate objects: Subordinate Objects Tables Documents Functions Message functions Outbound messages Use as Source or target Source or target Function call in query Function call in query Real-time data movement Target only For Batch data movement These objects are described in “Source and target objects” on page 178 and “Real-time source and target objects” on page 266. For information about configuring adapter connections for a Job Server. click to select the Datastores tab. and starting adapters. Adapters can provide access to an application’s data and metadata or just metadata. 2. The Datastore Editor dialog opens (the title bar reads. the adapter might be designed to access metadata. Find out how you can participate and help to improve our documentation. Adapters are represented in Designer by adapter datastores. Defining an adapter datastore You need to define at least one datastore for each adapter through which you are extracting or loading data. 1.

configure. The datastore name appears in the Designer only. and ensure that the Job Server’s service is running. Show ATL OK Cancel Apply 8. Data Integrator Designer Guide 113 . 5. After you complete your datastore connection. select Adapter. It can be the same as the adapter name. you can browse and/or import metadata from the data source through the adapter. you must first install the adapter on the Job Server computer. Opens a text window that displays how Data Integrator will code the selections you make for this datastore in its scripting language. configure the Job Server to support local adapters using Data Integrator’s System Manager utility. Adapters residing on the Job Server computer and registered with the selected Job Server appear in the Job server list. Find out how you can participate and help to improve our documentation. To create an adapter datastore. Saves selections. and manage multiple configurations for a datastore. 7. Click OK. 6. Data Integrator displays it below the grid. Select a Job server from the list. the following buttons are available: Buttons Edit Description Opens the Configurations for Datastore dialog. In the Datastore type list. Enter all adapter information required to complete the datastore connection. Select an adapter instance from the Adapter instance name list. The datastore configuration is saved in your metadata repository and the new datastore appears in the object library. Datastores Adapter datastores 5 3.This document is part of a SAP study on PDF usage. Use the tool bar on this window to add. For the datastore as a whole. Note: If the developer included a description for each option. Cancels selections and closes the Datastore Editor window. 4. Enter a unique identifying name for the datastore. Also the adapter documentation should list all information required for a datastore connection. Saves selections and closes the Datastore Editor (Create New Datastore) window.

5 Datastores Adapter datastores 1.This document is part of a SAP study on PDF usage. then it retains the previous properties. 3. 2. Right-click any object to check importability. then it displays them accordingly. and the Designer can communicate to get the adapter’s properties. To delete an adapter datastore and associated metadata objects Right-click the datastore you want to delete and select Delete. Browsing metadata through an adapter datastore The metadata you can browse depends on the specific adapter. If these objects exist in established flows. Data Integrator looks for the Job Server and adapter instance name you specify. Your edits propagate to all objects that call these objects. If the Designer cannot get the adapter’s properties. 3. After importing metadata. When editing an adapter datastore. 1. Scroll to view metadata name and description attributes. 2. A window opens showing source metadata. you can edit it. To change an adapter datastore’s configuration Right-click the datastore you want to browse and select Edit to open the Datastore Editor window. If the Job Server and adapter instance both exist. enter or select a value. Data Integrator removes the datastore and all metadata objects contained within that datastore from the metadata repository. 4. Click OK in the confirmation window. 1. they appear with a deleted icon . To browse application metadata Right-click the datastore you want to browse and select Open. Edit configuration information. Importing metadata through an adapter datastore The metadata you can import depends on the specific adapter. Click OK. 2. The edited datastore configuration is saved in your metadata repository. Click plus signs [+] to expand objects and view subordinate objects. 114 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.

Click OK. The object is imported into one of the adapter datastore containers (documents. and PROD) Multi-instance (databases with different versions or locales) Multi-user (databases for central and local repositories) For more information about how to use multiple datastores to support these scenarios. 3. Then. see “Portability solutions” on page 120. The Import by name window appears containing import parameters with corresponding text boxes. Creating and managing multiple datastore configurations Creating multiple configurations for a single datastore allows you to consolidate separate datastore connections for similar sources or targets into one source or target datastore with multiple configurations. Click each import parameter text box and enter specific information related to the object you want to import. 3. then select Import by name. or message functions). Find out how you can participate and help to improve our documentation. Any object(s) matching your parameter constraints are imported to one of the corresponding Data Integrator categories specified under the datastore. To import application metadata while browsing Right-click the datastore you want to browse. Right-click the object and select Import. functions. tables. 2. To import application metadata by name Right-click the datastore from which you want metadata. 2. Datastores Creating and managing multiple datastore configurations 5 1. then select Open. 4. you can select a set of configurations that includes the sources and targets you want by selecting a system configuration when you execute or schedule the job. Find the metadata object you want to import from the browsable list. This section covers the following topics: • Definitions Data Integrator Designer Guide 115 .This document is part of a SAP study on PDF usage. 1. such as: • • • • OEM (different databases for design and distribution) Migration (different connections for DEV. The ability to create multiple datastore configurations provides greater easeof-use for job portability scenarios. TEST. outbound messages.

or the system configuration does not specify a configuration for a datastore. For example. You can create an alias from the datastore editor for any datastore configuration. Database objects usually have owners. as needed. a table) in an underlying database. Data Integrator uses the default datastore configuration as the current configuration at job execution time. Find out how you can participate and help to improve our documentation. database type. 116 Data Integrator Designer Guide . user name. If a datastore has more than one configuration. Default datastore configuration — The datastore configuration that Data Integrator uses for browsing and importing database objects (tables and functions) and executing jobs if no system configuration is specified. Also known as database owner name or physical owner name. password. If a datastore has only one configuration. Current datastore configuration — The datastore configuration that Data Integrator uses to execute a job. select a default configuration. Data Integrator will execute the job using the system configuration. Specify a current configuration for each system configuration. Owner name — Owner name of a database object (for example. database objects in an ODBC datastore connecting to an Access database do not have owners. Data Integrator uses it as the default configuration. Each configuration is a property of a datastore that refers to a set of configurable options (such as database connection name. Alias — A logical owner name. If you do not create a system configuration. 5 Datastores Creating and managing multiple datastore configurations • • • • • • • Why use multiple datastore configurations? Creating a new configuration Adding a datastore alias Portability solutions Job portability tips Renaming table and function owner Defining a system configuration Definitions Refer to the following terms when creating and managing multiple datastore configurations: Datastore configuration — Allows you to provide multiple metadata sources or targets for datastores. Some database objects do not have owners. Database objects — The tables and functions that are imported from a datastore. If you define a system configuration.This document is part of a SAP study on PDF usage. and locale) and their values. Create an alias for objects that are in different database environments if you have different owner names in those environments.

Click the Create New Configuration icon on the toolbar. 5. Creating a new configuration You can create multiple configurations for all datastore types except memory datastores. Datastores Creating and managing multiple datastore configurations 5 Dependent objects — Dependent objects are the jobs. Click Advanced to view existing configuration information. Select a Database version from the drop-down menu. The Create New Configuration window opens. versions. see the Data Integrator Reference Guide. Data Integrator Designer Guide 117 . it is the default configuration. Defining a system configuration then adding datastore configurations required for a particular environment. 3. Select a system configuration when you execute a job. Creating a new configuration within an existing source or target datastore. If only one configuration exists. Select a Database type from the drop-down menu. For example. Each datastore must have at least one configuration. To create a new datastore configuration From the Datastores tab of the object library. and custom functions in which a database object is used. Find out how you can participate and help to improve our documentation. For Datastore Editor details. Use the Datastore Editor to create and edit datastore configurations. you can decrease end-to-end development time in a multi-source.This document is part of a SAP study on PDF usage. 2. logical configuration Name. porting can be as simple as: 1. and instances. Adding a datastore alias then map configurations with different object owner names to it. Why use multiple datastore configurations? By creating multiple datastore configurations. Dependent object information is generated by the where-used utility. enterprise data warehouse environment because you can easily port jobs among different database types. Enter a unique. right-click any existing datastore and select Edit. c. 2. data flows. 1. b. Click Edit to open the Configurations for Datastore window. work flows. 24x7. In the Create New Configuration window: a. 3. 4.

However. If you create a new datastore configuration with the same database type and version as the one previously deleted. When you delete datastore configurations. The Designer automatically uses the existing SQL transform and target values for the same database type and version. In the Values for table targets and SQL transforms section. If you deselect Restore values if they already exist. 5 Datastores Creating and managing multiple datastore configurations d. These same results also display in the Output window of the Designer. e. Data Integrator must add any new database type and version values to these transform and target objects. the Restore values if they already exist option allows you to access and take advantage of the saved value settings.) • f. Find out how you can participate and help to improve our documentation. Data Integrator saves all associated target values and SQL transforms. Data Integrator uses the default configuration to import metadata and also preserves the default configuration during export and multi-user operations.This document is part of a SAP study on PDF usage. the Designer automatically populates the Use values from with the earlier version. Data Integrator pre-selects the Use values from value based on the existing database type and version. Your first datastore configuration is 118 Data Integrator Designer Guide . allowing you to provide new values. Further. Data Integrator does not attempt to restore target and SQL transform values. Select or deselect the Restore values if they already exist option. See For each datastore. Under these circumstances. If your datastore contains pre-existing data flows with SQL transforms or target objects. when you add a new datastore configuration. if the database you want to associate with a new configuration is a later version than that associated with other existing configurations. you can choose to use the values from another existing configuration or the default for the database type and version. or if the database version is older than your existing configuration. • If you keep this option (selected as default) Data Integrator uses customized target and SQL transform values from previously deleted datastore configurations. Data Integrator displays the Added New Values Modified Objects window which provides detailed information about affected data flows and modified objects. Data Integrator requires that one configuration be designated as the default configuration. if database type and version are not already specified in an existing configuration. Click OK to save the new configuration.

Data Integrator provides six functions that are useful when working with multiple source and target datastore configurations: Function db_type db_version db_database_name Category Description 1. Click OK. Miscellaneous Returns the database version of the current datastore configuration. If the datastore you are exporting already exists in the target repository. You can also rename tables and functions after you import them. then click Aliases (Click here to create). Miscellaneous Returns the database type of the current datastore configuration. Data Integrator overrides configurations in the target with source configurations. Data Integrator Designer Guide 119 . Data Integrator exports system configurations separate from other job related objects. For more information. The Create New Alias window opens. Adding a datastore alias From the datastore editor. Find out how you can participate and help to improve our documentation. 2. Datastores Creating and managing multiple datastore configurations 5 automatically designated as the default. Miscellaneous Returns the database name of the current datastore configuration if the database type is MS SQL Server or Sybase ASE. however after adding one or more additional datastore configurations. you can also create multiple aliases for a datastore then map datastore configurations to each alias. click Advanced. see “Renaming table and function owner” on page 126. Data Integrator substitutes your specified datastore configuration alias for the real owner name when you import metadata for database objects. Data Integrator preserves all configurations in all datastores including related SQL transform text and target table editor settings. The Create New Alias window closes and your new alias appears underneath the Aliases category When you define a datastore alias. When you export a repository. use only alphanumeric characters and the underscore symbol (_) to enter an alias name.This document is part of a SAP study on PDF usage. Under Alias Name in Designer. To create an alias From within the datastore editor. you can use the datastore editor to flag a different configuration as the default. 3.

This document is part of a SAP study on PDF usage. Portability solutions Set multiple source or target configurations for a single datastore if you want to quickly change connections to a different source or target database. current_system_conf Miscellaneous Returns the name of the current system configuration. To use multiple configurations successfully. For more information. Use the Administrator to select a system configuration as well as view the underlying datastore configuration associated with it when you: • • • • Execute batch jobs Schedule batch jobs View batch job history Create real-time jobs For more information. You can also use variable interpolation in SQL text with these functions to enable a SQL transform to perform successfully regardless of which configuration the Job Server uses at job execution time. For more information. returns a NULL value. if you have a datastore with a configuration for Oracle sources and SQL sources. variables. data types. Use the same table names. 5 Datastores Creating and managing multiple datastore configurations Function db_owner Category Description Miscellaneous Returns the real owner name that corresponds to the given alias name under the current datastore configuration current_configuration Miscellaneous Returns the name of the datastore configuration that is in use at runtime. Data Integrator provides several different solutions for porting jobs: • • Migration between environments Multiple instances 120 Data Integrator Designer Guide . iguration If no system configuration is defined. see “SQL” on page 355and the Data Integrator Reference Guide. see “Job portability tips” on page 125. as well as the same column names and data types. and so on when you switch between datastore configurations. functions. Data Integrator links any SQL transform and target table editor settings used in a data flow to datastore configurations. see the Data Integrator Management Console: Administrator Guide. For example. number and order of columns. make sure that the table metadata schemas match exactly. alias names. Find out how you can participate and help to improve our documentation. design your jobs so that you do not need to change schemas.

which means that you do not have to edit datastores before running ported jobs in the target environment. see the Data Integrator Advanced Development and Migration Guide. The Export utility saves additional configurations in the target environment. You use a typical repository migration procedure. Minimal security issues: Testers and operators in production do not need permission to modify repository objects. add configurations for the test environment when migrating from development to test) to the source repository (for example. Multiple instances If you must load multiple instances of a data source to a target data warehouse. Because Data Integrator overwrites datastore configurations during export. Datastores Creating and managing multiple datastore configurations 5 • • OEM deployment Multi-user development For more information on Data Integrator migration and multi-user development. Each environment has a unique database connection name. user name. the process typically includes the following characteristics: • • • • The environments use the same database type but may have unique database versions or locales. Data Integrator Designer Guide 121 . add to the development repository before migrating to the test environment). To load multiple instances of a data source to a target data warehouse Create a datastore that connects to a particular instance. Database objects (tables and functions) can belong to different owners. For more information. see the Data Integrator Advanced Development and Migration Guide. the task is the same as in a migration scenario except that you are using only one Data Integrator repository. Either you export jobs to an ATL file then import the ATL file to another repository. other connection properties. you should add configurations for the target environment (for example. This solution offers the following advantages: • • Minimal production down time: You can start jobs as soon as you export them. or you export jobs directly from one repository to another repository. password. Migration between environments When you must move repository metadata to another environment (for example from development to test or from test to production) which uses different source and target databases.This document is part of a SAP study on PDF usage. and owner mapping. Find out how you can participate and help to improve our documentation. 1.

the deployment typically has the following characteristics: • • The instances require various source database types and versions. 8. Each instance has a unique database connection name. other connection properties. and owner mappings. Use the database object owner renaming tool to rename owners of any existing database objects. 6. then run the jobs. Define the first datastore configuration. If this is the case. Define a set of alias-to-owner mappings within the datastore configuration. user name. Find out how you can participate and help to improve our documentation. you may need to trigger functions at run-time to match different instances. user name. When you use an alias for a configuration. Run the jobs in all database instances. Database tables across different databases belong to different owners. The instances may use different locales. OEM deployment If you design jobs for one database type and deploy those jobs to other database types as an OEM partner. 5 Datastores Creating and managing multiple datastore configurations 2. Map owner names from the new database instance configurations to the aliases that you defined in step 3. password. database connection name. database version. This allows you to use database objects for jobs that are transparent to other database instances. This datastore configuration contains all configurable properties such as database type. Data Integrator imports all objects using the metadata alias rather than using real owner names.This document is part of a SAP study on PDF usage. 4. Data Integrator requires different SQL text for functions (such as lookup_ext and sql) and transforms (such as the SQL transform).) Import database objects and develop jobs using those objects. You export jobs to ATL files for deployment. When you define a configuration for an Adapter datastore. make sure that the relevant Job Server is running so the Designer can find all available adapter instances for the datastore. password. (See “Renaming table and function owner” on page 126 for details. To support executing jobs under different instances. 7. 5. Data Integrator also requires different settings for the target table (configurable in the target table editor). Since a datastore can only access one instance at a time. 3. • • • • 122 Data Integrator Designer Guide . and locale information. add datastore configurations for each additional instance.

If you selected a bulk loader method for one or more target tables within your job’s data flows. When this occurs. Data Integrator preserves object history (versions and labels). Data Integrator does not copy bulk loader options for targets from one database type to another. To support a new instance under a new database type. Multi-user development If you are using a central repository management system. you will not have to modify the SQL text for each environment. 2. This way. If the SQL text contains any hard-coded owner names or database names. If the SQL text in any SQL transform is not applicable for the new database type. and new configurations apply to different database types. to check in and check out jobs. For an example of how to apply the db_type and SQL functions within an interpolation script. the development environment typically has the following characteristics: • • It has a central repository and a number of local repositories. Reference this report to make manual changes as needed. • Data Integrator Designer Guide 123 . allowing multiple developers. lookup_ext(). Data Integrator copies target table and SQL transform database properties from the previous configuration to each additional configuration when you save it. and pushdown_sql() functions. modify the SQL text for the new database type. When Data Integrator saves a new configuration it also generates a report that provides a list of targets automatically set for bulk loading. see the Data Integrator Reference Guide. To deploy jobs to other database types as an OEM partner Develop jobs for a particular database type following the steps described in the Multiple instances scenario.This document is part of a SAP study on PDF usage. Multiple development environments get merged (via central repository operations such as check in and check out) at times. Because Data Integrator does not support unique SQL text for each database type or version of the sql(). 3. open your targets and manually set the bulk loader option (assuming you still want to use the bulk loader method with the new database type). use the db_type() and similar functions to get the database type and version of the current datastore configuration and provide the correct SQL text for that database type and version using the variable substitution (interpolation) technique. real owner names (used initially to import objects) must be later mapped to a set of aliases shared among all users. consider replacing these names with variables to supply owner names or database names for multiple database types. Datastores Creating and managing multiple datastore configurations 5 1. Find out how you can participate and help to improve our documentation. each with their own local repository.

Instead. Checking in the new objects does not automatically check in the dependent objects that were checked out. user name. If you cannot check out some of the dependent objects. the renaming tool only affects the flows that you can check out. 124 Data Integrator Designer Guide . If all the dependent objects can be checked out. Find out how you can participate and help to improve our documentation. the original object will co-exist with the new object. After renaming. password. 1. In the multi-user development scenario you must define aliases so that Data Integrator can properly preserve the history for all objects in the shared environment. Data Integrator displays a message. Database objects may belong to different owners. 2. If the objects to be renamed have dependent objects. 3.This document is part of a SAP study on PDF usage. which gives you the option to proceed or cancel the operation. Data Integrator does not delete original objects from the central repository when you check in the new objects. 5 Datastores Creating and managing multiple datastore configurations • • • The instances share the same database type but may have different versions and locales. other connection properties. If all the dependent objects cannot be checked out (data flows are checked out by another user). add a configuration and make it your default configuration while working in your own environment. and owner mapping. Use caution because checking in datastores and checking them out as multi-user operations can override datastore configurations. The number of flows affected by the renaming process will affect the Usage Count and Where-Used information in the Designer for both the original object and the new object. Renaming occurs in local repositories. check out the datastore to a local repository and apply the renaming tool in the local repository. Each instance has a unique database connection name. renaming will create a new object that has the alias and delete the original object that has the original owner name. To rename the database objects stored in the central repository. When porting jobs in a multi-user environment Use the Renaming table and function owner to consolidate object database object owner names into aliases. Data Integrator will ask you to check out the dependent objects. You are responsible for checking in all the dependent objects that were checked out during the owner renaming process. Maintain the datastore configurations of all users by not overriding the configurations they created.

Data Integrator supports options in some database types or versions that it does not support in others For example. Job portability tips • Data Integrator assumes that the metadata of a table or function is the same across different database types and versions specified in different configurations in the same datastore.This document is part of a SAP study on PDF usage. The following Data Integratorfeatures support job portability: • • • • Enhanced SQL transform With the enhanced SQL transform. you can enter different SQL text for different database types/versions and use variable substitution in the SQL text to allow Data Integrator to read the correct text for its associated datastore configuration. if you import a table when the default configuration of the datastore is Oracle. If you import an Oracle hashpartitioned table and set your data flow to run in parallel. Find out how you can participate and help to improve our documentation. • Enhanced target table editor Using enhanced target table editor options. Import metadata for a database object using the default configuration and use that same metadata with all configurations defined in the same datastore. • Enhanced datastore editor Using the enhanced datastore editor. Datastores Creating and managing multiple datastore configurations 5 When your group completes the development phase. Data Integrator Designer Guide 125 . when you run your job using sources from a DB2 environment. you can configure database table targets for different database types/versions to match their datastore configurations. For instance. then later use the table in a job to extract from DB2. not on DB2 or other database hash-partitioned tables. parallel reading will not occur. Data Integrator will read from each partition in parallel. your job will run. Business Objects recommends that the last developer delete the configurations that apply to the development environments and add the configurations that apply to the test or production environments. when you create a new datastore configuration you can choose to copy the database properties (including the datastore and table target options as well as the SQL transform text) from an existing configuration or use the current values. Data Integrator supports parallel reading on Oracle hash-partitioned tables. However.

Table schemas should match across the databases in a datastore. all databases must be Oracle). For example. use a DATETIME column in the Microsoft SQL Server source. Consolidating metadata under a single alias name allows you to access accurate and consistent dependency information at any time while also allowing you to more easily switch between configurations when you move jobs to different environments. template tables. For example. or functions. If you create configurations for both caseinsensitive databases and case-sensitive databases in the same datastore. • • To learn more about migrating Data Integrator projects. This means the number of columns. and column positions should be exactly the same. 126 Data Integrator Designer Guide . Data Integrator assumes that the signature of the stored procedure is exactly the same for the two databases. Use owner renaming to assign a single metadata alias instead of the real owner name for database objects in the datastore. and in/out types of the parameters must match exactly. Define primary and foreign keys the same way. see the Data Integrator Advanced Development and Migration Guide. functions. Business Objects recommends that you name the tables. If your stored procedure has three parameters in one database. a shared alias makes it easy to track objects checked in by multiple users. This process is called owner renaming. Further. and stored procedures using all upper-case characters. the names. If all users of local repositories use the same alias. name database tables. and stored procedures the same for all sources. If you have a DATE column in an Oracle source. it should have exactly three parameters in the other databases. the column names. 5 Datastores Creating and managing multiple datastore configurations • When you design a job that will be run from different database types or versions. data types. The column data types should be the same or compatible. use a VARCHAR column in the Microsoft SQL Server source too. Renaming table and function owner Data Integrator allows you to rename the owner of imported tables. Data Integrator can track dependencies for objects that your team checks in and out of the central repository. then you have to use it as a function with all other configurations in a datastore (in other words. When using objects stored in a central repository. functions. Find out how you can participate and help to improve our documentation. When you import a stored procedure from one datastore configuration and try to use it for another datastore configuration. positions. if a stored procedure is a stored function (only Oracle supports stored functions). Stored procedure schemas should match. if you have a VARCHAR column in an Oracle source.This document is part of a SAP study on PDF usage.

expand a table. not the datastore from which they were imported. and data flows that use the renamed object) to use the new owner name. • • If the objects you want to rename are from a case-sensitive database. You may need to choose a different object name. To ensure that all objects are portable across all configurations in this scenario. Data Integrator updates the dependent objects (jobs. If they are the same.This document is part of a SAP study on PDF usage. work flows. Displayed Usage Count and Where-Used information reflect the number of updated dependent objects. The object library shows the entry of the object with the new owner name. template table. To rename the owner of a table or function From the Datastore tab of the local object library. or function category. 1. Find out how you can participate and help to improve our documentation. The Rename Owner window opens. If the objects you want to rename are from a datastore that contains both case-sensitive and case-insensitive databases. the instances of a table or function in a data flow are affected. Data Integrator determines if that the two objects have the same schema. then Data Integrator displays a message to that effect. then Data Integrator proceeds. During the owner renaming process: • • Data Integrator Designer Guide 127 . When you enter a New Owner Name. Datastores Creating and managing multiple datastore configurations 5 When you rename an owner. 2. Data Integrator uses it as a metadata alias for the table or function. the owner renaming mechanism preserves case sensitivity. If they are different. Note: If the object you are renaming already exists in the datastore. Data Integrator supports both case-sensitive and case-insensitive owner renaming. Data Integrator will base the case-sensitivity of new owner names on the case sensitivity of the default configuration. 3. enter all owner names and object names using uppercase characters. Enter a New Owner Name then click Rename. Right-click the table or function and select Rename Owner.

depending upon the check out state of a renamed object and whether that object is associated with any dependent objects. Using an alias for all objects stored in a central repository allows Data Integrator to track all objects checked in by multiple users. Find out how you can participate and help to improve our documentation. Behavior: When you click Rename. When you are checking objects in and out of a central repository. If all local repository users use the same alias. 5 Datastores Creating and managing multiple datastore configurations • If Data Integrator successfully updates all the dependent objects. Behavior: When you click Rename. Data Integrator can track dependencies for objects that your team checks in and out of the central repository. Case 3: Object is not checked out. Behavior: Same as Case 1.This document is part of a SAP study on PDF usage. and object has no dependent objects in the local or central repository. and object has no dependent objects in the local or central repository. Using the Rename window in a multi-user scenario This section provides a detailed description of Rename Owner window behavior in a multi-user scenario. it deletes the metadata for the object with the original owner name from the object library and the repository. there are several behaviors possible when you select the Rename button. Data Integrator displays a second window listing the dependent objects (that use or refer to the renamed object). • Case 1: Object is not checked out. Data Integrator renames the object owner. • Case 2: Object is checked out. and object has one or more dependent objects (in the local repository). • 128 Data Integrator Designer Guide .

No check out necessary. Note: An object may still have one or more dependent objects in the central repository. • If you are not connected to the central repository.This document is part of a SAP study on PDF usage. Data Integrator renames the objects and modifies the dependent objects to refer to the renamed object using the new owner name. If a dependent object is located in the local repository only. a second window opens to display the dependent objects and a status indicating their check-out state and location.” Data Integrator Designer Guide 129 . When you click Rename. the Rename Owner mechanism (by design) does not affect the dependent objects in the central repository.” • If you are connected to the central repository. If you click Cancel. if the object to be renamed is not checked out. • Case 4: Object is checked out and has one or more dependent objects. the status message reads: “This object is checked out from central repository “X”. Please select Tools | Central Repository… to activate that repository before renaming. Find out how you can participate and help to improve our documentation. Datastores Creating and managing multiple datastore configurations 5 If you click Continue. However. Behavior: This case contains some complexity. the status message reads: “Used only in local repository. the Rename Owner window opens. the Designer returns to the Rename Owner window.

When that user checks in the dependent object.This document is part of a SAP study on PDF usage. this window allows you to check out the necessary dependent objects from the central repository. 130 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. From the central repository. If the check out was successful.user1” The window with dependent objects looks like this: As in Case 2. the status shows the name of the local repository. Click the Refresh List button to update the check out status in the list. After you check out the dependent object. 5 Datastores Creating and managing multiple datastore configurations • If the dependent object is in the central repository. check out associated dependent objects from the central repository. For example: “Oracle. without having to go to the Central Object Library window. In addition. then right-click and select Check Out. This is useful when Data Integrator identifies a dependent object in the central repository but another user has it checked out. click Refresh List to update the status and verify that the dependent object is no longer checked out. the status message reads: “Not checked out” If you have the dependent object checked out or it is checked out by another user. To use the Rename Owner feature to its best advantage. the Designer updates the status. and it is not checked out. the status message shows the name of the checked out repository. the purpose of this second window is to show the dependent objects. This helps avoid having dependent objects that refer to objects with owner names that do not exist. select one or more objects.production.

Defining a system configuration What is the difference between datastore configurations and system configurations? • Datastore configurations — Each datastore configuration defines a connection to a particular database from a single datastore. It is your responsibility to maintain consistency with the objects in the central repository. Data Integrator displays another dialog box that warns you about objects not yet checked out and to confirm your desire to continue. Data Integrator updates the table or function with the new owner name and the Output window displays the following message: Object <Object_Name>: owner name <Old_Owner> successfully renamed to <New_Owner>. it created a new object identical to the original. in reality Data Integrator has not modified the original object. but uses the new owner name. Data Integrator renames the owner of the selected object. Data Integrator then performs an “undo checkout” on the original object. • Case 4b: You click Continue. Click No to return to the previous dialog box showing the dependent objects. Datastores Creating and managing multiple datastore configurations 5 • Case 4a: You click Continue. in the Datastore tab of the local object library. the Output window displays the following message: Object <Object_Name>: Owner name <Old_Owner> could not be renamed to <New_Owner >. Data Integrator modifies objects that are not checked out in the local repository to refer to the new owner name. it looks as if the original object has a new owner name. When the rename operation is successful. In this situation. The original object with the old owner name still exists. but one or more dependent objects are not checked out from the central repository. and modifies all dependent objects to refer to the new owner name. Data Integrator Designer Guide 131 .This document is part of a SAP study on PDF usage. including references from dependent objects. Find out how you can participate and help to improve our documentation. Although to you. and all dependent objects are checked out from the central repository. Click Yes to proceed with renaming the selected object and to edit its dependent objects. It becomes your responsibility to check in the renamed object. If Data Integrator does not successfully rename the owner.

when you check in or check out a datastore to a central repository (in a multi-user design environment). Data Integrator includes datastore configurations when you import or export a job. you avoid modifying your datastore each time you import or export a job. select Tools > System Configurations.This document is part of a SAP study on PDF usage. a job designer defines the required datastore and system configurations then a system administrator determines which system configuration to use when scheduling or starting a job. determine and create datastore configurations and system configurations depending on your business environment and rules. Similarly. Data Integrator maintains system configurations separate from jobs. Find out how you can participate and help to improve our documentation. you can export system configurations to a separate flat file which you can later import. 132 Data Integrator Designer Guide . To create a system configuration In the Designer. (See “Creating a new configuration” on page 117 for details. Create datastore configurations using the Datastore Editor. The other columns indicate the name of a datastore containing multiple configurations. When designing jobs. The System Configuration Editor window opens. Enter a system configuration name in the first column. To use this window: a. Use the first column to the left (Configuration name) to list system configuration names. However. 1. 5 Datastores Creating and managing multiple datastore configurations • System configurations — Each system configuration defines a set of datastore configurations that you want to use together when running a job. Select a system configuration to use at run-time. In many enterprises. You can define a system configuration if your repository contains at least one datastore with multiple configurations. Data Integrator also checks in or checks out the corresponding datastore configurations.) Datastore configurations are part of a datastore object. or each time you check in and check out the datastore. By maintaining system configurations in a separate file. You cannot check in or check out system configurations in a multi-user environment. Create datastore configurations for the datastores in your repository before you create system configurations to organize and associate them.

c. Click OK to save your system configuration settings. Select a list box under any of the datastore columns and click the down-arrow to view a list of available configurations in that datastore. Data Integrator Designer Guide 133 . Click a blank cell under Configuration name and enter a new. Under each listed datastore. 2. right-click a datastore. 2. 3. and Delete commands for that row. unique system configuration name. Click to select a datastore configuration in the list. If you do not map a datastore configuration to a system configuration. Or. click to select an existing system configuration name and enter a new name to change it. Datastores Creating and managing multiple datastore configurations 5 b. particularly when exporting. 3. Find out how you can participate and help to improve our documentation. Right-click the gray box at the beginning of a row to use the Cut. Click OK. Select Repository > Export system configuration.atl file to easily identify that file as a system configuration. 1. Copy. select the datastore configuration you want to use when you run a job using the associated system configuration.This document is part of a SAP study on PDF usage. To export a system configuration In the object library. 4. the Job Server uses the default datastore configuration at run-time. Paste. Business Objects recommends that you add the SC_ prefix to each exported system configuration . Business Objects recommends that you use the SC_ prefix in each system configuration name so that you can easily identify this file as a system configuration.

5 Datastores Creating and managing multiple datastore configurations 134 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide File Formats chapter .

File formats describe the metadata structure. you must: • • Create a file format template that defines the structure for a file. The source or target file format is based on a template and specifies connection information such as the file name. The object library stores file format templates that you use to define specific file formats as sources and targets in data flows. A file format describes a specific file.This document is part of a SAP study on PDF usage. Data Integrator can use data stored in files for data sources and targets. you use a file format to connect Data Integrator to source or target data when the data is stored in a file rather than a database table. see the Data Integrator Reference Guide. see the Data Integrator Supplement for SAP File format objects can describe files in: • • • 136 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. Create a specific source or target file format in a data flow. What are file formats? A file format is a set of properties describing the structure of a flat file (ASCII). Delimited format — Characters such as commas or tabs separate each field Fixed width format — The column width is specified by the user SAP R/3 format — For details. 6 File Formats About this chapter About this chapter This chapter contains the following topics: • • • • • • • • What are file formats? File format editor Creating file formats Editing file formats File format features Creating COBOL copybook file formats File transfers Web log support For full details of file format properties. Therefore. A file format defines a connection to a file. A file format template is a generic description that can be used for many data files. When working with file formats.

This document is part of a SAP study on PDF usage. Column Attributes — Edit and define the columns or fields in the file. Available properties vary by the mode of the file format editor: • • • • • • • New mode — Create a new file format template Edit mode — Edit an existing file format template Source mode — Edit the file format of a particular source file Target mode — Edit the file format of a particular target file Properties-Values — Edit the values for file format properties. Data Integrator Designer Guide 137 . The properties and appearance of the work areas vary with the format of the file. The file format editor has three work areas: The file format editor contains “splitter” bars to allow resizing of the window and all the work areas. Field-specific formats override the default format set in the PropertiesValues area. Data Preview — View how the settings affect sample data. File Formats File format editor 6 File format editor Use the file format editor to set properties for file format templates and source and target file formats. Find out how you can participate and help to improve our documentation. You can expand the file format editor to the full screen size. Expand and collapse the property groups by clicking the leading plus or minus.

138 Data Integrator Designer Guide . see the Data Integrator Reference Guide. Find out how you can participate and help to improve our documentation. You can navigate within the file format editor as follows: • • Switch between work areas using the Tab key. and arrow keys.This document is part of a SAP study on PDF usage. Navigate through fields in the Data Preview area with the Page Up. Page Down. 6 File Formats File format editor Properties-Values Column Attributes Splitter bar Data Preview For more information about the properties in the file format editor.

Data Integrator Designer Guide 139 .This document is part of a SAP study on PDF usage. • When the file format type is fixed-width. When you drag and drop a file format into a data flow. Note: The Show ATL button displays a view-only copy of the Transformation Language file generated for your file format. the format represents a file that is based on the template and specifies connection information such as the file name. you can also edit the column metadata structure in the Data Preview area. Creating a new file format Modeling a file format on a sample file Replicating and renaming file formats Creating a file format from an existing flat table schema Create a file format template using any of the following methods: • • • • To use a file format to create a metadata file. Find out how you can participate and help to improve our documentation. File Formats Creating file formats 6 • Open a drop-down menu in the Properties-Values area by pressing the ALT-down arrow key combination. Creating file formats • • To specify a source or target file Create a file format template that defines the structure for a file. You might be directed to use this by Business Objects Technical Support. see “To create a specific source or target file” on page 148.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. 6 File Formats Creating file formats Creating a new file format 1. go to the Formats tab. and select New. right-click Flat Files. 140 Data Integrator Designer Guide . To create a new file format In the local object library.

5. Find out how you can participate and help to improve our documentation. if a fixed-width. After you save this file format template. If you want to read and load files using a third-party file-transfer program. Consequently.This document is part of a SAP study on PDF usage. select YES for Custom transfer program and see “File transfers” on page 160. For more information about multi-byte support. then no data is displayed in the data preview section of the file format editor for its files. Note: Data Integrator represents column sizes (field-size) in number of characters for all sources except fixed-width file formats. file format uses a multi-byte code page. enter a name that describes this file format template. The file format editor opens. you cannot change the name. 3. specify the file type: • • Delimited — Select Delimited if the file uses a character sequence to separate columns Fixed width — Select Fixed width if the file uses specified widths for each column. 4. In Type. Data Integrator Designer Guide 141 . File Formats Creating file formats 6 2. see the Data Integrator Reference Guide. which it always represents in bytes. In Name.

Enter scale and precision information for Numeric and Decimal data types. 6 File Formats Creating file formats 6. d. Find out how you can participate and help to improve our documentation. Enter field lengths for VarChar data types. 7. Instead. if desired. You can model a file format on a sample file. Look for properties available when the file format editor is in new mode. Complete the other properties to describe files that this template represents. See “To model a file format on a sample file” on page 143.This document is part of a SAP study on PDF usage. 142 Data Integrator Designer Guide . • 8. if you only specify a source column format. Click Save & Close to save the file format template and close the file format editor. Note: • You do not need to specify columns for files used as targets. All properties are described in the Data Integrator Reference Guide. Enter field name. c. Data Integrator writes to the target file using the transform’s output schema. Enter Format field information for appropriate data types. For a decimal or real data type. and the column names and data types in the target schema do not match those in the source schema. e. specify the structure of the columns in the Column Attributes work area: a. Properties vary by file type. If you do specify columns and they do not match the output schema from the preceding transform. This information overrides the default format set in the PropertiesValues area for that data type. Set data types. b. it defaults to the format used by the code page on the computer where the Job Server is installed. For source files. Data Integrator cannot use the source column format specified.

File Formats Creating file formats 6 Modeling a file format on a sample file 1. Browse to set the Root directory and File(s) to specify the sample file. copy and paste the path name from the telnet application directly into the Root directory text box in the file format editor. abc. Then. Data Integrator Designer Guide 143 . but the Job Server must be able to access it. however. set Location to Local. you can telnet to the Job Server (UNIX or Windows) computer and find the full path name of the file you want to use. you must specify a file located on the Job Server computer that will execute the job. a path on UNIX might be /usr/data/abc. Find out how you can participate and help to improve our documentation. During execution. Note: During design. • If the sample file is on the current Job Server computer. For example.This document is part of a SAP study on PDF usage.txt would be two different files in the same UNIX directory. Note: In the Windows operating system. You can type an absolute path or a relative path. You cannot use the Windows Explorer to determine the exact file location on Windows. When you select Job Server. 3. Under Data File(s): • If the sample file is on your Designer computer. The file format editor will show the column names in the Data Preview area and create the metadata structure automatically. set the appropriate column delimiter for the sample file. Enter the Root directory and File(s) to specify the sample file. To model a file format on a sample file From the Formats tab in the local object library. Under Input/Output. set Location to Job Server. Edit the metadata structure as needed. so you must type the path to the file.txt. If the file type is delimited.txt and aBc. 2. file names are case sensitive in the UNIX environment. create a new flat file format template or edit an existing flat file format template. 5. A path on Windows might be C:\DATA\abc. (For example. 4. set Skip row header to Yes if you want to use the first row in the file to designate field names. you can specify a file located on the computer where the Designer runs or on the computer where the Job Server runs. Indicate the file location in the Location property. the Browse icon is disabled.) To reduce the risk of typing errors.txt. files are not case sensitive.

d. b. f. replicate and rename instead of configuring from scratch. 6. Enter field lengths for the VarChar data type. Enter scale and precision information for Numeric and Decimal data types. b. Find out how you can participate and help to improve our documentation. c. Click Save & Close to save the file format template and close the file format editor. Right-click to insert or delete fields.This document is part of a SAP study on PDF usage. 6 File Formats Creating file formats For both delimited and fixed-width files. Rename fields. Replicating and renaming file formats After you create one file format schema. Set data types. For fixed-width files. e. you can edit the metadata structure in the Column Attributes work area: a. you can also edit the metadata structure in the Data Preview area: a. This format information overrides the default format set in the Properties-Values area for that data type. if desired. Enter Format field information for appropriate data types. you can quickly create another file format object with the same schema by replicating the existing file format and renaming it. 144 Data Integrator Designer Guide . Right-click to insert or delete fields. Click to select and highlight columns. To save time in creating file format objects.

To create a file format from an existing file format In the Formats tab of the object library. Find out how you can participate and help to improve our documentation. displaying the schema of the copied file format. right-click an existing file format and choose Replicate from the menu. File Formats Creating file formats 6 1. The File Format Editor opens.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 145 .

Find out how you can participate and help to improve our documentation. click Save. 3. this is your only opportunity to modify the Name property value.This document is part of a SAP study on PDF usage. 146 Data Integrator Designer Guide . you cannot modify the name again. 6. Type a new. Once saved. 6 File Formats Creating file formats 2. Properties are described in the Data Integrator Reference Guide. Edit other properties as desired. Note: You must enter a new name for the replicated file. Click Save & Close. unique name for the replicated file format. click Cancel or press the Esc button on your keyboard. 4. To save and view your new file format schema. Double-click to select the Name property value (which contains the same name as the original file format object). Look for properties available when the file format editor is in new mode. 5. Also. To terminate the replication process (even after you have changed the name and clicked Save). Data Integrator does not allow you to save the replicated file with the same name as the original (or any other existing File Format object).

Edit the new schema as appropriate and click Save & Close.This document is part of a SAP study on PDF usage. right-click a schema and select Create File format. Data Integrator Designer Guide 147 . The File Format editor opens populated with the schema you selected. To create a file format from an existing flat table schema From the Query editor. Find out how you can participate and help to improve our documentation. 2. File Formats Creating file formats 6 Creating a file format from an existing flat table schema 1.

3. Under File name(s). For example. refer to the Data Integrator Reference Guide. you can edit properties that uniquely define that source or target such as the file name and location. 6. be sure to specify the file name and location in the File and Location properties. To edit a file format template In the object library Formats tab. Drag the file format template to the data flow workspace. Refer to “Setting file names at run-time using variables” on page 314. if you have a date field in a source or target file that is formatted as mm/dd/yy and the data for this field changes to the format dd-mm-yy due to changes in the program that generates the source file. Note: You can use variables as file names. Select Make Source to define a source file format. 5. You cannot change the name of a file format template. You can access it from the Formats tab of the object library. 2. 2. Enter the properties specific to the source or target file. Find out how you can participate and help to improve our documentation. For a description of available properties. Click the name of the file format object in the workspace to open the file format editor. Connect the file format object to other objects in the data flow as appropriate. 1. 1. To create a specific source or target file Select a flat file format template on the Formats tab of the local object library. 4. you can edit the corresponding file format template and change the date format information. The file format editor opens with the existing format values.This document is part of a SAP study on PDF usage. For specific source or target file formats. Look for properties available when the file format editor is in source mode or target mode. 6 File Formats Editing file formats Data Integrator saves the file format in the repository. Editing file formats You can modify existing file format templates to match changes in the format or structure of a file. 148 Data Integrator Designer Guide . double-click an existing flat file format (or right-click and choose Edit). Edit the values as needed. or select Make Target to define a target file format.

Properties are described in the Data Integrator Reference Guide. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 149 . Any changes effect every source or target file that is based on this file format template. Click Save. Any changes you make to values in a source or target file editor override those on the original file format. Set the Location of the source files to Local or Job Server. 1. 3. Click Save. To change properties that are not available in source or target mode. Look for properties available when the file format editor is in source or target mode as appropriate. The file format editor opens. File format features Data Integrator offers several capabilities for processing files: • • • • • • Reading multiple files at one time Identifying source file names Number formats Ignoring rows with specified markers Date formats at the field level Error handling for flat-file sources Reading multiple files at one time Data Integrator can read multiple files with the same format from a single directory using a single source object. displaying the properties for the selected source or target file. Look for properties available when the file format editor is in edit mode. 3. Edit the desired properties. 2. To specify multiple files to read Open the editor for your source file format Under Data File(s) in the file format editor: a. 2. you must edit the file’s file format template. 1. File Formats File format features 6 Properties are described in the Data Integrator Reference Guide. To edit a source or target file From the workspace. click the name of a source or target file.

Note: If your Job Server is on a different computer that the Designer.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Set the root directory in Root directory. Under File(s). you cannot use Browse to specify the root directory. 150 Data Integrator Designer Guide . For example: 1999????. You specified a wildcard character to read multiple source files at one time You load from different source files on different runs To identify the source file for each row in the target Under Source Information in the file format editor. c. enter one of the following: • • A list of file names separated by commas. but the Job Server must be able to access it. 6 File Formats File format features b. You can type an absolute path or a relative path. You must type the path. set Include file name to Yes. This option generates a column named DI_FILENAME that contains the name of the source file.txt might read files from the year 1999 *.txt reads all files with the txt extension from the specified Root directory Identifying source file names You might want to identify the source file for each row in your target in the following situations: • • 1. or A file name containing a wild card character (* or ?).

) are the two most common formats used to determine decimal and thousand separators for numeric data types. File Formats File format features 6 2.) and the comma (. When you run the job. For example: 2. you might want to ignore comment line markers such as # and //. In the Query editor. Data Integrator Designer Guide 151 .098. Find Ignore row marker(s) under the Format Property.32-.65 or 2. map the DI_FILENAME column from Schema In to Schema Out. Number formats The period (.089. For example: +12. Use the semicolon to delimit each marker. You can use either symbol for the thousands indicator and either symbol for the decimal separator. 3. Find out how you can participate and help to improve our documentation. When formatting files in Data Integrator. Numeric. the DI_FILENAME column for each row in the target contains the source file name. no rows are ignored. Associated with this feature. To specify markers for rows to ignore Open the file format editor from the Object Library or by opening a source object in the workspace.000. 1. Click in the associated text box and enter a string to indicate one or more markers representing rows that Data Integrator should skip during file read and/or metadata creation. Int (integer). two special characters — the semicolon (. Ignoring rows with specified markers The file format editor provides a way to ignore rows containing a specified marker (or markers) when reading files.This document is part of a SAP study on PDF usage. data types in which these symbols can be used include Decimal. For example. Leading and trailing decimal signs are also supported. When you specify the default value. 2.65.00 or 32.) and the backslash (\) — make it possible to define multiple markers in your ignore row marker string. and use the backslash to indicate special characters as markers (such as the backslash and the semicolon). The default marker value is an empty string. 3. and Double.

\. or date-time formats set in the Properties-Values area.\.\\.This document is part of a SAP study on PDF usage.yy 152 Data Integrator Designer Guide .hi abc.) Marker Value(s) abc abc. Any that begin with abc or \ or . Find out how you can participate and help to improve our documentation. Date formats at the field level You can specify a date format at the field level to overwrite the default date.def. time. For example. when the Data Type is set to Date. you can edit the value in the corresponding Format field to a different date format such as: • • • yyyy.dd mm/dd/yy dd. Row(s) Ignored None (this is the default value) Any that begin with the string abc Any that begin with abc or def or hi Any that begin with abc or . 6 File Formats File format features The following table provides some ignore row marker(s) examples. (Each value is delimited by a semicolon unless the semicolon is preceded by a backslash.mm. abc.mm.

Error-handling options In the File Format Editor. the Error Handling set of properties allows you to choose whether or not to have Data Integrator: • check for either of the two types of flat-file source error Data Integrator Designer Guide 153 .This document is part of a SAP study on PDF usage. These error-handling properties apply to flat-file sources only. Find out how you can participate and help to improve our documentation. see the Data Integrator Reference Guide. in the case of a fixed-width file. File Formats File format features 6 Error handling for flat-file sources During job execution. a field might be defined in the File Format Editor as having a data type of integer but the data encountered is actually varchar. Row-format errors — For example. Data Integrator processes rows from flat-file sources one at a time. For complete details of all file format properties. Data Integrator identifies a row that does not match the expected width value. You can configure the File Format Editor to identify rows in flatfile sources that contain the following types of errors: • • Data-type conversion errors — For example.

3. Expand Flat Files. Find out how you can participate and help to improve our documentation.txt.234. Note: If you set the file format’s Parallel process thread option to any value greater than 0 or {none}. under the Error Handling properties for Capture data conversion errors. Please check the file for bad data.2.txt>. Data Integrator error. and click Edit. right-click a format. 6 File Formats File format features • • • write the invalid row(s) to a specified error file stop processing the source file after reaching a specified number of invalid rows log data-type conversion or row-format warnings to the Data Integrator error log. row number in source file.def where 3 indicates an error occurred after the third column. The format is a semicolon-delimited text file.defg. 4. the row number in source file value will be -1.234.3. The file resides on the same computer as the Job Server.-80104: 1-3-A column delimiter was seen after column number <3> for row number <2> in file <d:/acl_work/in_test..def are the three columns of data from the invalid row. for Capture row format errors click Yes. To capture data-type conversion errors. the error file will include both types of errors. 5. or redefine the input schema for the file by editing the file format in the UI. column number where the error occurred. you can limit the number of warnings to log without stopping the job About the error file If enabled. 2. The total number of columns defined is <3>. and defg. click Yes. so a row delimiter should be seen after column number <3>. You can have multiple input source files for the error file. 1. click the Formats tab. if so. Configuring the File Format Editor for error handling Follow these procedures to configure the error-handling options. all columns from the invalid row The following entry illustrates a row-format error: d:/acl_work/in_test. 154 Data Integrator Designer Guide . The File Format Editor opens. To capture data-type conversion or row-format errors In the object library.This document is part of a SAP study on PDF usage. To capture errors in row formats. Entries in an error file have the following syntax: source file path and name.

4. For Write error rows to file. click Yes for either or both of the Capture data conversion errors or Capture row format errors properties. 7. Click Save or Save & Close. To write invalid rows to an error file In the object library. then enter only the file name in the Error file name property. Type an Error file name. 1. The File Format Editor opens. 6. then type a full path and file name here. Under the Error Handling properties. Type an Error file root directory in which to store the error file. right-click a format. 3.This document is part of a SAP study on PDF usage. 2. click Yes. Expand Flat Files. Click Save or Save & Close. Two more fields appear: Error file root directory and Error file name. click the Formats tab. Data Integrator Designer Guide 155 . File Formats File format features 6 6. If you leave Error file root directory blank. If you type a directory path here. Find out how you can participate and help to improve our documentation. 5. and click Edit.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

6

File Formats File format features

For added flexibility when naming the error file, you can enter a variable that is set to a particular file with full path name. Use variables to specify file names that you cannot otherwise enter such as those that contain multibyte characters To limit to the number of invalid rows Data Integrator processes before stopping the job 1. In the object library, click the Formats tab. 2. 3. Expand Flat Files, right-click a format, and click Edit. The File Format Editor opens. Under the Error Handling properties, click Yes for either or both the Capture data conversion errors or Capture row format errors properties. For Maximum errors to stop job, type a number. Note: This property was previously known as Bad rows limit. 5. 1. 2. 3. 4. Click Save or Save & Close. To log data-type conversion warnings in the Data Integrator error log In the object library, click the Formats tab. Expand Flat Files, right-click a format, and click Edit. The File Format Editor opens. Under the Error Handling properties, for Log data conversion warnings, click Yes. Click Save or Save & Close. To log row-format warnings in the Data Integrator error log In the object library, click the Formats tab. Expand Flat Files, right-click a format, and click Edit. The File Format Editor opens. 3. 4. Under the Error Handling properties, for Log row format warnings, click Yes. Click Save or Save & Close.

4.

1. 2.

To limit to the number of warning messages to log If you choose to log either data-type or row-format warnings, you can limit the total number of warnings to log without interfering with job execution. 1. 2. In the object library, click the Formats tab. Expand Flat Files, right-click a format, and click Edit.

156

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
File Formats Creating COBOL copybook file formats

6

The File Format Editor opens. 3. 4. 5. Under the Error Handling properties, for Log data conversion warnings and/or Log row format warnings, click Yes. For Maximum warnings to log, type a number. Click Save or Save & Close.

Creating COBOL copybook file formats
When creating a COBOL copybook format, you can:

• • • •

create just the format, then configure the source after you add the format to a data flow, or create the format and associate it with a data file at the same time create rules to identify which records represent which schemas using a field ID option identify the field that contains the length of the schema’s record using a record length field option To create a new COBOL copybook file format In the local object library, click the Formats tab, right-click COBOL copybooks, and click New. The Import COBOL copybook window opens. Name the format by typing a name in the Format name field. On the Format tab for File name, specify the COBOL copybook file format to import, which usually has the extension .cpy. During design, you can specify a file in one of the following ways:

This section also describes how to:

1.

2. 3.

• •
4. 5. 6.

For a file located on the computer where the Designer runs, you can use the Browse button. For a file located on the computer where the Job Server runs, you must type the path to the file. You can type an absolute path or a relative path, but the Job Server must be able to access it.

Click OK. Data Integrator adds the COBOL copybook to the object library. The COBOL Copybook schema name(s) dialog box displays. If desired, select or double-click a schema name to rename it. Click OK.

Data Integrator Designer Guide

157

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

6

File Formats Creating COBOL copybook file formats

When you later add the format to a data flow, you can use the options in the source editor to define the source. See the Data Integrator Reference Guide. To create a new COBOL copybook file format and a data file In the local object library, click the Formats tab, right-click COBOL copybooks, and click New. The Import COBOL copybook window opens. 2. 3. Name the format by typing a name in the Format name field. On the Format tab for File name, specify to the COBOL copybook file format to import, which usually has the extension .cpy. During design, you can specify a file in one of the following ways:

1.

• •
4. 5.

For a file located on the computer where the Designer runs, you can use the Browse button. For a file located on the computer where the Job Server runs, you must type the path to the file. You can type an absolute path or a relative path, but the Job Server must be able to access it.

Click the Data File tab. For Directory, type or browse to the directory that contains the COBOL copybook data file to import. If you include a directory path here, then enter only the file name in the Name field.

6.

Specify the COBOL copybook data file Name. If you leave Directory blank, then type a full path and file name here. During design, you can specify a file in one of the following ways:

• •
7.

For a file located on the computer where the Designer runs, you can use the Browse button. For a file located on the computer where the Job Server runs, you must type the path to the file. You can type an absolute path or a relative path, but the Job Server must be able to access it.

If the data file is not on the same computer as the Job Server, click the Data Access tab. Select FTP or Custom and enter the criteria for accessing the data file. For details on these options, see the Data Integrator Reference Guide. Click OK. The COBOL Copybook schema name(s) dialog box displays. If desired, select or double-click a schema name to rename it.

8. 9.

10. Click OK.

158

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
File Formats Creating COBOL copybook file formats

6

The Field ID tab allows you to create rules for indentifying which records represent which schemas. To create rules to identify which records represent which schemas In the local object library, click the Formats tab, right-click COBOL copybooks, and click Edit. The Edit COBOL Copybook window opens. 2. 3. 4. 5. 6. 7. 8. 9. In the top pane, select a field to represent the schema. Click the Field ID tab. On the Field ID tab, select the check box Use field <schema name.field name> as ID. Click Insert below to add an editable value to the Values list. Type a value for the field. Continue (adding) inserting values as necessary. Select additional fields and insert values as necessary. Click OK. To identify the field that contains the length of the schema’s record In the local object library, click the Formats tab, right-click COBOL copybooks, and click Edit. The Edit COBOL Copybook window opens. 2. 3. 4. Click the Record Length Field tab. For the schema to edit, click in its Record Length Field column to enable a drop-down menu. Select the field (one per schema) that contains the record's length. The offset value automatically changes to the default of 4; however, you can change it to any other numeric value. The offset is the value that results in the total record length when added to the value in the Record length field. 5. Click OK. For a complete description of all the options available on the Import COBOL copybook or Edit COBOL copybook dialog boxes, see the Data Integrator Reference Guide. To edit the source, open the source editor; see the Data Integrator Reference Guide. To see the list of data type conversions between Data Integrator and COBOL copybooks, see the Data Integrator Reference Guide.

1.

1.

Data Integrator Designer Guide

159

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

6

File Formats File transfers

File transfers
Data Integrator can read and load files using a third-party file transfer program for flat files. You can use third-party (custom) transfer programs to:

• • • •

Incorporate company-standard file-transfer applications as part of Data Integrator job execution Provide high flexibility and security for files transferred across a firewall A custom transfer program (invoked during job execution) Additional arguments, based on what is available in your program, such as:

The custom transfer program option allows you to specify:

• • •

Connection data Encryption/decryption mechanisms Compression mechanisms

Custom transfer system variables for flat files
When you set Custom Transfer program to YES in the Property column of the file format editor, the following options are added to the column. To view them, scroll the window down.

When you set custom transfer options for external file sources and targets, some transfer information, like the name of the remote server that the file is being transferred to or from, may need to be entered literally as a transfer program argument. You can enter other information using the following Data Integrator system variables: Data entered for: User name Password Local directory File(s) Is substituted for this variable if it is defined in the Arguments field $AW_USER $AW_PASSWORD $AW_LOCAL_DIR $AW_FILE_NAME

160

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
File Formats File transfers

6

By using these variables as custom transfer program arguments, you can collect connection information entered in Data Integrator and use that data at run-time with your custom transfer program. For example, the following custom transfer options use a Windows command file (Myftp.cmd) with five arguments. Arguments 1 through 4 are Data Integrator system variables:

• • •

User and Password variables are for the external server The Local Directory variable is for the location where the transferred files will be stored in Data Integrator The File Name variable is for the names of the files to be transferred

Argument 5 provides the literal external server name.

The content of the Myftp.cmd script is as follows: Note: If you do not specify a standard output file (such as ftp.out in the example below), Data Integrator writes the standard output into the job’s trace log.
@echo off set USER=%1 set PASSWORD=%2 set LOCAL_DIR=%3 set FILE_NAME=%4 set LITERAL_HOST_NAME=%5 set INP_FILE=ftp.inp echo %USER%>%INP_FILE% echo %PASSWORD%>>%INP_FILE% echo lcd %LOCAL_DIR%>>%INP_FILE% echo get %FILE_NAME%>>%INP_FILE% echo bye>>%INP_FILE% ftp -s:%INP_FILE% %LITERAL_HOST_NAME%>ftp.out

Data Integrator Designer Guide

161

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

6

File Formats File transfers

Custom transfer options for flat files
Of the custom transfer program options, only the Program executable option is mandatory.

Entering User Name, Password, and Arguments values is optional. These options are provided for you to specify arguments that your custom transfer program can process (such as connection data). You can also use Arguments to enable or disable your program’s built-in features such as encryption/decryption and compression mechanisms. For example, you might design your transfer program so that when you enter sSecureTransportOn or -CCompressionYES security or compression is enabled. Note: Available arguments depend on what is included in your custom transfer program. See your custom transfer program documentation for a valid argument list. You can use the Arguments box to enter a user name and password. However, Data Integrator also provides separate User name and Password boxes. By entering the $AW_USER and $AW_PASSWORD variables as Arguments and then using the User and Password boxes to enter literal strings, these extra boxes are useful in two ways:

You can more easily update users and passwords in Data Integrator both when you configure Data Integrator to use a transfer program and when you later export the job. For example, when you migrate the job to another environment, you might want to change login information without scrolling through other arguments. You can use the mask and encryption properties of the Password box. Data entered in the Password box is masked in log files and on the screen, stored in the repository, and encrypted by Data Integrator. Note: Data Integrator sends password data to the custom transfer program in clear text. If you do not allow clear passwords to be exposed as arguments in command-line executables, then set up your custom program to either:

• •

Pick up its password from a trusted location Inherit security privileges from the calling program (in this case, Data Integrator)

162

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
File Formats File transfers

6

Setting custom transfer options
The custom transfer option allows you to use a third-party program to transfer flat file sources and targets. You can configure your custom transfer program in the File Format Editor window. Like other file format settings, you can override custom transfer program settings if they are changed for a source or target in a particular data flow. You can also edit the custom transfer option when exporting a file format. To configure a custom transfer program in the file format editor Select the Formats tab in the object library. Right-click Flat Files in the tab and select New. The File Format Editor opens. 3. Select either the Delimited or the Fixed width file type. Note: While the custom transfer program option is not supported with R/ 3 file types, you can use it as a data transport method for an R/3 data flow. See the Data Integrator Supplement for SAP for more information. 4. 5. Enter a format name. Select Yes for the Custom transfer program option.

1. 2.

6.

Enter the custom transfer program name and arguments.

7.

Complete the other boxes in the file format editor window. See the Data Integrator Reference Guide for more information. In the Data Files(s) section, specify the location of the file in Data Integrator. To specify system variables for Root directory and File(s) in the Arguments box:

Associate the Data Integrator system variable $AW_LOCAL_DIR with the local directory argument of your custom transfer program.

Data Integrator Designer Guide

163

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

6

File Formats File transfers

Associate the Data Integrator system variable $AW_FILE_NAME with the file name argument of your custom transfer program.

For example, enter: -l$AW_LOCAL_DIR\$AW_FILE_NAME When the program runs, the Root directory and File(s) settings are substituted for these variables and read by the custom transfer program. Note: The flag -l used in the example above is a custom program flag. Arguments you can use as custom program arguments in Data Integrator depend upon what your custom transfer program expects. 8. Click Save.

Design tips
Keep the following concepts in mind when using the custom transfer options:

• •

Variables are not supported in file names when invoking a custom transfer program for the file. You can only edit custom transfer options in the File Format Editor (or Datastore Editor in the case of SAP R/3) window before they are exported. You cannot edit updates to file sources and targets at the data flow level when exported. After they are imported, you can adjust custom transfer option settings at the data flow level. They override file format level settings.

When designing a custom transfer program to work with Data Integrator, keep in mind that:

• •

Data Integrator expects the called transfer program to return 0 on success and non-zero on failure. Data Integrator provides trace information before and after the custom transfer program executes. The full transfer program and its arguments with masked password (if any) is written in the trace log. When "Completed Custom transfer" appears in the trace log, the custom transfer program has ended. If the custom transfer program finishes successfully (the return code = 0), Data Integrator checks the following:

For an R/3 dataflow, if the transport file does not exist in the local directory, it throws an error and Data Integrator stops. See the Data Integrator Supplement for SAP for information about file transfers from SAP R/3. For a file source, if the file or files to be read by Data Integrator do not exist in the local directory, Data Integrator writes a warning message into the trace log.

164

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
File Formats Web log support

6

• • •

If the custom transfer program throws an error or its execution fails (return code is not 0), then Data Integrator produces an error with return code and stdout/stderr output. If the custom transfer program succeeds but produces standard output, Data Integrator issues a warning, logs the first 1,000 bytes of the output produced, and continues processing. The custom transfer program designer must provide valid option arguments to ensure that files are transferred to and from the Data Integrator local directory (specified in Data Integrator). This might require that the remote file and directory name be specified as arguments and then sent to the Data Integrator Designer interface using Data Integrator system variables.

Web log support
Web logs are flat files generated by Web servers and are used for business intelligence. Web logs typically track details of Web site hits such as:

• • • • • • •

Client domain names or IP addresses User names Timestamps Requested action (might include search string) Bytes transferred Referred address Cookie ID

Web logs use a common file format and an extended common file format.
Figure 6-1 :Common Web Log Format

Figure 6-2 :Extended Common Web Log Format

Data Integrator supports both common and extended common Web log formats as sources. The file format editor also supports the following:

• •

Dash as NULL indicator Time zone in date-time, e.g. 01/Jan/1997:13:06:51 –0600

Data Integrator Designer Guide

165

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

6

File Formats Web log support

Data Integrator includes several functions for processing Web log data:

• • •

Word_ext function Concat_date_time function WL_GetKeyValue function

Word_ext function
The word_ext is a Data Integrator string function that extends the word function by returning the word identified by its position in a delimited string. This function is useful for parsing URLs or file names.

Format
word_ext(string, word_number, separator(s))

A negative word number means count from right to left

Examples
word_ext('www.bodi.com', 2, '.') returns 'bodi'. word_ext('www.cs.wisc.edu', -2, '.') returns 'wisc'. word_ext('www.cs.wisc.edu', 5, '.') returns NULL. word_ext('aaa+=bbb+=ccc+zz=dd', 4, '+=') returns 'zz'. If 2 separators are specified (+=), the function looks for either one. word_ext(',,,,,aaa,,,,bb,,,c ', 2, '.') returns 'bb'. This function skips consecutive

delimiters.

Concat_date_time function
The concat_date_time is a Data Integrator date function that returns a datetime

from separate date and time inputs.

Format
concat_date_time(date, time)

Example
concat_date_time(MS40."date",MS40."time")

WL_GetKeyValue function
The WL_GetKeyValue is a custom function (written in the Data Integrator

Scripting Language) that returns the value of a given keyword. It is useful for parsing search strings.

166

Data Integrator Designer Guide

google.This document is part of a SAP study on PDF usage. Sample Web log formats in Data Integrator This is a file with a common Web log file format: Data Integrator Designer Guide 167 . Find out how you can participate and help to improve our documentation.'q') returns 'bodi+B2B'. keyword) Example A search in Google for bodi B2B is recorded in a Web log as: GET “http://www. File Formats Web log support 6 Format WL_GetKeyValue(string.google.com/ search?hl=en&lr=&safe=off&q=bodi+B2B&btnG=Google+Search” WL_GetKeyValue('http://www.com/ search?hl=en&lr=&safe=off&q=bodi+B2B&btnG=Google+Search'.

Find out how you can participate and help to improve our documentation. 6 File Formats Web log support This is the file format editor view of this Web log: 168 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage.

Data flows are described in Chapter 7: Data Flows. Data Integrator Designer Guide 169 .This document is part of a SAP study on PDF usage. File Formats Web log support 6 This is a representation of a sample data flow for this Web log. Find out how you can participate and help to improve our documentation.

This document is part of a SAP study on PDF usage. 6 File Formats Web log support 170 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide Data Flows chapter .

7 Data Flows About this chapter About this chapter This chapter contains the following topics: • • • • • • • • • • What is a data flow? Data flows as steps in work flows Intermediate data sets in a data flow Passing parameters to data flows Creating and defining data flows Source and target objects Transforms Query transform overview Data flow execution Audit Data Flow Overview What is a data flow? Data flows extract. Find out how you can participate and help to improve our documentation. transforming data. Naming data flows Data flow names can include alphanumeric characters and underscores (_). and load data. From inside a work flow. They cannot contain blank spaces. transform. including reading sources. Everything having to do with data. you can add it to a job or work flow.This document is part of a SAP study on PDF usage. occurs inside a data flow. and loading targets. The lines connecting objects in a data flow represent the flow of data through data transformation steps. After you define a data flow. a data flow can send and receive information to and from other objects through input and output parameters. 172 Data Integrator Designer Guide .

Find out how you can participate and help to improve our documentation. even when they are steps in a work flow. The resulting data flow looks like the following: Steps in a data flow Each icon you place in the data flow diagram becomes a step in the data flow. This chapter discusses objects that you can use as steps in a data flow: • • Source and target objects Transforms The connections you make between the icons determine the order in which Data Integrator completes the steps.This document is part of a SAP study on PDF usage. defined in a query transform A target table where the new rows are placed You indicate the flow of data through these components by connecting them in the order that data moves through them. Data sets created within a data flow are not available to other steps in the work flow. Data flows as steps in work flows Data flows are closed operations. Data Flows What is a data flow? 7 Data flow example Suppose you want to populate the fact table in your data warehouse with new data from two tables in your source transaction database. Your data flow consists of the following: • • • Two source tables A join between these tables. Data Integrator Designer Guide 173 .

This data set may. the results of a SQL statement containing a WHERE clause). a work flow can do the following: • • • Call data flows to perform data movement operations Define the conditions appropriate to run data flows Pass parameters to and from data flows Intermediate data sets in a data flow Each step in a data flow—up to the target definition—produces an intermediate result (for example. The intermediate result consists of a set of rows from the previous operation and the schema in which the rows are arranged.This document is part of a SAP study on PDF usage. 174 Data Integrator Designer Guide . This result is called a data set. Find out how you can participate and help to improve our documentation. which flows to the next step in the data flow. in turn. however. 7 Data Flows What is a data flow? A work flow does not operate on data sets and cannot provide more data to a data flow. be further “filtered” and directed into yet another data set.

Rows can be flagged as INSERT by transforms in the data flow to indicate that a change occurred in a data set as compared with an earlier image of the same data set. The change is recorded in the target separately from the existing data. If a row is flagged as NORMAL when loaded into a target. Parameters make data flow definitions more flexible. You can use this value in a data flow to extract only rows modified since the last update. Overwrites an existing row in the target. All rows in a data set are flagged as NORMAL when they are extracted from a source. You can.This document is part of a SAP study on PDF usage. The operation codes are as follows: Operation code NORMAL Description Creates a new row in the target. The change is recorded in the target in the same row as the existing data. When a data flow receives parameters. the steps inside the data flow can reference those parameters as variables. Data Flows Passing parameters to data flows 7 Operation codes Each row in a data set is flagged with an operation code that identifies the status of the row. For example. Data Integrator Designer Guide 175 . however. Is ignored by the target. Find out how you can participate and help to improve our documentation. Parameters evaluate single values rather than sets of values. INSERT DELETE UPDATE Passing parameters to data flows Data does not flow outside a data flow. Rows can be flagged as DELETE only by the Map_Operation transform. The following figure shows the parameter last_update used in a query to determine the data set used to load the fact table. Rows flagged as DELETE are not loaded. pass parameters into and out of a data flow. Creates a new row in the target. Rows can be flagged as UPDATE by transforms in the data flow to indicate that a change occurred in a data set as compared with an earlier image of the same data set. it is inserted as a new row in the target. not even when you add a data flow to a work flow. a parameter can indicate the last time a fact table was updated.

go to the Data Flows tab. Creating and defining data flows You can create data flows using objects from: • • The object library The tool palette After creating a data flow. Select the data flow category. To define a new data flow using the object library In the object library. “To change properties of a data flow” on page 177. 176 Data Integrator Designer Guide . see “Variables and Parameters” on page 295. right-click and select New.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. you can change its properties. 7 Data Flows Creating and defining data flows For more information about parameters. 1. For details see. 2.

Data Flows Creating and defining data flows 7 3. To change properties of a data flow Right-click the data flow and select Properties. To define a new data flow using the tool palette 1. When you drag a data flow icon into a job. Click the workspace for a job or work flow to place the data flow. You can change the following properties of a data flow: Data Integrator Designer Guide 177 . You can add data flows to batch and real-time jobs. 5. and targets you need. Add the sources. Select the new data flow. transforms. 4.This document is part of a SAP study on PDF usage. 2. you are telling Data Integrator to validate these objects according the requirements of the job type (either batch or realtime). Select the data flow icon in the tool palette. 1. The Properties window opens for the data flow. 3. Add the sources. Drag the data flow into the workspace for a job or a work flow. 2. transforms. Find out how you can participate and help to improve our documentation. and targets you need.

For more information. which can be on the local or a remote computer of the same or different database type. For more information. groups. Find out how you can participate and help to improve our documentation. lookups. even if the data flow is contained in a work flow that is a recovery unit that re-executes. 3. Execute only once When you specify that a data flow should only execute once. Source and target objects A data flow directly reads and loads data using two types of objects: Source objects — Define sources from which you read data 178 Data Integrator Designer Guide . Cache type You can cache data to improve performance of operations such as joins. Business Objects recommends that you do not mark a data flow as Execute only once if a parent work flow is a recovery unit.This document is part of a SAP study on PDF usage. d. b. and recovery. filtering. Click OK. and table comparisons. sorts. parallel flows. Use database links Database links are communication paths between one database server and another. Degree of parallelism Degree Of Parallelism (DOP) is a property of a data flow that defines how many times each transform within a data flow replicates to process a parallel subset of data. c. Database links allow local users to access data on a remote database. see the Data Integrator Performance Optimization Guide. Pageable—This value is the default. a batch job will never re-execute that data flow after the data flow completes successfully. see the Data Integrator Performance Optimization Guide. 7 Data Flows Source and target objects a. For more information. see the Data Integrator Performance Optimization Guide. see the Data Integrator Reference Guide. You can select one of the following values for the Cache type option on your data flow Properties window: • • In-Memory—Choose this value if your data flow processes a small amount of data that can fit in the available memory. For more information about how Data Integrator processes data flows with multiple conditions such as execute once.

Data Flows Source and target objects 7 Target objects — Define targets to which you write (or load) data Source objects Source objects represent data sources read from data flows. If you have the SAP licensed extension. See “Template tables” on page 181. A delimited or fixed-width flat file A file with an application. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Source object Table Template table Description Data Integrator access A file formatted with columns and rows Direct or through as used in relational databases adapter A template table that has been created Direct and saved in another data flow (used in development). see the the Data Integrator Supplement for SAP.specific format (not readable by SQL or XML parser) A file formatted with XML tags Direct Through adapter File Document XML file XML message Direct Used as a source in real-time jobs. See Direct “Real-time source and target objects” on page 266).specific format Through (not readable by SQL or XML parser) adapter A file formatted with XML tags Direct File Document XML file Data Integrator Designer Guide 179 . you can also use IDocs as sources. Target object Table Template table Description A file formatted with columns and rows as used in relational databases Data Integrator access Direct or through adapter A table whose format is based on the Direct output of the preceding transform (used in development) A delimited or fixed-width flat file Direct A file with an application. For more information. Target objects Target objects represent data targets that can be written to in data flows.

7 Data Flows Source and target objects Target object XML template file Description Data Integrator access An XML file whose format is based on Direct the preceding transform output (used in development. DTDs. Select the appropriate object library tab: • • Formats tab for flat files. Find out how you can participate and help to improve our documentation. XML message Outbound message If you have the SAP licensed extension. Define a database datastore. 1. primarily for debugging data flows) See “Real-time source and target objects” on page 266). If the object library is not already open. For more information. See “Real-time source and target objects” on page 266). or XML Schemas Datastores tab for database and adapter objects 180 Data Integrator Designer Guide . 3.This document is part of a SAP study on PDF usage. 2. select Tools > Object Library to open it. see the the Data Integrator Supplement for SAP. Define a file format and import the file Import an XML file format “Template tables” on page 181 “File Formats” on page 135 “To import a DTD or XML Schema format” on page 233 Define an adapter “Adapter datastores” on datastore and import page 111 object metadata. To add a source or target object to a data flow Open the data flow in which you want to place the object. you can also use IDocs as targets. Adding source or target objects to data flows Fulfill the following prerequisites before using a source or target object in a data flow: For Tables accessed directly from a database Template tables Files XML files and messages Objects accessed through an adapter Prerequisite Reference Define a database “Database datastores” on datastore and import page 81 table metadata.

it can only be used as a target in one data flow. DTD. you do not have to initially create a new table in your DBMS and import the metadata into Data Integrator. Names can include alphanumeric characters and underscores (_). Instead. 6. a secondary window appears. 7. With template tables. Though a template table can be used as a source table in multiple data flows. (Expand collapsed lists by clicking the plus sign next to a container icon. a popup menu appears.) For a new template table. Select the kind of object to make. To create a target template table Use one of the following methods to open the Create Template window: 1. select the Template XML icon from the tool palette. Click the object name in the workspace Data Integrator opens the editor for the object. Select the object you want to add as a source or target. Data Flows Source and target objects 7 4. Data Integrator automatically creates the table in the database with the schema defined by the data flow when you execute a job. Find out how you can participate and help to improve our documentation. 8. Drop the object in the workspace. select the Template Table icon from the tool palette. 5. Note: Ensure that any files that reference flat file. when you release the cursor. Template tables cannot have the same name as an existing table within a datastore. • From the tool pallet: a. The source or target object appears in the workspace. Data Integrator Designer Guide 181 . when you release the cursor. Template tables During the initial design of an application. For a new XML template file. you can use it as a source in other data flows. or XML Schema formats are accessible from the Job Server where the job will be run and specify the file location relative to this computer. Set the options you require for the object. you might find it convenient to use template tables to represent database tables. For objects that can be either sources or targets. Click the template table icon. For new template tables and XML template files. After creating a template table as a target in one data flow.This document is part of a SAP study on PDF usage. Enter the requested information for the new template object.

If you modify and save the data transformation operation in the data flow where the template table is a target. 6.This document is part of a SAP study on PDF usage. Once a template table is created in the database. Data Integrator uses the template table to create a new table in the database you specified when you created the template table. Find out how you can participate and help to improve our documentation. From the Project menu select Save. 2. Click OK. Click the plus sign (+) next to the datastore that contains the template table you want to convert. In the Query transform. Other features. 2. Template tables are particularly useful in early application development when you are designing and testing a project. b. map the Schema In columns that you want to include in the target table. After you are satisfied with the design of your data flow. you can no longer alter the schema. save it. The table appears in the workspace as a template table icon. 5. On the Create Template window. enter a table name. are available for template tables. Connect the template table to the data flow as a target (usually a Query transform). c. Click the template table icon and drag it to the workspace. Note that you must convert template tables to take advantage of some features such as bulk loading. Click inside a data flow to place the template table in the workspace. Any updates to the schema are automatically made to any other instances of the template table. 1. When the job is executed. Expand a datastore. you can convert the template table in the repository to a regular table. Once a template table is converted. select a datastore. During the validation process. On the Create Template window. the template table’s icon changes to a target table icon and the table appears in the object library under the datastore’s list of tables. To convert a template table into a regular table from the object library Open the object library and go to the Datastores tab. 182 Data Integrator Designer Guide . the schema of the template table automatically changes. Data Integrator warns you of any errors such as those resulting from changing the schema. In the workspace. such as exporting an object. 3. 7 Data Flows Source and target objects b. • From the object library: a. 4.

This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 183 . Right-click on the template table you want to convert and select Import Table. 4. To convert a template table into a regular table from a data flow Open the data flow containing the template table. In the datastore object library. Data Flows Source and target objects 7 A list of objects appears. Click the plus sign (+) next to Template Tables. choose View > Refresh. Right-click a template table you want to convert and select Import Table. To update the icon in all data flows. 3. Data Integrator converts the template table in the repository into a regular table by importing it from the database. Find out how you can participate and help to improve our documentation. the table is now listed under Tables rather than Template Tables. 1. The list of template tables appears. 2.

you can no longer change the table’s schema. 7 Data Flows Transforms After a template table is converted into a regular table. functions operate on single values in specific columns in a data set. Paths are defined in an expression table. Find out how you can participate and help to improve our documentation. Generates an additional “effective to” column based on the primary key’s “effective date. By contrast. Transform Case Description Simplifies branch logic in data flows by consolidating case or decision making logic in one transform. Data Integrator includes many built-in transforms.” Data_Transfer Date_Generation Effective_Date 184 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. These transforms are available from the object library on the Transforms tab. Transforms operate on data sets. Allows a data flow to split its processing into two sub data flows and push down resource-consuming operations to the database server. Transforms manipulate input sets and produce one or more output sets. See the Data Integrator Reference Guide for detailed information. Transforms Data Integrator includes objects called transforms. Generates a column filled with date values based on the start and end dates and increment that you provide.

starting from a value based on existing keys in the table you specify. Rotates the values in specified rows to columns. Unifies rows from two or more sources into a single target. Compares two data sets and produces the difference between them as a data set with rows flagged as INSERT and UPDATE. Hierarchy flattening can be both vertical and horizontal. A query transform is similar to a SQL SELECT statement. You specify in which column to look for updated data.) Retrieves a data set that satisfies conditions that you specify. Rotates the values in specified columns to rows. Generates new keys for source data. History_Preserving Key_Generation Map_CDC_Operation Map_Operation Merge Pivot (Columns to Rows) Query Reverse Pivot (Rows to Columns) Row_Generation SQL Table_Comparison Validation XML_Pipeline Transform editors Transform editor layouts vary. Allows conversions between operation codes. Data Flows Transforms 7 Transform Hierarchy_Flattening Description Flattens hierarchical data into relational tables so that it can participate in a star schema. Generates a column filled with int values starting at zero and incrementing by one to the end value you specify. Performs the indicated SQL query operation. Find out how you can participate and help to improve our documentation. The transform you will use most often is the Query transform. Ensures that the data at any stage in the data flow meets your criteria. maps output data. Processes large XML inputs in small batches.This document is part of a SAP study on PDF usage. this transform supports any data stream if its input requirements are met. and resolves beforeand after-images for UPDATE rows. You can filter out or replace data that fails your criteria. (Also see Reverse Pivot. Converts rows flagged as UPDATE to UPDATE plus INSERT. so that the original values are preserved in the target. While commonly used to support Oracle changed-data capture. Sorts input data. which has two panes: • • An input schema area and/or output schema area A options area (or parameters area) that allows you to set all the values the transform requires Data Integrator Designer Guide 185 .

From the work flow. Alternatively. 4. 6. click the data flow name. To add a transform to a data flow Open a data flow object. Select the transform you want to add to the data flow. 1. Draw the data flow connections. 2. Drag the transform icon into the data flow workspace. Find out how you can participate and help to improve our documentation. click the Data Flow tab and double-click the data flow name. from the object library. Input schema area Output schema area Options area Adding transforms to data flows Use the Designer to add transforms to data flows. 186 Data Integrator Designer Guide . 5. To connect a source to a transform. Go to the Transforms tab. 7 Data Flows Transforms Here is an example of the Query transform editor. click the arrow on the right edge of the source and drag the cursor to the arrow on the left edge of the transform.This document is part of a SAP study on PDF usage. Open the object library if it is not already open. 3.

mm represents a two-digit month. To specify a data column as a transform option. This opens the transform editor. or.mi.mm. • • 7. and function results to the output schema Assign primary keys to output columns Data Integrator Designer Guide 187 . nested schemas. and dd represents a two-digit day Where hh represents the two hour digits.dd Description Where yyyy represents a four-digit year. so this section provides an overview. To specify dates or times as option values. which lets you complete the definition of the transform. see the Data Integrator Reference Guide. 8. Enter option values. and ss the two second digits of a time hh. Data Flows Query transform overview 7 Continue connecting inputs and outputs as required for the transform.This document is part of a SAP study on PDF usage. You can connect the output of the transform to the input of another transform or target. The input for the transform might be the output from another transform or the output from a source. Find out how you can participate and help to improve our documentation. use the following formats: Format yyyy. For a full description. the transform may not require source data. enter the column name as it appears in the input schema or drag the column name from the input schema into the option box. Click the name of the transform. mi the two minute digits. The Query transform can perform the following operations: • • • • • • • Choose (filter) the data to extract from sources Join data from multiple sources Map columns from input to output schemas Perform transformations and functions on the data Perform data nesting and unnesting (see “Nested Data” on page 215) Add new columns.ss Query transform overview Query transform The Query transform is by far the most commonly used transform.

188 Data Integrator Designer Guide . 7 Data Flows Query transform overview Adding a Query transform to a data flow Because it is so commonly used.This document is part of a SAP study on PDF usage. without mappings. 1. Find out how you can participate and help to improve our documentation. The outputs from a Query can include input to another transform or input to a target. providing an easier way to add a Query transform. 2. the Query transform icon is included in the tool palette. • • The inputs for a Query can include the output from another transform or the output from a source. To add a Query transform to a data flow Click the Query icon in the tool palette. Connect the Query to inputs and outputs. Click anywhere in a data flow workspace. If you connect a target table to a Query with an empty output schema. Data Integrator automatically fills the Query’s output schema with the columns from the target table. 3.

a graphical interface for performing query operations. Find out how you can participate and help to improve our documentation. The currently selected output schema is called the current schema and determines: • • The output elements that can be modified (added. Data Flows Query transform overview 7 Query editor The query editor. mapped. or deleted) The scope of the Select through Order by tabs in the parameters area The current schema is highlighted while all other (non-current) output schemas are gray. To change the current schema You can change the current schema in the following ways: • Select a schema from the Output list.This document is part of a SAP study on PDF usage. contains the following areas: • • • Input schema area (upper left) Output schema area (upper right) Parameters area (lower tabbed area) The “i” icon indicates tabs containing user-defined entries. The input and output schema areas can contain: • • • Columns Nested schemas Functions (output only) The Schema In and Schema Out lists display the currently selected schema in each area. Data Integrator Designer Guide 189 .

Find out how you can participate and help to improve our documentation. When the text editor is enabled. Primary key columns are flagged by a key icon. To modify output schema contents You can modify the output schema in several ways: • • Drag and drop (or copy and paste) columns or nested schemas from the input schema area to the output schema area to create simple mappings. you can access these features using the buttons above the editor. 190 Data Integrator Designer Guide . Drag and drop input schemas and columns into the output schema to enable the editor. For example. Unnest or re-nest schemas.column # comment The job will not run and you cannot successfully export it. 7 Data Flows Query transform overview • • Right-click a schema. Note: You cannot add comments to a mapping clause in a Query transform. or function in the output schema area and select Make Current. Double-click one of the non-current (grayed-out) elements in the output schema area. Use the function wizard and the expression editor to build expressions. column. Use the object description or workspace annotation feature instead. Use right-click menu options on output elements to: • • • • • Add new output columns and schemas Use adapter functions or (with the SAP license extension) SAP R/3 functions to generate new output columns Assign or reverse primary key settings on output columns.This document is part of a SAP study on PDF usage. the following syntax is not supported on the Mapping tab: table. Use the Mapping tab to provide complex column mappings.

the transaction order is to extract. and you want to ensure that Data Integrator only executes a particular data flow one time. The syntax is like an SQL SELECT WHERE clause. then load data into a target. Outer Join Specifies an inner table and outer table for any joins (in the Where sheet) that are to be treated as outer joins. Data Integrator skips subsequent occurrences in the job. You might use this feature when developing complex batch jobs with multiple paths. Find out how you can participate and help to improve our documentation. • Use the Search tab to locate input and output elements containing a specific word or term. transform.EMPNO < 9000 Use the buttons above the editor to build expressions. Specifies all input schemas that are used in the current schema. Data flow execution A data flow is a declarative specification from which Data Integrator determines the correct data to process. Group By Order By Specifies how the output rows are grouped (if required). Select From Specifies whether to output only distinct rows (discarding any identical duplicate rows). The specification declares the desired output. Data Integrator only executes the first occurrence of the data flow. you can specify that a batch job execute a particular data flow only one time.EMPNO = TABLE2. However. In that case. such as jobs with try/catch blocks or conditionals. Specifies how the output rows are sequenced (if required). Data Flows Data flow execution 7 • Use the Select through Order By tabs to provide additional parameters for the current schema (similar to SQL SELECT statement clauses).This document is part of a SAP study on PDF usage. You can drag and drop schemas and columns into these areas. See “Creating and defining data flows” on page 176 for information on how to specify that a job execute a data flow only one time. Data Integrator executes a data flow each time the data flow occurs in a job. Data Integrator Designer Guide 191 .EMPNO > 1000 OR TABLE2.EMPNO AND TABLE1. For example in data flows placed in batch jobs. Data flows are similar to SQL statements. for example: TABLE1. Where Specifies conditions that determine which rows are output.

Data Integrator creates database-specific SQL statements based on a job’s data flow diagrams. This work distribution provides the following potential benefits: • • Better memory management by taking advantage of more CPU resources and physical memory Better job performance and scalability by using concurrent sub data flow execution to take advantage of grid computing 192 Data Integrator Designer Guide . For more information. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Data Integrator produces output while optimizing performance. for SQL sources and targets. To optimize performance. see the Data Integrator Reference Guide. ORDER BY. 7 Data Flows Data flow execution The following sections provide an overview of advanced features for data flows: • • • • “Push down operations to the database server” on page 192 “Distributed data flow execution” on page 192 “Load balancing” on page 193 “Caches” on page 194 Push down operations to the database server From the information in the data flow specification. For example. Resource-intensive operations include joins. You can use the Data_Transfer transform to pushdown resource-intensive operations anywhere within a data flow to the database. Distributed data flow execution Data Integrator provides capabilities to distribute CPU-intensive and memoryintensive data processing work (such as join. Data Integrator pushes down as many transform operations as possible to the source or target database and combines as many operations as possible into one request to the database. For example. GROUP BY. grouping. table comparison and lookups) across multiple processes and computers. Data Integrator reduces the number of rows and operations that the Data Integrator engine must process. you can examine the SQL that Data Integrator generates and alter your design to produce the most efficient results. Before running a job. and DISTINCT. See the Data Integrator Performance Optimization Guide for more information. Data flow design influences the number of operations that Data Integrator can push to the source or target database. By pushing down operations to the database. Data Integrator tries to push down joins and function evaluations.

Find out how you can participate and help to improve our documentation. Instead. For more information.This document is part of a SAP study on PDF usage. • Data_Transfer transform With this transform. Data Flows Data flow execution 7 You can create sub data flows so that Data Integrator does not need to process the entire data flow in memory at one time. You can specify the following values on the Distribution level option when you execute a job: Data Integrator Designer Guide 193 . You can also distribute the sub data flows to different job servers within a server group to use additional memory and CPU resources. When you specify multiple Run as a separate process options. Data Integrator does not need to process the entire data flow on the Job Server computer. the sub data flow processes run in parallel. This transform splits the data flow into two sub data flows and transfers the data to a table in the database server to enable Data Integrator to push down the operation. Use the following features to split a data flow into multiple sub data flows: • Run as a separate process option on resource-intensive operations that include the following: • • Hierarchy_Flattening transform Query operations that are CPU-intensive and memory-intensive: • • • • • • • DISTINCT GROUP BY Join ORDER BY Table_Comparison transform Lookup_ext function Count_distinct function If you select the Run as a separate process option for multiple operations in a data flow. see the Data Integrator Performance Optimization Guide. Load balancing You can distribute the execution of a job or a part of a job across multiple Job Servers within a Server Group to better balance resource-intensive operations. For more information and usage scenarios for separate processes. Data Integrator splits the data flow into smaller sub data flows that use separate resources (memory and computer) from each other. the Data_Transfer transform can push down the processing of a resource-intensive operation to the database server. see the Data Integrator Performance Optimization Guide.

This document is part of a SAP study on PDF usage.A job can execute on an available Job Server. Find out how you can participate and help to improve our documentation. each sub data flow can use its own cache type. Lookups — Because a lookup table might exist on a remote database. or table lookup) within a data flow can execute on an available Job Server. you might want to cache it in memory to reduce access times. • Pageable cache Use a pageable cache when your data flow processes a very large amount of data that does not fit in memory. For more information. processed by various transforms. Data Integrator provides the following types of caches that your data flow can use for all of the operations it contains: • In-memory Use in-memory cache when your data flow processes a small amount of data that fits in memory. Table comparisons — Because a comparison table must be read for each row of a source.An resource-intensive operation (such as a sort. 7 Data Flows Audit Data Flow Overview • • • Job level .Each data flow within a job can execute on an available Job Server. table comparison. If you split your data flow into sub data flows that each run on a different Job Server. • • • Joins — Because an inner source of a join must be read for each row of an outer source. 194 Data Integrator Designer Guide . Sub data flow level . see the Data Integrator Performance Optimization Guide. Audit Data Flow Overview You can audit objects within a data flow to collect run time audit statistics. and loaded into targets. you might want to cache a source when it is used as an inner source in a join. you might want to cache the comparison table. Data flow level . Caches Data Integrator provides the option to cache data in memory to improve operations such as the following in your data flows. the Data Integrator Performance Optimization Guide. You can perform the following tasks with this auditing feature: • Collect audit statistics about data read into a Data Integrator job.

This document is part of a SAP study on PDF usage. For a full description of auditing data flows. Generate notification of audit failures. Data Integrator Designer Guide 195 . Data Flows Audit Data Flow Overview 7 • • • Define rules about the audit statistics to determine if the correct data is processed. Find out how you can participate and help to improve our documentation. see “Using Auditing” on page 362. Query the audit statistics that persist in the Data Integrator repository.

This document is part of a SAP study on PDF usage. 7 Data Flows Audit Data Flow Overview 196 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.

Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide Work Flows chapter .This document is part of a SAP study on PDF usage.

Almost all of the features documented for work flows also apply to jobs.This document is part of a SAP study on PDF usage. with one exception: jobs do not have parameters. Jobs are special because you can execute them. the purpose of a work flow is to prepare for executing data flows and to set the state of the system after the data flows are complete. Find out how you can participate and help to improve our documentation. Jobs (introduced in Chapter 4: Projects) are special work flows. The following objects can be elements in work flows: • 198 Work flows Data Integrator Designer Guide . elements in a work flow can determine the path of execution based on a value set by a previous job or can indicate an alternative path if something goes wrong in the primary path. For example. Steps in a work flow Work flow steps take the form of icons that you place in the work space to create a work flow diagram. Ultimately. 8 Work Flows About this chapter About this chapter This chapter contains the following topics: • • • • • • • • • What is a work flow? Steps in a work flow Order of execution in work flows Example of a work flow Creating work flows Conditionals While loops Try/catch blocks Scripts What is a work flow? A work flow defines the decision-making process for executing data flows.

Here is the diagram for a work flow that calls three data flows: Note that Data_Flow1 has no connection from the left but is connected on the right to the left edge of Data_Flow2 and that Data_Flow2 is connected to Data_Flow3. the steps need not be connected. The connections you make between the icons in the workspace determine the order in which work flows execute. In the following work flow. A work flow can also call itself. There is a single thread of control connecting all three steps. Find out how you can participate and help to improve our documentation. Execution begins with Data_Flow1 and continues through the three data flows. unless the jobs containing those work flows execute in parallel. Data Integrator executes data flows 1 through 3 in parallel: Data Integrator Designer Guide 199 . Data Integrator can execute the independent steps in the work flow as separate processes. Order of execution in work flows Steps in a work flow execute in a left-to-right sequence indicated by the lines connecting the steps. Work Flows Order of execution in work flows 8 • • • • • Data flows Scripts Conditionals While loops Try/catch blocks Work flows can call other work flows. If there is no dependency.This document is part of a SAP study on PDF usage. In that case. and you can nest calls to any depth. Connect steps in a work flow when there is a dependency between the steps.

If the connections are not active. However. You need to write a script to determine when the last update was made. which automatically sends mail notifying an administrator of the problem. In addition. Find out how you can participate and help to improve our documentation. In that case.This document is part of a SAP study on PDF usage. and you want to ensure that Data Integrator only executes a particular work flow or data flow one time. the catch runs a script you wrote. such as jobs with try/catch blocks or conditionals. Example of a work flow Suppose you want to update a fact table. before you move data from the source. You define a data flow in which the actual data transformation takes place. 200 Data Integrator Designer Guide . define each sequence as a separate work flow. You can then pass this date to the data flow as a parameter. Data Integrator only executes the first occurrence of the work flow or data flow. You might use this feature when developing complex jobs with multiple paths. you define a try/catch block. To do this in Data Integrator. you want to determine when the fact table was last updated so that you only extract rows that have been added or changed since that date. 8 Work Flows Example of a work flow To execute more complex work flows in parallel. then call each of the work flows from another work flow as in the following example: Define work flow A Define work flow B Call work flows A and B from work flow C You can specify that a job execute a particular work flow or data flow only one time. you want to check that the data connections required to build the fact table are active when data is read from them. Data Integrator skips subsequent occurrences in the job.

To create a new work flow using the object library Open the object library. To specify that a job executes the work flow one time When you specify that a work flow should only execute once. Drag the work flow into the diagram. Add the data flows. and scripts that you need. they are steps of a decision-making process that influences the data flow. 5. Data Integrator Designer Guide 201 . conditionals. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. try/catch blocks. Work Flows Creating work flows 8 Scripts and error detection cannot execute in the data flow. 1. Right-click and choose New. 3. which looks like the following: Data Integrator executes these steps in the order that you connect them. Business Objects recommends that you not mark a work flow as Execute only once if the work flow or a parent work flow is a recovery unit. Creating work flows You can create work flows using one of two methods: • • Object library Tool palette After creating a work flow. This decision-making process is defined as a work flow. 4. even if the work flow appears in the job multiple times. 2. Rather. Go to the Work Flows tab. To create a new work flow using the tool palette Select the work flow icon in the tool palette. Click where you want to place the work flow in the diagram. you can specify that a job only execute the work flow one time. 2. even if the work flow is contained in a work flow that is a recovery unit that reexecutes. If more than one instance of a work flow appears in a job. work flows. 1. a job will never re-execute that work flow after the work flow completes successfully. you can improve execution performance by running the work flow only one time.

202 Data Integrator Designer Guide . Select the Execute only once check box. The Properties window opens for the work flow. 2. To define a conditional. then and else diagrams) are included in the scope of the parent control flow’s variables and parameters. Work flow elements to execute if the If expression evaluates to TRUE. parallel flows. Find out how you can participate and help to improve our documentation. (Optional) Work flow elements to execute if the If expression evaluates to FALSE. Then Else Define the Then and Else branches inside the definition of the conditional. You can use functions. Conditionals Conditionals are single-use objects used to implement if/then/else logic in a work flow.This document is part of a SAP study on PDF usage. Right click on the work flow and select Properties. you specify a condition and two logical branches: If A Boolean expression that evaluates to TRUE or FALSE. 3. 8 Work Flows Conditionals For more information about how Data Integrator processes work flows with multiple conditions like execute once. see the Data Integrator Reference Guide. and recovery. Conditionals and their components (if expressions. Click OK. variables. and standard operators to construct the expression. 1.

you define two work flows— one for each branch of the conditional. Find out how you can participate and help to improve our documentation. you can define them in the conditional editor itself. The definition of the conditional shows the two branches as follows: Data Integrator Designer Guide 203 . To implement this conditional in Data Integrator.This document is part of a SAP study on PDF usage. If the elements in each branch are simple. You write a script in a work flow to run the command file and return a success flag. Suppose you use a Windows command file to transfer data from a legacy system into Data Integrator. Work Flows Conditionals 8 A conditional can fit in a work flow. You then define a conditional that reads the success flag to determine if the data is available for the rest of the work flow.

open the object library to the Work Flows tab. 8. 6. 1. 5. Open the work flow in which you want to place the conditional. Add your predefined work flow to the Then box. After you complete the expression. Click the name of the conditional to open the conditional editor. 3. The conditional appears in the diagram. Enter the Boolean expression that controls the conditional. 9. click OK. Find out how you can participate and help to improve our documentation. and save each work flow as a separate object rather than constructing these work flows inside the conditional editor. Continue building your expression. then drag it into the Then box. nested conditionals. 8 Work Flows Conditionals Work flow executed when if is TRUE Work flow executed when if is FALSE Both the Then and Else branches of the conditional can contain any object that you can have in a work flow including other work flows. Business Objects recommends that you define. Click the location where you want to place the conditional in the diagram. 7.This document is part of a SAP study on PDF usage. try/catch blocks. select the desired work flow. To define a conditional Define the work flows that are called by the Then and Else branches of the conditional. To add an existing work flow. test. and so on. 204 Data Integrator Designer Guide . Click the icon for a conditional in the tool palette. 4. Click if. You might want to use the function wizard or smart editor. 2.

The conditional is now defined. This section discusses: • • • Design considerations Defining a while loop Using a while loop with View Data Design considerations The while loop is a single-use object that you can use in a work flow. Find out how you can participate and help to improve our documentation. While loops Use a while loop to repeat a sequence of steps in a work flow as long as a condition is true. 11. Data Integrator tests your conditional for syntax errors and displays any errors encountered. While number !=0 True Step 1 False Step 2 Data Integrator Designer Guide 205 . If the If expression evaluates to FALSE and the Else box is blank. (Optional) Add your predefined work flow to the Else box. After you complete the conditional. Work Flows While loops 8 10. The while loop repeats a sequence of steps as long as a condition is true. 12. Data Integrator exits the conditional and continues with the work flow. choose Debug > Validate.This document is part of a SAP study on PDF usage. Click the Back button to return to the work flow that calls the conditional.

You can use a while loop to check for the existence of the file using the file_exists function. before checking again. Because the system might never write the file. repeat the while loop. While file_exists(tempt. If the condition does not change. the while loop will not end. to ensure that the while loop eventually exits. change the while loop to check for the existence of the file and the value of the counter. In each iteration of the loop. As long as the file does not exist. 8 Work Flows While loops Typically. Find out how you can participate and help to improve our documentation. the steps done during the while loop result in a change in the condition so that the condition is eventually no longer satisfied and the work flow exits from the while loop.txt) = 0 or counter < 10 True sleep(60000) False counter = counter + 1 Defining a while loop You can define a while loop in any work flow. you can have the work flow go into sleep mode for a particular length of time. you might want a work flow to wait until the system writes a particular file. say one minute. As long as the file does not exist and the counter is less than a particular value. 206 Data Integrator Designer Guide . put the work flow in sleep mode and then increment the counter. For example. To define a while loop Open the work flow where you want to place the while loop. 1. In other words. you must add another check to the loop. such as a counter.This document is part of a SAP study on PDF usage.

Data Integrator Designer Guide 207 . 4. Connect these objects to represent the order that you want the steps completed. Close the while loop editor to return to the calling work flow. Note: Although you can include the parent work flow in the while loop. choose Debug > Validate. Data Integrator tests your definition for syntax errors and displays any errors encountered. which gives you more space to enter an expression and access to the function wizard. 8. Click the while loop icon on the tool palette. and data flows. After defining the steps in the while loop. In the While box at the top of the editor. 7. recursive calls can create an infinite loop. The while loop appears in the diagram. 6. 5. Click OK after you enter an expression in the editor. Click the location where you want to place the while loop in the workspace diagram. Find out how you can participate and help to improve our documentation. work flows. Alternatively. Work Flows While loops 8 2.This document is part of a SAP study on PDF usage. Click the while loop to open the while loop editor. 3. You can add any objects valid in a work flow including scripts. enter the condition that must apply to initiate and repeat the steps in the while loop. click to open the expression editor. Add the steps you want completed during the while loop to the workspace in the while loop editor.

Depending on the design of your job. • • Try/catch blocks A try/catch block is a combination of one try object and one or more catch objects that allow you to specify alternative work flows if errors occur while Data Integrator is executing a job. the while loop will complete normally.This document is part of a SAP study on PDF usage. Define the work flows that a thrown exception executes. if the while loop is the last object in a job). then the exception is handled by normal error logic. the job will complete as soon as all scannable objects are satisfied. 8 Work Flows Try/catch blocks Using a while loop with View Data When using View Data. If an exception is thrown during the execution of a try/catch block and if no catch is looking for that exception. Data Integrator might not complete all iterations of a while loop if you run a job in view data mode: • If the while loop contains scannable objects and there are no scannable objects outside the while loop (for example. If there are scannable objects after the while loop. If there are no scannable objects following the while loop but there are scannable objects completed in parallel to the while loop. do the following: • • Indicate the group of errors that you want to catch. Try/catch blocks: • • • “Catch” classes of exceptions “thrown” by Data Integrator. 208 Data Integrator Designer Guide . The while loop might complete any number of iterations. possibly after the first iteration of the while loop. In each catch. the DBMS. a job stops when Data Integrator has retrieved the specified number of rows for all scannable objects. Here’s the general method to implement exception handling: • • • Insert a try object before the steps for which you are handling errors. Insert one or more catches in the work flow after the steps. or the operating system Apply solutions that you provide Continue execution Try and catch objects are single-use objects. Find out how you can participate and help to improve our documentation. Scanned objects in the while loop will show results from the last iteration. then the job will complete after the scannable objects in the while loop are satisfied.

if the data flow BuildTable causes any system-generated exceptions handled by the catch. Rerun a failed work flow or data flow. 2. 7. Work Flows Try/catch blocks 8 The following work flow shows a try/catch block surrounding a data flow: In this case. We recommend that you define. Here are some examples of possible exception actions: • • • 1. 9. To define a try/catch block Define the work flow that is called by each catch you expect to define. Click the try icon in the tool palette. Note: There is no editor for a try. Click the location where you want to place the catch in the diagram. test. Click the catch icon in the tool palette. Repeat steps 5 and 6 if you want to catch other exception groups in this try/catch block. 8. The action initiated by the catch can be simple or complex. The try icon appears in the diagram. Click the name of the catch object to open the catch editor.This document is part of a SAP study on PDF usage. Send a prepared e-mail message to a system administrator. 5. then the work flow defined in the catch executes. Click the location where you want to place the try in the diagram. 4. Data Integrator Designer Guide 209 . Connect the try and catch to the objects they enclose. 6. and save each work flow as a separate object rather than constructing these work flows inside the catch editor. The catch icon appears in the diagram. the try merely initiates the try/catch block. 3. Open the work flow that includes the try/catch block. Run a scaled-down version of a failed work flow or data flow. Find out how you can participate and help to improve our documentation.

13. Select a group of exceptions from the list of Available Exceptions. After you have completed the catch. 15. 16. 11. select the desired work flow. Categories of available exceptions Categories of available exceptions include: • • ABAP generation errors Database access errors 210 Data Integrator Designer Guide .) Each catch supports one exception group selection. Data Integrator executes the catch work flow. (For a complete list of available exceptions. Repeat steps 9 through 11 until you have chosen all of the exception groups for this catch. 8 Work Flows Try/catch blocks Available exceptions Chosen exceptions Catch work flow 10. Repeat steps 9 to 14 for each catch in the work flow. Click Set. To add an existing work flow. choose Debug > Validate. Add your predefined work flow to the catch work flow box. Data Integrator tests your definition for syntax errors and displays any errors encountered. see “Categories of available exceptions” on page 210. Find out how you can participate and help to improve our documentation. Click the Back button to return to the work flow that calls the catch. 12. and then drag it into the box.This document is part of a SAP study on PDF usage. 14. If any error in the exception group listed in the catch occurs during the execution of this try/catch block. open the object library to the Work Flows tab.

Work Flows Scripts 8 • • • • • • • • • • • • Email errors Engine abort errors Execution errors File access errors Connection and bulk loader errors Parser errors R/3 execution errors Predefined transform errors Repository access errors Resolver errors System exception errors User transform errors Scripts Scripts are single-use objects used to call functions and assign values to variables in a work flow. Variable names start with a dollar sign ($).This document is part of a SAP study on PDF usage. Comments start with a pound sign (#).). For example. You can then assign the variable to a parameter that passes into a data flow and identifies the rows to extract from a source. The basic rules for the syntax of the script are as follows: For example. the following script statement determines today’s date and assigns the value to the variable $TODAY: Data Integrator Designer Guide 211 . Function calls always specify parameters even if the function uses no parameters. you can use the SQL function in a script to determine the most recent update time for a table and then assign that value to a variable. String values are enclosed in single quotation marks ('). Find out how you can participate and help to improve our documentation. A script can contain the following statements: • • • • • • • • • • Function calls If statements While statements Assignment statements Operators Each line ends with a semicolon (.

Enter the script statements. 3. To save a script If you want to save a script that you use often. You cannot use variables unless you declare them in the work flow that calls the script. 4. Click the location where you want to place the script in the diagram.This document is part of a SAP study on PDF usage. 212 Data Integrator Designer Guide . 2. For more information about scripts and the Business Objects scripting language. Data Integrator tests your script for syntax errors and displays any errors encountered. create a custom function containing the script steps. The following figure shows the script editor displaying a script that determines the start time from the output of a custom function. The script icon appears in the diagram. 1. see the Data Integrator Reference Guide. each followed by a semicolon. 8 Work Flows Scripts $TODAY = sysdate(). Click the function button to include functions in your script. select Debug > Validate. To create a script Open the work flow. Click the script icon in the tool palette. Click the name of the script to open the script editor. 5. 6. After you complete the script. Find out how you can participate and help to improve our documentation.

Find out how you can participate and help to improve our documentation. Work Flows Scripts 8 Debugging scripts using the print function Data Integrator has a debugging feature that allows you to print: • • The values of variables and parameters during execution The execution path followed within a script You can use the print function to write the values of parameters and variables in a work flow to the trace log. The value of parameter $x: value For details about the print function. see the Data Integrator Reference Guide. produces the following output in the trace log: The following output is being printed via the Print function in <Session job_name>. For example. Data Integrator Designer Guide 213 .This document is part of a SAP study on PDF usage. this line in a script: print ('The value of parameter $x: [$x]').

This document is part of a SAP study on PDF usage. 8 Work Flows Scripts 214 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.

Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide Nested Data chapter .This document is part of a SAP study on PDF usage.

some data sets. However. and transforms. Data Integrator maps nested data to a separate schema implicitly related to a single row and column of the parent schema.This document is part of a SAP study on PDF usage. targets. NRDM provides a way to view and manipulate hierarchical relationships within data flow sources. skip this chapter. 216 Data Integrator Designer Guide . What is nested data? Real-world data often has hierarchical relationships that are represented in a relational database with master-detail schemas using foreign keys to create the mapping. handle hierarchical relationships through nested data. 9 Nested Data About this chapter About this chapter This chapter contains the following topics: • • • • • What is nested data? Representing hierarchical data Formatting XML documents Operations on nested data XML extraction and parsing for columns If you do not plan to use nested data sources or targets. Sales orders are often presented using nesting: the line items in a sales order are related to a single header and are represented using a nested schema. Find out how you can participate and help to improve our documentation. This mechanism is called Nested Relational Data Modelling (NRDM). Each row of the sales order data set contains a nested line item schema. such as XML documents and SAP R/ 3 IDocs.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. There is a unique instance of each nested schema for each row at each level of the relationship. columns inside a nested schema can also contain columns. and can scale to present a deeper level of hierarchical complexity. For example. Examples include: • Multiple rows in a single data set • Multiple data sets related by a join • Nested data Using the nested data method can be more concise (no repeated information). Nested Data Representing hierarchical data 9 Representing hierarchical data You can represent the same hierarchical data in several ways. Data Integrator Designer Guide 217 .

and transforms in data flows. The structure of the schema shows how the data is ordered. • • Sales is the top-level schema. Find out how you can participate and help to improve our documentation. LineItems is a nested schema. 9 Nested Data Representing hierarchical data Generalizing further with nested data. targets. you can see the structure of nested data in the input and output schemas of sources. each row at each level can have any number of columns containing nested schemas. The minus sign in front of the schema icon indicates that the column list is open. 218 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. In Data Integrator. Nested schemas appear with a schema icon paired with a plus sign. which indicates that the object contains columns.

When you import a format document’s metadata. Formatting XML documents Data Integrator allows you to import and export metadata for XML documents (files or messages). Find out how you can participate and help to improve our documentation. This section discusses: • • • • • Importing XML Schemas Specifying source options for XML files Mapping optional schemas Using Document Type Definitions (DTDs) Generating DTDs and XML Schemas from an NRDM schema Data Integrator Designer Guide 219 . it is structured into Data Integrator’s internal schema for hierarchical documents which uses the nested relational data model (NRDM).dtd).xsd for example) or a document type definition (. The format of an XML file or message (. Nested Data Formatting XML documents 9 • CustInfo is a nested schema with the column list closed. which you can use as sources or targets in jobs.xml) can be specified using either an XML Schema (. XML documents are hierarchical.This document is part of a SAP study on PDF usage. Their valid structure is stored in separate format documents.

9 Nested Data Formatting XML documents Importing XML Schemas Data Integrator supports WC3 XML Schema Specification 1. 220 Data Integrator Designer Guide .0. For an XML document that contains information to place a sales order—order header. This section describes the following topics: • • • Importing XML schemas Importing abstract types Importing substitution groups Importing XML schemas Import the metadata for each XML Schema you use. and line items—the corresponding XML Schema includes the order structure and the relationship between data. Seethe Data Integrator Reference Guide for this XML Schema’s URL. customer.This document is part of a SAP study on PDF usage. The object library lists imported XML Schemas in the Formats tab. Find out how you can participate and help to improve our documentation.

To import an XML Schema From the object library. See the Data Integrator Reference Guide for the list of Data Integrator attributes. Data Integrator imports and converts them all to Data Integrator nested table and column attributes. Data Integrator Designer Guide 221 . then imports the following: • • • • • Document structure Namespace Table and column names Data type of each column Nested table and column attributes While XML Schemas make a distinction between elements and attributes. 1.This document is part of a SAP study on PDF usage. click the Format tab. Nested Data Formatting XML documents 9 When importing an XML Schema. Enter settings into the Import XML Schema Format window: When importing an XML Schema: • Enter the name you want to use for the format in Data Integrator. Find out how you can participate and help to improve our documentation. Right-click the XML Schemas icon. Data Integrator reads the defined elements and attributes. 3. 2.

• • 4. You must type the path. Expand the XML Schema category.This document is part of a SAP study on PDF usage. 222 Data Integrator Designer Guide . You can also view and edit nested table and column attributes from the Column Properties window. Note: If your Job Server is on a different computer than the Designer. specify the number of levels it has by entering a value in the Circular level box. If the XML Schema contains recursive elements (element A contains B. This value must match the number of recursive levels in the XML Schema’s content. select the Formats tab. 2. select a name in the Namespace drop-down list to identify the imported XML Schema. Find out how you can participate and help to improve our documentation. 9 Nested Data Formatting XML documents • Enter the file name of the XML Schema or its URL address. element B contains A). Data Integrator only imports elements of the XML Schema that belong to this node or any subnodes. the job that uses this XML Schema will fail. After you import an XML Schema. To view and edit nested table and column attributes for XML Schema From the object library. you can edit its column properties such as data type using the General tab of the Column Properties window. Click OK. The XML Schema Format window appears in the workspace. Otherwise. Varchar 1024 is the default. You can type an absolute path or a relative path. Double-click an XML Schema name. but the Job Server must be able to access it. 1. 3. select the name of the primary node you want to import. In the Root element name drop-down list. You can set Data Integrator to import strings as a varchar of any size. you cannot use Browse to specify the file path. • • If the root element name is not unique within the XML Schema.

This document is part of a SAP study on PDF usage. Nested Data Formatting XML documents 9 The Type column displays the data types that Data Integrator uses when it imports the XML document metadata. Data Integrator Designer Guide 223 . Find out how you can participate and help to improve our documentation. See the Data Integrator Reference Guide for more information about data types supported by XML Schema.

When a type is defined as abstract.This document is part of a SAP study on PDF usage. • • When an element is defined as abstract. an abstract element PublicationType can have a substitution group that consists of complex types such as MagazineType. the instance document must use a type derived from it (identified by the xsi:type attribute). 224 Data Integrator Designer Guide . Importing abstract types An XML schema uses abstract types to force substitution for a particular element or type. 9 Nested Data Formatting XML documents 4. Find out how you can participate and help to improve our documentation. The default is to select all complex types in the substitution group or all derived types for the abstract type. but you can choose to select a subset. and NewspaperType. BookType. For example. Double-click a nested table or column and select Attributes to view or edit XML Schema attributes. a member of the element’s substitution group must appear in the instance document.

From the drop-down list on the Abstract type box. For example. click the Abstract type button and take the following actions: a. the Abstract type button is enabled. Find out how you can participate and help to improve our documentation. the following excerpt from an xsd defines the PublicationType element as abstract with derived types BookType and MagazineType: <xsd:complexType name="PublicationType" abstract="true"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="Date" type="xsd:gYear"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="BookType"> <xsd:complexContent> <xsd:extension base="PublicationType"> <xsd:sequence> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence> </xsd:extension> /xsd:complexContent> </xsd:complexType> <xsd:complexType name="MagazineType"> <xsd:complexContent> <xsd:restriction base="PublicationType"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string" minOccurs="0" maxOccurs="1"/> <xsd:element name="Date" type="xsd:gYear"/> </xsd:sequence> </xsd:restriction> </xsd:complexContent> </xsd:complexType> 2.This document is part of a SAP study on PDF usage. when you enter the file name or URL address of an XML Schema that contains an abstract type. Nested Data Formatting XML documents 9 1. select the name of the abstract type. Data Integrator Designer Guide 225 . To limit the number of derived types to import for an abstract type On the Import XML Schema Format window. To select a subset of derived types for an abstract type.

Select the check boxes in front of each derived type name that you want to import. the Substitution Group button is enabled. but an application typically only uses a limited number of them. The default is to select all substitution groups. 226 Data Integrator Designer Guide . BookType. when you enter the file name or URL address of an XML Schema that contains substitution groups. AdsType. Find out how you can participate and help to improve our documentation. For example. In other words. c. Click OK. and NewspaperType: <xsd:element name="Publication" type="PublicationType"/> 1. Note: When you edit your XML schema format.This document is part of a SAP study on PDF usage. Importing substitution groups An XML schema uses substitution groups to assign elements to a special group of elements that can be substituted for a particular named element called the head element. but you can choose to select a subset. Data Integrator selects all derived types for the abstract type by default. To limit the number of substitution groups to import On the Import XML Schema Format window. 9 Nested Data Formatting XML documents b. The list of substitution groups can have hundreds or even thousands of members. the subset that you previously selected is not preserved. the following excerpt from an xsd defines the PublicationType element with substitution groups MagazineType.

Data Integrator selects all elements for the substitution group by default. Click OK. From the drop-down list on the Substitution group box. Note: When you edit your XML schema format. Find out how you can participate and help to improve our documentation. select the name of the substitution group. Click the Substitution Group button and take the following actions a. c. In other words.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 227 . b. Nested Data Formatting XML documents 9 <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Publication" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Magazine" type="MagazineType" substitutionGroup="Publication"/> <xsd:element name="Book" type="BookType" substitutionGroup="Publication"/> <xsd:element name="Ads" type="AdsType" substitutionGroup="Publication"/> <xsd:element name="Newspaper" type="NewspaperType" substitutionGroup="Publication"/> 2. Select the check boxes in front of each substitution group name that you want to import. the subset that you previously selected is not preserved.

To create a data flow with a source XML file From the object library. Expand the XML Schema and drag the XML Schema that defines your source XML file into your data flow. 2. you create a data flow to use the XML documents as sources or targets in jobs. 228 Data Integrator Designer Guide . 4. For information about other source options. 3. To specify multiple files. Place a query in the data flow and connect the XML source to the input of the query. Double-click the XML source in the work space to open the XML Source File Editor. 1. Creating a data flow with a source XML file 1. To read multiple XML files at one time Open the editor for your source XML file In XML File on the Source tab. 9 Nested Data Formatting XML documents Specifying source options for XML files After you import metadata for XML documents (files or messages). see Identifying source file names on page 229. Find out how you can participate and help to improve our documentation. see Reading multiple XML files at one time on page 228. 5.This document is part of a SAP study on PDF usage. You must specify the name of the source XML file in the XML file text box. enter a file name containing a wild card character (* or ?). To identify the source XML file. Reading multiple XML files at one time Data Integrator can read multiple files with the same format from a single directory using a single source object. click the Format tab. see the Data Integrator Reference Guide. 2.

This document is part of a SAP study on PDF usage. Nested Data Formatting XML documents 9 For example: D:\orders\1999????. Identifying source file names You might want to identify the source XML file for each row in your source output in the following situations: • • You specified a wildcard character to read multiple source files at one time You load from a different source file on different days Data Integrator Designer Guide 229 . see the Data Integrator Reference Guide. Find out how you can participate and help to improve our documentation.xml might read files from the year 1999 D:\orders\*.xml reads all files with the xml extension from the specified directory For information about other source options.

2. select Include file name column which generates a column DI_FILENAME to contain the name of the source XML file. 3. Find out how you can participate and help to improve our documentation. When you run the job. 9 Nested Data Formatting XML documents 1. map the DI_FILENAME column from Schema In to Schema Out. the target DI_FILENAME column will contain the source XML file name for each row in the target. In the Query editor. 230 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. To identify the source XML file for each row in the target In the XML Source File Editor.

When you run a job with a nested table set to optional and you have nothing defined for any columns and nested tables beneath that table. then go to the Attributes tab and set the Optional Table attribute value to yes or no. Click Apply and OK to set. Nested Data Formatting XML documents 9 Mapping optional schemas You can quickly specify default mapping for optional schemas without having to manually construct an empty nested table for each optional schema in the Query transform.This document is part of a SAP study on PDF usage. ENAME varchar(10). While a schema element is marked as optional. To toggle it off. Also. JOB varchar (9) NT1 al_nested_table ( DEPTNO int KEY . You can also right-click a nested table and select Properties. this nested table cannot be marked as optional. Data Integrator retains this option when you copy and paste schemas into your Query transforms. Find out how you can participate and help to improve our documentation. when you import XML schemas (either through DTDs or XSD files). You must map any output schema not marked as optional to a valid nested query block. Data Integrator generates special ATL and does not perform user interface validation for this nested table. Example: CREATE NEW Query ( EMPNO int KEY . This feature is especially helpful when you have very large XML schemas with many nested levels in your jobs. right-click the nested table again and select Optional again. DNAME varchar (14). Data Integrator automatically marks nested tables as optional if the corresponding option was set in the DTD or XSD file. defined sub-query block. the resulting query block must be complete and conform to normal validation rules required for a nested query block. When you make a schema column optional and do not provide mapping for it. you can still provide a mapping for the schema by appropriately programming the corresponding sub-query block with application logic that specifies how Data Integrator should produce the output. if you modify any part of the sub-query block. Data Integrator instantiates the empty nested table when you run the job. NT2 al_nested_table (C1 int) ) SET(“Optional Data Integrator Designer Guide 231 . However. Note: If the Optional Table value is something other than yes or no. • • To make a nested table “optional” Right-click a nested table and select Optional to toggle it on. Data Integrator generates a NULL in the corresponding PROJECT list slot of the ATL for any optional schema without an associated.

unnested tables. Note: You cannot mark top-level schemas. The DTD describes the data contained in the XML document and the relationships among the elements in the data.JOB. 232 Data Integrator Designer Guide . If you import the metadata from an XML file. DEPT.This document is part of a SAP study on PDF usage. EMP.ENAME. Import the metadata for each DTD you use. 9 Nested Data Formatting XML documents Table” = ‘yes’) ) AS SELECT EMP. NULL FROM EMP. customer. For an XML document that contains information to place a sales order—order header. You can import metadata from either an existing XML file (with a reference to a DTD) or DTD file. Find out how you can participate and help to improve our documentation.EMPNO. or nested tables containing function calls optional. Data Integrator automatically retrieves the DTD for that XML file. and line items—the corresponding DTD includes the order structure and the relationship between data. EMP. Using Document Type Definitions (DTDs) The format of an XML document (file or message) can be specified by a document type definition (DTD). The object library lists imported DTDs in the Formats tab.

Data Integrator reads the defined elements and attributes. such as text and comments. If importing a DTD file. 2. You can type an absolute path or a relative path. 1. The Import DTD Format window opens. Enter settings into the Import DTD Format window: • • In the DTD definition name box. To import a DTD or XML Schema format From the object library. • If importing an XML file. Data Integrator Designer Guide 233 . enter the name you want to give the imported DTD format in Data Integrator. Data Integrator ignores other parts of the definition. Note: If your Job Server is on a different computer than the Designer. you cannot use Browse to specify the file path. See the Data Integrator Reference Guide for information about Data Integrator attributes that support DTDs. Find out how you can participate and help to improve our documentation. but the Job Server must be able to access it. select XML for the File type option. Right-click the DTDs icon and select New. You must type the path. select the DTD option. 3.This document is part of a SAP study on PDF usage. click the Format tab. This allows you to modify imported XML data and edit the data type as needed. Nested Data Formatting XML documents 9 When importing a DTD. Enter the file that specifies the DTD you want to import.

To view and edit nested table and column attributes for DTDs From the object library. First generate a DTD/XML Schema. Select the Attributes tab to view or edit DTD attributes. You can also view and edit DTD nested table and column attributes from the Column Properties window. select the name of the primary node you want to import. 2. you can edit its column properties such as data type using the General tab of the Column Properties window. the job that uses this DTD will fail.This document is part of a SAP study on PDF usage. Double-click a DTD name. select the Formats tab. The DTD Format window appears in the workspace. This feature is useful if you want to stage data to an XML file and subsequently read it into another data flow. Otherwise. After you import a DTD. You can set Data Integrator to import strings as a varchar of any size. Then use it to setup an XML format. 4. Click OK. 234 Data Integrator Designer Guide . If the DTD contains recursive elements (element A contains B. This value must match the number of recursive levels in the DTD’s content. which in turn is used to set up an XML source for the staged file. specify the number of levels it has by entering a value in the Circular level box. 1. Expand the DTDs category. 9 Nested Data Formatting XML documents • • In the Root element name box. element B contains A). Double-click a nested table or column. Varchar 1024 is the default. The Column Properties window opens. Generating DTDs and XML Schemas from an NRDM schema You can right-click any schema from within a query editor in the Designer and generate a DTD or an XML Schema that corresponds to the structure of the selected schema (either NRDM or relational). • 4. 5. Data Integrator only imports elements of the DTD that belong to this node or any subnodes. Find out how you can participate and help to improve our documentation. 3.

If the Required attribute is set to NO. The Native Type attribute is used to set the type of the element or attribute. While generating XML Schemas. Find out how you can participate and help to improve our documentation. See the Data Integrator Reference Guide for details about how Data Integrator creates internal attributes when importing a DTD or XML Schema. the MinOccurs and MaxOccurs values will be set based on the Minimum Occurrence and Maximum Occurrence attributes of the corresponding nested table. Operations on nested data This section discusses: • • • • Overview of nested data and the Query transform FROM clause construction Nesting columns Using correlated columns in nested data Data Integrator Designer Guide 235 . Nested Data Operations on nested data 9 The DTD/XML Schema generated will be based on the following information: • • • • • Columns become either elements or attributes based on whether the XML Type attribute is set to ATTRIBUTE or ELEMENT.This document is part of a SAP study on PDF usage. Nested tables become intermediate elements. the corresponding element or attribute is marked optional. No other information is considered while generating the DTD or XML Schema.

236 Data Integrator Designer Guide . a query that includes nested data includes a SELECT statement to define operations for each parent and child schema in the output. because a SELECT statement can only include references to relational data sets. However. you use the Query transform to manipulate nested data. The Query Editor contains a tab for each clause of the query: • • The SELECT select_list applies to the current schema. When working with nested data. The mapping between input and output schemas defines the project list for the statement. The other SELECT statement elements defined by the query work the same with nested data as they do with flat data. the query provides an interface to perform SELECTs at each level of the relationship that you define in the output schema. 9 Nested Data Operations on nested data • • • • Distinct rows and nested data Grouping values across nested schemas Unnesting nested data How transforms handle nested data Overview of nested data and the Query transform With relational data. Find out how you can participate and help to improve our documentation. you must explicitly define the FROM clause in a query. Data Integrator assists by setting the toplevel inputs as the default FROM clause values for the top-level output schema. a Query transform allows you to execute a SELECT statement. When working with nested data. the Query transform assumes that the FROM clause in the SELECT statement contains the data sets that are connected as inputs to the query object. which the Schema Out text box displays. If you want to extract only part of the nested data. you can use the XML_Pipeline transform (see the Data Integrator Reference Guide). In Data Integrator. The FROM tab includes top-level columns by default. You can include columns from nested schemas or remove the top-level columns in the FROM list by adding schemas to the FROM tab. Without nested schemas.This document is part of a SAP study on PDF usage.

because the SELECT statements are dependent upon each other—and because the user interface makes it easy to construct arbitrary data sets—determining the appropriate FROM clauses for multiple levels of nesting can be complex. FROM clause construction When you include a schema in the FROM clause. The current schema allows you to distinguish multiple SELECT statements from each other within a single query. However.This document is part of a SAP study on PDF usage. you indicate that all of the columns in the schema—including columns containing nested schemas—are available to be included in the output. constrained by the WHERE clause for the current schema. you indicate that the output will be formed from the cross product of the two schemas. These FROM clause descriptions and the behavior of the query are exactly the same with nested data as with relational data. Nested Data Operations on nested data 9 The parameters you enter for the following tabs apply only to the current schema (as displayed in the Schema Out text box at the top right): • • • WHERE GROUP BY ORDER BY For information on setting the current schema and completing the parameters. Find out how you can participate and help to improve our documentation. see Query editor on page 189. If you include more than one schema in the FROM clause. Data Integrator Designer Guide 237 .

238 Data Integrator Designer Guide . Include both input schemas at the top-level in the FROM clause to produce the appropriate data. The data that a SELECT statement from a lower schema produces differs depending on whether or not a schema is included in the FROM clause at the top-level. join the order schema at the top-level with a customer schema.This document is part of a SAP study on PDF usage. 9 Nested Data Operations on nested data A FROM clause can contain: • • Any top-level schema from the input Any schema that is a column of a schema in the FROM clause of the parent schema The FROM clauses form a path that can start at any level of the output. The next two examples use the sales order data set to illustrate scenarios where FROM clause values change the data resulting from the query. Find out how you can participate and help to improve our documentation. The first schema in the path must always be a top-level schema from the input. Example: FROM clause includes all top-level inputs To include detailed customer information for all of the orders in the output.

For example.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide 239 . and Address for each SALES_ORDER_NUMBER. Nested Data Operations on nested data 9 This example shows: • • The FROM clause includes the two top-level schemas OrderStatus_In and cust. and you want the output to include detailed material information for each line-item. Customer name. The Schema Out pane shows customer details CustID. Example: Lower level FROM clause contains top-level input Suppose you want the detailed information from one schema to appear for each row in a lower level of another schema. the input includes a materials schema and a nested line-item schema.

Find out how you can participate and help to improve our documentation. For example. if you have sales-order information in a header schema and a line-item schema. you can nest the line items under the header schema.This document is part of a SAP study on PDF usage. You can also include new columns or include mapping expressions for the columns.OrderNo = LineItems. When you indicate the columns included in the nested schema.Item In the FROM clause. source list. specify the top-level Materials schema and the nested LineItems schema. 9 Nested Data Operations on nested data This example shows: • • The nested schema LineItems in Schema Out has a FROM clause that specifies only the Orders schema. Nesting columns When you nest rows of one schema inside another. Indicate the FROM clause. 1. specify the query used to define the nested data set for each row of the parent schema. 240 Data Integrator Designer Guide . the data set produced in the nested schema is the result of a query against the first one using the related values from the second one. The line items for a single row of the header schema are equal to the results of a query including the order number: SELECT * FROM LineItems WHERE Header.LineItems. 3. and WHERE clause to describe the SELECT statement that the query executes to determine the top-level data set. 2.OrderNo In Data Integrator. Source list: Drag the columns from the input to the output. To construct a nested data set Create a data flow with the sources that you want to include in the nested data set.Item = Materials. you can use a query transform to construct a nested data set from relational data. To include the Description from the top-level Materials schema for each row in the nested LineItems schema • • • Map Description from the top-level Materials Schema In to LineItems Specify the following join constraint: "Order". Place a query in the data flow and connect the sources to the input of the query. • • FROM clause: Include the input sources in the list on the From tab.

Nested Data Operations on nested data 9 • 4. Find out how you can participate and help to improve our documentation. right-click and choose New Output Schema. To take advantage of this relationship. WHERE clause: Only columns are available that meet the requirements for the FROM clause.This document is part of a SAP study on PDF usage. Select list: Only columns are available that meet the requirements for the FROM clause as described in FROM clause construction on page 237. You can also drag an entire schema from the input to the output. In the output of the query. If the output requires it. the columns in a nested schema are implicitly related to the columns in the parent row. The higher-level column is a correlated column. nested under the top-level schema. Make the top-level schema the current schema. Using correlated columns in nested data Correlation allows you to use columns from a higher-level schema to construct a nested schema. that schema is automatically listed. (For information on setting the current schema and completing the parameters. Data Integrator Designer Guide 241 . 8. Indicate the FROM clause. if that schema is included in the FROM clause for this schema. nest another schema at this level. If the output requires it. Change the current schema to the nested schema. • FROM clause: If you created a new output schema.) The query editor changes to display the new current schema. you can use columns from the parent schema in the construction of the nested schema. source list. and WHERE clause to describe the SELECT statement that the query executes to determine the top-level data set. Repeat steps 4 through 6 in this current schema. 6. If you dragged an existing schema from the input to the top-level output. see Query editor on page 189. In a nested-relational model. • • 7. Create a new schema in the output. 5. WHERE clause: Include any filtering or joins required to define the data set for the top-level output. A new schema icon appears in the output. you need to drag schemas from the input to populate the FROM clause. nest another schema under the top level.

copy all columns of the parent schema to the output. 5. the data in the nested schema includes only the rows that match both the related values in the current row of the parent schema and the value of the correlated column. Data Integrator creates a column called LineItems that contains a nested schema that corresponds to the LineItems nested schema in the input. you do not need to include the schema that includes the column in the FROM clause of the nested schema. Including the correlated column creates a new output column in the LineItems schema called OrderNo and maps it to the Order.OrderNo column. To used a correlated column in a nested schema Create a data flow with a source that includes a parent schema with a nested schema. (For information on setting the current schema and completing the parameters. Find out how you can participate and help to improve our documentation. In addition to the top-level columns. 1. Including the attribute in the nested schema allows you to use the attribute to simplify correlated queries against the nested data.) Include a correlated column in the nested schema. 3. In the query editor. For example. For example. the source could be an order header schema that has a LineItems column that contains a nested schema. Correlated columns can include columns from the parent schema and any other schemas in the FROM clause of the parent schema. drag the OrderNo column from the Header schema into the LineItems schema. • To include a correlated column in a nested schema. Including the key in the nested schema allows you to maintain a relationship between the two schemas after converting them from the nested data model to a relational model.This document is part of a SAP study on PDF usage. You can always remove the correlated column from the lower-level schema in a subsequent query transform. 2. 9 Nested Data Operations on nested data Including a correlated column in a nested schema can serve two purposes: • The correlated column is a key in the parent schema. Change the current schema to the LineItems schema. 242 Data Integrator Designer Guide . 4. see Query editor on page 189. The data set created for LineItems includes all of the LineItems columns and the OrderNo. Connect a query to the output of the source. The correlated column is an attribute in the parent schema. If the correlated column comes from a schema other than the immediate parent.

For example. Unnesting a schema produces a cross-product of the top-level schema (parent) and the nested schema (child). to assemble all the line items included in all the orders for each state from a set of orders. For example. Data Integrator Designer Guide 243 . Unnesting nested data Loading a data set that contains nested schemas into a relational (nonnested) target requires that the nested rows be unnested. the grouping operation combines the nested schemas for each group.This document is part of a SAP study on PDF usage.State) and create an output schema that includes State column (set to Order. Find out how you can participate and help to improve our documentation. The result is a set of rows (one for each state) that has the State column and the LineItems nested schema that contains all the LineItems for all the orders for that state. a sales order may use a nested schema to define the relationship between the order header and the order line items. This is particularly useful to avoid cross products in joins that produce nested output. the multi-level must be unnested. Grouping values across nested schemas When you specify a Group By clause for a schema with a nested schema. To load the data into relational schemas. Nested Data Operations on nested data 9 Distinct rows and nested data The Distinct rows option in Query transforms removes any duplicate rows at the top level of a join.State) and LineItems nested schema. you can set the Group By clause in the top level of the data set to the state column (Order.

for example. the result of unnesting schemas is a cross product of the parent and child schemas. 244 Data Integrator Designer Guide . the inner-most child is unnested first. 9 Nested Data Operations on nested data It is also possible that you would load different columns from different nesting levels into different schemas. A sales order. and so on to the top-level schema. may be flattened so that the order number is maintained separately with each line item and the header and line item information loaded into separate schemas. then the result—the cross product of the parent and the inner-most child—is then unnested from its parent. No matter how many levels are involved. Data Integrator allows you to unnest any number of nested schemas at any depth. Find out how you can participate and help to improve our documentation. When more than one level of unnesting occurs.This document is part of a SAP study on PDF usage.

if an order includes multiple customer values such as ship-to and bill-to addresses. 1. For example. Find out how you can participate and help to improve our documentation. and then cut the unneeded columns or nested columns. flattening a sales order by unnesting customer and line-item schemas produces rows of data that might not be useful for processing the order. Data Integrator Designer Guide 245 . to remove nested schemas or columns inside nested schemas. To unnest nested data Create the output that you want to unnest in the output schema of a query. Data for unneeded columns or schemas might be more difficult to filter out after the unnesting operation.This document is part of a SAP study on PDF usage. make the nested schema the current schema. You can use the Cut command to remove columns or schemas from the top level. Nested Data Operations on nested data 9 Unnesting all schemas (cross product of all data) might not produce the results you intend.

246 Data Integrator Designer Guide . Use an XML_Pipeline transform to select portions of the nested data. Only the columns at the first level of the input data set are available for subsequent transforms. Unnest How transforms handle nested data Nested data included in the input to transforms (with the exception of a query or XML_Pipeline transform) passes through the transform without being included in the transform’s operation. see the Data Integrator Reference Guide. Nest the data again to reconstruct the nested relationships. • • • 2. For details. Use a query transform to unnest the data. right-click the schema name and choose Unnest. For details. The output of the query (the input to the next step in the data flow) includes the data in the new relationship.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. 9 Nested Data Operations on nested data 2. For each of the nested schemas that you want to unnest. To transform values in lower levels of nested schemas Take one of the following actions to obtain the nested data 1. see Unnesting nested data on page 243. Perform the transformation. as the following diagram shows.

• • Data Integrator converts a clob data type input to varchar if you select the Import unsupported data types as VARCHAR of size option when you create a database datastore connection in the Datastore Editor. The function load_to_xml generates XML from a given NRDM structure in Data Integrator. This function takes varchar data only. data from long and clob columns must be converted to varchar before it can be transformed by Data Integrator. More and more database vendors allow you to store XML in one column. If your source uses a long data type. Nested Data XML extraction and parsing for columns 9 XML extraction and parsing for columns In addition to extracting XML message and file data. then loads the generated XML to a varchar column. To enable extracting and parsing for columns. Data Integrator Designer Guide 247 . long. If you want a job to convert the output to a long column. or clob. representing it as NRDM data during transformation.This document is part of a SAP study on PDF usage. There are plans to lift this restriction in the future. Data Integrator’s XML handling capability also supports reading from and writing to such fields. The field is usually a varchar. use the varchar_to_long function. then load it to a target or flat file column. Note: Data Integrator limits the size of the XML supported with these methods to 100K due to the current limitation of its varchar data type. use the long_to_varchar function to convert data to varchar. you can also use Data Integrator to extract XML data stored in a source table or flat file column. which takes the output of the load_to_xml function as input. then loading it to an XML message or file. transform it as NRDM data. Data Integrator provides four functions to support extracting from and loading to columns: • • • • extract_from_xml load_to_xml long_to_varchar varchar_to_long The extract_from_xml function gets the XML content stored in a single column and builds the corresponding NRDM structure so that Data Integrator can transform it. Find out how you can participate and help to improve our documentation. Sample Scenarios The following scenarios describe how to use four Data Integrator functions to extract XML data from a source column and load it into a target column.

and a data flow for your design. a job. Conversely. 4. and make sure that its size is big enough to hold the XML data. into the Data Integrator repository. Name this output column content. 2. Imported the XML Schema PO. The second parameter in this function (4000 in this case) is the maximum size of the XML data stored in the table column. map the source table column to a new output column. Create a second query that uses the function extract_from_xml to extract the XML data. 3. In the query editor. long_to_varchar(content. assume you have previously performed the following steps: 1. In the Map section of the query editor. do not enter a number that is too big. 5. Opened the data flow and dropped the source table with the column named content in the data flow. Data Integrator will truncate the data and cause a runtime error. 9 Nested Data XML extraction and parsing for columns Scenario 1 Using long_to_varchar and extract_from_xml functions to extract XML data from a column with data of the type long. which would waste computer memory at runtime. Find out how you can participate and help to improve our documentation. Use this parameter with caution. which provides the format for the XML data.xsd. To extract XML data from a column into Data Integrator First. Imported an Oracle table that contains a column named Content with the data type long. which contains XML data for a purchase order. select the Conversion function type. Create a query with an output column of data type varchar. If the size is not big enough to hold the maximum XML data for the column. Created a Project. then select the long_to_varchar function and configure it by entering its parameters. 2. 3. open the Function Wizard.This document is part of a SAP study on PDF usage. 4000) From this point: 1. 4. 248 Data Integrator Designer Guide .

To invoke the function extract_from_xml. Enter the name of the purchase order schema (in this case PO) The third parameter is Enable validation. Click Next. The first is the XML column name. choose New Function Call…. this function is not displayed in the function wizard. Data Integrator Designer Guide 249 . Enter content. b. When the Function Wizard opens. Enter 0 if you do not. • • • c. Enter values for the input parameters.This document is part of a SAP study on PDF usage. Otherwise. select Conversion and extract_from_xml. Nested Data XML extraction and parsing for columns 9 a. Note: You can only use the extract_from_xml function in a new function call. right-click the current context in the query. which is the output column in the previous query that holds the XML data The second parameter is the DTD or XML Schema name. Find out how you can participate and help to improve our documentation. Enter 1 if you want Data Integrator to validate the XML with the specified Schema.

right-click the function call in the second query and choose Modify Function Call. For example. For the function. 6. The extract_from_xml function also adds two columns: • • AL_ERROR_NUM — returns error codes: 0 for success and a non-zero integer for failures AL_ERROR_MSG — returns an error message if AL_ERROR_NUM is not 0. 250 Data Integrator Designer Guide . If the function fails due to an error when trying to produce the XML output. The return type of the column is defined in the schema. Note: If you find that you want to modify the function call. You can select any number of the top-level columns from an XML schema. billTo. Find out how you can participate and help to improve our documentation. With the data converted into the Data Integrator NRDM structure. comment.This document is part of a SAP study on PDF usage. e. Click Finish. which include either scalar or NRDM column data. if you want to load the NRDM structure to a target XML file. create an XML file target and connect the second query to it. you are ready to do appropriate transformation operations on it. Data Integrator returns NULL for scalar columns and empty nested tables for NRDM columns. 9 Nested Data XML extraction and parsing for columns d. select a column or columns that you want to use on output. shipTo. and items. Data Integrator generates the function call in the current context and populates the output schema of the query with the output columns you specified. Returns NULL if AL_ERROR_NUM is 0 Choose one or more of these columns as the appropriate output for the extract_from_xml function. Imagine that this purchase order schema has five top-level elements: orderDate.

you want to convert an NRDM structure for a purchase order to XML data using the function load_to_xml. which is of the long data type. do not include the function long_to_varchar in your data flow. and then load the data to an Oracle table column called content. If the data type of the source column is not long but varchar. you can use just one query by entering the function expression long_to_varchar directly into the first parameter of the function extract_from_xml. Nested Data XML extraction and parsing for columns 9 In this example. Because the function load_to_xml returns a value of varchar data type. to extract XML data from a column of data type long. you use the function varchar_to_long to convert the value of varchar data type to a value of the data type long. Find out how you can participate and help to improve our documentation. we created two queries: the first query to convert the data using the long_to_varchar function and the second query to add the extract_from_xml function. The first parameter of the function extract_from_xml can take a column of data type varchar or an expression that returns data of type varchar.This document is part of a SAP study on PDF usage. Scenario 2 Using the load_to_xml function and the varchar_to_long function to convert a Data Integrator NRDM structure to scalar data of the varchar type in an XML format and load it to a column of the data type long. In this example. Alternatively. Data Integrator Designer Guide 251 .

This document is part of a SAP study on PDF usage. b. Open the function wizard from the mapping section of the query and select the Conversion Functions category Use the function varchar_to_long to map the input column content to the output column content. 'PO'. In this example. In the mapping area of the Query window. a. 7. Click Finish. and then select the function load_to_xml. in this example. 4000) 2. click the category Conversion Functions. c. Assume the column is called content and it is of the data type long. You used the second query to convert the varchar data type to long. 4. Find out how you can participate and help to improve our documentation. Click Next. To load XML data into a column of the data type long Create a query and connect a previous query or source (that has the NRDM structure of a purchase order) to it. The function varchar_to_long takes only one input parameter. this function converts the NRDM structure of purchase order PO to XML data and assigns the value to output column content. 5. Create another query with output columns matching the columns of the target table. d. The function load_to_xml has seven parameters. You used the first query to convert an NRDM structure to XML data and to assign the value to a column of varchar data type. From the Mapping area open the function wizard. Enter a value for the input parameter. create an output column of the data type varchar called content. NULL. you used two queries.0" encoding = "UTF-8" ?>'. For more information. 252 Data Integrator Designer Guide .0" encoding = "UTF-8" ?>'. see the Data Integrator Reference Guide. '<?xml version="1. 4000) ) If the data type of the column in the target database table that stores the XML data is varchar. 1. Enter values for the input parameters. 3. 'PO'. '<?xml version="1. In this query. 1. varchar_to_long(content) Connect this query to a database target. You can use just one query if you use the two functions in one expression: varchar_to_long( load_to_xml(PO. there is no need for varchar_to_long in the transformation. 9 Nested Data XML extraction and parsing for columns 1. notice the function expression: load_to_xml(PO. 1. 6. NULL. 1. Like the example using the extract_from_xml function. Make sure the size of the column is big enough to hold the XML data.

Data Integrator Designer Guide Real-time jobs chapter . Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

the Access Server routes the message to a waiting process that performs a predefined set of operations for the message type. Find out how you can participate and help to improve our documentation. Real-time means that Data Integrator can receive requests from ERP systems and Web applications and send replies immediately after getting the requested data from a data cache or a second application. The content of the message can vary: • • It could be a sales order or an invoice processed by an ERP system destined for a data cache. 254 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. When a message is received. Real-time job — Performs a predefined set of operations for that message type and creates a response. 10 Real-time jobs Overview Overview Data Integrator supports real-time data transformation. Two Data Integrator components support request-response message processing: • • Access Server — Listens for messages and routes each message based on message type. This chapter contains the following topics: • • • • • • • Request-response message processing What is a real-time job? Creating real-time jobs Real-time source and target objects Testing real-time jobs Building blocks for real-time jobs Designing real-time applications Request-response message processing The message passed through a real-time system includes the information required to perform a business transaction. You define operations for processing on-demand messages by building real-time jobs in the Designer. It could be an order status request produced by a Web application that requires an answer from a data cache or back-office system. The Data Integrator Access Server constantly listens for incoming messages. The Access Server then receives a response for the message and replies to the originating application.

If a customer wants to know when they can pick up their order at your distribution center. Realtime jobs “extract” data from the body of the message received and from any secondary sources used in the job.This document is part of a SAP study on PDF usage. Also in real-time jobs. and loads data. you might use transforms differently in real-time jobs. It can also extract data from other sources such as tables or files. Each real-time job can extract data from a single message type. Real-time versus batch Like a batch job. a real-time job extracts. This ensures that each message receives a reply as soon as possible. you might use branches and logic controls more often than you would in batch jobs. Find out how you can participate and help to improve our documentation. The Access Server returns the response to the originating application. transforms. For example. However. you might want to create a CheckOrderStatus job using a look-up function to count order items and then a case transform to provide status in the form of strings: “No items are ready for pickup” or “X items in your order are ready for pickup” or “Your order is ready for pickup”. You create a different real-time job for each type of message your system can produce. Data Integrator Designer Guide 255 . Real-time jobs What is a real-time job? 10 Processing might require that additional data be added to the message from a data cache or that the message data be loaded to a data cache. What is a real-time job? The Data Integrator Designer allows you to define the processing of real-time messages using a real-time job. Data Integrator writes data to message targets and secondary targets in parallel. The same powerful transformations you can define in batch jobs are available in real-time jobs.

10 Real-time jobs What is a real-time job? Unlike batch jobs. The message processing could return confirmation that the order was submitted successfully. customer information. The message contents might be as simple as the sales order number. 256 Data Integrator Designer Guide . Real-time services then wait for messages from the Access Server.This document is part of a SAP study on PDF usage. The message might include the order number. When the Access Server receives a message. it passes the message to a running real-time service designed to process this message type. the message contains data that can be represented as a single column in a single-row table. Find out how you can participate and help to improve our documentation. Messages How you design a real-time job depends on what message you want it to process. suppose a message includes information required to determine order status for a particular order. For example. The real-time service continues to listen and process messages on demand until it receives an instruction to shut down. In this case. Typical messages include information required to implement a particular business operation and to produce an appropriate response. a message could be a sales order to be entered into an ERP system. The real-time service processes the message and returns a response. and the line-item details for the order. instead. real-time jobs do not execute in response to a schedule or internal trigger. real-time jobs execute as real-time services started through the Administrator. In a second case. The corresponding real-time job might use the input to query the right sources and return the appropriate product information.

you can include values from a data cache to supplement the transaction before applying it against the back-office application (such as an ERP system). both of the line items are processed for the single row of header information. toplevel table. See Chapter 9: Nested Data for details. When processing the message. Find out how you can participate and help to improve our documentation. Real-time jobs can send only one row of data in a reply message (message target). the message contains data that cannot be represented in a single table. you can structure message targets so that all data is contained in a single row by nesting tables within columns of a single. Real-time job examples These examples provide a high-level description of how real-time jobs address typical real-time scenarios. SCM. Data Integrator represents the header and line item data in the message in a nested relationship. the order header information can be represented by a table and the line items for the order can be represented by a second table. Using a query transform. In this sales order.This document is part of a SAP study on PDF usage. Real-time jobs What is a real-time job? 10 In this case. the real-time job processes all of the rows of the nested table for each row of the top-level table. However. Data Integrator data flows support the nesting of tables within other tables. Data Integrator Designer Guide 257 . Loading transactions into a back-office application A real-time job can receive a transaction from a Web application and load it to a back-office application (ERP. legacy). Later sections describe the actual objects that you would use to construct the logic in the Designer.

Real-time jobs can receive messages from a back-office application and load them into a data cache or data warehouse. 10 Real-time jobs What is a real-time job? Collecting back-office data into a data cache You can use messages to keep the data cache current. data cache. 258 Data Integrator Designer Guide . Retrieving values. back-office apps You can create real-time jobs that use values from a data cache to determine whether or not to query the back-office application (such as an ERP system) directly.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

which typically move large amounts of data at scheduled times. Real-time job models In contrast to batch jobs. Data Integrator processes the request. You can specify any number of work flows and data flows inside it. or a combination of objects. conditionals. It runs only when a real-time service starts. returns a reply. data flow. and clean-up. while loops.). or a combination of objects. Real-time jobs Creating real-time jobs 10 Creating real-time jobs You can create real-time jobs using the same objects as batch jobs (data flows. listens for a request. It runs only when a real-time service is shut down. Data Integrator Designer Guide 259 . work flows. and continues listening. The real-time processing loop is a container for the job’s single process logic. object usage must adhere to a valid real-time job model. data flow. etc. This listen-process-listen logic forms a processing loop. scripts. A real-time job is divided into three processing components: initialization. The clean-up component (optional) can be a script. once started as a real-time service. Find out how you can participate and help to improve our documentation. work flow. work flow. Figure 10-1 :Real-time job • • • The initialization component (optional) can be a script. However. a real-time job. When a real-time job receives a request (typically to access small number of records).This document is part of a SAP study on PDF usage. a real-time processing loop.

10 Real-time jobs Creating real-time jobs In a real-time processing loop. you create a real-time job using a single data flow in its real-time processing loop. If you use multiple data flows in a real-time processing loop: • • The first object in the loop must be a data flow. This single data flow must include a single message source and a single message target. Find out how you can participate and help to improve our documentation. a single message source must be included in the first step and a single message target must be included in the last step.This document is part of a SAP study on PDF usage. The following models support this rule: • • • Single data flow model Multiple data flow model Request/Acknowledge data flow model (see the Data Integrator Supplement for SAP) Single data flow model With the single data flow model. 260 Data Integrator Designer Guide . you can ensure that data in each message is completely processed in an initial data flow before processing for the next data flows starts. all 40 must pass though the first data flow to a staging or memory table before passing to a second data flow. if the data represents 40 items. This allows you to control and collect all the data in a message at any point in a real-time job for design and troubleshooting purposes. For example. Figure 10-2 :Real-time processing loop By using multiple data flows. The last object in the loop must be a data flow. This data flow must have a message target. This data flow must have one and only one message source. Figure 10-1 :Real-time processing loop Multiple data flow model The multiple data flow model allows you to create a real-time job using multiple data flows in its real-time processing loop.

Using real-time job models Single data flow model When you use a single data flow within a real-time processing loop your data flow diagram might look like this: Notice that the data flow has one message source and one message target. Multiple data flow model When you use multiple data flows within a real-time processing loop your data flow diagrams might look like those in the following example scenario in which Data Integrator writes data to several targets according to your multiple data flow design. Real-time jobs Creating real-time jobs 10 • • • Additional data flows cannot have message sources or targets. completing each one before moving on to the next: • • • Receive requests about the status of individual orders from a web portal and record each message to a backup flat file Perform a query join to find the status of the order and write to a customer database table. Reply to each message with the query join results Data Integrator Designer Guide 261 . Find out how you can participate and help to improve our documentation. and you can add them inside any number of work flows. All data flows can use input and/or output memory tables to pass data sets on to the next data flow. Example scenario requirements: Your job must do the following tasks. You can add any number of additional data flows to the loop. They improve the performance of real-time jobs with multiple data flows.This document is part of a SAP study on PDF usage. Memory tables store data in memory while a loop runs.

It reads the result of the join in the memory table (table source) and loads the reply (XML message target). see “Memory datastores” on page 102. 262 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. add a data flow to the work flow. and writes the results to a database table (table target) and a new memory table (table target). Note: You might want to create a memory table to move data to sequential data flows. set up the tasks in each data flow: • The first data flow receives the XML message (using an XML message source) and records the message to the flat file (flat file format target). Find out how you can participate and help to improve our documentation. • The last data flow sends the reply. Next. create a real-time job and add a data flow. this same data flow writes the data into a memory table (table target). and an another data flow to the real-time processing loop. Second. For more information about building real-time jobs. performs a join with stored data (table source). For more information. a work flow. Meanwhile. 10 Real-time jobs Creating real-time jobs Solution: First. • The second data flow reads the message data from the memory table (table source). Notice this data flow has neither a message source nor a message target. see the “Building blocks for real-time jobs” and “Designing real-time applications” sections.

rename New_RTJob1.This document is part of a SAP study on PDF usage. The workspace displays the job’s structure. New_RTJob1 appears in the project area. a prefix saved with the job name will help you identify it. Data Integrator Designer Guide 263 . job names may also appear in text editors used to create adapter or Web Services calls. To create a real-time job In the Designer. From the project area. In this case. create or open an existing project. right-click the white space and select New Realtime job from the shortcut menu. 2. In the project area. Real-time jobs Creating real-time jobs 10 Creating a real-time job 1. which consists of two markers: • • RT_Process_begins Step_ends These markers represent the beginning and end of a real-time processing loop. 3. use the naming convention: RTJOB_JobName. Find out how you can participate and help to improve our documentation. Although saved real-time jobs are grouped together under the Job tab of the object library. Always add a prefix to job names with their job type. In these cases.

see “Real-time source and target objects” on page 266. Click inside the loop. The boundaries of a loop are indicated by begin and end markers. b. Drop and configure a data flow. A real-time job with a single data flow might look like this: e. 10 Real-time jobs Creating real-time jobs 4. One message source and one message target are allowed in a realtime processing loop. and connect initialization object(s) and clean-up object(s) as needed. This data flow must include one message source. When you place a data flow icon into a job. If you want to create a job with a single data flow: a. d. 264 Data Integrator Designer Guide . If you want to create a job with multiple data flows: a. configure. you are telling Data Integrator to validate the data flow according the requirements of the job type (batch or real-time). c.This document is part of a SAP study on PDF usage. You can add data flows to either a batch or real-time job. Build the data flow including a message source and message target. Add. Find out how you can participate and help to improve our documentation. Connect the begin and end markers to the data flow. Click the data flow icon in the tool pallet. For more information about sources and targets. 5.

Drop. f. These data flows will run in parallel when job processing begins. scripts. 8. This data flow must include one message target. Save the job. 7. configure. Just before the end of the loop. Assign test source and target files for the job and execute it in test mode. After adding and configuring all objects. and connect the initialization. Return to the real-time job window and connect all the objects. and clean up objects outside the real-time processing loop as needed. drop and configure your last data flow. Data Integrator Designer Guide 265 . A real-time job with multiple data flows might look like this: 6. validate your job. data flows. To include parallel processing in a real-time job.This document is part of a SAP study on PDF usage. or conditionals from left to right between the first data flow and end of the real-time processing loop. drop data flows within job-level work flows. Connected objects run in sequential order. Find out how you can participate and help to improve our documentation. After this data flow. Do not connect these secondary-level data flows. drop other objects such as work flows. c. Note: Objects at the real-time job level in Designer diagrams must be connected. For more information see “Testing real-time jobs” on page 270. e. Real-time jobs Creating real-time jobs 10 b. d. Open each object and configure it.

Find out how you can participate and help to improve our documentation. configure a service and service providers for the job and run it in your test environment. 10 Real-time jobs Real-time source and target objects 9. For more information. see the Data Integrator Supplement for SAP. Using the Administrator. the schema displays nested tables to represent the relationships among the data. If the XML message source or target contains nested data. Real-time source and target objects Real-time jobs must contain a real-time source and/or target object. with the following additions: For XML messages Prerequisite Import a DTD or XML Schema to define a format Reference Object library location “To import a DTD or Formats tab XML Schema format” on page 233 Outbound message Define an adapter datastore “Adapter datastores” Datastores tab. Adding sources and targets to real-time jobs is similar to adding them to batch jobs. you can also use IDoc messages as real-time sources. on page 111 under adapter datastore To view an XML message source or target schema In the workspace of a real-time job. Those normally available are: Object XML message Description Used as a: Data Integrator Access An XML message structured in Source or target Directly or through a DTD or XML Schema format adapters Through an adapter Outbound message A real-time message with an Target application-specific format (not readable by XML parser) If you have the SAP licensed extension. 266 Data Integrator Designer Guide . and import object metadata. click the name of an XML message source or XML message target to open its editor.This document is part of a SAP study on PDF usage.

suppose you are processing a message that contains a sales order from a Web application. The order contains the customer name. Data Integrator Designer Guide 267 . Real-time jobs Real-time source and target objects 10 Root element Columns at the top level Nested table Columns nested one level Sample file for testing Secondary sources and targets Real-time jobs can also have secondary sources or targets (see “Source and target objects” on page 178).This document is part of a SAP study on PDF usage. you can supplement the message with the customer information to produce the complete document to send to the ERP system. Inside a data flow of a real-time job. but when you apply the order against your ERP system. Find out how you can participate and help to improve our documentation. For example. The supplementary information might come from the ERP system itself or from a data cache containing the same information. you need to supply more detailed customer information.

which can reduce performance when moving large amounts of data. Find out how you can participate and help to improve our documentation. 10 Real-time jobs Real-time source and target objects Tables and files (including XML files) as sources can provide this supplementary information. Add secondary sources and targets to data flows in real-time jobs as you would to data flows in batch jobs (See “Adding source or target objects to data flows” on page 180). in which the data resulting from the processing of a single data flow can be loaded into multiple tables as a single transaction. Note: Target tables in batch jobs also support transactional loading. You can specify the order in which tables in the transaction are included using the target table editor. However. use caution when you consider enabling this option for a batch job because it requires the use of memory. No part of the transaction applies if any part fails. This feature supports a scenario in which you have a set of tables with foreign keys that depend on one with primary keys. 268 Data Integrator Designer Guide . Transactional loading of tables Target tables in real-time jobs support transactional loading. Data Integrator loads data to secondary targets in parallel with a target message.This document is part of a SAP study on PDF usage. Data Integrator reads data from secondary sources according to the way you design the data flow.

targets in each datastore load independently.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Design tips for data flows in real-time jobs Keep in mind the following when you are designing data flows: • If you include a table in a join with a real-time source. Data Integrator includes the data set from the real-time source as the outer loop of the join. pre-load. Real-time jobs Real-time source and target objects 10 Turn on transactional loading Assign the order of this table in the transaction You can use transactional loading only when all the targets in a data flow are in the same datastore. do not cache data from secondary sources unless the data is static. If you use transactional loading. you can control which table is included in the next outer-most loop of the join using the join ranks for the tables. • Data Integrator Designer Guide 269 . The data will be read when the real-time job starts and will not be updated while the job is running. In real-time jobs. you cannot use bulk loading. or post-load commands. If the data flow loads tables in more than one datastore. If more than one supplementary source is included in the join.

you can execute a real-time job using a sample source message from a file to determine if Data Integrator produces the expected target message. For example. if a request comes in for a product number that does not exist. 2. 270 Data Integrator Designer Guide . 10 Real-time jobs Testing real-time jobs • If no rows are passed to the XML target. These include: • • • Executing a real-time job in test mode Using View Data Using an XML file target Executing a real-time job in test mode You can test real-time job designs without configuring the job as a service associated with an Access Server. the real-time job returns an empty response to the Access Server. If more than one row passes to the XML target. In test mode. Testing real-time jobs There are several ways to test real-time jobs during development.This document is part of a SAP study on PDF usage. You might want to provide appropriate instructions to your user (exception handling in your job) to account for this type of scenario. See Chapter 9: Nested Data for more information. • • For more detailed information about real-time job processing see the Data Integrator Reference Guide. enter a file name in the XML test file box. the target reads the first row and discards the other rows. 1. your job might be designed in such a way that no data passes to the reply message. Execute the job. Use paths for both test files relative to the computer that runs the Job Server for the current repository. Enter the full path name for the source file that contains your sample data. With NRDM. To specify a sample XML message and target test file In the XML message source and target editors. Recovery mechanisms are not supported in real-time jobs. use your knowledge of Data Integrator’s Nested Relational Data Model (NRDM) and structure your message source and target formats so that one “row” equals one message. Find out how you can participate and help to improve our documentation. To avoid this issue. you can structure any amount of data into a single “row” because columns in tables can contain other tables.

you can include XML files as sources or targets in batch and real-time jobs. you define an XML file by importing a DTD or XML Schema for the file. 1. Using View Data To ensure that your design returns the results you expect. Just like an XML message. Enter a file name relative to the computer running the Job Server. Connect the output of the step in the data flow that you want to capture to the input of the file. Data Integrator reads data from the source test file and loads it into the target test file. See Chapter 15: Design and Debug for more information. Find out how you can participate and help to improve our documentation. In the file editor. A menu prompts you for the function of the file. drag the DTD or XML Schema into a data flow of a real-time job. specify the location to which Data Integrator writes data. Choose Make XML File Target. Unlike XML messages. With View Data. Data Integrator Designer Guide 271 . execute your job using View Data. 3. Using an XML file target You can use an “XML file target” to capture the message produced by a data flow while allowing the message to be returned to the Access Server. To use a file to capture output from steps in a real-time job In the Formats tab of the object library. you can capture a sample of your output data to ensure your design is working.This document is part of a SAP study on PDF usage. The XML file target appears in the workspace. 4. 2. Real-time jobs Testing real-time jobs 10 Test mode is always enabled for real-time jobs. then dragging the format into the data flow definition.

In addition to the real-time source. You can include a join expression in the query to extract the specific values required from the supplementary source. Use the data in the real-time source to find the necessary supplementary data. include the files or tables from which you require supplementary information. Include a table or file as a source. 3. Supplementing message data The data included in messages from real-time sources might not map exactly to your requirements for processing or storing the information. If not. 272 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. 2. One technique for supplementing the data in a real-time source includes these steps: 1. you can define steps in the real-time job to supplement the message information. Find out how you can participate and help to improve our documentation. Use a query to extract the necessary data from the table or file. 10 Real-time jobs Building blocks for real-time jobs Building blocks for real-time jobs This section describes some of the most common operations that real-time jobs can perform and how to define them in the Designer: • • • Supplementing message data Branching data flow based on a data cache value Calling application functions Also read about “Embedded Data Flows” on page 283 and the Case transform in the Data Integrator Reference Guide.

Find out how you can participate and help to improve our documentation. a request message includes sales order information and its reply message returns order status. The message includes only the customer name and the order number.This document is part of a SAP study on PDF usage. A real-time job is then defined to retrieve the customer number and rating from other sources before determining the order status. resulting in output for only the sales document and line items included in the input from the application. The business logic uses the customer number and priority rating to determine the level of status to return. If no value returns from the join. even if no match is found To supplement message data In this example. the query produces no rows and the message returns to the Access Server empty. If you cannot guarantee that a value returns. Real-time jobs Building blocks for real-time jobs 10 Input from the Web application The WHERE clause joins the two inputs. Be careful to use data in the join that is guaranteed to return a value. consider these alternatives: • • Lookup function call — Returns a default value if no match is found Outer join — Always returns a value. Data Integrator Designer Guide 273 .

This document is part of a SAP study on PDF usage. Order status for the highest ranked customers is determined directly from the ERP. Both branches return order status for each line item in the order. 10 Real-time jobs Building blocks for real-time jobs 1. The data flow merges the results and constructs the response. The illustration below shows a single data flow model. The logic can be arranged in a single or multiple data flows.CustName = Cust_Status. The example shown here determines order status in one of two methods based on the customer status value. 3. so it is reasonable to extract the data from a data cache rather than going to an ERP system directly. This source could be a table or file. In this example. 2. Include the real-time source in the real-time job. the supplementary information required doesn’t change very often. Complete the real-time job to determine order status. Join the sources. Find out how you can participate and help to improve our documentation. Include the supplementary source in the real-time job. In a query transform. 274 Data Integrator Designer Guide . construct a join on the customer name: (Message. Order status for other customers is determined from a data cache of sales order information. 4.CustName) You can construct the output to include only the columns that the realtime job needs to determine order status. The next section describes how to design branch paths in a data flow.

SCM. If the target receives an empty set. Compare data from the real-time source with the rule.This document is part of a SAP study on PDF usage. Real-time jobs Building blocks for real-time jobs 10 Supplement the order from a data cache Join with a customer table to determine customer priority Determine order status. You might need to consider the case where the rule indicates back-office application access. Merge the results from each path into a single data set. 2. “is there enough inventory on hand to fill this order?” Data Integrator Designer Guide 275 . but the system is not currently available. Route the single result to the real-time target. One technique for constructing this logic includes these steps: 1. 5. Determine the rule for when to access the data cache and when to access the back-office application. This example describes a section of a real-time job that processes a new sales order. the real-time job returns an empty response (begin and end XML tags only) to the Access Server. Define each path that could result from the outcome. You might need to consider error-checking and exception-handling to make sure that a value passes to the target. Find out how you can participate and help to improve our documentation. 3. CRM). 4. either from a data cache or ERP Branching data flow based on a data cache value One of the most powerful things you can do with a real-time job is to design logic that determines whether responses should be generated from a data cache or if they must be generated from data in a back-office application (ERP. The section is responsible for checking the inventory available of the ordered products—it answers the question.

Data Integrator makes a comparison for each line item in the order. In addition. Because this data flow needs to be able to determine inventory values for multiple line items. 2. To branch a data flow based on a rule Create a real-time job and drop a data flow inside it.This document is part of a SAP study on PDF usage. 10 Real-time jobs Building blocks for real-time jobs The rule controlling access to the back-office application indicates that the inventory (Inv) must be more than a pre-determined value (IMargin) greater than the ordered quantity (Qty) to consider the data cached inventory value acceptable. 1. Determine the values you want to return from the data flow. yet the data flow compares values for line items inside the sales order. the output can use the same convention. See “To import a DTD or XML Schema format” on page 233 to define the format of the data in the XML message. The XML source contains the entire sales order. 276 Data Integrator Designer Guide . The XML target that ultimately returns a response to the Access Server requires a single row at the top-most level. 3. the structure of the output requires the inventory information to be nested. the output needs to include some way to indicate whether the inventory is or is not available. Add the XML source in the data flow. Find out how you can participate and help to improve our documentation. The input is already nested under the sales order.

then delete any unneeded columns or nested tables from the output. You can drag all of the columns and nested tables from the input to the output. you would add a join expression in the WHERE clause of the query. Connect the output of the XML source to the input of a query and map the appropriate columns to the output. Construct the query so you extract the expected data from the inventory data cache table. you have to define the join more carefully: • • • • Change context to the LineItem table Include the Inventory table in the FROM clause in this context (the LineItem table is already in the From list) Define an outer join with the Inventory table as the inner table Add the join expression in the WHERE clause in this context In this example. 6. you can assume that there will always be exactly one value in the Inventory table for each line item and can therefore leave out the outer join definition. Without nested data. Real-time jobs Building blocks for real-time jobs 10 4.This document is part of a SAP study on PDF usage. Add the comparison table from the data cache to the data flow as a source. 5. Data Integrator Designer Guide 277 . Find out how you can participate and help to improve our documentation. Because the comparison occurs between a nested table and another top-level table.

Drag the Inv and IMargin columns from the input to the LineItem table. From tab list includes the Inventory table Where tab expression applies only in this schema 7. Find out how you can participate and help to improve our documentation. 10 Real-time jobs Building blocks for real-time jobs After changing contexts.This document is part of a SAP study on PDF usage. Add two queries to the data flow: 8. For example. Include the values from the Inventory table that you need to make the comparison. This example uses a join so that the processing can be 278 Data Integrator Designer Guide . • Query to retrieve inventory values from the ERP The WHERE clause at the nested level (LineItem) of the query ensures that the quantities specified in the incoming line item rows are not accounted for by inventory values from the data cache. you could use a lookup function or a join on the specific table in the ERP system. • Query to process valid inventory values from the data cache The WHERE clause at the nested level (LineItem) of the query ensures that the quantities specified in the incoming line item rows are appropriately accounted for by inventory values from the data cache. Split the output of the query based on the inventory comparison. the nested table is active while any other tables in the output schema are grayed out. The inventory values in the ERP inventory table are then substituted for the data cache inventory values in the output There are several ways to return values from the ERP.

The “CacheOK” branch of this example always returns line-item rows that include enough inventory to account for the order quantity. Data Integrator Designer Guide 279 .LineItems. each branch returns an inventory value that can then be compared to the order quantity to answer the question. Find out how you can participate and help to improve our documentation. As in the previous join. CacheOK. you can remove the inventory value from the output of these rows. “is there enough inventory to fill this order?” To complete the order processing. The goal from this section of the data flow was an answer to.LineItems CheckERP.IMARGIN) FROM Compare2cache. if you cannot guarantee that a value will be returned by the join.Qty < (Compare2cache. ERP_Inventory 9.This document is part of a SAP study on PDF usage. Real-time jobs Building blocks for real-time jobs 10 performed by the ERP system rather than Data Integrator.LineItems.INV + Compare2cache. at the nested-level context: WHERE Compare2cache.LineItems.LineItems.IMARGIN) FROM Compare2cache.LineItems.INV + Compare2cache.LineItems.Qty >= (Compare2cache.LineItems. make sure to define an outer join so that the line item row is not lost. at the nested-level context: WHERE Compare2cache. Show inventory levels only if less than the order quantity.

you can specify the top-level table. then shape the results into the columns and tables required for a response. the available inventory value can be useful if customers want to change their order quantity to match the inventory available.INV – ERP_Inventory. retrieve results. You must determine the requirements of the function to prepare the appropriate inputs.This document is part of a SAP study on PDF usage. Complete the processing of the message. top-level columns. To make up the input. you must specify the individual columns that make up the structure. Find out how you can participate and help to improve our documentation. Application functions require input values for some parameters and some can be left unspecified. Add the XML target to the output of the Merge transform. Change the mapping of the Inv column in each of the branches to show available inventory values only if they are less than the order quantity. The Merge transform combines the results of the two branches into a single data set. Calling application functions A real-time job can use application functions to operate on data. If the application function includes a structure as an input parameter. A data flow may contain several steps that call a function. 11. Both branches of the data flow include the same column and nested tables. 280 Data Integrator Designer Guide . and any tables nested one-level down relative to the tables listed in the FROM clause of the context calling the function.IMARGIN 10. You can include tables as input or output parameters to the function. 10 Real-time jobs Building blocks for real-time jobs The “CheckERP” branch can return line item rows without enough inventory to account for the order quantity. Merge the branches into one response. • • For data cache OK: Inv maps from 'NULL' For CheckERP: Inv maps from ERP_Inventory.

In particular. ERP system access is likely to be much slower than direct database access. SCM. you have many opportunities to design a system that meets your internal and external information and resource needs. Legacy) application access. or order quantity. These techniques are evident in the way airline reservations systems provide pricing information—a quote for a specific flight—contrasted with other retail Web sites that show pricing for every item displayed as part of product catalogs. product availability. You can maximize performance through your Web application design decisions. For example. The alternative might be to request pricing information directly from the ERP system. you can structure your application to reduce the number of queries that require direct back-office (ERP. modify your Web application. Real-time jobs Designing real-time applications 10 Designing real-time applications Data Integrator provides a reliable and low-impact connection between a Web application and an back-office applications such as an enterprise resource planning (ERP) system. Using the pricing example. reducing the performance your customer experiences with your Web application. if your ERP system supports a complicated pricing structure that includes dependencies such as customer priority. The information you allow your customers to access through your Web application can impact the performance that your customers see on the Web. This section discusses: • • • Reducing queries requiring back-office application access Messages from real-time jobs to adapter instances Real-time service invoked by an adapter instance Reducing queries requiring back-office application access This section provides a collection of recommendations and considerations that can help reduce the time you spend experimenting in your development cycles. design the application to avoid displaying price information along with standard product information and instead show pricing only after the customer has chosen a specific product and quantity.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. you might not be able to depend on values from a data cache for pricing information. Data Integrator Designer Guide 281 . To reduce the impact of queries requiring direct ERP system access. Because each implementation of an ERP system is different and because Data Integrator includes versatile decision support logic.

Using these objects in real-time jobs is the same as in batch jobs. 282 Data Integrator Designer Guide . The DTD or XML Schema represents the data schema for the information resource. and returns the response to a target (again. see “Importing metadata through an adapter datastore” on page 114. (Please see your adapter SDK documentation for more information about terms such as operation instance and information resource. Find out how you can participate and help to improve our documentation. • • Message function calls allow the adapter instance to collect requests and send replies.This document is part of a SAP study on PDF usage. the Query processes a message (here represented by “Employment”) received from a source (an adapter instance). For information on importing message function calls and outbound messages. the message from the adapter is represented by a DTD or XML Schema object (stored in the Formats tab of the object library). The real-time service processes the message from the information resource (relayed by the adapter) and returns a response. In the example data flow below. it translates it to XML (if necessary). See “To modify output schema contents” on page 190. Real-time service invoked by an adapter instance This section uses terms consistent with Java programming. Outbound message objects can only send outbound messages. then sends the XML message to a real-time service. They cannot be used to receive messages.) When an operation instance (in an adapter) gets a message from an information resource. 10 Real-time jobs Designing real-time applications Messages from real-time jobs to adapter instances If a real-time job will send a message to an adapter instance. an adapter instance). refer to the adapter documentation to decide if you need to create a message function call or an outbound message. In the real-time service.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide Embedded Data Flows chapter .

The embedded data flow can contain any number of sources or targets. This chapter covers the following topics: • • • • • • Overview Example of when to use embedded data flows Creating embedded data flows Using embedded data flows Testing embedded data flows Troubleshooting embedded data flows Overview An embedded data flow is a data flow that is called from inside another data flow. Data passes into or out of the embedded data flow from the parent flow through a single source or target. Find out how you can participate and help to improve our documentation. Replicate sections of a data flow as embedded data flows so you can execute them independently.This document is part of a SAP study on PDF usage. Add an embedded data flow at the end of a data flow Add an embedded data flow at the beginning of a data flow Replicate an existing data flow. Group sections of a data flow in embedded data flows to allow clearer layout and documentation. An embedded data flow is a design aid that has no effect on job execution. 284 Data Integrator Designer Guide . When Data Integrator executes the parent data flow. You can create the following types of embedded data flows: Type One input One output No input or output Use when you want to. then executes it. optimizes the parent data flow. Reuse data flow logic. Save logical sections of a data flow so you can use the exact logic in other data flows. Debug data flow logic. 11 Embedded Data Flows About this chapter About this chapter Data Integrator provides an easy-to-use option to create embedded data flows. but only one input or one output can pass data to or from the parent data flow. Use embedded data flows to: • • • Simplify data flow display.. or provide an easy way to replicate the logic and modify it for other flows.. it expands any embedded data flows.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 285 . a data flow uses a single source to load three different target systems. Embedded Data Flows Example of when to use embedded data flows 11 Example of when to use embedded data flows In this example. The Case transform sends each row from the source to different transforms that process it to get a unique target output. You can simplify the parent data flow by using embedded data flows for the three different cases.

which means that the embedded data flow can appear only at the beginning or at the end of the parent data flow. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. and select Make Embedded Data Flow. the embedded data flow is connected to the parent by one input object. To create an embedded data flow Select objects from an open data flow using one of the following methods: • • • • Click the white space and drag the rectangle around the objects CTRL-click each object All connected to each other Connected to other objects according to the type of embedded data flow you want to create: Ensure that the set of objects you select are: • • • One input One output No input or output In the example shown in step 2. Right-click one object you want to use as an input or as an output port and select Make Port for that object. 11 Embedded Data Flows Creating embedded data flows Creating embedded data flows There are two ways to create embedded data flows. 2. right-click. Then: • • Open the data flow you just added. • • Select objects within a data flow. Using the Make Embedded Data Flow option 1. 286 Data Integrator Designer Guide . Drag a complete and fully validated data flow from the object library into an open data flow in the workspace. Data Integrator marks the object you select as the connection point for this embedded data flow. Right-click and select Make Embedded Data Flow. Note: You can specify only one port.

which has a call to the new embedded data flow. 3. Data Integrator saves the new embedded data flow object to the repository and displays it in the object library under the Data Flows tab. If you deselect the Replace objects in original data flow box. 4. Click OK. Data Integrator will not make a change in the original data flow. the original data flow becomes a parent data flow.This document is part of a SAP study on PDF usage. Embedded Data Flows Creating embedded data flows 11 The Create Embedded Data Flow window opens. Find out how you can participate and help to improve our documentation. Name the embedded data flow using the convention EDF_EDFName for example EDF_ERP. The embedded data flow is represented in the new parent data flow as shown in step 4. If Replace objects in original data flow is selected. Data Integrator Designer Guide 287 . You can use an embedded data flow created without replacement as a stand-alone data flow for troubleshooting.

Find out how you can participate and help to improve our documentation. When you use the Make Embedded Data flow option. 11 Embedded Data Flows Creating embedded data flows 5. Notice that Data Integrator created a new object. the embedded data flow will include a target XML file object labeled EDFName_Output.This document is part of a SAP study on PDF usage. EDF_ERP_Input. if an embedded data flow has an output connection. Data Integrator automatically creates an input or output object based on the object that is connected to the embedded data flow when it is created. 6. Click the name of the embedded data flow to open it. For example. The naming conventions for each embedded data flow type are: Type One input One output No input or output Naming Conventions EDFName_Input EDFName_Output Data Integrator creates an embedded data flow without an input or output object 288 Data Integrator Designer Guide . which is the input port that connects this embedded data flow to the parent data flow.

Open the embedded data flow. Consider renaming the flow using the EDF_EDFName naming convention. 4. Data Integrator Designer Guide 289 . Embedded Data Flows Creating embedded data flows 11 Creating embedded data flows from existing flows To call an existing data flow from inside another data flow. Select objects in data flow 1. Different types of embedded data flow ports are indicated by directional markings on the embedded data flow icon. To create an embedded data flow out of an existing data flow Drag an existing valid data flow from the object library into a data flow that is open in the workspace. Data Integrator creates new input or output XML file and saves the schema in the repository as an XML Schema. The following example scenario uses both options: • • • Create data flow 1. 2. Create data flow 2 and data flow 3 and add embedded data flow 1 to both of them. Right-click a source or target object (file or table) and select Make Port. 3. Input port No port Output port Using embedded data flows When you create and configure an embedded data flow using the Make Embedded Data Flow option. Find out how you can participate and help to improve our documentation. and create embedded data flow 1 so that parent data flow 1 calls embedded data flow 1. 1. You can reuse an embedded data flow by dragging it from the Data Flow tab of the object library into other data flows. To save mapping time. Note: Ensure that you specify only one input or output port. then mark which source or target to use to pass data between the parent and the embedded data flows. The embedded data flow appears without any arrowheads (ports) in the workspace. you might want to use the Update Schema option or the Match Schema option. put the data flow inside the parent data flow.This document is part of a SAP study on PDF usage.

Open the embedded data flow’s parent data flow. 290 Data Integrator Designer Guide . Data Integrator copies the schema of Case to the input of EDF_ERP. in the data flow shown below. Right-click the embedded data flow object and select Update Schema. The Match Schema option only affects settings in the current data flow. This option updates the schema of an embedded data flow’s input object with the schema of the preceding object in the parent data flow.This document is part of a SAP study on PDF usage. To update a schema 1. Now the schemas in data flow 2 and data flow 3 that are feeding into embedded data flow 1 will be different from the schema the embedded data flow expects. 2. All occurrences of the embedded data flow update when you use this option. It updates the schema of embedded data flow 1 in the repository. Find out how you can participate and help to improve our documentation. Updating Schemas Data Integrator provides an option to update an input schema of an embedded data flow. Use the Match Schema option for embedded data flow 1 in both data flow 2 and data flow 3 to resolve the mismatches at runtime. 11 Embedded Data Flows Creating embedded data flows • Go back to data flow 1. • • The following sections describe the use of the Update Schema and Match Schema options in more detail. For example. Change the schema of the object preceding embedded data flow 1 and use the Update Schema option with embedded data flow 1.

This document is part of a SAP study on PDF usage. The embedded data flow ignores additional columns and reads missing columns as NULL. Columns in both schemas must have identical or convertible data types. 1. Data Integrator Designer Guide 291 . or remove entire embedded data flows. The Match Schema option only affects settings for the current data flow. Data Integrator also allows the schema of the preceding object in the parent data flow to have more or fewer columns than the embedded data flow. Embedded Data Flows Creating embedded data flows 11 Matching data between parent and embedded data flow The schema of an embedded data flow’s input object can match the schema of the preceding object in the parent data flow by name or position. Find out how you can participate and help to improve our documentation. Deleting embedded data flow objects You can delete embedded data flow ports. A match by position is the default. See the section on “Type conversion” in the Data Integrator Reference Guide for more information. 2. To specify how schemas should be matched Open the embedded data flow’s parent data flow. Right-click the embedded data flow object and select Match Schema > By Name or Match Schema > By Position.

View Data to sample data passed into an embedded data flow. Delete these defunct embedded data flow objects from the parent data flows. 2. If you delete embedded data flows from the object library. 1. For more configuration information see the Data Integrator Reference Guide. 11 Embedded Data Flows Creating embedded data flows To remove a port Right-click the input or output object within the embedded data flow and deselect Make Port. an input or output XML file object is created and then (optional) connected to the preceding or succeeding object in the parent data flow. To test the XML file without a parent data flow. 292 Data Integrator Designer Guide . When you use the Make Embedded Data Flow option. Run the job. see Chapter 15: Design and Debug.This document is part of a SAP study on PDF usage. Note: You cannot remove a port simply by deleting the connection in the parent flow. Testing embedded data flows You might find it easier to test embedded data flows by running them separately as regular data flows. 3. To remove an embedded data flow Select it from the open parent data flow and choose Delete from the rightclick menu or edit menu. Find out how you can participate and help to improve our documentation. Put the embedded data flow into a job. transformed. To separately test an embedded data flow Specify an XML file for the input port or output port. the embedded data flow icon appears with a red circle-slash flag in the parent data flow. and rules about the audit statistics to verify the expected data is processed. Data Integrator removes the connection to the parent object. click the name of the XML file to open its source or target editor to specify a file name. and loaded into targets. You can also use the following features to test embedded data flows: • • For for more information on both of these features. Auditing statistics about the data read from sources.

This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 293 . in the embedded data flow. Deleted connection to the parent data flow while the Make Port option. Embedding the same data flow at any level within itself. See “To remove a port” on page 292. Transforms with splitters (such as the Case transform) specified as the output port object because a splitter produces multiple outputs. and embedded data flows can only have one. For example. DF1 data flow calls EDF1 embedded data flow which calls EDF2. You can however have unlimited embedding levels. remains selected. See “To remove an embedded data flow” on page 292. Variables and parameters declared in the embedded data flow that are not also declared in the parent data flow. Embedded Data Flows Creating embedded data flows 11 Troubleshooting embedded data flows The following situations produce errors: • • • • • • Both an input port and output port are specified in an embedded data flow. Find out how you can participate and help to improve our documentation. Trapped defunct data flows.

This document is part of a SAP study on PDF usage. 11 Embedded Data Flows Creating embedded data flows 294 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.

This document is part of a SAP study on PDF usage. Data Integrator Designer Guide Variables and Parameters chapter . Find out how you can participate and help to improve our documentation.

This chapter contains the following topics: • • • • • • • Overview The Variables and Parameters window Using local variables and parameters Using global variables Local and global variable rules Environment variables Setting file names at run-time using variables Overview You can increase the flexibility and reusability of work flows and data flows using local and global variables when you design your jobs. 12 Variables and Parameters About this chapter About this chapter This chapter covers creating local and global variables for Data Integrator jobs. 296 Data Integrator Designer Guide . It also introduces the use of environment variables. a variable can be used in a LOOP or IF statement to check a variable's value to decide which step to perform: If $amount_owed > 0 print(‘$invoice. The data type of a variable can be any supported by Data Integrator such as an integer. catch. Find out how you can participate and help to improve our documentation. Data Integrator typically uses them in a script.This document is part of a SAP study on PDF usage. or text string.doc’). You can use variables in expressions to facilitate decision-making or data manipulation (using arithmetic or character substitution). decimal. or conditional process. Variables are symbolic placeholders for values. If you define variables in a job or work flow. date. For example.

You can also set global variable values using external job. $BB = $AA + $BB. Parameters are expressions that pass to a work flow or data flow when they are called in a job.This document is part of a SAP study on PDF usage. Variables and Parameters Overview 12 Work Flow Variables defined: $AA int $BB int Script If $AA < 0 $AA = 0. You can set values for local or global variables in script objects. and global variables using the Variables and Parameters window in the Designer. Find out how you can participate and help to improve our documentation. Catch If $BB < 0 $BB = 0. execution. For example. Global variables are restricted to the job in which they are created. however. Data Integrator Designer Guide 297 . You create local variables. $AA = $AA + $BB. or schedule properties. they do not require parameters to be passed to work flows and data flows. use them in a custom function or in the WHERE clause of a query transform. parameters. You must use parameters to pass local variables to child objects (work flows and data flows). Conditional If Expression $AA >= $BB You can use variables inside data flows. In Data Integrator. local variables are restricted to the object in which they are created (job or work flow).

Local variable parameters can only be set at the work flow and data flow level.This document is part of a SAP study on PDF usage. The Variables and Parameters window contains two tabs. data type. For more information about setting global variable values in SOAP calls. or from the project area click an object to open it in the workspace. If there is no object selected. 12 Variables and Parameters The Variables and Parameters window Using global variables provides you with maximum flexibility. see the Data Integrator Management Console: Administrator Guide. Global variables can only be set at the job level. To view the variables and parameters in each job. select Variables. 298 Data Integrator Designer Guide . The Variables and Parameters window opens. 2. during production you can change values for default global variables at runtime from a job's schedule or SOAP call without having to open a job in the Designer. The Context box in the window changes to show the object you are viewing. and parameter type) for an object type. or data flow In the Tools menu. double-click an object. 1. Find out how you can participate and help to improve our documentation. For example. work flow. Variables can be used as file names for: • • • • • Flat file sources and targets XML file sources and targets XML message targets (executed in the Designer in test mode) IDoc file sources and targets (in an SAP R/3 environment) IDoc message sources and targets (SAP R/3 environment) The Variables and Parameters window Data Integrator displays the variables and parameters defined for an object in the Variables and Parameters window. the window does not indicate a context. From the object library. The Definitions tab allows you to create and view variables (name and data type) and parameters (name.

or another parameter. Data flows cannot return output values. A WHERE clause. values in the Calls tab can be variables or parameters. You can also enter values for each parameter by right-clicking a parameter and clicking Properties. variables. values in the Calls tab can be constants. The Calls tab allows you to view the name of each parameter defined for all objects in a parent object’s definition. Data Integrator scripting language rules and syntax The following illustration shows the relationship between an open work flow called DeltaFacts. the Context box in the Variables and Parameters window. For the input parameter type. Work flows may also return variables or parameters to parent objects. For the output or input/output parameter type. and a compatible data type if they are placed inside an output parameter type. Values in the Calls tab must also use: • • The same data type as the variable if they are placed inside an input or input/output parameter type. and the content in the Definition and Calls tabs. Data Integrator Designer Guide 299 . Parent objects to pass local variables. or a function in the data flow. Variables and Parameters The Variables and Parameters window 12 The following table lists what type of variables and parameters you can create using the Variables and Parameters window when you select different objects. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. column mapping. Object Type Job Work flow What you can create Used by for the object Local variables Global variables Local variables Parameters Data flow Parameters A script or conditional in the job Any object in the job This work flow or passed down to other work flows or data flows using a parameter.

12 Variables and Parameters Using local variables and parameters The definition of work flow WF_DeltaFacts is open in the workspace. The parent work flow (not shown) passes values to or receives values from WF_DeltaFacts. 300 Data Integrator Designer Guide . create a parameter and map the parameter to the local variable by entering a parameter value. Using local variables and parameters To pass a local variable to another object. which is called by WF_DeltaFacts.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Parameters defined in WF_DeltaFacts. define the local variable. then from the calling object. Parameters defined in WF_DeltaWrapB. WF_DeltaFacts can pass values to or receive values from WF_DeltaWrapB using these parameters. to use a local variable inside a data flow. For example. define the variable in a parent work flow and then pass the value of the variable as a parameter of the data flow.

Data Integrator can calculate a new end value based on a stored number of parts to process each time. Passing values into data flows You can use a value passed as a parameter into a data flow to control the data transformed in the data flow. For more information see the Data Integrator Reference Guide. The value passed by the parameter can be used by any object called by the work flow or data flow. For example. If the work flow that calls DF_PartFlow records the range of numbers processed. Find out how you can participate and help to improve our documentation. Note: You can also create local variables and parameters for use in custom functions. Variables and Parameters Using local variables and parameters 12 Parameters Parameters can be defined to: • • Pass their values into and out of work flows Pass their values into data flows Each parameter is assigned a type: input. the data flow DF_PartFlow processes daily inventory values. It can process all of the part numbers in use or a range of part numbers based on external requirements such as the range of numbers processed most recently.This document is part of a SAP study on PDF usage. such as $SizeOfSet. A query transform in the data flow uses the parameters passed in to filter the part numbers extracted from the source. and pass that value to the data flow as the end value. or input/output. output. it can pass the end value of the range $EndRange as a parameter to the data flow to indicate the start value of the range to process next. Data Integrator Designer Guide 301 .

Right-click and choose Properties. Select the data type for the variable. 8. Defining local variables Variables are defined in the Variables and Parameter window. To define a local variable Click the name of the job or work flow in the project area or workspace. 1. Click OK. Select Variables. Defining parameters There are two steps for setting up a parameter for a work flow or data flow: • • Add the parameter definition to the flow. 12 Variables and Parameters Using local variables and parameters The data flow could be used by multiple calls contained in one or more work flows to perform the same task on different part number ranges by specifying different parameters for the particular calls. 4. 3. Find out how you can participate and help to improve our documentation. Go the Definition tab. Enter the name of the new variable. 10. but cannot contain blank spaces.This document is part of a SAP study on PDF usage. Always begin the name with a dollar sign ($). Select the new variable (for example. The name can include any alpha or numeric character or underscores (_). 2. Set the value of the parameter in the flow call. Right-click and choose Insert. 6. Choose Tools > Variables to open the Variables and Parameters window. $NewVariable0). 7. 9. 302 Data Integrator Designer Guide . or double-click one from the object library. 5.

Open the Variables and Parameters window. Find out how you can participate and help to improve our documentation. or data flow. 5. In the Variables and Parameters window. it must have a compatible data type if it is an output parameter type. Select the new parameter (for example. Right-click and choose Insert. Enter the expression the parameter will pass in the Value box. $NewArgument1). A list of parameters passed to that object appears. 2. 7. work flow. The parameter must have the same data type as the variable if it is an input or input/output parameter. Variables and Parameters Using local variables and parameters 12 1. Select the parameter type (input. 3. work flow. 4.This document is part of a SAP study on PDF usage. 11. Right-click and choose Properties. Data Integrator Designer Guide 303 . 1. 2. 10. Go to the Definition tab. 6. select the Calls tab. Click OK. 9. or data flow. Select the parameter. The Calls tab shows all the objects that are called from the open job. 3. 4. Select Parameters. Enter the name of the parameter using alphanumeric characters with no blank spaces. 5. To add the parameter to the flow definition Click the name of the work flow or data flow. output. right-click. Select the data type for the parameter. 8. To set the value of the parameter in the flow call Open the calling job. and choose Properties. Click the plus sign (+) next to the object that contains the parameter you want to set. or input/output).

Global variables are exclusive within the context of the job in which they are created. a variable. 4. once you use a name for a global variable in a job. 5. Special syntax $variable_name 'string' Using global variables Global variables are global within a job. Setting parameters is not necessary when you use global variables. click Insert. From the shortcut menu. Choose Tools > Variables to open the Variables and Parameters window. 2. or another parameter (for example $startID or $parm1). then the value is a variable or parameter. that name becomes reserved for the job. However. 3. This section discusses: • • • Creating global variables Viewing global variables Setting global variable values Creating global variables Define variables in the Variables and Parameter window. or ‘string1’). Use the following syntax to indicate special values: Value type Variable String 6. 304 Data Integrator Designer Guide . Go the Definition tab. Click OK. 3. To create a global variable Click the name of a job in the project area or double-click a job from the object library. $NewJobGlobalVariable appears inside the global variables tree: 1. Find out how you can participate and help to improve our documentation. If the parameter type is output or input/output. 12 Variables and Parameters Using global variables If the parameter type is input. then its value can be an expression that contains a constant (for example 0.This document is part of a SAP study on PDF usage. Right-click Global Variables (job Context_) to open a shortcut menu.

The Global Variable Properties window opens: 7.This document is part of a SAP study on PDF usage. defined in a job. Rename the variable and select a data type. 2. A global variable defined in one job is not available for modification or viewing from another job. Right-click $NewJobGlobalVariable and select Properties from the shortcut menu. 1. Viewing global variables Global variables. are visible to those objects relative to that job. You can view global variables from the Variables and Parameters window (with an open job in the work space) or from the Properties dialog of a selected job. Click the Global Variables tab. 3. Click OK. The Variables and Parameters window displays the renamed global variable. Find out how you can participate and help to improve our documentation. 8. Variables and Parameters Using global variables 12 $NewJobGlobalVariable 6. Right-click and select Properties. Data Integrator Designer Guide 305 . select the Jobs tab. Global variables appear on this tab. To view global variables in a job from the Properties dialog In the object library.

Note: You cannot pass global variables as command line arguments for realtime jobs. you can rely on these dialogs for viewing values set for global variables and easily edit values when testing or scheduling a job. Click Properties. the internal value will override the external job value. 2. 306 Data Integrator Designer Guide . 4. 12 Variables and Parameters Using global variables Setting global variable values In addition to setting a variable inside a job using an initialization script. However. Enter values for the global variables in this job. 1. All values defined as job properties are shown in the Properties and the Execution Properties dialogs of the Designer and in the Execution Options and Schedule pages of the Administrator. All global variables created in the job appear. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. They are read as NULL. Values set outside a job are processed the same way as those set in an initialization script. 3. Values for global variables can be set outside a job: • • As a job property As an execution or schedule property Global variables without defined values are also allowed. By setting values outside a job. Click the Global Variable tab. if you set a value for the same variable both inside and outside a job. To set a global variable value as a job property Right-click a job in the object library or project area. you can set and maintain global variable values outside a job.

See the Data Integrator Reference Guide for syntax information and example scripts. Variables and Parameters Using global variables 12 You can use any statement used in a script with this option. Note: For testing purposes. This allows you to override job property values at run-time. Data Integrator saves values in the repository as job properties. Data Integrator Designer Guide 307 . To set a global variable value as an execution property Execute a job from the Designer. 5. Find out how you can participate and help to improve our documentation. Make sure to set the execution properties for a real-time job. You can also view and edit these default values in the Execution Properties dialog of the Designer and in the Execution Options and Schedule pages of the Administrator. 1. Click OK. you can execute real-time jobs from the Designer in test mode.This document is part of a SAP study on PDF usage. or execute or schedule a batch job from the Administrator.

This document is part of a SAP study on PDF usage. View the global variables in the job and their default values (if available). 12 Variables and Parameters Using global variables 2. 308 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.

This document is part of a SAP study on PDF usage. Variables and Parameters Using global variables 12 Data Integrator Designer Guide 309 . Find out how you can participate and help to improve our documentation.

If you are using the Designer. Find out how you can participate and help to improve our documentation. 3. 310 Data Integrator Designer Guide . 12 Variables and Parameters Using global variables If no global variables exist in a job. Edit values for global variables as desired. the Global Variable sections in these windows do not appear. click OK. 4. If you are using the Administrator. click Execute or Schedule.This document is part of a SAP study on PDF usage.

Find out how you can participate and help to improve our documentation. $MONTH=’JANUARY’. Consequently. For your the job run.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 311 . and $DAY. $DAY is not defined and Data Integrator reads it as NULL. default value for the current job run. the execution property value overrides the job property value and becomes the default value for the current job run. You cannot save execution property global variable values. Data Integrator executes the following list of statements: $YEAR=2002. A value defined inside a job has the highest rank. these values are only associated with a job schedule. Automatic ranking of global variable values in a job Using the methods described in the previous section. • If you set a global variable value as both a job and an execution property. $MONTH=’JANUARY’. the schedule property value overrides the job property value and becomes the external. Values entered as execution properties are not saved. Values entered as schedule properties are saved but can only be accessed from within the Administrator. if you set variables $YEAR and $MONTH as execution properties to values 2002 and ‘JANUARY’ respectively. Data Integrator selects the highest ranking value for use in the job. JOB_Test1. Variable $YEAR is set as a job property with a value of 2003. these values are viewed and edited from within the Administrator. then the statement $YEAR=2002 will replace $YEAR=2003. However. execution properties for global variable values are not saved. if you enter different values for a single global variable. • If you set a global variable value for both a job property and a schedule property. $MONTH. Note: In this scenario. assume that a job. not the job itself. Variables and Parameters Using global variables 12 The job runs using the values you enter. Data Integrator saves schedule property values in the repository. you set variables $MONTH and $DAY as execution properties to values ‘JANUARY’ and 31 respectively. however. For example. Data Integrator executes a list of statements which includes default values for JOB_Test1: $YEAR=2003. For the second job run. $DAY=31. You set $DAY to 31 during the first job run. has three global variables declared: $YEAR. A value entered as a job property has the lowest rank.

Find out how you can participate and help to improve our documentation. 12 Variables and Parameters Using global variables • A global variable value defined inside a job always overrides any external values. For example. In this scenario. ‘APRIL’ remains the value for the global variable until it encounters the other value for the same variable in the second work flow.This document is part of a SAP study on PDF usage. However. Since the value in the script is inside the job. The execution property $MONTH = ‘APRIL’ is the global variable value. each containing a data flow. Data Integrator processes execution. suppose you have a job called JOB_Test2 that has three work flows. Up until that point. ‘MAY’ overrides ‘APRIL’ for the variable $MONTH. The second data flow is inside a work flow that is preceded by a script in which $MONTH is defined as ‘MAY’. the override does not occur until Data Integrator attempts to apply the external values to the job being processed with the internal value. The first and third data flows have the same global variable with no value defined. 312 Data Integrator Designer Guide . Data Integrator continues the processing the job with this new value. ‘APRIL’ becomes the default value for the job. schedule. or job property values as default values.

you can rely on these dialogs for viewing all global variables and their values. When you replicate a data flow or work flow. all parameters and local and global variables are also replicated. use global variables as file names and start and end dates. parallel flows. the local and global variables defined in that job context are also replicated. there are advantages to defining values for global variables outside a job. see the Data Integrator Reference Guide. Find out how you can participate and help to improve our documentation. Data Integrator reports an error. By setting values outside a job. Data Integrator Designer Guide 313 . values defined as job properties are shown in the Properties and the Execution Properties dialogs of the Designer and in the Execution Options and Schedule pages of the Administrator. you can set global variable values when creating or editing a schedule without opening the Designer. If you attempt to validate a data flow or work flow containing global variables without a job. However. You can also easily edit them for testing and scheduling. and recovery. For example. In the Administrator. Local and global variable rules When defining local or global variables. you must validate these local and global variables within the job context in which they were created. Replicating jobs and work flows • • When you replicate all objects. consider rules for: • • • Naming Replicating jobs and work flows Importing and exporting For information about how Data Integrator processes variables in work flows with multiple conditions like execute once. For example. Naming • • Local and global variables must have unique names within their job context. Any name modification to a global variable can only be performed at the job level.This document is part of a SAP study on PDF usage. Variables and Parameters Local and global variable rules 12 Advantages to setting values outside a job While you can set values inside jobs.

retrieve. Find out how you can participate and help to improve our documentation. set_env. Only the call to that global variable is exported. 3. Create a script to set the value of a local or global variable. Flat files XML files and messages IDoc files and messages (in an SAP R/3 environment) The lookup_ext function (for a flat file used as a translate table parameter) To use a variable in a flat file name Create a local or global variable using the Variables and Parameters window. and test the values of environment variables. or data flows.This document is part of a SAP study on PDF usage. When you export a lower-level object (such as a data flow) without the parent job. 314 Data Integrator Designer Guide . the global variable is not exported. 2. You can temporarily set the value of an environment variable inside a job. For more information about these functions. The get_env. work flows. the value is visible to all objects in that job. and is_set_env functions provide access to underlying operating system variables that behave as the operating system allows. see the Data Integrator Reference Guide. Variables can be used as file names for: • The following sources and targets: • • • • 1. and is_set_env functions to set. you also export all local and global variables defined for that job. Setting file names at run-time using variables You can set file names at runtime by specifying a variable as the file name. a validation error will occur. or call a system environment variable. set_env. 12 Variables and Parameters Environment variables Importing and exporting • • When you export a job object. Environment variables You can use system-environment variables inside Data Integrator jobs. If you use this object in another job without defining the global variable in the new job. Use the get_env. Declare the variable in the file format editor or in the Function editor as a lookup_ext parameter. work flow or data flow. Once set.

Variables and Parameters Setting file names at run-time using variables 12 • When you set a variable value for a flat file. you can also use multiple file names and wild cards. Enter the variable in the File(s) property under Data File(s) in the File Format Editor. Notice that the $FILEINPUT variable includes two file names (separated by a comma). The figure above provides an example of how to use multiple variable names and wild cards. Data Integrator Designer Guide 315 . • The following figure shows how you can set values for variables in flat file sources and targets in a script. Neither is supported when using variables in the lookup_ext function. You cannot enter a variable in the Root directory property.in’.This document is part of a SAP study on PDF usage. substitute the path and file name in the Translate table box in the lookup_ext function editor with the variable name. For more information.*. When you use variables as sources and targets.out’. See the Data Integrator Reference Guide for more information about creating scripts. see the Data Integrator Reference Guide.* and KNA1c?mma. d:/verion5. specify both the file name and the directory name. For lookups. $FILEINPUT = ‘d:verions5.0/Vfilenames/goldlog/KNA1comma.in) also make use of the wild cards (* and ?) supported by Data Integrator.0/Vfilenames/goldlog/KNA1c?mma. The two names (KNA1comma.0/Vfilenames/work/VF0015. $FILEOUTPUT = ‘d:/version5. Find out how you can participate and help to improve our documentation.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. 12 Variables and Parameters Setting file names at run-time using variables 316 Data Integrator Designer Guide .

This document is part of a SAP study on PDF usage. Data Integrator Designer Guide Executing Jobs chapter . Find out how you can participate and help to improve our documentation.

Depending on your needs. When jobs are scheduled by third-party software: • • The job initiates outside of Data Integrator. Find out how you can participate and help to improve our documentation. You will most likely run immediate jobs only during the development cycle.This document is part of a SAP study on PDF usage. For these jobs. you can configure: • Immediate jobs Data Integrator initiates both batch and real-time jobs and runs them immediately from within the Data Integrator Designer. When a job is invoked by a third-party scheduler: • • • Services Real-time jobs are set up as services that continuously listen for requests from an Access Server and process requests on-demand as they are received. use the Data Integrator Administrator or use a third-party scheduler. usually many times on the same machine) must be running. • Scheduled jobs Batch jobs are scheduled. The corresponding Job Server must be running. 318 Data Integrator Designer Guide . The job operates on a batch job (or shell script for UNIX) that has been exported from Data Integrator. To schedule a job. The Data Integrator Designer does not need to be running. both the Designer and designated Job Server (where the job executes. Use the Data Integrator Administrator to create a service from a real-time job. 13 Executing Jobs About this chapter About this chapter This chapter contains the following topics: • • • • • Overview of Data Integrator job execution Preparing for job execution Executing jobs as immediate tasks Debugging execution errors Changing Job Server options Overview of Data Integrator job execution You can run Data Integrator jobs in three different ways.

you can access additional information by right-clicking the error listing and selecting View from the context menu. then opens the Output window to display the error. The default is not to validate. If during validation Data Integrator discovers an error in an object definition.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 319 . schedule. Clicking the Validate Current View button from the toolbar (or choosing Validate > Current View from the Debug menu). This command checks the syntax of the object definition for the active workspace. or export a job to be executed as a scheduled task: • • • Validating jobs and job components Ensuring that the Job Server is running Setting job execution options Validating jobs and job components You can also explicitly validate jobs and their components as you create them by: Clicking the Validate All button from the toolbar (or choosing Validate > All Objects in View from the Debug menu). You can set the Designer options (Tools > Options > Designer > General) to validate jobs started in Designer before job execution. Find out how you can participate and help to improve our documentation. double-click the error in the Output window to open the editor of the object containing the error. Data Integrator also validates jobs before exporting them. Executing Jobs Preparing for job execution 13 Preparing for job execution Follow these preparation procedures before you execute. This command checks the syntax of the object definition for the active workspace and for all objects that are called from the active workspace view recursively. If there are errors. If you are unable to read the complete error text in the window. it opens a dialog box indicating that an error exists.

if the data type of a source column in a transform within a data flow does not match the data type of the target column in the transform. 13 Executing Jobs Preparing for job execution Error messages have these levels of severity: Severity Information Description Informative message only—does not prevent the job from running. When the Designer starts. Although these are object options—they affect the function of the object—they are located in either the Property or the Execution window associated with the job. The error is not severe enough to stop job execution. but you might get unexpected results. it displays the status of the Job Server for the repository to which you are connected.This document is part of a SAP study on PDF usage. Job Server is running Job Server is inactive The name of the active Job Server and port number displays in the status bar when the cursor is over the icon. You must fix the error before the job will execute. Warning Error Ensuring that the Job Server is running Before you execute a job (either as an immediate or scheduled task). The name of the active Job Server and port number display when you roll-over the Job Server icon. Find out how you can participate and help to improve our documentation. For example. No action is required. The error is severe enough to stop job execution. Data Integrator alerts you with a warning message. Setting job execution options Options for jobs include Debug and Trace. 320 Data Integrator Designer Guide . ensure that the Job Server is associated with the repository where the client is running.

Right-click and choose Execute. Select options on the Properties window: • • • For an introduction to object properties. • • 1. The next step depends on whether you selected the Perform complete validation before job execution check box in the Designer Options (see “Designer — General” on page 67): 1. • • If you have not selected this check box. see “Viewing and changing object properties” on page 53 For information about Debug and Trace properties. Executing Jobs Executing jobs as immediate tasks 13 Execution options for jobs can either be set for a single instance or as a default value.This document is part of a SAP study on PDF usage. There might also be warning messages—for example. Proceed to the next step. Data Integrator validates the job before it runs. see the Data Integrator Designer Guide. Executing jobs as immediate tasks Immediate or “on demand” tasks are initiated from the Data Integrator Designer. 2. To execute a job as an immediate task In the project area. Data Integrator prompts you to save any objects that have changes that have not been saved. Find out how you can participate and help to improve our documentation. You must correct any serious errors before the job will run. Both the Designer and Job Server must be running for the job to execute. 3. If you have selected this check box. 2. For more information about using the Global Variable tab. messages indicating that date values will be converted to datetime Data Integrator Designer Guide 321 . right-click the job name and choose Properties. a window opens showing execution properties (debug and trace) for the job. select the job name. see the Data Integrator Reference Guide. The right-click Execute menu sets the options for a single execution only and overrides the default settings The right-click Properties menu sets the default settings To set execution options for every execution of the job From the Project area.

override the default trace properties. the execution window opens with the trace log button active. monitor log. Set the execution properties. 13 Executing Jobs Executing jobs as immediate tasks values. or select global variables at runtime. “Setting global variable values” on page 306. Click OK. Note: Setting execution properties here affects a temporary change for the current execution only. As Data Integrator begins execution. You can choose the Job Server that you want to process this job. and error log (if there are any errors). 322 Data Integrator Designer Guide . 5. For more information. Correct them if you want (they will not prevent job execution) or click OK to continue. Find out how you can participate and help to improve our documentation. For more information about execution logs. Use the buttons at the top of the log window to display the trace log. After the job validates. datastore profiles for sources and targets if applicable. 4. a window opens showing the execution properties (debug and trace) for the job. see: • • the Data Integrator Reference Guide. enable automatic recovery. see “Debugging execution errors” on page 324.This document is part of a SAP study on PDF usage.

This description is saved with the log which can be accessed later from the Log tab. See “Examining target data” on page 329.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Executing Jobs Executing jobs as immediate tasks 13 After the job is complete. Monitor tab The Monitor tab lists the trace logs of all current or most recent executions of a job. Data Integrator Designer Guide 323 . use an RDBMS query tool to check the contents of the target table or file. A red light indicates that the job has stopped You can right-click and select Properties to add a description for a specific trace log. • A red cross indicates that the job encountered an error Log tab You can also select the Log tab to view a job’s trace log history. The traffic-light icons in the Monitor tab have the following meanings: • • A green light indicates that the job is running You can right-click and select Kill Job to stop a job that is still running.

13 Executing Jobs Debugging execution errors Click on a trace log to open it in the workspace. Debugging execution errors The following tables lists tools that can help you understand execution errors: Tool Trace log monitor log Definition Itemizes the steps executed in the job and the time execution began and ended.This document is part of a SAP study on PDF usage. Use the trace. the number of rows streamed through each step. Displays each step of each data flow in the job. Always examine your target data to see if your job produced the results you expected. some of the ABAP errors are also available in the Data Integrator error log. If the job ran against SAP data. Displays the name of the object being executed when an Data Integrator error occurred and the text of the resulting error message. monitor. and error log icons (left to right at the top of the job execution window in the workspace) to view each type of available log for the date and time that the job was run. Find out how you can participate and help to improve our documentation. and the duration of each step. Error log Target data The following sections describe how to use these tools: • Using Data Integrator logs 324 Data Integrator Designer Guide .

Use the monitor and error log icons (middle and right icons at the top of the execution window) to view these logs. To access a log during job execution If your Designer is running when job execution begins. Find out how you can participate and help to improve our documentation. • • To open the trace log on job execution. To copy log content from an open log. displaying the trace log information. Data Integrator Designer Guide 325 . the execution window opens automatically. see the Data Integrator Management Console: Administrator Guide. For information about administering logs from the Administrator.This document is part of a SAP study on PDF usage. select Tools > Options > Designer > General > Open monitor on job execution. select one or multiple lines and use the key commands [Ctrl+C]. Executing Jobs Debugging execution errors 13 • • • • Examining trace logs Examining monitor logs Examining error logs Examining target data Using Data Integrator logs This section describes how to use Data Integrator logs in the Designer.

The Job Server listed executed the job. and error log files in the workspace. Indicates that the job encountered an error while being executed by a server group. The Job Server listed executed the job. expand the job you are interested in to view the list of trace log files and click one. To access a log after the execution window has been closed In the project area. Alternatively. Indicates that the job encountered an error on this explicitly selected Job Server.This document is part of a SAP study on PDF usage. monitor. 1. click the Log tab. 13 Executing Jobs Debugging execution errors The execution window stays open until you explicitly close it. Indicates that the was job executed successfully by a server group. Log indicators signify the following: Job Log Description Indicator Indicates that the job executed successfully on this explicitly selected Job Server. Find out how you can participate and help to improve our documentation. Click a job name to view all trace. 2. N_ 326 Data Integrator Designer Guide .

) Use the list box to switch between log types or to view No logs or All logs. Data Integrator Designer Guide 327 . see the Data Integrator Management Console: Administrator Guide. 2. 4. To delete a log You can set how long to keep logs in Data Integrator Administrator. Examining trace logs Use the trace logs to determine where an execution failed. For information about examining trace logs from the Administrator. Right-click the log you want to delete and select Delete Log. Find out how you can participate and help to improve our documentation. In the project area. Click the log icon for the execution of the job you are interested in. (Identify the execution from the position in sequence or datetime stamp. whether the execution steps occur in the order you expect.This document is part of a SAP study on PDF usage. and which parts of the execution are the most time consuming. For more information. see the Data Integrator Management Console: Administrator Guide. Executing Jobs Debugging execution errors 13 3. The following figure shows an example of a trace log. If want to delete logs from the Designer manually: 1. click the Log tab.

328 Data Integrator Designer Guide . The following screen shows an example of a monitor log. 13 Executing Jobs Debugging execution errors Examining monitor logs The monitor log quantifies the activities of the components of the job. Examining error logs Data Integrator produces an error log for every job execution.This document is part of a SAP study on PDF usage. the error log is blank. It lists the time spent in a given component of a job and the number of data rows that streamed through the component. Find out how you can participate and help to improve our documentation. The following screen shows an example of an error log. If the execution completed without error. Use the error logs to determine how an execution failed.

Always examine your data to make sure the data movement operation produced the results you expect.txt AL_JobServerLoad Sets the polling interval (in seconds) that Data Integrator OSPolling uses to get status information used to calculate the load balancing index.This document is part of a SAP study on PDF usage. Updated values were handled properly. Find out how you can participate and help to improve our documentation. Adapter Start Timeout (For adapters) Defines the time that the Administrator or Designer will wait for a response from the Job Server that manages adapters (start/stop/status). 60 Data Integrator Designer Guide 329 . Executing Jobs Changing Job Server options 13 Examining target data The best measure of the success of a job is the state of the target data. AL_JobServerLoad Enables a Job Server to log server group information if the FALSE BalanceDebug value is set to TRUE. you might want to return to the Designer and change values for the following Job Server options: Table 13-1 :Job Server Options Option Option Description Default Value 10800000 (3 hours) 90000 (90 seconds) Adapter Data (For adapters) Defines the time a function call or outbound Exchange Timeout message will wait for the response from the adapter operation. After you familiarize yourself with the more technical aspects of how Data Integrator handles data (using the Data Integrator Reference Guide) and some of its interfaces like those for adapters and SAP R/3. Be sure that: • • • • • Data was not converted to incompatible types or truncated. Generated keys have been properly incremented. Information is saved in: $LINK_DIR/log/ <JobServerName>/server_eventlog. Changing Job Server options There are many options available in Data Integrator for troubleshooting and tuning a job. Data was not lost between updates of the target. This index is used by server groups. Data was not duplicated in the target.

Sets the number of retries for an FTP connection that initially 0 fails. You can also set the Degree of parallelism for individual data flows from each data flow’s Properties window. Sets the FTP connection retry interval in milliseconds. Splitter Optimization 330 Data Integrator Designer Guide . 13 Executing Jobs Changing Job Server options Option Display DI Internal Jobs Option Description Default Value Displays Data Integrator’s internal datastore FALSE CD_DS_d0cafae2 and its related jobs in the object library. If you create a job in which a file source feeds into two FALSE queries Data Integrator might hang. FALSE FALSE FTP Number of Retry FTP Retry Interval Global_DOP Ignore Reduced Msg Type Ignore Reduced Msg Type_foo OCI Server Attach Retry 3 The engine calls the Oracle OCIServerAttach function each time it makes a connection to Oracle. close and reopen the Designer. then update the CD_DS_d0cafae2 datastore configuration to match your new repository configuration.This document is part of a SAP study on PDF usage. increase the retry value to 5. the function may fail. If the engine calls this function too fast (processing parallel data flows for example). 1000 Sets the Degree of Parallelism for all data flows run by a 1 given Job Server. The first calculates usage dependencies on repository tables and the second updates server group configurations. change the default value of this option to TRUE. For more information. If this option is set to TRUE. This enables the calculate usage dependency job (CD_JOBd0cafae2) and the server group job (di_job_al_mach_info) to run without a connection error. the engine internally creates two source files that feed the two queries instead of a splitter that feeds the two queries. see the Data Integrator Performance Optimization Guide. (For SAP R/3) Disables IDoc reduced message type processing for all message types if the value is set to TRUE. user name. or other connection information. If a data flow’s Degree of parallelism value is 0. If you change your repository password. (For SAP R/3) Disables IDoc reduced message type processing for a specific message type (such as foo) if the value is set to TRUE. The CD_DS_d0cafae2 datastore supports two internal jobs. then the Job Server will use the Global_DOP value. To correct this. The Job Server will use the data flow’s Degree of parallelism value if it is set to any value except zero because it overrides the Global_DOP value. Find out how you can participate and help to improve our documentation.

Select a Job Server from the Default Job Server section. TRUE This creates a fully qualified server name and allows the Designer to locate a Job Server on a different domain. 3. For more information.This document is part of a SAP study on PDF usage. To change option values for an individual Job Server Select the Job Server you want to work with by making it your default Job Server. The data flow level option takes precedence over this Job Server level option. The use of linked datastores can also be disabled from any data flow properties dialog. Click OK. Select Tools > Options > Job Server > General. Use Domain Name Adds a domain name to a Job Server name in the repository. a. see the Data Integrator Performance Optimization Guide. Find out how you can participate and help to improve our documentation. Enter the section and key you want to use from the following list of value pairs: Section int int AL_JobServer AL_JobServer string AL_Engine AL_Engine AL_Engine AL_Engine AL_Engine AL_Engine Key AdapterDataExchangeTimeout AdapterStartTimeout AL_JobServerLoadBalanceDebug AL_JobServerLoadOSPolling DisplayDIInternalJobs FTPNumberOfRetry FTPRetryInterval Global_DOP IgnoreReducedMsgType IgnoreReducedMsgType_foo OCIServerAttach_Retry Data Integrator Designer Guide 331 . Select Tools > Options > Designer > Environment. If you set this option to FALSE. 2. Executing Jobs Changing Job Server options 13 Option Use Explicit Database Links Option Description Default Value Jobs with imported database links normally will show TRUE improved performance because Data Integrator uses these links to push down processing to a database. b. all data flows will not use linked datastores. 1. c.

6. To save the settings and close the Options window. enter the following to change the default value for the number of times a Job Server will retry to make an FTP connection if it initially fails: These settings will change the default value for the FTPNumberOfRetry option from zero to two. as needed. Enter a value. 5. Find out how you can participate and help to improve our documentation. 13 Executing Jobs Changing Job Server options Section AL_Engine AL_Engine Repository Key SPLITTER_OPTIMIZATION UseExplicitDatabaseLinks UseDomainName 4.This document is part of a SAP study on PDF usage. Re-select a default Job Server by repeating step 1. click OK. 332 Data Integrator Designer Guide . For example.

Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide Data Quality chapter .This document is part of a SAP study on PDF usage.

Define the actions to take when an audit rule fails. or target object processes correct data. This feedback allows business users to quickly review. data cleansing or other transforms. These features can help ensure that you have “trusted” information. The distribution. Use Data Validation dashboards in the Metadata Reporting tool to evaluate the reliability of your target data based on the validation rules you created in your Data Integrator batch jobs. data quality control becomes critical in your extract.This document is part of a SAP study on PDF usage. relationship. Compare sample data from different steps of your job to verify that your data extraction job returns the results you expect. Take appropriate actions when the data does not meet your business rules. 334 Data Integrator Designer Guide . Define rules that determine if a source. For more information. The Data Integrator Designer provides the following features that you can use to determine and improve the quality and structure of your source data: • Use the Data Profiler to determine: • The quality of your source data before you extract it. as well as your target data warehouse. and structure of your source data to better design your Data Integrator jobs and data flows. assess. 14 Data Quality Chapter overview Chapter overview With operational systems frequently changing. The Data Integrator Designer provides data quality controls that act as a firewall to identify and fix errors in your data. Verify that your source data meets your business rules. transform. The content of your source and target data so that you can verify that your data extraction job returns the results you expect. transform and load (ETL) jobs. • • • • • • • • • • • • • Use the View Data feature to: Use the Validation transform to: Use the auditing data flow feature to: Use data cleansing transforms to improve the quality of your data. see Chapter 18: Data Cleansing. Find out how you can participate and help to improve our documentation. The Data Profiler can identify anomalies in your source data to help you better define corrective actions in the validation transform. and identify potential inconsistencies or errors in source data. View your source data before you execute a job to help you create higher quality job designs.

refer to the Data Integrator Release Notes. The topics in this section include: Connecting to the profiler server Profiler statistics Executing a profiler task Monitoring profiler tasks using the Designer Viewing the profiler results Data Integrator Designer Guide 335 . minimum string length. This chapter contains the following topics: • • • • Using the Data Profiler Using View Data to determine data quality Using the Validation transform Using Auditing Using the Data Profiler The Data Profiler executes on a profiler server to provide the following data profiler information that multiple users can view: • Column analysis—The Data Profiler provides two types of column profiles: • Basic profiling—This information includes minimum value. median. Save the values in all columns in each row. pattern count. maximum value. Find out how you can participate and help to improve our documentation. You can save two levels of data: • • • • • • • Save the data only in the columns that you select for the relationship. For the most recent list of profile information. • • Relationship analysis—This information identifies data mismatches between any two columns for which you define a relationship. median string length. Data Quality Using the Data Profiler 14 For more information about Data Validation dashboards. and pattern percent. Detailed profiling—Detailed column analysis includes distinct count. and maximum string length. see the Data Integrator Management Console: Metadata Reports User’s Guide. including columns that have an existing primary key and foreign key relationship. average value.This document is part of a SAP study on PDF usage. distinct percent.

14 Data Quality Using the Data Profiler Data sources that you can profile You can execute the Data Profiler on data contained in the following sources. • Databases. The Data Integrator Designer must connect to the profiler server to run the Data Profiler and view the profiler results. You provide this connection information on the Profiler Server Login window. See the Data Integrator Release Notes for the complete list of sources that the Data Profiler supports. see the Data Integrator Management Console: Administrator Guide. • • From the tool bar menu. which include: Flat files Connecting to the profiler server You must install and configure the profiler server before you can use the Data Profiler. select Tools > Profiler Server Login. double-click the Profiler Server icon which is to the right of the Job Server icon.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. On the bottom status bar. To connect to a Data Profiler Server from the Data Integrator Designer Use one of the following methods to invoke the Profiler Server Login window: 1. which include: • • • • • • • • • • • • • • Attunity Connector for mainframe databases DB2 Oracle SQL Server Sybase IQ Teradata JDE One World JDE World Oracle Applications PeopleSoft SAP R/3 Siebel Applications. 336 Data Integrator Designer Guide . For details.

enter the Data Profiler Server connection information. Data Integrator Designer Guide 337 . Click Test to validate the Profiler Server location. In the Profiler Server Login window. see the Data Integrator Management Console: Administrator Guide. 3. Data Quality Using the Data Profiler 14 2. Field Host Port Description The name of the computer where the Data Profiler Server exists. Find out how you can participate and help to improve our documentation. If the host name is valid.This document is part of a SAP study on PDF usage. Port number through which the Designer connects to the Data Profiler Server. the drop-down list in User Name displays the user names that belong to the profiler server. you receive a message that indicates that the profiler server is running. Note: When you click Test. To add profiler users.

the Profiler Server icon on the bottom status bar no longer has the red X on it. when you move the pointer over this icon. Find out how you can participate and help to improve our documentation. 338 Data Integrator Designer Guide . 14 Data Quality Using the Data Profiler 4. When you successfully connect to the profiler server. In addition. Password 5. Field Description User Name The user name for the Profiler Server login.This document is part of a SAP study on PDF usage. Enter the user information in the Profiler Server Login window. You can select a user name from the drop-down list or enter a new name. Click Connect. The password for the Profiler Server login. the status bar displays the location of the profiler server.

For details. the average value in this column. Percentage of rows that contain a NULL value in this column. For character columns. If you generate statistics for multiple sources in one profile task. The Data Profiler provides two types of column profiles: • • Basic profiling Detailed profiling This section also includes Examples of using column profile statistics to improve data quality. the Data Profiler generates the following basic profiler attributes for each column that you select. all sources must be in the same datastore. Number of rows that contain this highest value in this column. Nulls Nulls % Zeros Number of NULL values in this column. the average length of the string values string length in this column. For numeric columns. For character columns. The columns can all belong to one data source or from multiple data sources. Find out how you can participate and help to improve our documentation. Average For character columns. the length of the shortest string value in this column. see “Submitting column profiler tasks” on page 342.This document is part of a SAP study on PDF usage. Data Quality Using the Data Profiler 14 Profiler statistics You can calculate and generate two types of data profiler statistics: • • Column profile Column profile Relationship profile You can generate statistics for one or more columns. Data Integrator Designer Guide 339 . Number of rows that contain this lowest value in this column. Basic profiling By default. the highest value in this column. Basic Attribute Min Min count Max Max count Average Min string length Max string length Description Of all values. Number of 0 values in this column. the lowest value in this column. the length of the longest string value in this column. Of all values.

see the following: • • • For the most recent list of profiler attributes. For more information. see “Viewing column profile data” on page 350. Percentage of rows that contain each distinct value in this column. frequencies. but detailed attributes generation consumes more time and computer resources. these profile statistics might show that a column value is markedly higher than the other values in a data source. Median For character columns. the number of rows that contain a blank in this column. To generate the profiler attributes. Number of different patterns in this column. Business Objects recommends that you do not select the detailed profile unless you need the following attributes: Detailed Attribute Median Description The value that is in the middle row of the source table. Examples of using column profile statistics to improve data quality You can use the column profile attributes to assist you in different tasks. 14 Data Quality Using the Data Profiler Basic Attribute Zeros % Blanks Blanks % Description Percentage of rows that contain a 0 value in this column. For example. see the Data Integrator Release Notes. Detailed profiling You can generate more detailed attributes in addition to the above attributes. Therefore. Percentage of rows that contain each pattern in this column. You might then decide to define a validation transform to set a flag in a different table when you load this outlier into the target table. 340 Data Integrator Designer Guide . ranges. For character columns. and outliers. To view the profiler attributes. Find out how you can participate and help to improve our documentation. the value that is in the middle row of string length the source table. including the following tasks: • Obtain basic statistics.This document is part of a SAP study on PDF usage. see “Submitting column profiler tasks” on page 342. Distincts Distinct % Patterns Pattern % Number of distinct values in this column. Percentage of rows that contain a blank in this column.

the Data Profiler saves the data only in the columns that you select for the relationship. and a different range in another source. nulls. • • • Relationship profile A relationship profile shows the percentage of non matching values in columns of two sources. The sources can be: • • • Tables Flat files A combination of a table and a flat file The key columns can have a primary key and foreign key relationship defined or they can be unrelated (as when one comes from a datastore and the other from a file format). see “Submitting relationship profiler tasks” on page 346. and blanks in the source system. You can use the relationship profile to assist you in different tasks. Find out how you can participate and help to improve our documentation. Analyze the numeric range. you might decide to define a validation transform to convert them all to use the same target format. For example. • Save all columns data You can save the values in the other columns in each row.This document is part of a SAP study on PDF usage. For details. the profile statistics might show that phone number has several different formats. When you view the relationship profile results. For example. You can choose between two levels of relationship profiles to save: • Save key columns data only By default. Data Quality Using the Data Profiler 14 • Identify variations of the same content. For example. Discover data patterns and formats. the profile statistics might show that nulls occur for fax number. part number might be an integer data type in one data source and a varchar data type in another data source. including the following tasks: Data Integrator Designer Guide 341 . With this profile information. customer number might have one range of numbers in one source. Identify missing information. For example. Your target will need to have a data type that can accommodate the maximum range. You might then decide to define a validation transform to replace the null value with a phrase such as “Unknown” in the target table. you can drill down to see the actual data that does not match (see “Viewing the profiler results”). but this processing will take longer and consume more computer resources to complete. You might then decide which data type you want to use in your target data warehouse.

hold down the Ctrl key as you select each table. If you want to profile all tables within a datastore. Note: This optional feature is not available for columns with nested schemas. 2. Find out how you can participate and help to improve our documentation. For a flat file. Reasons to submit profile tasks this way include: 342 Data Integrator Designer Guide . You can execute the following profiler tasks: • • Submitting column profiler tasks Submitting relationship profiler tasks You cannot execute a column profile task with a relationship profile task. select the datastore name. • Right-click and select Submit Column Profile Request. Validate relationships across data sources. After you select your data source. duplicate names and addresses might exist between two sources or no name might exist for an address in one source. For a table. To select multiple files in the Formats tab. For example. For example. but another source might not. but some problems only exist in one system or the other. To select a subset of tables in the datastore tab. 14 Data Quality Using the Data Profiler • • • Identify missing data in the source system. Executing a profiler task The Data Profiler allows you to calculate profiler statistics for any set of columns you choose. see “Column profile” on page 339 To generate profile statistics for columns in one or more data sources In the Object Library of the Data Integrator Designer. hold down the Ctrl key as you select each file. you can generate column profile statistics in one of the following ways: 1. For example. Submitting column profiler tasks For a list of profiler attributes that the Data Profiler generates. Identify redundant data across data sources. you can select either a table or flat file.This document is part of a SAP study on PDF usage. one data source might include region. two different problem tracking systems might include a subset of common customerreported problems. LONG or TEXT data type. go to the Datastores tab and select a table. go to the Formats tab and select a file.

the Submit Column Profile Request window lists the columns and data types.This document is part of a SAP study on PDF usage. You might want to use this option if you are already on the View Data window and you notice that either: • • 3. The profile task runs asynchronously and you can perform other Designer tasks while the profile task executes. Right-click. or to remove dashes which are allowed in column names but not in task names. (Optional) Edit the profiler task name. If you select a single source. see “Column profile” on page 339. The value is C for column profile that obtains attributes (such as low value and high value) for each selected column. the default name has the following format: username_t_firstsourcename_lastsourcename Column username t Description Name of the user that Data Integrator uses to access system services. firstsourcename lastsourcename 4. Find out how you can participate and help to improve our documentation. The profile statistics have not yet been generated. Name of first source in alphabetic order. The Data Profiler generates a default name for each profiler task. and you must wait for the task to complete before you can perform other tasks on the Designer. Data Quality Using the Data Profiler 14 • • • Some of the profile statistics can take a long time to calculate. You can profile multiple sources in one profile task. Keep the check in front of each column that you want to profile and remove the check in front of each column that you do not want to profile. Data Integrator Designer Guide 343 . a unique name. You can edit the task name to create a more meaningful name. select View Data. Name of last source in alphabetic order if you select multiple sources. For more information. click the Profile tab. or The date that the profile statistics were generated is older than you want. Type of profile. This option submits a synchronous profile task. If you select one source. the default name has the following format: username_t_sourcename If you select a multiple sources. and click Update.

Select a data source to display its columns on the right side. If you selected multiple sources. 14 Data Quality Using the Data Profiler Alternatively. the Submit Column Profiler Request window lists the sources on the left. a.This document is part of a SAP study on PDF usage. 344 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. 5. you can click the check box at the top in front of Name to deselect all columns and then select the check boxes.

ensure that you specify a pageable cache directory that contains enough disk space for the amount of data you profile. Alternatively. For details. If you clicked Update on the Profile tab of the View Data window. median string length. you must re-import the source table before you execute the profile task. Instead. a new column was added). see “Viewing the profiler results”. You can also monitor your profiler task by name in the Data Integrator Administrator. For details. you can click the check box at the top in front of Name to deselect all columns and then select the individual check box for the columns you want to profile. When the profiler task has completed. (Optional) Select Detailed profiling for a column. If you want to remove Detailed profiling for all columns. 6. If you want detailed attributes for all columns in all sources listed. you can view the profile results in the View Data option. Choose Detailed profiling only if you want these attributes: distinct count. click Detailed profiling and select Apply to all columns of all sources.This document is part of a SAP study on PDF usage. Note: If the table metadata changed since you imported it (for example. Click Submit to execute the profile task. c. see “Monitoring profiler tasks using the Designer” on page 349. Data Integrator Designer Guide 345 . See Configuring Job Server runtime resources” in the Data Integrator Getting Started Guide. the Profiler monitor pane appears automatically when you click Submit. click Detailed profiling and select Remove from all columns of all sources. Repeat steps a and b for each data source. Note: The Data Profiler consumes a large amount of resources when it generates detailed profile statistics. median value. For details. Data Quality Using the Data Profiler 14 b. If you choose Detailed profiling. 8. the Profiler monitor window does not appear when you click Submit. distinct percent. 7. and remove the check in front of each column that you do not want to profile. If you clicked the Submit Column Profile Request option to reach this Submit Column Profiler Request window. keep the check in front of each column that you want to profile. see the Data Integrator Management Console Administrator Guide. On the right side of the Submit Column Profile Request window. pattern count. a profile task is submitted asynchronously and you must wait for it to complete before you can do other tasks on the Designer. pattern. Find out how you can participate and help to improve our documentation.

Hold the Ctrl key down as you select the second table. To generate a relationship profile for columns in two sources In the Object Library of the Data Integrator Designer. c. see “Data sources that you can profile” on page 336. If you plan to use Relationship profiling.This document is part of a SAP study on PDF usage. 14 Data Quality Using the Data Profiler Submitting relationship profiler tasks A relationship profile shows the percentage of non matching values in columns of two sources. select two sources. if you run a relationship profile task on an integer column and a varchar column. The two columns do not need to be the same data type. ensure that you specify a pageable cache directory that contains enough disk space for the amount of data you profile. Note: The Data Profiler consumes a large amount of resources when it generates relationship values. 346 Data Integrator Designer Guide . a. the Data Profiler converts the integer value to a varchar value to make the comparison. See Configuring Job Server runtime resources in the Data Integrator Getting Started Guide. To select two sources in the same datastore or file format: a. d. The columns can have a primary key and foreign key relationship defined or they can be unrelated (as when one comes from a datastore and the other from a file format). b. Right-click on the first source. c. To select two sources from different datastores or files: The Submit Relationship Profile Request window appears. Right-click and select Submit Relationship Profile Request. 1. Find out how you can participate and help to improve our documentation. Change to a different Datastore or Format in the Object Library Click on the second source. The sources can be any of the following: • • • Tables Flat files A combination of a table and a flat file For more details. b. Go to the Datastore or Format tab in the Object Library. select Submit Relationship Profile Request > Relationship with. but they must be convertible. For example. Go to the Datastore or Format tab in the Object Library.

The default name that the Data Profiler generates for multiple sources has the following format: username_t_firstsourcename_lastsourcename Column username t Description Name of the user that Data Integrator uses to access system services. The value is R for Relationship profile that obtains non matching values in the two selected columns. You can edit the task name to create a more meaningful name. firstsourcename lastsourcename Data Integrator Designer Guide 347 . Data Quality Using the Data Profiler 14 Note: You cannot create a relationship profile for the same column in the same source or for columns with a LONG or TEXT data type. Name last selected source. (Optional) Edit the profiler task name. 2. Find out how you can participate and help to improve our documentation. Name first selected source. or to remove dashes. Type of profile. which are allowed in column names but not in task names.This document is part of a SAP study on PDF usage. a unique name.

4. Right-click in the upper pane and click Delete All Relations. do one of the following actions: • • 5. If a primary key and foreign key relationship does not exist between the two data sources. select Save all columns data. To specify or change the columns that you want to see relationship values: a. the upper pane of the Submit Relationship Profile Request window shows a line between the primary key column and foreign key column of the two sources. To delete an existing relationship between two columns. To delete all existing relationships between the two sources. • • 6.This document is part of a SAP study on PDF usage. For the profile results. By default. This option indicates that the Data Profiler saves the data only in the columns that you select for the relationship. and select Delete Selected Relation. By default. You can change the columns to profile. and you will not see any sample data in the other columns when you view the relationship profile. select the line. specify the columns that you want to profile. You can resize each data source to show all columns. The Data Profiler will determine which values are not equal and calculate the percentage of non matching values. If you deleted all relations and you want the Data Profiler to select an existing primary-key and foreign-key relationship. 14 Data Quality Using the Data Profiler 3. The bottom half of the Submit Relationship Profile Request window shows that the profile task will use the equal (=) operation to compare the two columns. Hold down the cursor and draw a line to the other column that you want to select. Move the cursor to the first column that you want to select. If you want to see values in the other columns in the relationship profile. Click Propose Relation near the bottom of the Submit Relationship Profile Request window. 7. Click Delete All Relations near the bottom of the Submit Relationship Profile Request window. if the relationship exists. Find out how you can participate and help to improve our documentation. right-click. Click Submit to execute the profiler task. Right-click in the upper pane and click Propose Relation. do one of the following actions: b. see “Viewing relationship profile data” on page 354. 348 Data Integrator Designer Guide . the Save key columns data only option is selected.

see “Monitoring profiler tasks using the Designer”. you can view the profile results in the View Data option when you right click on a table in the Object Library. 8. see the Data Integrator Management Console Administrator Guide. For more information about parameters. see “Viewing the profiler results” on page 350. For details. Monitoring profiler tasks using the Designer The Profiler monitor window appears automatically when you submit a profiler task (see “Executing a profiler task” on page 342). a new column was added). If you clicked Update on the Profile tab of the View Data window. When the profiler task has completed. The Profiler monitor pane appears automatically when you click Submit. You can dock this profiler monitor pane in the Designer or keep it separate. You can also monitor your profiler task by name in the Data Integrator Administrator. Data Quality Using the Data Profiler 14 Note: If the table metadata changed since you imported it (for example. the Information window also displays the error message. Data Integrator Designer Guide 349 . 9. You can click on the icons in the upper-left corner of the Profiler monitor to display the following information: Refreshes the Profiler monitor pane to display the latest status of profiler tasks Sources that the selected task is profiling. The Profiler monitor pane displays the currently running task and all of the profiler tasks that have executed within a configured number of days.This document is part of a SAP study on PDF usage. For details. you must re-import the source table before you execute the profile task. you must click Tools > Profiler monitor on the Menu bar to view the Profiler monitor window. see the Data Integrator Management Console Administrator Guide. Find out how you can participate and help to improve our documentation. If the task failed. For details about the Profile monitor.

Double-click on the value in this Status column to display the error message. This section describes: • • Viewing column profile data on the Profile tab in View Data. 14 Data Quality Using the Data Profiler The Profiler monitor shows the following columns: Column Name Description Name of the profiler task that was submitted from the Designer. 350 Data Integrator Designer Guide . the default name has the following format: username_t_firstsourcename_lastsourcename Type The type of profiler task can be: • Column • Status Relationship The status of a profiler task can be: • Done — The task completed successfully. Click the Profile tab (second) to view the column profile attributes. • • • Pending — The task is on the wait queue because the maximum number of concurrent tasks has been reached or another task is profiling the same table. Error — The task terminated with an error. Running — The task is currently executing. Viewing relationship profile data on the Relationship tab in View Data. 3. Timestamp Date and time that the profiler task executed. Find out how you can participate and help to improve our documentation. Right-click and select View Data. select the table for which you want to view profiler attributes. the default name has the following format: username_t_sourcename If the profiler task is for multiple sources.This document is part of a SAP study on PDF usage. If the profiler task is for a single source. Sources Names of the tables for which the profiler task executes. To view the column attributes generated by the Data Profiler In the Object Library. Viewing the profiler results The Data Profiler calculates and saves the profiler attributes into a profiler repository that multiple users can view. 2. Viewing column profile data 1.

For character columns. the average string length length of the string values in this column. Percentage of rows that contain a NULL value in this column. the highest value in this column. To populate the profile grid. Nulls Nulls % Zeros Number of NULL values in this column. Data Quality Using the Data Profiler 14 a. Select names from this column. Number of rows that contain this lowest value in this column. the average value in this column. The value n/a in the profile grid indicates an attribute does not apply to a data type. the lowest value in this column. Perform the steps in “Executing a profiler task” on page 342. Find out how you can participate and help to improve our documentation. Number of 0 values in this column. You can sort the values in each attribute column by clicking the column heading. b. Number of rows that contain this highest value in this column. The profile grid contains the column names in the current source and profile attributes for each column. the length of the shortest string value in this column. do one of the following actions: • • c.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 351 . the length of the longest string value in this column. Of all values. Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Average For character columns. For numeric columns. For character columns. The Profile tab shows the number of physical records that the Data Profiler processed to generate the values in the profile grid. then click Update. Relevant data type Character Numeric Yes Yes Yes Yes Yes No No No Yes Yes Yes Datetime Yes Yes Yes Yes Yes No No No Yes Yes No Basic Profile attribute Min Min count Max Max count Average Min string length Max string length Description Of all values.

You can resize the width of the column to display the entire string. 14 Data Quality Using the Data Profiler Basic Profile attribute Zeros % Blanks Blanks % Description Relevant data type Character Numeric Yes No No Datetime No No No Percentage of rows that contain a 0 value No in this column. the value that is in the middle row of the source table. If you selected the Detailed profiling option on the Submit Column Profile Request window. Click on the value 46 to view the actual data. the number of rows that contain a blank in this column. Find out how you can participate and help to improve our documentation. The value that is in the middle row of the source Yes table. 4. Yes Percentage of rows that contain a blank in Yes this column. 352 Data Integrator Designer Guide . the Profile tab also displays the following detailed attribute columns. Relevant data type Character Numeric Datetime Yes Yes Yes Yes Yes Yes Yes Yes Detailed Description Profile attribute Distincts Distinct % Median Number of distinct values in this column. The bottom half of the View Data window displays the rows that contain the attribute value that you clicked. The Data Profiler uses the following calculation to obtain the median value: (Total number of rows / 2) + 1 Median string length For character columns. Yes No No Pattern % Percentage of rows that contain each distinct value in this column The format of each unique pattern in this column. For example. Patterns Number of different patterns in this column. For character columns. Yes No No Yes No No Click an attribute value to view the entire row in the source table. your target ADDRESS column might only be 45 characters. but the Profiling data for this Customer source table shows that the maximum string length is 46. d. Percentage of rows that contain each distinct value in this column.This document is part of a SAP study on PDF usage. You can hide columns that you do not want to view by clicking the Show/Hide Columns icon.

number of records for each pattern value. Find out how you can participate and help to improve our documentation. For example. Data Integrator Designer Guide 353 . and you must wait for the task to complete before you can perform other tasks on the Designer. Reasons to update at this point include: • • The profile attributes have not yet been generated The date that the profile attributes were generated is older than you want. The Last updated value in the bottom left corner of the Profile tab is the timestamp when the profile attributes were last generated. You can also click the check box at the top in front of Name to deselect all columns and then select each check box in front of each column you want to profile. the following Profile tab for table CUSTOMERS shows the profile attributes for column REGION. Data Quality Using the Data Profiler 14 5. Note: The Update option submits a synchronous profile task. (Optional) Click Update if you want to update the profile attributes. The pattern values. and percentages appear on the right side of the Profile tab.This document is part of a SAP study on PDF usage. 6. The Distincts attribute for the REGION column shows the statistic 19 which means 19 distinct values for REGION exist. The Submit column Profile Request window appears. Click a statistic in either Distincts or Patterns to display the percentage of each distinct value or pattern value in a column. Select only the column names you need for this profiling operation because Update calculations impact performance.

The columns can have a primary key and foreign key relationship defined or they can be unrelated (as when one comes from a datastore and the other from a file format). decide what value you want to substitute for Null values when you define a validation transform. Therefore. For details. the bars in the right-most column show the relative size of each percentage.This document is part of a SAP study on PDF usage. Viewing relationship profile data Relationship profile data shows the percentage of non matching values in columns of two sources. 354 Data Integrator Designer Guide . The Profiling data on the right side shows that a very large percentage of values for REGION is Null. Your business rules might dictate that REGION should not contain Null values in your target data warehouse. Click on either Null under Value or 60 under Records to display the other columns in the rows that have a Null value in the REGION column. In addition. see “Define validation rule based on column profile” on page 359. 14 Data Quality Using the Data Profiler 7. flat files. Find out how you can participate and help to improve our documentation. Click the statistic in the Distincts column to display each of the 19 values and the percentage of rows in table CUSTOMERS that have that value for column REGION. or a combination of a table and a flat file. The sources can be tables. 9. 8.

For example. 4. Note: The Relationship tab is visible only if you executed a relationship profile task. Data Integrator Designer Guide 355 . Click the Relationship tab (third) to view the relationship profile results. select the table or file for which you want to view relationship profile data. Right-click and select View Data.This document is part of a SAP study on PDF usage. the following View Data Relationship tab shows the percentage (16. The relationship profile was defined on the CUST_ID column in table ODS_CUSTOMER and CUST_ID column in table ODS_SALESORDER. Each row displays a non matching CUST_ID value. 3. Find out how you can participate and help to improve our documentation.67) of customers that do not have a sales order. 2. and the percentage of total customers with this CUST_ID value. To view the relationship profile data generated by the Data Profiler In the Object Library. the number of records with that CUST_ID value. Click the 16. The non matching values KT03 and SA03 display on the right side of the Relationship tab.67% of rows in table ODS_CUSTOMER have CUST_ID values that do not exist in table ODS_SALESORDER. Click the nonzero percentage in the diagram to view the key values that are not contained within the other table.67 percentage in the ODS_CUSTOMER oval to display the CUST_ID values that do not exist in the ODS_SALESORDER table. Data Quality Using the Data Profiler 14 1. The value in the left oval indicates that 16.

14 Data Quality Using View Data to determine data quality 5. Find out how you can participate and help to improve our documentation. Using View Data to determine data quality Use View Data to help you determine the quality of your source and target data. You can see the data in different ways from the three tabs on the View Data panel: • • • Data tab Profile tab Relationship Profile or Column Profile tab For more information about View Data options and how to use View Data to design and debug your jobs. 356 Data Integrator Designer Guide . Compare sample data from different steps of your job to verify that your data extraction job returns the results you expect.This document is part of a SAP study on PDF usage. You can display a subset of columns in each row and define filters to display a subset of rows (see “View Data properties” on page 408). Data tab The Data tab is always available and displays the data contents of sample rows. you cannot view the data in the other columns. see “Using View Data” on page 404. View Data provides the capability to: • • View sample source data before you execute a job to create higher quality job designs. your business rules might dictate that all phone and fax numbers be in one format for each country. Click one of the values on the right side to display the other columns in the rows that contain that value. See step 6 in “Submitting relationship profiler tasks” on page 346. Note: If you did not select Save all column data on the Submit Relationship Profile Request window. see “Define validation rule based on column profile” on page 359. For an example. The following Data tab shows a subset of rows for the customers that are in France. The bottom half of the Relationship Profile tab displays the values in the other columns of the row that has the value KT03 in the column CUST_ID. For example.

median. For more information. and pattern percent. such as average value. You can now decide which format you want to use in your target data warehouse and define a validation transform accordingly (see “Define validation rule based on column profile” on page 359). and maximum string length. the Profile tab displays the following column attributes: distinct values. distinct count. the Profile tab displays the same above column attributes plus many more calculated statistics. distinct percent. For more information. Relationship Profile or Column Profile tab The third tab that displays depends on whether or not you configured and use the Data Profiler. NULLs. minimum string length. the Column Profile tab allows you to calculate statistical information for a single column. see “Viewing column profile data” on page 350. Profile tab Two displays are available on the Profile tab: • • Without the Data Profiler. pattern count. see “Profile tab” on page 415.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 357 . see “Data tab” on page 414. For more information. Find out how you can participate and help to improve our documentation. minimum value. median string length. • If you do not use the Data Profiler. If you configured and use the Data Profiler. see “Column Profile tab” on page 417. Data Quality Using View Data to determine data quality 14 Notice that the PHONE and FAX columns displays values with two different formats. and maximum value. For information about other options on the Data tab.

For more information. the Relationship tab displays the data mismatches between two columns from which you can determine the integrity of your data between two sources. if needed. 2. To analyze column profile attributes In the Object Library of the Designer.This document is part of a SAP study on PDF usage. The Patterns attribute for the PHONE column shows the value 20 which means 20 different patterns exist. see “Viewing relationship profile data” on page 354. The Data Profiler and View Data features can identify anomalies in the incoming data to help you better define corrective actions in the validation transform. Find out how you can participate and help to improve our documentation. select the View Data right-click option on the table that you profiled. The Profile tab shows the following column profile attributes: 1. 358 Data Integrator Designer Guide . Using the Validation transform The validation transform provides the ability to compare your incoming data against a set of pre-defined business rules and. 14 Data Quality Using the Validation transform • If you use the Data Profiler. For example. Access the Profile tab on the View Data window. suppose you want to analyze the data in the Customer table in the Microsoft SQL Server Northwinds sample database. take any corrective actions. Analyze column profile To obtain column profile information. follow the procedure “Submitting column profiler tasks” on page 342.

In the validation transform editor. To display the columns in these two records. select the column for which you want to replace a specific pattern. Suppose that your business rules dictate that all phone numbers in France should have the format 99. However. To define a validation rule to substitute a different value for a specific pattern 1.99. select the PHONE column. For the example in “Analyze column profile” on page 358.99 under Pattern or click the value 2 under Records. see the Data Integrator Reference Guide. 5. the profiling data shows that two records have the format (9) 99.99. For more information about the validation transform. Click the value 20 under the Patterns attribute to display the individual patterns and the percentage of rows in table CUSTOMERS that have that pattern for column PHONE.99. You see that some phone numbers in France have a prefix of ‘(1)’. see “Define validation rule based on column profile” below.99. click either the value (9) 99. Define validation rule based on column profile This section takes the Data Profiler results and defines the validation transform according to the sample business rules.99. To remove this ‘(1)’ prefix when you load the customer records into your target table. 4.99.99.99.This document is part of a SAP study on PDF usage. For details. define a validation rule with the Match pattern option. Data Quality Using the Validation transform 14 3. Data Integrator Designer Guide 359 . Find out how you can participate and help to improve our documentation.

enter the following pattern: ‘99. '(1) '. In the Action on Failure area. Either manually enter the replace_substr function in the text box or click Function to have the Define Input Parameter(s) window help you set up the replace_substr function: replace_substr(CUSTOMERS. Click the Enable validation check box. select Send to Pass and check the box For Pass.PHONE.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.99’ 4. In the Condition area. Substitute with. 5.99.99. select the Match pattern and enter the specific pattern that you want to pass per your business rules. Using the phone example above. 14 Data Quality Using the Validation transform 2. 3. null) 360 Data Integrator Designer Guide .

8. For the phone example. It only replaces occurrences of the value (1) because the Data Profiler results show only that specific value in the source data. Repeat steps 1 through 5 to define a similar validation rule for the FAX column. enter the value of the string you want to replace. 6. In the Define Input Parameter(s) window. After you execute the job. Data Integrator Designer Guide 361 . enter: ‘(1) ’ 12. In our example. Click Finish. 9. Find out how you can participate and help to improve our documentation. Data Quality Using the Validation transform 14 Note: This replace_substr function does not replace any number enclosed in parenthesis. enter your replacement value. In the Input Parameter window. For Search string on the Define Input Parameter(s) window. Click Function and select the string function replace_substr. click the Input string dropdown list to display source tables. 10. double-click the source table and doubleclick the column name. double-click the Customer table and double-click the Phone column name. 7. use the View Data icons to verify that the string was substituted correctly. enter: null 13. 11. Click Next. For Replace string on the Define Input Parameter(s) window. For the phone example.This document is part of a SAP study on PDF usage.

362 Data Integrator Designer Guide . Define rules with these audit statistics to ensure that the data at the following points in a data flow is what you expect: • • • • • Extracted from sources Processed by transforms Loaded into targets Generate a run time notification that includes the audit rule that failed and the values of the audit statistics at the time of failure. see “Guidelines to choose audit points” on page 371. This section describes the following topics: • • • • • • • Auditing objects in a data flow Accessing the Audit window Defining audit points. Note: If you add an audit point prior to an operation that is usually pushed down to the database server. rules. or target. Use auditing to perform the following tasks: • • Define audit points to collect run time statistics about the data that flows out of objects. performance might degrade because pushdown operations cannot occur after an audit point. such as a source. transform.This document is part of a SAP study on PDF usage. If a transform has multiple distinct or different outputs (such as Validation or Case). and action on failure Guidelines to choose audit points Auditing embedded data flows Resolving invalid audit labels Viewing audit results Auditing objects in a data flow You can collect audit statistics on the data that flows out of any Data Integrator object. Find out how you can participate and help to improve our documentation. Display the audit statistics after the job execution to help identify the object in the data flow that might have produced incorrect data. you can audit each output independently. Auditing stores these statistics in the Data Integrator repository. 14 Data Quality Using Auditing Using Auditing Auditing provides a way to ensure that a data flow loads correct data into the warehouse. For details.

integer. Actions on audit failure — One or more of three ways to generate notification of an audit rule (or rules) failure: email. • Error count for rows that generated some type of error if you enabled error handling. This function only includes the Good rows. or column. You identify the object to audit when you define an audit function on it. • • Audit function This section describes the data types for the audit functions and the error count statistics. Average of the numeric values in the column. Audit rule — A Boolean expression in which you use audit labels to verify the Data Integrator job. all rules must succeed or the audit fails. custom script. you define the following objects in the Audit window: • • Audit point — The object in a data flow where you collect audit statistics. see “Audit function” on page 363 Audit Functions Description This function collects two statistics: • Good count for rows that were successfully processed. Checksum of the values in the column. If you define multiple rules in a data flow. or a target. Applicable data types include decimal. Data Integrator Designer Guide 363 . Data Object Table or output Count schema Column Sum Column Average Column Checksum • Audit label — The unique name in the data flow that Data Integrator generates for the audit statistics collected for each audit function that you define.This document is part of a SAP study on PDF usage. Applicable data types include decimal. For more information. raise exception. a transform. This function only includes the Good rows. double. For more information. Audit function — The audit statistic that Data Integrator collects for a table. and real. You use these labels to define audit rules for the data flow. see “Audit rule” on page 365. see “Audit label” on page 364. and real. double. Find out how you can participate and help to improve our documentation. see “Audit notification” on page 365. Sum of the numeric values in the column. integer. output schema. The following table shows the audit functions that you can define. You can audit a source. For more information. For more information. Data Quality Using Auditing 14 To use auditing.

Generating label names If the audit point is on a table or output schema. Data Integrator collects two types of statistics: • • Good row count for rows processed without any error. DECIMAL. Audit Functions Default Data Type Count Sum Average Checksum INTEGER Allowed Data Types INTEGER Type of audited column INTEGER. Data Integrator generates an audit label with the following format: $auditfunction_objectname If the audit point is in an embedded data flow. REAL VARCHAR(128) VARCHAR(128) Error count statistic When you enable a Count audit function. You can change the data type in the Properties window for each audit function in the Data Integrator Designer. REAL Type of audited column INTEGER. DECIMAL. DOUBLE. which are allowed in column names but not in label names. You can edit the label names. Data Integrator generates the following two labels for the audit function Count: $Count_objectname $CountError_objectname If the audit point is on a column. 14 Data Quality Using Auditing Data types The following table shows the default data type for each audit function and the permissible data types. DOUBLE. the labels have the following formats: $Count_objectname_embeddedDFname $CountError_objectname_embeddedDFname $auditfunction_objectname_embeddedDFname 364 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. One way that error rows can result is when you specify the User overflow file option in the Source Editor or Target Editor. You might want to edit a label name to create a shorter meaningful name or to remove dashes. Audit label Data Integrator generates a unique name for each audit function that you define on an audit point.This document is part of a SAP study on PDF usage. Error row count for rows that the Data Integrator job could not process but ignores those rows to continue processing.

Data Integrator executes them in this order: • Email to list — Data Integrator sends a notification of which audit rule failed to the email addresses that you list in this option. $Count_CUSTOMER = $Count_CUSTDW $Sum_ORDER_US + $Sum_ORDER_EUROPE = $Sum_ORDER_DW round($Avg_ORDER_TOTAL) >= 10000 The following Boolean expressions are examples of audit rules: Audit notification You can choose any combination of the following actions for notification of an audit failure. and a Right-Hand-Side (RHS). Raise exception — The job fails if an audit rule fails. Therefore. You can use this audit exception in a try/catch block. This option uses the smtp_to function to send email. You must redefine the rule with the new name. Data Quality Using Auditing 14 Editing label names You can edit the audit label name when you create the audit function and before you create an audit rule that uses the label. You can continue the job execution in a try/catch block. Use a comma to separate the list of email addresses. The job stops at the first audit rule that fails. If you edit the label name after you use it in an audit rule. This action is the default.This document is part of a SAP study on PDF usage. the audit rule does not automatically use the new name. a function with audit labels as parameters. Find out how you can participate and help to improve our documentation. You can specify a variable for the email list. If you choose all three actions. you must define the server and sender for the Simple Mail Tool Protocol (SMTP) in the Data Integrator Server Manager. • • Script — Data Integrator executes the custom script that you create in this option. • • The LHS can be a single audit label. multiple audit labels that form an expression with one or more mathematical operators. multiple audit labels that form an expression with one or more mathematical operators. Data Integrator Designer Guide 365 . and the error log shows which audit rule failed. The RHS can be a single audit label. or a function with audit labels as parameters. a Boolean operator. or a constant. Audit rule An audit rule is a Boolean expression which consists of a Left-Hand-Side (LHS).

In the workspace. right-click on a data flow icon and select the Auditing option. the Label tab displays the sources and targets in the data flow.This document is part of a SAP study on PDF usage. Icon Tool tip Collapse All Description Collapses the expansion of the source. If your data flow contains multiple consecutive query transforms. 14 Data Quality Using Auditing If you uncheck this action and an audit rule fails. Accessing the Audit window Access the Audit window from one of the following places in the Data Integrator Designer: • • • From the Data Flows tab of the object library. and target objects. When a data flow is open in the workspace. click the Audit icon in the toolbar. Find out how you can participate and help to improve our documentation. You can view which rule failed in the Auditing Details report in the Metadata Reporting tool. right-click on a data flow name and select the Auditing option. For more information. 366 Data Integrator Designer Guide . see “Viewing audit results” on page 377. the job completes successfully and the audit does not write messages to the job log. Click the icons on the upper left corner of the Label tab to change the display. transform. the Audit window shows the first query. When you first access the Audit window.

only the first query displays. To define auditing in a data flow Access the Audit window. see “Auditing objects in a data flow” on page 362. and Target and first first query objects in the data flow. target. Use one of the methods that section “Accessing the Audit window” on page 366 describes. Data Integrator generates the following: • • An audit icon on the object in the data flow in the workspace An audit label that you use to define audit rules. 2. On the Label tab. Show Source.This document is part of a SAP study on PDF usage.REGION_ID = 2 R3 contains rows where ODS_CUSTOMER. If the data flow Query contains multiple consecutive query transforms. the Properties window allows you to edit the audit label and change the data type of the audit function. Objects Defining audit points. Default display which shows the source. In addition to choosing an audit function.REGION_ID IN (1.REGION_ID = 3 R123 contains rows where ODS_CUSTOMER. 2 or 3) Data Integrator Designer Guide 367 . Define audit points. and action on failure 1.REGION_ID = 1 R2 contains rows where ODS_CUSTOMER. rules. Find out how you can participate and help to improve our documentation. For the format of this label. the data flow Case_DF has the following objects and you want to verify that all of the source rows are processed by the Case transform. • • Source table ODS_CUSTOMER Four target tables: • • • • R1 contains rows where ODS_CUSTOMER. Show Labelled Displays the objects that have audit labels defined. right-click on an object that you want to audit and choose an audit function or Properties. When you define an audit point. For example. Data Quality Using Auditing 14 Icon Tool tip Show All Objects Description Displays all the objects within the data flow.

Find out how you can participate and help to improve our documentation. 368 Data Integrator Designer Guide . Data Integrator creates the audit labels $Count_ODS_CUSTOMER and $CountError_ODS_CUSTOMER. 14 Data Quality Using Auditing a.This document is part of a SAP study on PDF usage. Right-click on source table ODS_CUSTOMER and choose Count. The Audit window shows the following audit labels. Similarly. b. right-click on each of the target tables and choose Count. and an audit icon appears on the source object in the workspace.

When you right-click on the label. which consists of three text boxes with drop-down lists: a. The options in the editor provide common Boolean operators. On the Rule tab in the Audit window. Select the label of the first audit point in the first drop-down list. c. If you want to compare audit statistics for one object against one other object. to verify that the count of rows from the source table is equal to the rows in the target table. use the expression editor. Select the label for the second audit point from the third drop-down list. click Add which activates the expression editor of the Auditing Rules section. Find out how you can participate and help to improve our documentation. right-click on the label. Choose a Boolean operator from the second drop-down list. and select the value (No Audit) in the Audit function drop-down list. Click the function to remove the check mark and delete the associated audit label. b. use the Customer expression box. select audit labels and the Boolean operation in the expression editor as follows: Data Integrator Designer Guide 369 . use the Custom expression box with its function and smart editors to type in the operator. If you want to compare the first audit value to a constant instead of a second audit value. 3. If you require a Boolean operator that is not in this list. you can also select Properties. For example. Define audit rules. If you want to remove an audit label.This document is part of a SAP study on PDF usage. Data Quality Using Auditing 14 c. and the audit function that you previously defined displays with a check mark in front of it.

b. 14 Data Quality Using Auditing If you want to compare audit statistics for one or more objects against statistics for multiple other objects or a constant. Click the ellipsis button to open the full-size smart editor window. d. Click Close in the Audit window. For example. To update the rule in the top Auditing Rule box.This document is part of a SAP study on PDF usage. f. Click OK to close the smart editor. g. a. Drag the first audit label of the object to the editor pane. drag the audit labels. click on the title “Auditing Rule” or on another option. Find out how you can participate and help to improve our documentation. to verify that the count of rows from the source table is equal to the sum of rows in the first three target tables. Click the Variables tab on the left and expand the Labels node. h. type in the Boolean operation and plus signs in the smart editor as follows: 370 Data Integrator Designer Guide . select the Custom expression box. The audit rule displays in the Custom editor. c. Type a Boolean operator Drag the audit labels of the other objects to which you want to compare the audit statistics of the first object and place appropriate mathematical operators between them. e.

Data Quality Using Auditing 14 4. For details. see “Viewing audit results” on page 377. • 5. You can specify a variable for the email list. Guidelines to choose audit points The following are guidelines to choose audit points: • When you audit the output data of an object. Execute the job. and target objects.This document is part of a SAP study on PDF usage. Script — Data Integrator executes the script that you create in this option. the Data Integrator optimizer cannot pushdown operations after the audit point. Find out how you can participate and help to improve our documentation. Define the action to take if the audit fails. Look at the audit results. query. Therefore. For example. If you turn on the audit trace on the Trace tab in the Execution Properties window. 6. see “Viewing audit results” on page 377. • Email to list — Data Integrator sends a notification of which audit rule failed to the email addresses that you list in this option. This action is the default. Data Integrator Designer Guide 371 . to obtain audit statistics on the query results. the job completes successfully and the audit does not write messages to the job log. You can choose one or more of the following actions: • Raise exception — The job fails if an audit rule fails and the error log shows which audit rule failed. If you clear this option and an audit rule fails. You can view which rule failed in the Auditing Details report in the Metadata Reporting tool. you can view all audit results on the Job Monitor Log. define the first audit point on the query or later in the data flow. rather than on the source. Use a comma to separate the list of email addresses. For more information. suppose your data flow has a source. You can view passed and failed audit rules in the metadata reports. and the query has a WHERE clause that is pushed to the database server that significantly reduces the amount of data that returns to Data Integrator. if the performance of a query that is pushed to the database server is more important than gathering audit statistics from the source. The Execution Properties window has the Enable auditing option checked by default. Uncheck this box if you do not want to collect audit statistics for this specific job execution. Define the first audit point on the query.

This section describes the following considerations when you audit embedded data flows: • • Enabling auditing in an embedded data flow Audit points not visible outside of the embedded data flow Enabling auditing in an embedded data flow If you want to collect audit statistics on an embedded data flow when you execute the parent data flow. Find out how you can participate and help to improve our documentation. You cannot audit NRDM schemas or real-time jobs. 2. Data Integrator disables the DOP for the whole data flow. 372 Data Integrator Designer Guide . 1.This document is part of a SAP study on PDF usage. If you use the CHECKSUM audit function in a job that normally executes in parallel. Click on the Audit icon in the toolbar to open the Audit window On the Label tab. Auditing is disabled when you run a job with the debugger. If a data flow is embedded at the beginning or at the end of the parent data flow. 14 Data Quality Using Auditing • • • • • • If a pushdown_sql function is after an audit point. but you can audit the output of an SAP R/3 data flow. You can only audit a bulkload that uses the Oracle API method. an audit function might exist on the output port or on the input port. the number of rows loaded is not available to Data Integrator. You cannot audit within an SAP R/3 data flow. For the other bulk loading methods. Auditing embedded data flows You can define audit labels and audit rules in an embedded data flow. you must enable the audit label of the embedded data flow. and DOP processes the rows in a different order than in the source. expand the objects to display any audit functions defined within the embedded data flow. Data Integrator cannot execute it. 3. To enable auditing in an embedded data flow Open the parent data flow in the Data Integrator Designer workspace. The order of rows is important for the result of CHECKSUM.

Find out how you can participate and help to improve our documentation. Right-click on the Audit function and choose Enable. 4. You can also choose Properties to change the label name and enable the label. Data Quality Using Auditing 14 The following Audit window shows an example of an embedded audit function that does not have an audit label defined in the parent data flow. Data Integrator Designer Guide 373 .This document is part of a SAP study on PDF usage.

This document is part of a SAP study on PDF usage. 14 Data Quality Using Auditing 5. the following embedded dataflow has an audit function defined on the source SQL transform and an audit function defined on the target table. data passes from the embedded data flow to the parent data flow through a single source. Audit points not visible outside of the embedded data flow When you embed a data flow at the beginning of another data flow. 374 Data Integrator Designer Guide . For example. some of the objects are not visible in the parent data flow. Find out how you can participate and help to improve our documentation. Because some of the objects are not visible in the parent data flow. You can define audit rules with the enabled label. the audit points on these objects are also not visible in the parent data flow. data passes into the embedded data flow from the parent through a single target. In either case. When you embed a data flow at the end of another data flow.

The following Audit window for the parent data flow shows the audit function defined in the embedded data flow. An audit point still exists for the entire embedded data flow. Find out how you can participate and help to improve our documentation. Data Quality Using Auditing 14 The following Audit window shows these two audit points. the target Output becomes a source for the parent data flow and the SQL transform is no longer visible.This document is part of a SAP study on PDF usage. When you embed this data flow. but the label is no longer applicable. Data Integrator Designer Guide 375 . but does not show an Audit Label.

2. Find out how you can participate and help to improve our documentation. right click on the Invalid Labels node and click on Delete All. right-click on the audit function in the Audit window and select Enable. If you want to delete all of the invalid labels at once. After you define a corresponding audit label on a new object. right-click on the invalid label and choose Delete. 3. If you delete or rename an object that had an audit point defined on it The following Audit window shows the invalid label that results when an embedded data flow deletes an audit label that the parent data flow had enabled. 1. 4. 14 Data Quality Using Auditing If you want to audit the embedded data flow. Note any labels that you would like to define on any new objects in the data flow.This document is part of a SAP study on PDF usage. 5. Resolving invalid audit labels An audit label can become invalid in the following situations: • • If you delete the audit label in an embedded data flow that the parent data flow has enabled. 376 Data Integrator Designer Guide . Expand the Invalid Labels node to display the individual labels. To resolve invalid audit labels Open the Audit window.

The following sample message appears in the Job Error Log: Audit rule failed <($Count_ODS_CUSTOMER = $CountR1)> for <Data flow Case_DF>. Data Quality Using Auditing 14 Viewing audit results You can see the audit status in one of the following places: • • Job Monitor Log If the audit rule fails. Job Error Log When you choose the Raise exception option and an audit rule fails. Data flow <Case_DF>. Data flow <Case_DF>. Audit Label $CountError_R3 = 0. You can see messages for audit rules that passed and failed. Audit Label $Count_R123 = 12. Audit Rule passed ($Count_ODS_CUSTOMER = $CountR123): LHS=12. Data flow <Case_DF>. Metadata Reports Wherever the custom script sends the audit messages. Find out how you can participate and help to improve our documentation. Data flow <Case_DF>. Audit Label $CountError_R123 = 0. Audit Label $Count_R3 = 3. Data flow <Case_DF>. the Job Error Log shows the rule that failed.This document is part of a SAP study on PDF usage. RHS=12. Data flow <Case_DF>. Data flow <Case_DF>. Audit Label $CountError_R2 = 0. Data flow <Case_DF>. Metadata Reports Email message. Data flow <Case_DF>. Audit Label $Count_ODS_CUSTOMER = 12. The following sample audit success messages appear in the Job Monitor Log when Audit Trace is set to Yes: Audit Label $Count_R2 = 4. Metadata Reports Job Monitor Log If you set Audit Trace to Yes on the Trace tab in the Execution Properties window. Audit Label $Count_R1 = 5. Audit Label $CountError_ODS_CUSTOMER = 0. the places that display audit information depends on the Action on failure option that you chose: Action on failure Raise exception Email to list Script Places where you can view audit information Job Error Log. Data Integrator Designer Guide 377 . Audit Rule passed ($Count_ODS_CUSTOMER = (($CountR1 + $CountR2 + $Count_R3)): LHS=12. Audit Label $CountError_R1 = 0. audit messages appear in the Job Monitor Log. Data flow <Case_DF>. Data flow <Case_DF>. Data flow <Case_DF>. RHS=12.

14 Data Quality Data Cleansing with Data Integrator Data Quality Metadata Reports You can look at the Audit Status column in the Data Flow Execution Statistics reports of the Metadata Report tool. and passed back to the Data Integrator job. This value is a link to the Auditing Details report which shows the rule that failed and values of the audit labels. For examples of these Metadata Reports. Data Quality Projects and datastores are imported into the Data Integrator Designer and used to call Data Quality Projects from a server. Failed — Audit rule failed. Data is passed to the Data Quality Projects. This value is a link to the Auditing Details report which shows the values of the audit labels. Find out how you can participate and help to improve our documentation. This data cleansing functionality is initiated and viewed in the Data Integrator Designer. Information Collected — This status occurs when you define audit labels to collect statistics but do not define audit rules.This document is part of a SAP study on PDF usage. see the Data Integrator Management Console: Metadata Reports User’s Guide. This value is a link to the Auditing Details report which shows the audit rules and values of the audit labels. This Audit Status column has the following values: • • • • Not Audited Passed — All audit rules succeeded. cleansed. This section covers the following topics: • • • • • • • Overview of Data Integrator Data Quality architecture Data Quality Terms and Definitions Creating a Data Quality datastore Importing Data Quality Projects Using the Data Quality transform Mapping input fields from the data flow to the project Creating custom projects 378 Data Integrator Designer Guide . Data Cleansing with Data Integrator Data Quality Data Integrator Data Quality integrates the data cleansing functionality of the Business Objects Data Quality application with Data Integrator.

via the reader and writer socket threads. At execution time. the dataflow streams the input data to the Data Quality server where the data is cleaned and then sent back to Data Integrator. where the cleansed data is further processed by the dataflow. The remainder of this chapter explains how to use the Data Integrator Designer to implement data cleansing as provided by the above integration of Data Integrator and Data Quality. Data Integrator Designer Guide 379 . Cleansed data is passed back to the Data Integrator job via the Data Quality socket based writer. the Data Integrator Query Transform. Data Quality Data Cleansing with Data Integrator Data Quality 14 Overview of Data Integrator Data Quality architecture Data Quality Projects are imported in a Data Quality datastore in Data Integrator and used as Data Quality transforms in data flows. the Data Integrator Data Quality transform. The following diagram illustrates the flow of data from the Data Integrator source. Find out how you can participate and help to improve our documentation. and the Data Quality server.This document is part of a SAP study on PDF usage. Data is passed to a running Data Quality workflow via a Data Integrator reader socket thread to the Data Quality socket based reader.

This document is part of a SAP study on PDF usage. 14 Data Quality Data Cleansing with Data Integrator Data Quality 380 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.

To create a new Data Quality datastore Click the Datastore tab in the Local Object Library. The Create New Datastore dialog box appears. Overview of steps to use Data Integrator Data Quality Use the following steps to cleanse data in a Data Quality Project with Data Integrator: 1. An imported project. 1. Data Quality Overview of steps to use Data Integrator Data Quality 14 Data Quality Terms and Definitions Data Quality Datastore: A data store that represents a connection to a Data Quality server. Blueprint: A sample Data Quality Project that can be used by Data Integrator without modification. Contains imported projects that are available on the associated Data Quality server. 3. Create a Data Quality Datastore in Data Integrator Import a project from Data Quality into the Datastore Call the imorted Data Quality in a Data Integrator dataflow as a transform Map the Input and Output fields These steps are explained in detail in the remaining sections of this chapter.This document is part of a SAP study on PDF usage. Creating a Data Quality datastore Creating a Data Quality datastore is the first step in the data cleansing workflow. is usually called a transform. 2. This datastore allows you to connect to the Data Quality server. Data Integrator Data Quality Project: A reusable (first class) object that can be dropped onto data flows to provide specific data cleansing function (as defined by the project). 4. when used in a Data Integraton dataflow. Data Integrator Designer Guide 381 . and to import Data Quality Projects (or blueprints). Find out how you can participate and help to improve our documentation. These objects are imported and grouped within a data quality datastore which must be created first.

The default path to the configuration rules folder is: C:\dqxi\11_5\repository\configuration_rules If you store your integrated batch projects elsewhere. Choose File > New > Datastore to invoke this dialog box: Fill out the datastore properties window as shown in the following table: Option Datastore name Datastore type Server name Port number Repository path Description Type a descriptive name for the new datastore. Choose BusinessObjects Data Quality from the drop-down list.This document is part of a SAP study on PDF usage. Type the host name you selected when installing Data Quality. 382 Data Integrator Designer Guide . For example. Note: The "Repository Path" is a path on the machine where the Data Quality server runs (it might not be the same machine where the Data Integrator client is running). Type a number that represents the maximum number of seconds to wait for a connection to be made between Data Integrator and Data Quality Note: Timeout • This is the same process you use to set up any new datastore. Type the path to the configuration_rules folder. Find out how you can participate and help to improve our documentation. type localhost here. provide that path here. if you installed to your local computer. Type the port number you use for the Data Quality server. 14 Data Quality Overview of steps to use Data Integrator Data Quality 2. but the values for the settings must coincide with the necessary Data Quality configurations.

Right-click the project you wish to import and choose Import. or when a Data Integrator Data Quality job is being executed. 2. Find out how you can participate and help to improve our documentation. You cannot start and stop the Data Quality server from Data Integrator. next import a project from the Data Quality server. as shown below. Figure 14-3 :Importing a Data Quality Project from the Designer Note: You can also import multiple projects at a time by holding the Shift key and selecting a range of projects.This document is part of a SAP study on PDF usage. 1. To import the Data Quality Project into Data Integrator Double-click the datastore in the Local Objects Library A list of XML files (Data Quality Projects and blueprints) appears in the Workspace on the right of the Designer window. Data Integrator Designer Guide 383 . Importing Data Quality Projects After you have created the new Data Quality datastore. Data Quality Importing Data Quality Projects 14 • The Data Quality server must be up and available when importing a Data Quality Project. as shown below.

Find out how you can participate and help to improve our documentation. as shown below: Figure 14-4 :Data Quality Projects shown as children in the Designer After you import a Data Quality Project into Data Integrator. 14 Data Quality Importing Data Quality Projects Note that after you import the Data Quality Project it appears as child of the Data Quality datastore.This document is part of a SAP study on PDF usage. you can drag and drop it into a dataflow to call it like any typical Data Integrator transform. 384 Data Integrator Designer Guide .

and a file writer: Figure 14-5 :Data flow containg the Data Quality transform Data Integrator Designer Guide 385 .This document is part of a SAP study on PDF usage. a query. The following graphic shows a simple data flow that contains an input reader. Find out how you can participate and help to improve our documentation. You can drill into the transform to set its properties and configure data mappings. You can decide which fields are sent to the data quality engine and which fields bypass the Data Quality server. it behaves like any other Data Integrator transform. Data Quality Using the Data Quality transform 14 Using the Data Quality transform After a Data Quality Project is imported and dropped onto a data flow. a data quality transform.

Find out how you can participate and help to improve our documentation. those fields should be identified as passthrough. but should appear in the output. Data Integrator will not match the passthrough fields with the correct original records. its data will not be modified from its source to its target. 386 Data Integrator Designer Guide . Note: Passthrough should not be used when the Data Quality Project changes the order of records. for example when a sorter is used or when the number of records is changed in the Data Quality Project. you can drag fields from the upper left to the upper right window. After this option is enabled.This document is part of a SAP study on PDF usage. If there are fields that do not require cleansing. 14 Data Quality Using the Data Quality transform Drill into the Data Quality transform and click on the Properties tab to see the following view: Figure 14-6 :Data Quality transform Properties view Enable Passthrough Check this option if you need to define output fields that copy their data directly from the input without any cleansing performed. When a field is identified as passthrough.

Substitution files are located in the configuration_rules folder of your repository on the data quality sever machine.This document is part of a SAP study on PDF usage.xml. See the Business Objects Data Quality Project Architect manual for more information about customizing your Data Quality Projects. Data Integrator uses the filename Substitutions. If you do not specify a filename here. Data Quality Using the Data Quality transform 14 Substitution File The substitution file is used by the Data Quality Project during execution. Find out how you can participate and help to improve our documentation. which might not exist on the server machine. The Data Quality Project defines the name and the meaning of the input fields. Figure 14-7 :Mapping view of a data quality transform The fields visible in the lower left window above are input fields expected by the project's socket reader. The project also defines what data quality operations are performed on these fields. Data Integrator Designer Guide 387 . Note that any unmapped fields pass throgh the engine as an empty string. Mapping input fields from the data flow to the project The following graphic displays the mapping view of a data quality transform that uses the AddressCleanseUSA project.

The following view of the output schema is displayed: 388 Data Integrator Designer Guide . The project defines the field names. are the output fields from the Data Quality Project's socket writer.This document is part of a SAP study on PDF usage. 14 Data Quality Using the Data Quality transform To map input fields from the data flow to the Data Quality Project. Address1 is then mapped to the Data Quality Project field ADDRESS_LINE1 The fields visible in the lower right window. If passthrough is enabled you can also map your passthrough columns with this view. drag columns from the lower right window to the upper right window (shown in the picture above). below. drill into the query transform from the data flow view. you can add a description here to help document your output fields. Unmapped fields will be passed as NULL values to the Data Quality server. To create a passthrough column. but not the data type. and "Company" are defined as passthrough. Also. Find out how you can participate and help to improve our documentation. the field Address1 is passed to the transform from the previous query. In the example above. "Name". You can change the name of the output columns. Note that not all fields need to be mapped. To examine the final output schema. drag the field from the upper left window to the upper right window. drag columns from the upper left window to the lower left window. In the example above. To map the cleansed output fields from the project back to the dataflow. the fields "ID".

2. 1. Choose New > Project > Integrated batch project. To create a custom project Open the Data Quality Project Architect. Data Quality Creating custom projects 14 Figure 14-8 :Final output schema final output schema Creating custom projects You can use the Data Quality blueprints.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. if they suit your needs. 3. Launch Data Quality Project Architect. or you can create custom Data Quality Projects in the Data Quality Project Architect. Note: To open the Data Quality Project Architect from Data Integrator at any time. you can also right-click on a project and choose. Drag and drop the transforms or compound transforms you need onto the canvas to begin creating your project. Data Integrator Designer Guide 389 .

This document is part of a SAP study on PDF usage.) You should not change these settings if the output order of records must match the input order of records. 14 Data Quality Creating custom projects Figure 14-9 :Data Quality Project Architect menu This drop-down menu only appears if the Data Quality Project Architect application is installed on the same machine as Data Integrator. “integrated batch project” for data cleansing. When the Project Aarchitect is launched. (This is set in the Common > Performance > Num_Of_Threads option. Setting threads to a number greater than one can result in records being output in a different order. By default. see your Business Objects Data Quality user documentation. Find out how you can participate and help to improve our documentation. it opens the selected project. you must follow the rules explained below when you define the Data Quality Project. all transforms are set to run on one thread. For more information. 390 Data Integrator Designer Guide . Adhere to the following rules when you set up an integrated batch project: • Number of threads. In order to use integrated batch projects. which is a requirement for using passaround. Using integrated batch project Data Integrator use Data Quality Projects of the type. Plugins are set to zero threads.

The output data includes all records from the data source. If you use a Sorter transform. consumer_match_family_name_ address_usa_full Data Integrator Designer Guide 391 . you must edit them in the Data Quality Project Architect. phone. the Misc_Options > Input_Mode option must be set to Batch. The records at this point are not in their original order. If you use an Associate transform. Data Quality Creating custom projects 14 For example. date. if order is of secondary importance to performance. These blueprints reside on the Data Quality server. SSN. You can access the Project Architect through the Start menu or you can access it from within Data Integrator (see. and if you are able to use passthrough. you can adjust the Transform_Performance > Buffer_Size_Kilobytes option value to increase performance. firm. and the second collection will most likely finish processing before the first.This document is part of a SAP study on PDF usage. After creating a datastore to connect with the Data Quality repository. If you must edit a blueprint. Each thread processes a collection. A sample integrated batch project configured to cleanse address data in the USA. and the first collection contains 1000 records and the second collection contains two records. Find out how you can participate and help to improve our documentation. you will have a list of blueprints to choose from. and email data using the Englishbased USA data quality rules. • • • Do not use the Unique_ID or Observer transform in your projects. “Creating custom projects” on page 389). then you can set the number of threads to a number greater than one. and to cleanse name. Data Quality blueprints for Data Integrator Business Objects Data Quality provides Data Integrator users with Data Quality blueprints to set up your Data Quality Projects. with match results fields providing information on the match process. A sample integrated batch project configured to cleanse consumer data and identify matching data records based on similar family name and address data. However. The functionality available in these transforms can be replicated using the Data Integrator Query transform. suppose you set the number of threads to 2 in the Match transform. The names and descriptions are listed below: Name address_cleanse_usa address_data_cleanse_usa Description A sample integrated batch project configured to cleanse address data in the USA. title.

A sample integrated batch project configured to read the data prepared in the pass1 project and identify matching data records based on similar family name and address data. with match results fields providing information on the match process. A sample integrated batch project configured to read the data prepared in the pass1 project and identify matching data records based on similar firm and address data. preparing data to be used in the pass2 project to identify matching data records based on similar family name and address data. preparing data to be used in the pass2 project to identify matching data records based on similar name and address data. and address data. Find out how you can participate and help to improve our documentation. consumer_match_family_name_ address_usa_pass2 consumer_match_name_address A sample integrated batch project configured to cleanse _usa_full consumer data and identify matching data records based on similar name and address data. A sample integrated batch project configured to cleanse corporate data and identify matching data records based on similar name. The output data includes all records from the data source. consumer_match_name_address A sample integrated batch project configured to read the _usa_pass2 data prepared in the pass1 project and identify matching data records based on similar name and address data. 14 Data Quality Creating custom projects Name consumer_match_family_name_ address_usa_pass1 Description A sample integrated batch project configured to cleanse consumer data and generate a break group key. The output data includes all records from the data source. A sample integrated batch project configured to cleanse corporate data and generate a break group key. corporate_match_firm_address_ usa_full A sample integrated batch project configured to cleanse corporate data and identify matching data records based on similar firm and address data.This document is part of a SAP study on PDF usage. preparing data to be used in the pass2 project to identify matching data records based on similar firm and address data. with match results fields providing information on the match process. corporate_match_firm_address_ usa_pass1 corporate_match_firm_address_ usa_pass2 corporate_match_name_firm_ address_usa_full 392 Data Integrator Designer Guide . with match results fields providing information on the match process. firm. consumer_match_name_address A sample integrated batch project configured to cleanse _usa_pass1 consumer data and generate a break group key. The output data includes all records from the data source.

Field name Address_Locality3 Address_Post_Office_Box_Number Address_Primary_Name1 Address_Primary_Postfix1 Address_Primary_Prefix1 Address_Primary_Type Firm1_Firm_Name_Match_STD1-3 Match_Apply_Blank_Penalty Match_Data_Source_ID Description City. and so on). firm. and address data. Data Quality Creating custom projects 14 Name corporate_match_name_firm_ address_usa_pass1 Description A sample integrated batch project configured to cleanse corporate data and generate a break group key. Fields mapped from Data Integrator to Data Quality that are larger than that will generate an error at runtime.This document is part of a SAP study on PDF usage. lane. Note: The maximum field length allowed to be passed from Data Integrator to Data Quality is 512 characters (1024 bytes). which match reports use for identifying statistics about match groups. firm. date. town. phone. and address data. corporate_match_name_firm_ address_usa_pass2 data_cleanse_usa Mapping blueprint fields The following table lists the extra fields in the integrated batch Reader transforms of the Data Quality blueprints. Data Integrator Designer Guide 393 . A sample integrated batch project configured to cleanse name. Specifies the source ID. such as a directional. you must map them. firm. A field that contains the indicator to apply blank penalties. and email data using the English-based USA data quality rules. Post office box number Street name data. A sample integrated batch project configured to read the data prepared in the pass1 project and identify matching data records based on similar name. If you want to use any of these. boulevard. SSN. such as a directional. or suburb. preparing data to be used in the pass2 project to identify matching data records based on similar name. Address data that comes at the beginning of a street name. Find out how you can participate and help to improve our documentation. title. Firm match standards. Address data that comes at the end of a street name. Data that tells what type of street it is (street.

Many times. Person2_GivenName1_Match_STD1. Person3_GivenName2_Match_STD1. but you can use something else as well. the same data is used for both. Specifies the gender of the persons in your data record. (up to three persons) Specifies the best record priority of the record.This document is part of a SAP study on PDF usage. The Data_Source_ID. is tied to the MDR statistics. 394 Data Integrator Designer Guide . Person1_GivenName2_Match_STD2 The second and third match standards for the given Person1_GivenName2_Match_STD3 (middle) name of the first person in the data record. If you are familiar with Match/Consolidate. The Data_Source_ID value can be used to fill this. For example.Match standards for the given (middle) name of the 3 second person in the data record. Find out how you can participate and help to improve our documentation.Given name1(first name) match standards for the third 3 person in your data record. but there may be times when a reader is not enough for Source_ID identification. Specifies the values for use in qualification tables (driver and passenger ID. The Match_Source_ID is specific to statistics within match groups themselves. A user may want to qualify within a reader source multiple Source_ID. 14 Data Quality Creating custom projects Field name Match_Perform_Data_Salvage Match_Person1_Gender Match_Person2_Gender Match_Person3_Gender Match_Priority Match_Qualification_Table Match_Source_ID Description Specifies the indicator (Y/N) for performing a data salvage. Data_Source_ID is specific to the metadata repository report statistics that are generated. for example). Person2_GivenName2_Match_STD1. CPA. Person3_GivenName1_Match_STD1. which I believe you are. academic degree.Given name1(first name) match standards for the 3 second person in your data record. or affiliation. this equates to the List_ID value. however. Person1_Honorary_Postname Person2_Honorary_Postname Person3_Honorary_Postname Honorary postname for up to three persons in the data record indicating certification.Match standards for the given (middle) name of the 3 third person in the data record.

including setting up projects. For example. refer to the Data Quality documentation found in Start > Programs > BusinessObjects XI Release 2 > Data Quality 11. CN244-56. Find out how you can participate and help to improve our documentation. and so on. Use these fields to map custom fields that you want to pass into Data Quality but that do not have a Data Quality counterparts. creating substitution files. Data Quality Creating custom projects 14 Field name UDPM1-4 User_Defined_01-20 Description Input of data that you have defined in your pattern file.This document is part of a SAP study on PDF usage.5 > Documentation. Data Quality documentation For information about using Data Quality. mapping fields. Data Integrator Designer Guide 395 .

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. 14 Data Quality Creating custom projects 396 Data Integrator Designer Guide .

This document is part of a SAP study on PDF usage. Data Integrator Designer Guide Design and Debug chapter . Find out how you can participate and help to improve our documentation.

Use the Difference Viewer to compare the metadata for similar objects and their properties. transform. or target object. Use the auditing data flow feature to verify that correct data is processed by a source. when the following parent data flow is saved. See which data flows use the same object. work flow. Use the Interactive Debugger to set breakpoints and filters between transforms within a data flow and view job data row-by-row during a job execution. For example. Using View Where Used Using View Data Using the interactive debugger Comparing Objects Calculating usage dependencies This chapter contains the following topics: • • • • • Using View Where Used When you save a job. transform. Parent/child relationship data is preserved. see “Using Auditing” on page 362. Data Integrator also saves pointers between it and its three children: • • • a table source a query transform a file target 398 Data Integrator Designer Guide . 15 Design and Debug About this chapter About this chapter This chapter covers the following Designer features that you can use to design and debug jobs: • • • • • Use the View Where Used feature to determine the impact of editing a metadata object (for example. or data flow Data Integrator also saves the list of objects used in them in your repository. Find out how you can participate and help to improve our documentation. at table). and target data in a data flow after a job executes.This document is part of a SAP study on PDF usage. Use the View Data feature to view sample source. For more information.

The data can be accessed using the View Where Used option. 1. while maintaining a data flow. To access the View Where Used option in the Designer you can work from the object library or the workspace. find all the data flows that are also using the table and update them as needed. Before doing this. To access parent/child relationship information from the object library View an object in the object library to see the number of times that it has been used. for example.This document is part of a SAP study on PDF usage. Design and Debug Using View Where Used 15 You can use this parent/child relationship data to determine what impact a table change. you may need to delete a source table definition and re-import the table (or edit the table schema). Data Integrator Designer Guide 399 . For example. Find out how you can participate and help to improve our documentation. From the object library You can view how many times an object is used and then view where it is used. will have on other data flows that are using the same table.

15 Design and Debug Using View Where Used The Usage Count column is displayed on all object library tabs except: • • • Projects Jobs Transforms Click the Usage Count column heading to sort values. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. to find objects that are not used. For example. 400 Data Integrator Designer Guide . 2. If the Usage count is greater than zero. right-click the object and select View Where Used.

flat files.: • • Source Target Data Integrator Designer Guide 401 . Table DEPT is used by data flow DF1. table DEPT is used as a Source. in the following example: The As column provides additional context. The Information tab displays rows for each parent of the object you selected. etc. Find out how you can participate and help to improve our documentation. tables. For example. in data flow DF1. Design and Debug Using View Where Used 15 The Output window opens.This document is part of a SAP study on PDF usage. The type and name of the selected object is displayed in the first column’s heading. Other possible values for the As column are: • For XML files and messages. The As column tells you how the selected object is used by the parent.

Find out how you can participate and help to improve our documentation. The workspace diagram opens highlighting the child object the parent is using. From the Output window. • If the row represents a different parent.This document is part of a SAP study on PDF usage. double-click a parent object. 15 Design and Debug Using View Where Used • As For flat files and tables only: Description Translate table/file used in a lookup function Translate table/file used in a lookup_ext function Lookup() Lookup_ext() Lookup_seq() Translate table/file used in a lookup_seq function • For tables only: As Comparison Description Table used in the Table Comparison transform Key Generation Table used in the Key Generation transform 3. you can double-click a row in the output window again. 402 Data Integrator Designer Guide . the workspace diagram for that object opens. Once a parent is open in the workspace.

if you select a table. The names of objects used in parents can only be seen by opening the parent in the workspace. Find out how you can participate and help to improve our documentation. right-click an object in the workspace diagram and select the View Where Used option. or from the tool bar. For example. In this example. the Output window displays a list of data flows that use the table. The Information tab on the Output window displays the name used in the object library. The Output window opens with a list of parent objects that use the selected object. Design and Debug Using View Where Used 15 • If the row represents a child object in the same parent. select View > Where Used. This is an important option because a child object in the Output window might not match the name used in its parent. You can customize workspace object names for sources and targets. Data Integrator Designer Guide 403 . select the View Where Used button. Data Integrator saves both the name used in each parent and the name used in the object library. the Output window opens with a list of jobs (parent objects) that use the open data flow. this object is simply highlighted in the open diagram. • To view information for a child object.This document is part of a SAP study on PDF usage. From the workspace From an open diagram of an object in the workspace (such as a data flow). you can view where a parent or child object is used: • To view information for the open (parent) object.

• Data Integrator does not save parent/child relationships between functions. 15 Design and Debug Using View Data Limitations • • This feature is not supported in central repositories. For example. open the work flow in the workspace. the usage count in the object library will be 1 for both functions A and B. Only parent and child pairs are shown in the Information tab of the Output window. a data flow is the parent. For more information see. View imported source data. a table is not often joined to itself in a job design. You can also use the Metadata Reports tool to run a Where Used dependency report for any object. then rightclick a data flow and select the View Where Used option. For example. Using View Data View Data provides a way to scan and capture a sample of the data produced by each step in a job. If function A is saved in one data flow.This document is part of a SAP study on PDF usage. This report lists all related objects not just parent/child pairs. The Designer counts an object’s usage as the number of times it is used for a unique purpose. these are not listed in the Output window display for a table. even when the job does not execute successfully. If the table is also used by a grandparent (a work flow for example). for a table. At any point after you import a data source. This includes custom ABAP transforms that you might create to support an SAP R/3 environment. • • • Transforms are not supported. you can check on the status of that data—before and after processing your data flows. changed data from transformations. in data flow DF1 if table DEPT is used as a source twice and a target once the object library displays its Usage count as 2. and ending data at your targets. • If function A calls function B. the Usage count in the object library will be zero for both functions. To see the relationship between a data flow and a work flow. “Where Used” on page 465. Find out how you can participate and help to improve our documentation. 404 Data Integrator Designer Guide . The fact that function B is used once in function A is not counted. and function A is not in any data flows or scripts. This occurrence should be rare. For example.

The topics in this section include: • • • • Accessing View Data Viewing data in the workspace View Data properties View Data tabs Accessing View Data There are multiple places throughout Designer where you can open a View Data pane. the Table Profile tab and Column Profile tab options are not supported for hierarchies. Design and Debug Using View Data 15 Use View Data to check the data while designing and testing jobs to ensure that your design returns the results you expect. Sources and targets You can view data for sources and targets from two different locations: • View Data button View Data buttons appear on source and target objects when you drag them into the workspace. you can create higher quality job designs. you can view and compare sample data from different steps. • • Transforms (For more information. (See “Viewing data in the workspace” on page 406 for more information. Of course after you execute the job. Armed with data details. For SAP R/3 and PeopleSoft. View Data information is displayed in embedded panels for easy navigation between your flows and the data. You can scan and analyze imported table and file data from the object library as well as see the data for those same objects within existing jobs. Click the View data button (magnifying glass icon) to open a View Data pane for that source or target object. see “Viewing data passed by transforms” on page 435) Lines in a diagram (For more information. Find out how you can participate and help to improve our documentation. Note: View Data is not supported for SAP R/3 IDocs. Using one or more View Data panes. Use View Data to look at: • Sources and targets View Data allows you to see data before you execute a job.This document is part of a SAP study on PDF usage. see “Using the interactive debugger” on page 418). you can refer back to the source data again.) Data Integrator Designer Guide 405 .

This means: 406 Data Integrator Designer Guide . Viewing data in the workspace View Data can be accessed from the workspace when magnifying glass buttons appear over qualified objects in a data flow. the table must be from a supported database. XML Format Editor. • Right-click a table and select Open or Properties. To view data for a file. see “Viewing data passed by transforms” on page 435. The Table Metadata. or Properties window opens. you can select the View Data tab. There are two ways to open a View Data pane from the object library: • Right-click a table object and select View Data. To view data for a table. From any of these windows. 15 Design and Debug Using View Data • Object library View Data in potential source or target objects from the Datastores or Formats tabs. the file must physically exist and be available from your computer’s operating system.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Transforms To view data after transformation.

A large View Data pane appears beneath the current workspace area. files must physically exist and be accessible. and tables must be from a supported database. When both panes are filled and you click another View Data button. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. To open a View Data pane in the Designer workspace. see “Viewing data passed by transforms” on page 435. a small menu appears containing window placement icons. Design and Debug Using View Data 15 • • For sources and targets. click the magnifying glass button on a data flow object. The black area in each icon Data Integrator Designer Guide 407 . You can open two View Data panes for simultaneous viewing. Click the magnifying glass button for another object and a second pane appears below the workspace area (Note that the first pane area shrinks to accommodate the presence of the second pane). For transforms.

The Designer displays View Data buttons on open objects with grey rather than white backgrounds. the description is the full object name: • • • ObjectName(Datastore. an arrow. 408 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. View Data properties You can access View Data properties from tool bar buttons or the right-click menu. Replace left pane Replace right pane The description or path for the selected View Data button displays at the top of the pane. the path would indicate: Query -> ALVW_JOBINFO(Joes. 15 Design and Debug Using View Data indicates the pane you want to replace with a new set of data.DI_REPO) You can also find the View Data pane that is associated with an object or line by: • • Rolling your cursor over a View Data button on an object or line.DI_REPO). • For sources and targets.This document is part of a SAP study on PDF usage. For example. and the object name to the right.Owner) for tables FileName(File Format Name) for files For View Data buttons on a line. Looking for grey View Data buttons on objects and lines. if you select a View Data button on the line between the query named Query and the target named ALVW_JOBINFO(joes. the path consists of the object name on the left. Click a menu option and the data from the latest selected object replaces the data in the corresponding pane. The Designer highlights the View Data pane for the object.

Default sample size is 1000 rows for imported source and target objects. Create filters. 2. To view and add filters In the View Data tool bar. Filtering You can focus on different sets of rows in a local or new data sample by placing fetch conditions on columns. 1. see “Starting and stopping the interactive debugger” on page 424. Maximum sample size is 5000 rows. Set sample size for sources and targets from Tools > Options > Designer > General > View Data sampling size. Data Integrator uses the Data sample rate option instead of sample size. or right-click the grid and select Filters. Design and Debug Using View Data 15 View Data displays your data in the rows and columns of a data grid. When using the interactive debugger. You can see which conditions have been applied in the navigation bar. Data Integrator Designer Guide 409 . For more information. Find out how you can participate and help to improve our documentation. The Filters window opens. click the Filters button.This document is part of a SAP study on PDF usage. • • Filtering Sorting If your original data set is smaller or if you use filters. The number of rows displayed is determined by a combination of several conditions: • Sample size — The number of rows sampled in memory. the number of returned rows could be less than the default.

b. Operator—Select an operator from the second column. 410 Data Integrator Designer Guide .ss ‘abc’ 3. Column—Select a name from the first column.mm. 15 Design and Debug Using View Data The Filters window has three columns: a.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.dd hh24:mm:ss yyyy. Value—Enter a value in the third column that uses one of the following data type formats: Data Type Integer. select an operator (AND. OR) for the engine to use in concatenating filters. In the Concatenate all filters using list box.dd hh24:mm. Select {remove filter} to delete the filter. c. real date time datetime varchar Format standard yyyy.mm. Each row in this window is considered a filter. double.

see “Using Refresh” on page 411. While Data Integrator is refreshing the data. Using Refresh To fetch another data sample from the database using new filter and sort settings. Your filters are saved for the current object and the local sample updates to show the data filtered as specified in the Filters dialog. in the tool bar click the Refresh button in the tool bar. or right-click the data grid and select Refresh. To change sort order. The priority of a sort is from left to right on the grid. click the column heading again. In the View Data tool bar. click the Add Filter button. Find out how you can participate and help to improve our documentation. 2. After you edit filtering and sorting. click OK. or right-click the cell and select Add Filter. or right-click the grid and select Remove Filters. 1. click Apply. Sorting You can click one or more column headings in the data grid to sort your data. Design and Debug Using View Data 15 4. To use filters with a new sample. 3. To stop a refresh operation. <column> = <cell value>. To see how the filter affects the current set of returned rows. To add a filter for a selected cell Select a cell from the sample data grid. then opens the Filters window so you can view or edit the new filter. all View Data controls except the Stop button are disabled. 5. or right-click the grid and select Remove Sort. click the Stop button. The Add Filter option adds the new filter condition. When you are finished. To remove filters from an object. see “Using Refresh” on page 411. Data Integrator Designer Guide 411 . To remove sorting for an object. To save filters and close the Filters window. go to the View Data tool bar and click the Remove Filters button. An arrow appears on the heading to indicate sort order: ascending (up arrow) or descending (down arrow).This document is part of a SAP study on PDF usage. use the Refresh command. To use sorts with a new sample. click OK. All filters are removed for the current object. from the tool bar click the Remove Sort button.

The right-click menu. or right-click the data grid and select Show/Hide Columns. The arrow shortcut menu. full-sized View Data window. Hide. 2. Select a column to display it. 1. Show All. You can also “quick hide” a column by right-clicking the column heading and selecting Hide from the menu. Alternatively. This option is only available if the total number of columns in the table is ten or fewer. 3. To show or hide columns Click the Show/Hide columns tool bar button. Opening a new window To see more of the data sample that you are viewing in a View Data pane. Find out how you can participate and help to improve our documentation. 15 Design and Debug Using View Data Using Show/Hide Columns You can limit the number of columns displayed in View Data by using the Show/Hide Columns option from: • • • The tool bar. The Column Settings window opens. you can right-click and select Open in new window from the menu. From any View Data pane. Click OK. open a full-sized View Data window.This document is part of a SAP study on PDF usage. located to the right of the Show/Hide Columns tool bar button. or Hide All. click the Open Window tool bar button to activate a separate. Select the columns that you want to display or click one of the following buttons: Show. 412 Data Integrator Designer Guide .

See “Filtering” on page 409. Option Description Open in new window Opens the View Data pane in a larger window. Add a Filter Remove Filter Remove Sort Show/hide navigation Show/hide columns See “To add a filter for a selected cell” on page 411. Shows or hides the navigation bar which appears below the data table.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Save As Print Copy Cell Saves the data in the View Data pane. Design and Debug Using View Data 15 View Data tool bar options The following options are available on View Data panes. Removes sort settings for the object you select. Prints View Data pane data. See “Opening a new window” on page 412. Refresh data Fetches another data sample from existing data in the View Data pane using new filter and sort settings. See “Sorting” on page 411. Removes all filters in the View Data pane. Open Filters window Opens the Filters window. See “Using Refresh” on page 411. Copies View Data pane cell data. See “Using Show/Hide Columns” on page 412 View Data tabs The View Data panel for objects contains three tabs: • Data tab Data Integrator Designer Guide 413 .

1. Click a cell in a marked column. Find out how you can participate and help to improve our documentation. The Drill Down button (an ellipses) appears in the cell. When a column references nested schemas. a. Click the Drill Down button. The Profile and Relationship tabs are supported with the Data Profiler (see “Viewing the profiler results” on page 350 for more information). It also indicates nested schemas such as those used in XML files and messages. Alternatively.This document is part of a SAP study on PDF usage. Without the Data Profiler. Data tab The Data tab allows you to use the properties described in “View Data properties” on page 408. the Profile and Column Profile tabs are supported for some sources and targets (see release notes for more information). The data grid updates to show the data in the selected cell or nested table. b. c. The Data tab is always available. To view a nested schema Double-click a cell. that column is shaded yellow and a small table icon appears in the column heading. 15 Design and Debug Using View Data • • Profile tab Column Profile tab Use tab options to give you a complete profile of a source or target object. 414 Data Integrator Designer Guide .

Design and Debug Using View Data 15 In the Schema area. for example <CompanyName>. Continue to use the data grid side of the panel to navigate. see “Executing a profiler task” on page 342. tables and columns in the selected path are displayed in blue. the Profile tab displays the profile attributes that you selected on the Submit Column Profile Request option. To use the Profile tab without the Data Profiler Select one or more columns. Also. The Profile tab allows you to calculate statistical information for any set of columns you choose. Click the Drill Up button at the top of the data grid to move up in the hierarchy. Profile tab If you use the Data Profiler. In the Data area. For details. For example: • • Select a lower-level nested column and double-click a cell to update the data grid. Data Integrator Designer Guide 415 . while nested schema references are displayed in grey. Nested schema references are shown in angle brackets. 1. This optional feature is not available for columns with nested schemas or for the LONG data type. Find out how you can participate and help to improve our documentation. the selected cell value is marked by a special icon. data is shown for columns. See the entire path to the selected column or table displayed to the right of the Drill Up button.This document is part of a SAP study on PDF usage. Use the path and the data grid to navigate through nested schemas.

Data Integrator saves previously calculated values in the repository and displays them until the next update. The total number of distinct values in this column. The grid contains six columns: Column Column Description Names of columns in the current table. Select names from this column. 3. the minimum value in this column. 15 Design and Debug Using View Data Select only the column names you need for this profiling operation because Update calculations impact performance. Of all values. the maximum value in this column. In addition to updating statistics. Note that Min and Max columns are not sortable. The statistics display in the Profile grid. Click Update. Of all values. You can also right-click to use the Select All and Deselect All menu options. then click Update to populate the profile grid. The time that this statistic was calculated.This document is part of a SAP study on PDF usage. 2. you can click the Records button on the Profile tab to count the total number of physical records in the object you are profiling. 416 Data Integrator Designer Guide . The total number of NULL values in this column. Distinct Values NULLs Min Max Last Updated Sort values in this grid by clicking the column headings. Find out how you can participate and help to improve our documentation.

1. To calculate value usage statistics for a column Enter a number in the Top box. plus an additional value called “Other. Data Integrator Designer Guide 417 . Find out how you can participate and help to improve our documentation. Results are saved in the repository and displayed until you perform a new update. 3. Total Percentage The percentage of rows in the specified column that have this value compared to the total number of values in the column. Click Update. The Column Profile grid displays statistics for the specified column.“ So. The total number of rows in the specified column that contain this value. Note: This optional feature is not available for columns with nested schemas or the LONG data type. or “Other” (remaining values that are not used as frequently).This document is part of a SAP study on PDF usage. you will get up to 6 returned values (the top 5 used values in the specified column. the Relationship tab displays instead of the Column Profile (see “Viewing relationship profile data” on page 354 for more information). If you use the Data Profiler. Design and Debug Using View Data 15 Column Profile tab The Column Profile tab allows you to calculate statistical information for a single column. This number is used to find the most frequently used values in the column. The default is 10. plus the “Other” category). if you enter 5 in the Top box. which means that Data Integrator returns the top 10 most frequently used values. 2. The grid contains three columns: Column Value Description A “top” (most frequently used) value found in your specified column. Select a column name in the list box. Data Integrator returns a number of values up to the number specified in the Top box.

For this example. set properties for the execution. you can start the interactive debugger from the Debug menu when a job is active in the workspace. the total number of rows counted during the calculation for each top value is 1000. Select Start debug. 50 percent use the value Item3. 15 Design and Debug Using the interactive debugger For example. The 418 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. and so on. as only 10 percent is shown in the Other category. Topics in this section are: • • • • • • • Before starting the interactive debugger Starting and stopping the interactive debugger Windows Menu options and tool bar Viewing data passed by transforms Push-down optimizer Limitations Note: A repository upgrade is required to use this feature. The debug mode begins. Using the interactive debugger The Designer includes an interactive debugger that allows you to examine and modify data row-by-row (during a debug mode job execution) by placing filters and breakpoints on lines in a data flow diagram. 20 percent use the value Item2. then click OK. The interactive debugger provides powerful options to debug a job. statistical results in the preceding table indicate that of the four most frequently used values in the Name column. You can also see that the four most frequently used values (the “top four”) are used in 90 percent of all cases. Find out how you can participate and help to improve our documentation. Before starting the interactive debugger Like executing a job.

open the job that you want to debug. you might want to set the following: • • Filters and breakpoints Interactive debugger port between the Designer and an engine. 3. This often means that the first transform in each data flow of a job is pushed down to the source database. Setting filters and breakpoints You can set any combination of filters and breakpoints in a data flow before you start the interactive debugger. The Designer enables the appropriate commands as you progress through an interactive debugging session. Right-click the line that you want to examine and select Set Filter/ Breakpoint. • 1. If you do not set predefined filters or breakpoints: • The Designer will optimize the debug job execution. Data Integrator Designer Guide 419 . and tool bar buttons that you can use to control the pace of the job and view data by pausing the job execution using filters and breakpoints. To set a filter or breakpoint In the workspace. While in debug mode. You can pause a job manually by using a debug option called Pause Debug (the job pauses before it encounters the next transform). click the Stop debug button on the interactive debugger toolbar. you cannot view the data in a job between its source and the first transform unless you set a predefined breakpoint on that line. To exit the debug mode and return other Designer features to read/write. All interactive debugger commands are listed in the Designer’s Debug menu. The debugger uses the filters and pauses at the breakpoints you set. A line is a line between two objects in a workspace diagram. For more information. see “Push-down optimizer” on page 435.This document is part of a SAP study on PDF usage. Design and Debug Using the interactive debugger 15 Debug mode provides the interactive debugger's windows. 2. Open one of its data flows. all other Designer features are set to read-only. however. Before you start a debugging session. Consequently. menus. Find out how you can participate and help to improve our documentation.

4. you can double-click the line or click the line then: • • • Press F9 Click the Set Filter/Breakpoint button on the tool bar Select Debug > Set Filter/Breakpoint. 420 Data Integrator Designer Guide . Set and enable a filter or a breakpoint using the options in this window. the following window represents the line between AL_ATTR (a source table) and Query (a Query transform). 15 Design and Debug Using the interactive debugger Alternatively. Find out how you can participate and help to improve our documentation. Its title bar displays the objects to which the line connects. The Breakpoint window opens.This document is part of a SAP study on PDF usage. For example.

you can set a breakpoint between a source and transform or two transforms. A breakpoint condition applies to the after image for UPDATE. If you set a filter and a breakpoint on the same line.This document is part of a SAP study on PDF usage. If you use a breakpoint with a condition. • • If you use a breakpoint without a condition. the execution pauses when the number of rows you specify pass through the breakpoint. Find out how you can participate and help to improve our documentation. the job execution pauses for the first row passed to the breakpoint that meets the condition. Choose to use a breakpoint with or without conditions. The appropriate icon appears on the selected line. Data Integrator applies the filter first. Design and Debug Using the interactive debugger 15 A debug filter functions as a simple Query transform with a WHERE clause. Click OK. Instead of selecting a conditional or unconditional breakpoint. 5. Use a filter to reduce a data set in a debug job execution. In this case. the job execution pauses for the first row passed to a breakpoint. A breakpoint is the location where a debug job execution pauses and returns control to you. The breakpoint can only see the filtered rows. NORMAL and INSERT row types and to the before image for a DELETE row type. Data Integrator provides the following filter and breakpoint conditions: Icon Description Breakpoint disabled Breakpoint enabled Filter disabled Data Integrator Designer Guide 421 . Note that complex expressions are not supported in a debug filter. Place a debug filter on a line between a source and a transform or two transforms. you can also use the Break after ‘n’ row(s) option. Like a filter.

when you start the interactive debugger. it highlights subsequent lines and displays the locator box at your current position. The locator box appears over the breakpoint icon as shown in the following diagram: A View Data button also appears over the breakpoint.This document is part of a SAP study on PDF usage. the job pauses at your breakpoint. As the debugger steps though your job’s data flow logic. You can use this button to open and close the View Data panes. For example. 422 Data Integrator Designer Guide . 15 Design and Debug Using the interactive debugger Icon Description Filter enabled Filter and breakpoint disabled Filter and breakpoint enabled Filter enabled and breakpoint disabled Filter disabled and breakpoint enabled In addition to the filter and breakpoint icons that can appear on a line. A red locator box also indicates your current location in the data flow. see “Windows” on page 427. For more information. the debugger highlights a line when it pauses there. Find out how you can participate and help to improve our documentation.

Find out how you can participate and help to improve our documentation. 1. 2. Enter a value in the Interactive Debugger box. Design and Debug Using the interactive debugger 15 Changing the interactive debugger port The Designer uses a port to an engine to start and stop the interactive debugger.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 423 . 3. To change the interactive debugger port setting Select Tools > Options > Designer > Environment. The interactive debugger port is set to 5001 by default. Click OK.

This document is part of a SAP study on PDF usage. 1. right-click a job and select Start debug. click Start debug. in the project area you can click a job and then: • • • Press Ctrl+F8 From the Debug menu. Find out how you can participate and help to improve our documentation. the Designer enables the Start Debug option on the Debug menu and tool bar. You can select a job from the object library or from the project area to activate it in the workspace. The Debug Properties window opens. Alternatively. 424 Data Integrator Designer Guide . Click the Start debug button on the tool bar. Once a job is active. 15 Design and Debug Using the interactive debugger Starting and stopping the interactive debugger A job must be active in the workspace before you can start the interactive debugger. To start the interactive debugger In the project area.

Find out how you can participate and help to improve our documentation. then the Designer displays up Data Integrator Designer Guide 425 . Print all trace messages.This document is part of a SAP study on PDF usage. if the source table has 1000 rows and you set the Data sample rate to 500. and Job Server options. in the following data flow diagram. See the Data Integrator Reference Guide for a description of the Monitor sample rate. You will also find more information about the Trace and Global Variable options. For example. Design and Debug Using the interactive debugger 15 The Debug Properties window includes three parameters similar to the Execution Properties window (used when you just want to run a job). The options unique to the Debug Properties window are: • Data sample rate — The number of rows cached for each line when a job executes using the interactive debugger.

press Shift+F8. Click OK. 3. 15 Design and Debug Using the interactive debugger to 500 of the last rows that pass through a selected line. Note: You cannot perform any operations that affect your repository (such as dropping objects into a data flow) when you execute a job in debug mode. The Designer: • • • • • Displays the interactive debugger windows. The interactive debugger windows display information about the job execution up to this point. Sets the user interface to read-only. Adds Debugging Job <JobName> to its title bar. When the debugger encounters a breakpoint. it pauses the job execution.This document is part of a SAP study on PDF usage. To stop a job in debug mode and exit the interactive debugger Click the Stop Debug button on the tool bar. The job you selected from the project area starts to run in debug mode. 1000 rows 1000 rows Can view up to 500 rows • 2. You now have control of the job execution. 426 Data Integrator Designer Guide . Displays the debug icon in the status bar. Exit the debugger when the job is finished — Click to stop the debugger and return to normal mode after the job executes. The debugger displays the last row processed when it reaches a breakpoint. or From the Debug menu. Enables the appropriate Debug menu and tool bar options. Defaults to cleared. Enter the debug properties that you want to use or use the defaults. They also update as you manually step through the job or allow the debugger to continue the execution. Find out how you can participate and help to improve our documentation. click Stop debug.

use the Debug menu or the tool bar. then click and drag its title bar to re-dock it. The following diagram shows the default locations for these windows. You can resize or hide a debugger window using its control buttons. encounters a breakpoint. the Designer displays three additional windows as well as the View Data panes beneath the work space. Your layout is preserved for your next Designer session. Data Integrator Designer Guide 427 . Call stack window The Call Stack window lists the objects in the path encountered so far (before either the job completes. Design and Debug Using the interactive debugger 15 Windows When you start a job in the interactive debugger. Find out how you can participate and help to improve our documentation. Control bar Control buttons The Designer saves the layout you create when you stop the interactive debugger. or you pause it). Call Stack window Trace window Variable window Each window is docked in the Designer’s window. To show or hide a debugger window manually. To move a debugger window. See “Menu options and tool bar” on page 433. double-click on the window’s control bar to release it.This document is part of a SAP study on PDF usage.

Stop debugger. Similarly. press shift+F8. You can double-click an object in the Call Stack window to open it in the workspace. When you must exit the debugger. 428 Data Integrator Designer Guide . select the Stop Debug button on the tool bar.This document is part of a SAP study on PDF usage. this window displays the following: Job <JobName> finished. or select Debug > Stop Debug. Find out how you can participate and help to improve our documentation. 15 Design and Debug Using the interactive debugger For example. the following Call Stack window indicates that the data you are currently viewing is in a data flow called aSimple and shows that the path taken began with a job called Simple and passed through a condition called Switch before it entered the data flow. Trace window The Trace window displays the debugger’s output status and errors. the Call Stack window highlights the object. Debug Variables window The Debug Variables window displays global variables in use by the job at each breakpoint. if you click an object in a diagram. When the job completes. When the job completes the debugger gives you a final opportunity to examine data.

Double-click a cell or right-click it and select Edit cell. you can undo the discard immediately afterwards. Displays (above the View Data tool bar) the names of objects to which a line connects using the format: TableName(DatastoreName. Allows you to flag a row that you do not want the next transform to process. Select the discarded row and click Undo Discard Row. For example. Allows you to edit data in a cell. Discarded row data appears in the strike-through style in the View Data pane (for example. You can fix the job design later to eliminate the error permanently. You might want to fix an error temporarily to continue with a debugger run.This document is part of a SAP study on PDF usage. If you accidentally discard a row. For more information. see “Using View Data” on page 404. The following View Data pane options are unique to the interactive debugger: • • • • • Allows you to view data that passes through lines.TableOwnerName)-> QueryName. To edit cell data: • • • • Deselect the All check box so that only has one row displayed. 100345). Design and Debug Using the interactive debugger 15 View Data pane The View Data pane for lines uses the same tool bar and navigation options described for the View Data feature. select it and click Discard Row. Uses a property called the Data sample rate. Displays data one row at a time by default. Data Integrator Designer Guide 429 . Provides the All check box which allows you to see more than one row of processed data. To discard a row from the next step in a data flow process. Find out how you can participate and help to improve our documentation.

For example. Find out how you can participate and help to improve our documentation. right-click a row and select either Discard Row or Undo Discard Row from the shortcut menu. 15 Design and Debug Using the interactive debugger Alternatively. 430 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. it displays the first row processed at a pre-defined breakpoint. if a source in a data flow has four rows and you set the Data sample rate to 2 when you start the debugger.

Data Integrator Designer Guide 431 . select the All check box on the upper-right corner of this pane. If you click Get Next Row again. Find out how you can participate and help to improve our documentation. then the next row at the same breakpoint is displayed: If you want to see both rows. Design and Debug Using the interactive debugger 15 If you use the Get Next Row option. At this point. The row displayed at the bottom of the table is the last row processed. only the last two rows processed are displayed because you set the sample size to 2.This document is part of a SAP study on PDF usage. . you have viewed two rows that have passed through a line.

432 Data Integrator Designer Guide . click OK. When you are finished using the Filters/ Breakpoints window. You can open this window from the Debug menu or tool bar. select the line(s) that you want to edit. Find out how you can participate and help to improve our documentation. Lines that contain filters or breakpoints are listed in the far-left side of the Filters/Breakpoints window. select a command from the list and click Execute. To manage these.This document is part of a SAP study on PDF usage. You can also select a single line on the left and view/edit its filters and breakpoints on the right side of this window. 15 Design and Debug Using the interactive debugger Filters and Breakpoints window You can manage interactive debugger filters and breakpoints using the Filters/Breakpoints window.

Allows you to manually pause the debugger. Debug menu Debug tool bar Table 15-2 :Debug menu and tool bar options Option Execute Description Key Commands Opens the Execution Properties window from F8 which you can select job properties then execute a job outside the debug mode. Other Designer operations are set to read-only until you stop the debugger. Available when a job is active in the workspace. Design and Debug Using the interactive debugger 15 Menu options and tool bar Once you start the interactive debugger. Find out how you can participate and help to improve our documentation. Stops a debug mode execution and exits the debugger. You can use this option instead of a breakpoint.This document is part of a SAP study on PDF usage. you can access appropriate options from the Designer’s Debug menu and tool bar. Ctrl+F8 Opens the Debug Properties window from which you can select job properties then execute a job in debug mode (start the debugger). All Designer operations are reset to read/write. Shift+F8 Start debug Stop debug Pause debug None Data Integrator Designer Guide 433 . Available when a job is active in the workspace.

Opens a dialog from which you can set. This option is always available in the Designer. remove. The workspace displays a red square on the line to indicate the path you are using. None When not selected. From the workspace. F9 edit. all filters and breakpoints are hidden from view. This option is always available in the Designer.This document is part of a SAP study on PDF usage. You can also set conditions for breakpoints. you can right-click a line and select the same option from a short cut menu. The debugger continues until: • You use the Pause debug option F11 Get next row Continue Ctrl+F10 • • Show Filters/ Breakpoints Another breakpoint is encountered The job completes Shows all filters and breakpoints that exist in a job. Available when a data flow is active in the workspace. If the transform you step over has multiple outputs. 15 Design and Debug Using the interactive debugger Option Step over Description Key Commands Allows you to manually move to the next line in a F10 data flow by stepping over a transform in the workspace. Find out how you can participate and help to improve our documentation. Allows you to give control of the job back to the Designer.. Also offers the same functionality as the Set Filters/ Breakpoints window. Opens a dialog with which you can manage Alt+F9 multiple filters and breakpoints in a data flow. Shows or hides the Call Stack window None Set Filter/ Breakpoints. Call Stack 434 Data Integrator Designer Guide . enable or disable filters and breakpoints. Filters/ Breakpoints.. the Designer provides a popup menu from which you can select the logic branch you want to take. Allows you to stay at the current breakpoint and view the next row of data in the data set. Use this option to see the first row in a data set after it is transformed...

This document is part of a SAP study on PDF usage. Design and Debug Using the interactive debugger 15 Option Variables Description Shows or hides the Debug Variables window Key Commands None Trace Shows or hides the Trace window None Viewing data passed by transforms To view the data passed by transforms. in the project area. 2. Navigate through the data to review it. Clear the Exit the debugger when the job is finished check box. Click the View Data button displayed on a line in the data flow. 2. it normally pushes down as many operations as possible to the source database to maximize performance. 4. Click OK. Find out how you can participate and help to improve our documentation. To view sample data in debug mode While still in debug mode after the job completes. the following push-down rules apply: • Query transforms The first transform after a source object in a data flow is optimized in the interactive debugger and pushed down to the source database if both objects meet the push-down criteria and if you have not placed a Data Integrator Designer Guide 435 . right-click a job and click Start debug. click the name of the data flow to view. which is 500. 3. You can enter a value in the Data sample rate text box or leave the default value. Push-down optimizer When Data Integrator executes a job. The Debug Properties window opens. To view data passed by transforms In the project area. 1. When done. Because the interactive debugger requires a job execution. 1. click the Stop debug button on the toolbar. 3. execute the job in debug mode.

After the interactive debugger is started.This document is part of a SAP study on PDF usage. For more information about push-down criteria. Debug options are not available at the work flow level. the result of your selection is unpredictable. see the Data Integrator Performance Optimization Guide. the line is disabled during the debugging session. if there are several outputs for a transform you can choose which path to use. Find out how you can participate and help to improve our documentation. For example. if the first transform is pushed down. The debugger cannot be used with SAP R/3 data flows. Pre-defined filters are interactive debugger filters defined before you start the interactive debugger. You cannot place a breakpoint on this line and you cannot use the View Data pane. All objects in a data flow must have a unique name. it is push-down. • Filters If the input of a pre-defined filter is a database source. Pre-defined breakpoints are breakpoints defined before you start the interactive debugger. A repository upgrade is required to use this feature. Limitations • • • • The interactive debugger can be used to examine data flows. If any of these objects have the same name. 436 Data Integrator Designer Guide . 15 Design and Debug Using the interactive debugger breakpoint on the line before the first transform. • Breakpoints Data Integrator does not push down any operations if you set a predefined breakpoint.

Depending on the object type.. for example. highlight Compare. and background shading. 3. — Compares the selected object and its dependents to another similar type of object The cursor changes to a target icon.This document is part of a SAP study on PDF usage. Some of these properties are configurable. Objects must be of the same type.. click one of the following options (availability depends on the object you selected): 1. • • • • Object to central — Compares the selected object to its counterpart in the central object library Object with dependents to central — Compares the selected object and its dependent objects to its counterpart in the central object library Object to. you can compare a job to another job or a custom function to another custom function. Click on the desired object.. the panes show items such as the object’s properties and the properties of and connections (links) between its child objects. right-click an object name. You can compare: • • • two different objects different versions of the same object an object in the local object library with its counterpart in the central object library You can compare just the top-level objects. but you cannot compare a job to a data flow. or you can include the object’s dependents in the comparison. Data Integrator Designer Guide 437 . the target cursor changes color. Design and Debug Comparing Objects 15 Comparing Objects Data Integrator allows you to compare any two objects and their properties by using the Difference Viewer utility. Find out how you can participate and help to improve our documentation. The window identifies changed items with a combination of icons. When you move the cursor over an object that is eligible for comparison. and from the submenu.. To compare two different objects In the local or central object library. — Compares the selected object to another similar type of object Object with dependents to. From the shortcut menu. color. 2. The Difference Viewer window opens in the workspace.

15 Design and Debug Comparing Objects Object names and their locations Navigation bar Toolbar Status bar To compare two versions of the same object If you are working in a multiuser environment and using a central object library. Click Show Differences or Show Differences with Dependents. 1. For more information about using the History window. In the History window. you can compare two objects that have different versions or labels. see the Data Integrator Advanced Development and Migration Guide. 2. The Difference Viewer window opens in the workspace. Close the History window. In the central object library.This document is part of a SAP study on PDF usage. 3. 4. and from the shortcut menu click Show History. Find out how you can participate and help to improve our documentation. 438 Data Integrator Designer Guide . Ctrl-click the two versions or labels you want to compare. right-click an object name.

and the second object appears on the right. Show Level 2 expands to the next level.This document is part of a SAP study on PDF usage. Filter buttons • • • • • Hide non-executable elements — Select this option to remove from view those elements that do not affect job execution Hide identical elements — Select this option to remove from view those elements that do not have differences Disable filters — Removes all filters applied to the comparison Show levels — Show Level 1 shows only the objects you selected for comparison. • Navigation buttons: • • • • • • • First Difference (Alt+Home) Previous Difference (Alt+left arrow) Current Difference Next Difference (Alt+right arrow) Last Difference (Alt+End) Enable filter(s) — Click to open the Filters dialog box. You can have multiple Difference Viewer windows open at a time in the workspace. Find out how you can participate and help to improve our documentation. Show All Levels expands all levels of both trees. when a Difference Viewer window is active. Find (Ctrl+F) — Click to open a text search dialog box. To refresh a Difference Viewer window. The next section describes these features. The Difference Viewer window includes the following features: • • • • toolbar navigation bar status bar shortcut menu Also. press F5. Design and Debug Comparing Objects 15 Overview of the Difference Viewer window The first object you selected appears in the left pane of the window. Data Integrator Designer Guide 439 . the main designer window also contains a menu called Difference Viewer. Following each object name is its location. Toolbar The toolbar includes the following buttons. etc. Expanding or collapsing any property set also expands or collapses the compared object’s corresponding property set.

The purple brackets in the bar indicate the portion of the comparison that is currently in view in the panes. You can click on the navigation bar to select a difference (the cursor point will have a star on it). See the next section. The item has been added to the object in the right pane.This document is part of a SAP study on PDF usage. Navigation bar The vertical navigation bar contains colored bars that represent each of the differences throughout the comparison. You must close this window before continuing in Data Integrator. The colors correspond to those in the status bar for each difference. The status bar also indicates that there is at least one filter applied to this comparison. An arrow in the navigation bar indicates the difference that is currently highlighted in the panes. See the next section for more information on how to navigate through differences. Icon Difference Deleted Changed Inserted Description The item does not appear in the object in the right pane. The differences between the items are highlighted in blue (the default) text. 15 Design and Debug Comparing Objects • Open in new window — Click to open the currently active Difference Viewer in a separate window. Status bar The status bar at the bottom of the window includes a key that illustrates the color scheme and icons that identify the differences between the two objects. the currently highlighted difference is 9 of 24 total differences in the comparison). 440 Data Integrator Designer Guide . Consolidated This icon appears next to an item if items within it have differences. “Shortcut menu”. Find out how you can participate and help to improve our documentation. Expand the item by clicking its plus sign to view the differences You can change the color of these icons by right-clicking in the Difference Viewer window and clicking Configuration. The status bar also includes a reference for which difference is currently selected in the comparison (for example.

Click another marker to change it. you might find this useful if you have the Differences Viewer open in a separate window). Click Configuration to open the Configuration window. changed. 6. Layout — Use to reposition the navigation bar. Click OK. Click a Basic color or create a custom color. or click OK to close the Configuration window. Data Integrator Designer Guide 441 . navigation bar. 1. Find out how you can participate and help to improve our documentation. Click the Color sample to open the Color palette. 5. or consolidated items in the comparison panes. For example. inserted. Deleted. You can customize this color scheme as follows. 2. or secondary toolbar (an additional toolbar that appears at the top of the window. 4. 7. or Consolidated) to change. see “To change the color scheme” on page 441. Click a marker (Inserted. Configuration — Click to modify viewing options for elements with differences. Right-click in the body of the Difference Viewer window to display the shortcut toolbar. Changed. 3. • • To change the color scheme The status bar at the bottom of the Difference Viewer window shows the current color scheme being used to identify deleted.This document is part of a SAP study on PDF usage. Design and Debug Comparing Objects 15 Shortcut menu Right-clicking in the body of the Difference Viewer window displays a shortcut menu that contains all the toolbar commands plus: • View — Toggle to display or hide the status bar.

1. Select an item in either pane that has a difference. Note that an arrow appears next to the colored bar that corresponds to that item. You can also use the navigation bar. For text-based items such as scripts. 5. The purple brackets in the bar indicate the portion of the comparison that is currently in view in the panes. Click OK to close the Configuration window. Navigating through differences The Difference Viewer window offers several options for navigating through differences. click the marker to configure and repeat steps 4 through 6. Click OK. for example to view only inserted items (with a default color of green). Difference Viewer menu When a Difference Viewer window is active in the workspace. For example.This document is part of a SAP study on PDF usage. To apply different background colors to different markers. or select the Apply for all markers check box. The menu contains the same commands as the toolbar. clicking the Next Difference button highlights the next item that differs in some way from the compared object. Right-click in the body of the Difference Viewer window to display the shortcut toolbar. 6. You can click on these bars to jump to different places in the comparison. Use the scroll bar in either pane to adjust the bracketed view. Click Configuration to open the Configuration window. Use the scroll bars for these panes to navigate within them. 4. 2. 15 Design and Debug Comparing Objects To change the background shading Items with differences appear with a background default color of grey. click the magnifying glass to view the text in a set of new panes that appear below the main object panes. Click the Background sample to open the Color palette. 3. Click the magnifying glass (or any other item) to close the text panes. Click a marker to change. 8. 442 Data Integrator Designer Guide . the main Designer window contains a menu called Difference Viewer. Click a Basic color or create a custom color. The item is marked with the appropriate icon and only the differing text appears highlighted in the color assigned to that type of difference. 7. Find out how you can participate and help to improve our documentation. You can customize this background. You can navigate through the differences between the objects using the navigation buttons on the toolbar.

or Select Tools > Options > Designer > General > Calculate column mappings when saving data flows. • • Data Integrator Designer Guide 443 . Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. • To calculate usage dependencies from the Designer Right-click in the object library of the current repository and select Repository > Calculate Usage Dependencies. Note: If you change configuration settings for your repository. you must also change the internal datastore configuration for the calculate usage dependencies operation. To calculate column mappings from the Designer Right-click the object library and select Repository > Calculate column mappings. The Calculate Usage Dependency option populates the internal AL_USAGE table and ALVW_PARENT_CHILD view. Design and Debug Calculating usage dependencies 15 Calculating usage dependencies You can calculate usage dependencies from the Designer at any time.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. 15 Design and Debug Calculating usage dependencies 444 Data Integrator Designer Guide .

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide Exchanging metadata chapter .

After you create the file. you must manually import it into another tool.0 XML/XMI 1. and repositories in distributed heterogeneous environments.This document is part of a SAP study on PDF usage. Data Integrator can also use: 446 Data Integrator Designer Guide . You can use the Metadata Exchange option or the Business Objects Universes option to export metadata. If MIMB is installed. 16 Exchanging metadata About this chapter About this chapter Data Integrator allows you to import metadata from AllFusion ERwin Data Modeler (ERwin) from Computer Associates and export metadata for use with reporting tools like those available with the BusinessObjects 2000 BI Platform. you can export metadata directly from a repository into a universe using the Create or Update data mode. you can export metadata into an XML file. Metadata exchange Creating Business Objects universes Attributes that support metadata exchange This chapter discusses these topics: • • • Metadata exchange You can exchange metadata between Data Integrator and third-party tools using XML files and the Metadata Exchange option. the additional formats it supports are listed in the Metadata Exchange window. Using the Business Objects Universes option. Data Integrator supports two built-in metadata exchange formats: • CWM 1. platforms.1 CWM (the Common Warehouse Metamodel)— is a specification that enables easy interchange of data warehouse metadata between tools. you can exchange metadata with all formats that MIMB supports. Find out how you can participate and help to improve our documentation. By using MIMB with Data Integrator. • • Using the Metadata Exchange option.x XML MIMB (the Meta Integration® Model Bridge) MIMB is a Windows stand-alone utility that converts metadata models among design tool formats. • • ERwin 4.

See “Creating Business Objects universes” on page 449. Click OK to complete the import. Data Integrator Designer Guide 447 . select Metadata Exchange. select ERwin 4. Exchanging metadata Metadata exchange 16 • BusinessObjects Universe Builder Converts Data Integrator repository metadata to Business Objects universe metadata. Specify the Source file name (enter directly or click Browse to search). 1. 4. select Metadata Exchange. To import metadata into Data Integrator using Metadata Exchange From the Tools menu. To export metadata from Data Integrator using Metadata Exchange From the Tools menu. 5. This section discusses: • • Creating Business Objects universes Exporting metadata files from Data Integrator Importing metadata files into Data Integrator You can import metadata from ERwin Data Modeler 4. 3. Exporting metadata files from Data Integrator You can export Data Integrator metadata into a file that other tools can read. In the Metadata Exchange window.This document is part of a SAP study on PDF usage. 2. select Import metadata from file. Select the Target datastore name from the list of Data Integrator datastores.x XML into a Data Integrator datastore. 6. In the Metadata format box. 1.x XML from the list of available formats. Find out how you can participate and help to improve our documentation.

Using the MIMB application provides more configuration options for structuring the metadata in the exported file. If you have MIMB installed and you select an MIMB-supported format. 3.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. If you do not select the Visual check box. When you search for the file. select Export Data Integrator metadata to file. 16 Exchanging metadata Metadata exchange 2. 4. you open a typical browse window. In the Metadata Exchange window. Select a Metadata format for the target from the list of available formats. the metadata is exported without opening the MIMB application. like this: 448 Data Integrator Designer Guide . Specify the target file name (enter directly or click Browse to search). select the Visual check box to open the MIMB application when completing the export process.

Exchanging metadata Creating Business Objects universes 16 Find any of the following file formats/types: Format DI CWM 1. 6. File type XML XML All Creating Business Objects universes Data Integrator allows you to easily export its metadata to Business Objects universes for use with business intelligence tools. To create universes using the Tools menu Select Tools > Business Objects Universes. The Create Universe or Update Universe window opens. Select the datastore(s) that contain the tables and columns to export and click OK. refer to the BusinessObjects Universe Builder Guide. Data Integrator launches the Universe Builder application and provides repository information for the selected datastores.0 XML/XMI 1. For more information.1 DI ERwin 4. 2. first install BusinessObjects Universe Builder on the same computer as BusinessObjects Designer and Data Integrator Designer. Select the Source datastore name from the list of Data Integrator datastores. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide 449 . Click OK to complete the export.x XML MIMB format (only if installed) After you select a file. 5. You can create Business Objects universes using the Tools menu or the object library. click Open. A universe is a layer of metadata used to translate physical metadata into logical metadata. You can install Universe Builder using the installer for Data Integrator Designer or using the separate Universe Builder CD. For example the physical column name deptno might become Department Number according to a given universe design. Note: To use this export option. Select either Create or Update. 3.This document is part of a SAP study on PDF usage. 1.

3. Right-click a datastore and select Business Objects Universes. Mappings between repository and universe metadata Data Integrator metadata maps to BusinessObjects Universe metadata as follows: Data Integrator Table Column Owner Column data type (see next table) Primary key/foreign key relationship Table description Table Business Description Table Business Name Column description Column Business description Column Business Name Column mapping Column source information (lineage) Data types also map: Data Integrator Data type Date/Datetime/Time Decimal Int Double/Real BusinessObjects Type Date Number Number Number BusinessObjects Universe Class. For more information. Data Integrator launches the Universe Builder application and provides repository information for the selected datastores. column Schema Object data type Join expression Class description Class description Class name Object description Object description Object name Object description Object description 450 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. To create universes using the object library Select the Datastores tab. 16 Exchanging metadata Creating Business Objects universes 1.This document is part of a SAP study on PDF usage. 2. refer to the BusinessObjects Universe Builder Guide. Select either Create or Update. table Object.

Exchanging metadata Attributes that support metadata exchange 16 Data Integrator Data type Interval Varchar Long BusinessObjects Type Number Character Long Text Attributes that support metadata exchange The attributes Business_Name and Business_Description exist in Data Integrator for both tables and columns.1. • A Business_Name is a logical field. A Business_Description is a business-level description of a table or column. and load physical data while the Business Name data remains intact. Use this attribute to define and run jobs that extract. Data Integrator transfers this information separately and adds it to a BusinessObjects Class description. Data Integrator Designer Guide 451 . • Data Integrator includes two additional column attributes that support metadata exchanged between Data Integrator and BusinessObjects: • • Column_Usage Associated_Dimension For more information see the Data Integrator Reference Guide. Find out how you can participate and help to improve our documentation. transform. Data Integrator stores it as a separate and distinct field from physical table or column names. These attributes support metadata exchanged between Data Integrator and BusinessObjects through the Universe Builder (UB) 1.This document is part of a SAP study on PDF usage.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. 16 Exchanging metadata Attributes that support metadata exchange 452 Data Integrator Designer Guide .

This document is part of a SAP study on PDF usage. Data Integrator Designer Guide Recovery Mechanisms chapter . Find out how you can participate and help to improve our documentation.

during the failed job execution. You can use various techniques to recover from unsuccessful job executions. You might need to use a combination of these techniques depending on the relationships between data flows in your application. Data Integrator retrieves the results for successfully completed steps and reruns uncompleted or failed steps under the same conditions as the original job. rerun the job and retrieve all the data without duplicate or missing data. you must fix the problems that prevented the successful execution of the job and run the job again. Data Integrator records the result of each successfully completed step in a job.This document is part of a SAP study on PDF usage. This section discusses two techniques: • • Automatically recovering jobs — A Data Integrator feature that allows you to run unsuccessful jobs in recovery mode. Data Integrator considers steps that raise exceptions as failed steps. This chapter contains the following topics: • • • • Recovering from unsuccessful job execution Automatically recovering jobs Manually recovering jobs using status tables Processing data with problems Recovering from unsuccessful job execution If an Data Integrator job does not complete properly. you can choose to run the job again in recovery mode. Therefore. Find out how you can participate and help to improve our documentation. even if the step is caught in a try/catch block. For recovery purposes. Automatically recovering jobs With automatic recovery. During recovery mode. If you do not use these techniques. If a job fails. Manually recovering jobs using status tables — A design technique that allows you to rerun jobs without regard to partial results in a previous run. partially loaded. 17 Recovery Mechanisms About this chapter About this chapter Recovery mechanisms are available in Data Integrator for batch jobs only. you need to design your data movement jobs so that you can recover—that is. or altered. some data flows in the job may have completed and some tables may have been loaded. 454 Data Integrator Designer Guide . However. you might need to roll back changes manually from target tables if interruptions occur during job execution.

To run a job from Designer with recovery enabled In the project area. Right-click and choose Execute. Make sure that the Enable Recovery check box is selected on the Execution Properties window. 2. In that case. Data Integrator prompts you to save any changes. Data Integrator Designer Guide 455 . you must perform any recovery operations manually. • To run a job with recovery enabled from the Administrator When you schedule or execute a job from the Administrator. Recovery Mechanisms Automatically recovering jobs 17 Enabling automated recovery To use the automatic recover feature. select the Enable Recovery check box. 3. Data Integrator saves the results from successfully completed steps when the automatic recovery feature is enabled. you must enable the feature during initial execution of a job.This document is part of a SAP study on PDF usage. If this check box is not selected. select the job name. Find out how you can participate and help to improve our documentation. Data Integrator does not record the results from the steps during the job and cannot recover the job if it fails. 1.

Business Objects recommends that you not mark a work flow or data flow as Execute only once when the work flow or a parent work flow is a recovery unit. Because of the dependency. For example. Find out how you can participate and help to improve our documentation. then click OK. Data Integrator executes the entire work flow during recovery. the entire work flow must complete successfully. Right-click and choose Properties. Select the Recover as a unit check box. when you specify that a work flow or a data flow should only execute once. 1. select the work flow. and recovery. you should designate the work flow as a “recovery unit. there are some exceptions to recovery unit processing. see the Data Integrator Reference Guide. 456 Data Integrator Designer Guide . To specify a work flow as a recovery unit In the project area. 3. steps in a work flow depend on each other and must be executed together.” When a work flow is a recovery unit. However. even steps that executed successfully in prior work flow runs. For more information about how Data Integrator processes data flows and work flows with multiple conditions like execute once. 2. 17 Recovery Mechanisms Automatically recovering jobs Marking recovery units In some cases.This document is part of a SAP study on PDF usage. If the work flow does not complete successfully. even if that work flow or data flow is contained within a recovery unit work flow that re-executes. a job will never re-execute that work flow or data flow after it completes successfully. Therefore. parallel flows.

the black “x” and green arrow symbol indicate that a work flow is a recovery unit. In recovery mode.This document is part of a SAP study on PDF usage. Recovery Mechanisms Automatically recovering jobs 17 During recovery. Data Integrator executes the steps in parallel if they are not connected in the work flow diagrams and in serial if they are connected. Data Integrator executes the steps or recovery units that did not complete successfully in a previous execution—this includes steps that failed and steps that threw an exception but completed successfully such as those in a try/catch block. when the previous run succeeded. When you select Recover from last failed execution. As in normal job execution. then Data Integrator retrieves the results from the previous execution. Right-click and choose Execute. This option is not available when a job has not yet been executed. If you need to make any changes to the job itself to correct the failure. then the entire work flow re-executes during recovery. Running in recovery mode If a job with automated recovery enabled fails during execution. without an error—during a previous execution. 2. or when recovery mode was disabled during previous run. you cannot use automatic recovery but must run the job as if it is a first run. Data Integrator Designer Guide 457 . To run a job in recovery mode from Designer In the project area. On the workspace diagram. you can reexecute the job in recovery mode. If the entire work flow completes successfully—that is. you need to determine and remove the cause of the failure and rerun the job in recovery mode. select the (failed) job name. Find out how you can participate and help to improve our documentation. Data Integrator retrieves the results from any steps that were previously executed successfully and executes or re-executes any other steps. If any step in the work flow did not complete successfully. Data Integrator considers this work flow a unit. As with any job execution failure. Data Integrator prompts you to save any objects that have unsaved changes. Make sure that the Recover from last failed execution check box is selected. 1. 3.

while the job is running.This document is part of a SAP study on PDF usage. When you schedule or execute a (failed) job from the Administrator. the database log overflows and stops the job from loading fact tables. The next day. To ensure that the recovery job follows the exact execution path that the original job followed. Data Integrator stores results from the following types of steps: • • • • 458 Work flows Batch data flows Script statements Custom functions (stateless type only) Data Integrator Designer Guide . If the job was allowed to run under changed conditions—suppose a sysdate function returns a new date to control what data is extracted—then the new data loaded into the targets will no longer match data successfully loaded into the target during the first execution of the job. In addition. When recovery is enabled. It is important that the recovery job run exactly as the previous run. then the job execution may follow a completely different path through conditional steps or try/catch blocks. However. Find out how you can participate and help to improve our documentation. If the recovery job used new extraction criteria—such as basing data extraction on the current system date—the data in the fact tables would not correspond to the data previously extracted into the dimension tables. Ensuring proper execution path The automated recovery system requires that a job in recovery mode runs again exactly as it ran previously. 17 Recovery Mechanisms Automatically recovering jobs If you clear this option. suppose a daily update job running overnight successfully loads dimension tables in a warehouse. the administrator truncates the log file and runs the job again in recovery mode. performing all steps. the recovery job must use the same extraction criteria that the original job used when loading the dimension tables. if the recovery job uses new values. results from scripts. select the Recover from last failed execution check box. To ensure that the fact tables are loaded with the data that corresponds properly to the data already loaded in the dimension tables. The recovery job does not reload the dimension tables because the original. failed run successfully loaded them. Data Integrator records any external inputs to the job— return values for systime and sysdate. Data Integrator runs this job anew. For example. and so forth—and the recovery job uses the stored values.

you set an alternate value for $i. the first work flow contains an error that throws an exception. that you set within a try/catch block. For example. If an exception is thrown inside a try/catch block. using variables set in the try/catch block could alter the results during automatic recovery.This document is part of a SAP study on PDF usage. Because the execution path through the try/catch block might be different in the recovered job. Find out how you can participate and help to improve our documentation. then during recovery Data Integrator executes the step that threw the exception and subsequent steps. Recovery Mechanisms Automatically recovering jobs 17 • • • • • • SQL function exec function get_env function rand function sysdate function systime function Using try/catch blocks with automatic recovery Data Integrator does not save the result of a try/catch block for reuse during recovery. Subsequent steps are based on the value of $i. $i. However. Job execution logic $i = 10 IF $i < 1 IF $i<1=TRUE IF $i<1=FALSE $i = 0 During the first job execution. suppose you create a job that defines a variable. If an exception occurs. the job fails in the subsequent work flow. Data Integrator Designer Guide 459 . which is caught.

You do not want to insert duplicate rows during recovery when the data flow re-executes.This document is part of a SAP study on PDF usage. You can use several methods to ensure that you do not insert duplicate rows: • Design the data flow to completely replace the target table during each execution 460 Data Integrator Designer Guide . 17 Recovery Mechanisms Automatically recovering jobs First job execution An error occurs while processing this work flow An exception is thrown and caught in the first execution You fix the error and run the job in recovery mode. $i. Recovery execution The execution path changes because of the results from the try/catch block No exception is thrown in the recovery execution To ensure proper results with automatic recovery when a job contains a try/ catch block. and the job selects a different subsequent work flow. As a result. only some of the required rows could be inserted in a table. do not use values set inside the try/catch block in any subsequent steps. Ensuring that data is not duplicated in targets Define work flows to allow jobs correct recovery. is different. Find out how you can participate and help to improve our documentation. A data flow might be partially completed during an incomplete run. producing different results. the first work flow no longer throws the exception. During the recovery execution. Thus the value of the variable.

• Set the auto correct load option for the target table The auto correct load option checks the target table for existing rows before adding new rows to the table. however. • Include a SQL command to execute before the table loads Preload SQL commands can remove partial database updates that occur during incomplete execution of a step in a job. Data Integrator Designer Guide 461 . you must define variables and pass them to data flows correctly. the variable value is not reset. You can use tuning techniques such as bulk loading options to improve overall performance. Find out how you can participate and help to improve our documentation. During recovery. can needlessly slow jobs executed in non-recovery mode. To use preload SQL commands properly.) The rows inserted during the previous. For example. Using preload SQL to allow re-executable data flows To use preload SQL commands to remove partial database updates. (The variable value is set in a script. add a preload SQL command that deletes any rows with a time-date stamp greater than that recorded by the variable.This document is part of a SAP study on PDF usage. You can create a script with a variable that records the current time stamp before any new rows are inserted. During initial execution. Typically. Recovery Mechanisms Automatically recovering jobs 17 This technique can be optimal when the changes to the target table are numerous compared to the size of the table. partial database load would match this criteria. and the preload SQL command would delete them. 1. tables must contain a field that allows you to tell when a row was inserted. no rows match the deletion criteria. which is executed successfully during the initial run. Using the auto correct load option. To use preload SQL commands to ensure proper recovery Determine appropriate values that you can use to track records inserted in your tables. Create a preload SQL command that deletes rows based on the value in that field. suppose a table contains a column that records the time stamp of any row insertion. Consider this technique when the target table is large and the changes to the table are relatively few. the preload SQL command deletes rows based on a variable that is set before the partial insertion step began. In the target table options.

You need to create a separate script that sets the required variables before each data flow or work flow that loads a table. 17 Recovery Mechanisms Automatically recovering jobs For example.This document is part of a SAP study on PDF usage. if each row in a table is marked with the insertion time stamp. Find out how you can participate and help to improve our documentation. create the “tracking” variables for that work flow at the job level. see “Defining local variables” on page 302. Scripts are unique steps in jobs or work flows. Job Work Flow (A recovery unit) When a work flow is a recovery unit. Generally. you do not want tracking variables reset during recovery because when they reset. If a work flow is a recovery unit. see “Scripts” on page 211. the preload SQL command will not work properly. For information about creating scripts. Create scripts that set the variables to the appropriate values. typically at the job level. For information about creating a variable. 2. create the scripts for that work flow at the job level. otherwise. Connect the scripts to the corresponding data flows or work flows. create scripts that set tracking variables inside the work flow before the data flow that requires the value. Work Flow (Not a recovery unit) When a work flow is not a recovery unit. create scripts that set tracking variables outside the work flow. Variables are either job or work-flow specific. Connect the script to the work flow. otherwise. 3. Create variables that can store the “tracking” values. then you can use the value from the sysdate() function to determine when a row was added to that table. If a work flow is a recovery unit. Connect the script directly to the appropriate data flow. create your tracking variables at the work flow level. 462 Data Integrator Designer Guide . 4. create the scripts at the work flow level.

The table records a job’s execution status. Create parameters to pass the variable information from the job or work flow where you created the variable to the data flow that uses the tracking variable in the preload SQL command. Manually recovering jobs using status tables You can design your jobs and work flows so that you can manually recover from an unsuccessful run. To implement a work flow with a recovery execution path: • • • Define a flag that indicates when the work flow is running in recovery mode. Insert appropriate preload SQL commands that remove any records inserted during earlier unsuccessful runs. Find out how you can participate and help to improve our documentation. Store the flag value in a status table. see “Defining parameters” on page 302. deleting rows that were inserted after the variable was set. You created a variable $load_time that records the value from the sysdate() function before the load starts. A job designed for manual recovery must have certain characteristics: • • You can run the job repeatedly. A “failure” value signals Data Integrator to take a recovery execution path. delete from PO_ITEM where TIMESTMP > [$load_time] For information about creating preload SQL commands. Recovery Mechanisms Manually recovering jobs using status tables 17 5. your preload SQL command must delete any records in the table where the value in TIMESTMP is larger than the value in $load_time. The preload SQL commands reference the parameter containing the tracking variable. 6. For example. see the Data Integrator Reference Guide. You can use an execution status table to produce jobs that can be run multiple times without duplicating target rows. Check the flag value in the status table before executing a work flow to determine which path to execute in the work flow. Then. Data Integrator Designer Guide 463 . The job implements special steps to recover data when a step did not complete successfully during a previous run. For information about creating variables. suppose the PO_ITEM table records the date-time stamp in the TIMESTMP column.This document is part of a SAP study on PDF usage. and you passed that variable to the data flow that loads the PO_ITEM table in a parameter named $load_time.

For example. This work flow would have five steps. as illustrated: 464 Data Integrator Designer Guide . you could design a work flow that uses the auto correct load option when a previous run does not complete successfully. Find out how you can participate and help to improve our documentation. 17 Recovery Mechanisms Manually recovering jobs using status tables • Update the flag value when the work flow executes successfully.This document is part of a SAP study on PDF usage.

Find out how you can participate and help to improve our documentation. $stop_date = sql('target_ds'. ('SELECT stop_timestamp FROM status_table WHERE start_timestamp = (SELECT MAX(start_timestamp) FROM status_table)')). Recovery Mechanisms Manually recovering jobs using status tables 17 $StopStamp = sql('target_ds'.This document is part of a SAP study on PDF usage. ('UPDATE status_table SET stop_timestamp = SYSDATE WHERE start_timestamp = (SELECT MAX(start_timestamp) FROM status_table)')). 1 5 Data Integrator Designer Guide 465 . ELSE $recovery_needed = 0. IF (($StopStamp = NULL) OR ($StopStamp = '')) $recovery_needed = 1.

this section discusses three techniques: • • • Using overflow files Filtering missing or bad values Handling facts with missing dimensions Using overflow files A row that cannot be inserted is a common data problem. 17 Recovery Mechanisms Processing data with problems 1. In other cases. 3. Data Integrator might insert rows with missing information. give a full path name to ensure that Data Integrator creates a unique file when more than one file is created in the same job. When you specify an overflow file. When you specify an overflow file and Data Integrator cannot load a row into a table. 5. Use the overflow file to process this type of data problem. 466 Data Integrator Designer Guide . Data Integrator is unable to insert a row. evaluate the $recovery_needed variable. If recovery is not required. This section describes mechanisms you can use to anticipate and process data problems. By default. In particular. see the Data Integrator Reference Guide.This document is part of a SAP study on PDF usage. you can set the option to use an overflow file in the Options tab. Processing data with problems Jobs might not produce the results you expect because of problems with data. you might have a data flow write rows with missing information to a special file that you can inspect later. This data flow loads the data using the auto correct load option. The trace log indicates the data flow in which the load failed and the location of the file. Retrieve the flag value. 2. the name of the overflow file is the target table name. In some cases. which indicates the success or failure of the previous execution. 4. execute the recovery data flow recover_customer. You can design your data flows to anticipate and process these types of problems. In a conditional. Find out how you can participate and help to improve our documentation. Data Integrator writes the row to the overflow file instead. Store this value in a variable such as $recovery_needed. If recovery is required. For example. execute the non-recovery data flow load_customer. For more information about the auto correct load option. from the status table. This data flow loads the data without the auto correct load option. For any table used as a target. Update the flag value in the status table to indicate successful execution.

For example. you can use the commands to load the target manually when the target is accessible. you choose what Data Integrator writes to the file about the rows that failed to load: either the data from the row or the SQL commands required to load the row. Data Integrator Designer Guide 467 . suppose you are extracting data from a source and you know that some phone numbers and customer names are missing. Note: You cannot use overflow files when loading to a BW Transfer Structure. you can use Data Integrator to read the data from the overflow file. and load it into the target table. Using queries in data flows. and filter the NULL values into a file for your inspection. cleanse it. for example: • • • Out of memory for the target Overflow column settings Duplicate key values You can use the overflow information to identify invalid data in your source or problems introduced in the data movement. There are many reasons for loading to fail. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Recovery Mechanisms Processing data with problems 17 When you select the overflow file option. Every new run will overwrite the existing overflow file. You can also choose to include this data in the target or to disregard it. Filtering missing or bad values A missing or invalid value in the source data is another common data problem. If you select SQL commands. you can identify missing or invalid values in source data. load the data into a target. You can use a data flow to extract data from the source. If you select data.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

17

Recovery Mechanisms Processing data with problems

This data flow has five steps, as illustrated.
key_generation ( ' target_ds.owner.Customer ', ' Customer_Gen_Key ', 1)

3 1 2

4
SELECT Query.CustomerID, Query.NAME, Query.PHONE FROM Query WHERE (NAME = NULL) OR (PHONE = NULL);

5
10002, ,(415)366-1864 20030,Tanaka, 21101,Navarro, 17001, ,(213)433-2219 16401, ,(609)771-5123

The data flow: 1. 2. 3. 4. 5. Extracts data from the source Selects the data set to load into the target and applies new keys. (It does this by using the Key_Generation function.) Loads the data set into the target, using the bulk load option for best performance Uses the same data set for which new keys were generated in step 2, and select rows with missing customer names and phone numbers Writes the customer IDs for the rows with missing data to a file

Now, suppose you do not want to load rows with missing customer names into your target. You can insert another query into the data flow to ensure that Data Integrator does not insert incomplete rows into the target. The new query filters the rows with missing customer names before loading any rows into the target. The missing data query still collects those rows along with the rows containing missing phone numbers. In this version of the example, the Key_Generation transform adds keys for new rows before inserting the filtered data set into the target.

468

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Recovery Mechanisms Processing data with problems

17

The data flow now has six steps, as shown.
SELECT * FROM source WHERE (NAME <> NULL);

2 1

3

4

6 5
SELECT Query.CustomerID, Query.NAME, Query.PHONE FROM Query WHERE (NAME = NULL) OR (PHONE = NULL);

1. 2. 3. 4. 5.

Extracts data from the source Selects the data set to load into the target by filtering out rows with no customer name values Generates keys for rows with customer names Loads the valid data set (rows with customer names) into the target using the bulk load option for best performance Uses a separate query transform to select rows from the source that have no names or phones Note that Data Integrator does not load rows with missing customer names into the target; however, Data Integrator does load rows with missing phone numbers.

6.

Writes the customer IDs for the rows with missing data to a file.

You could add more queries into the data flow to select additional missing or invalid values for later inspection.

Data Integrator Designer Guide

469

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

17

Recovery Mechanisms Processing data with problems

Handling facts with missing dimensions
Another data problem occurs when Data Integrator searches a dimension table and cannot find the values required to complete a fact table. You can approach this problem in several ways:

Leave the problem row out of the fact table. Typically, this is not a good idea because analysis done on the facts will be missing the contribution from this row.

Note the row that generated the error, but load the row into the target table anyway. You can mark the row as having an error, or pass the row information to an error file as in the examples from “Filtering missing or bad values” on page 467.

Fix the problem programmatically. Depending on the data missing, you can insert a new row in the dimension table, add information from a secondary source, or use some other method of providing data outside of the normal, high-performance path.

470

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Data Integrator Designer Guide

Techniques for Capturing Changed Data

chapter

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data About this chapter

About this chapter
This chapter contains the following topics:

• • • • • • •

Understanding changed-data capture Using CDC with Oracle sources Using CDC with DB2 sources Using CDC with Attunity mainframe sources Using CDC with Microsoft SQL Server databases Using CDC with timestamp-based sources Using CDC for targets

Understanding changed-data capture
When you have a large amount of data to update regularly and a small amount of system down time for scheduled maintenance on a data warehouse, update data over time, or delta load. Two commonly used delta load methods are full refresh and changed-data capture (CDC).

Full refresh
Full refresh is easy to implement and easy to manage. This method ensures that no data will be overlooked or left out due to technical or programming errors. For an environment with a manageable amount of source data, full refresh is an easy method you can use to perform a delta load to a target system.

Capturing only changes
After an initial load is complete, you can choose to extract only new or modified data and update the target system. Identifying and loading only changed data is called changed-data capture (CDC). This includes only incremental data that has changed since the last refresh cycle. Data Integrator acts as a mechanism to locate and extract only the incremental data that changed since the last refresh. Improving performance and preserving history are the most important reasons for using changed-data capture.

Performance improves because with less data to extract, transform, and load, the job typically takes less time.

472

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Understanding changed-data capture

18

If the target system has to track the history of changes so that data can be correctly analyzed over time, the changed-data capture method can provide a record of these changes. For example, if a customer moves from one sales region to another, simply updating the customer record to reflect the new region negatively affects any analysis by region over time because the purchases made by that customer before the move are attributed to the new region.

This chapter discusses both general concepts and specific procedures for performing changed-data capture in Data Integrator.

Source-based and target-based CDC
Changed-data capture can be either source-based or target-based.

Source-based CDC
Source-based changed-data capture extracts only the changed rows from the source. It is sometimes called incremental extraction. This method is preferred because it improves performance by extracting the least number of rows. Data Integrator offers access to source-based changed data that various software vendors provide. The following table shows the data sources that Data Integrator supports.
Table 18-3 :Data Sources and Changed Data Capture Products and Techniques

Data Source Oracle 9i and higher

Products or techniques to use for Changed Data Capture Use Oracle’s CDC packages to create and manage CDC tables. These packages make use of a publish and subscribe model. You can create a CDC datastore for Oracle sources using Data Integrator Designer. You can also use the Designer to create CDC tables in Oracle, then import them for use in Data Integrator jobs. For more information, refer to “Using CDC with Oracle sources” on page 475.

Data Integrator Designer Guide

473

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Understanding changed-data capture

Data Source

Products or techniques to use for Changed Data Capture

DB2 UDB for Windows, Use the following products to capture changed UNIX, and Linux, data from DB2 sources: • DB2 Information Integrator for Replication Edition 8.2 (DB2 II Replication Edition) • IBM WebSphere Message Queue 5.3.1 (MQ)

Data Integrator’s real-time IBM Event Publisher adapter. DB2 II Replication Edition publishes changes from DB2 onto WebSphere Message Queues. Use the Data Integrator Designer to create a CDC datastore for DB2 sources. Use the Data Integrator Administrator to configure an IBM Event Publisher adapter and create Data Integrator real-time jobs to capture the changed data from the MQ queues. For more information, refer to “Using CDC with DB2 sources” on page 495. For mainframe data sources that use Attunity to connect to Data Integrator, you can use Attunity Streams 4.6. For more information, refer to “Using CDC with Attunity mainframe sources” on page 505. Use Microsoft SQL Replication Server to capture changed data from SQL Server databases. For more information, refer to “Using CDC with Microsoft SQL Server databases” on page 513. Use date and time fields to compare source-based changed-data capture job runs. This technique makes use of a creation and/or modification timestamp on every row. You can compare rows using the time of the last update as a reference. This method is called timestamp-based CDC. For more information, refer to “Using CDC with timestamp-based sources” on page 522.

Mainframe data sources (Adabas, DB2 UDB for z/OS, IMS, SQL/MP, VSAM, flat files) accessed with Attunity Connect Microsoft SQL Server databases

Other sources

Target-based CDC
Target-based changed-data capture extracts all the data from the source, but loads only the changed rows into the target.

474

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with Oracle sources

18

Target-based changed-data capture is useful when you want to capture history but do not have the option to use source-based changed-data capture. Data Integrator offers table comparison to support this method.

Using CDC with Oracle sources
If your environment must keep large amounts of data current, the Oracle Change Data Capture (CDC) feature is a simple solution to limiting the number or rows that Data Integrator reads on a regular basis. A source that reads only the most recent operations (INSERTS, UPDATES, DELETES), allows you to design smaller, faster delta loads. This section includes the following topics:

• • • • • •

Overview of CDC for Oracle databases Setting up Oracle CDC Importing CDC data from Oracle Configuring an Oracle CDC source Creating a data flow with an Oracle CDC source Maintaining CDC tables and subscriptions

Overview of CDC for Oracle databases
With Oracle 9i or higher, Data Integrator manages the CDC environment by accessing Oracle's CDC packages. Oracle’s packages use the publish and subscribe model with its CDC tables. Oracle publishes changed data from the original table to its CDC table. Data Integrator Designer allows you to create or import CDC tables and create subscriptions to access the data in the CDC table. Separate subscriptions allows each user to keep track of the last changed row that he or she accessed. You can also enable check-points for subscriptions so that Data Integrator only reads the latest changes in the CDC table. Oracle uses the following terms for Change Data Capture:

• •

Change (CDC) table—A relational table that contains changed data that results from DML operations performed on a source table. Change set—A group of CDC tables that are transactionally consistent. For example, SalesOrder and SalesItem tables should be in a change set to ensure that changes to an order and its line items are captured together.

Data Integrator Designer Guide

475

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Using CDC with Oracle sources

• •

Change source—The database that contains one or more change sets. Publisher—The person who captures and publishes the changed data. The publisher is usually a database administrator (DBA) who creates and maintains the schema objects that make up the source database and staging database. Publishing mode—Specifies when and how to capture the changed data. For details, see the following table of publishing modes. Source database—The production database that contains the data that you extracted for your initial load. The source database contains the source tables. Staging database—The database where the changed data is published. Depending on the publishing mode, the staging database can be the same as, or different from, the source database. Subscriber—A user that can access the published data in the CDC tables. Subscription—Controls access to the change data from one or more source tables within a single change set. A subscription contains one or more subscriber views. Subscriber view—The changed data that the publisher has granted the subscriber access to use.

• • • • • •

Oracle 10G supports the following publishing modes: Publishing mode Synchronous How capture data When Location of captured data changes is available CDC tables must reside in the source database Considerations

Uses internal Real time triggers on the source tables to store the changes in CDC tables

Adds overhead to source database at capture time Available in Oracle 9i and Oracle 10G

476

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with Oracle sources

18

Publishing mode

How capture data

When Location of captured data changes is available A change set contains multiple CDC tables and must reside locally in the source database A change set contains multiple CDC tables and can be remote or local to the source database

Considerations

Asynchronous Uses redo or archive Near real time HotLog logs for the source database

• •

Improves performance because data is captured offline Available in Oracle 10G only Improves performance because data is captured offline Available in Oracle 10G only

Asynchronous Uses redo logs AutoLog managed by log transport services that automate transfer from source database to staging database

Depends on frequency of redo log switches on the source database

Oracle CDC in synchronous mode
The following diagram shows how the changed data flows from Oracle CDC tables to Data Integrator in synchronous mode.

When a transaction changes a source table, internal triggers capture the changed data and store it in the corresponding CDC table.

Data Integrator Designer Guide

477

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Using CDC with Oracle sources

Oracle CDC in asynchronous HotLog mode
The following diagram shows how the changed data flows from Oracle CDC tables to Data Integrator in asynchronous HotLog mode.

When a transaction changes a source table, the Logwriter records the changes in the Online Log Redo files. Oracle Streams processes automatically populate the CDC tables when transactions are committed.

Oracle CDC in asynchronous AutoLog mode
The following diagram shows how the changed data flows from Oracle CDC tables to Data Integrator in asynchronous AutoLog mode.

478

Data Integrator Designer Guide

Setting up Oracle CDC Use the following system requirements on your Oracle source database server to track changes: • Install Oracle’s CDC packages. Oracle archives the redo log file and copies the Online Log Redo files to the staging database. enable Oracle’s system triggers. and dropped as needed. Asynchronous CDC is available with Oracle Enterprise Edition only. Oracle Streams processes populate the CDC tables from the copied log files. if a CDC package needs to be re-installed. purged. For synchronous CDC. Find out how you can participate and help to improve our documentation.sql. • • • • • • Synchronous CDC is available with Oracle Standard Edition and Enterprise Edition. then find and run Oracle’s SQL script initcdc. Note: The Oracle archive process requires uninterrupted connectivity through Oracle Net to send the redo log files to the remote file server (RFS). Data Integrator Designer Guide 479 . These packages are installed by default. Enable Java.This document is part of a SAP study on PDF usage. Give datastore owners the SELECT privilege for CDC tables and the SELECT_CATALOG_ROLE and EXECUTE_CATALOG_ROLE privileges. Set source table owner privileges so CDC tables can be created. Techniques for Capturing Changed Data Using CDC with Oracle sources 18 When the log switches on the source database. open Oracle’s Admin directory. However.

5. edit. If you will use the Data Integrator Designer to create CDC tables. create a CDC datastore using the Designer. Select the CDC check box. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources • For asynchronous AutoLog CDC: • The source database DBA must build a LogMiner data dictionary to enable the log transport services to send this data dictionary to the staging database. Select an Oracle version. The Designer only allows you to select the Oracle versions that support CDC packages. • • CDC datastores To gain access to CDC tables.This document is part of a SAP study on PDF usage. Use one of the following ways: • Use an Oracle utility to create CDC tables 480 Data Integrator Designer Guide . You can use this datastore to browse and import CDC tables. Importing CDC data from Oracle You must create a CDC table in Oracle for every source table you want to read from before you can import that CDC table using Data Integrator. and access a CDC datastore from the Datastores tab of the object library. you need to specify the SCN in the wizard (see step 12 of procedure “To invoke the New CDC table wizard in the Designer” on page 481). 4. 3. Specify the name of your staging database (the change source database where the changed data is published) in Connection name. To create a CDC datastore for Oracle Create a database datastore with the Database Type option set to Oracle. 2. Find out how you can participate and help to improve our documentation. The source database DBA must also obtain the SCN value of the data dictionary build. The publisher (usually the source database DBA) must configure log transport services to copy the redo log files from the source database system to the staging database system and to automatically register the redo log files. Enter the User and Password for your staging database and click OK. Like other datastores. A CDC datastore is a read-only datastore that can only access tables. 1. Oracle automatically updates the data dictionary with any source table DDL operations that occur during CDC to keep the staging tables consistent with the source tables. you can create.

and select New. right-click a CDC datastore and select Open. you can browse the datastore for existing CDC tables using the Datastore Explorer. Techniques for Capturing Changed Data Using CDC with Oracle sources 18 • Use Data Integrator Designer to create CDC tables Using existing Oracle CDC tables 1. The Data Integrator Designer provides you the ability to create Oracle CDC tables for all publishing modes: • • • Synchronous CDC HotLog Asynchronous CDC AutoLog Asynchronous CDC Creating CDC tables Data Integrator 1. In the Datastore Explorer. To invoke the New CDC table wizard in the Designer In the object library. right-click the white space in the External Metadata section. If you select Open. When you find the table that you want to import. right-click it and select Import. When CDC tables exist in Oracle Import an Oracle CDC table by right-clicking the CDC datastore name in the object library and selecting Open. Data Integrator Designer Guide 481 . Import by Name. The New CDC table wizard opens. This wizard allows you to add a CDC table. Find out how you can participate and help to improve our documentation. or Search. 2.This document is part of a SAP study on PDF usage. 2.

The Asynchronous modes are disabled. this wizard opens automatically. you can only select the Synchronous mode. Select the publishing mode on the first page of the wizard. 3. 482 Data Integrator Designer Guide . If your source database uses Asynchronous AutoLog publishing mode. the wizard selects the Asynchronous HotLog mode by default. If your source database is Oracle 9i. The user name for the source database DBA. Find out how you can participate and help to improve our documentation. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources Note: If the Datastore Explorer opens and no CDC tables exist in your datastore. The password for the Change Source user. select Asynchronous AutoLog and provide the following source database connection information: Field Connection name User Name Password Description The name of the database where the Change Source resides. If your source database is Oracle 10G. Use the service name of the Oracle Net service configuration.This document is part of a SAP study on PDF usage.

and after-images in the new CDC table. Specify the source table information in the second page of the wizard. By default. 6. For more information about this option. (Optional) Select Generate before-images if you want to track before. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide 483 . all columns are selected. enter values for a table Owner and/or Name. see “Using before-images” on page 490. Click a name in the list of returned tables and click Next to create a CDC table using the selected table as a source table. Click Next. Specify which columns to include or exclude from the CDC table in one of the following ways: • Remove the check mark from the box next to the name of each column that you want to exclude. 8. 7. c. enter a different owner name. Data Integrator generates a table name using the following convention: CDC__SourceTableName. Click the Search button to see a list of non-CDC external tables available in this datastore. 5. Specify the CDC table name for the new CDC table. Techniques for Capturing Changed Data Using CDC with Oracle sources 18 4. If the owner name you want to use is not in the list. By default. b. The second page of the wizard appears. You can use a wild-card character (%) to perform pattern matching for Name or Owner values. the owner name of the new CDC table is the owner name of the datastore. Specify the CDC table owner for the new CDC table. By default. To filter a search. a.This document is part of a SAP study on PDF usage. The source table owner name is also displayed in the CDC table owner list box.

creates the CDC table on the Oracle server. b. For an Oracle CDC table. 10. 484 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. this column is called Operation$. Note: All tables that Data Integrator imports through a CDC datastore contain a column that indicates which operation to perform for each row. click Next. then imported it successfully into Data Integrator. For asynchronous (HotLog or AutoLog) publishing mode. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources • Click Unselect All and place a check mark next to the name of each column that you want to include. Click Finish. Find out how you can participate and help to improve our documentation. Oracle adds other columns when it creates a CDC table. For synchronous publishing mode: a. 9. This dialog confirms that Oracle created a new CDC table. These columns all use a dollar sign as a suffix. and imports the table’s metadata into Data Integrator’s repository. The Designer connects to the Oracle instance. In addition to this column. Click OK on the information dialog.

Find out how you can participate and help to improve our documentation. this column is called Operation$. Note: All tables that Data Integrator imports through a CDC datastore contain a column that indicates which operation to perform for each row. you can create a new change set by typing in the name. These columns all use a dollar sign as a suffix. For asynchronous HotLog publishing mode. and imports the table’s metadata into Data Integrator’s repository. Select Define retention period to enable the Begin Date and End Date text boxes. Data Integrator Designer Guide 485 . specify the change set information in the fourth page of the wizard. Click Finish. Oracle adds other columns when it creates a CDC table. a. Alternatively. In addition to this column. Techniques for Capturing Changed Data Using CDC with Oracle sources 18 11. c. select a name from the drop-down list for Change set name. b. If you would like to add this change table to an existing change set to keep the changes transactionally consistent with the tables in the change set. creates the CDC table on the Oracle server. d.This document is part of a SAP study on PDF usage. For an Oracle CDC table. Select Stop capture on DDL if a DDL error occurs and you do not want to capture data. The Designer connects to the Oracle instance.

You can obtain this name from the source database Global_Name table SCN of data dictionary build—SCN value of the data dictionary build. c. select a name from the drop-down list for Change source name.This document is part of a SAP study on PDF usage. you can create a new change set by typing in the name. • • • Change source name—Name of the CDC change source. e. If you want to create a new change source. Source database—Name of the source database. Select Stop capture on DDL if a DDL error occurs during data capture and you do not want to capture data. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources 12. a. If you would like to add this change table to an existing change set to keep the changes transactionally consistent with the tables in the change set. refer to your Oracle documentation. Find out how you can participate and help to improve our documentation. Alternatively. 486 Data Integrator Designer Guide . specify the change set and change source information in the fourth page of the wizard. If you would like to add this change table to an existing change source. For more information about these parameters. For asynchronous AutoLog publishing mode. Select Define retention period to enable the Begin Date and End Date text boxes. type the following information: b. select a name from the drop-down list for Change set name. d.

this column is called Operation$. 2. creates the CDC table on the change source. In addition to this column. These columns all use a dollar sign as a suffix. When Data Integrator imports a CDC table. Viewing an imported CDC table 1. and imports the table’s metadata into Data Integrator’s repository. Find out how you can participate and help to improve our documentation. For an Oracle CDC table. The Designer connects to the Oracle staging database. Oracle adds other columns when it creates a CDC table. 3. Expand the Tables folder.This document is part of a SAP study on PDF usage. Click Finish. Note: All tables that Data Integrator imports through a CDC datastore contain a column that indicates which operation to perform for each row. Techniques for Capturing Changed Data Using CDC with Oracle sources 18 f. Double-click a table name or right-click and select Open. To view an imported CDC table Find your CDC datastore in the object library. An imported Oracle CDC table schema looks like the following: Oracle CDC table columns Oracle source columns Data Integrator Designer Guide 487 . it also adds two columns to the table’s schema: DI_SEQUENCE_NUMBER with the data type integer and DI_OPERATION_TYPE with the data type varchar(1).

The translation is as follows: Operation$ I D UO. Both the before. 1. depending on the options that were selected when the CDC table is created. All Oracle control columns end with a dollar sign ($). it automatically becomes a source object. 488 Data Integrator Designer Guide .and after-images receive the same sequence number. To configure a CDC table Drag a CDC datastore table into a data flow.This document is part of a SAP study on PDF usage.and after-images for an UPDATE operation. This sequencing column provides a way to collate image pairs if they are separated as a result of the data flow design. For information about when to consider using before-images. it checks the values in column Operation$ and translates them to Data Integrator values in the DI_OPERATION_TYPE column. The DI_OPERATION_TYPE column The possible values for the DI_OPERATION_TYPE column are: • • • • I for INSERT D for DELETE B for before-image of an UPDATE U for after-image of an UPDATE When Data Integrator reads rows from Oracle. This field increments by one each time Data Integrator reads a row except when it encounters a pair of before. The DI_SEQUENCE_NUMBER column The DI_SEQUENCE_NUMBER column starts with zero at the beginning of each extraction. Find out how you can participate and help to improve our documentation. UU UN DI_OPERATION_TYPE I D B U Configuring an Oracle CDC source When you drag a CDC datastore table into a data flow. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources This example has eight control columns added to the original table: • • Two generated by Data Integrator Six Oracle control columns Note: The Oracle control columns vary. see “Using before-images” on page 490.

Enables Data Integrator to restrict CDC subscription reads using check-points. Specify a value for the CDC subscription name. Enable check-point Get before-image for Oracle allows a before-image and an after-image to be each update row associated with an UPDATE row. Data Integrator Designer Guide 489 .This document is part of a SAP study on PDF usage. owner. you can use the same subscription name (without conflict) with different tables in the same datastore if they have different owner names. Click the name of this source object to open its Source Table Editor. There are three CDC table options in the Source Table Editor’s CDC Options tab: Option Name CDC subscription name Description The name that marks a set of changed data in a continuously growing Oracle CDC table. 3. This value is required. Once a check-point is enabled. By default. Find out how you can participate and help to improve our documentation. it reads only the rows inserted into the CDC table since the last check-point. the next time the CDC job runs. Data Integrator moves the check-point forward to mark the last row read. enable this option. For more information. Select from the list or create a new subscription. Techniques for Capturing Changed Data Using CDC with Oracle sources 18 2. If you want to read before-images (for a CDC table set to capture them). Subscriptions are created in Oracle and saved for each CDC table A subscription name is unique to a datastore. 4. After a job completes successfully. Click the CDC Options tab. and table name. Business Objects recommends that you enable check-pointing for a subscription name in a production environment. For more information. For example. see “Using before-images” on page 490. see “Using check-points” on page 490. only after-images are retrieved.

if ever runs again. do not reuse data flows that use CDC datastores because each time a source table extracts data it uses the same subscription name. Data Integrator can expand the UPDATE row into two rows: one row for the before-image of the update. for example. enter a name in the CDC Subscription name box and select the Enable check-point option. can get different results and leave check-points in different locations in the table. Note: To avoid data corruption problems. you can calculate the difference between an employee’s new and old salary by looking at the difference between the values in salary fields. Find out how you can participate and help to improve our documentation. 490 Data Integrator Designer Guide . the Data Integrator uses the source table’s subscription name to read the most recent set of appended rows. when source tables are updated. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources Using check-points When a job in Data Integrator runs with check-pointing enabled. You can use before-images to: • Update primary keys However. under most circumstances. If you do not enable check-pointing. their primary keys do not need to be updated. Using before-images If you want to retrieve the before-images of UPDATE rows. fewer rows pass through the engine which allows the job to execute in less time. a best practice scenario would be to change the subscription name for the production job so that the test job. This increases processing time. prior to when the update operation is applied to the target. By not retrieving before-images. When you migrate CDC jobs from test to production. The before image of an update row is the image of the row before the row is changed. will not affect the production job’s results. depending upon when they run. To use check-points.This document is part of a SAP study on PDF usage. • Calculate change logic between data in columns For example. then the job reads all the rows in the table. This means that identical jobs. and the after image of an update row refers to the image of the row after the change is applied. on the Source Table Editor. and one row for the after-image of the update. The default behavior is that a CDC reader retrieves after-images only.

Techniques for Capturing Changed Data Using CDC with Oracle sources 18 When you want to capture before-images for update rows: • At CDC table creation time. This would cause data integrity issues. If the underlying. eliminate. but undesirable results may still occur due to programming errors. the before. for every update. enabling the Get before-images for each update row option has no effect. Add the appropriate target table and connect the objects. and multiply the number of rows in a data flow (for example. Data Integrator processes two rows. see “To invoke the New CDC table wizard in the Designer” on page 481. To define a data flow with an Oracle CDC table source From the Designer object library. Select the Get before-images for each update row option in the CDC table’s source editor. make sure the Oracle CDC table is also setup to retrieve full before-images. 3. The Map_CDC_Operation transform can resolve problems. Data Integrator Designer Guide 491 . For more information. due to the use of the group by or order by clauses in a query) be aware of the possible impact to targets. 1. If you create an Oracle CDC table using Data Integrator Designer.This document is part of a SAP study on PDF usage. Query and Map_CDC_Operation transforms to the data flow workspace. When using functions and transforms that re-order. In addition to the performance impact of this data volume increase. re-direct. CDC table is not set-up properly. Find out how you can participate and help to improve our documentation. you use a Query transform to remove the Oracle control columns and the Map_CDC_Operation transform to interpret the Data Integrator control columns and take appropriate actions. drag the Oracle CDC table. • Once you select the Get before-images for each update row option. Use the procedure in “Configuring an Oracle CDC source” on page 488 to configure the CDC table. 2. Note: A data flow can only contain one CDC source. Creating a data flow with an Oracle CDC source To use an Oracle CDC source.and after-image pairs may be separated or lost depending on the design of your data flow. you can select the Generate beforeimages check box to do this.

or UPDATE.This document is part of a SAP study on PDF usage. For an Oracle CDC source table. 5. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources 4. DELETE. The operations can be INSERT. For example. the corresponding row is deleted from the target table. the DI_OPERATION_TYPE column is automatically selected as the Row operation column. In the Query Editor. map only the Data Integrator control columns and the source table columns that you want in your target table. see the Data Integrator Reference Guide. The Map_CDC_Operation transform uses the values in the column in the Row Operation Column box to perform the appropriate operation on the source row for the target table. 492 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. if the operation is DELETE. For detailed information about the Map_CDC_Operation transform.

Purging CDC tables Periodically purge CDC tables so they do not grow indefinitely. Data Integrator Designer Guide 493 . To drop Oracle CDC subscriptions or tables From the object library. 1. Techniques for Capturing Changed Data Using CDC with Oracle sources 18 Maintaining CDC tables and subscriptions This section discusses purging CDC tables and dropping CDC subscriptions and tables.This document is part of a SAP study on PDF usage. Refer to your Oracle documentation for how to purge data that is no longer being used by any subscribers. Find out how you can participate and help to improve our documentation. right-click a CDC datastore and select Open. Dropping CDC subscriptions and tables You can drop Oracle CDC tables and their subscriptions from the Datastore Explorer window in Data Integrator Designer.

Select each subscription name to drop it from Oracle and delete it from the Data Integrator repository. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. In the Datastore Explorer window. Oracle subscriptions are associated with these subscription names. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources 2. 3. Right-click a table and select CDC maintenance. click Repository Metadata. • Drop table This option drops the Oracle CDC table and also deletes it from the Data Integrator repository. Choose: • Drop Subscription This option opens the list of subscriptions you created in Data Integrator for the selected table. 494 Data Integrator Designer Guide . T2 4.

However. Data Integrator’s IBM Event Publisher (EP) adapter listens for messages from IBM WebSphere MQ. lookup_ext. DELETES). the DB2 CDC feature is a simple solution to limiting the number or rows that Data Integrator reads on a regular basis. Data Integrator captures changed data on an IBM DB2 database server and applies it on-demand to a target system. The IBM EP adapter converts the IBM Event Publisher messages into Data Integrator internal messages and sends the internal messages to different real-time jobs. DELETE. the recovered job will begin to review the job at the start of the CDC table. Key_Generation. Check-points are ignored. Data Integrator Designer Guide 495 . such as lookup. Find out how you can participate and help to improve our documentation. • • • • Table_Comparison. and SQL transforms All database functions. faster delta loads. and UPDATE. Oracle CDC captures DML statements. Techniques for Capturing Changed Data Using CDC with DB2 sources 18 Limitations The following limitations exist when using CDC with Oracle sources: • You cannot use the following transforms and functions with a source table imported with a CDC datastore because of the existence of the Data Integrator generated columns for CDC tables. key_generation. and total_rows You can only create one CDC source in a data flow. Oracle CDC does not support the following operations because they disable all database triggers: • • • Direct-path INSERT statements The multi_table_insert statement in parallel DML mode If you are using check-pointing and running your job in recovery mode. sql. Data Integrator cannot compare or search these columns. UPDATES.This document is part of a SAP study on PDF usage. allows you to design smaller. including INSERT. A source that reads only the most recent operations (INSERTS. Using CDC with DB2 sources If your environment must keep large amounts of data current. The data uses the following path: • • • The Q Capture program on the Replication Server sends data from the DB2 log to the WebSphere MQ Queue Manager.

Stops processing messages if it cannot deliver them to an enabled realtime service.This document is part of a SAP study on PDF usage.You can install it on one computer if DB2 UBD. Guaranteed delivery To achieve guaranteed delivery. 18 Techniques for Capturing Changed Data Using CDC with DB2 sources The input message is forwarded to multiple real-time services if they process data from some or all the tables in the message. MQ is a peer-to-peer product. Find out how you can participate and help to improve our documentation. the IBM EP Adapter: • • • Discards table data if there is no real-time service for it. the Data Integrator Job Server. Data Integrator connects to a stack of IBM products. Setting up DB2 Data Integrator supports reading changed data from DB2 Universal Database (UDB) on Windows and UNIX. Use DB2 Information Integrator (II) Replication to create a pathway between a DB2 server log and IBM WebSphere MQ. Stops processing messages if one message fails. which it provides to publish changed data from a variety of IBM sources including DB2 UBD for Windows and UNIX. which means that the data must transfer from one MQ system to another MQ system. and its IBM EP 496 Data Integrator Designer Guide . DB2 II Replication.

the names of the MQ queues. The Restart queue keeps track of where to start reading the DB2 recovery log after the Q Capture program restarts. the names of the tables that you want to publish. • • • • • • Complete the Q Capture program configuration by specifying XML publishing options: specify the DB2 server name.This document is part of a SAP study on PDF usage. Also define a remote data queue on the DB2 II Replication computer. Specify a server. The Replication Center creates control tables on this server. Note: Do not select any option for XML publication that restricts the message content to changed columns only. and the properties of the XML messages (for example. For each local queue. which will publish table data out to MQ. Name a new schema for these tables. The Q Capture Server is the DB2 database that contains source data. A message must be able to hold at least one row. Specify a schema name to identify the queue capture program and its unique set of control tables. specify 4000 for Max Message Length. Techniques for Capturing Changed Data Using CDC with DB2 sources 18 Adapter are also on the same computer. The Administrator queue receives control and status messages from the Q Capture program. Choose to send both changed and unchanged columns to Data Integrator. for example if DB2 and DB2 II Replication are on a different computer than Data Integrator. Select the Event publishing view from the DB2 II Replication Center Launchpad and use the wizard to create Q capture control tables. • Configure MQ by creating two local queues (admin and restart) on the DB2 II Replication computer and one local queue (data) on the Data Integrator computer. Find out how you can participate and help to improve our documentation. DB2 II Replication. The value that you specify here should less than or equal to the Max Message Length you set earlier for MQ). Data Integrator Designer Guide 497 . If your configuration is spread over more-than-one computer. The following steps summarize the procedure for setting up the source-side and make recommendations for settings. then install MQ on both computers. It assumes that DB2. • Start the Q capture program. This ensures that when there is a change in any column Data Integrator receives the whole row in the request message. and MQ are on one computer while a second computer has Data Integrator and MQ installed. select persistent as the value for Default Persistence and set the value for Max Message Length to match the size of a table row. Enter the names for the queues (that you created in MQ) that will function as the administration and restart queues in DB2 II Replication.

• • • Using the Designer: • • • • • • • Create a CDC datastore for DB2 Import metadata for DB2 tables Build real-time jobs using metadata Enable a real-time service Start the IBM EP Adapter (starts real-time services too) Monitor the real-time services and adapter Using the CDC Services node in the Administrator: 498 Data Integrator Designer Guide . and filters out data that is not required in the associated real-time job. enable the Job Server to support adapters (the IBM EP Adapter is installed with every Job Server). Find out how you can participate and help to improve our documentation. However. Data Integrator allows one MQ queue and one repository per adapter instance name. Using the CDC Services node in the Administrator. see IBM publications: • • DB2 Information Integrator Replication and Event Publishing Guide and Reference V8. configure Data Integrator’s IBM EP Adapter before you proceed with creating a datastore. A Job Server must be adapter-enabled to appear as a Job Server choice in the Create Datastore Editor.1 Setting up Data Integrator Data Integrator uses real-time services to read changed data from DB2.This document is part of a SAP study on PDF usage.3. under the Management node. The IBM EP Adapter reads messages from MQ. you must configure a real-time IBM Event Publisher (EP) Adapter instance. add the Access Server and repository that will process DB2 CDC jobs. converts them to a format used by Data Integrator real-time services. before you can create a DB2 CDC datastore. do the following: • • Install a Job Server Using the Server Manager.2 WebSphere MQ for Windows: Quick Beginnings Version 5. You can create real-time services from real-time jobs. which you use to import metadata to create real-time jobs. To use Data Integrator to read and load DB2 changed data. Open the Data Integrator Administrator and. 18 Techniques for Capturing Changed Data Using CDC with DB2 sources For more detailed documentation about setting up IBM products on the source side. Adapters are installed automatically with all Job Servers.

In fact. Data Integrator Designer Guide 499 . Data Integrator imports and processes a source in a DB2 CDC job as a regular table. which require one message source and target. Techniques for Capturing Changed Data Using CDC with DB2 sources 18 CDC Services This section describes procedures that are unique to the DB2 CDC feature. create an adapter instance using the Administrator’s CDC Services node and configure a database DB2 CDC datastore in the Designer.This document is part of a SAP study on PDF usage. Data Integrator uses the IBM EP Adapter only to process the data. Data Integrator imports regular table metadata with a DB2 CDC datastore connection. if one or more DB2 CDC tables exist in a real-time job. Find out how you can participate and help to improve our documentation. Uses different connections for Import/Execute commands Unlike other adapters. Does not use messages in real-time jobs Unlike other Data Integrator real-time jobs. Data Integrator does not support message sources or targets in that job. which Data Integrator uses to both import metadata for jobs and process those jobs. DB2 CDC metadata is imported directly from the DB2 database. Uses CDC Services to create an IBM Event Publisher adapter instance Unlike other Data Integrator supported adapters. (for which you would create an adapter instance using the Administrator’s Adapter Instances node and create an adapter datastore connection using the Designer).

CDC datastores DB2 II control tables use the publish/subscribe model. Select a Database version. Select Database as the Datastore Type and DB2 as the Database Type. To gain access to DB2 CDC tables. 2. 3. Find out how you can participate and help to improve our documentation. A CDC datastore is a read-only datastore that can only access tables. which pushes it out to applications like Data Integrator’s IBM EP Adapter using messages. and access a CDC datastore from the Datastores tab of the object library. 1. Before you can create a DB2 CDC datastore. 18 Techniques for Capturing Changed Data Using CDC with DB2 sources Uses CDC Services to configure adapters and real-time jobs Unlike other Data Integrator adapters and real-time services. which are configured separately under the Adapter Instance and Real-time > Access Server > Real-time Services nodes in the Administrator. 4.This document is part of a SAP study on PDF usage. For more information. you can create. Enter a database User name and Password. Data Integrator automatically sets adapter parameters and provides short cuts for configuring and monitoring associated real-time services. To configure a CDC service. create a DB2 CDC datastore using the Designer. Data Integrator allows you to import DB2 tables and create real-time jobs for maintaining changed data. 500 Data Integrator Designer Guide . you must configure and monitor an IBM EP Adapter and its services as a CDC Service. To create a CDC datastore for DB2 Enter a datastore name. use the Real-time > Access Server > CDC Services node. Like other datastores. see the Data Integrator Management Console: Administrator Guide. Enter a Data source (use the name of the Replication server).x or higher. you must create an IBM EP Adapter instance using the Administrator. 5. Change-data tables are only available from DB2 UDB 8. edit. DB2 II reads data from the DB2 log and publishes it to MQ.

Techniques for Capturing Changed Data Using CDC with DB2 sources 18 6. the Advanced options display. Data Integrator runs it in test mode. which requires the full path from a test file name on the Job Server computer. Find out how you can participate and help to improve our documentation. Enter the name of the control table schema that you created for this datastore using DB2 II. When you check this box. enter the name of the Job Server (that manages the adapter instance) and adapter instance name (that will access changed data).This document is part of a SAP study on PDF usage. Every file matching the file name becomes a input message. Select the Enable CDC check box. 8. 9. Data Integrator Designer Guide 501 . Data Integrator accepts a wild card in the file name (*. 7. In the Event Publisher Configuration section of the Advanced options. (Optional) Enter a name for a test file. If you configure a test file name here. you must enter corresponding configuration information for the adapter. See the Data Integrator Management Console: Administrator Guide.txt). When you run a real-time job from the Designer.

see “Using before-images” on page 490.This document is part of a SAP study on PDF usage. When you find the table that you want to import. Both the before. click Apply. This sequencing column provides a way to collate image pairs if they become separated as a result of the data flow design. 11. or Search. You can configure DB2II to create before-images. 2. Right-click the CDC datastore name in the object library and select Open. Configuring a DB2 CDC source When Data Integrator imports a CDC table. This field increments by one each time Data Integrator reads a row except when it encounters a pair of before. the before. You can use this datastore to import CDC tables. If Data Integrator encounters before-images it retrieves them before applying the after-image UPDATE operation to the target. Importing CDC data from DB2 To import CDC table metadata from DB2 1.and after-images receive the same sequence number. right-click it and select Import. 502 Data Integrator Designer Guide .and after-images. Click OK. it adds four columns. Import by Name. If you select Open. 18 Techniques for Capturing Changed Data Using CDC with DB2 sources 10. Data Integrator preserves two columns from DB2 II: • • • • DI_Db2_TRANS_ISN (transaction sequence number) DI_DB2_TRANS_TS (time stamp) DI_SEQUENCE_NUMBER DI_OPERATION_TYPE Data Integrator generates the other two additional columns: The DI_SEQUENCE_NUMBER column The DI_SEQUENCE_NUMBER column starts with zero at the beginning of each extraction. In addition to the performance impact of this data volume increase. then click Edit and follow steps 6 through 8 again for any additional configurations.and after-image pairs could be separated or lost depending on the design of your data flow. For information about when to consider using before-images. If you want to create more than one configuration for this datastore. you can browse the datastore for existing CDC tables using the Datastore Explorer. Find out how you can participate and help to improve our documentation.

re-direct. Techniques for Capturing Changed Data Using CDC with DB2 sources 18 which would cause data integrity issues. using GROUP BY or ORDER BY clauses in a query). see the Data Integrator Reference Guide. It checks and translates the tags in the message to Data Integrator values for the DI_OPERATION_TYPE column. When using functions and transforms that re-order. There are no additional source table options for DB2 CDC tables.and after-images become separated or get multiplied into many rows (for example. but undesirable results can still occur due to programming errors. For detailed information about the Map_CDC_Operation transform.This document is part of a SAP study on PDF usage. The Map_CDC_Operation transform allows you to restore the original ordering of image pairs by using the DI_SEQUENCE_NUMBER column as its Sequencing column. Find out how you can participate and help to improve our documentation. DB2 CDC tables When you drag a CDC datastore table into a data flow. it automatically becomes a source object. 2. and multiply the number of rows in a data flow. click the name of this source object to open its Source Target Editor. be aware of the possible impact to targets. The Map_CDC_Operation transform can resolve problems. Data Integrator Designer Guide 503 . you can lose row order. 1. The DI_OPERATION_TYPE column Data Integrator generates values in the DI_OPERATION_TYPE column. If during the course of a data flow the before. eliminate. To configure a DB2 CDC table Drag a CDC datastore table into a data flow. If you want to set a Join Rank for this table. Valid values for this column are: • • • • I for INSERT D for DELETE B for before-image of an UPDATE U for after-image of an UPDATE Data Integrator receives each row from DB2II as a message.

you will have 5 messages with 10 values each or 50 values to update. limit the number of columns that you publish. This filtering step can limit update processing time. Use the Replication Server's Event Publishing options in DB2 II to select tables and columns (subscribers) to be published. The View Data feature is not supported for DB2 CDC tables The embedded data flow feature is not supported. However. Find out how you can participate and help to improve our documentation. and SQL transforms All database functions. 18 Techniques for Capturing Changed Data Using CDC with DB2 sources Limitations The following limitations exist for this feature: • You cannot use the following transforms and functions with a source table imported with a CDC datastore because of the existence of the Data Integrator generated columns for CDC tables. If the data has not changed. Subscription. If you limit the number of columns to 4. For DB2 CDC. Data Integrator cannot compare or search these columns. you can limit the amount of data in a message. Data Integrator knows not to load that data. key_generation. such as lookup. Data Integrator only requires that you publish all rows for columns that you actually publish. To decrease load time. If there are no changes in a particular column. if you have 10 columns in a table and 5 rows of data. 504 Data Integrator Designer Guide . and checkpoint information is not configurable in Data Integrator as it is for Oracle CDC. DB2 II publishes each row of a table as a message which contains data for all columns in the row. before-image. Key_Generation. Checkpoints are not supported. then you will have 5 messages with 4 values each or 20 values to update. For example. • • • • • • Table_Comparison. lookup_ext.This document is part of a SAP study on PDF usage. Checkpoints allow Data Integrator to check to see of a column value has changed since the last update. Data Integrator loads it regardless with DB2 CDC. because you can use DB2 II to allow some columns to be published while disallowing others. You can also use DB2 II to specify that you want to track before-images. and total_rows You can only create one CDC source in a data flow. sql.

UPDATES. A source that reads only the most recent operations (INSERTS. After the first request to capture changes. one for each changed table. DELETES) allows you to design smaller. Less processing also occurs during recovery of a failed job because the recovery process does not need to back out the uncommitted changes. The following diagram shows the path that the data takes from Attunity CDC to Data Integrator. Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources 18 Using CDC with Attunity mainframe sources If your environment must keep large amounts of data current. the mainframe CDC feature is a simple solution to limiting the number of rows that must be read on a regular basis. faster delta loads. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Extracts only committed changes which is less processing than extracting every change. is required to extract changes. Without a staging area. The CDC Agent sends the changed data to an optional staging area. Data Integrator Designer Guide 505 . The advantages of a staging area are: • • • A single journal scan can extract changes to more than one table. • The Attunity CDC Agent monitors the database journal for changes to specific tables. Data Integrator captures changed data on Attunity mainframe data sources and applies it to a target system. multiple journal scans. the CDC agent stores a context that the agent uses as a marker to not recapture changes prior to it.

Attunity generates the CDC data source on the same computer as the CDC agent by default. create an Attunity CDC data source in Attunity Studio. Setting up Attunity CDC If you currently use Attunity as the connection to Data Integrator to extract data from mainframe sources. Refer to the Attunity CDC documentation for details. you will not capture before images even if you specify the Data Integrator option Get before-image for each update row. You specify the workspace name in the Data Integrator option Attunity workspace.This document is part of a SAP study on PDF usage. • 506 Data Integrator Designer Guide . • The Attunity Studio wizard generates the following components that you need to specify on the Data Integrator Datastore Editor when you define an Attunity CDC datastore: • A CDC data source name that you specify in the Data Integrator option Data source. • • Select a name for your CDC agent. Obtain the host name of this computer to specify in the Data Integrator option Host location. You have the option of placing the CDC data source on the client (same computer as Data Integrator). Select the tables to monitor for changes. A workspace for the CDC agent to manage the change capture event queue. Find out how you can participate and help to improve our documentation. 18 Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources However. • • Specify your data source. see the Attunity Connect CDC document. choose one of the following methods to capture changes and specify the location of the journal: • • • • VSAM under CICS—By CICS Log stream DB2 on OS/390 and z/OS platforms—By DB2 Journal DB2 on OS/400—By DB400 Journal DISAM on Windows—By Journal For a complete list of supported data sources. • Attunity Connect CDC sends the changes to the CDC data sources through which Data Integrator can access the changes using standard ODBC or JDBC. The following steps summarize the procedure for using the Attunity Studio wizard to create a CDC data source. a staging area requires additional storage and processing overhead. Based on your data source. Specify if you want to capture before images for update operations. If you do not specify this option in Attunity Studio.

Setting up Data Integrator To use Data Integrator to read and load changed data from mainframe sources using Attunity. Refer to the “Mainframe interface” section of Chapter 5: Datastores for a list of mainframe data sources and an introduction to creating database datastores. To create a CDC datastore for Attunity Open the Datastore Editor. Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources 18 For more information. 4. Data Integrator Designer Guide 507 . do the following procedures on the Data Integrator Designer: • • • • Create a CDC datastore for Attunity Import metadata for Attunity tables Configure a mainframe CDC source Build real-time jobs using metadata Creating CDC datastores The CDC datastore option is available for all mainframe interfaces to Data Integrator.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. 2. 1. 3. In the Datastore type box. refer to the CDC setup chapter in the Attunity Connect: The Change Data Capture Solution. Enter a name for the datastore. select Database. select Attunity_Connector. In the Database type box.

In the Port box. and import the tables. When you setup access to the data sources in Attunity Studio. 9. enter the connection information once. but you cannot join two CDC tables. All Attunity data sources must use the same workspace. specify the name of the host on which the Attunity data source daemon exists.This document is part of a SAP study on PDF usage. All Attunity data sources must be accessible by the same user name and password. Find out how you can participate and help to improve our documentation. In the Host location box. use the same workspace name for each data source. refer to the Attunity web site. You can now use the new datastore connection to import metadata tables into the current Data Integrator repository. The default value is 2551. specify the Attunity daemon port number. You can enable CDC for the following data sources. You might want to specify multiple data sources in one Attunity datastore for easier management. 10. specify the name of the Attunity CDC data source. this datastore becomes a CDC datastore. Specify the Attunity server workspace name that the CDC agent uses to manage the change capture event queue for the CDC data source. If you list multiple data source names for one Attunity Connector datastore. Data Integrator imports data from regular Attunity data sources differently than from CDC data sources. VSAM under CICS DB2 UDB for z/OS DB2 UDB for OS/400 In the Data source box. it is easier to create one datastore. You can specify more than one data source for one datastore. For the current list of data sources. Once saved. Complete the rest of the dialog and click OK. If you can access all of the CDC tables through one Attunity data source. • • 7. ensure that you meet the following requirements: • Do not specify regular Attunity data sources with CDC data sources in the same Data Integrator datastore. 8. 508 Data Integrator Designer Guide . 18 Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources 5. Check the Enable CDC box to enable the CDC feature. • • • 6.

Both the beforeand after-images receive the same sequence number. using GROUP BY or ORDER BY clauses in a query). You can configure Attunity Streams to retrieve before. For information about when to consider using before-images.images of UPDATE rows before Data Integrator applies the UPDATE operation to the target.This document is part of a SAP study on PDF usage. you can lose row order. you can use it to import CDC table metadata. Data Integrator will discard the rows. right-click the datastore name and select Open. The Data Integrator import operation adds the following columns to the original table: Column name DI_SEQUENCE_NUMB ER DI_OPERATION_TYPE Data type integer varchar(1) varchar(26) varchar(4) varchar(12) varchar(64) Source of column Generated by Data Integrator Generated by Data Integrator Supplied by Attunity Streams Supplied by Attunity Streams Supplied by Attunity Streams Supplied by Attunity Streams Context Timestamp TransactionID Operation tableName varchar(128) Supplied by Attunity Streams The DI_SEQUENCE_NUMBER column The DI_SEQUENCE_NUMBER column starts with zero at the beginning of each extraction.and after-images become separated or get multiplied into many rows (for example.images in the database. Find out how you can participate and help to improve our documentation. Functions and templates are not available because the Attunity CDC datastore is read-only. This field increments by one each time Data Integrator reads a row except when it encounters a pair of before.and after-images. In the object library. Data Integrator Designer Guide 509 . If during the course of a data flow the before. Import by Name. see “Using before-images” on page 490. or Search. For mainframe CDC. only the CDC tables that you selected in the procedure “Setting up Attunity CDC” on page 506 are visible when you browse external metadata. This sequencing column provides a way to collate image pairs if they become separated as a result of the data flow design. Note that if you do not configure Attunity Streams to capture before. Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources 18 Importing mainframe CDC data After you create a CDC datastore.

it automatically becomes a source object. Specify a value for the CDC subscription name. 4. To configure a mainframe CDC table Drag a CDC datastore table into a data flow. Click the name of this source object to open its Source Table Editor. 18 Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources The Map_CDC_Operation transform allows you to restore the original ordering of image pairs by using the DI_SEQUENCE_NUMBER column as its Sequencing column. 2. The DI_OPERATION_TYPE column Data Integrator generates values in the DI_OPERATION_TYPE column. The table automatically becomes a source object.This document is part of a SAP study on PDF usage. 510 Data Integrator Designer Guide . Valid values for this column are: • • • • I for INSERT D for DELETE B for before-image of an UPDATE U for after-image of an UPDATE Configuring a mainframe CDC source When you drag a CDC datastore table into a data flow. Find out how you can participate and help to improve our documentation. For detailed information about the Map_CDC_Operation transform. 1. 3. see the Data Integrator Reference Guide. Click the CDC Options tab.

and table name. Using mainframe check-points Attunity CDC agents read mainframe sources and load changed data either into a staging area or directly into the CDC data source. This field is required. Data Integrator Designer Guide 511 . it reads only the rows inserted into the CDC table since the last check-point. Select from the list or type a new name to create a new subscription. checkpoints are not enabled. You can use multiple subscription names to identify different users who read from the same imported Attunity CDC table. owner. Enable checkpoint Enables Data Integrator to restrict CDC reads using check-points. Rows of changed data append to the previous load in the CDC data source. Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources 18 The Source Table Editor’s CDC Options tab shows the following three CDC table options: Option Name Description CDC subscription A name that Data Integrator uses to keep track of the name position in the continuously growing Attunity CDC table. Once a check-point is placed. see “Using mainframe check-points” on page 511.This document is part of a SAP study on PDF usage. the next time the CDC job runs. enable this option. only after-images are retrieved. A subscription name must be unique within a datastore. For more information. For example. Find out how you can participate and help to improve our documentation. Attunity CDC uses the subscription name to mark the last row read so that the next Data Integrator job starts reading the CDC table from that position. Attunity CDC uses the subscription name to save the position of each user. For more information. By default. see “Using before-images” on page 490. If row your source can log before-images and you want to read them during change-data capture jobs. you can use the same subscription name (without conflict) with different tables that have the same name in the same datastore if they have different owner names. By default. Get before-image Some databases allow two images to be associated with for each update an UPDATE row: a before-image and an after-image.

18 Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources When you enable check-points. log-based CDC capture software must be set up properly. Therefore. This means that identical jobs. do not reuse data flows that use CDC datastores because each time a source table extracts data it uses the same subscription name. re-direct. When you must capture before-image update rows: • • Make sure Attunity Streams is set up to retrieve full before-images. For detailed information about the Map_CDC_Operation transform. the recovered job begins to review the CDC data source at the last check-point. 512 Data Integrator Designer Guide . Note: To avoid data corruption problems. if the test job ever runs again.This document is part of a SAP study on PDF usage. see the Data Integrator Reference Guide. eliminate. the CDC job reads all the rows in the Attunity CDC data source and processing time increases. be aware of the possible impact to targets. can get different results and leave check-points in different locations in the file. on the Source Table Editor enter the CDC Subscription name and select the Enable check-point option. If you enable check-points and you run your CDC job in recovery mode. When you migrate CDC jobs from test to production. otherwise enabling the Get before-images for each update row option in Data Integrator has no effect. and multiply the number of rows in a data flow. The Map_CDC_Operation transform can resolve problems. Using before-images from mainframe sources For an introduction to before. depending upon when they run. When you use functions and transforms that re-order. If check-points are not enabled. Find out how you can participate and help to improve our documentation. After you check the Get before-images for each update row option. Data Integrator processes two rows for every update. a CDC job in Data Integrator uses the subscription name to read the most recent set of appended rows and to mark the end of the read.and after-image pairs could be separated or lost depending on the design of your data flow. In addition to the performance impact of this data volume increase. see “Using before-images” on page 490. but undesirable results can still occur due to programming errors. the before. which would cause data integrity issues. it does not affect the production job’s results. Select the Get before-images for each update row option in the CDC table’s source editor.and after-images. The underlying. a best-practice scenario is to change the subscription name for the production job. To use check-points.

Microsoft uses the following terms for the SQL Replication Server: • Article—An article is a table. Data Integrator cannot compare or search these columns.This document is part of a SAP study on PDF usage. Data Integrator interacts with SQL Replication Server. the CDC feature is a simple solution to limit the number of rows that must be read on a regular basis. such as lookup. Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases 18 Limitations The following limitations exist for this feature: • You cannot use the following transforms and functions with a source table imported with a CDC datastore because of the existence of the Data Integrator generated columns for CDC tables. a partition. and SQL transforms All database functions. or a database object that the DBA specifies for replication. faster delta loads. • • • Table_Comparison. Overview of CDC for SQL Server databases Data Integrator captures changed data on SQL Server databases and applies it to a target system. An article can be any of the following: • • • • • • • • An entire table Certain columns (using a vertical filter) Certain rows (using a horizontal filter) A stored procedure or view definition The execution of a stored procedure A view An indexed view A user-defined function Data Integrator Designer Guide 513 . Key_Generation. Find out how you can participate and help to improve our documentation. DELETES) allows you to design smaller. To capture changed data. and total_rows You can only create one CDC source in a data flow. UPDATES. sql. key_generation. A source that reads only the most recent operations (INSERTS. Using CDC with Microsoft SQL Server databases If your environment must keep large amounts of data current. lookup_ext.

This document is part of a SAP study on PDF usage. A publication makes it easier to specify a logically related set of data and database objects that you want to replicate together. Find out how you can participate and help to improve our documentation. Data Integrator obtains changed data from the Distribution database in the MS SQL Replication Server. and transactions into the distribution database. not to all of the publications available on a Publisher. Subscribers subscribe to publications. • • • An application makes changes to a database and the Publisher within the MS SQL Replication Server captures these changes within a transaction log. Data Integrator reads the data from the command table within the Distribution database. 514 Data Integrator Designer Guide . not to individual articles within a publication. They subscribe only to the publications that they need. and creates input rows for a target data warehouse table. The Log Reader Agent in the Distributor reads the Publisher’s transaction log and saves the changed data in the Distribution database. Publication—A publication is a collection of one or more articles from one database. 18 Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases • • • • Distributor—The Distributor is a server that stores metadata. history data. Subscriber—A subscriber is a server that receives replicated data. applies appropriate filters. Publisher—The Publisher is a server that makes data available for replication to other servers. The following diagram shows how the changed data flows from MS SQL Replication Server to Data Integrator. Data Integrator reads the distribution database to obtain changed data.

When you enable a database for replication. The following steps summarize the procedure to configure SQL Replication Server for your SQL Server database. This MS SQL wizard generates the following components that you need to specify on the Data Integrator Datastore Editor when you define an SQL Server CDC datastore: • • • • • MSSQL distribution server name MSSQL distribution database name MSSQL distribution user name MSSQL distribution password Select the New Publications option on the Replication node of the Microsoft SQL Enterprise Manager to create new publications that specify the tables that you want to publish. configure the Distribution database in SQL Replication Server to capture changes on these tables. One of these tables is Sysarticles which contains a row for each article defined in this specific database. Find out how you can participate and help to improve our documentation. and the Distribution option. Follow the wizard to create the Distributor and Distribution database. MSpublications—contains one row for each publication that a Publisher replicates. Setting up SQL Replication Server for CDC If your Data Integrator currently connects to SQL Server to extract data. Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases 18 Data Integrator accesses the following tables within the Distribution database: • • • • MSarticles—contains one row for each article that a Publisher replicates. In the Commands tab of the Table Article Properties window: Data Integrator Designer Guide 515 .This document is part of a SAP study on PDF usage. subscribers. MSpublisher_databases—contains one row for each Publisher and Publisher database pair that the local Distributor services. This type updates data at the Publisher and send changes incrementally to the Subscriber. MSrepl_commands—contains rows of replicated commands (changes to data). select the Configure publishing. • On the Replication node of the Microsoft SQL Enterprise Manager. Replication Server creates tables on the source database. One of the columns in Sysarticles indicates which columns in a source table are being published. Data Integrator requires the following settings in the Advanced Options: • • Select Transactional publication on the Select Publication Type window.

In the Datastore type box. Setting up Data Integrator To use Data Integrator to read and load changed data from SQL Server databases. For more information. Otherwise. select XCALL. Find out how you can participate and help to improve our documentation. Clear Clustered indexes because Data Integrator treats the table as a log and reads sequentially from it. You specify this publication name on the Data Integrator Datastore Editor when you define an MSSQL CDC datastore. 18 Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases • • If you want before images for UPDATE and DELETE commands. refer to the Microsoft SQL Enterprise Manager online help. allow anonymous subscriptions to save all transactions in the Distribution database. select CALL. 2. Check the Enable CDC box to enable the CDC feature. Select Keep the existing table unchanged because Data Integrator treats the table as a log. 516 Data Integrator Designer Guide . Enter a name for the datastore. 1. select Database. • In the Snapshot tab of the Table Article Properties window: • • • • Specify a publication name and description. 5. Clear the options Create the stored procedures during initial synchronization of subscriptions and Send parameters in binary format options because Data Integrator does not use store procedures and has its own internal format. select Microsoft SQL Server. do the following procedures on the Data Integrator Designer: • • • Create a CDC datastore for SQL Server Import metadata for SQL Server tables Configure a CDC source Creating CDC datastores The CDC datastore option is available for SQL Server connections to Data Integrator. 4. In the Database type box. 3.This document is part of a SAP study on PDF usage. Select option Yes. To create a CDC datastore for SQL Server Open the Datastore Editor. Refer to “Defining a database datastore” on page 85 for an introduction to creating database datastores.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases

18

6. 7. 8.

Select a Database version. Change-data tables are only available from SQL Server 2000 Enterprise. Enter a Database name (use the name of the Replication server). Enter a database User name and Password.

9.

In the CDC section, enter the following names that you created for this datastore when you configured the Distributor and Publisher in the MS SQL Replication Server:

• • • • •

MSSQL distribution server name MSSQL distribution database name MSSQL publication name MSSQL distribution user name MSSQL distribution password

10. If you want to create more than one configuration for this datastore, click Apply, then click Edit and follow step 9 again for any additional configurations. 11. Click OK.

Data Integrator Designer Guide

517

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases

You can now use the new datastore connection to import metadata tables into the current Data Integrator repository.

Importing SQL Server CDC data
After you create a CDC datastore, you can use it to import CDC table metadata. In the object library, right-click the datastore name and select Open, Import by Name, or Search. Only the CDC tables that you selected in the procedure “Setting up SQL Replication Server for CDC” on page 515 are visible when you browse external metadata. Data Integrator uses the MSpublications and MSarticles table in the Distribution database of SQL Replication Server to create a list of published tables. When you import each CDC table, Data Integrator uses the Sysarticles table in the Publisher database of SQL Replication Server to display only published columns. An imported CDC table schema might look like the following:

The Data Integrator import operation adds the following columns to the original table: Column name DI_SEQUENCE_NUMBER DI_OPERATION_TYPE Data type integer varchar(1) Source of column Generated by Data Integrator Generated by Data Integrator

518

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases

18

Column name MSSQL_TRAN_SEQNO

Data type

Source of column

varchar(256) Supplied by SQL Replication Server Supplied by SQL Replication Server

MSSQL_TRAN_TIMESTAMP timestamp

The DI_SEQUENCE_NUMBER column
The DI_SEQUENCE_NUMBER column starts with zero at the beginning of each extraction. This field increments by one each time Data Integrator reads a row except when it encounters a pair of before- and after-images. Both the beforeand after-images receive the same sequence number. This sequencing column provides a way to collate image pairs if they become separated as a result of the data flow design. You can configure SQL Replication Server to retrieve before-images of UPDATE rows before Data Integrator applies the UPDATE operation to the target. Note that if you do not configure SQL Replication Server to capture before-images in the database, only after-images are captured by default. For information about when to consider using before-images, see “Using beforeimages” on page 490. If during the course of a data flow the before- and after-images become separated or get multiplied into many rows (for example, using GROUP BY or ORDER BY clauses in a query), you can lose row order. The Map_CDC_Operation transform allows you to restore the original ordering of image pairs by using the DI_SEQUENCE_NUMBER column as its Sequencing column. For detailed information about the Map_CDC_Operation transform, see the Data Integrator Reference Guide.

The DI_OPERATION_TYPE column
Data Integrator generates values in the DI_OPERATION_TYPE column. Valid values for this column are:

• • • •

I for INSERT D for DELETE B for before-image of an UPDATE U for after-image of an UPDATE

Configuring a SQL Server CDC source
When you drag a CDC datastore table into a data flow, it automatically becomes a source object.

Data Integrator Designer Guide

519

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases

1. 2. 3. 4.

To configure a SQL Server CDC table Drag a CDC datastore table into a data flow. The table automatically becomes a source object. Click the name of this source object to open its Source Table Editor. Click the CDC Options tab. Specify a value for the CDC subscription name.

The Source Target Editor’s CDC Options tab shows the following three CDC table options: Option Name Description

CDC subscription A name that Data Integrator uses to keep track of the name position in the continuously growing SQL Server CDC table. SQL Server CDC uses the subscription name to mark the last row read so that the next Data Integrator job starts reading the CDC table from that position. You can use multiple subscription names to identify different users who read from the same imported SQL Server CDC table. SQL Server CDC uses the subscription name to save the position of each user. Select from the list or type a new name to create a new subscription. A subscription name must be unique within a datastore, owner, and table name. For example, you can use the same subscription name (without conflict) with different tables that have the same name in the same datastore if they have different owner names. This value is required. Enable checkpoint Enables Data Integrator to restrict CDC reads using check-points. Once a check-point is placed, the next time the CDC job runs, it reads only the rows inserted into the CDC table since the last check-point. For more information, see “Using mainframe check-points” on page 511. By default, checkpoints are not enabled.

520

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases

18

Option Name

Description

Get before-image Some databases allow two images to be associated with an UPDATE row: a before-image and an after-image. If for each update your source can log before-images and you want to read row them during change-data capture jobs, enable this option. By default, only after-images are retrieved. For more information, see “Using before-images” on page 490.

Using check-points
A Log Reader Agent in SQL Replication Server read the transaction log of the Publisher and saves the changed data into the Distribution database, which Data Integrator uses as the CDC data source. Rows of changed data append to the previous load in the CDC data source. When you enable check-points, a CDC job in Data Integrator uses the subscription name to read the most recent set of appended rows and to mark the end of the read. If check-points are not enabled, the CDC job reads all the rows in the CDC data source and processing time increases. To use check-points, on the Source Table Editor enter the CDC Subscription name and select the Enable check-point option. If you enable check-points and you run your CDC job in recovery mode, the recovered job begins to review the CDC data source at the last check-point. Note: To avoid data corruption problems, do not reuse data flows that use CDC datastores because each time a source table extracts data it uses the same subscription name. This means that identical jobs, depending upon when they run, can get different results and leave check-points in different locations in the file.

Using before-images from SQL Server sources
For an introduction to before- and after-images, see “Using before-images” on page 490. When you must capture before-image update rows:

Make sure SQL Replication Server is set up to retrieve full beforeimages. When you create a Publication in SQL Replication Server, specify XCALL for UPDATE commands and DELETE commands to obtain before images.

Select the Get before-images for each update row option in the CDC table’s source editor.

Data Integrator Designer Guide

521

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Using CDC with timestamp-based sources

SQL Replication Server must be set up properly, otherwise enabling the Get before-images for each update row option in Data Integrator has no effect. After you check the Get before-images for each update row option, Data Integrator processes two rows for every update. In addition to the performance impact of this data volume increase, the before- and after-image pairs could be separated or lost depending on the design of your data flow, which would cause data integrity issues. The Map_CDC_Operation transform can resolve problems, but undesirable results can still occur due to programming errors. When you use functions and transforms that re-order, re-direct, eliminate, and multiply the number of rows in a data flow, be aware of the possible impact to targets. For detailed information about the Map_CDC_Operation transform, see the Data Integrator Reference Guide.

Limitations
The following limitations exist for this feature:

You cannot use the following transforms and functions with a source table imported with a CDC datastore because of the existence of the Data Integrator generated columns for CDC tables. Data Integrator cannot compare or search these columns.

• • •

Table_Comparison, Key_Generation, and SQL transforms All database functions, such as lookup, lookup_ext, key_generation, sql, and total_rows

You can only create one CDC source in a data flow.

Using CDC with timestamp-based sources
Use Timestamp-based CDC to track changes: • If you are using sources other than Oracle 9i, DB2 8.2, mainframes accessed through IBM II Classic Federation, or mainframes accessed through Attunity and

If the following conditions are true:

• • •

There are date and time fields in the tables being updated You are updating a large table that has a small percentage of changes between extracts and an index on the date and time fields You are not concerned about capturing intermediate results of each transaction between extracts (for example, if a customer changes regions twice in the same day).

522

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with timestamp-based sources

18

Business Objects does not recommend using the Timestamp-based CDC when:

• • •

You have a large table, a large percentage of it changes between extracts, and there is no index on the timestamps. You need to capture physical row deletes. You need to capture multiple events occurring on the same row between extracts.

This section discusses what you need to consider when using source-based, time-stamped, changed-data capture:

• • •

Processing timestamps Overlaps Types of timestamps

In these sections, the term timestamp refers to date, time, or datetime values. The discussion in this section applies to cases where the source table has either CREATE or UPDATE timestamps for each row. Timestamps can indicate whether a row was created or updated. Some tables have both create and update timestamps; some tables have just one. This section assumes that tables contain at least an update timestamp. For other situations, see “Types of timestamps” on page 533. Some systems have timestamps with dates and times, some with just the dates, and some with monotonically generated increasing numbers. You can treat dates and generated numbers the same. It is important to note that for the timestamps based on real time, time zones can become important. If you keep track of timestamps using the nomenclature of the source system (that is, using the source time or sourcegenerated number), you can treat both temporal (specific time) and logical (time relative to another time or event) timestamps the same way.

Processing timestamps
The basic technique for using timestamps to determine changes and to save the highest timestamp loaded in a given job and start the next job with that timestamp. To do this, create a status table that tracks the timestamps of rows loaded in a job. At the end of a job, UPDATE this table with the latest loaded timestamp. The next job then reads the timestamp from the status table and selects only the rows in the source for which the timestamp is later than the status table timestamp.

Data Integrator Designer Guide

523

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Using CDC with timestamp-based sources

The following example illustrates the technique. Assume that the last load occurred at 2:00 PM on January 1, 1998. At that time, the source table had only one row (key=1) with a timestamp earlier than the previous load. Data Integrator loads this row into the target table and updates the status table with the highest timestamp loaded: 1:10 PM on January 1, 1998. After 2:00 PM Data Integrator adds more rows to the source table: Source table Key 1 2 3 Data Tanaka Lani Update_Timestamp 01/01/98 02:12 PM 01/01/98 02:39 PM

Alvarez 01/01/98 01:10 PM

Target table Key 1 Data Update_Timestamp

Alvarez 01/01/98 01:10 PM

Status table Last_Timestamp 01/01/98 01:10 PM At 3:00 PM on January 1, 1998, the job runs again. This time the job does the following: 1. 2. Reads the Last_Timestamp field from the status table (01/01/98 01:10 PM). Selects rows from the source table whose timestamps are later than the value of Last_Timestamp. The SQL command to select these rows is:
SELECT * FROM Source WHERE 'Update_Timestamp' > '01/01/98 01:10 pm'

This operation returns the second and third rows (key=2 and key=3). 3. 4. Loads these new rows into the target table. Updates the status table with the latest timestamp in the target table (01/
01/98 02:39 PM) with the following SQL statement: UPDATE STATUS SET 'Last_Timestamp' = SELECT MAX('Update_Timestamp') FROM target_table

The target shows the new data:

524

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with timestamp-based sources

18

Source table Key 1 2 3 Data Tanaka Lani Update_Timestamp 01/01/98 02:12 PM 01/01/98 02:39 PM

Alvarez 01/01/98 01:10 PM

Target table Key 2 3 1 Data Tanaka Lani Update_Timestamp 01/01/98 02:12 PM 01/01/98 02:39 PM

Alvarez 01/01/98 01:10 PM

Status table Last_Timestamp 01/01/98 02:39 PM To specify these operations, a Data Integrator data flow requires the following objects (and assumes all the required metadata for the source and target tables has been imported):

A data flow to extract the changed data from the source table and load it into the target table:
Data flow: Changed data with timestamps

The query selects rows from SOURCE_TABLE to load to TARGET_TABLE.

The query includes a WHERE clause to filter rows with older timestamps.

A work flow to perform the following: 1. Read the status table

Data Integrator Designer Guide

525

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Using CDC with timestamp-based sources

2.

Set the value of a variable to the last timestamp

526

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with timestamp-based sources

18

3. 4.

Call the data flow with the variable passed to it as a parameter Update the status table with the new timestamp

Work flow: Changed data with timestamps $Last_Timestamp_var = sql('target_ds', 'SELECT to_char(last_timestamp, \'YYYY.MM.DD HH24:MI:SS\') FROM status_table');

1

2

3 4
$Last_Timestamp_var = sql('target_ds', ' UPDATE status_table SET last_timestamp = (SELECT MAX(target_table.update_timestamp) FROM target_table) ');

A job to execute the work flow

Overlaps
Unless source data is rigorously isolated during the extraction process (which typically is not practical), there is a window of time when changes can be lost between two extraction runs. This overlap period affects source-based changed-data capture because this kind of data capture relies on a static timestamp to determine changed data. For example, suppose a table has 1000 rows (ordered 1 to 1000). The job starts with timestamp 3:00 and extracts each row. While the job is executing, it updates two rows (1 and 1000) with timestamps 3:01 and 3:02, respectively. The job extracts row 200 when someone updates row 1. When the job extracts row 300, it updates row 1000. When complete, the job extracts the latest timestamp (3:02) from row 1000 but misses the update to row 1.

Data Integrator Designer Guide

527

... update row 1 (original row 1 already extracted) Update row 1000 Extract row 600 Extract row 1000.This document is part of a SAP study on PDF usage. 600 . . This section continues on the assumption that there is at least an update timestamp. .. job done There are three techniques for handling this situation: • • • Overlap avoidance Overlap reconciliation Presampling The following sections describe these techniques and their implementations in Data Integrator. For other situations. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources Here is the data in the table: Row Number 1 2 3 . see “Types of timestamps” on page 533. 528 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. 1000 Column A ... .. ........... . ....... . Here is the timeline of events (assume the job extracts 200 rows per minute): 3:00 3:01 3:02 3:03 3:05 Start job extraction at row 1 Extract row 200.. 200 ..

Find out how you can participate and help to improve our documentation. Overlap reconciliation Overlap reconciliation requires a special extraction process that reapplies changes that could have occurred during the overlap period. if it takes at most two hours to run the job. the overlap data flow must check whether the rows exist in the target and insert only the ones that are missing. perform it for as few rows as possible. but rows flagged as UPDATE rarely are. While this regular job does not give you up-to-the-minute updates. Therefore. the regular data flow selects the new rows from the source. you can run a job at 1:00 AM every night that selects only the data updated the previous day until midnight. If the data volume is sufficiently low. it is possible to set up a system where there is no possibility of an overlap. rows flagged as INSERT are often loaded into a fact table. overlap reconciliation reapplies the data updated between 9:30 PM and 10:30 PM on January 1. For example. an overlap period of at least two hours is recommended. Because the overlap data flow is likely to apply the same rows again. an overlap period of n (or n plus some small increment) hours is recommended. Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 Overlap avoidance In some cases. The overlap period is usually equal to the maximum possible extraction time. and uses the database loader to add the new facts to the target database. For example. it guarantees that you never have an overlap and greatly simplifies timestamp management. If it can take up to n hours to extract the data from the source system. There is an advantage to creating a separate overlap data flow.This document is part of a SAP study on PDF usage. it cannot blindly bulk load them or it creates duplicates. This lookup affects performance. you can load the entire new data set using this technique of checking before loading. avoiding the need to create two different data flows. 1998. A “regular” data flow can assume that all the changes are new and make assumptions to simplify logic and improve performance. therefore. For example. generates new keys for them. You can avoid overlaps if there is a processing interval where no updates are occurring on the target system. if you can guarantee that the data extraction from the source system does not last more than one hour. This extraction can be executed separately from the regular extraction. Data Integrator Designer Guide 529 . Thus. if the highest timestamp loaded from the previous job was 01/01/98 10:30 PM and the overlap period is one hour. For example.

The main difference is that the status table now contains a start and an end timestamp. and then extracting rows up to that timestamp. Find out how you can participate and help to improve our documentation. and the next job runs. it does the following: 1. The start timestamp is the latest timestamp extracted by the previous job.This document is part of a SAP study on PDF usage. 1998. To return to the example: The last extraction job loaded data from the source table to the target table and updated the status table with the latest timestamp loaded: Source table Key 1 2 3 Data Alvarez Tanaka Lani Update_Timestamp 01/01/98 01:10 PM 01/01/98 02:12 PM 01/01/98 02:39 PM Target table Key 1 Data Alvarez Update_Timestamp 01/01/98 01:10 PM Status table Start_Timestamp End_Timestamp 01/01/98 01:10 PM NULL Now it’s 3:00 PM on January 1. saving it. the end timestamp is the timestamp selected by the current job. The SQL command to select one row is: SELECT MAX(Update_Timestamp) FROM source table 530 Data Integrator Designer Guide . Selects the most recent timestamp from the source table and inserts it into the status table as the End Timestamp. The technique is an extension of the simple timestamp processing technique described previously in “Processing timestamps” on page 523. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources Presampling Presampling eliminates the overlap by first identifying the most recent timestamp in the system.

Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 The status table becomes: Status table Start_Timestamp End_Timestamp 01/01/98 01:10 PM 01/01/98 02:39 PM 2. The SQL command to select these rows is: SELECT * FROM source table WHERE Update_Timestamp > '1/1/98 1:10pm' AND Update_Timestamp <= '1/1/98 2:39pm' This operation returns the second and third rows (key=2 and key=3) 3. 4. Loads these new rows into the target table. Updates the status table by setting the start timestamp to the previous end timestamp and setting the end timestamp to NULL.This document is part of a SAP study on PDF usage. Selects rows from the source table whose timestamps are greater than the start timestamp but less than or equal to the end timestamp. The table values end up as follows: Source table Key 1 2 3 Data Alvarez Tanaka Lani Update_Timestamp 01/01/98 01:10 PM 01/01/98 02:12 PM 01/01/98 02:39 PM Target table Key 1 2 3 Data Alvarez Tanaka Lani Update_Timestamp 01/01/98 01:10 PM 01/01/98 02:12 PM 01/01/98 02:39 PM Status table Start_Timestamp End_Timestamp 01/01/98 02:39 PM NULL Data Integrator Designer Guide 531 . Find out how you can participate and help to improve our documentation.

• A work flow to perform the following: 1.This document is part of a SAP study on PDF usage. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources To enhance the previous example to consider the overlap time requires the following changes to the work flow: • A data flow to extract the changes since the last update and before the most recent timestamp. 532 Data Integrator Designer Guide . The query includes a WHERE clause to filter rows between timestamps. Read the source table to find the most recent timestamp. Update the start timestamp with the value from end timestamp and set the end timestamp to NULL. Set the value of two variables to the start of the overlap time and to the end of the overlap time. respectively. Call the data flow with the variables passed to it as parameters. 3. Data flow: Changed data with overlap The query selects rows from SOURCE_TABLE to load to TARGET_TABLE. Find out how you can participate and help to improve our documentation. 4. 2.

'UPDATE status_table SET start_timestamp = end_stamp'). Types of timestamps Some systems have timestamps that record only when rows are created. Others have timestamps that record only when rows are updated. 'UPDATE status_table SET end_timestamp = \'\' '). $End_timestamp_var = sql('target_ds'. Find out how you can participate and help to improve our documentation. This section discusses these timestamps: • • • Create-only timestamps Update-only timestamps Create and update timestamps Data Integrator Designer Guide 533 . (Typically. update-only systems set the update timestamp when the row is created or updated. (('UPDATE status_table SET end_timestamp = \' ' || to_char($End_timestamp_var. Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 Work flow: Changed data with overlap $End_timestamp_var = sql('target_ds'. 2 1 3 4 $Start_timestamp_var = sql('target_ds'. 'yyyy-mm-dd hh24:mi:ss')) || '\' ')). there are systems that keep separate timestamps that record when rows are created and when they are updated. 'SELECT MAX(update_stamp) FROM source_table').) Finally.This document is part of a SAP study on PDF usage. $Start_timestamp_var = sql('target_ds'.

If the table never gets updated. The section “Using CDC for targets” on page 545. Accomplish these extractions in Data Integrator by adding the WHERE clause from the following SQL commands into an appropriate query transform: • • Find new rows: SELECT * FROM source_table WHERE Create_Timestamp > $Last_Timestamp Find updated rows: SELECT * FROM source_table WHERE Create_Timestamp <= $Last_Timestamp AND Update_Timestamp > $Last_Timestamp) From here.This document is part of a SAP study on PDF usage. Less frequently (for example. Update-only timestamps Using only an update timestamp helps minimize the impact on the source systems. Create and update timestamps Both timestamps allow you to easily separate new data from updates to the existing data. If the system provides only an update timestamp and there is no way to tell new rows from updated rows. 534 Data Integrator Designer Guide . and the updated rows go through the key-lookup process and are updated in the target. you can extract only the new rows. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources Create-only timestamps If the source system provides only create timestamps. the new rows go through the key-generation process and are inserted into the target. you can process the entire table to identify the changes. weekly) extract the updated rows by processing the entire table. your job has to reconcile the new data set against the existing data using the techniques described in the section “Using CDC for targets” on page 545. describes how to identify changes. but it makes loading the target systems more difficult. you can combine the following two techniques: • • Periodically (for example. you have these options: • • • If the table is small enough. Find out how you can participate and help to improve our documentation. daily) extract only the new rows. The job extracts all the changed rows and then filters unneeded rows using their timestamps. If the table is large and gets updated.

For example. you cannot simply reload it every time a record changes: unless you assign the generated key of 123 to the customer ABC. Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 For performance reasons. the customer dimension table contains generated keys. Timestamp-based CDC examples This section discusses the following techniques for time-stamped based CDC: • • Preserving generated keys Preserving history Preserving generated keys For performance reasons.This document is part of a SAP study on PDF usage. many data warehouse dimension tables use generated keys to join with the fact table. When you run a job to update this table. All facts for customer ABC have 123 as the customer key. the customer dimension table and the fact tables do not correlate. If you do not find the key. customer ABC has a generated key 123 in the customer dimension table. the simplest technique is to look up the key for all rows using the lookup function in a query. The updated rows cannot be loaded by bulk into the same target at the same time. Even if the customer dimension is small. In the following example. you might want to separate the extraction of new rows into a separate data flow to take advantage of bulk loading into the target. You can preserve generated keys by: • • Using the lookup function Comparing tables Using the lookup function If history preservation is not an issue and the only goal is to generate the correct keys for the existing rows. Source customer table Company Name ABC DEF GHI JKL Customer ID 001 002 003 004 Data Integrator Designer Guide 535 . the source customer rows must match the existing keys. generate a new one. Find out how you can participate and help to improve our documentation.

3.This document is part of a SAP study on PDF usage. Data flow: Replace generated keys 1 Source data without generated keys 3 Source data with generated keys when they exist 2 536 Data Integrator Designer Guide . Loads the result into a file (to be able to test this stage of the data flow before adding the next steps). 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources Target dimension table Gen_Key Company Name 123 124 125 ABC DEF GHI Customer ID 001 002 003 This example data flow does the following: 1. Extracts the source rows. Find out how you can participate and help to improve our documentation. 2. Retrieves the existing keys using a lookup function in the mapping of a new column in a query.

2. The column name in the target table containing the generated keys. Caching option to optimize the lookup performance.customer GKey NULL 'PRE_LOAD_CACHE' Customer_ID Customer_ID Fully qualified name of the target table containing the generated keys. Find out how you can participate and help to improve our documentation. A query to select the rows with NULL generated keys.This document is part of a SAP study on PDF usage. 3. The column in the target table containing the value to use to match rows. NULL value to insert in the key column if no existing key is found. The arguments for the function are as follows: lookup function arguments Description target_ds. A Key_Generation transform to determine the appropriate key to add. The column in the source table containing the values to use to match rows. this requires the following steps: 1. The resulting data set contains all the rows from the source with generated keys where available: Result data set Gen_Key Company Name 123 124 125 NULL ABC DEF GHI JKL Customer ID 001 002 003 004 Adding a new generated key to the new records requires filtering out the new rows from the existing and updated rows. Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 The lookup function compares the source rows with the target.owner. Data Integrator Designer Guide 537 . A target to load the new rows into the customer dimension table. In the data flow.

the rows from the source whose keys were found in the target table might contain updated data. Find out how you can participate and help to improve our documentation. The data flow requires new steps to handle updated rows. A query to filter the rows with existing keys from the rows with no keys. Data Integrator loads all rows from the source into the target. as follows: 1.This document is part of a SAP study on PDF usage. Data flow: Adding new generated keys 2 1 3 Customer dimension table—new rows with new generated keys This data flow handles the new rows. Because this example assumes that preserving history is not a requirement. however. A target to load the rows into the customer dimension table. 2. A new line leaving the query that looked up the existing keys. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources The data flow expands as follows. 538 Data Integrator Designer Guide . 3.

You can then run the result through the key-generation transform to assign a new key for every INSERT. a table-comparison transform provides a better alternative by allowing the data flow to load only changed rows. The table-comparison transform examines all source rows and performs the following operations: • • • • Generates an INSERT for any new row not in the target table. Find out how you can participate and help to improve our documentation. Ignores any row that is in the target table and has not changed. This is the data set that Data Integrator loads into the target table. Data Integrator Designer Guide 539 . Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 The data flow expands as follows: Data flow: Loading all rows into the target 3 1 Customer dimension table: all rows with existing generated keys 2 Comparing tables The drawback of the generated-keys method is that even if the row has not been changed. Fills in the generated key for the updated rows. Generates an UPDATE for any row in the target table that has changed.This document is part of a SAP study on PDF usage. it generates an UPDATE and is loaded into the target. If the amount of data is large.

A key-generation transform to generate new keys. Find out how you can participate and help to improve our documentation. Most likely. you will perform history preservation on dimension tables. 4. 5. 540 Data Integrator Designer Guide . A target to load the rows into the customer dimension table Data flow: Load only updated or new rows 1 2 3 5 4 Preserving history History preserving allows the data warehouse or data mart to maintain the history of data so you can analyze it over time. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources The data flow that accomplishes this transformation includes the following steps: 1. A table-comparison transform to generate INSERT and UPDATE rows and to fill in existing keys. 2. 3. A source to extract the rows from the source table(s).This document is part of a SAP study on PDF usage. A query to map columns from the source.

flags the row as INSERT. if the values have changed. the data flow preserves the history for the Region column but does not preserve history for the Phone column. Data Integrator Designer Guide 541 . The History_Preserving transform ignores everything but rows flagged as UPDATE. if a customer moves from one sales region to another. The original row describing the customer remains in the customer dimension table with a unique generated key. the History_Preserving transform generates a new row for that customer. Source Customer table Customer Fred's Coffee Region East Phone (212) 123-4567 (650) 222-1212 (115) 231-1233 Jane's Donuts West Sandy's Candy Central Target Customer table GKey 1 2 3 Customer Fred's Coffee Region East Phone (212) 123-4567 (201) 777-1717 (115) 454-8000 Jane's Donuts East Sandy's Candy Central In this example. it compares the values of specified columns and. A source to extract the rows from the source table(s). In the following example. 2. The data flow contains the following steps: 1. one customer moved from the East region to the West region.This document is part of a SAP study on PDF usage. For these rows. Data Integrator provides a special transform that preserves data history to prevent this kind of situation. simply updating the customer record to reflect the new region would give you misleading results in an analysis by region over time because all purchases made by a customer before the move would incorrectly be attributed to the new region. Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 For example. and another customer’s phone number changed. This produces a second row in the target instead of overwriting the first row. Find out how you can participate and help to improve our documentation. A Key_Generation transform gives the new row a new generated key and loads the row into the customer dimension table. To expand on how Data Integrator would handle the example of the customer who moves between regions: • • • If Region is a column marked for comparison. A query to map columns from the source.

Find out how you can participate and help to improve our documentation. A table-comparison transform to generate INSERTs and UPDATEs and to fill in existing keys.This document is part of a SAP study on PDF usage. 5. 6. the change in the Sandy's Candy row did not create a new row but updated the existing one. Data flow: Preserve history in the target 1 2 3 5 6 4 The resulting dimension table is as follows: Target Customer table GKey 1 2 3 4 Customer Fred's Coffee Region East Phone (212) 123-4567 (201) 777-1717 (115) 231-1233 (650) 222-1212 New row Updated rows Jane's Donuts East Sandy's Candy Central Jane's Donuts West Because the Region column was set as a Compare column in the History_Preserving transform. the change in the Jane's Donuts row created a new row in the customer dimension table. A History_Preserving transform to convert certain UPDATE rows to INSERT rows. A key-generation transform to generate new keys for the updated rows that are now flagged as INSERT. A target to load the rows into the customer dimension table. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources 3. Because the Phone column was not used in the comparison. 542 Data Integrator Designer Guide . 4.

This UPDATE statement updates the Valid to value. which is less than the Valid to date. The UPDATE record will set the Valid to date column on the current record (the one with the same primary key as the INSERT) to the value in the Valid from date column in the INSERT record.) When you specify the Valid from and Valid to entries. This UPDATE record will set the Column value to Reset value in the target table record with the same source primary key as the INSERT statement. In the INSERT statement the Column will be set to Set value. GKey). You can control which key to use for updating by appropriately configuring the loading options in the target editor. (Valid in this sense means that the record’s generated key value is used to load the fact table during this time interval. Customer). A record from the source table is considered valid in the dimension table for all date values t such that the Valid from date is less than or equal to t. In history-preserving techniques. When you specify entries in both the groups. Data Integrator enables you to set an update flag to mark the current record in a dimension table. When you specify Column. and only update the latest version if the update is on the generated key (for example.” Data Integrator supports Valid from and Valid to date columns. Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 Now that there are two rows for Jane's Donuts. correlations between the dimension table and the fact table must use the highest key value. Valid_from date and valid_to date To support temporal queries like “What was the customer’s billing address on May 24.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 543 . 1998. the History_Preserving transform generates an UPDATE record before it generates an INSERT statement for history-preservation reasons (it converts an UPDATE into an INSERT). Value Set value in column Column identifies the current valid record in the target table for a given source table primary key. there are multiple records in the target table with the same source primary key values. Find out how you can participate and help to improve our documentation. Update flag To support slowly changing dimension techniques. the History_Preserving transform generates only one extra UPDATE statement for every INSERT statement it produces. the History_Preserving transform generates an UPDATE record before it generates an INSERT statement. Note that updates to non-history preserving columns update all versions of the row if the update is performed on the natural key (for example.

you must consider: • • Header and detail synchronization Capturing physical deletions Header and detail synchronization Typically.LAST_MODIFIED BETWEEN $G_SDATE AND $G_EDATE OR DETAIL. You might opt to relax that clause by removing one of the upper bounds. 544 Data Integrator Designer Guide . For example. In some instances.LAST_MODIFIED BETWEEN $G_SDATE AND $G_EDATE OR DETAIL. however. you might choose to extract all header and detail information whenever any changes occur at the header level or in any individual line item. Conversely. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources Additional job design tips When designing a job to implement changed-data capture (CDC). In these cases. when rows are physically deleted). Find out how you can participate and help to improve our documentation. or you might not have access to such information (for example. but it might improve the final performance of your system while not altering the result of your target database.LAST_MODIFIED >= $G_SDATE) … This might retrieve a few more rows than originally intended. but the same column at the order header level does not update.ID = DETAIL. source systems keep track of header and detail information changes in an independent way.This document is part of a SAP study on PDF usage.ID AND (HEADER.LAST_MODIFIED BETWEEN $G_SDATE AND $G_EDATE) For some databases.ID = DETAIL. such as in: … WHERE HEADER. To extract all header and detail rows when any of these elements have changed. use logic similar to this SQL statement: SELECT … FROM HEADER. a change to the default ship-to address in the order header might impact none of the existing line items. this WHERE clause is not well optimized and might cause serious performance degradation. if a line-item status changes. DETAIL WHERE HEADER. its “last modified date” column updates.ID AND (HEADER. your source system might not consistently update those tracking columns.

your job should include logic to update your target database correspondingly. Check every order that could possibly be deleted — You must verify whether any non-closed order has been deleted. however. Perform a partial refresh based on a business-driven time-window — For example. removing line items from an existing order). refreshing the last month of orders is appropriate to maintain integrity. Data Integrator Designer Guide 545 . Using CDC for targets Source-based changed-data capture is almost always preferable to targetbased capture for performance reasons. this technique requires you to keep a record of the primary keys for every object that is a candidate for deletion. Target-based changed-data capture allows you to use the technique when source-based change information is limited. then you must capture these physical deletions when synchronizing header and detail information. In this case. To be efficient. Techniques for Capturing Changed Data Using CDC for targets 18 Capturing physical deletions When your source system allows rows to be physically deleted. If the first non-closed order in your source table occurred six months ago. Some source systems. Perform a partial refresh based on a data-driven time-window — For example. suppose that the business that the job supports usually deletes orders shortly after creating them.This document is part of a SAP study on PDF usage. therefore fully synchronizing the source system and the target database. Perform a full refresh — Simply reload all of the data. There are several ways to do this: • • • Scan a log of operations — If your system logs transactions in a readable format or if you can alter the system to generate such a log. • • When physical deletions of detail information in a header-detail relationship are possible (for example. Find out how you can participate and help to improve our documentation. then you can scan that log to identify the rows you need to delete. suppose that the source system only allows physical deletion of orders that have not been closed. then by refreshing the last six months of data you are guaranteed to have achieved synchronization. do not provide enough information to make use of the source-based changeddata capture techniques.

18 Techniques for Capturing Changed Data Using CDC for targets 546 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

Data Integrator Designer Guide Monitoring jobs chapter .This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

19 Monitoring jobs About this chapter About this chapter This chapter contains the following topics: • • Administrator SNMP support Administrator The Data Integrator Administrator is your primary monitoring resource for all jobs designed in the Data Integrator Designer. You can use an SNMP-supported application to monitor Data Integrator job status and receive error events. Find out how you can participate and help to improve our documentation. Thus. The SNMP agent monitors and records information about the Job Server and the jobs it is running. Topics in this section include: • • • • • • About the Data Integrator SNMP agent Job Server. SNMP support Data Integrator includes an SNMP (simple network management protocol) agent that allows you to connect to third-party applications to monitor its jobs. SNMP agent.This document is part of a SAP study on PDF usage. and NMS application architecture About SNMP Agent’s Management Information Base (MIB) About an NMS application Configuring Data Integrator to support an NMS application Troubleshooting About the Data Integrator SNMP agent When you enable SNMP (simple network management protocol) for a Job Server. For detailed information. that Job Server sends information about the jobs it runs to the SNMP agent. see the Data Integrator Administrator Guide. You can configure NMS (network management software) applications to communicate with the SNMP agent. you can use your NMS application to monitor the status of Data Integrator jobs. 548 Data Integrator Designer Guide .

This document is part of a SAP study on PDF usage. The Data Integrator SNMP agent sends proactive messages (traps) to NMS applications. Monitoring jobs SNMP support 19 The SNMP agent is a license-controlled feature of the Data Integrator Job Server. The SNMP agent uses the agent port to communicate with NMS applications using UDP (user datagram protocol). When you have a Data Integrator SNMP license on a computer. Like Job Servers. When you configure the SNMP agent. Find out how you can participate and help to improve our documentation. The agent listens for requests from the NMS applications and responds to requests. and NMS application architecture You must configure one Job Server to communicate with the SNMP agent and to manage the communication for SNMP-enabled Job Servers. This Job Server does not need to be the same one configured with the communication port. SNMP agent. you can enable SNMP for any number of Job Servers running on that computer. it will send events to the SNMP agent via the Job Server with the communication port. The agent uses the trap receiver ports to send error events (traps or notifications) to NMS applications. You must also enable at least one Job Server for SNMP. Data Integrator Designer Guide 549 . you specify one agent port and any number of trap receiver ports. When you enable a Job Server for SNMP. SNMP starts when the Data Integrator Service starts. traps notify you about potential problems. While you use an NMS application. Job Server.

Tables include: Table 19-1 : Data Integrator MIB Job Server table Column jsIndex jsName jsStatus Description A unique index that identifies each row in the table Name of Job Server Status of the Job Server. Possible values are: • notEnabled • • • • • • • • • • • • • initializing optimizing ready proceed wait stop stopRunOnce stopRecovered stopError notResponding error warning trace 550 Data Integrator Designer Guide . The tables contain information about the status of Job Servers and the jobs they run. the time the agent started. Find out how you can participate and help to improve our documentation.txt BOBJ-DI-MIB.txt Consult these files for more detailed descriptions and up-to-date information about the structure of objects in the Data Integrator MIB. The Data Integrator MIB contains five scalar variables and two tables. 19 Monitoring jobs SNMP support About SNMP Agent’s Management Information Base (MIB) The Data Integrator SNMP agent uses a management information base (MIB) to store information about SNMP-enabled Job Servers and the jobs they run. Metadata for the Data Integrator MIB is stored in two files which are located in the LINK_DIR/bin/snmp/mibs directory: BOBJ-ROOT-MIB. and the current system time.This document is part of a SAP study on PDF usage. The scalar variables list the installed version of Data Integrator.

Monitoring jobs SNMP support 19 Table 19-1 :Data Integrator MIB Job table Column jobIdDate jobIdTime jobIdN jobRowN jType Description The part of a job’s identifier that matches the date The part of a job’s identifier that matches the time The final integer in a job’s identifier A unique index that identifies an object in a job Associated object for this row. Find out how you can participate and help to improve our documentation. trace — Always zero Data Integrator Designer Guide 551 . • • • • • • • • • • • • • • jRowsIn The status of the object. Possible values include: job — A job wf — A work flow df — A data flow error — An error message trace — A trace message • • • • • jName jStatus The name or identifier for this object.This document is part of a SAP study on PDF usage. such as the job or work flow name or the error message identifier. Possible values include: notEnabled initializing optimizing ready proceed wait stop stopRunOnce stopRecovered stopError notResponding error warning trace • • • • Depends on the type of object: Data flow — The number of input rows read Work flow — Always zero Job — Sum of values for all data flows in the job Error. warning.

additional error rows as needed. zero additional rows For a failed job. each time the agent starts it loads data into the Job table for each Job Server. To provide some historical context. starting with the oldest jobs. and data flows. The MIB is stored in memory. If the MIB’s size reaches the maximum table size. The SNMP agent maintains the data for completed jobs for the specified lifetime. or jRowsOut last changed. warning. or trace has occurred during this job The time when the object’s jStatus. jRowsIn. The Data Integrator SNMP agent receives data about jobs and Job Servers from SNMP-enabled Job Servers and maintains this data in the Data Integrator MIB for currently running jobs and recently completed jobs. you set a job lifetime and a maximum table size. work flows. For jobs. For errors. The data that remains includes: • • • One Job table row with the statistics for the entire job For a successful job. The number of milliseconds between the beginning of the object’s execution and jStatusTime. 552 Data Integrator Designer Guide . work flow or data flow). the agent eliminates 20 percent of the completed jobs. jMessage contains the message text. trace — Number of times that the error. warning.This document is part of a SAP study on PDF usage. either empty or an information message. Find out how you can participate and help to improve our documentation. The agent summarizes and eliminates individual data flow and work flow records for completed jobs periodically to reduce the size of the MIB. The number of milliseconds necessary to compile the object (job. warnings. or trace messages. The data is from jobs that ran just before the Job Servers were shut down. During configuration. 19 Monitoring jobs SNMP support Column jRowsOut Description Depends on the type of object: • Data flow — The number of output rows written • • • jStatusTime jExecTime jInitTime jMessage Work flow — Always zero Job — Sum of values for all data flows in the job Error.

enterprises.internet. Note: Status for real-time services does not appear until the real-time services have processed enough messages to reach their cycle counts.org.dod.mib-2. • Agent denies a request due to an authentication failure (if configured to do so).dataIntegrator) or one of the standard SNMP MIBs: • • • iso.internet. After the jobs reach their cycle count.internet. Specifically.dod.businessObjects. traps notify you about potential problems. GetNextRequest.snmp iso. Find out how you can participate and help to improve our documentation.dod.This document is part of a SAP study on PDF usage.org.system iso. Similarly. Because there are no writable objects in the Data Integrator MIB.mtmt.mib-2. Data Integrator does not send traps for real-time jobs until the jobs have reached their cycle count.snmpModules The agent listens on the agent port for commands from an NMS application which communicates commands as PDUs (protocol data units).dod. The agent responds to SNMP GetRequest.org.mgmt. Data Integrator refreshes status or sends additional traps. While you use an NMS application. Monitoring jobs SNMP support 19 About an NMS application An NMS application can query the Data Integrator SNMP agent for the information stored in the Data Integrator MIB (iso.internet. GetBulkRequest. the agent sends an SNMPv2-Trap PDU to the SNMP ports that you have configured. The agent sends traps when: • • • • • • Errors occur during batch or real-time jobs Job Servers fail Agent starts Agent has an internal error Agent has an orderly shut down Agent restarts and the previous agent was unable to send a trap caused by a job error (these traps include a historical annotation) Note: This can occur if the machine fails unexpectedly or is halted without an orderly shutdown. the agent gracefully rejects SetRequest commands for that MIB.snmpv2.org. See “Traps” on page 560. Data Integrator Designer Guide 553 .private. and SetRequest commands that specify valid object identifiers (OIDs).

When you enable SNMP for a Job Server. continue with this procedure. you are telling Data Integrator to send events to the SNMP agent via the communication Job Server on the same machine. Find out how you can participate and help to improve our documentation. The summary lists the Job Server name. If you want to add a new Job Server. SNMP configuration in Windows 1. • • • Configure the Data Integrator SNMP agent on the same computer for which you configured Job Servers. This window lists the Job Servers currently configured. Configure your NMS application to query the Data Integrator MIB for job status using the agent port. verify that the communication port is correct. the communication port.This document is part of a SAP study on PDF usage. and whether SNMP is enabled. When you select a Job Server to support SNMP. Note: Supporting SNMP communication and enabling SNMP are separate configurations. the Job Server port. To do so: • Select one Data Integrator Job Server on each computer to support SNMP communication. Select a Job Server and click Edit. if the application gets a time-out). check the agent. whether the Job Server supports adapters and SNMP communication. 19 Monitoring jobs SNMP support Configuring Data Integrator to support an NMS application If you have an NMS application that monitors systems across your network using SNMP. click Edit Job Server Config. If a Job Server is already configured to support adapters and SNMP communication. Enable SNMP on each Job Server that you want to monitor. Exactly one Job Server must be configured to support adapters and SNMP communication. 3. The Job Server Configuration Editor opens. Refer to the documentation for the NMS application.1 > Server Manager). If the Data Integrator SNMP agent does not respond to the NMS application (for example. 554 Data Integrator Designer Guide . 2. you can use that application to monitor Data Integrator jobs. In the Data Integrator Server Manager. click Add. you must also specify the communication port that connects the Job Server to the SNMP agent. To select a Job Server to support SNMP communication Open the Data Integrator Server Manager (Start > Programs > BusinessObjects Data Integrator 6. Otherwise.

Click OK. The SNMP agent allows you to enable or disable more than one Job Server at a time. 7. select OK. see step 2 below. The default value is 4001. In the Communication port box. The summary indicates whether the Job Server is SNMP-enabled. • • 2. To enable SNMP on a Job Server You can enable SNMP for a Job Server from the Server Manager or from the SNMP agent. Select the Support adapter and SNMP communication check box.1 > Server Manager). • If you want to configure the SNMP agent (including enabling SNMP for Job Servers): a. b. Open the Data Integrator Server Manager (Start > Programs > BusinessObjects Data Integrator 6. Data Integrator uses the same port to communicate with adapters. select Restart. The SNMP agent allows you to enable or disable more than one Job Server at a time. click Edit Job Server Config. Monitoring jobs SNMP support 19 4. • If you want to enable SNMP for the current Job Server: a. Verify that the repositories associated with this Job Server are correct. 1. 3. see “To configure the SNMP agent” on page 556. b. Select the Enable SNMP check box In the Job Server Configuration Editor. “To configure the SNMP agent” on page 556. 5. Find out how you can participate and help to improve our documentation. enter the port you want to use for communication between the Data Integrator SNMP agent and Job Servers on this computer. Skip to step 2 in the procedure. In the Server Manager window. select OK. 6. Click OK. Data Integrator Designer Guide 555 . This window lists the Job Servers currently configured. To use the SNMP agent. The Job Server Configuration Editor opens. c. To use the Server Manager. In the Data Integrator Server Manager. You can enable SNMP for a Job Server from the Server Manager or from the SNMP agent. d. In the Job Server Configuration Editor. Enter a port number that is not used for other applications. c.This document is part of a SAP study on PDF usage.

Find out how you can participate and help to improve our documentation. Select the Enable SNMP on this machine check box to enable the Data Integrator SNMP agent. To enable SNMP on additional Job Servers. 19 Monitoring jobs SNMP support You can enable SNMP on any Job Server. 5. Select the Enable SNMP check box. If you want to add a new Job Server. SNMP-enabled Job Servers send the SNMP agent messages about job status and job errors. which restarts the Job Servers using the new configuration information. 7. 556 Data Integrator Designer Guide . see the next section. 4. Parameter categories include: 1.1 > Server Manager). 9. click Add. “SNMP configuration parameters”. v3 — Parameters that affect how the agent grants NMS applications using the v3 version of SNMP access to the Data Integrator MIB and Data Integrator supported standard MIBs Traps — Parameters that determine where the agent sends trap messages • • 4. 3. In the Server Manager window. 5. 8.This document is part of a SAP study on PDF usage. click Edit SNMP Config. 2. repeat steps 4 through 6. select OK. Click OK after you enter the correct configuration parameters for the Data Integrator SNMP agent. select Restart. • • • Job Servers for SNMP — Job Servers enabled for SNMP on this machine System Variables — Parameters that affect basic agent operation Access Control. Select a category and set configuration parameters for your SNMP agent. Click OK. 6. In the Data Integrator Server Manager. v1/v2c — Parameters that affect how the agent grants NMS applications using the v1 or v2c version of SNMP access to the Data Integrator MIB and Data Integrator supported standard MIBs Access Control. Select a Job Server and click Edit. For details on each parameter category. you can modify current or default SNMP configuration parameters. In the Job Server Configuration Editor. The Data Integrator Service restarts. The SNMP Configuration Editor opens. To configure the SNMP agent Open the Data Integrator Server Manager (Start > Programs > BusinessObjects Data Integrator 6. After you enable the SNMP agent.

To enable SNMP for a Job Server that is not enabled. Find out how you can participate and help to improve our documentation. click Restart. In the Data Integrator Server Manager window. the SNMP agent maintains and reports data for jobs that run on that Job Server. The Data Integrator Service restarts. This text is reported to the NMS application. System name System contact Data Integrator Designer Guide 557 . then you must set this value to v1.This document is part of a SAP study on PDF usage. to disable SNMP for all configured Job Servers. The security mechanism used by v1 is not robust. If not. Monitoring jobs SNMP support 19 6. The default port is 161. Note: Some NMS applications use v1 by default. To disable SNMP for a Job Server that is enabled. Optional. Business Objects recommends that you reconfigure the NMS application. The editor lists each configured Job Server in one of two columns: Not enabled for SNMP or Enabled for SNMP. click Enable All. and trap messages sent by the agent are not compatible with v1. Agent port Enter the port at which the agent listens for commands (PDUs) from NMS applications and responds to those commands. Business Objects recommends that you not use v1. to enable SNMP for all the configured Job Servers. Enter the name of the computer. select that Job Server and click Disable. If other devices or agents that the application monitors support v2c or v3. which restarts the Data Integrator SNMP agent using the new configuration information. System Variables Use this category to set parameters that affect the SNMP agent’s operation. Network monitors might contact this person to resolve identified problems. See “To enable SNMP on a Job Server” on page 555. the standard SNMP input port. Enter text that describes the person to contact regarding this system. SNMP configuration parameters Job Servers for SNMP Use this category to select any number of Job Servers and enable SNMP for them. v2c. click Disable All. When SNMP is enabled for a Job Server. Parameter Description Minimum Select the earliest version of SNMP that NMS applications SNMP version will use to query the agent: v1. You can also enable or disable SNMP for individual Job Servers using the Job Server Configuration Editor. select that Job Server and click Enable. or v3. Similarly.

JobTable cache max size (in KB) Access Control. The agent summarizes jobs (that is. Enter the maximum number of minutes a job will remain in the Data Integrator MIB after the job completes. Remember that selecting this option gives the community the capability of reading and then modifying variables in the standard SNMP MIBs. To enable access for a new community Click Add. Select Read-only to permit this community to read the Data Integrator MIB only. or encryption pass phrases. The default is 819 (0. This text is reported to the NMS application. Enter the maximum number of kilobytes that the agent can use for the Data Integrator MIB. The agent eliminates jobs completely after reaching the lifetime limit. 3. If an NMS application monitoring the Data Integrator SNMP agent uses SNMP version v1 or v2c. v1/v2c Use this category to enter the information that allows NMS applications using SNMP version v1 or v2c to access the MIBs controlled by the Data Integrator SNMP agent. Enter text that identifies this computer such as physical location information.This document is part of a SAP study on PDF usage. starting with the oldest jobs. Find out how you can participate and help to improve our documentation. 19 Monitoring jobs SNMP support Parameter System location JobTable cache lifetime (in min) Description Optional. user password. If the MIB reaches this size. Default lifetime is 1440 (one day). With this setting. 2. this community is not permitted to send SetRequest commands to any MIB or GetRequest commands to a standard SNMP MIB for trees that contain security information such as community strings. The editor lists community names and the type of access permitted. 1. Select Read-write to permit this community to send SetRequest commands for all read-write variables in any MIB and GetRequest commands for variables in any MIB. 558 Data Integrator Designer Guide . you must set the Minimum SNMP version to either v1 or v2c under the System Variables category.8 Megabytes) which will store approximately 1000 jobs. eliminates individual data flow and work flow records) after one-eighth of a job’s lifetime. the agent reduces the MIB by 20 percent by eliminating completed jobs.

Enter appropriate information for the user. the administrator of the NMS application assigns the name. 2. To delete access for a particular community Select the community name. Data Integrator Designer Guide 559 . enter the community name permitted to send requests to this agent. Typically. Click Delete.This document is part of a SAP study on PDF usage. v3 Use this category to enter the information that allows NMS applications using SNMP version v3 to access the MIBs controlled by the Data Integrator SNMP agent. To edit a community’s name or access Select the community name and click Edit. Monitoring jobs SNMP support 19 4. To enable access for a new user Click Add. 3. The editor lists user names along with properties of each user. this user is not permitted to send SetRequest commands to any MIB or GetRequest commands to a standard SNMP MIB for a tree that contains security information such as community strings. Parameter Description Read-only Select Read-only to permit this user to read the Data Integrator MIB only. Access Control. 2. In Community name. 2. Click OK. Click OK. or encryption passphrases. Names are case-sensitive and must be unique. Change access type and community name as desired. Find out how you can participate and help to improve our documentation. The NMS application includes this name in all requests to the agent. Remember that selecting this option gives the user the capability of reading and then modifying variables in the standard SNMP MIB. 1. 1. user passwords. 5. With this setting. Read-write Select Read-write to permit this user to send SetRequest commands for all read-write variables in any MIB and Get request commands for variables in any MIB. 1.

Each name must be unique. Find out how you can participate and help to improve our documentation. 19 Monitoring jobs SNMP support Parameter Description User name Enter a name of a user to which the Data Integrator SNMP agent will respond. The NMS application includes this name in all requests to the agent.This document is part of a SAP study on PDF usage. 2. Click OK. A receiver is an NMS application identified by a machine and port. Names are case-sensitive. 2. 1. the administrator of the NMS application assigns the name. Typically. 1. in addition to traps about job errors. This is a computer where an NMS application is installed. Password Confirm password 3. 3. Enter identifying information about the trap receiver. user name. and password as desired. To add a new trap receiver Click Add. Enter the password for the user. To delete access for a user Select the user name. Click OK. Click Delete. Traps Use this category to configure where to send traps. Select the Enable traps for authentication failures check box if you want the agent to send traps when requests fail due to authentication errors. Change access data. Parameters Machine name Description Enter the name of the computer or the IP address of the computer where the agent sends trap messages. such as incorrect passwords or community names. The password is case-sensitive. 2. 1. The editor lists the receivers of the trap messages sent by the Data Integrator SNMP agent. 560 Data Integrator Designer Guide . Re-enter the password to safeguard against typing errors. To edit a user’s name or access data Select the user name and click Edit.

/al_env. To delete a trap receiver Select the trap receiver. 3. Update the identifying information about the trap receiver. Click OK. Find out how you can participate and help to improve our documentation. SNMP configuration on UNIX This section lists the procedures to configure SNMP on UNIX. To change information for a trap receiver Select the trap receiver and click Edit. For more detailed descriptions about the options mentioned here. Monitoring jobs SNMP support 19 Parameters Port Description Enter the port where the NMS application listens for trap messages. . Enter the community name that the NMS application expects in trap messages.sh $ . Community name 3. 1. Click OK. 1. To select a Job Server to support SNMP communication Run the Server Manager. 2. The default value is 162. Enter: $ cd $LINK_DIR/bin/ $ . 1.This document is part of a SAP study on PDF usage./svrcfg Note: The second command sets the environment variables before running the Server Manager. see “SNMP configuration in Windows” on page 554. 2. Click Delete. the standard SNMP output port. Data Integrator Designer Guide 561 .

Enter ‘y’ when prompted with the following question: Do you want to manage adapters and SNMP communication for the Job Server 'Server1''Y|N' [Y]?: 562 Data Integrator Designer Guide . Enter option e : Edit a JOB SERVER entry. ** Data Integrator Server Manager Utility ** 1 : Control Job Service 2 : Configure Job Server 3 : Configure Runtime Resources 4 : Configure Access Server 5 : Configure Web Server 6 : Configure SNMP Agent 7 : Configure SMTP 8 : Configure HACMPa x : Exit Enter Option: 2 a. Enter a number that will be used as the SNMP communication port when you see the following question: Enter TCP Port Number for Job Server <S1> [19111]: 6. Select option 2 to configure a Job Server. Enter the serial number of the Job Server you want to work with when you see the following question: Enter serial number of Job Server to edit: 1 5. 19 Monitoring jobs SNMP support 2. 4. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. for AIX only 3.

then enter x. Enter: $ cd $LINK_DIR/bin/ $ . To enable SNMP on a Job Server Run the Server Manager. Enter: $ cd $LINK_DIR/bin/ $ . 4./al_env.sh $ . 5. then enter x. . To configure an agent Run the Server Manager. _________________________________________________________ ** Current Job Server Information ** _________________________________________________________ S# -1* 2 Job Server Name --------Server1 Server2 TCP Port ----19111 19112 Enable SNMP ------Y N Repository Connection --------repo1@orasvr1 repo2@orasvr1 *:JobServer <S1> supports adapter and SNMP communication on port:19110 _____________________________________________________ c: Create a new JOB SERVER entry e: Edit a JOB SERVER entry d: Delete a JOB SERVER entry u: UPDATE a REPO q: Quit Enter Option: q a: Add a REPO to job server y: Resync a REPO r: Remove a REPO from job server s: Set default REPO 7./svrcfg 1. Enter option e : Edit a JOB SERVER entry.sh $ ./svrcfg 1. To exit the Server Manager. Monitoring jobs SNMP support 19 When you return to the Current Job Server Information page. Find out how you can participate and help to improve our documentation. the Job Server set to manage adapters or SNMP is marked with an asterisk and noted below the list of Job Servers. enter q. . enter q. Data Integrator Designer Guide 563 . Enter y when prompted with the following question: Do you want to Enable SNMP for this JobServer 'Y|N' [N]: To exit the Server Manager. 2.This document is part of a SAP study on PDF usage./al_env. 3. Select option 2 to configure a Job Server.

564 Data Integrator Designer Guide . 19 Monitoring jobs SNMP support 2. Select option 6 Configure SNMP Agent. ** Data Integrator Server Manager Utility ** 1 : Control Job Service 2 : Configure Job Server 3 : Configure Runtime Resources 4 : Configure Access Server 5 : Configure Web Server 6 : Configure SNMP Agent 7 : Configure SMTP 8 : Configure HACMPa x : Exit Enter Option: 3 a. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. the SNMP configuration menu appears. for AIX only One of these options appears based on the current configuration: • • SNMP is Disabled for this installation of Data Integrator [ D = Keep Disabled / E = Enable ]? : SNMP is Enabled for this installation of Data Integrator [ E = Keep Enabled / D = Disable ]? : Once you enable SNMP for Data Integrator.

Find out how you can participate and help to improve our documentation. Press ENTER to keep the default values or enter new values for each variable. Data Integrator Designer Guide 565 . SYSTEM VARIABLES --------------Minimum SNMP Version: v2c System Name: hpsrvr3 System Contact: sysadmtr System Location: JobTable Cache Lifetime: 1440 (in min) JobTable Cache Max Size: 819 (in KB) Default Port: 4961 Access Control. Monitoring jobs SNMP support 19 The following is a sample SNMP configuration menu screen.Exit to previous Menu Enter Option: • To modify system variables.Access Control. v3 4 . Values for each variable display. v1/v2c 3 .Access Control. v3 ---------------------Read User : read_v3 Write User : write_v3 SNMP TRAPS ---------Authentication Traps : Enabled Agent is configured to send trap to: aixserver3:20162 with Community String trap_co 1 . v1/v2c ---------------------Read Community string : read_v2 Write Community string : write_v2 Access Control. choose 1.Modify System Variables 2 .This document is part of a SAP study on PDF usage.Modify Traps X .

Save Changes X .Add TRAP receiver D .Add READ Community String W . either enter a new value or press RETURN to keep the original value. Data Integrator Designer Guide . A submenu displays.Disable sending traps on Authentication failures S . Access Control. choose 3. A submenu displays.Save Changes X . choose 2. v1/v2c ---------------------Read Community string : read_v2 R .Add READ User W . 19 Monitoring jobs SNMP support • To modify SNMP v1 and V2c community names. TRAP SETUP ---------Authentication Traps : Enabled Agent is configured to send trap to: aixserver3:20162 with Community String trap_co A . enter a new value or press RETURN to keep the original value.Delete Community String S . either enter a new value or press RETURN to keep the original value.Exit without save to Previous Menu Enter Option: Use this menu to: • 566 Enable/disable authentication traps. At the prompt.This document is part of a SAP study on PDF usage. Access Control.Enable sending traps on Authentication failures F .Delete TRAP receiver E .Add WRITE Community String D .Exit without save to Previous Menu Enter Option: • To modify SNMP v3 USER names. v3 -----------------Read User : read_v3 R . Find out how you can participate and help to improve our documentation.Exit without save to Previous Menu Enter Option: • To modify TRAP setup. At the prompt. choose 4.Add WRITE User D . A submenu displays.Save Changes X .Delete User S . At the prompt.

and try the next step. Set a trap receiver Select the Enable traps for authentication failures check box Restart the Data Integrator SNMP agent. To determine whether the agent regards a request as unauthorized: a. Under the Traps category of the SNMP Configuration Editor: • • b. 2. The SNMP agent does not reply to unauthorized requests. c. Verify that the agent and the NMS application are using compatible versions of SNMP. and community string the trap receiver will use to authenticate the traps). A “+” indicates newly added names. 3. all additions and deletions display and you are prompted to confirm them. if the NMS application sends messages using SNMP version v2c or v3. A “-“ indicates deleted names. revert to the original time-out setting. Data Integrator Designer Guide 567 . After you enter “S”. To troubleshoot the Data Integrator SNMP agent Check that you are using a valid v1/v2c community name or v3 user name. Under System Variables in the SNMP Configuration Editor. Troubleshooting 1. Monitoring jobs SNMP support 19 • Configure a trap receiver (host name and port number on which the trap receiver is listening. For example. Repeat until the agent responds or the time-out exceeds 20 seconds. Unauthorized requests will fail due to a time-out. Find out how you can participate and help to improve our documentation. you can set the Minimum SNMP version to v2c. If the agent does not respond and the time-out is more than 20 seconds. Increase the SNMP agent’s time-out.This document is part of a SAP study on PDF usage. change the NMS application setting or the configuration of the Data Integrator SNMP agent. If the versions are incompatible. Inspect the output of the trap receiver for authorization traps. the Minimum SNMP version must not be greater than the version of the messages that the NMS application sends.

Check errors in the SNMP error log: “installation directory”/bin/snmp/snmpd. Business Objects recommends that you reconfigure the NMS application.log Use the Server Manager to resolve errors. 568 Data Integrator Designer Guide .p. snmp. you must re-create all your SNMP users in the Data Integrator Server Manager. 4. Copies of the following four files from “installation directory”/bin/snmp directory: • • • • 6.log Check that encryption pass phrases are accessible. 19 Monitoring jobs SNMP support Note: Some NMS applications use v1 by default. Find out how you can participate and help to improve our documentation. include: • • • The exact NMS command you are trying to run and the exact error message Name and version of the NMS application you are running and the version of SNMP protocol that application uses. In your defect report. 5. If not.This document is part of a SAP study on PDF usage. Encryption passphrases are locked to a given SNMP engine ID based on the IP address of the computer running the SNMP agent. If other devices or agents that the NMS application monitors support v2c or v3.conf snmpd.conf snmpd. then you must configure the Data Integrator SNMP agent to accept version v1 commands.conf snmpd. If you change the IP address of that computer. Contact Business Objects technical support. if possible.

364 B blanks. converting to NULL for Oracle loader 69 Bridge to Business Objects metadata 447 browsing adapter datastore 114 datastores 90 metadata 90 bulk loading. Data Quality 379 ASCII format files 136 attributes creating 56 deleting 56 object 53 setting values 55 table. datastores 111 AL_JobServerLoadBalanceDebug 329 AL_JobServerLoadOSPolling 329 annotations adding 60 deleting 60 resizing 60 using 45 application metadata 111 arechitecture. See semicolons ’ See single quotation marks audit label definition 363 editing 364. Oracle 69 business name table attribute 451 Data Integrator Designer Guide 569 . Find out how you can participate and help to improve our documentation. description 366 auditing data flow description 362 enabling for job execution 371 guidelines 371 viewing failed rule 377 viewing results 377 viewing status 378 auto correct loading 461 A about Data Integrator command 39 Access Server message processing 254 real-time jobs 256 specifying for metadata reporting 66 Adapter Data Exchange Timeout 329 Adapter SDK 112 Adapter Start Timeout 329 adapters. Index Symbols $ See dollar signs .This document is part of a SAP study on PDF usage. 367 definition 363 audit rule Boolean expression examples 365 defining 369 definition 363 audit trace 371 Audit window. 365 generating 364 removing 369 resolving invalid 376 audit notification defining 371 ways to generate 365 audit point defining 366. business name 451 audit function data types 364 definition 363 enabling in embedded data flow 373 error row count statistics 363 errow row count statistics 364 good row count statistics 363.

distributing 192 data flows accessible from tool palette 43 adding sources 180 connecting objects in 45 creating from object library 176 defining 176 defining parameters 302 description 172 designing for automatic recovery 461–463 embedded 283 executing only once 456 execution order 191 570 Data Integrator Designer Guide . 513 close command 34 closing windows 47 columns. using with sources 522 using with mainframes 505 using with Oracle 475–495 using with SQL Server 513. extractiong and parsing XML to 247 comments 51. 75 projects 73 repository 22 while loop 206 work flows 201 current Job Server 66 current schema 189 custom functions displaying 36 object library access 49 saving scripts 212 Cut command 35 CWM 446 D data capturing changes 472 loading 179 missing or bad values 467 problems with.This document is part of a SAP study on PDF usage. overview 473 source-based. processing 466–470 recovering 454–466 testing 329 data cache. database requirements 479 changed-data capture 505 overview 472 source-based. 543 timestamp-based. auditing 194 data flow. adapter 112 file formats 139 jobs 74. displaying 69 century change year 69 change-data capture Oracle. 545 timestamp-based. Index C caching comparison tables 194 lookup tables 194 source tables 194 Calculate Usage Dependency 443 Calculating usage dependencies 443 Callable from SQL statement check box 99 calling reusable objects 31 calls to objects in the object library 31 catch editor 209 catch. defined 474 timestamp-based. 211 Compact Repository command 34 concat_date_time function 166 conditionals defining 204 description 202 editor 203–205 example 203 using for manual recovery 466 configurations See also datastore configurations. selecting over ERP for data 275–280 data flow. Find out how you can participate and help to improve our documentation. See try/catch blocks central repositories. examples 535. system configurations configuring Data Profiling Servers 336 connecting objects 45 contents command 39 converting data types 94 Copy command 35 copying objects 53 creating data flows 176 datastores for databases 85 datastores. when to use 473 target-based 475.

changing 89 persistent cache 106 properties. adapter 112 description 80 exporting 119 importing metadata 94. 191 Data Integrator Designer window 32 objects 30 Data options 69 Data Profiler. View Data 356 data sets. purpose 115 object library access 49 objects in. 404 debugger 418 limitations 436 managing filters and breakpoints 432 setting filters and breakpoints 419 show/hide filters and breakpoints 434 starting and stopping 424 tool bar options 413 using the View Data pane 429 windows 427 debugging scripts 212 decimals 151 declarative specifications 191 default configuration 118 default Job Server. executing only once 178. implementing 80 database connection. overview 87 datastores adapters 111 and database links 110 browsing metadata 91 connecting to databases 85 custom 90 database changes. setting 66 defining conditionals 204 data flows 176 datastores for databases 85 datastores. Index in jobs 173 object library access 49 parameter values. changing 89 datastore editor. Windows 336 Data Quality 378 Data Quality datastore 381 Data Quality projects 383 Data quality. changing 89 default configuration 118 defining. setting 303 parameters in 172. sorting 90 objects in. implementing in datastore 80 connections to datastore. operation codes in 174 data transformations in real-time jobs 254 data types. changing 89 requirements for database links 110 Sybase 26 using multiple configurations 80 date formats 152 DB2 logging in to repository 25 using with parameterized SQL 85 Debug menu 37 Debug options Interactive Debugger 418 View Data 404 View Where Used 398. data generated 335 Data Profiling Server configuring.This document is part of a SAP study on PDF usage. converting 94 database links and datastores 110 defined 109 requirements for datastores 110 databases changes to. 175 passing parameters to 175 resizing in workspace 46 sources 178 steps in 173 targets 179 in work flows 173 data flows in batch jobs. 100 memory 102 multiple configurations. viewing 90 options. 75 nested tables 216 objects 31 Data Integrator Designer Guide 571 . adapter 112 jobs 74. Find out how you can participate and help to improve our documentation.

This document is part of a SAP study on PDF usage. importing 232 object library access 49 duplicate rows 243 E Edit menu 35 editing schemas 94. overview 185 embedded data flow audit points not visible 374 creating by selecting objects within a data flow 286 definition 284 enabling auditing 372 embedded data flows description 283 troubleshooting 293 enabling descriptions. 58 displaying 58 editing 59 enabling system setting 36 hiding 58 resizing 58 using 45 viewing 57 design phase. description 324 errors catching and correcting 208 categories 210 data. system setting 36 object descriptions. system level 57 ending lines in scripts 211 engine processes. specifying files during 157. displaying 69 monitoring job execution 68 options 66 port 67 schema levels displayed 67 window 32 dimension tables. Global_DOP. processing 466–470 572 Data Integrator Designer Guide . opening 181 transform. Index parameters 302 projects 73 reusable objects 51 try/catch blocks 209 variables 302. 304 while loops 206 Degree of parallelism. maximum number 69 environment variables 314 ERP system reducing direct requests to 281 selecting over data cache 275–280 error log files. missing values in 470 disabling object descriptions 59 disconnecting objects 45 Display DI Internal Jobs 330 distinct rows and nested data 243 document type definition. setting 330 Delete command Edit menu 35 Project menu 34 DELETE operation code 175 deleting annotations 60 lines in diagrams 45 objects 62 reusable objects from diagrams 63 descriptions adding 58. importing 96 DTD See also XML messages format of XML message 232 metadata. object level 57 descriptions. Find out how you can participate and help to improve our documentation. opening 187 transform. See DTD dollar signs ($) in variable names 211 variables in parameters 304 domains importing automatically 68 metadata. description 328 error logs. 114 editor catch 209 conditional 203–205 file format 137 object. description of 50 query 189 script 212 table. 158 Designer central repositories.

missing dimension values 470 Field ID 159 file format file transfers using a custom program 160 file format editor editing file format 148 modes 137 navigation 138 specifying multiple files 149 work areas 137 file formats creating 139 creating from a table 147 date formats for fields 152 delimited 141 editing 137. creating 148 target. 49 FROM clause for nested tables 237–239 FTP Number of Retry 330 FTP. connection retry interval. variables in 156.This document is part of a SAP study on PDF usage. recovering from 454–466 Exit command 34 exporting files. 230 new 140. 158. Index debugging object definitions 320 messages 320 not caught by try/catch blocks 208 sample solutions 209 severity 320 exceptions See also try/catch blocks automatic recovery and 454 available 210 categories 210 implementing handling 208 sample solutions 209 try/catch blocks and 210 exchanging metadata 446 executing jobs data flows in 191 immediately 321 work flows in 200 execution enabling recovery for 455 order in data flows 191 order in work flows 199 unsuccessful. using 143 source. viewing 92 F fact tables. 157. metadata exchange 447 external tables. 230 multi-byte characters in name 156 reading multiple 149 reading multiple XML files 228 specifying in Data Integrator 139 filtering to find missing or bad values 467 formats 49. 148 file names. 158 fixed-width 136 identifying source 150. specifying 157. creating 148 variables in name 314 Web log. setting 330 functions application 280 concat_date_time 166 contrasted with transforms 184 editing 94. Find out how you can participate and help to improve our documentation. using in 211 WL_GetKeyValue 166 word_ext 166 G global variables creating 304 viewing 305 Data Integrator Designer Guide 573 . metadata. 314 fixed width 141 identifying source 150. example 167 Web logs 165 file transfers 160 Custom transfer program options for flat files 162 files delimited 136 design phase. 114 metadata imported 95 scripts. 158 number 151 object library access 49 overview 136 reading multiple files 149 replicating 145 sample files.

100 History_Preserving transform 541 I icons. displaying names 42. 49 IDoc reduced message type. enabling 455 recovery mode. See conditionals Ignore Reduced Msg Type 330 Ignore Reduced Msg Type_fooSAP R/3 reduced message type. connecting to 81 lines connecting objects in diagrams 45 ending script 211 Linked datastores 109 linked datastores 110 loading data changed data 472 objects 179 local object library 47 local repository. ignoring a specific type 330 importing metadata adapters 114 into datastore 96. metadata. Index graphical user interface See Designer H Help menu 39 hiding object descriptions 59 hierarchies. creating 22 log files statistics 328 viewing during job execution 325 logging in DB2 25 Designer 22–27 Oracle 24 repository version 23 SQL Server 25 Sybase 26 logs. 100 DTD 232 using Metadata Exchange 447 information messages 320 INSERT operation code 175 intermediate results. See data sets object library access 49 objects in 74 organizing complex 74 parameter values. setting 303 recovery mode. from the Designer 323 testing 318. using for 535–539 J Job Server associating with repository 23 default 66 default. copying 325 lookup function generating keys. importing 94. under Tools > Options 69 SNMP configuration 549 job server LoadBalanceDebug option 329 LoadOSPolling option 329 jobs creating 74 debugging execution 324–329 defining 75 executing 321 monitoring 68 M mainframes 574 Data Integrator Designer Guide . running in 457–458 resizing in workspace 46 stopping. 321–323 troubleshooting file sources feeding 2 queries 330 validation 68 K Key_Generation transform 535 L legacy systems.This document is part of a SAP study on PDF usage. ignoring 330 reduced message type. ignoring a specific type 330 if/then/else in work flows. changing options for 329 options. Find out how you can participate and help to improve our documentation.

information imported 95 imported tables. 114 reporting 66 tables. information imported 94 Universe Builder 447. removing duplicate rows 243 nested tables creating 240 FROM clause 237–239 in real-time jobs 257 SELECT statement 236 unnesting data 243–246 unnesting for use in transforms 246 viewing structure 218 New command 34 NMS application. Find out how you can participate and help to improve our documentation. 107 script functions for 105 troubleshooting 105 update schema option 104 menu bar 33 menus Debug 37 Edit 35 Help 39 Project 34 Tools 36 Validation 38 View 35 Window 38 messages See also real-time jobs error 320 information 320 warning 320 metadata analysis categories 443 application 111. deleting 56 attributes. determining 92 exchanging files with external applications 446 external tables. creating 56 attributes. creating 102. characters displayed 67 naming 53 Data Integrator Designer Guide 575 . setting 55 calling existing 52 connecting 45 copying 53 Data Integrator 30 defining 52 descriptions 57 editors 50 imported. 92 functions. viewing 90 in jobs 74 names. logging in to repository 25 MIMB 446 Monitor tab 323 monitor. 114 changes in. viewing 91. opening on job execution 68 multi-user owner renaming 128 using aliases and owner renaming 126 N naming conventions. 449 metadata exchange exporting a file 447 importing a file 447 Microsoft SQL Server. creating with 201 objects annotations 59 attributes. 100. Index connecting to 81 using changed-data capture 505 memory datastores. viewing 91 importing 96. objects 76 naming objects 53 nested data. relationship to SNMP agent 548 NORMAL operation code 175 number formats 151 O object library creating reusable object in 51 deleting objects from 62 local 47 objects available in 49 opening 48 tabs in 49 work flows. 107 memory tables 102 create row ID option 104 creating 103.This document is part of a SAP study on PDF usage.

175 dates as 187 default 67 defining 303 example 175 passing automatically 67 passing to data flows 175 setting values passed in 303 syntax for values 304 times as 187 Paste command 35 PeopleSoft. Data Quality 383 projects defining 73 definition 72 object library access 49 propagating schema edits 94. 304 R reading data. with database links 110 persistent cache datastore 106 Persistent cache datastores 106 ports. importing metadata 96 performance changed-data capture and 472. converting blanks 69 logging into repository 24 package 98 troubleshooting parallel data flows 330 using changed-data capture 475–495 output schema. using for 461–463 pre-packaged adapters 112 preserving history changed-data capture and 473 Print command 34 print function 213 Print Setup command 34 project area 41 project menu 34 Project. viewing 53 relationships 31 relationships among 31 renaming 53 reusable 30 searching for 63 single-use 31 sorting in object library 90 OCI Server Attach Retry 330 Open command 34 opening projects 73 operation codes. 535 improving. 114 properties definition 30 object. See sources real-time jobs Access Server.This document is part of a SAP study on PDF usage. single 211. limiting 67 query transforms compared to SQL SELECT statements 191 output schema. listing of 175 options Designer 66 versus properties 30 Options window 66 Oracle bulk loading. See tool palette Palette command 35 parameters in data flows 172. importing procedure from 98 palette. displaying 36 overflow files 466–467 P package. Index properties. viewing 53 versus options 30 pushing operations to database stored procedure restrictions 99 Q query editor description 189 schema tree. auto filling 188 overview 187 in real-time jobs 257 quotation marks. filling automatically 188 output window. Designer 67 preload SQL commands job recovery and 461 recovery. requirement for 254 adding to a project 263 576 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.

456 work flows. executing once 456 recovery. associating with 23 Microsoft SQL Server. specifying as unit 456–457 recovery. Find out how you can participate and help to improve our documentation. in work flows 199 Refresh command 36 renaming objects 53 replicating file format templates 145 objects 53 repository creating 22 DB2. using for 461 data flows in batch jobs. ignoring 330 Save All command 34 Save command 34 saving projects 74 reusable objects 62 scripts 212 scaling workspace 46 schemas changes in imported data 92 editing 94. supplementary 267. 114 levels displayed 67 tree elements in editor. 461– 463 results saved 458 starting 457 try/catch block and 459 variables for 462–463 work flows. See execution S SAP R/3 reduced message type. limiting 67 script editor 212 scripting language 313 scripts adding from tool palette 43 debugging 212 elements of 211 examples 211 saving 212 syntax 211 writing 212 searching for objects 63 secondary index information for tables 93 SELECT statements Data Integrator Designer Guide 577 . logging in 25 object library. automatic auto correct load option. correcting 457 overview 454 preload SQL commands. logging in 25 Job Server. automatic for batch jobs data flows. logging in 24 storing object definitions in 31 versions 23 Reset Users window 26 reusable objects calling existing 52 creating 51 defining 51 deleting from object library 62 list of 49 reusing 31 saving 62 single definition 31 storing 31 run. using for 463 recursions. Index branching 275–280 cached data or ERP data. manual conditionals. executing once 201. choosing 275–280 compared to batch jobs 255 creating 263 description 255 examples 257–258 message types 256–257 message-response processing 254 RFC calls in 280 sources. using for 466 designing work flows for 463–466 status table. relationship to 48 Oracle. 272–274 testing 270 transactional loading 268 record length field 159 recovery.This document is part of a SAP study on PDF usage. executing once 178 enabling 455 executing path during 458 failures. using for 461.

restriction on SQL 99 storing reusable objects 31 strings comments in 211 in scripts 211 Sybase datastore 26 log in 26 syntax debugging object definitions 319 values in parameters 304 system configurations creating 132 defining 131 displaying 36 exporting 133. defined 556 Access Control.) in scripts 211 simple network management protocol 548 single quotation marks (') in scripts 211 string values in parameters 304 single-use objects description 31 list of 42 SNMP enable for a Job Server on UNIX 563 enable for a Job Server on Windows 555 SNMP agent configuration and architecture 549 configure on UNIX 563 configure on Windows 556 defined 548 events and NMS commands 553 real-time jobs and cycle count 553 relationship to Job Server 549 relationship to MIB 550 relationship to NMS application 548 status of jobs and Job Servers. importing automatically 68 editing 94. See environment variables T table comparison 475 Table_Comparison transform changed data capture. 199 stored procedures. viewing metadata for 91 importing domains 68 importing metadata 94. Find out how you can participate and help to improve our documentation. 133 system variables. defined 560 sources data flows 178 editor. description 328 statistics logs. using for 539 tables adding to data flows as sources 180 caching for comparisons 194 caching for inner joins 194 caching for lookups 194 domains. viewing metadata 92 external. 114 editor. opening 181 files 139 Splitter Optimization 330 SQL Server log in 25 SQL Server. defined 556 traps. 100 loading in single transaction 268 memory 102 metadata imported 94 schema. defined 550 troubleshooting 567 SNMP agent parameters 556–561 Access Control v3. defined 559 Access Control. using changed-data capture 513 statistics log files. 557 System Variables. description 324 Status Bar command 35 status table 463 steps in data flows 173 in jobs 198 in work flows 198. defined 556 Access Control.This document is part of a SAP study on PDF usage. Index equivalent in Data Integrator 191 for nested tables 236 semicolons (. opening 181 external. determining changes in 92 template 181 target-based changed-data capture 474 578 Data Integrator Designer Guide .v1/v2c. defined 556. viewing metadata for 91 imported. v3. defined 557 Traps. defined 558 Job Servers for SNMP. v1/v2c.

limiting in editor 67 trees. description 327 trace logs description 324 open on job execution 325 transactional loading 268 transforms contrasted with functions 184 editors 185 inputs 187 and nested data 246 object library access 49 query 187 in real-time jobs 254 schema tree. Find out how you can participate and help to improve our documentation. rules for 313 linking to parameters 67 local 300–304 local. Index targets changed-data capture 545 data flows 178 evaluating results 324. See try/catch blocks true/false in work flows. See Designer users. resetting 26 V validating jobs before execution 68 Validation menu 38 variables environment 314 file names.This document is part of a SAP study on PDF usage. importing 96 tries. avoiding 529 overlaps. using in 156 global 304–313 global. rules for 313 in names of file formats 314 overview 296–298 Data Integrator Designer Guide 579 . See conditionals try/catch blocks automatic recovery restriction 459 catch editor 209 defining 209 description 208 example 209 from tool palette 43 U Undo command 35 Universe Builder 447. creating with 201 toolbar 39 Toolbar command 35 Tools menu 36 trace log files. PeopleSoft metadata. 329 files 139 generating keys 535 overflow files for 466 preserving history 540 template tables converting to tables 182–184 using 181 testing real-time jobs 270 Timestamp-based change-data capture with sources overview 523 Timestamp-based changed-data capture with sources create and update 534 create-only 534 examples 524 overlaps 527 overlaps. reconciling 529 presampling 530 processing timestamps 523 sample data flows and work flows 525 update-only 534 tool palette defining data flows with 177 description 42 displaying 35 work flows. 449 create or update a universe 449 mappings between repository and universe data types 450 metadata mappings between a repository and universe 450 unnesting data in nested tables 243–246 UPDATE operation code 175 UseDomainName 331 UseExplicitDatabaseLinks 331 user interface.

100 W warning messages 320 Web logs Data Integrator support for 165 overview 165 Where used. 456 execution order 199 from tool palette 43 independent steps in 199 multiple steps in 200 object library access 49 parameter values. 114 reading multiple files 228 as targets 271 XML messages 219 editing 94. Index parameters. setting 303 purpose of 198 recovering as a unit 456–457 resizing in workspace 46 scripts in 211 steps in 198 try/catch blocks in 208 variables. 404 selecting before deleting an object 62 views. use for 462–463 in scripts 211 system 314 Variables and Parameters window. Designer option 398. Designer option 398. importing metadata 94. repository 23 view data 356. 314. object library access 49 XML source editor specifying multiple files 228 580 Data Integrator Designer Guide . 404 while loops defining 206 design considerations 205 view data 208 Window menu 38 windows closing 47 Options 66 WL_GetKeyValue function 166 word_ext function 166 work flows adding parameters 303 calling other work flows 199. using 298 variables as file names for lookup_ext function 314. 114 sample for testing 270 viewing schema 266 XML Schema. 200 conditionals in 202 connecting objects in 45 connecting steps 199 creating 201 data flows in 173 defining parameters 302 defining variables 302 description 198 designing for automatic recovery 461–463 designing for manual recovery 463–466 example 200 executing only once 200. passing automatically 67 workspace annotations 45 arranging windows 47 characters displayed 67 closing windows 47 description 44 descriptions in 45 scaling 46 writing data 179 X XML data. extracting and parsing to columns 247 XML files editing 94. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. for sources or targets 67. passing automatically 67 in R/3 data flows 302 recovery. 201. 314 for sources and targets 314. 314. 314 versions. 404 overview 404 set sample size. 409 tool bar options 413 using with while loops 208 View menu 35 View where used.

Index Y years. interpreting two digits 69 Data Integrator Designer Guide 581 . Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

This document is part of a SAP study on PDF usage. Index 582 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.

htm. it will be the subject of special quality measures to ensure that it is of a high quality and optimally fulfills customers’ needs. The IP address is transmitted to SAP. tracking no longer takes place. The study will run until the 31st of August 2010 after that point in time the tool will no longer be functional and therefore also not collect data and send pseudomyzed data to SAP. When you uninstall the tool. No additional data which would allow the personal identification of a user is transferred. Once this key content has been identified.com/PDFusage/PDFUsage_Study. The ID is generated when you install the tool and enables the distinction between total views and views by distinct users. the tags are only active if you explicitly agree to join the study and install an Activator tool provided by SAP that enables the tracking http://help. this document is tagged. but not stored or used. Once the tool is installed. The insights gained will be used to create more helpful content in the future. it registers whenever you open. To enable the collection of data. even if you open a tagged document. The study collects data on how documents are used offline to find out which content is most important and helpful. Participate now! . If you re-install the tool a new ID will be generated.sap. but not the identification of a particular.SAP PDF Usage Study This document is part of an SAP study to improve documentation. However. close or print a tagged document and sends the information to SAP together with a unique ID. The key content will also be analyzed to get a better understanding of the type of information that is most valuable for customers.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->