P. 1
Di Designer Guide

Di Designer Guide

|Views: 501|Likes:
Published by Massimo Bellucci

More info:

Published by: Massimo Bellucci on Aug 30, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

09/21/2012

pdf

text

original

Sections

  • Data Integrator Designer Guide
  • Welcome
  • Overview of this document
  • Audience and assumptions
  • More Data Integrator product documentation 1
  • More Data Integrator product documentation
  • About this chapter
  • Creating a Data Integrator repository
  • Associating the repository with a Job Server
  • Entering repository information
  • Version restrictions
  • Oracle login
  • Microsoft SQL Server login
  • IBM DB2 login
  • Sybase ASE login
  • Resetting users
  • Data Integrator objects
  • Reusable objects
  • Single-use objects
  • Object hierarchy
  • Designer window
  • Menu bar
  • Project menu
  • Edit menu
  • View menu
  • Tools menu
  • Debug menu
  • Validation menu
  • Window menu
  • Toolbar 3
  • Help menu
  • Toolbar
  • Project area 3
  • Project area
  • Tool palette
  • Workspace
  • Moving objects in the workspace area
  • Connecting and disconnecting objects
  • Describing objects
  • Scaling the workspace
  • Arranging workspace windows
  • Closing workspace windows
  • Local object library
  • Object editors
  • Working with objects
  • Creating new reusable objects
  • Changing object names
  • Viewing and changing object properties
  • Creating descriptions
  • Creating annotations
  • Saving and deleting objects
  • Searching for objects
  • General and environment options
  • Designer — Environment
  • Designer — General
  • Designer — Graphics
  • Designer — Central Repository Connections
  • Data — General
  • Job Server — Environment
  • Job Server — General
  • Projects
  • Objects that make up a project
  • Creating new projects
  • Opening existing projects
  • Saving projects
  • Jobs
  • Creating jobs
  • Naming conventions for objects in jobs
  • Datastores
  • What are datastores?
  • Database datastores
  • Mainframe interface
  • Defining a database datastore
  • Changing a datastore definition
  • Browsing metadata through a database datastore
  • Importing metadata through a database datastore
  • Imported table information
  • Imported stored function and procedure information
  • Ways of importing metadata
  • Reimporting objects
  • Memory datastores
  • Memory table target options
  • Persistent cache datastores
  • Linked datastores
  • Adapter datastores
  • DB1
  • Defining an adapter datastore
  • Browsing metadata through an adapter datastore
  • Importing metadata through an adapter datastore
  • Creating and managing multiple datastore configurations
  • Definitions
  • Why use multiple datastore configurations?
  • Creating a new configuration
  • Adding a datastore alias
  • Portability solutions
  • Migration between environments
  • Multiple instances
  • OEM deployment
  • Multi-user development
  • Job portability tips
  • Renaming table and function owner
  • Defining a system configuration
  • What are file formats?
  • File format editor 6
  • File format editor
  • Creating file formats
  • Creating a new file format
  • Modeling a file format on a sample file
  • Replicating and renaming file formats
  • Creating a file format from an existing flat table schema
  • Editing file formats
  • File format features
  • Reading multiple files at one time
  • Identifying source file names
  • Number formats
  • Ignoring rows with specified markers
  • Date formats at the field level
  • Error handling for flat-file sources
  • Creating COBOL copybook file formats
  • File transfers
  • Custom transfer system variables for flat files
  • Custom transfer options for flat files
  • Setting custom transfer options
  • Design tips
  • Web log support
  • Word_ext function
  • Concat_date_time function
  • WL_GetKeyValue function
  • chapter
  • Data Flows
  • What is a data flow?
  • Naming data flows
  • Data flow example
  • Steps in a data flow
  • Data flows as steps in work flows
  • Intermediate data sets in a data flow
  • Passing parameters to data flows 7
  • Operation codes
  • Passing parameters to data flows
  • Creating and defining data flows
  • Source and target objects
  • Source objects
  • Target objects
  • Adding source or target objects to data flows
  • Template tables
  • Transforms
  • Transform editors
  • Adding transforms to data flows
  • Query transform overview
  • Adding a Query transform to a data flow
  • Query editor
  • Data flow execution
  • Push down operations to the database server
  • Distributed data flow execution
  • Load balancing
  • Caches
  • Audit Data Flow Overview
  • What is a work flow?
  • Steps in a work flow
  • Order of execution in work flows 8
  • Order of execution in work flows
  • Example of a work flow
  • Creating work flows 8
  • Creating work flows
  • Conditionals
  • While loops
  • Design considerations
  • Defining a while loop
  • Using a while loop with View Data
  • Try/catch blocks
  • Categories of available exceptions
  • Scripts
  • Debugging scripts using the print function
  • Nested Data
  • What is nested data?
  • Representing hierarchical data 9
  • Representing hierarchical data
  • Formatting XML documents
  • Importing XML Schemas
  • Importing XML schemas
  • Importing abstract types
  • Importing substitution groups
  • Specifying source options for XML files
  • Reading multiple XML files at one time
  • Mapping optional schemas
  • Using Document Type Definitions (DTDs)
  • Generating DTDs and XML Schemas from an NRDM schema
  • Operations on nested data
  • Overview of nested data and the Query transform
  • FROM clause construction
  • Nesting columns
  • Using correlated columns in nested data
  • Distinct rows and nested data
  • Grouping values across nested schemas
  • Unnesting nested data
  • How transforms handle nested data
  • XML extraction and parsing for columns
  • Sample Scenarios
  • Overview
  • Request-response message processing
  • What is a real-time job?
  • Real-time versus batch
  • Messages
  • Real-time job examples
  • Creating real-time jobs
  • Real-time job models
  • Single data flow model
  • Multiple data flow model
  • Using real-time job models
  • Creating a real-time job
  • Real-time source and target objects
  • Secondary sources and targets
  • Transactional loading of tables
  • Design tips for data flows in real-time jobs
  • Testing real-time jobs
  • Executing a real-time job in test mode
  • Using an XML file target
  • Building blocks for real-time jobs
  • Supplementing message data
  • Branching data flow based on a data cache value
  • Calling application functions
  • Designing real-time applications 10
  • Designing real-time applications
  • Reducing queries requiring back-office application access
  • Messages from real-time jobs to adapter instances
  • Real-time service invoked by an adapter instance
  • Example of when to use embedded data flows 11
  • Example of when to use embedded data flows
  • Creating embedded data flows
  • Using the Make Embedded Data Flow option
  • Creating embedded data flows from existing flows
  • Using embedded data flows
  • Testing embedded data flows
  • Troubleshooting embedded data flows
  • The Variables and Parameters window
  • The Variables and Parameters window opens
  • Using local variables and parameters
  • Parameters
  • Passing values into data flows
  • Defining local variables
  • Defining parameters
  • Using global variables
  • Creating global variables
  • Viewing global variables
  • Setting global variable values
  • Local and global variable rules 12
  • Local and global variable rules
  • Naming
  • Replicating jobs and work flows
  • Importing and exporting
  • Environment variables
  • Setting file names at run-time using variables
  • Overview of Data Integrator job execution
  • Preparing for job execution 13
  • Preparing for job execution
  • Validating jobs and job components
  • Ensuring that the Job Server is running
  • Setting job execution options
  • Executing jobs as immediate tasks
  • Monitor tab
  • Log tab
  • Debugging execution errors
  • Using Data Integrator logs
  • Examining trace logs
  • Examining monitor logs
  • Examining error logs
  • Examining target data
  • Changing Job Server options
  • Chapter overview
  • Using the Data Profiler
  • Data sources that you can profile
  • Connecting to the profiler server
  • Profiler statistics
  • Column profile
  • Basic profiling
  • Detailed profiling
  • Relationship profile
  • Executing a profiler task
  • Submitting column profiler tasks
  • Submitting relationship profiler tasks
  • Monitoring profiler tasks using the Designer
  • Viewing the profiler results
  • Viewing column profile data
  • Viewing relationship profile data
  • Using View Data to determine data quality
  • Data tab
  • Profile tab
  • Relationship Profile or Column Profile tab
  • Using the Validation transform
  • Analyze column profile
  • Define validation rule based on column profile
  • Using Auditing
  • Auditing objects in a data flow
  • Accessing the Audit window
  • Defining audit points, rules, and action on failure
  • Guidelines to choose audit points
  • Auditing embedded data flows
  • Enabling auditing in an embedded data flow
  • Audit points not visible outside of the embedded data flow
  • Resolving invalid audit labels
  • Viewing audit results
  • Job Monitor Log
  • Job Error Log
  • Metadata Reports
  • Data Cleansing with Data Integrator Data Quality
  • Overview of Data Integrator Data Quality architecture
  • Overview of steps to use Data Integrator Data Quality 14
  • Data Quality Terms and Definitions
  • Overview of steps to use Data Integrator Data Quality
  • Creating a Data Quality datastore
  • Importing Data Quality Projects 14
  • Importing Data Quality Projects
  • Using the Data Quality transform
  • Mapping input fields from the data flow to the project
  • Creating custom projects
  • Data Quality blueprints for Data Integrator
  • Using View Where Used
  • From the object library
  • From the workspace
  • Using View Data
  • Accessing View Data
  • Viewing data in the workspace
  • View Data properties
  • Filtering
  • Sorting
  • View Data tool bar options
  • View Data tabs
  • Column Profile tab
  • Using the interactive debugger
  • Before starting the interactive debugger
  • Changing the interactive debugger port
  • Starting and stopping the interactive debugger
  • Windows
  • Filters and Breakpoints window
  • Menu options and tool bar
  • Viewing data passed by transforms
  • Push-down optimizer
  • Comparing Objects
  • Overview of the Difference Viewer window
  • To change the color scheme
  • Navigating through differences
  • Calculating usage dependencies 15
  • Calculating usage dependencies
  • Metadata exchange
  • Importing metadata files into Data Integrator
  • Exporting metadata files from Data Integrator
  • Creating Business Objects universes 16
  • Creating Business Objects universes
  • Mappings between repository and universe metadata
  • Attributes that support metadata exchange 16
  • Attributes that support metadata exchange
  • Recovery Mechanisms
  • Recovering from unsuccessful job execution
  • Automatically recovering jobs
  • Enabling automated recovery
  • Marking recovery units
  • Running in recovery mode
  • Ensuring proper execution path
  • Using try/catch blocks with automatic recovery
  • Ensuring that data is not duplicated in targets
  • Using preload SQL to allow re-executable data flows
  • Manually recovering jobs using status tables
  • Processing data with problems
  • Using overflow files
  • Filtering missing or bad values
  • Handling facts with missing dimensions
  • Understanding changed-data capture
  • Full refresh
  • Capturing only changes
  • Source-based and target-based CDC
  • Using CDC with Oracle sources
  • Overview of CDC for Oracle databases
  • Setting up Oracle CDC
  • CDC datastores
  • Importing CDC data from Oracle
  • Viewing an imported CDC table
  • Configuring an Oracle CDC source
  • Creating a data flow with an Oracle CDC source
  • Maintaining CDC tables and subscriptions
  • Limitations
  • Using CDC with DB2 sources
  • Guaranteed delivery
  • Setting up DB2
  • Setting up Data Integrator
  • CDC Services
  • Importing CDC data from DB2
  • Configuring a DB2 CDC source
  • Using CDC with Attunity mainframe sources
  • Setting up Attunity CDC
  • Importing mainframe CDC data
  • Configuring a mainframe CDC source
  • Using mainframe check-points
  • Using CDC with Microsoft SQL Server databases
  • Overview of CDC for SQL Server databases
  • Setting up SQL Replication Server for CDC
  • Importing SQL Server CDC data
  • Configuring a SQL Server CDC source
  • Using CDC with timestamp-based sources
  • Processing timestamps
  • Overlaps
  • Overlap avoidance
  • Overlap reconciliation
  • Presampling
  • Types of timestamps
  • Create-only timestamps
  • Update-only timestamps
  • Create and update timestamps
  • Timestamp-based CDC examples
  • Preserving generated keys
  • Using the lookup function
  • Comparing tables
  • Preserving history
  • Additional job design tips
  • Header and detail synchronization
  • Using CDC for targets 18
  • Capturing physical deletions
  • Using CDC for targets
  • Administrator
  • SNMP support
  • About the Data Integrator SNMP agent
  • Job Server, SNMP agent, and NMS application architecture
  • About SNMP Agent’s Management Information Base (MIB)
  • About an NMS application
  • Configuring Data Integrator to support an NMS application
  • SNMP configuration parameters
  • Job Servers for SNMP
  • System Variables
  • Access Control, v1/v2c
  • Access Control, v3
  • Traps
  • Troubleshooting
  • Index

Data Integrator Designer Guide

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. See the last page of this document and find out how you can participate and help to improve our documentation.

Data Integrator 11.7.2 for Windows and UNIX

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Copyright

If you find any problems with this documentation, please report them to Business Objects S.A. in writing at documentation@businessobjects.com. Copyright © Business Objects S.A. 2007. All rights reserved.

Trademarks

Business Objects, the Business Objects logo, Crystal Reports, and Crystal Enterprise are trademarks or registered trademarks of Business Objects SA or its affiliated companies in the United States and other countries. All other names mentioned herein may be trademarks of their respective owners. Business Objects products in this release may contain redistributions of software licensed from third-party contributors. Some of these individual components may also be available under alternative licenses. A partial listing of third-party contributors that have requested or permitted acknowledgments, as well as required notices, can be found at: http://www.businessobjects.com/thirdparty

Third-party contributors

Patents

Business Objects owns the following U.S. patents, which may cover products that are offered and sold by Business Objects: 5,555,403, 6,247,008 B1, 6,578,027 B2, 6,490,593 and 6,289,352. April 26, 2007

Date

2

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents
Chapter 1 Introduction 17 Welcome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Overview of this document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Audience and assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 More Data Integrator product documentation . . . . . . . . . . . . . . . . . . . . . . 19 Chapter 2 Logging in to the Designer 21

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Creating a Data Integrator repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Associating the repository with a Job Server . . . . . . . . . . . . . . . . . . . . . . . 22 Entering repository information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Version restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Oracle login . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Microsoft SQL Server login . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 IBM DB2 login . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Sybase ASE login . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Resetting users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Chapter 3 Designer user interface 29

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Data Integrator objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Reusable objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Single-use objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Object hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Designer window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Menu bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Project menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Edit menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 View menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Data Integrator Designer Guide

3

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Tools menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Debug menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Validation menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Window menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Help menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Project area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Tool palette . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Moving objects in the workspace area . . . . . . . . . . . . . . . . . . . . . . . . . 44 Connecting and disconnecting objects . . . . . . . . . . . . . . . . . . . . . . . . . 45 Describing objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Scaling the workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Arranging workspace windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Closing workspace windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Local object library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Object editors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Working with objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Creating new reusable objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Changing object names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Viewing and changing object properties . . . . . . . . . . . . . . . . . . . . . . . . 53 Creating descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Creating annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Saving and deleting objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Searching for objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 General and environment options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Designer — Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Designer — General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Designer — Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Designer — Central Repository Connections . . . . . . . . . . . . . . . . . . . . 69 Data — General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Job Server — Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Job Server — General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Chapter 4

Projects and Jobs

71

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Objects that make up a project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Creating new projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Opening existing projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Saving projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Creating jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Naming conventions for objects in jobs . . . . . . . . . . . . . . . . . . . . . . . . 76 Chapter 5 Datastores 79

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 What are datastores? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Database datastores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Mainframe interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Defining a database datastore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Changing a datastore definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Browsing metadata through a database datastore . . . . . . . . . . . . . . . 90 Importing metadata through a database datastore . . . . . . . . . . . . . . . 94 Memory datastores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Persistent cache datastores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Linked datastores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Adapter datastores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Defining an adapter datastore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Browsing metadata through an adapter datastore . . . . . . . . . . . . . . . 114 Importing metadata through an adapter datastore . . . . . . . . . . . . . . . 114 Creating and managing multiple datastore configurations . . . . . . . . . . . . 115 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Why use multiple datastore configurations? . . . . . . . . . . . . . . . . . . . 117 Creating a new configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Adding a datastore alias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Portability solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Data Integrator Designer Guide

5

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Job portability tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Renaming table and function owner . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Defining a system configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Chapter 6 File Formats 135

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 What are file formats? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 File format editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Creating file formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Creating a new file format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 Modeling a file format on a sample file . . . . . . . . . . . . . . . . . . . . . . . . 143 Replicating and renaming file formats . . . . . . . . . . . . . . . . . . . . . . . . . 144 Creating a file format from an existing flat table schema . . . . . . . . . . 147 Editing file formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 File format features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Reading multiple files at one time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Identifying source file names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Number formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Ignoring rows with specified markers . . . . . . . . . . . . . . . . . . . . . . . . . 151 Date formats at the field level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Error handling for flat-file sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Creating COBOL copybook file formats . . . . . . . . . . . . . . . . . . . . . . . . . . 157 File transfers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 Custom transfer system variables for flat files . . . . . . . . . . . . . . . . . . 160 Custom transfer options for flat files . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Setting custom transfer options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Design tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 Web log support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Word_ext function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 Concat_date_time function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 WL_GetKeyValue function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

6

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Chapter 7

Data Flows

171

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 What is a data flow? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Naming data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Data flow example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Steps in a data flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Data flows as steps in work flows . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Intermediate data sets in a data flow . . . . . . . . . . . . . . . . . . . . . . . . . 174 Operation codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Passing parameters to data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Creating and defining data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 Source and target objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 Source objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Target objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Adding source or target objects to data flows . . . . . . . . . . . . . . . . . . 180 Template tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 Transform editors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Adding transforms to data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 Query transform overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Adding a Query transform to a data flow . . . . . . . . . . . . . . . . . . . . . . 188 Query editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Data flow execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Push down operations to the database server . . . . . . . . . . . . . . . . . . 192 Distributed data flow execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 Load balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Audit Data Flow Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Chapter 8 Work Flows 197

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 What is a work flow? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 Steps in a work flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

Data Integrator Designer Guide

7

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Order of execution in work flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Example of a work flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 Creating work flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 While loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Design considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Defining a while loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 Using a while loop with View Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Try/catch blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Categories of available exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Debugging scripts using the print function . . . . . . . . . . . . . . . . . . . . . 213 Chapter 9 Nested Data 215

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 What is nested data? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 Representing hierarchical data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Formatting XML documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Importing XML Schemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 Specifying source options for XML files . . . . . . . . . . . . . . . . . . . . . . . 228 Mapping optional schemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 Using Document Type Definitions (DTDs) . . . . . . . . . . . . . . . . . . . . . 232 Generating DTDs and XML Schemas from an NRDM schema . . . . . 234 Operations on nested data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Overview of nested data and the Query transform . . . . . . . . . . . . . . . 236 FROM clause construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Nesting columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 Using correlated columns in nested data . . . . . . . . . . . . . . . . . . . . . . 241 Distinct rows and nested data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Grouping values across nested schemas . . . . . . . . . . . . . . . . . . . . . . 243 Unnesting nested data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 How transforms handle nested data . . . . . . . . . . . . . . . . . . . . . . . . . . 246 XML extraction and parsing for columns . . . . . . . . . . . . . . . . . . . . . . . . . . 247

8

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Sample Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 Chapter 10 Real-time jobs 253

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 Request-response message processing . . . . . . . . . . . . . . . . . . . . . . . . . 254 What is a real-time job? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Real-time versus batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 Real-time job examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 Creating real-time jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Real-time job models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Using real-time job models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Creating a real-time job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Real-time source and target objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 Secondary sources and targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 Transactional loading of tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 Design tips for data flows in real-time jobs . . . . . . . . . . . . . . . . . . . . 269 Testing real-time jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270 Executing a real-time job in test mode . . . . . . . . . . . . . . . . . . . . . . . . 270 Using View Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 Using an XML file target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 Building blocks for real-time jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 Supplementing message data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 Branching data flow based on a data cache value . . . . . . . . . . . . . . 275 Calling application functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280 Designing real-time applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 Reducing queries requiring back-office application access . . . . . . . . 281 Messages from real-time jobs to adapter instances . . . . . . . . . . . . . 282 Real-time service invoked by an adapter instance . . . . . . . . . . . . . . 282 Chapter 11 Embedded Data Flows 283

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

Data Integrator Designer Guide

9

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Example of when to use embedded data flows . . . . . . . . . . . . . . . . . . . . . 285 Creating embedded data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 Using the Make Embedded Data Flow option . . . . . . . . . . . . . . . . . . . 286 Creating embedded data flows from existing flows . . . . . . . . . . . . . . . 289 Using embedded data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Testing embedded data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 Troubleshooting embedded data flows . . . . . . . . . . . . . . . . . . . . . . . . 293 Chapter 12 Variables and Parameters 295

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 The Variables and Parameters window . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 Using local variables and parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Passing values into data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 Defining local variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 Defining parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 Using global variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 Creating global variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 Viewing global variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 Setting global variable values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306 Local and global variable rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Naming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Replicating jobs and work flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Importing and exporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 Environment variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 Setting file names at run-time using variables . . . . . . . . . . . . . . . . . . . . . . 314 Chapter 13 Executing Jobs 317

About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318 Overview of Data Integrator job execution . . . . . . . . . . . . . . . . . . . . . . . . 318 Preparing for job execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 Validating jobs and job components . . . . . . . . . . . . . . . . . . . . . . . . . . 319

10 Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Contents

Ensuring that the Job Server is running . . . . . . . . . . . . . . . . . . . . . . . 320 Setting job execution options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 Executing jobs as immediate tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 Monitor tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 Log tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 Debugging execution errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324 Using Data Integrator logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 Examining target data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 Changing Job Server options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 Chapter 14 Data Quality 333

Chapter overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 Using the Data Profiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 Data sources that you can profile . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 Connecting to the profiler server . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 Profiler statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 Executing a profiler task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 Monitoring profiler tasks using the Designer . . . . . . . . . . . . . . . . . . . 349 Viewing the profiler results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350 Using View Data to determine data quality . . . . . . . . . . . . . . . . . . . . . . . 356 Data tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356 Profile tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357 Relationship Profile or Column Profile tab . . . . . . . . . . . . . . . . . . . . . 357 Using the Validation transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358 Analyze column profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358 Define validation rule based on column profile . . . . . . . . . . . . . . . . . 359 Using Auditing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 Auditing objects in a data flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 Accessing the Audit window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366 Defining audit points, rules, and action on failure . . . . . . . . . . . . . . . 367 Guidelines to choose audit points . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 Auditing embedded data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372 Resolving invalid audit labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376

Data Integrator Designer Guide 11

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 435 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 Using View Data . . . . . . . . . . . 399 From the workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418 Starting and stopping the interactive debugger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . 433 Viewing data passed by transforms . . . . . . . . . . . . . . . . . . . . 424 Windows . . . 391 Chapter 15 Design and Debug 397 About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 Overview of steps to use Data Integrator Data Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405 Viewing data in the workspace . . . . 436 Comparing Objects . . . . . . . . . . . . . . . . 381 Importing Data Quality Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Overview of the Difference Viewer window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .This document is part of a SAP study on PDF usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 Data Cleansing with Data Integrator Data Quality . . . . . . . . . . . . . . . . . . . . . 379 Data Quality Terms and Definitions . . . . . . . . . . . . . . . . 385 Creating custom projects . . 378 Overview of Data Integrator Data Quality architecture . . . . . . . . . . . . . . . . . 389 Data Quality blueprints for Data Integrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contents Viewing audit results . . . . . . . 439 12 Data Integrator Designer Guide . . . . . . . . . . . . . . . . . . . . . . . . 406 View Data properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 View Data tabs . . . 398 Using View Where Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418 Before starting the interactive debugger . . . . . . . . . . . . . 398 From the object library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408 View Data tool bar options . . . . . . . . . . . . . . . . . . . . 383 Using the Data Quality transform . . . . . . . . . . . . . . . . Find out how you can participate and help to improve our documentation. . . . . . . . . . . . . . 404 Accessing View Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 Menu options and tool bar . . . . . 413 Using the interactive debugger . 435 Push-down optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 Creating a Data Quality datastore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . 459 Ensuring that data is not duplicated in targets . . . . . . . . . . . 458 Using try/catch blocks with automatic recovery . . . . . . . . . . . . . . 446 Metadata exchange . . . . . . . . . . . . . . . . . . . . . . . Find out how you can participate and help to improve our documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 Marking recovery units . . . . . . . . . . . . . . . 456 Running in recovery mode . . . . . 449 Mappings between repository and universe metadata . 461 Manually recovering jobs using status tables . . . 451 Chapter 17 Recovery Mechanisms 453 About this chapter . . . . . . . . . . . . 454 Enabling automated recovery . . . . . . . . . . 472 Understanding changed-data capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450 Attributes that support metadata exchange . . . . . . . . . . . . . 466 Filtering missing or bad values . 472 Data Integrator Designer Guide 13 . . . . . . . . . . . . . . . . . . . Contents Navigating through differences . . . . . . 454 Recovering from unsuccessful job execution . . . . . . . . . . . . . . . . 463 Processing data with problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442 Calculating usage dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447 Exporting metadata files from Data Integrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460 Using preload SQL to allow re-executable data flows . . . . . . . . . . . . . . . . . . . . 472 Full refresh . . . . . . . . . . . . .This document is part of a SAP study on PDF usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . 467 Handling facts with missing dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470 Chapter 18 Techniques for Capturing Changed Data 471 About this chapter . . 446 Importing metadata files into Data Integrator . . . . . . . . . . . 443 Chapter 16 Exchanging metadata 445 About this chapter . . . . . . . . . . . . . . . . . . . . . . 454 Automatically recovering jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457 Ensuring proper execution path . . . . . . . . . . . 466 Using overflow files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472 Capturing only changes . . . . . . . . . . . . . . . . . . . . . . . . . . 447 Creating Business Objects universes . . . . . . . . . . . .

. . . . . . . . . . . . . . . 502 Configuring a DB2 CDC source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519 Using CDC with timestamp-based sources . . . . . . . . . . 511 Using CDC with Microsoft SQL Server databases . . . . . . . . . . . . . . . . . . Find out how you can participate and help to improve our documentation. . . . . . . . . . . . . . . . . . . . . .This document is part of a SAP study on PDF usage. . . . . . 499 CDC datastores . . . . . . . . . . . . . . . . 496 Setting up DB2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515 Setting up Data Integrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496 Setting up Data Integrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513 Setting up SQL Replication Server for CDC . . . . . . . . . . . . . . . . . . . . . . . . . . . 509 Configuring a mainframe CDC source . . . . . . . . . . . . . . . . . . . . 495 Guaranteed delivery . . . . . . . Contents Source-based and target-based CDC . . . 513 Overview of CDC for SQL Server databases . . . . . . . . . . . . . 488 Creating a data flow with an Oracle CDC source . . . . 491 Maintaining CDC tables and subscriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518 Configuring a SQL Server CDC source . . . . . . 487 Configuring an Oracle CDC source . . . . . . . . . . . . . . . . . . . . . 502 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506 Setting up Data Integrator . . . . . . . . . . . . . . . . . . . . . 479 CDC datastores . . . . . . . . . . . . . 510 Using mainframe check-points . . . . . . . . . . . . . . . 505 Setting up Attunity CDC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 Using CDC with DB2 sources . . . . . . . . . . . . . . . . . . . . . . . . 480 Viewing an imported CDC table . . . . . . . . . . . . . . . . . . . . . 480 Importing CDC data from Oracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 Overview of CDC for Oracle databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 Setting up Oracle CDC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507 Importing mainframe CDC data . . . . . . . . . . . . . . . . . . . . . . . . . . . 493 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504 Using CDC with Attunity mainframe sources . . . . . . . . . . 516 Importing SQL Server CDC data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 Using CDC with Oracle sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498 CDC Services . . . . . . . . . . . . . . . . . 522 14 Data Integrator Designer Guide . . . . . . . . . 500 Importing CDC data from DB2 . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .This document is part of a SAP study on PDF usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548 About the Data Integrator SNMP agent . . . . . . . . . . 545 Chapter 19 Monitoring jobs 547 About this chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 Additional job design tips . . 553 Configuring Data Integrator to support an NMS application . . . . . . . . . . . . . . . . . . . . . . . . . . . 523 Overlaps . . . . 548 SNMP support . . . . 533 Timestamp-based CDC examples . . . . . . . . . . . . . . . . . . . and NMS application architecture . . . . . . . . . . . . . . . . . . . . 544 Using CDC for targets . Find out how you can participate and help to improve our documentation. . . . . . . SNMP agent. . . . . . . . . . . . . . . . . . . . . . . . . . 567 Index 569 Data Integrator Designer Guide 15 . . . . 548 Job Server. . 548 Administrator . . . . . . 550 About an NMS application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549 About SNMP Agent’s Management Information Base (MIB) . . . . . . . . . . . . . . . . . . . . . . . . . . 554 Troubleshooting . . . 527 Types of timestamps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contents Processing timestamps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Contents 16 Data Integrator Designer Guide .

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide Introduction chapter .

This chapter discusses these topics: • • • Overview of this document Audience and assumptions More Data Integrator product documentation Overview of this document The book contains two kinds of information: • • Conceptual information that helps you understand the Data Integrator Designer and how it works Procedural information that explains in a step-by-step manner how to accomplish a task While you are learning about the product While you are performing tasks in the design and early testing phase of your data-movement projects As a general source of information during any phase of your projects You will find this book most useful: • • • Audience and assumptions This and other Data Integrator product documentation assumes the following: • • • • You are an application developer. RDBMS. business intelligence. and load data from databases and applications into a data warehouse used for analytic and on-demand queries. or database administrator working on data extraction. data warehousing. You can also use the Designer to define logical paths for processing message-based queries and transactions from Web-based. transform. and messaging concepts.This document is part of a SAP study on PDF usage. 18 Data Integrator Designer Guide . You understand your organization’s data needs. The Data Integrator Designer provides a graphical user interface (GUI) development environment in which you define data application logic to extract. You understand your source data systems. You are familiar with SQL (Structured Query Language). and back-office applications. consultant. or data integration. frontoffice. 1 Introduction Welcome Welcome Welcome to the Data Integrator Designer Guide. Find out how you can participate and help to improve our documentation.

which you can use for basic stand-alone training purposes Release Notes Release Summary Technical Manuals Tutorial Select one of the following from the Designer’s Help menu: Other links from the Designer’s Help menu include: DIZone—Opens a browser window to the DI Zone. Find out how you can participate and help to improve our documentation. and last-minute documentation corrections Release Summary—Opens the Release Summary PDF. To view documentation in PDF format. which includes known and fixed bugs. etc. you can view technical documentation from many locations. an online resource for the Data Integrator user community) Knowledge Base—Opens a browser window to Business Objects’ Technical Support Knowledge Exchange forum (access requires registration) Data Integrator Designer Guide 19 . you should be familiar with: • • • DTD and XML Schema formats for XML files Publishing Web Services (WSDL. select Start > Programs > Business Objects > Data Integrator > Data Integrator Documentation and select: • • • • • • • • • • Release Notes—Opens this document. Introduction More Data Integrator product documentation 1 • If you are interested in using this product to design real-time processing. which describes the latest Data Integrator features Tutorial—Opens the Data Integrator Tutorial PDF. More Data Integrator product documentation Consult the Data Integrator Getting Started Guide for: • • • An overview of Data Integrator products and architecture Data Integrator installation and configuration information A list of product documentation and a suggested reading path After you install Data Integrator.) You are familiar Data Integrator installation environments—Microsoft Windows or UNIX. HTTP. and SOAP protocols.This document is part of a SAP study on PDF usage. you can: • If you accepted the default installation. migration considerations.

This document is part of a SAP study on PDF usage. using one of the following methods: • • Choose Contents from the Designer’s Help menu. 1 Introduction More Data Integrator product documentation You can also view and download PDF documentation. Click objects in the object library or workspace and press F1. Find out how you can participate and help to improve our documentation. including Data Integrator documentation for previous releases (including Release Summaries and Release Notes). 20 Data Integrator Designer Guide .com/ documentation/. Online Help opens to the subject you selected. Use Online Help’s links and tool bar to navigate. You can also open Help. by visiting the Business Objects documentation Web site at http://support.businessobjects.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide Logging in to the Designer chapter .

Sybase ASE. you can create a repository at any time using the Data Integrator Repository Manager. 2. 2 Logging in to the Designer About this chapter About this chapter This chapter describes how to log in to the Data Integrator Designer. This adds the Data Integrator repository schema to the specified database. IBM DB2. When running a job from a repository. you create a repository during installation. You can link any number of repositories to a single Job Server. enter the database connection information for the repository and select Local for repository type. you select one of the associated repositories. Find out how you can participate and help to improve our documentation. Microsoft SQL Server. When you log in to the Data Integrator Designer. Associating the repository with a Job Server Each repository must be associated with at least one Job Server. Data Integrator repositories can reside on Oracle. choose Programs > Business Objects > Data Integrator > Repository Manager (assuming you installed Data Integrator in the Data Integrator program group). Typically. you can balance loads appropriately. In the Repository Manager window. 4. 1. From the Start menu. which is the process that starts jobs. To create a local repository Define a database for the local repository using your database management system. However. In production environments. 22 Data Integrator Designer Guide . 3. Click Create. This chapter discusses: • • • • Creating a Data Integrator repository Associating the repository with a Job Server Entering repository information Resetting users Creating a Data Integrator repository You must configure a local repository to log in to Data Integrator. you are actually logging in to the database you defined for the Data Integrator repository.This document is part of a SAP study on PDF usage. The same Job Server can run jobs stored on multiple repositories.

11.0. The required information varies with the type of database containing the repository. However. Entering repository information To log in. Data Integrator Designer Guide 23 . and 11. During login. Designer 11. choose Programs > Business Objects > Data Integrator > Server Manager (assuming you installed Data Integrator in the Data Integrator program group).0 is the earliest repository version that could be used with Designer version 11. 11. To create a Job Server for your local repository Open the Data Integrator Server Manager. From the Start menu. enter the connection information for your Data Integrator repository. Data Integrator alerts you if there is a mismatch between your Designer version and your repository version. This section discusses: • • • • • Version restrictions Oracle login Microsoft SQL Server login IBM DB2 login Sybase ASE login Version restrictions Your repository version must be associated with the same major release as the Designer and must be less than or equal to the version of the Designer.7 can access repositories 11. See the Data Integrator Getting Started Guide for detailed instructions. but not repository 6.This document is part of a SAP study on PDF usage. you define a Job Server and link it to a repository during installation. you can view Data Integrator and repository versions by selecting Help > About Data Integrator. you can define or edit Job Servers or links between repositories and Job Servers at any time using the Data Integrator Server Manager.7.7 (equal to or less than). After you log in. Logging in to the Designer Entering repository information 2 Typically. 1. Find out how you can participate and help to improve our documentation. in this example. 2. repository 11.5.6. For example. So.5 (different major release version).

2 Logging in to the Designer Entering repository information Some features in the current release of the Designer might not be supported if you are not logged in to the latest version of the repository. 24 Data Integrator Designer Guide . Oracle login From the Windows Start menu. Find out how you can participate and help to improve our documentation. select Programs > Business Objects > Data Integrator > Data Integrator Designer.This document is part of a SAP study on PDF usage.

you must complete the following fields: • • • Database type — Select DB2. Remember — Check this box if you want the Designer to store this information for the next time you log in. User name and Password —The user name and password for a Data Integrator repository defined in an Oracle database. Microsoft SQL Server login From the Windows Start menu. Logging in to the Designer Entering repository information 2 In the Repository Login window. User name and Password — The user name and password for a Data Integrator repository defined in a DB2 database.ora entry or Net Service Name of the database. Find out how you can participate and help to improve our documentation. For a DB2 repository. Database server name —The database server name. complete the following fields: • • • • Database type — Select Oracle. Remember — Check this box if you want the Designer to store this information for the next time you log in. Data Integrator Designer Guide 25 . select Programs > Business Objects > Data Integrator > Data Integrator Designer. Database connection name — The TNSnames. DB2 datasource — The data source name. you must complete the following fields: • • • • Database type — Select Microsoft_SQL_Server. clear to authenticate using the existing Microsoft SQL Server login account name and password and complete the User name and Password fields.This document is part of a SAP study on PDF usage. • • IBM DB2 login From the Windows Start menu. select Programs > Business Objects > Data Integrator > Data Integrator Designer. For a Microsoft SQL Server repository. Windows authentication — Select to have Microsoft SQL Server validate the login account name and password using information from the Windows operating system. User name and Password — The user name and password for a Data Integrator repository defined in a Microsoft SQL Server database. Database name — The name of the specific database to which you are connecting.

• • • Database name — Enter the name of the specific database to which you are connecting. Database server name — Enter the database’s server name. you must complete the following fields: • • Database type — Select Sybase ASE. Note: For UNIX Job Servers. User name and Password — Enter the user name and password for this database. you might receive an error because the Job Server cannot communicate with the repository. Remember — Check this box if you want the Designer to store this information for the next time you log in. select Programs > Business Objects > Data Integrator > Data Integrator Designer. the Reset Users window appears. more than one person may attempt to log in to a single repository. the case you type for the database server name must match the associated case in the SYBASE_Home\interfaces file. 26 Data Integrator Designer Guide . Resetting users Occasionally. Sybase ASE login From the Windows Start menu. Find out how you can participate and help to improve our documentation. when logging in to a Sybase repository in the Designer.This document is part of a SAP study on PDF usage. If the case does not match. For a Sybase ASE repository. If this happens. listing the users and the time they logged in to the repository. 2 Logging in to the Designer Resetting users • Remember — Check this box if you want the Designer to store this information for the next time you log in.

you have several options. You can: • • • Reset Users to clear the users in the repository and set yourself as the currently logged in user. Note: Only use Reset Users or Continue if you know that you are the only user connected to the repository. Find out how you can participate and help to improve our documentation. Continue to log in to the system regardless of who else might be connected. Subsequent changes could corrupt the repository. Exit to terminate the login attempt and close the session. Data Integrator Designer Guide 27 . Logging in to the Designer Resetting users 2 From this window.This document is part of a SAP study on PDF usage.

2 Logging in to the Designer Resetting users 28 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide Designer user interface chapter .

which control the operation of objects. which document the object. but do not affect its operation. Objects are hierarchical and consist of: • • Options. The local object library shows objects such as source and target metadata. system functions. 30 Data Integrator Designer Guide . or work with in Data Integrator Designer are called objects. and jobs. in a datastore. For example.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Reusable objects Single-use objects Data Integrator has two types of objects: • • The object type affects how you define and retrieve the object. projects. Properties. the name of the database to which you connect is an option for the datastore object. Properties describe an object. edit. It contains the following topics: • • • • • • • • • • • Data Integrator objects Designer window Menu bar Toolbar Project area Tool palette Workspace Local object library Object editors Working with objects General and environment options Data Integrator objects All “entities” you define. 3 Designer user interface About this chapter About this chapter This chapter provides basic information about the Designer’s graphical user interface. the name of the object and the date it was created are properties. Reusable objects You can reuse and replicate most objects defined in Data Integrator. For example.

is a reusable object. You can then reuse the definition as often as necessary by creating calls to the definition. The following figure shows the relationships between major Data Integrator object types: Data Integrator Designer Guide 31 . When you drag and drop an object from the object library.This document is part of a SAP study on PDF usage. Object hierarchy Data Integrator object relationships are hierarchical. A reusable object has a single definition. Data Integrator stores the definition in the local repository. A data flow. you are really creating a new reference (or call) to the existing object definition. for example. can call the same data flow. Find out how you can participate and help to improve our documentation. you are changing the object in all other places in which it appears. If you change the definition of the object in one place. Designer user interface Data Integrator objects 3 After you define and save a reusable object. Multiple jobs. The object library contains object definitions. Single-use objects Some objects are defined only within the context of a single job or data flow. Access reusable objects through the local object library. both jobs use the new version of the data flow. like a weekly load job and a daily load job. for example scripts and specific transform definitions. all calls to the object refer to that definition. If the data flow changes.

This document is part of a SAP study on PDF usage. 32 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. 3 Designer user interface Designer window Designer window The Data Integrator Designer user interface consists of a single application window and several embedded supporting windows.

Tool palette. Menu bar This section contains a brief description of the Designer’s menus: • • • • • Project menu Edit menu View menu Tools menu Debug menu Data Integrator Designer Guide 33 . and tabbed Local object library.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Toolbar. Designer user interface Menu bar 3 Project area Menubar Toolbar Workspace Tool palette Local object library The application window contains the Menu bar. Project area. tabbed Workspace.

batch job. Compact Repository — Remove redundant and obsolete objects from the repository tables. Print Setup — Set up default printer information. Delete — Delete the selected object. XML Schema. 34 Data Integrator Designer Guide . DTD. 3 Designer user interface Menu bar • • • Validation menu Window menu Help menu Project menu The project menu contains standard Windows as well as Data Integratorspecific options. datastore. Find out how you can participate and help to improve our documentation. Save — Save the object open in the workspace. Save All — Save all changes to objects in the current Designer session. Print — Print the active workspace.This document is part of a SAP study on PDF usage. • • • • • • • • • • New — Define a new project. Open — Open an existing project. or custom function. transform. file format. data flow. Close — Close the currently open project. work flow. real-time job. Exit — Exit Data Integrator Designer.

Data Integrator Designer Guide 35 . Note: You can only paste clipboard contents once. View menu A check mark indicates that the tool is active. Cut — Cut the selected object or text and place it on the clipboard. Copy — Copy the selected object or text to the clipboard. Note: You cannot copy reusable objects using the Copy command. • Paste — Paste the contents of the clipboard into the active workspace or text box. instead. Palette — Display or remove the floating tool palette. Clear All — Clear all objects in the active workspace (no undo). you must cut or copy the objects again. Find out how you can participate and help to improve our documentation. • • • Undo — Undo the last operation (text edits only). To paste again. Status Bar — Display or remove the status bar in the Designer window.This document is part of a SAP study on PDF usage. use Replicate in the object library to make an independent copy of an object. • • • Toolbar — Display or remove the toolbar in the Designer window. Designer user interface Menu bar 3 Edit menu The Edit menu provides standard Windows commands with a few restrictions. • • Delete — Delete the selected object.

The Output window shows errors that occur such as during job validation or object export. For more information. Project Area — Display or remove the project area from the Data Integrator window. • • • • • • • Object Library — Open or close the object library window. see “Local object library” on page 47. For more information. see “Using the Data Profiler” on page 335. see the Data Integrator Reference Guide. For more information.This document is part of a SAP study on PDF usage. For more information. For more information. System Configurations — Display the System Configurations editor. Output — Open or close the Output window. Profiler Monitor — Display the status of Profiler tasks. 36 Data Integrator Designer Guide . see “Creating and managing multiple datastore configurations” on page 115. Variables — Open or close the Variables and Parameters window. Custom Functions — Display the Custom Functions window. Find out how you can participate and help to improve our documentation. For more information. Use this command to ensure the content of the workspace represents the most up-to-date information from the repository. Refresh — Redraw the display. Tools menu An icon with a different color background indicates that the tool is active. 3 Designer user interface Menu bar • • Enabled Descriptions — View descriptions for objects with enabled descriptions. see “Project area” on page 41. see “Variables and Parameters” on page 295.

For more information. For more information. See the Data Integrator Advanced Development and Migration Guide. Options — Display the Options window. • • • • • • Debug menu The only options available on this menu at all times are Show Filters/ Breakpoints and Filters/Breakpoints. Export — Export individual repository objects to another repository or file. in the object library right-click and select Repository > Export to file. You can drag objects from the object library into the editor for export. see “Using the interactive debugger” on page 418. Select the object type.Opens the Debug Properties window which allows you to run a job in the debug mode. See “General and environment options” on page 66. report type. and the objects in the repository that you want to list in the report. To export your whole repository. Data Integrator Designer Guide 37 .Opens the Execution Properties window which allows you to execute the selected job. See “Creating Business Objects universes” on page 449. Show Filters/Breakpoints . For more information. see the Data Integrator Advanced Development and Migration Guide. Designer user interface Menu bar 3 • • Profiler Server Login — Connect to the Profiler Server. The Execute and Start Debug options are only active when a job is selected. All other options are available as appropriate when a job is running in the Debug mode. See “Metadata exchange” on page 446. Central Repositories — Create or edit connections to a central repository for managing object versions among multiple users.Shows and hides filters and breakpoints in workspace diagrams. Metadata Exchange — Import and export metadata to third-party systems via a file. Start Debug . see “Connecting to the profiler server” on page 336. Find out how you can participate and help to improve our documentation. For more information. Import From File — Import objects into the current repository from a file. Metadata Reports — Display the Metadata Reports window. See “Metadata reporting tool” on page 427. • • • Execute .This document is part of a SAP study on PDF usage. BusinessObjects Universes — Export (create or update) metadata in Business Objects Universes. This command opens the Export editor in the workspace. see the Data Integrator Advanced Development and Migration Guide.

Validation menu The Designer displays options on this menu as appropriate when an object is open in the workspace. See the Data Integrator Performance Optimization Guide. Window menu The Window menu provides standard Windows options. Display Optimized SQL — Display the SQL that Data Integrator generated for a selected data flow. Find out how you can participate and help to improve our documentation. Cascade — Display window panels overlapping with titles showing. Close All Windows — Close all open windows. For more information. • • • Validate — Validate the objects in the current workspace view or all objects in the job before executing the application. Forward — Move forward in the list of active workspace windows. Tile Horizontally — Display window panels side by side. • • • • • • Back — Move back in the list of active workspace windows.Opens a window you can use to manage filters and breakpoints.This document is part of a SAP study on PDF usage. 38 Data Integrator Designer Guide . 3 Designer user interface Menu bar • Filters/Breakpoints . see “Filters and Breakpoints window” on page 432. Display Language — View a read-only version of the language associated with the job. Tile Vertically — Display window panels one above the other.

It is provided for users who prefer to print out their documentation. Job Server and engine. and a link to the Business Objects Web site.5 and higher. Release Notes — Display current release notes. About Data Integrator — Display information about Data Integrator including versions of the Designer. The name of the currently-selected object is indicated by a check mark. This format prints graphics clearly and includes a master index and page numbers/references. Find out how you can participate and help to improve our documentation. Navigate to another open object by selecting its name in the list. Technical Manuals— Display a PDF version of Data Integrator documentation. Data Integrator provides application-specific tools. including: Data Integrator Designer Guide 39 .This document is part of a SAP study on PDF usage. Release Summary — Display summary of new features in the current release. This file contains the same content as on-line help. Help menu • • Contents — Display on-line help. • • • Toolbar In addition to many of the standard Windows tools. Data Integrator‘s on-line help works with Microsoft Internet Explorer version 5. copyright information. Designer user interface Toolbar 3 • A list of objects open in the workspace also appears on the Windows menu. You can also access the same file from the Help menu in the Administrator or from the <linkdir>\Doc\Books directory.

Opens and closes the variables and parameters creation window. which lists parent objects (such as jobs) of the object currently open in the workspace (such as a data flow). right-click one and select View Where Used. Validates the object definition open in the workspace. Enables the system level setting for viewing object descriptions in the workspace. Go Back Go Forward Move back in the list of active workspace windows. Opens the Audit window to define audit labels and rules for the data flow. 40 Data Integrator Designer Guide . Validates the object definition open in the workspace. To see if an object in a data flow is reused elsewhere. Local Object Library Central Object Library Variables Project Area Output View Enabled Descriptions Validate Current View Validate All Objects in View Audit Objects in Data Flow Opens and closes the local object library window. Other objects included in the definition are also validated. Opens and closes the project area. Find out how you can participate and help to improve our documentation. Move forward in the list of active workspace windows.This document is part of a SAP study on PDF usage. Objects included in the definition are also validated. Use this command to find other jobs that use the same data flow. View Where Used Opens the Output window. Opens and closes the output window. 3 Designer user interface Toolbar Icon Tool Description Close all windows Closes all open windows in the workspace. Opens and closes the central object library window. before you decide to make design changes.

Provides a hierarchical view of all objects used in each project. View the status of currently executing jobs. To quickly switch between your last docked and undocked locations. When you drag the project area away from a Designer window edge. Data Integrator Designer Guide 41 . right-click its gray border and select/deselect Allow Docking. you can click and drag the project area to any location on your screen and it will not dock inside the Designer window. See “Menu options and tool bar” on page 433. Tabs on the bottom of the project area support different tasks. View the history of complete jobs. Project area The project area provides a hierarchical view of the objects used in each project. Opens the Data Integrator About box. you can click and drag the project area to dock at and undock from any edge within the Designer window.This document is part of a SAP study on PDF usage. When you deselect Allow Docking. To unhide the project area. with product component version numbers and a link to the Business Objects Web site. To control project area location. view and manage projects. just double-click the gray border. or select Hide from the menu. Find out how you can participate and help to improve our documentation. including which steps are complete and which steps are executing. • When you select Allow Docking. These tasks can also be done using the Data Integrator Administrator. it stays undocked. Selecting a specific job execution displays its status. Logs can also be viewed with the Data Integrator Administrator. Use the tools to the right of the About tool with the interactive debugger. click its toolbar icon. Designer user interface Project area 3 Icon Tool Data Integrator Management Console About Description Opens and closes the Management Console window. Tabs include: Create. • When you select Hide. the project area disappears from the Designer window.

hold the cursor over the icon until the tool tip for the icon appears.This document is part of a SAP study on PDF usage. The icons in the tool palette allow you to create new objects in the workspace. if you select the data flow icon from the tool palette and define a new data flow. which shows the project hierarchy: Project Work flow Data flow Job As you drill down into objects in the Designer workspace. it will be automatically available in the object library after you create it. The tool palette contains the following icons: 42 Data Integrator Designer Guide . you are creating a new definition of an object. When you create an object from the tool palette. as shown. the window highlights your location within the project hierarchy. Find out how you can participate and help to improve our documentation. Tool palette The tool palette is a separate window that appears by default on the right edge of the Designer workspace. For example. adding a call to the existing definition. 3 Designer user interface Tool palette Here’s an example of the Project window’s Designer tab. To show the name of each icon. You can move the tool palette anywhere on your screen or dock it on any edge of the Designer window. If a new object is reusable. later you can drag that existing data flow from the object library. The icons are disabled when they are not allowed to be added to the diagram open in the workspace.

Jobs and work (single-use) flows Creates a new try object. (singleuse) Jobs and work flows Jobs and work flows Jobs. Data flows Use it to define column mappings and row selections.Data flows use) Used only with the SAP Licensed extension. (single-use) Creates an annotation. Designer user interface Tool palette 3 Icon Tool Pointer Description (class) Available Returns the tool pointer to a Everywhere selection pointer for selecting and moving objects in a diagram. (single-use) Creates a table for a target. (reusable) Creates a new data flow. (reusable) Used only with the SAP licensed extension. Creates a template for a query. Find out how you can participate and help to improve our documentation. Creates a new script object. (single-use) Data flows Jobs and work flows Jobs and work flows Work flow Data flow R/3 data flow Query transform Template table Template XML Data transport Script Conditional Try Catch Annotation Creates an XML template. Creates a new work flow. and data flows Data Integrator Designer Guide 43 . (singleuse) Creates a new catch object.This document is part of a SAP study on PDF usage. (single. work flows. (single-use) Jobs and work flows Creates a new conditional object.

These processes are represented by icons that you drag and drop into a workspace to create a workspace diagram. 44 Data Integrator Designer Guide . Drag the object to where you want to place it in the workspace. To move an object to a different place in the workspace area Click to select the object.This document is part of a SAP study on PDF usage. 1. 3 Designer user interface Workspace Workspace When you open or select a job or any flow within a job hierarchy. The workspace provides a place to manipulate system objects and graphically assemble data movement processes. Find out how you can participate and help to improve our documentation. This section describes major workspace area tasks. This diagram is a visual representation of an entire data movement application or some part of a data movement application. the workspace becomes “active” with your selection. such as: • • • • • • Moving objects in the workspace area Connecting and disconnecting objects Describing objects Scaling the workspace Arranging workspace windows Closing workspace windows Moving objects in the workspace area Use standard mouse commands to move objects in the workspace. 2.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Designer user interface Workspace

3

Connecting and disconnecting objects
You specify the flow of data through jobs and work flows by connecting objects in the workspace from left to right in the order you want the data to be moved: To connect objects Place the objects you want to connect in the workspace. Click and drag from the triangle on the right edge of an object to the triangle on the left edge of the next object in the flow.

1. 2.

1.

To disconnect objects Click the connecting line.

2.

Press the Delete key.

Describing objects
You can use descriptions to add comments about objects. You can use annotations to explain a job, work flow, or data flow. You can view object descriptions and annotations in the workspace. Together, descriptions and annotations allow you to document a Data Integrator application. For example, you can describe the incremental behavior of individual jobs with numerous annotations and label each object with a basic description.

Data Integrator Designer Guide

45

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

3

Designer user interface Workspace

For more information, see “Creating descriptions” on page 57 and “Creating annotations” on page 59.

Scaling the workspace
You can control the scale of the workspace. By scaling the workspace, you can change the focus of a job, work flow, or data flow. For example, you might want to increase the scale to examine a particular part of a work flow, or you might want to reduce the scale so that you can examine the entire work flow without scrolling. 1. To change the scale of the workspace In the drop-down list on the tool bar, select a predefined scale or enter a custom value.

2.

Alternatively, right-click in the workspace and select a desired scale.

46

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Designer user interface Local object library

3

Note: You can also select Scale to Fit and Scale to Whole:

• •

Select Scale to Fit and the Designer calculates the scale that fits the entire project in the current view area. Select Scale to Whole to show the entire workspace area in the current view area.

Arranging workspace windows
The Window menu allows you to arrange multiple open workspace windows in the following ways: cascade, tile horizontally, or tile vertically.

Closing workspace windows
When you drill into an object in the project area or workspace, a view of the object’s definition opens in the workspace area. The view is marked by a tab at the bottom of the workspace area, and as you open more objects in the workspace, more tabs appear. (You can show/hide these tabs from the Tools > Options menu. Go to Designer > General options and select/deselect Show tabs in workspace. For more information, see the “General and environment options” section.) Note: These views use system resources. If you have a large number of open views, you might notice a decline in performance. Close the views individually by clicking the close box in the top right corner of the workspace. Close all open views by selecting Window > Close All Windows or clicking the Close All Windows icon on the toolbar.

Local object library
The local object library provides access to reusable objects. These objects include built-in system objects, such as transforms, and the objects you build and save, such as datastores, jobs, data flows, and work flows.

Data Integrator Designer Guide

47

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

3

Designer user interface Local object library

The local object library is a window into your local Data Integrator repository and eliminates the need to access the repository directly. Updates to the repository occur through normal Data Integrator operation. Saving the objects you create adds them to the repository. Access saved objects through the local object library. To learn more about local as well as central repositories, see the Data Integrator Advanced Development and Migration Guide. To control object library location, right-click its gray border and select/deselect Allow Docking, or select Hide from the menu.

When you select Allow Docking, you can click and drag the object library to dock at and undock from any edge within the Designer window. When you drag the object library away from a Designer window edge, it stays undocked. To quickly switch between your last docked and undocked locations, just double-click the gray border. When you deselect Allow Docking, you can click and drag the object library to any location on your screen and it will not dock inside the Designer window.

When you select Hide, the object library disappears from the Designer window. To unhide the object library, click its toolbar icon.

To open the object library Choose Tools > Object Library, or click the object library icon in the icon bar.
Object library window

Transform object list

Tabs for other object types

48

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Designer user interface Local object library

3

The object library gives you access to the object types listed in the following table. The table shows the tab on which the object type appears in the object library and describes the Data Integrator context in which you can use each type of object. Tab Description Projects are sets of jobs available at a given time. Jobs are executable work flows. There are two job types: batch jobs and real-time jobs. Work flows order data flows and the operations that support data flows, defining the interdependencies between them. Data flows describe how to process a task. Transforms operate on data, producing output data sets from the sources you specify. The object library lists both built-in and custom transforms. Datastores represent connections to databases and applications used in your project. Under each datastore is a list of the tables, documents, and functions imported into Data Integrator. Formats describe the structure of a flat file, XML file, or XML message. Custom Functions are functions written in the Data Integrator Scripting Language. You can use them in Data Integrator jobs. To display the name of each tab as well as its icon, do one of the following: • Make the object library window wider until the names appear.

Hold the cursor over the tab until the tool tip for the tab appears, as shown.

Data Integrator Designer Guide

49

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

3

Designer user interface Object editors

To sort columns in the object library Click the column heading. For example, you can sort data flows by clicking the Data Flow column heading once. Names are listed in ascending order. To list names in descending order, click the Data Flow column heading again.

Object editors
To work with the options for an object, in the workspace click the name of the object to open its editor. The editor displays the input and output schemas for the object and a panel below them listing options set for the object. If there are many options, they are grouped in tabs in the editor. A schema is a data structure that can contain columns, other nested schemas, and functions (the contents are called schema elements). A table is a schema containing only columns. A common example of an editor is the editor for the query transform, as shown in the following illustration:

Input schema Output schema

Parameter tabs

Tabs of open windows

For specific information about the query editor, see “Query editor” on page 189.

50

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Designer user interface Working with objects

3

In an editor, you can:

• • • •

Undo or redo previous actions performed in the window (right-click and choose Undo or Redo) Find a string in the editor (right-click and choose Find) Drag-and-drop column names from the input schema into relevant option boxes Use colors to identify strings and comments in text boxes where you can edit expressions (keywords appear blue; strings are enclosed in quotes and appear pink; comments begin with a pound sign and appear green) Note: You cannot add comments to a mapping clause in a Query transform. For example, the following syntax is not supported on the Mapping tab:
table.column # comment

The job will not run and you cannot successfully export it. Use the object description or workspace annotation feature instead.

Working with objects
This section discusses common tasks you complete when working with objects in the Designer. With these tasks, you use various parts of the Designer—the toolbar, tool palette, workspace, and local object library. Tasks in this section include:

• • • • • • •

Creating new reusable objects Changing object names Viewing and changing object properties Creating descriptions Creating annotations Saving and deleting objects Searching for objects

Creating new reusable objects
You can create reusable objects from the object library or by using the tool palette. After you create an object, you can work with the object, editing its definition and adding calls to other objects. To create a reusable object (in the object library) Open the object library by choosing Tools > Object Library.

1.

Data Integrator Designer Guide

51

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

3

Designer user interface Working with objects

2. 3. 4.

Click the tab corresponding to the object type. Right-click anywhere except on existing objects and choose New. Right-click the new object and select Properties. Enter options such as name and description to define the object. To create a reusable object (using the tool palette) In the tool palette, left-click the icon for the object you want to create. Move the cursor to the workspace and left-click again. The object icon appears in the workspace where you have clicked.

1. 2.

To open an object’s definition You can open an object’s definition in one of two ways:

From the workspace, click the object name.

Click the object name to open its definition

Data Integrator opens a blank workspace in which you define the object

From the project area, click the object.

You define an object using other objects. For example, if you click the name of a batch data flow, a new workspace opens for you to assemble sources, targets, and transforms that make up the actual flow. 1. 2. 3. To add an existing object (create a new call to an existing object) Open the object library by choosing Tools > Object Library. Click the tab corresponding to any object type. Select an object.

52

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Designer user interface Working with objects

3

4.

Drag the object to the workspace.

Note: Objects dragged into the workspace must obey the hierarchy logic explained in “Object hierarchy” on page 31. For example, you can drag a data flow into a job, but you cannot drag a work flow into a data flow.

Changing object names
You can change the name of an object from the workspace or the object library. You can also create a copy of an existing object. Note: You cannot change the names of built-in objects. 1. 2. 3. To change the name of an object in the workspace Click to select the object in the workspace. Right-click and choose Edit Name. Edit the text in the name text box. Click outside the text box or press Enter to save the new name. To change the name of an object in the object library Select the object in the object library. Right-click and choose Properties. Edit the text in the first text box. Click OK. To copy an object Select the object in the object library. Right-click and choose Replicate. Data Integrator makes a copy of the top-level object (but not of objects that it calls) and gives it a new name, which you can edit.

1. 2. 3. 4.

1. 2.

Viewing and changing object properties
You can view (and, in some cases, change) an object’s properties through its property page. 1. 2. To view, change, and add object properties Select the object in the object library. Right-click and choose Properties. The General tab of the Properties window opens.

Data Integrator Designer Guide

53

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

3

Designer user interface Working with objects

3.

Complete the property sheets. The property sheets vary by object type, but General, Attributes and Class Attributes are the most common and are described in the following sections. When finished, click OK to save changes you made to the object properties and to close the window. Alternatively, click Apply to save changes without closing the window.

4.

General tab
The General tab contains two main object properties: name and description.

From the General tab, you can change the object name as well as enter or edit the object description. You can add object descriptions to single-use objects as well as to reusable objects. Note that you can toggle object descriptions on and off by right-clicking any object in the workspace and selecting/deslecting View Enabled Descriptions. Depending on the object, other properties may appear on the General tab. Examples include:

• •

Execute only once — See “Creating and defining data flows” in Chapter 7: Data Flows for more information. Recover as a unit — See “Marking recovery units” in Chapter 17: Recovery Mechanisms for more information about this work flow property.

54

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Designer user interface Working with objects

3

• • •
Attributes tab

Degree of parallelism — See the Data Integrator Performance Optimization Guide for more information about this advanced feature. Use database links — See the Data Integrator Performance Optimization Guide for more information about this advanced feature. Cache type — See the Data Integrator Performance Optimization Guide for more information about this advanced feature.

The Attributes tab allows you to assign values to the attributes of the current object.

To assign a value to an attribute, select the attribute and enter the value in the Value box at the bottom of the window. Some attribute values are set by Data Integrator and cannot be edited. When you select an attribute with a system-defined value, the Value field is unavailable.

Data Integrator Designer Guide

55

To create a new attribute for a class of objects. 3 Designer user interface Working with objects Class Attributes tab The Class Attributes tab shows the attributes available for the type of object selected. right-click in the attribute list and select Add. The new attribute is now available for all of the objects of this class. all data flow objects have the same class attributes. 56 Data Integrator Designer Guide . For example. To delete an attribute. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. You cannot delete the class attributes predefined by Data Integrator. select it then right-click and choose Delete.

you also import or export its description. select View > Enabled Descriptions.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 57 . The objectlevel setting is also disabled by default unless you add or edit a description from the workspace. To activate the object-level setting. A description is associated with a particular object. Find out how you can participate and help to improve our documentation. When you import or export that repository object (for example. The system-level setting is disabled by default. Both settings must be activated to view the description for a particular object. test. descriptions are a convenient way to add comments to workspace objects. To activate that system-level setting. and production environments). or click the View Enabled Descriptions button on the toolbar. Therefore. Designer user interface Working with objects 3 Creating descriptions Use descriptions to document objects. The Designer determines when to show object descriptions based on a system-level setting and an object-level setting. The object-level setting is saved with the object in the repository. The system-level setting is unique to your setup. You can see descriptions on workspace diagrams. when migrating between development. right-click the object and select Enable object description.

The description displays in the workspace under the object. 1. 4. select Enabled Descriptions. Click OK. right-click an object and select Properties. Alternately. Find out how you can participate and help to improve our documentation. Alternately. Click OK. 58 Data Integrator Designer Guide . enter text in the Description box.This document is part of a SAP study on PDF usage. To hide a particular object’s description In the workspace diagram. Right-click the work flow and select Enable Object Description. select Enabled Descriptions. 3 Designer user interface Working with objects An ellipses after the text in a description indicates that there is more text. 3. From the View menu. To see which object is associated with which selected description. To add a description to an object In the project area or object library. you can select the View Enabled Descriptions button on the toolbar. To add a description to an object from the workspace From the View menu. When you move an object. To see all the text. The description for the object displays in the object library. The description displays automatically in the workspace (and the object’s Enable Object Description option is selected). 2. 1. resize the description by clicking and dragging it. then right-clicking one of the selected objects. 2. 1. 3. its description moves as well. In the Properties window. 2. view the object’s name in the status bar. In the workspace. 1. right-click an object. select an existing object (such as a job) that contains an object to which you have added a description (such as a work flow). you can select multiple objects by: • Pressing and holding the Control key while selecting objects in the workspace diagram. Enter your comments in the Description text box. right-click an object and select Properties. 3. To display a description in the workspace In the project area.

or data flow where it appears. or paste text into the description. Data Integrator alerts you that the description will be updated for every occurrence of the object. However. work flow. you can right-click any object and select Properties to open the object’s Properties window and add or edit its description. Alternately. 1. across all jobs. Designer user interface Working with objects 3 • 2. In the pop-up menu. Find out how you can participate and help to improve our documentation. The description for the object selected is hidden. select Save. after deactivating the alert. because the object-level switch overrides the system-level switch. Dragging a selection box around all the objects you want to select. Note: If you attempt to edit the description of a reusable object. you can only reactivate the alert by calling Technical Support. 2. In the Project menu. Enter.This document is part of a SAP study on PDF usage. 3. even if the View Enabled Descriptions option is checked. or data flow. You can select the Do not show me this again check box to avoid this alert. cut. To edit object descriptions In the workspace. deselect Enable Object Description. When you import or export that job. copy. Data Integrator Designer Guide 59 . double-click an object description. part of a flow. Creating annotations Annotations describe a flow. work flow. An annotation is associated with the job. then right-clicking one of the selected objects. or a diagram in a workspace. you import or export associated annotations.

You can use annotations to describe any workspace such as a job. you can resize and move the annotation by clicking and dragging. Alternately.This document is part of a SAP study on PDF usage. In the tool palette. You can add any number of annotations to a diagram. To delete an annotation Right-click an annotation. However. Click a location in the workspace to place the annotation. 3. edit. 60 Data Integrator Designer Guide . or while loop. conditional. and delete text directly on the annotation. 2. In addition. data flow. work flow. click the annotation icon. 2. 3 Designer user interface Working with objects 1. catch. Find out how you can participate and help to improve our documentation. 1. You cannot hide annotations that you have added to the workspace. Select Delete. you can select an annotation and press the Delete key. You can add. An annotation appears on the diagram. To annotate a workspace diagram Open the workspace diagram you want to annotate. you can move them out of the way or delete them.

Find out how you can participate and help to improve our documentation. and any calls to other reusable objects are recorded in the repository. the object properties. When you save the object. Data Integrator stores the description even if the object is not complete or contains an error (does not validate).This document is part of a SAP study on PDF usage. singleuse objects are saved only as part of the definition of the reusable object that calls them. Designer user interface Working with objects 3 Saving and deleting objects “Saving” an object in Data Integrator means storing the language that describes the object to the repository. You can save reusable objects. only the call is saved. You can choose to save changes to the reusable object currently open in the workspace. Data Integrator Designer Guide 61 . the definitions of any single-use objects it calls. The content of the included reusable objects is not saved.

62 Data Integrator Designer Guide . Right-click and choose Delete. Find out how you can participate and help to improve our documentation. To save all changed objects in the repository Choose Project > Save All. To save changes to a single reusable object Open the project in which your object is included. Data Integrator marks all calls to the object with a red “deleted” icon to indicate that the calls are invalid. 1. 3. 2. Repeat these steps for other individual objects you want to save. You must remove or replace these calls to produce an executable job. • If you select Yes. (optional) Deselect any listed object to avoid saving it. Data Integrator lists the reusable objects that were changed since the last save operation. 2. To delete an object definition from the repository In the object library. see “Using View Where Used” on page 398. 2.This document is part of a SAP study on PDF usage. 1. This command saves all objects open in the workspace. Note: Built-in objects such as transforms cannot be deleted from the object library. Choose Project > Save. • If you attempt to delete an object that is being used. For more information. select the object. Data Integrator provides a warning message and the option of using the View Where Used feature. Click OK. 3 Designer user interface Working with objects 1. Note: Data Integrator also prompts you to save all objects that have changes when you execute a job and when you exit the Designer. Saving a reusable object saves any single-use object included in it.

Enter the appropriate values for the search. The objects matching your entries are listed in the window. you can search for objects defined in the repository or objects available through a datastore. From the search results window you can use the context menu to: • • • Open an item View the attributes (Properties) Import external tables as repository metadata Data Integrator Designer Guide 63 . Click Search. If you delete a reusable object from the workspace or from the project area. 1. To search for an object Right-click in the object library and choose Search. 2. 2. Searching for objects From within the object library. Designer user interface Working with objects 3 1. only the object call is deleted. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. To delete an object call Open the object that contains the call you want to delete. Right-click the object call and choose Delete. 3. Options available in the Search window are described in detail following this procedure. The object definition remains in the object library. Data Integrator displays the Search window.

enter the name as it appears in the database or application and use double quotation marks (") around the name to preserve the case. When searching a datastore or application. objects you create in the Designer have no description unless you add a one. When you designate a datastore. 3 Designer user interface Working with objects You can also drag objects from the search results window and drop them in the desired location. Objects imported into the Data Integrator repository have a description from their source. You can designate whether the information to be located Contains the specified name or Equals the specified name using the drop-down box next to the Name field. Files. You can search by attribute values only when searching in the repository. IDOCs. The Basic tab in the Search window provides you with the following options: Option Name Description The object name to find. By default. Choose from the repository or a specific datastore. Where to search. When searching the repository. From the Advanced tab. The object description to find. you can choose to search for objects based on their Data Integrator attribute values. choose from object types available through that datastore. the name is not case sensitive.This document is part of a SAP study on PDF usage. If you are searching in the repository. If you are searching in a datastore and the name is case sensitive in that datastore. Work flows. 64 Data Integrator Designer Guide . Hierarchies. you can also choose to search the imported data (Internal Data) or the entire datastore (External Data). Jobs. and Domains. Description Type Look in The Search window also includes an Advanced tab. Data flows. Find out how you can participate and help to improve our documentation. choose from Tables. The search returns objects whose description attribute contains the value entered. The type of object to find.

The type of search performed. The attribute value to find. Select Equals to search for any attribute that contains only the value specified. Find out how you can participate and help to improve our documentation. Designer user interface Working with objects 3 The Advanced tab provides the following options: Option Attribute Description The object attribute in which to search.This document is part of a SAP study on PDF usage. The attributes are listed for the object type specified on the Basic tab. Select Contains to search for any attribute that contains the value specified. Value Match Data Integrator Designer Guide 65 .

The window displays option groups for Designer. As you select each option group or option. Changes are effective immediately. one Job Server must be defined as the default Job Server to use at login. modify these options and path names. Default Job Server: If a repository is associated with several Job Servers. Find out how you can participate and help to improve our documentation. 3 Designer user interface General and environment options General and environment options To open the Options window.This document is part of a SAP study on PDF usage. See the Data Integrator Supplement for SAP for more information about these options. Current — Displays the current value of the default Job Server. SAP options appear if you install these licensed extensions. a description appears on the right. Data. New — Allows you to specify a new value for the default Job Server from a drop-down list of Job Servers associated with this repository. Note: Job-specific options and path names specified in Designer refer to the current default Job Server. Expand the options by clicking the plus icon. If you change the default Job Server. An Administrator is defined by host name and port. 66 Data Integrator Designer Guide . The standard options include: • • • • • • • Designer — Environment Designer — General Designer — Graphics Designer — Central Repository Connections Data — General Job Server — Environment Job Server — General Designer — Environment Default Administrator for Metadata Reporting: Administrator — Select the Administrator that the metadata reporting tool uses. and Job Server options. select Tools > Options.

Data Integrator Designer Guide 67 . Interactive Debugger — Allows you to set a communication port for the Designer to communicate with a Job Server while running in Debug mode. Changes will not take effect until you restart Data Integrator. You may choose to constrain the port used for communication between Designer and Job Server when the two components are separated by a firewall. To specify a specific listening port. Allows you to specify a range of ports from which the Designer can choose a listening port. Uncheck to specify a listening port or port range. Enter port numbers in the From port and To port text boxes. Designer user interface General and environment options 3 Designer Communication Ports: Allow Designer to set the port for Job Server communication — If checked. the name of the server group appears. but the Designer only displays the number entered here. Element names are not allowed to exceed this number. For more information. Server group for local repository — If the local repository that you logged in to when you opened the Designer is associated with a server group. Designer — General View data sampling size (rows) — Controls the sample size used to display the data in sources and targets in open data flows in the workspace. The default is 100. Number of characters in workspace icon name — Controls the length of the object names displayed in the workspace.This document is part of a SAP study on PDF usage. Maximum schema tree elements to auto expand — The number of elements displayed in the schema tree. Object names are allowed to exceed this number. Enter a number for the Input schema and the Output schema. see “Using View Data” on page 404. View data by clicking the magnifying glass icon on source and target objects. see “Changing the interactive debugger port” on page 423. Specify port range — Only activated when you deselect the previous control. Designer automatically sets an available port to receive messages from the current Job Server. The default is 17 characters. enter the same port number in both the From port and To port text boxes. Default parameters to variables of the same name — When you declare a variable at the work-flow level. For more information. The default is checked. Find out how you can participate and help to improve our documentation. Data Integrator automatically passes the value as a parameter with the same name to a data flow called by a work flow.

Designer — Graphics Choose and preview stylistic elements to customize your workspaces. Data Integrator automatically stores this information in the AL_COLMAP table (ALVW_MAPPING view) when you save a data flow. or white. you should validate your design manually before job execution. The default is on. Show dialog when job is completed — Allows you to choose if you want to see an alert or just read the trace messages. Note that this option is only available with a plain background style. Line Thickness — Set the connector line thickness. • • • • • • Workspace flow type — Switch between the two workspace flow types (Job/Work Flow and Data Flow) to view default settings. Find out how you can participate and help to improve our documentation. Show tabs in workspace — Allows you to decide if you want to use the tabs at the bottom of the workspace to navigate. be sure to validate your entire job before saving it. For more information. Color scheme — Set the background color to blue. Modify settings for each type using the remaining options. 3 Designer user interface General and environment options Automatically import domains — Select this check box to automatically import domains when importing a table that references a domain. Calculate column mapping while saving data flow — Calculates information about target tables and columns and the sources used to populate them. If you select this option. Open monitor on job execution — Affects the behavior of the Designer when you execute a job. Background style — Choose a plain or tiled background pattern for the selected flow type. otherwise. Data Integrator performs a complete job validation before running a job. Using these options. Line Type — Choose a style for object connector lines. This functionality is highly sensitive to errors and will skip data flows with validation problems. Use navigation watermark — Add a watermark graphic to the background of the flow type selected.This document is part of a SAP study on PDF usage. You can see this information when you generate metadata reports. you can easily distinguish your job/work flow design workspace from your data flow design workspace. the workspace remains as is. 68 Data Integrator Designer Guide . If you keep this default setting. With this option enabled. the Designer switches the workspace to the monitor view during job execution. Perform complete validation before job execution — If checked. The default is unchecked. see “Tools” on page 467. gray.

The default value is 15. Two-digit years greater than or equal to this value are interpreted as 19##. visit http://www.businessobjects. Data Integrator Designer Guide 69 . Reactivate automatically — Select if you want the active central repository to be reactivated whenever you log in to Data Integrator using the current local repository. Find out how you can participate and help to improve our documentation. To activate a central repository. right-click one of the central repository connections listed and select Activate. Data — General Century Change Year — Indicates how Data Integrator interprets the century for two-digit years. if the Century Change Year is set to 15: Two-digit year 99 16 15 14 Interpreted as 1999 1916 1915 2014 Convert blanks to nulls for Oracle bulk loader — Converts blanks to NULL values when loading data using the Oracle bulk loader utility and: • • the column is not part of the primary key the column is nullable Job Server — Environment Maximum number of engine processes — Sets a limit on the number of engine processes that this Job Server can have running concurrently. Job Server — General Use this window to reset Job Server options (see “Changing Job Server options” on page 329) or with guidance from Business Objects Customer Support. Designer user interface General and environment options 3 Designer — Central Repository Connections Displays the central repository connections and the active central repository. For example. Two-digit years less than this value are interpreted as 20##.This document is part of a SAP study on PDF usage. For contact information.com/ support/.

3 Designer user interface General and environment options 70 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide Projects and Jobs chapter .

expand it to view the lower-level objects contained in the object. Objects that make up a project The objects in a project appear hierarchically in the project area. 72 Data Integrator Designer Guide . Opening a project makes one group of objects easily accessible in the user interface. Projects have common characteristics: • • • Projects are listed in the object library. If a plus sign (+) appears next to an object. Only one project can be open at a time. and the DF_EmpMap data flow contains multiple objects. A project is the highest level of organization offered by Data Integrator. Find out how you can participate and help to improve our documentation. Projects cannot be shared among multiple users. You can use a project to group jobs that have schedules that depend on one another or that you want to monitor together. In the following example. 4 Projects and Jobs About this chapter About this chapter Project and job objects represent the top two levels of organization for the application flows you create using the Designer.This document is part of a SAP study on PDF usage. This chapter contains the following topics: • • Projects Jobs Projects A project is a reusable object that allows you to group jobs. Data Integrator shows you the contents as both names in the project area hierarchy and icons in the workspace. the Job_KeyGen job contains two data flows.

3. It cannot contain blank spaces. Find out how you can participate and help to improve our documentation. Enter the name of your new project. Data Integrator Designer Guide 73 . Data Integrator closes that project and opens the new one. Click Open. they also appear in the project area. Projects and Jobs Projects 4 Click here to open the first level Click here to open the next level Each item selected in the project area also displays in the workspace: Creating new projects 1. To create a new project Choose Project > New > Project. The name can include alphanumeric characters and underscores (_).This document is part of a SAP study on PDF usage. As you add jobs and other lowerlevel objects to the project. Click Create. 3. To open an existing project Choose Project > Open. Select the name of an existing project from the list. 2. Opening existing projects 1. The new project appears in the project area. 2. Note: If another project was already open.

Note: Data Integrator also prompts you to save all objects that have changes when you execute a job and when you exit the Designer. Jobs A job is the only object you can execute. In production. you can schedule batch jobs and set up real-time jobs as services that execute a process when Data Integrator receives a message request. then create a single job that calls those work flows. 4 Projects and Jobs Jobs Saving projects 1. organize its content into individual work flows. Saving a reusable object saves any single-use object included in it. 3. work flows. To save all changes to a project Choose Project > Save All. A job diagram is made up of two or more objects connected together. see Chapter 8: Work Flows. For more information on work flows. 2. and data flows that you edited since the last save. You can include any of the following objects in a job definition: • Data flows • • • • • • • • Sources Targets Transforms Scripts Conditionals While Loops Try/catch blocks Work flows If a job becomes complex. Find out how you can participate and help to improve our documentation. Data Integrator lists the jobs. Click OK. A job is made up of steps you want executed together.This document is part of a SAP study on PDF usage. You can manually execute and test jobs in development. 74 Data Integrator Designer Guide . (optional) Deselect any listed object to avoid saving it. Each step is represented by an object icon that you place in the workspace to create a job diagram.

There are some restrictions regarding the use of some Data Integrator features with real-time jobs. Find out how you can participate and help to improve our documentation. you are telling Data Integrator to validate these objects according the requirements of the job type (either batch or real-time). Data Integrator Designer Guide 75 . You can add work flows and data flows to both batch and real-time jobs. 3. 2. select the project name. 2. see Chapter 10: Real-time jobs. To create a job in the object library Go to the Jobs tab. Data Integrator opens a new workspace for you to define the job. 1. Right-click Batch Jobs or Real Time Jobs and choose New. For more information. Projects and Jobs Jobs 4 Real-time jobs use the same components as batch jobs. To create a job in the project area In the project area. Right-click and choose New Batch Job or Real Time Job. Edit the name. It cannot contain blank spaces. The name can include alphanumeric characters and underscores (_).This document is part of a SAP study on PDF usage. Creating jobs 1. When you drag a work flow or data flow icon into a job.

Right-click and select Properties to change the object’s name and add a description. 4 Projects and Jobs Jobs 3. Find out how you can participate and help to improve our documentation. 4. The name can include alphanumeric characters and underscores (_). Naming conventions for objects in jobs We recommend that you follow consistent naming conventions to facilitate object identification across all systems in your enterprise. 5. To add the job to the open project. This allows you to more easily work with metadata across all applications such as: • • • • Data-modeling applications ETL applications Reporting applications Adapter software development kits 76 Data Integrator Designer Guide . A new job with a default name appears.This document is part of a SAP study on PDF usage. drag it into the project area. It cannot contain blank spaces.

By using a prefix or suffix. For example: DF_OrderStatus.This document is part of a SAP study on PDF usage.<package>. you can more easily identify your object’s type. other interfaces might require you to identify object types by the text alone.<PROC_Name> <datastore>.<PROC_Name> Data Integrator Designer Guide 77 . Projects and Jobs Jobs 4 Examples of conventions recommended for use with jobs and other objects are shown in the following table. the stored procedure naming convention can look like either of the following: <datastore>.<owner>. Find out how you can participate and help to improve our documentation. For example. RTJob_OrderStatus. you might want to provide standardized names for objects that identify a specific action across all object types. In addition to prefixes and suffixes. Prefix DF_ EDF_ EDF_ RTJob_ WF_ JOB_ _DS DC_ SC_ _Memory_DS PROC_ _Input _Output Suffix Object Data flow Embedded data flow Embedded data flow Real-time job Work flow Job Datastore Datastore configuration System configuration Memory datastore Stored procedure Example DF_Currency EDF_Example_Input EDF_Example_Output RTJob_OrderStatus WF_SalesOrg JOB_SalesOrg ORA_DS DC_DB2_production SC_ORA_test Catalog_Memory_DS PROC_SalesStatus Although Data Integrator Designer is a graphical user interface with icons representing objects in its windows. naming conventions can also include path name identifiers. In addition to prefixes and suffixes.<owner>.

Find out how you can participate and help to improve our documentation. 4 Projects and Jobs Jobs 78 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage.

Data Integrator Designer Guide Datastores chapter . Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

SAP R/3 and SAP BW.D. These 80 Data Integrator Designer Guide . and PROD) to the same datastore name. make corresponding changes in the datastore information in Data Integrator—Data Integrator does not automatically detect the new information.D. The specific information that a datastore object can access depends on the connection configuration. Find out how you can participate and help to improve our documentation. PeopleSoft. These configurations can be direct or through adapters. See “Database datastores” on page 81. When your database or application changes. Data Integrator reads and writes data stored in XML documents through DTDs and XML Schemas. This allows you to plan ahead for the different environments your datastore may be used in and limits the work involved with migrating jobs. Data Integrator datastores can connect to: • • • Databases and mainframe file systems. Note: Data Integrator reads and writes data stored in flat files through flat file formats as described in Chapter 6: File Formats. Note: Objects deleted from a datastore connection are identified in the project area and workspace by a red “deleted” icon . Edwards World.This document is part of a SAP study on PDF usage. TEST. See “Adapter datastores” on page 111. You can create multiple configurations for a datastore. This visual flag allows you to find and update data flows affected by datastore changes. and Siebel Applications. Applications that have pre-packaged or user-written Data Integrator adapters. See “Formatting XML documents” on page 219. 5 Datastores About this chapter About this chapter This chapter contains the following topics: • • • • What are datastores? Database datastores Adapter datastores Creating and managing multiple datastore configurations What are datastores? Datastores represent connection configurations between Data Integrator and databases or applications. Edwards One World and J. Oracle Applications. For example. Datastore configurations allow Data Integrator to access metadata from a database or application and read from or write to that database or application while Data Integrator executes a job. J. See the appropriate Data Integrator Supplement. you can add a set of configurations (DEV.

and thus. For more information. The data sources that Attunity Connect accesses are in the following list. Oracle. Sybase ASE. Group any set of datastore configurations into a system configuration. Netezza. see “Creating and managing multiple datastore configurations” on page 115. Find out how you can participate and help to improve our documentation. Datastores Database datastores 5 connection settings stay with the datastore during export or import. For a complete list of sources. and Teradata databases (using native connections) Other databases (through ODBC) A Data Integrator repository. • • • • Adabas DB2 UDB for OS/390 and DB2 UDB for OS/400 IMS/DB VSAM Data Integrator Designer Guide 81 . Microsoft SQL Server. Database datastores Database datastores can represent single or multiple Data Integrator connections with: • • • • Legacy systems using Attunity Connect IBM DB2. Sybase IQ. the set of datastore configurations for your current environment. using a memory datastore or persistent cache datastore Mainframe interface Defining a database datastore Browsing metadata through a database datastore Importing metadata through a database datastore Memory datastores Persistent cache datastores Linked datastores This section discusses: • • • • • • • Mainframe interface Data Integrator provides the Attunity Connector datastore that accesses mainframe data sources through Attunity Connect. select a system configuration.This document is part of a SAP study on PDF usage. Business Objects Data Federator. MySQL. When running or scheduling a job. refer to the Attunity documentation.

In the Datastore type box. 5. When you install a Data Integrator Job Server on UNIX. Data Integrator connects to Attunity Connector using its ODBC interface. For more information about how to install and configure these products. Servers Install and configure the Attunity Connect product on the server (for example. 2.1 or higher. which you can use for configuration and administration. It is not necessary to purchase a separate ODBC driver manager for UNIX and Windows platforms. the installer will prompt you to provide an installation directory path for Attunity connector software. In addition. you do not need to install a driver manager. an zSeries computer).5. install the Attunity Connect product. 5 Datastores Database datastores • Flat files on OS/390 and flat files on OS/400 Attunity Connector accesses mainframe data using software that you must manually install on the mainframe server and the local client (Job Server) computer. refer to their documentation. because Data Integrator loads ODBC drivers directly on UNIX platforms. 3. upgrade your repository to Data Integrator version 6. Configure ODBC data sources on the client (Data Integrator Job Server). To create an Attunity Connector datastore In the Datastores tab of the object library. Finish entering values in the remainder of the dialog. Clients To access mainframe data using Attunity Connector.This document is part of a SAP study on PDF usage. select Database. 82 Data Integrator Designer Guide . 4. select Attunity Connector. Enter a name for the datastore. 1. In the Database type box. Configuring an Attunity datastore To use the Attunity Connector datastore option. right-click and select New. The ODBC driver is required. Attunity also offers an optional tool called Attunity Studio. Find out how you can participate and help to improve our documentation.

location of the Attunity daemon. see “Specifying multiple data sources in one Attunity datastore” on page 84. Data Integrator automatically generates the correct SQL for this format. and the Attunity daemon port number. Data Integrator’s format for accessing Attunity tables is unique to Data Integrator. Find out how you can participate and help to improve our documentation.TableName When using the Designer to create your jobs with imported Attunity tables. However. precede the table name with the data source and owner names separated by a colon. Datastores Database datastores 5 • To create an Attunity Connector datastore.This document is part of a SAP study on PDF usage. you must know the Attunity data source name. Data Integrator Designer Guide 83 . when you author SQL. be sure to use this format. Since a single datastore can access multiple software systems that do not share the same namespace. The format is as follows: AttunityDataSource:OwnerName. With an Attunity Connector. You can author SQL in the following constructs: • • • • • SQL function SQL transform Pushdown_sql function Pre-load commands in table loader Post-load commands in table loader For information about how to specify multiple data sources in one Attunity datastore. the name of the Attunity data source must be specified when referring to a table. You specify a unique Attunity server workspace name.

you might want to access both types of data using a single connection. You can now use the new datastore connection to import metadata tables into the current Data Integrator repository. use the same workspace name for each data source. If you have several types of data on the same computer. To specify multiple sources in the Datastore Editor. Data Integrator cannot access a table with an owner name larger than 64 characters.This document is part of a SAP study on PDF usage.Navdemo Requirements for an Attunity Connector datastore Data Integrator requires the following for Attunity Connector datastores: • For any table in Data Integrator. Specifying multiple data sources in one Attunity datastore You can use the Attunity Connector datastore to access multiple Attunity data sources on the same Attunity Daemon location. separate data source names with semicolons in the Attunity data source box using the following format: AttunityDataSourceName. 5 Datastores Database datastores 6. click the Advanced button. ensure that you meet the following requirements: • • • All Attunity data sources must be accessible by the same user name and password. In the case of Attunity tables. For general information about these options see. the maximum size of the Attunity data source name and actual owner name is 63 (the ":" accounts for 1 character). Find out how you can participate and help to improve our documentation. If you want to change any of the default options (such as Rows per Commit or Language). the maximum size of the owner name is 64 characters. if you have a DB2 data source named DSN4 and a VSAM data source named Navdemo. for example a DB2 database and VSAM. For example. 84 Data Integrator Designer Guide . Click OK. “Defining a database datastore” on page 85. 7. If you list multiple data source names for one Attunity Connector datastore. you can use a single connection to join tables (and push the join operation down to a remote server). enter the following values into the Data source box: DSN4. which reduces the amount of data transmitted through your network. All Attunity data sources must use the same workspace.AttunityDataSourceName For example. When you setup access to the data sources in Attunity Studio.

1. (OPEN) This error occurs because of insufficient file permissions to some of the files in the Attunity installation directory. which is not enough to correctly represent a timestamp value. Data Integrator Designer Guide 85 . Find out how you can participate and help to improve our documentation. right-click and select New. If a user is not authorized to create. Datastores Database datastores 5 Limitations All Data Integrator features are available when you use an Attunity Connector datastore except the following: • • • • • • Bulk loading Imported functions (imports metadata for tables only) Template tables (creating tables) The datetime data type supports up to 2 sub-seconds only Data Integrator cannot load timestamp data into a timestamp column in a table because Attunity truncates varchar data to 8 characters. Enter the name of the new datastore in the Datastore Name field. To define a datastore. the job could fail with following error: [D000] Cannot open file /usr1/attun/navroot/def/sys System error 13: The file access permissions do not allow the specified action.This document is part of a SAP study on PDF usage. When you select a Datastore Type. It cannot contain spaces. execute and drop stored procedures jobs will still run. However. 3. authorize the user (of the datastore/ database) to create. get appropriate access privileges to the database or file system that the datastore describes.. Select the Datastore type. Data Integrator displays other options relevant to that type. To define a Database datastore In the Datastores tab of the object library. they will produce a warning message and will run less efficiently. For example. When running a job on UNIX. 2. execute and drop stored procedures. to allow Data Integrator to use parameterized SQL when reading or writing to DB2 databases. Choose Database. change the file permissions for all files in the Attunity directory to 777 by executing the following command from the Attunity installation directory: $ chmod -R 777 * Defining a database datastore Define at least one database datastore for each database or mainframe file system with which you are exchanging data. To avoid this error. The name can contain any alphabetical or numeric characters or underscores (_).

see the Data Integrator Performance Optimization Guide. DB2. 5 Datastores Database datastores 4. or Teradata. Data Federator. ODBC. Select the Database type. For more information. Oracle. 7. click OK. 6. Sybase ASE. This check box displays for all databases except Attunity Connector. Keep Enable automatic data transfer selected to enable transfer tables in this datastore that the Data_Transfer transform can use to push down subsequent database operations. Memory. Netezza. 86 Data Integrator Designer Guide . MySQL. Enter the appropriate information for the selected database type. The Enable automatic data transfer check box is selected by default when you create a new datastore and you chose Database for Datastore type. Choose from Attunity Connector. At this point. you can save the datastore or add more information to it: • To save the datastore and close the Datastore Editor. Sybase IQ. and Persistent Cache. 5. Memory.This document is part of a SAP study on PDF usage. Microsoft SQL Server. Data Federator. Find out how you can participate and help to improve our documentation.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. click the cells under each configuration name. See the Data Integrator Reference Guide for a description of the options in the grid for each database. Data Integrator Designer Guide 87 . select Advanced. To enter values for each configuration option. Datastores Database datastores 5 • To add more information.

Use the tool bar on this window to add. Find out how you can participate and help to improve our documentation. configure. Note: On versions of Data Integrator prior to version 11. the following buttons are available: Buttons Import unsupported data types as VARCHAR of size Description The data types that Data Integrator supports are documented in the Reference Guide.1 provides a specific Netezza option as the Database type instead of ODBC. Saves selections and closes the Datastore Editor (Create New Datastore) window. 8. Click OK. the correct database type to use when creating a datastore on Netezza was ODBC. See “Ways of importing metadata” on page 96 for the procedures you will use to import metadata from the connected database or application. If you want Data Integrator to convert a data type in your source that it would not normally support. select this option and enter the number of characters that you will allow. 88 Data Integrator Designer Guide . Cancels selections and closes the Datastore Editor window.7. When using Netezza as the database with Data Integrator. see “Creating and managing multiple datastore configurations” on page 115. and manage multiple configurations for a datastore.7.0. Opens the Configurations for Datastore dialog. it is recommended that you choose Data Integrator’s Netezza option as the Database type rather than ODBC. 5 Datastores Database datastores For the datastore as a whole. Opens a text window that displays how Data Integrator will code the selections you make for this datastore in its scripting language.This document is part of a SAP study on PDF usage. Data Integrator 11. Edit Show ATL OK Cancel Apply For more information about creating multiple configurations for a single datastore. Saves selections.

click Advanced and change properties for the current configuration. 1. 2. Click OK. See the Data Integrator Reference Guide for a detailed description of the options on the Configurations for Datastore dialog (opens when you select Edit in the Datastore Editor). Once you add a new configuration to an existing datastore. or click Edit to add. The Properties window opens. you can use the fields in the grid to change connection values and properties for the new configuration. Datastores Database datastores 5 Changing a datastore definition Like all Data Integrator objects. The Datastore Editor appears in the workspace (the title bar for this dialog displays Edit Datastore). Right-click the datastore name and select Properties. The options take effect immediately. 3. For example. The individual properties available for a datastore are described in the Data Integrator Reference Guide. Properties document the object. Change the datastore properties. 2. Data Integrator Designer Guide 89 . or delete additional configurations. edit.This document is part of a SAP study on PDF usage. To change datastore options Go to the Datastores tab in the object library. Right-click the datastore name and choose Edit. Find out how you can participate and help to improve our documentation. the name of the database to connect to is a datastore option. To change datastore properties Go to the datastore tab in the object library. You can change the connection information for the current datastore configuration. 4. the name of the datastore and the date on which it was created are datastore properties. Click OK. 3. datastores are defined by both options and properties: • • Options control the operation of objects. Properties are merely descriptive of the object and do not affect its operation. 1. For example.

To view imported objects Go to the Datastores tab in the object library. database datastores have functions. For example. You can use Data Integrator to view metadata for imported or nonimported objects and to check whether the metadata has changed for objects already imported. 1. Find out how you can participate and help to improve our documentation. click the plus sign (+) next to tables to view the imported tables. 90 Data Integrator Designer Guide . 2. 5 Datastores Database datastores Browsing metadata through a database datastore Data Integrator stores metadata information for all imported objects in a datastore. Click the plus sign (+) next to an object type to view the objects of that type imported from the datastore. and template tables. tables. Click again to sort in reversealphabetical order. 3. To sort the list of objects Click the column heading to sort the objects in each grouping and the groupings in each datastore alphabetically.This document is part of a SAP study on PDF usage. Click the plus sign (+) next to the datastore name to view the object types in the datastore. For example.

Data Integrator Designer Guide 91 .) Data Integrator opens the datastore explorer in the workspace. Checks for differences between metadata in the database and metadata in the repository. Imports (or re-imports) metadata from the database into the repository. You can view tables in the external database or tables in the internal repository.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. 4. Only available if you select one table. Command Open1 Reconcile Reimport Description Opens the editor for the table metadata. If you select one or more tables. Choose a datastore. Command Open1 Import Reconcile Description Opens the editor for the table metadata. You can also search through them. The datastore explorer lists the tables in the datastore. Select Repository metadata to view imported tables. 1. you can right-click for further options. For more information about the search feature. If you select one or more tables. 2. you can double-click the datastore icon. 3. and select Open. Reimports metadata from the database into the repository. you can right-click for further options. see “To import by searching” on page 99. Datastores Database datastores 5 Column heading 1. (Alternatively. To view datastore metadata Select the Datastores tab in the object library. Checks for differences between metadata in the repository and metadata in the database. right-click. Select External metadata to view tables in the external database.

select External Metadata. Choose the table or tables you want to check for changes. 3. 1. To use the most recent metadata from Data Integrator. select the table you want to view. 2. 1. The Imported column displays YES to indicate that the table has been imported into the repository. A table editor appears in the workspace and displays the schema and attributes of the table. Only available if you select one table. The Changed column displays YES to indicate that the database tables differ from the metadata imported into Data Integrator. 2. reimport the table. Right-click and choose Reconcile. 1. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. 5 Datastores Database datastores Command Delete Properties View Data 1 Description Deletes the table or tables from the repository. Shows the properties of the selected table. Opens the View Data window which allows you to see the data currently in the table. To browse the metadata for an external table In the browser window showing the list of external tables. Right-click and choose Open. To determine if a schema is has changed since it was imported In the browser window showing the list of repository tables. 92 Data Integrator Designer Guide .

This document is part of a SAP study on PDF usage. click the Indexes tab. In the Properties window. Find out how you can participate and help to improve our documentation. right-click a table to open the shortcut menu. Click an index to see the contents. From the datastores tab in the Designer. 2. The left portion of the window displays the Index list. 2. To view secondary index information for tables Secondary index information can help you understand the schema of an imported table. 1. A table editor appears in the workspace and displays the schema and attributes of the table. From the shortcut menu. 4. Right-click and select Open. click Properties to open the Properties window. Datastores Database datastores 5 1. 3. Data Integrator Designer Guide 93 . To view the metadata for an imported table Select the table name in the list of imported tables.

descriptions. Data Integrator converts the data type to one that is supported. In some cases. and data types. The data type for each column. After importing metadata. 94 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. it ignores the column entirely. The description of the table. Metadata Table name Table description Column name Column description Column data type Description The name of the table as it appears in the database. 5 Datastores Database datastores Importing metadata through a database datastore For database datastores. The description of the column. The edits are propagated to all objects that call these objects. you can edit column names. The name of the table column. if Data Integrator cannot convert the data type. If a column is defined as an unsupported data type.This document is part of a SAP study on PDF usage. you can import metadata for tables and functions. This section discusses: • • • • Imported table information Imported stored function and procedure information Ways of importing metadata Reimporting objects Imported table information Data Integrator determines and stores a specific set of metadata information for tables.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Information Data Integrator records about the table such as the date created and date modified if these values are available. You may change the decimal precision or scale and varchar size within Data Integrator after importing form the Business Objects Data Federator data source. these columns are indicated in the column list by a key icon next to the column name. You can configure imported functions and procedures through the function wizard and the smart editor in a category identified by the datastore name. Data Integrator Designer Guide 95 . Name of the table owner. Datastores Database datastores 5 Metadata Primary key column Description The column(s) that comprise the primary key for the table.6). Note: The owner name for MySQL and Netezza data sources corresponds to the name of the database or schema where the table appears. For more information. and Sybase ASE. Information that is imported for functions includes: • • • Function parameters Return type Name. and Sybase IQ databases. owner Imported functions and procedures appear on the Datastores tab of the object library. You can use these functions and procedures in the extraction specifications you give Data Integrator. see the Data Integrator Reference Guide. After a table has been added to a data flow diagram. Table attribute Owner name Varchar and Column Information from Business Objects Data Federator tables Any decimal column imported to Data Integrator from a Business Objects Data Federator data source is converted to the decimal precision and scale(28. Functions and procedures appear in the Function branch of each datastore tree. Oracle. You can also import stored functions and packages from Oracle. Any varchar column imported to Data Integrator from a Business Objects Data Federator data source is varchar(1024). Imported stored function and procedure information Data Integrator can import stored procedures from DB2. MS SQL Server.

In some environments. the tables are organized and displayed as a tree structure. 1. Right-click and choose Import. Right-click and choose Open. Select the items for which you want to import metadata. 3. 96 Data Integrator Designer Guide . Go to the Datastores tab. The workspace contains columns that indicate whether the table has already been imported into Data Integrator (Imported) and if the table schema has changed since it was imported (Changed). Find out how you can participate and help to improve our documentation. 4. Open the object library. 2.This document is part of a SAP study on PDF usage. To verify whether the repository contains the most recent metadata for an object. 5 Datastores Database datastores Ways of importing metadata This section discusses methods you can use to import metadata: • • • To import by browsing To import by name To import by searching To import by browsing Note: Functions cannot be imported by browsing. to import a table. 5. If this is true. For example. The items available to import through the datastore appear in the workspace. right-click the object and choose Reconcile. you must select a table rather than a folder that contains tables. Click the plus sign to navigate the structure. there is a plus sign (+) to the left of the name. Select the datastore you want to use. 6.

Select the datastore you want to use. to specify all tables.This document is part of a SAP study on PDF usage. To import by name Open the object library. In the Import By Name window. • For tables: • Enter a table name in the Name box to specify a particular table. go to the Datastores tab to display the list of imported objects. Right-click and choose Import By Name. If the name is case-sensitive in the database (and not all uppercase). In the object library. or select the All check box. 3. choose the type of item you want to import from the Type list. 2. if available. If you are importing a stored procedure. 5. Note: Options vary by database type. Datastores Database datastores 5 7. Data Integrator Designer Guide 97 . enter the name as it appears in the database and use double quotation marks (") around the name to preserve the case. 6. 1. Specify the items you want imported. select Function. Find out how you can participate and help to improve our documentation. Click the Datastores tab. 4.

procedures. functions. constants. If the name is case-sensitive in the database (and not all uppercase). If you leave the owner name blank. Otherwise. An Oracle package is an encapsulated collection of related program objects (e. Data Integrator imports all stored procedures and stored functions defined within the Oracle package. • For functions and procedures: • In the Name box. 98 Data Integrator Designer Guide . enter the name as it appears in the database and use double quotation marks (") around the name to preserve the case. and exceptions) stored together in the database. enter the name of the function or stored procedure. cursors. Data Integrator allows you to import procedures or functions created within packages and use them as top-level procedures or functions.. variables. any table with the specified table name).This document is part of a SAP study on PDF usage. You cannot import an individual function or procedure defined within a package. you specify matching tables regardless of owner (that is.g. Data Integrator will convert names into all upper-case characters. Find out how you can participate and help to improve our documentation. If you enter a package name. You can also enter the name of a package. 5 Datastores Database datastores • Enter an owner name in the Owner box to limit the specified tables to a particular owner.

To import by searching Note: Functions cannot be imported by searching. 3. The Search window appears. Right-click and select Search. ends the current transaction with COMMIT or ROLLBACK. 7. you specify matching functions regardless of owner (that is. Enter the entire item name or some part of it in the Name text box. Select Contains or Equals from the drop-down list to the right depending on whether you provide a complete or partial search value. Datastores Database datastores 5 • Enter an owner name in the Owner box to limit the specified functions to a particular owner. That is. you need to search for owner. or issues any ALTER SESSION or ALTER SYSTEM commands. • If you are importing an Oracle function or stored procedure and any of the following conditions apply. Click the Datastores tab. A stored procedure cannot be pushed down to a database inside another SQL statement when the stored procedure contains a DDL statement. Data Integrator Designer Guide 99 . enter the name as it appears in the database and use double quotation marks (") around the name to preserve the case.This document is part of a SAP study on PDF usage. 6. Find out how you can participate and help to improve our documentation. 1. 5. 8. Select the object type in the Type box. Equals qualifies only the full search string. 2. clear the Callable from SQL expression check box. (Optional) Enter a description in the Description text box. Click OK.table_name rather than simply table_name. 4. any function with the specified name). Select the name of the datastore you want to use. 7. If the name is case-sensitive in the database (and not all uppercase). If you leave the owner name blank. Open the object library.

Internal indicates that Data Integrator searches only the items that have been imported. function. Click Search. The advanced options only apply to searches of imported items. 100 Data Integrator Designer Guide . you can reimport it. select the table. Select the datastore in which you want to search from the Look In box. which updates the object’s metadata from your database (reimporting overwrites any changes you might have made to the object in Data Integrator). To import a table from the returned list. 5 Datastores Database datastores 9. 12. 11. Go to the Advanced tab to search using Data Integrator attribute values. 13. or table. Reimporting objects If you have already imported an object such as a datastore. right-click.This document is part of a SAP study on PDF usage. Data Integrator lists the tables matching your search criteria. Select External from the drop-down box to the right of the Look In box. 10. and choose Import. Find out how you can participate and help to improve our documentation. External indicates that Data Integrator searches for the item in the entire database defined by the datastore.

You can skip objects to reimport by clicking No for that object. you can reimport objects using the object library at various levels: • • • Individual objects — Reimports the metadata for an individual object such as a table or function Category node level — Reimports the definitions of all objects of that type in that datastore. click the Datastores tab. 4. 2. In this version of Data Integrator. and hierarchies To reimport objects from the object library In the object library. Data Integrator requests confirmation for each object unless you check the box Don’t ask me again for the remaining objects. functions. Right-click an individual object and click Reimport. 1. Find out how you can participate and help to improve our documentation. for example all tables in the datastore Datastore level — Reimports the entire datastore and all its dependent objects including tables. click View Where Used to display where the object is currently being used in your jobs. Data Integrator Designer Guide 101 . viewed the repository metadata. 3. you opened the datastore. Click Yes to reimport the metadata.This document is part of a SAP study on PDF usage. You can also select multiple individual objects using Ctrl-click or Shiftclick. Datastores Database datastores 5 To reimport objects in previous versions of Data Integrator. and selected the objects to reimport. IDOCs. If you are unsure whether to reimport (and thereby overwrite) the object. If you selected multiple objects to reimport (for example with Reimport All). or right-click a category node or datastore name and click Reimport All. The Reimport dialog box opens.

enter the name of the new datastore. • The lifetime of memory table data is the duration of the job. or adapter. the LOOKUP_EXT function and other transforms and functions that do not require database operations can access data without having to read it from a remote database. label a memory datastore to distinguish its memory tables from regular database tables in the workspace. To define a memory datastore From the Project menu. The data in memory tables cannot be shared between different real-time jobs. 102 Data Integrator Designer Guide . Creating memory datastores You can create memory datastores using the Datastore Editor window. application. Memory tables are schemas that allow you to cache intermediate data. 5 Datastores Database datastores Memory datastores Data Integrator also allows you to create a database datastore using Memory as the Database type. a datastore normally provides a connection to a database. In Data Integrator. Therefore. Find out how you can participate and help to improve our documentation. select New > Datastore. A memory datastore is a container for memory tables. the performance of real-time jobs with multiple data flows is far better than it would be if files or regular tables were used to store intermediate data. Memory datastores are designed to enhance processing performance of data flows executing in real-time jobs. Datastore names are appended to table names when table icons appear in the workspace. Store table data in memory for the duration of a job. 1. 2. Memory tables are represented in the workspace with regular table icons. Data (typically small amounts in a real-time job) is stored in memory to provide immediate access instead of going to the original source data. By storing table data in memory. In the Name box.This document is part of a SAP study on PDF usage. Memory tables can cache data from relational database tables and hierarchical data files such as XML messages and SAP IDocs (both of which contain nested schemas). Be sure to use the naming convention “Memory_DS”. Memory tables can be used to: • Move data between data flows in real-time jobs. only use memory tables when processing small quantities of data. By caching intermediate data. By contrast. For best performance. a memory datastore contains memory table schemas saved in the repository. Support for the use of memory tables in batch jobs is not available.

In the workspace. To create a memory table From the tool pallet. See “Create Row ID option” on page 104 for more information. select the memory datastore. Creating memory tables When you create a memory table. In the Database Type box select Memory. Instead. See Chapter 10: Realtime jobs for an example of how to use memory tables as sources and targets in a job. 7. the table appears with a table icon in the workspace and in the object library under the memory datastore. No additional attributes are required for the memory datastore. 3. Data Integrator defines the memory table’s schema and saves the table. The Create Table window opens. Click OK. Click inside a data flow to place the template table. 2. 5. Enter a table name. which can be either a schema from a relational database table or hierarchical data files such as XML messages. 5. If you want a system-generated row ID column in the table. 1. The memory table appears in the workspace as a template table icon. you can use a memory table as a source or target in any data flow. the memory table’s icon changes to a target table icon and the table appears in the object library under the memory datastore’s list of tables. In the Datastore type box keep the default Database. click the Create Row ID check box. Data Integrator creates the schema for each memory table automatically based on the preceding schema. Click OK. Data Integrator Designer Guide 103 . 4. Using memory tables as sources and targets After you create a memory table as a target in one data flow. you do not have to specify the table’s schema or import the table’s metadata. The first time you save the job. Subsequently. From the Project menu select Save. 4. click the template table icon. 8.This document is part of a SAP study on PDF usage. From the Create Table window. 6. Find out how you can participate and help to improve our documentation. Connect the memory table to the data flow as a target. Datastores Database datastores 5 3.

open the memory table’s target table editor to set table options. See “Memory table target options” on page 104 for more information. Select Update Schema.This document is part of a SAP study on PDF usage. This new column allows you to use a LOOKUP_EXT expression as an iterator in a script. Connect the memory table as a source or target in the data flow. To do this. use the Update Schema option. 5. All occurrences of the current memory table are updated with the new schema. 2. 2. 104 Data Integrator Designer Guide . To use a memory table as a source or target In the object library. click the Datastores tab. you would have to add a new a memory table to update a schema. 1. Otherwise. 5 Datastores Database datastores 1. Select the memory table you want to use as a source or target. To update the schema of a memory target table Right-click the memory target table’s icon in the work space. If you deselect this option. 6. Create Row ID option If the Create Row ID is checked in the Create Memory Table window. the second row inserted gets a value of 2. A list of tables appears. Update Schema option You might want to quickly update a memory target table’s schema if the preceding schema changes. To set this option. 4. Data Integrator generates an integer column called DI_Row_ID in which the first row inserted gets a value of 1. The schema of the preceding object is used to update the memory target table’s schema. The current memory table is updated in your repository. etc. Memory table target options The Delete data from table before loading option is available for memory table targets. Expand the memory datastore that contains the memory table you want to use. The default is yes. Expand Tables. Find out how you can participate and help to improve our documentation. and drag it into an open data flow. new data will append to the existing table data. Save the job. 3. If you are using a memory table as a target. open the memory target table editor.

’=’. For example: $NumOfRows = total_rows(memory_DS. end end In the preceding script. use the following syntax: TOTAL_ROWS(DatastoreName. while ($count < $NumOfRows) begin $data = lookup_ext([memory_DS. The table’s name is preceded by its datastore name (memory_DS).Owner.table1.table1) $I = 1. This provides finer control than the active job has over your data and memory usage. Use the DI_Row_ID column to iterate through a table using a lookup_ext function in a script. a blank space (where a table owner would be for a regular table).. Troubleshooting memory tables • One possible error. is that Data Integrator runs out of virtual memory space. Data Integrator Designer Guide 105 . see the Data Integrator Reference Guide.. a dot. This function can be used with any type of datastore.. then a second dot.TableName) function can only be used with memory tables. table1 is a memory table. so tables are identified by just the datastore name and the table name as shown. If Data Integrator runs out of memory while executing any operation.$I]). The TRUNCATE_TABLE(DatastoreName. particularly when using memory tables. There are no owners for memory datastores.TableName) function returns the number of rows in a particular table in a datastore.[A]. If used with a memory datastore. Data Integrator exits. Select the LOOKUP_EXT function arguments (line 7) from the function editor when you define a LOOKUP_EXT function.This document is part of a SAP study on PDF usage.’NO_CACHE’. For more information about these and other Data Integrator functions.. Find out how you can participate and help to improve our documentation. Datastores Database datastores 5 Note: The same functionality is available for other datastore types using the SQL function. if ($data != NULL) begin $count = $count + 1.TableName) Data Integrator also provides a built-in function that you can use to explicitly expunge data from a memory table. $I = $I + 1.[DI _Row_ID.[O]. The TOTAL_ROWS(DatastoreName.’MAX’]. $count=0.

a persistent cache datastore contains cache table schemas saved in the repository. When you load data into a persistent cache table. or adapter. Persistent cache tables allow you to cache large amounts of data. You can create cache tables that multiple data flows can share (unlike a memory table which cannot be shared between different real-time jobs). Persistent cache datastores Data Integrator also allows you to create a database datastore using Persistent cache as the Database type. Persistent cache datastores provide the following benefits for data flows that process large volumes of data. you can access a lookup table or comparison table locally (instead of reading from a remote database). if a large lookup table used in a lookup_ext function rarely changes. For example. 106 Data Integrator Designer Guide . deletes. 5 Datastores Database datastores • A validation and run time error occurs if the schema of a memory table does not match the schema of the preceding object in the data flow. • A persistent cache datastore is a container for cache tables. By contrast. Persistent cache tables can cache data from relational database tables and files. You can then subsequently read from the cache table in another data flow. a datastore normally provides a connection to a database. application. In Data Integrator. • You can store a large amount of data in persistent cache which Data Integrator quickly loads into memory to provide immediate access during a job. use the Update Schema option or create a new memory table to match the schema of the preceding object in the data flow. Note: You cannot cache data from hierarchical data files such as XML messages and SAP IDocs (both of which contain nested schemas). or updates on a persistent cache table. Find out how you can participate and help to improve our documentation. For example. • Two log files contain information specific to memory tables: trace_memory_reader log and trace_memory_loader log. You create a persistent cache table by loading data into the persistent cache target table using one data flow. To correct this error. you can create a cache once and subsequent jobs can use this cache instead of creating it each time. You cannot perform incremental inserts.This document is part of a SAP study on PDF usage. Data Integrator always truncates and recreates the table.

The first time you save the job. 3. Subsequently. 5. the table appears with a table icon in the workspace and in the object library under the persistent cache datastore. Datastores Database datastores 5 Creating persistent cache datastores You can create persistent cache datastores using the Datastore Editor window. you do not have to specify the table’s schema or import the table’s metadata. Click inside a data flow to place the template table in the workspace. Creating persistent cache tables When you create a persistent cache table. keep the default Database. Therefore. 6. select New > Datastore. 1. In the Name box. Be sure to use a naming convention such as “Persist_DS”. In the Cache directory box. enter the name of the new datastore. Instead. Data Integrator creates the schema for each persistent cache table automatically based on the preceding schema. Data Integrator Designer Guide 107 . Data Integrator defines the persistent cache table’s schema and saves the table. Click OK. In the Database Type box. Click the template table icon. Find out how you can participate and help to improve our documentation. • From the tool pallet: a. label a persistent cache datastore to distinguish its persistent cache tables from regular database tables in the workspace. To define a persistent cache datastore From the Project menu. 4. Datastore names are appended to table names when table icons appear in the workspace. you can either type or browse to a directory where you want to store the persistent cache. Persistent cache tables are represented in the workspace with regular table icons. 2.This document is part of a SAP study on PDF usage. You create a persistent cache table in one of the following ways: • • As a target template table in a data flow (see “To create a persistent cache table as a target in a data flow” below) As part of the Data_Transfer transform during the job execution (see the Data Integrator Reference Guide) To create a persistent cache table as a target in a data flow Use one of the following methods to open the Create Template window: 1. In the Datastore type box. b. select Persistent cache.

2. map the Schema In columns that you want to include in the persistent cache table. a. 5 Datastores Database datastores c. This option is the default. you can change the following options for the persistent cache table. For more information. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. 108 Data Integrator Designer Guide . 3. see the Data Integrator Reference Guide. Expand a persistent cache datastore. Click the template table icon and drag it to the workspace. 7. Open the persistent cache table’s target table editor to set table options. In the Query transform. select the persistent cache datastore. On the Options tab of the persistent cache target table editor. Compare_by_name — Data Integrator maps source columns to target columns by name. On the Create Template window. Click OK. There are two options: • • Compare_by_position — Data Integrator disregards the column names and maps source columns to target columns by position. 6. Column comparison — Specifies how the input columns are mapped to persistent cache table columns. On the Create Template window. • From the object library: a. 5. enter a table name. Connect the persistent cache table to the data flow as a target (usually a Query transform). b. 4. The persistent cache table appears in the workspace as a template table icon.

In DB2. From the Project menu select Save. you can use the persistent cache table as a source in any data flow. For more information. On the Keys tab. Using persistent cache tables as sources After you create a persistent cache table as a target in one data flow. Include duplicate keys — Select this check box to cache duplicate keys. Datastores Database datastores 5 b. 8.This document is part of a SAP study on PDF usage. see the Data Integrator Reference Guide. the template table’s icon changes to a target table icon and the table appears in the object library under the persistent cache datastore’s list of tables. 9. Find out how you can participate and help to improve our documentation. For information about the persistent cache table options. Oracle calls these paths database links. You can also use it as a lookup table or comparison table. see the Data Integrator Reference Guide. This option is selected by default. In the workspace. specify the key column or columns to use as the key in the persistent cache table. Linked datastores Various database vendors support one-way communication paths from one database server to another. the one-way communication path from a database server to another database server is provided by an information server that allows a set of servers to get Data Integrator Designer Guide 109 .

stored in the data dictionary of database Customers. The same information is stored in a Data Integrator database datastore.You can associate the datastore to another datastore and then import an external database link as an option of a Data Integrator datastore. see the Data Integrator Performance Optimization Guide. 5 Datastores Database datastores data from remote data sources. For more information. Data Integrator uses linked datastores to enhance its performance by pushing down operations to a target database using a target datastore. Relationship between database links and Data Integrator datastores A database link stores information about how to connect to a remote data source. database name.This document is part of a SAP study on PDF usage. and database type. Data Integrator refers to communication paths between databases as database links. to access data on Orders. a local Oracle database server. For example. The datastores must connect to the databases defined in the database link. linked servers provide the one-way communication path from one database server to another. which can be on the local or a remote computer and of the same or different database type. The datastores in a database link relationship are called linked datastores. such as its host name. Users connected to Customers however. Additional requirements are as follows: • • • • • A local server for database links must be a target server in Data Integrator A remote server for database links must be a source server in Data Integrator An external (exists first in a database) database link establishes the relationship between any target datastore and a source datastore A Local datastore can be related to zero or multiple datastores using a database link for each remote database Two datastores can be related to each other using one link only 110 Data Integrator Designer Guide . called Orders. cannot use the same link to access data in Orders. user name. can store a database link to access information in a remote Oracle database. In Microsoft SQL Server. password. These solutions allow local users to access data on a remote database. Users logged into database Customers must define a separate link. Find out how you can participate and help to improve our documentation. Customers.

Datastores Adapter datastores 5 The following diagram shows the possible relationships between database links and linked datastores: Remote Servers Local Server D s 2 D s 1 DBLink1 DBLink2 DBLink4 D s 3 DB2 DBLink1 DBLink2 DBLink3 DBLink4 DB1 DB3 Four database links. • • • • Dblink1 relates datastore Ds1 to datastore Ds2. This relationship is called linked datastore Dblink1 (the linked datastore has the same name as the external database link). DBLink 1 through 4. Although it is not a regular case. Data Integrator adapters allow you to: • • Browse application metadata Import application metadata into a Data Integrator repository Data Integrator Designer Guide 111 . allows only one database link between a target datastore and a source datastore pair. you can create multiple external database links that connect to the same remote source.This document is part of a SAP study on PDF usage. Adapter datastores Depending on the adapter implementation. Find out how you can participate and help to improve our documentation. are on database DB1 and Data Integrator reads them through datastore Ds1. see the Data Integrator Reference Guide. which are also related by Dblink1. if you select DBLink1 to link target datastore DS1 with source datastore DS2. For information about creating a linked datastore. Data Integrator. Dblink3 is not mapped to any datastore in Data Integrator because there is no datastore defined for the remote data source to which the external database link refers. However. For example. Dblink4 relates Ds1 with Ds3. you cannot import DBLink2 to do the same. Dblink2 is not mapped to any datastore in Data Integrator because it relates Ds1 with Ds2.

contact your Business Objects Sales Representative.This document is part of a SAP study on PDF usage. configuring. click to select the Datastores tab. while Data Integrator extracts data from or loads data directly to the application. Find out how you can participate and help to improve our documentation. Right-click and select New. Adapters can provide access to an application’s data and metadata or just metadata. Adapters are represented in Designer by adapter datastores. To define a datastore. you can buy Data Integrator prepackaged adapters to access application metadata and data in any application. For more information on these products. Also. the adapter might be designed to access metadata. 5 Datastores Adapter datastores • Move batch and real-time data between Data Integrator and applications Business Objects offers an Adapter Software Development Kit (SDK) to develop your own custom adapters. For example. 2. Defining an adapter datastore You need to define at least one datastore for each adapter through which you are extracting or loading data. and starting adapters. you must have appropriate access privileges to the application that the adapter serves. Data Integrator jobs provide batch and real-time data movement between Data Integrator and applications through an adapter datastore’s subordinate objects: Subordinate Objects Tables Documents Functions Message functions Outbound messages Use as Source or target Source or target Function call in query Function call in query Real-time data movement Target only For Batch data movement These objects are described in “Source and target objects” on page 178 and “Real-time source and target objects” on page 266. To define an adapter datastore In the Object Library. if the data source is SQL-compatible. For information about configuring adapter connections for a Job Server. see the Data Integrator Management Console: Administrator Guide. The Datastore Editor dialog opens (the title bar reads. For information about installing. 1. 112 Data Integrator Designer Guide . see the Data Integrator Getting Started Guide. Create new Datastore).

The datastore name appears in the Designer only. After you complete your datastore connection. 7. To create an adapter datastore. Select an adapter instance from the Adapter instance name list. select Adapter. Enter all adapter information required to complete the datastore connection. Find out how you can participate and help to improve our documentation. Opens a text window that displays how Data Integrator will code the selections you make for this datastore in its scripting language. the following buttons are available: Buttons Edit Description Opens the Configurations for Datastore dialog. The datastore configuration is saved in your metadata repository and the new datastore appears in the object library. It can be the same as the adapter name.This document is part of a SAP study on PDF usage. you can browse and/or import metadata from the data source through the adapter. Show ATL OK Cancel Apply 8. Note: If the developer included a description for each option. Data Integrator Designer Guide 113 . and ensure that the Job Server’s service is running. Use the tool bar on this window to add. Select a Job server from the list. and manage multiple configurations for a datastore. configure the Job Server to support local adapters using Data Integrator’s System Manager utility. you must first install the adapter on the Job Server computer. Cancels selections and closes the Datastore Editor window. 4. Adapters residing on the Job Server computer and registered with the selected Job Server appear in the Job server list. 5. Click OK. Saves selections and closes the Datastore Editor (Create New Datastore) window. Saves selections. Also the adapter documentation should list all information required for a datastore connection. In the Datastore type list. Datastores Adapter datastores 5 3. Data Integrator displays it below the grid. For the datastore as a whole. configure. Enter a unique identifying name for the datastore. 6.

The edited datastore configuration is saved in your metadata repository. If the Designer cannot get the adapter’s properties. Importing metadata through an adapter datastore The metadata you can import depends on the specific adapter. Data Integrator looks for the Job Server and adapter instance name you specify. Right-click any object to check importability. 3. Browsing metadata through an adapter datastore The metadata you can browse depends on the specific adapter. they appear with a deleted icon . then it retains the previous properties. you can edit it. Scroll to view metadata name and description attributes. Edit configuration information.This document is part of a SAP study on PDF usage. Click OK. 2. enter or select a value. 1. To change an adapter datastore’s configuration Right-click the datastore you want to browse and select Edit to open the Datastore Editor window. After importing metadata. If the Job Server and adapter instance both exist. When editing an adapter datastore. 2. 1. A window opens showing source metadata. 3. 4. 114 Data Integrator Designer Guide . Your edits propagate to all objects that call these objects. Click OK in the confirmation window. Click plus signs [+] to expand objects and view subordinate objects. 5 Datastores Adapter datastores 1. then it displays them accordingly. If these objects exist in established flows. To delete an adapter datastore and associated metadata objects Right-click the datastore you want to delete and select Delete. Find out how you can participate and help to improve our documentation. Data Integrator removes the datastore and all metadata objects contained within that datastore from the metadata repository. To browse application metadata Right-click the datastore you want to browse and select Open. and the Designer can communicate to get the adapter’s properties. 2.

and PROD) Multi-instance (databases with different versions or locales) Multi-user (databases for central and local repositories) For more information about how to use multiple datastores to support these scenarios. To import application metadata while browsing Right-click the datastore you want to browse. Then. TEST. then select Open. then select Import by name. Find the metadata object you want to import from the browsable list. Any object(s) matching your parameter constraints are imported to one of the corresponding Data Integrator categories specified under the datastore. To import application metadata by name Right-click the datastore from which you want metadata. 3. such as: • • • • OEM (different databases for design and distribution) Migration (different connections for DEV. This section covers the following topics: • Definitions Data Integrator Designer Guide 115 . Click each import parameter text box and enter specific information related to the object you want to import. The ability to create multiple datastore configurations provides greater easeof-use for job portability scenarios. you can select a set of configurations that includes the sources and targets you want by selecting a system configuration when you execute or schedule the job.This document is part of a SAP study on PDF usage. 1. tables. 2. Click OK. The Import by name window appears containing import parameters with corresponding text boxes. 2. or message functions). Creating and managing multiple datastore configurations Creating multiple configurations for a single datastore allows you to consolidate separate datastore connections for similar sources or targets into one source or target datastore with multiple configurations. outbound messages. 4. 3. Find out how you can participate and help to improve our documentation. functions. The object is imported into one of the adapter datastore containers (documents. Datastores Creating and managing multiple datastore configurations 5 1. Right-click the object and select Import. see “Portability solutions” on page 120.

If you do not create a system configuration. 5 Datastores Creating and managing multiple datastore configurations • • • • • • • Why use multiple datastore configurations? Creating a new configuration Adding a datastore alias Portability solutions Job portability tips Renaming table and function owner Defining a system configuration Definitions Refer to the following terms when creating and managing multiple datastore configurations: Datastore configuration — Allows you to provide multiple metadata sources or targets for datastores. database objects in an ODBC datastore connecting to an Access database do not have owners. and locale) and their values. Specify a current configuration for each system configuration. user name. Data Integrator uses the default datastore configuration as the current configuration at job execution time. password. 116 Data Integrator Designer Guide . Some database objects do not have owners. Owner name — Owner name of a database object (for example. Database objects usually have owners. a table) in an underlying database. You can create an alias from the datastore editor for any datastore configuration. Data Integrator uses it as the default configuration.This document is part of a SAP study on PDF usage. database type. Alias — A logical owner name. If a datastore has more than one configuration. as needed. Each configuration is a property of a datastore that refers to a set of configurable options (such as database connection name. select a default configuration. Also known as database owner name or physical owner name. Current datastore configuration — The datastore configuration that Data Integrator uses to execute a job. If a datastore has only one configuration. Data Integrator will execute the job using the system configuration. If you define a system configuration. or the system configuration does not specify a configuration for a datastore. Default datastore configuration — The datastore configuration that Data Integrator uses for browsing and importing database objects (tables and functions) and executing jobs if no system configuration is specified. For example. Database objects — The tables and functions that are imported from a datastore. Create an alias for objects that are in different database environments if you have different owner names in those environments. Find out how you can participate and help to improve our documentation.

logical configuration Name. right-click any existing datastore and select Edit. Data Integrator Designer Guide 117 . Enter a unique. Datastores Creating and managing multiple datastore configurations 5 Dependent objects — Dependent objects are the jobs. For Datastore Editor details. see the Data Integrator Reference Guide. Click Edit to open the Configurations for Datastore window. For example. and instances. work flows. 2.This document is part of a SAP study on PDF usage. Creating a new configuration You can create multiple configurations for all datastore types except memory datastores. Select a Database version from the drop-down menu. Defining a system configuration then adding datastore configurations required for a particular environment. Adding a datastore alias then map configurations with different object owner names to it. 3. versions. 5. c. and custom functions in which a database object is used. enterprise data warehouse environment because you can easily port jobs among different database types. 1. The Create New Configuration window opens. Find out how you can participate and help to improve our documentation. Dependent object information is generated by the where-used utility. Use the Datastore Editor to create and edit datastore configurations. porting can be as simple as: 1. Click the Create New Configuration icon on the toolbar. In the Create New Configuration window: a. Why use multiple datastore configurations? By creating multiple datastore configurations. Select a Database type from the drop-down menu. 3. To create a new datastore configuration From the Datastores tab of the object library. 2. b. Creating a new configuration within an existing source or target datastore. 4. Select a system configuration when you execute a job. you can decrease end-to-end development time in a multi-source. data flows. Each datastore must have at least one configuration. If only one configuration exists. it is the default configuration. 24x7. Click Advanced to view existing configuration information.

• If you keep this option (selected as default) Data Integrator uses customized target and SQL transform values from previously deleted datastore configurations. Data Integrator requires that one configuration be designated as the default configuration.This document is part of a SAP study on PDF usage. Data Integrator displays the Added New Values Modified Objects window which provides detailed information about affected data flows and modified objects. e.) • f. In the Values for table targets and SQL transforms section. Further. Data Integrator saves all associated target values and SQL transforms. Data Integrator must add any new database type and version values to these transform and target objects. Data Integrator does not attempt to restore target and SQL transform values. Find out how you can participate and help to improve our documentation. When you delete datastore configurations. However. allowing you to provide new values. See For each datastore. If you deselect Restore values if they already exist. the Designer automatically populates the Use values from with the earlier version. if database type and version are not already specified in an existing configuration. Select or deselect the Restore values if they already exist option. when you add a new datastore configuration. The Designer automatically uses the existing SQL transform and target values for the same database type and version. Under these circumstances. Your first datastore configuration is 118 Data Integrator Designer Guide . or if the database version is older than your existing configuration. Click OK to save the new configuration. Data Integrator pre-selects the Use values from value based on the existing database type and version. If you create a new datastore configuration with the same database type and version as the one previously deleted. If your datastore contains pre-existing data flows with SQL transforms or target objects. the Restore values if they already exist option allows you to access and take advantage of the saved value settings. These same results also display in the Output window of the Designer. 5 Datastores Creating and managing multiple datastore configurations d. Data Integrator uses the default configuration to import metadata and also preserves the default configuration during export and multi-user operations. if the database you want to associate with a new configuration is a later version than that associated with other existing configurations. you can choose to use the values from another existing configuration or the default for the database type and version.

Miscellaneous Returns the database version of the current datastore configuration. Datastores Creating and managing multiple datastore configurations 5 automatically designated as the default. you can also create multiple aliases for a datastore then map datastore configurations to each alias. For more information. Data Integrator Designer Guide 119 . Find out how you can participate and help to improve our documentation. however after adding one or more additional datastore configurations. see “Renaming table and function owner” on page 126. You can also rename tables and functions after you import them. The Create New Alias window opens. use only alphanumeric characters and the underscore symbol (_) to enter an alias name. then click Aliases (Click here to create). When you export a repository. Data Integrator substitutes your specified datastore configuration alias for the real owner name when you import metadata for database objects. Adding a datastore alias From the datastore editor. 3. you can use the datastore editor to flag a different configuration as the default. To create an alias From within the datastore editor. Data Integrator provides six functions that are useful when working with multiple source and target datastore configurations: Function db_type db_version db_database_name Category Description 1. Miscellaneous Returns the database type of the current datastore configuration. Data Integrator overrides configurations in the target with source configurations. 2. click Advanced. Under Alias Name in Designer. If the datastore you are exporting already exists in the target repository. The Create New Alias window closes and your new alias appears underneath the Aliases category When you define a datastore alias.This document is part of a SAP study on PDF usage. Click OK. Data Integrator preserves all configurations in all datastores including related SQL transform text and target table editor settings. Data Integrator exports system configurations separate from other job related objects. Miscellaneous Returns the database name of the current datastore configuration if the database type is MS SQL Server or Sybase ASE.

see “SQL” on page 355and the Data Integrator Reference Guide. and so on when you switch between datastore configurations. Use the Administrator to select a system configuration as well as view the underlying datastore configuration associated with it when you: • • • • Execute batch jobs Schedule batch jobs View batch job history Create real-time jobs For more information. returns a NULL value. design your jobs so that you do not need to change schemas. number and order of columns. Data Integrator provides several different solutions for porting jobs: • • Migration between environments Multiple instances 120 Data Integrator Designer Guide . make sure that the table metadata schemas match exactly. if you have a datastore with a configuration for Oracle sources and SQL sources. For more information. Portability solutions Set multiple source or target configurations for a single datastore if you want to quickly change connections to a different source or target database. You can also use variable interpolation in SQL text with these functions to enable a SQL transform to perform successfully regardless of which configuration the Job Server uses at job execution time. Data Integrator links any SQL transform and target table editor settings used in a data flow to datastore configurations. see the Data Integrator Management Console: Administrator Guide. functions. iguration If no system configuration is defined. For example. see “Job portability tips” on page 125. alias names. as well as the same column names and data types. variables. To use multiple configurations successfully.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. 5 Datastores Creating and managing multiple datastore configurations Function db_owner Category Description Miscellaneous Returns the real owner name that corresponds to the given alias name under the current datastore configuration current_configuration Miscellaneous Returns the name of the datastore configuration that is in use at runtime. current_system_conf Miscellaneous Returns the name of the current system configuration. Use the same table names. For more information. data types.

To load multiple instances of a data source to a target data warehouse Create a datastore that connects to a particular instance. Because Data Integrator overwrites datastore configurations during export. 1. see the Data Integrator Advanced Development and Migration Guide. you should add configurations for the target environment (for example. which means that you do not have to edit datastores before running ported jobs in the target environment. add to the development repository before migrating to the test environment). and owner mapping. password. You use a typical repository migration procedure. add configurations for the test environment when migrating from development to test) to the source repository (for example. or you export jobs directly from one repository to another repository. Each environment has a unique database connection name. other connection properties. Either you export jobs to an ATL file then import the ATL file to another repository. Multiple instances If you must load multiple instances of a data source to a target data warehouse. Migration between environments When you must move repository metadata to another environment (for example from development to test or from test to production) which uses different source and target databases. Find out how you can participate and help to improve our documentation. the task is the same as in a migration scenario except that you are using only one Data Integrator repository. For more information. see the Data Integrator Advanced Development and Migration Guide. The Export utility saves additional configurations in the target environment. Data Integrator Designer Guide 121 . Database objects (tables and functions) can belong to different owners. Minimal security issues: Testers and operators in production do not need permission to modify repository objects. user name. the process typically includes the following characteristics: • • • • The environments use the same database type but may have unique database versions or locales. Datastores Creating and managing multiple datastore configurations 5 • • OEM deployment Multi-user development For more information on Data Integrator migration and multi-user development. This solution offers the following advantages: • • Minimal production down time: You can start jobs as soon as you export them.This document is part of a SAP study on PDF usage.

Define the first datastore configuration. the deployment typically has the following characteristics: • • The instances require various source database types and versions. When you use an alias for a configuration. you may need to trigger functions at run-time to match different instances. database version. password. Each instance has a unique database connection name. and owner mappings. This datastore configuration contains all configurable properties such as database type.This document is part of a SAP study on PDF usage. You export jobs to ATL files for deployment. 6. Data Integrator requires different SQL text for functions (such as lookup_ext and sql) and transforms (such as the SQL transform). 4. password. When you define a configuration for an Adapter datastore. user name. 3. and locale information. Since a datastore can only access one instance at a time. To support executing jobs under different instances. add datastore configurations for each additional instance. database connection name. Map owner names from the new database instance configurations to the aliases that you defined in step 3. OEM deployment If you design jobs for one database type and deploy those jobs to other database types as an OEM partner. then run the jobs. This allows you to use database objects for jobs that are transparent to other database instances. Find out how you can participate and help to improve our documentation. Use the database object owner renaming tool to rename owners of any existing database objects. Database tables across different databases belong to different owners. Data Integrator also requires different settings for the target table (configurable in the target table editor). Define a set of alias-to-owner mappings within the datastore configuration. 5. Run the jobs in all database instances. 7. (See “Renaming table and function owner” on page 126 for details. The instances may use different locales. 5 Datastores Creating and managing multiple datastore configurations 2. 8. user name. If this is the case.) Import database objects and develop jobs using those objects. other connection properties. • • • • 122 Data Integrator Designer Guide . make sure that the relevant Job Server is running so the Designer can find all available adapter instances for the datastore. Data Integrator imports all objects using the metadata alias rather than using real owner names.

When this occurs. and new configurations apply to different database types. each with their own local repository. Reference this report to make manual changes as needed. see the Data Integrator Reference Guide. allowing multiple developers. to check in and check out jobs. If the SQL text contains any hard-coded owner names or database names. consider replacing these names with variables to supply owner names or database names for multiple database types. lookup_ext(). 2. For an example of how to apply the db_type and SQL functions within an interpolation script. This way. Multiple development environments get merged (via central repository operations such as check in and check out) at times. Datastores Creating and managing multiple datastore configurations 5 1. Data Integrator does not copy bulk loader options for targets from one database type to another. To deploy jobs to other database types as an OEM partner Develop jobs for a particular database type following the steps described in the Multiple instances scenario. • Data Integrator Designer Guide 123 . Find out how you can participate and help to improve our documentation. and pushdown_sql() functions. open your targets and manually set the bulk loader option (assuming you still want to use the bulk loader method with the new database type). you will not have to modify the SQL text for each environment. modify the SQL text for the new database type. 3.This document is part of a SAP study on PDF usage. Data Integrator copies target table and SQL transform database properties from the previous configuration to each additional configuration when you save it. use the db_type() and similar functions to get the database type and version of the current datastore configuration and provide the correct SQL text for that database type and version using the variable substitution (interpolation) technique. If you selected a bulk loader method for one or more target tables within your job’s data flows. When Data Integrator saves a new configuration it also generates a report that provides a list of targets automatically set for bulk loading. Multi-user development If you are using a central repository management system. the development environment typically has the following characteristics: • • It has a central repository and a number of local repositories. If the SQL text in any SQL transform is not applicable for the new database type. real owner names (used initially to import objects) must be later mapped to a set of aliases shared among all users. To support a new instance under a new database type. Because Data Integrator does not support unique SQL text for each database type or version of the sql(). Data Integrator preserves object history (versions and labels).

Data Integrator does not delete original objects from the central repository when you check in the new objects. When porting jobs in a multi-user environment Use the Renaming table and function owner to consolidate object database object owner names into aliases. the original object will co-exist with the new object. After renaming. password. renaming will create a new object that has the alias and delete the original object that has the original owner name. Checking in the new objects does not automatically check in the dependent objects that were checked out. 1. 3. add a configuration and make it your default configuration while working in your own environment.This document is part of a SAP study on PDF usage. Each instance has a unique database connection name. Renaming occurs in local repositories. Find out how you can participate and help to improve our documentation. 124 Data Integrator Designer Guide . Instead. If all the dependent objects can be checked out. The number of flows affected by the renaming process will affect the Usage Count and Where-Used information in the Designer for both the original object and the new object. Use caution because checking in datastores and checking them out as multi-user operations can override datastore configurations. Data Integrator will ask you to check out the dependent objects. check out the datastore to a local repository and apply the renaming tool in the local repository. user name. If you cannot check out some of the dependent objects. other connection properties. Maintain the datastore configurations of all users by not overriding the configurations they created. 2. If the objects to be renamed have dependent objects. the renaming tool only affects the flows that you can check out. You are responsible for checking in all the dependent objects that were checked out during the owner renaming process. Data Integrator displays a message. In the multi-user development scenario you must define aliases so that Data Integrator can properly preserve the history for all objects in the shared environment. If all the dependent objects cannot be checked out (data flows are checked out by another user). and owner mapping. Database objects may belong to different owners. To rename the database objects stored in the central repository. 5 Datastores Creating and managing multiple datastore configurations • • • The instances share the same database type but may have different versions and locales. which gives you the option to proceed or cancel the operation.

Data Integrator supports options in some database types or versions that it does not support in others For example. • Enhanced target table editor Using enhanced target table editor options. you can enter different SQL text for different database types/versions and use variable substitution in the SQL text to allow Data Integrator to read the correct text for its associated datastore configuration. your job will run. Find out how you can participate and help to improve our documentation. Business Objects recommends that the last developer delete the configurations that apply to the development environments and add the configurations that apply to the test or production environments.This document is part of a SAP study on PDF usage. Data Integrator will read from each partition in parallel. Import metadata for a database object using the default configuration and use that same metadata with all configurations defined in the same datastore. If you import an Oracle hashpartitioned table and set your data flow to run in parallel. Datastores Creating and managing multiple datastore configurations 5 When your group completes the development phase. The following Data Integratorfeatures support job portability: • • • • Enhanced SQL transform With the enhanced SQL transform. • Enhanced datastore editor Using the enhanced datastore editor. when you create a new datastore configuration you can choose to copy the database properties (including the datastore and table target options as well as the SQL transform text) from an existing configuration or use the current values. then later use the table in a job to extract from DB2. Job portability tips • Data Integrator assumes that the metadata of a table or function is the same across different database types and versions specified in different configurations in the same datastore. you can configure database table targets for different database types/versions to match their datastore configurations. when you run your job using sources from a DB2 environment. However. if you import a table when the default configuration of the datastore is Oracle. Data Integrator supports parallel reading on Oracle hash-partitioned tables. Data Integrator Designer Guide 125 . parallel reading will not occur. not on DB2 or other database hash-partitioned tables. For instance.

This process is called owner renaming. name database tables. all databases must be Oracle). Table schemas should match across the databases in a datastore. then you have to use it as a function with all other configurations in a datastore (in other words. Define primary and foreign keys the same way. functions. and in/out types of the parameters must match exactly. use a VARCHAR column in the Microsoft SQL Server source too. This means the number of columns. Consolidating metadata under a single alias name allows you to access accurate and consistent dependency information at any time while also allowing you to more easily switch between configurations when you move jobs to different environments. see the Data Integrator Advanced Development and Migration Guide. Renaming table and function owner Data Integrator allows you to rename the owner of imported tables. When you import a stored procedure from one datastore configuration and try to use it for another datastore configuration. The column data types should be the same or compatible. If you create configurations for both caseinsensitive databases and case-sensitive databases in the same datastore. Find out how you can participate and help to improve our documentation. Data Integrator assumes that the signature of the stored procedure is exactly the same for the two databases. If all users of local repositories use the same alias. data types. 5 Datastores Creating and managing multiple datastore configurations • When you design a job that will be run from different database types or versions. a shared alias makes it easy to track objects checked in by multiple users. and column positions should be exactly the same. For example. if a stored procedure is a stored function (only Oracle supports stored functions). the names. and stored procedures using all upper-case characters. If you have a DATE column in an Oracle source. Stored procedure schemas should match. positions. Use owner renaming to assign a single metadata alias instead of the real owner name for database objects in the datastore. • • To learn more about migrating Data Integrator projects. Data Integrator can track dependencies for objects that your team checks in and out of the central repository. When using objects stored in a central repository. it should have exactly three parameters in the other databases. Further. if you have a VARCHAR column in an Oracle source. and stored procedures the same for all sources. Business Objects recommends that you name the tables. use a DATETIME column in the Microsoft SQL Server source.This document is part of a SAP study on PDF usage. For example. template tables. 126 Data Integrator Designer Guide . or functions. If your stored procedure has three parameters in one database. the column names. functions.

Data Integrator determines if that the two objects have the same schema. When you enter a New Owner Name. expand a table. Data Integrator supports both case-sensitive and case-insensitive owner renaming. Data Integrator will base the case-sensitivity of new owner names on the case sensitivity of the default configuration. the instances of a table or function in a data flow are affected. then Data Integrator proceeds. Right-click the table or function and select Rename Owner. To rename the owner of a table or function From the Datastore tab of the local object library. not the datastore from which they were imported. If the objects you want to rename are from a datastore that contains both case-sensitive and case-insensitive databases. template table. To ensure that all objects are portable across all configurations in this scenario. Note: If the object you are renaming already exists in the datastore. You may need to choose a different object name. the owner renaming mechanism preserves case sensitivity. The Rename Owner window opens.This document is part of a SAP study on PDF usage. If they are different. 3. • • If the objects you want to rename are from a case-sensitive database. Data Integrator uses it as a metadata alias for the table or function. 2. enter all owner names and object names using uppercase characters. 1. During the owner renaming process: • • Data Integrator Designer Guide 127 . Enter a New Owner Name then click Rename. Data Integrator updates the dependent objects (jobs. The object library shows the entry of the object with the new owner name. Datastores Creating and managing multiple datastore configurations 5 When you rename an owner. and data flows that use the renamed object) to use the new owner name. If they are the same. or function category. then Data Integrator displays a message to that effect. Find out how you can participate and help to improve our documentation. work flows. Displayed Usage Count and Where-Used information reflect the number of updated dependent objects.

When you are checking objects in and out of a central repository. • Case 2: Object is checked out. Using the Rename window in a multi-user scenario This section provides a detailed description of Rename Owner window behavior in a multi-user scenario. Case 3: Object is not checked out. and object has no dependent objects in the local or central repository. there are several behaviors possible when you select the Rename button. depending upon the check out state of a renamed object and whether that object is associated with any dependent objects. and object has one or more dependent objects (in the local repository). Data Integrator renames the object owner. it deletes the metadata for the object with the original owner name from the object library and the repository. Data Integrator displays a second window listing the dependent objects (that use or refer to the renamed object). Behavior: When you click Rename. 5 Datastores Creating and managing multiple datastore configurations • If Data Integrator successfully updates all the dependent objects. Behavior: Same as Case 1. Using an alias for all objects stored in a central repository allows Data Integrator to track all objects checked in by multiple users. Data Integrator can track dependencies for objects that your team checks in and out of the central repository. • 128 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. Behavior: When you click Rename. Find out how you can participate and help to improve our documentation. and object has no dependent objects in the local or central repository. • Case 1: Object is not checked out. If all local repository users use the same alias.

Please select Tools | Central Repository… to activate that repository before renaming. the status message reads: “Used only in local repository. a second window opens to display the dependent objects and a status indicating their check-out state and location. Data Integrator renames the objects and modifies the dependent objects to refer to the renamed object using the new owner name. If you click Cancel. if the object to be renamed is not checked out. However. When you click Rename. Find out how you can participate and help to improve our documentation. Behavior: This case contains some complexity.” • If you are connected to the central repository. • Case 4: Object is checked out and has one or more dependent objects. No check out necessary. the Designer returns to the Rename Owner window. the status message reads: “This object is checked out from central repository “X”. Note: An object may still have one or more dependent objects in the central repository. the Rename Owner mechanism (by design) does not affect the dependent objects in the central repository.This document is part of a SAP study on PDF usage. • If you are not connected to the central repository.” Data Integrator Designer Guide 129 . the Rename Owner window opens. Datastores Creating and managing multiple datastore configurations 5 If you click Continue. If a dependent object is located in the local repository only.

the status message shows the name of the checked out repository.production. This helps avoid having dependent objects that refer to objects with owner names that do not exist. this window allows you to check out the necessary dependent objects from the central repository.This document is part of a SAP study on PDF usage. the status shows the name of the local repository. the status message reads: “Not checked out” If you have the dependent object checked out or it is checked out by another user. 130 Data Integrator Designer Guide . From the central repository.user1” The window with dependent objects looks like this: As in Case 2. If the check out was successful. When that user checks in the dependent object. For example: “Oracle. the purpose of this second window is to show the dependent objects. select one or more objects. then right-click and select Check Out. without having to go to the Central Object Library window. After you check out the dependent object. Find out how you can participate and help to improve our documentation. 5 Datastores Creating and managing multiple datastore configurations • If the dependent object is in the central repository. In addition. check out associated dependent objects from the central repository. the Designer updates the status. click Refresh List to update the status and verify that the dependent object is no longer checked out. and it is not checked out. This is useful when Data Integrator identifies a dependent object in the central repository but another user has it checked out. Click the Refresh List button to update the check out status in the list. To use the Rename Owner feature to its best advantage.

Data Integrator then performs an “undo checkout” on the original object. Defining a system configuration What is the difference between datastore configurations and system configurations? • Datastore configurations — Each datastore configuration defines a connection to a particular database from a single datastore. Data Integrator displays another dialog box that warns you about objects not yet checked out and to confirm your desire to continue. and modifies all dependent objects to refer to the new owner name. If Data Integrator does not successfully rename the owner. but uses the new owner name. Click No to return to the previous dialog box showing the dependent objects. Data Integrator modifies objects that are not checked out in the local repository to refer to the new owner name. and all dependent objects are checked out from the central repository. it created a new object identical to the original. It becomes your responsibility to check in the renamed object. in the Datastore tab of the local object library. In this situation. Data Integrator updates the table or function with the new owner name and the Output window displays the following message: Object <Object_Name>: owner name <Old_Owner> successfully renamed to <New_Owner>. Find out how you can participate and help to improve our documentation. Click Yes to proceed with renaming the selected object and to edit its dependent objects. Data Integrator Designer Guide 131 . Although to you. in reality Data Integrator has not modified the original object. including references from dependent objects. the Output window displays the following message: Object <Object_Name>: Owner name <Old_Owner> could not be renamed to <New_Owner >. It is your responsibility to maintain consistency with the objects in the central repository. When the rename operation is successful. it looks as if the original object has a new owner name. Datastores Creating and managing multiple datastore configurations 5 • Case 4a: You click Continue.This document is part of a SAP study on PDF usage. Data Integrator renames the owner of the selected object. The original object with the old owner name still exists. • Case 4b: You click Continue. but one or more dependent objects are not checked out from the central repository.

Create datastore configurations using the Datastore Editor. Create datastore configurations for the datastores in your repository before you create system configurations to organize and associate them. select Tools > System Configurations. The System Configuration Editor window opens. 5 Datastores Creating and managing multiple datastore configurations • System configurations — Each system configuration defines a set of datastore configurations that you want to use together when running a job. By maintaining system configurations in a separate file. In many enterprises. 1. (See “Creating a new configuration” on page 117 for details. When designing jobs. However.) Datastore configurations are part of a datastore object. Data Integrator also checks in or checks out the corresponding datastore configurations. To use this window: a. The other columns indicate the name of a datastore containing multiple configurations. Select a system configuration to use at run-time. You cannot check in or check out system configurations in a multi-user environment. determine and create datastore configurations and system configurations depending on your business environment and rules. a job designer defines the required datastore and system configurations then a system administrator determines which system configuration to use when scheduling or starting a job. Use the first column to the left (Configuration name) to list system configuration names. Find out how you can participate and help to improve our documentation. Data Integrator maintains system configurations separate from jobs. You can define a system configuration if your repository contains at least one datastore with multiple configurations. Similarly. you can export system configurations to a separate flat file which you can later import.This document is part of a SAP study on PDF usage. or each time you check in and check out the datastore. 132 Data Integrator Designer Guide . To create a system configuration In the Designer. when you check in or check out a datastore to a central repository (in a multi-user design environment). Data Integrator includes datastore configurations when you import or export a job. you avoid modifying your datastore each time you import or export a job. Enter a system configuration name in the first column.

4. right-click a datastore. 3. Or. the Job Server uses the default datastore configuration at run-time. Right-click the gray box at the beginning of a row to use the Cut. Select a list box under any of the datastore columns and click the down-arrow to view a list of available configurations in that datastore. 3. 1. Paste. Click to select a datastore configuration in the list.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. c. Datastores Creating and managing multiple datastore configurations 5 b. Business Objects recommends that you use the SC_ prefix in each system configuration name so that you can easily identify this file as a system configuration. Business Objects recommends that you add the SC_ prefix to each exported system configuration . Under each listed datastore.atl file to easily identify that file as a system configuration. select the datastore configuration you want to use when you run a job using the associated system configuration. and Delete commands for that row. Copy. particularly when exporting. 2. 2. click to select an existing system configuration name and enter a new name to change it. Select Repository > Export system configuration. Click OK to save your system configuration settings. unique system configuration name. To export a system configuration In the object library. Data Integrator Designer Guide 133 . Click a blank cell under Configuration name and enter a new. If you do not map a datastore configuration to a system configuration. Click OK.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. 5 Datastores Creating and managing multiple datastore configurations 134 Data Integrator Designer Guide .

Data Integrator Designer Guide File Formats chapter .This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

The source or target file format is based on a template and specifies connection information such as the file name. A file format template is a generic description that can be used for many data files. you must: • • Create a file format template that defines the structure for a file. What are file formats? A file format is a set of properties describing the structure of a flat file (ASCII). see the Data Integrator Supplement for SAP File format objects can describe files in: • • • 136 Data Integrator Designer Guide . you use a file format to connect Data Integrator to source or target data when the data is stored in a file rather than a database table. A file format defines a connection to a file. Create a specific source or target file format in a data flow. see the Data Integrator Reference Guide. Find out how you can participate and help to improve our documentation. Data Integrator can use data stored in files for data sources and targets. The object library stores file format templates that you use to define specific file formats as sources and targets in data flows. A file format describes a specific file.This document is part of a SAP study on PDF usage. Delimited format — Characters such as commas or tabs separate each field Fixed width format — The column width is specified by the user SAP R/3 format — For details. File formats describe the metadata structure. 6 File Formats About this chapter About this chapter This chapter contains the following topics: • • • • • • • • What are file formats? File format editor Creating file formats Editing file formats File format features Creating COBOL copybook file formats File transfers Web log support For full details of file format properties. When working with file formats. Therefore.

This document is part of a SAP study on PDF usage. Data Preview — View how the settings affect sample data. Available properties vary by the mode of the file format editor: • • • • • • • New mode — Create a new file format template Edit mode — Edit an existing file format template Source mode — Edit the file format of a particular source file Target mode — Edit the file format of a particular target file Properties-Values — Edit the values for file format properties. Column Attributes — Edit and define the columns or fields in the file. The file format editor has three work areas: The file format editor contains “splitter” bars to allow resizing of the window and all the work areas. You can expand the file format editor to the full screen size. File Formats File format editor 6 File format editor Use the file format editor to set properties for file format templates and source and target file formats. Expand and collapse the property groups by clicking the leading plus or minus. Field-specific formats override the default format set in the PropertiesValues area. The properties and appearance of the work areas vary with the format of the file. Data Integrator Designer Guide 137 . Find out how you can participate and help to improve our documentation.

Find out how you can participate and help to improve our documentation. 6 File Formats File format editor Properties-Values Column Attributes Splitter bar Data Preview For more information about the properties in the file format editor. You can navigate within the file format editor as follows: • • Switch between work areas using the Tab key. 138 Data Integrator Designer Guide . see the Data Integrator Reference Guide. and arrow keys. Page Down. Navigate through fields in the Data Preview area with the Page Up.This document is part of a SAP study on PDF usage.

• When the file format type is fixed-width. You might be directed to use this by Business Objects Technical Support. When you drag and drop a file format into a data flow. Creating file formats • • To specify a source or target file Create a file format template that defines the structure for a file. Find out how you can participate and help to improve our documentation. see “To create a specific source or target file” on page 148.This document is part of a SAP study on PDF usage. File Formats Creating file formats 6 • Open a drop-down menu in the Properties-Values area by pressing the ALT-down arrow key combination. you can also edit the column metadata structure in the Data Preview area. the format represents a file that is based on the template and specifies connection information such as the file name. Creating a new file format Modeling a file format on a sample file Replicating and renaming file formats Creating a file format from an existing flat table schema Create a file format template using any of the following methods: • • • • To use a file format to create a metadata file. Data Integrator Designer Guide 139 . Note: The Show ATL button displays a view-only copy of the Transformation Language file generated for your file format.

140 Data Integrator Designer Guide . 6 File Formats Creating file formats Creating a new file format 1. right-click Flat Files. and select New. go to the Formats tab. To create a new file format In the local object library. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

After you save this file format template. file format uses a multi-byte code page. Find out how you can participate and help to improve our documentation. Consequently. 3. Data Integrator Designer Guide 141 . which it always represents in bytes. Note: Data Integrator represents column sizes (field-size) in number of characters for all sources except fixed-width file formats. For more information about multi-byte support. If you want to read and load files using a third-party file-transfer program. enter a name that describes this file format template.This document is part of a SAP study on PDF usage. 5. In Name. 4. you cannot change the name. see the Data Integrator Reference Guide. if a fixed-width. specify the file type: • • Delimited — Select Delimited if the file uses a character sequence to separate columns Fixed width — Select Fixed width if the file uses specified widths for each column. The file format editor opens. In Type. select YES for Custom transfer program and see “File transfers” on page 160. File Formats Creating file formats 6 2. then no data is displayed in the data preview section of the file format editor for its files.

You can model a file format on a sample file. Data Integrator cannot use the source column format specified. For source files. Note: • You do not need to specify columns for files used as targets. Properties vary by file type. See “To model a file format on a sample file” on page 143. 142 Data Integrator Designer Guide . Look for properties available when the file format editor is in new mode. d. it defaults to the format used by the code page on the computer where the Job Server is installed. if desired. c. Enter Format field information for appropriate data types. Enter field name.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Instead. For a decimal or real data type. Set data types. 6 File Formats Creating file formats 6. if you only specify a source column format. and the column names and data types in the target schema do not match those in the source schema. Click Save & Close to save the file format template and close the file format editor. All properties are described in the Data Integrator Reference Guide. 7. • 8. Enter field lengths for VarChar data types. b. Data Integrator writes to the target file using the transform’s output schema. e. Enter scale and precision information for Numeric and Decimal data types. This information overrides the default format set in the PropertiesValues area for that data type. Complete the other properties to describe files that this template represents. If you do specify columns and they do not match the output schema from the preceding transform. specify the structure of the columns in the Column Attributes work area: a.

A path on Windows might be C:\DATA\abc. Under Data File(s): • If the sample file is on your Designer computer. set Skip row header to Yes if you want to use the first row in the file to designate field names. During execution. To model a file format on a sample file From the Formats tab in the local object library. but the Job Server must be able to access it. 2. however. Edit the metadata structure as needed. files are not case sensitive. Then.txt. • If the sample file is on the current Job Server computer. Note: During design. 5. set Location to Job Server. the Browse icon is disabled. File Formats Creating file formats 6 Modeling a file format on a sample file 1. Browse to set the Root directory and File(s) to specify the sample file.) To reduce the risk of typing errors. Under Input/Output. file names are case sensitive in the UNIX environment. set Location to Local. you can specify a file located on the computer where the Designer runs or on the computer where the Job Server runs. abc. 3. so you must type the path to the file. (For example. copy and paste the path name from the telnet application directly into the Root directory text box in the file format editor. a path on UNIX might be /usr/data/abc. Indicate the file location in the Location property. Data Integrator Designer Guide 143 . you can telnet to the Job Server (UNIX or Windows) computer and find the full path name of the file you want to use.txt. you must specify a file located on the Job Server computer that will execute the job. For example. Enter the Root directory and File(s) to specify the sample file.This document is part of a SAP study on PDF usage.txt would be two different files in the same UNIX directory. You cannot use the Windows Explorer to determine the exact file location on Windows. Find out how you can participate and help to improve our documentation. 4. You can type an absolute path or a relative path. When you select Job Server.txt and aBc. If the file type is delimited. create a new flat file format template or edit an existing flat file format template. The file format editor will show the column names in the Data Preview area and create the metadata structure automatically. Note: In the Windows operating system. set the appropriate column delimiter for the sample file.

b. you can edit the metadata structure in the Column Attributes work area: a. replicate and rename instead of configuring from scratch. Click to select and highlight columns. 6 File Formats Creating file formats For both delimited and fixed-width files. e. f. b. 6. Enter field lengths for the VarChar data type. Set data types. Right-click to insert or delete fields. To save time in creating file format objects. you can quickly create another file format object with the same schema by replicating the existing file format and renaming it. Enter Format field information for appropriate data types. you can also edit the metadata structure in the Data Preview area: a. Find out how you can participate and help to improve our documentation. c. Click Save & Close to save the file format template and close the file format editor. Right-click to insert or delete fields. if desired. For fixed-width files. This format information overrides the default format set in the Properties-Values area for that data type. Enter scale and precision information for Numeric and Decimal data types.This document is part of a SAP study on PDF usage. Replicating and renaming file formats After you create one file format schema. 144 Data Integrator Designer Guide . Rename fields. d.

The File Format Editor opens.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 145 . displaying the schema of the copied file format. Find out how you can participate and help to improve our documentation. To create a file format from an existing file format In the Formats tab of the object library. File Formats Creating file formats 6 1. right-click an existing file format and choose Replicate from the menu.

click Save. 146 Data Integrator Designer Guide . unique name for the replicated file format. Click Save & Close. 5. To save and view your new file format schema. 6. Look for properties available when the file format editor is in new mode. 4. Type a new. Double-click to select the Name property value (which contains the same name as the original file format object). Also.This document is part of a SAP study on PDF usage. this is your only opportunity to modify the Name property value. Properties are described in the Data Integrator Reference Guide. you cannot modify the name again. Find out how you can participate and help to improve our documentation. Edit other properties as desired. click Cancel or press the Esc button on your keyboard. 6 File Formats Creating file formats 2. Once saved. 3. To terminate the replication process (even after you have changed the name and clicked Save). Data Integrator does not allow you to save the replicated file with the same name as the original (or any other existing File Format object). Note: You must enter a new name for the replicated file.

right-click a schema and select Create File format.This document is part of a SAP study on PDF usage. 2. Edit the new schema as appropriate and click Save & Close. The File Format editor opens populated with the schema you selected. File Formats Creating file formats 6 Creating a file format from an existing flat table schema 1. To create a file format from an existing flat table schema From the Query editor. Data Integrator Designer Guide 147 . Find out how you can participate and help to improve our documentation.

be sure to specify the file name and location in the File and Location properties. Click the name of the file format object in the workspace to open the file format editor. Look for properties available when the file format editor is in source mode or target mode. For a description of available properties. For example. Edit the values as needed. or select Make Target to define a target file format. you can edit properties that uniquely define that source or target such as the file name and location. You cannot change the name of a file format template. if you have a date field in a source or target file that is formatted as mm/dd/yy and the data for this field changes to the format dd-mm-yy due to changes in the program that generates the source file. 6 File Formats Editing file formats Data Integrator saves the file format in the repository. Enter the properties specific to the source or target file. 2. The file format editor opens with the existing format values. 5. 1. Note: You can use variables as file names. For specific source or target file formats. To edit a file format template In the object library Formats tab. Under File name(s). you can edit the corresponding file format template and change the date format information. 3. 1. Drag the file format template to the data flow workspace. Refer to “Setting file names at run-time using variables” on page 314. Find out how you can participate and help to improve our documentation. Select Make Source to define a source file format. 2.This document is part of a SAP study on PDF usage. You can access it from the Formats tab of the object library. 148 Data Integrator Designer Guide . refer to the Data Integrator Reference Guide. double-click an existing flat file format (or right-click and choose Edit). 6. Editing file formats You can modify existing file format templates to match changes in the format or structure of a file. To create a specific source or target file Select a flat file format template on the Formats tab of the local object library. Connect the file format object to other objects in the data flow as appropriate. 4.

2. Edit the desired properties. Set the Location of the source files to Local or Job Server. 1. Find out how you can participate and help to improve our documentation. File format features Data Integrator offers several capabilities for processing files: • • • • • • Reading multiple files at one time Identifying source file names Number formats Ignoring rows with specified markers Date formats at the field level Error handling for flat-file sources Reading multiple files at one time Data Integrator can read multiple files with the same format from a single directory using a single source object.This document is part of a SAP study on PDF usage. displaying the properties for the selected source or target file. To edit a source or target file From the workspace. 3. 1. The file format editor opens. you must edit the file’s file format template. Any changes effect every source or target file that is based on this file format template. 2. Click Save. Look for properties available when the file format editor is in edit mode. Any changes you make to values in a source or target file editor override those on the original file format. To change properties that are not available in source or target mode. Click Save. Data Integrator Designer Guide 149 . 3. Look for properties available when the file format editor is in source or target mode as appropriate. Properties are described in the Data Integrator Reference Guide. File Formats File format features 6 Properties are described in the Data Integrator Reference Guide. To specify multiple files to read Open the editor for your source file format Under Data File(s) in the file format editor: a. click the name of a source or target file.

c. set Include file name to Yes. 150 Data Integrator Designer Guide . but the Job Server must be able to access it. Under File(s).This document is part of a SAP study on PDF usage. For example: 1999????. you cannot use Browse to specify the root directory. or A file name containing a wild card character (* or ?). You must type the path. 6 File Formats File format features b.txt reads all files with the txt extension from the specified Root directory Identifying source file names You might want to identify the source file for each row in your target in the following situations: • • 1. Find out how you can participate and help to improve our documentation. Set the root directory in Root directory. You specified a wildcard character to read multiple source files at one time You load from different source files on different runs To identify the source file for each row in the target Under Source Information in the file format editor. Note: If your Job Server is on a different computer that the Designer. enter one of the following: • • A list of file names separated by commas. This option generates a column named DI_FILENAME that contains the name of the source file. You can type an absolute path or a relative path.txt might read files from the year 1999 *.

) are the two most common formats used to determine decimal and thousand separators for numeric data types. When you run the job. When you specify the default value. 1. two special characters — the semicolon (. To specify markers for rows to ignore Open the file format editor from the Object Library or by opening a source object in the workspace. Int (integer).32-. When formatting files in Data Integrator. Associated with this feature. data types in which these symbols can be used include Decimal.65 or 2. and use the backslash to indicate special characters as markers (such as the backslash and the semicolon).098. For example: +12.This document is part of a SAP study on PDF usage.65.) and the comma (. you might want to ignore comment line markers such as # and //. Data Integrator Designer Guide 151 .) and the backslash (\) — make it possible to define multiple markers in your ignore row marker string. Click in the associated text box and enter a string to indicate one or more markers representing rows that Data Integrator should skip during file read and/or metadata creation. 3.000. In the Query editor. For example: 2. You can use either symbol for the thousands indicator and either symbol for the decimal separator. Find Ignore row marker(s) under the Format Property. Numeric. Use the semicolon to delimit each marker. For example. Find out how you can participate and help to improve our documentation. and Double.089. Number formats The period (. Leading and trailing decimal signs are also supported.00 or 32. 3. map the DI_FILENAME column from Schema In to Schema Out. 2. Ignoring rows with specified markers The file format editor provides a way to ignore rows containing a specified marker (or markers) when reading files. File Formats File format features 6 2. The default marker value is an empty string. the DI_FILENAME column for each row in the target contains the source file name. no rows are ignored.

hi abc.\\. Date formats at the field level You can specify a date format at the field level to overwrite the default date. For example. time. (Each value is delimited by a semicolon unless the semicolon is preceded by a backslash.This document is part of a SAP study on PDF usage.) Marker Value(s) abc abc. or date-time formats set in the Properties-Values area.dd mm/dd/yy dd. abc.mm.yy 152 Data Integrator Designer Guide . you can edit the value in the corresponding Format field to a different date format such as: • • • yyyy. Any that begin with abc or \ or .\. when the Data Type is set to Date.\. Find out how you can participate and help to improve our documentation.mm. Row(s) Ignored None (this is the default value) Any that begin with the string abc Any that begin with abc or def or hi Any that begin with abc or .def. 6 File Formats File format features The following table provides some ignore row marker(s) examples.

Data Integrator processes rows from flat-file sources one at a time. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. For complete details of all file format properties. in the case of a fixed-width file. File Formats File format features 6 Error handling for flat-file sources During job execution. Row-format errors — For example. a field might be defined in the File Format Editor as having a data type of integer but the data encountered is actually varchar. Data Integrator identifies a row that does not match the expected width value. Error-handling options In the File Format Editor. see the Data Integrator Reference Guide. the Error Handling set of properties allows you to choose whether or not to have Data Integrator: • check for either of the two types of flat-file source error Data Integrator Designer Guide 153 . These error-handling properties apply to flat-file sources only. You can configure the File Format Editor to identify rows in flatfile sources that contain the following types of errors: • • Data-type conversion errors — For example.

and defg. all columns from the invalid row The following entry illustrates a row-format error: d:/acl_work/in_test.3. 2.defg. Configuring the File Format Editor for error handling Follow these procedures to configure the error-handling options.234. right-click a format. 6 File Formats File format features • • • write the invalid row(s) to a specified error file stop processing the source file after reaching a specified number of invalid rows log data-type conversion or row-format warnings to the Data Integrator error log.def where 3 indicates an error occurred after the third column. column number where the error occurred. click the Formats tab. if so. 3. The File Format Editor opens.234. To capture data-type conversion errors. you can limit the number of warnings to log without stopping the job About the error file If enabled.2. Expand Flat Files. The format is a semicolon-delimited text file.-80104: 1-3-A column delimiter was seen after column number <3> for row number <2> in file <d:/acl_work/in_test. the error file will include both types of errors.def are the three columns of data from the invalid row. Note: If you set the file format’s Parallel process thread option to any value greater than 0 or {none}. To capture errors in row formats. You can have multiple input source files for the error file. The file resides on the same computer as the Job Server.txt. Find out how you can participate and help to improve our documentation. 4. so a row delimiter should be seen after column number <3>. click Yes. and click Edit.txt>. the row number in source file value will be -1. for Capture row format errors click Yes. 1. The total number of columns defined is <3>.This document is part of a SAP study on PDF usage. Data Integrator error. 5. row number in source file. Please check the file for bad data.. 154 Data Integrator Designer Guide . Entries in an error file have the following syntax: source file path and name. or redefine the input schema for the file by editing the file format in the UI. under the Error Handling properties for Capture data conversion errors. To capture data-type conversion or row-format errors In the object library.

click Yes for either or both of the Capture data conversion errors or Capture row format errors properties. Two more fields appear: Error file root directory and Error file name. Type an Error file name. Under the Error Handling properties. then enter only the file name in the Error file name property. Type an Error file root directory in which to store the error file. If you leave Error file root directory blank. 1. 4. 2. click Yes. 6. Expand Flat Files. right-click a format. To write invalid rows to an error file In the object library. If you type a directory path here. For Write error rows to file. Find out how you can participate and help to improve our documentation. 3. click the Formats tab. 7. The File Format Editor opens. Click Save or Save & Close. Click Save or Save & Close. and click Edit. 5. then type a full path and file name here. File Formats File format features 6 6. Data Integrator Designer Guide 155 .This document is part of a SAP study on PDF usage.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

6

File Formats File format features

For added flexibility when naming the error file, you can enter a variable that is set to a particular file with full path name. Use variables to specify file names that you cannot otherwise enter such as those that contain multibyte characters To limit to the number of invalid rows Data Integrator processes before stopping the job 1. In the object library, click the Formats tab. 2. 3. Expand Flat Files, right-click a format, and click Edit. The File Format Editor opens. Under the Error Handling properties, click Yes for either or both the Capture data conversion errors or Capture row format errors properties. For Maximum errors to stop job, type a number. Note: This property was previously known as Bad rows limit. 5. 1. 2. 3. 4. Click Save or Save & Close. To log data-type conversion warnings in the Data Integrator error log In the object library, click the Formats tab. Expand Flat Files, right-click a format, and click Edit. The File Format Editor opens. Under the Error Handling properties, for Log data conversion warnings, click Yes. Click Save or Save & Close. To log row-format warnings in the Data Integrator error log In the object library, click the Formats tab. Expand Flat Files, right-click a format, and click Edit. The File Format Editor opens. 3. 4. Under the Error Handling properties, for Log row format warnings, click Yes. Click Save or Save & Close.

4.

1. 2.

To limit to the number of warning messages to log If you choose to log either data-type or row-format warnings, you can limit the total number of warnings to log without interfering with job execution. 1. 2. In the object library, click the Formats tab. Expand Flat Files, right-click a format, and click Edit.

156

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
File Formats Creating COBOL copybook file formats

6

The File Format Editor opens. 3. 4. 5. Under the Error Handling properties, for Log data conversion warnings and/or Log row format warnings, click Yes. For Maximum warnings to log, type a number. Click Save or Save & Close.

Creating COBOL copybook file formats
When creating a COBOL copybook format, you can:

• • • •

create just the format, then configure the source after you add the format to a data flow, or create the format and associate it with a data file at the same time create rules to identify which records represent which schemas using a field ID option identify the field that contains the length of the schema’s record using a record length field option To create a new COBOL copybook file format In the local object library, click the Formats tab, right-click COBOL copybooks, and click New. The Import COBOL copybook window opens. Name the format by typing a name in the Format name field. On the Format tab for File name, specify the COBOL copybook file format to import, which usually has the extension .cpy. During design, you can specify a file in one of the following ways:

This section also describes how to:

1.

2. 3.

• •
4. 5. 6.

For a file located on the computer where the Designer runs, you can use the Browse button. For a file located on the computer where the Job Server runs, you must type the path to the file. You can type an absolute path or a relative path, but the Job Server must be able to access it.

Click OK. Data Integrator adds the COBOL copybook to the object library. The COBOL Copybook schema name(s) dialog box displays. If desired, select or double-click a schema name to rename it. Click OK.

Data Integrator Designer Guide

157

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

6

File Formats Creating COBOL copybook file formats

When you later add the format to a data flow, you can use the options in the source editor to define the source. See the Data Integrator Reference Guide. To create a new COBOL copybook file format and a data file In the local object library, click the Formats tab, right-click COBOL copybooks, and click New. The Import COBOL copybook window opens. 2. 3. Name the format by typing a name in the Format name field. On the Format tab for File name, specify to the COBOL copybook file format to import, which usually has the extension .cpy. During design, you can specify a file in one of the following ways:

1.

• •
4. 5.

For a file located on the computer where the Designer runs, you can use the Browse button. For a file located on the computer where the Job Server runs, you must type the path to the file. You can type an absolute path or a relative path, but the Job Server must be able to access it.

Click the Data File tab. For Directory, type or browse to the directory that contains the COBOL copybook data file to import. If you include a directory path here, then enter only the file name in the Name field.

6.

Specify the COBOL copybook data file Name. If you leave Directory blank, then type a full path and file name here. During design, you can specify a file in one of the following ways:

• •
7.

For a file located on the computer where the Designer runs, you can use the Browse button. For a file located on the computer where the Job Server runs, you must type the path to the file. You can type an absolute path or a relative path, but the Job Server must be able to access it.

If the data file is not on the same computer as the Job Server, click the Data Access tab. Select FTP or Custom and enter the criteria for accessing the data file. For details on these options, see the Data Integrator Reference Guide. Click OK. The COBOL Copybook schema name(s) dialog box displays. If desired, select or double-click a schema name to rename it.

8. 9.

10. Click OK.

158

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
File Formats Creating COBOL copybook file formats

6

The Field ID tab allows you to create rules for indentifying which records represent which schemas. To create rules to identify which records represent which schemas In the local object library, click the Formats tab, right-click COBOL copybooks, and click Edit. The Edit COBOL Copybook window opens. 2. 3. 4. 5. 6. 7. 8. 9. In the top pane, select a field to represent the schema. Click the Field ID tab. On the Field ID tab, select the check box Use field <schema name.field name> as ID. Click Insert below to add an editable value to the Values list. Type a value for the field. Continue (adding) inserting values as necessary. Select additional fields and insert values as necessary. Click OK. To identify the field that contains the length of the schema’s record In the local object library, click the Formats tab, right-click COBOL copybooks, and click Edit. The Edit COBOL Copybook window opens. 2. 3. 4. Click the Record Length Field tab. For the schema to edit, click in its Record Length Field column to enable a drop-down menu. Select the field (one per schema) that contains the record's length. The offset value automatically changes to the default of 4; however, you can change it to any other numeric value. The offset is the value that results in the total record length when added to the value in the Record length field. 5. Click OK. For a complete description of all the options available on the Import COBOL copybook or Edit COBOL copybook dialog boxes, see the Data Integrator Reference Guide. To edit the source, open the source editor; see the Data Integrator Reference Guide. To see the list of data type conversions between Data Integrator and COBOL copybooks, see the Data Integrator Reference Guide.

1.

1.

Data Integrator Designer Guide

159

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

6

File Formats File transfers

File transfers
Data Integrator can read and load files using a third-party file transfer program for flat files. You can use third-party (custom) transfer programs to:

• • • •

Incorporate company-standard file-transfer applications as part of Data Integrator job execution Provide high flexibility and security for files transferred across a firewall A custom transfer program (invoked during job execution) Additional arguments, based on what is available in your program, such as:

The custom transfer program option allows you to specify:

• • •

Connection data Encryption/decryption mechanisms Compression mechanisms

Custom transfer system variables for flat files
When you set Custom Transfer program to YES in the Property column of the file format editor, the following options are added to the column. To view them, scroll the window down.

When you set custom transfer options for external file sources and targets, some transfer information, like the name of the remote server that the file is being transferred to or from, may need to be entered literally as a transfer program argument. You can enter other information using the following Data Integrator system variables: Data entered for: User name Password Local directory File(s) Is substituted for this variable if it is defined in the Arguments field $AW_USER $AW_PASSWORD $AW_LOCAL_DIR $AW_FILE_NAME

160

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
File Formats File transfers

6

By using these variables as custom transfer program arguments, you can collect connection information entered in Data Integrator and use that data at run-time with your custom transfer program. For example, the following custom transfer options use a Windows command file (Myftp.cmd) with five arguments. Arguments 1 through 4 are Data Integrator system variables:

• • •

User and Password variables are for the external server The Local Directory variable is for the location where the transferred files will be stored in Data Integrator The File Name variable is for the names of the files to be transferred

Argument 5 provides the literal external server name.

The content of the Myftp.cmd script is as follows: Note: If you do not specify a standard output file (such as ftp.out in the example below), Data Integrator writes the standard output into the job’s trace log.
@echo off set USER=%1 set PASSWORD=%2 set LOCAL_DIR=%3 set FILE_NAME=%4 set LITERAL_HOST_NAME=%5 set INP_FILE=ftp.inp echo %USER%>%INP_FILE% echo %PASSWORD%>>%INP_FILE% echo lcd %LOCAL_DIR%>>%INP_FILE% echo get %FILE_NAME%>>%INP_FILE% echo bye>>%INP_FILE% ftp -s:%INP_FILE% %LITERAL_HOST_NAME%>ftp.out

Data Integrator Designer Guide

161

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

6

File Formats File transfers

Custom transfer options for flat files
Of the custom transfer program options, only the Program executable option is mandatory.

Entering User Name, Password, and Arguments values is optional. These options are provided for you to specify arguments that your custom transfer program can process (such as connection data). You can also use Arguments to enable or disable your program’s built-in features such as encryption/decryption and compression mechanisms. For example, you might design your transfer program so that when you enter sSecureTransportOn or -CCompressionYES security or compression is enabled. Note: Available arguments depend on what is included in your custom transfer program. See your custom transfer program documentation for a valid argument list. You can use the Arguments box to enter a user name and password. However, Data Integrator also provides separate User name and Password boxes. By entering the $AW_USER and $AW_PASSWORD variables as Arguments and then using the User and Password boxes to enter literal strings, these extra boxes are useful in two ways:

You can more easily update users and passwords in Data Integrator both when you configure Data Integrator to use a transfer program and when you later export the job. For example, when you migrate the job to another environment, you might want to change login information without scrolling through other arguments. You can use the mask and encryption properties of the Password box. Data entered in the Password box is masked in log files and on the screen, stored in the repository, and encrypted by Data Integrator. Note: Data Integrator sends password data to the custom transfer program in clear text. If you do not allow clear passwords to be exposed as arguments in command-line executables, then set up your custom program to either:

• •

Pick up its password from a trusted location Inherit security privileges from the calling program (in this case, Data Integrator)

162

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
File Formats File transfers

6

Setting custom transfer options
The custom transfer option allows you to use a third-party program to transfer flat file sources and targets. You can configure your custom transfer program in the File Format Editor window. Like other file format settings, you can override custom transfer program settings if they are changed for a source or target in a particular data flow. You can also edit the custom transfer option when exporting a file format. To configure a custom transfer program in the file format editor Select the Formats tab in the object library. Right-click Flat Files in the tab and select New. The File Format Editor opens. 3. Select either the Delimited or the Fixed width file type. Note: While the custom transfer program option is not supported with R/ 3 file types, you can use it as a data transport method for an R/3 data flow. See the Data Integrator Supplement for SAP for more information. 4. 5. Enter a format name. Select Yes for the Custom transfer program option.

1. 2.

6.

Enter the custom transfer program name and arguments.

7.

Complete the other boxes in the file format editor window. See the Data Integrator Reference Guide for more information. In the Data Files(s) section, specify the location of the file in Data Integrator. To specify system variables for Root directory and File(s) in the Arguments box:

Associate the Data Integrator system variable $AW_LOCAL_DIR with the local directory argument of your custom transfer program.

Data Integrator Designer Guide

163

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

6

File Formats File transfers

Associate the Data Integrator system variable $AW_FILE_NAME with the file name argument of your custom transfer program.

For example, enter: -l$AW_LOCAL_DIR\$AW_FILE_NAME When the program runs, the Root directory and File(s) settings are substituted for these variables and read by the custom transfer program. Note: The flag -l used in the example above is a custom program flag. Arguments you can use as custom program arguments in Data Integrator depend upon what your custom transfer program expects. 8. Click Save.

Design tips
Keep the following concepts in mind when using the custom transfer options:

• •

Variables are not supported in file names when invoking a custom transfer program for the file. You can only edit custom transfer options in the File Format Editor (or Datastore Editor in the case of SAP R/3) window before they are exported. You cannot edit updates to file sources and targets at the data flow level when exported. After they are imported, you can adjust custom transfer option settings at the data flow level. They override file format level settings.

When designing a custom transfer program to work with Data Integrator, keep in mind that:

• •

Data Integrator expects the called transfer program to return 0 on success and non-zero on failure. Data Integrator provides trace information before and after the custom transfer program executes. The full transfer program and its arguments with masked password (if any) is written in the trace log. When "Completed Custom transfer" appears in the trace log, the custom transfer program has ended. If the custom transfer program finishes successfully (the return code = 0), Data Integrator checks the following:

For an R/3 dataflow, if the transport file does not exist in the local directory, it throws an error and Data Integrator stops. See the Data Integrator Supplement for SAP for information about file transfers from SAP R/3. For a file source, if the file or files to be read by Data Integrator do not exist in the local directory, Data Integrator writes a warning message into the trace log.

164

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
File Formats Web log support

6

• • •

If the custom transfer program throws an error or its execution fails (return code is not 0), then Data Integrator produces an error with return code and stdout/stderr output. If the custom transfer program succeeds but produces standard output, Data Integrator issues a warning, logs the first 1,000 bytes of the output produced, and continues processing. The custom transfer program designer must provide valid option arguments to ensure that files are transferred to and from the Data Integrator local directory (specified in Data Integrator). This might require that the remote file and directory name be specified as arguments and then sent to the Data Integrator Designer interface using Data Integrator system variables.

Web log support
Web logs are flat files generated by Web servers and are used for business intelligence. Web logs typically track details of Web site hits such as:

• • • • • • •

Client domain names or IP addresses User names Timestamps Requested action (might include search string) Bytes transferred Referred address Cookie ID

Web logs use a common file format and an extended common file format.
Figure 6-1 :Common Web Log Format

Figure 6-2 :Extended Common Web Log Format

Data Integrator supports both common and extended common Web log formats as sources. The file format editor also supports the following:

• •

Dash as NULL indicator Time zone in date-time, e.g. 01/Jan/1997:13:06:51 –0600

Data Integrator Designer Guide

165

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

6

File Formats Web log support

Data Integrator includes several functions for processing Web log data:

• • •

Word_ext function Concat_date_time function WL_GetKeyValue function

Word_ext function
The word_ext is a Data Integrator string function that extends the word function by returning the word identified by its position in a delimited string. This function is useful for parsing URLs or file names.

Format
word_ext(string, word_number, separator(s))

A negative word number means count from right to left

Examples
word_ext('www.bodi.com', 2, '.') returns 'bodi'. word_ext('www.cs.wisc.edu', -2, '.') returns 'wisc'. word_ext('www.cs.wisc.edu', 5, '.') returns NULL. word_ext('aaa+=bbb+=ccc+zz=dd', 4, '+=') returns 'zz'. If 2 separators are specified (+=), the function looks for either one. word_ext(',,,,,aaa,,,,bb,,,c ', 2, '.') returns 'bb'. This function skips consecutive

delimiters.

Concat_date_time function
The concat_date_time is a Data Integrator date function that returns a datetime

from separate date and time inputs.

Format
concat_date_time(date, time)

Example
concat_date_time(MS40."date",MS40."time")

WL_GetKeyValue function
The WL_GetKeyValue is a custom function (written in the Data Integrator

Scripting Language) that returns the value of a given keyword. It is useful for parsing search strings.

166

Data Integrator Designer Guide

com/ search?hl=en&lr=&safe=off&q=bodi+B2B&btnG=Google+Search” WL_GetKeyValue('http://www.google. File Formats Web log support 6 Format WL_GetKeyValue(string. Find out how you can participate and help to improve our documentation.com/ search?hl=en&lr=&safe=off&q=bodi+B2B&btnG=Google+Search'. Sample Web log formats in Data Integrator This is a file with a common Web log file format: Data Integrator Designer Guide 167 .'q') returns 'bodi+B2B'.This document is part of a SAP study on PDF usage.google. keyword) Example A search in Google for bodi B2B is recorded in a Web log as: GET “http://www.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. 6 File Formats Web log support This is the file format editor view of this Web log: 168 Data Integrator Designer Guide .

Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide 169 . File Formats Web log support 6 This is a representation of a sample data flow for this Web log.This document is part of a SAP study on PDF usage. Data flows are described in Chapter 7: Data Flows.

Find out how you can participate and help to improve our documentation. 6 File Formats Web log support 170 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage.

Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide Data Flows chapter .This document is part of a SAP study on PDF usage.

Naming data flows Data flow names can include alphanumeric characters and underscores (_). transforming data. Everything having to do with data. 172 Data Integrator Designer Guide . a data flow can send and receive information to and from other objects through input and output parameters. and loading targets. transform. 7 Data Flows About this chapter About this chapter This chapter contains the following topics: • • • • • • • • • • What is a data flow? Data flows as steps in work flows Intermediate data sets in a data flow Passing parameters to data flows Creating and defining data flows Source and target objects Transforms Query transform overview Data flow execution Audit Data Flow Overview What is a data flow? Data flows extract. occurs inside a data flow. you can add it to a job or work flow. The lines connecting objects in a data flow represent the flow of data through data transformation steps. After you define a data flow. and load data. Find out how you can participate and help to improve our documentation. including reading sources.This document is part of a SAP study on PDF usage. They cannot contain blank spaces. From inside a work flow.

Data sets created within a data flow are not available to other steps in the work flow. This chapter discusses objects that you can use as steps in a data flow: • • Source and target objects Transforms The connections you make between the icons determine the order in which Data Integrator completes the steps. Data Integrator Designer Guide 173 . even when they are steps in a work flow. defined in a query transform A target table where the new rows are placed You indicate the flow of data through these components by connecting them in the order that data moves through them. Your data flow consists of the following: • • • Two source tables A join between these tables.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Data flows as steps in work flows Data flows are closed operations. Data Flows What is a data flow? 7 Data flow example Suppose you want to populate the fact table in your data warehouse with new data from two tables in your source transaction database. The resulting data flow looks like the following: Steps in a data flow Each icon you place in the data flow diagram becomes a step in the data flow.

7 Data Flows What is a data flow? A work flow does not operate on data sets and cannot provide more data to a data flow. This result is called a data set. in turn. a work flow can do the following: • • • Call data flows to perform data movement operations Define the conditions appropriate to run data flows Pass parameters to and from data flows Intermediate data sets in a data flow Each step in a data flow—up to the target definition—produces an intermediate result (for example. The intermediate result consists of a set of rows from the previous operation and the schema in which the rows are arranged. be further “filtered” and directed into yet another data set. This data set may. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. however. which flows to the next step in the data flow. the results of a SQL statement containing a WHERE clause). 174 Data Integrator Designer Guide .

Data Integrator Designer Guide 175 . Rows can be flagged as INSERT by transforms in the data flow to indicate that a change occurred in a data set as compared with an earlier image of the same data set. Rows can be flagged as UPDATE by transforms in the data flow to indicate that a change occurred in a data set as compared with an earlier image of the same data set. Overwrites an existing row in the target. For example.This document is part of a SAP study on PDF usage. Rows can be flagged as DELETE only by the Map_Operation transform. however. You can use this value in a data flow to extract only rows modified since the last update. not even when you add a data flow to a work flow. The following figure shows the parameter last_update used in a query to determine the data set used to load the fact table. Parameters make data flow definitions more flexible. Data Flows Passing parameters to data flows 7 Operation codes Each row in a data set is flagged with an operation code that identifies the status of the row. You can. When a data flow receives parameters. it is inserted as a new row in the target. All rows in a data set are flagged as NORMAL when they are extracted from a source. the steps inside the data flow can reference those parameters as variables. Creates a new row in the target. INSERT DELETE UPDATE Passing parameters to data flows Data does not flow outside a data flow. pass parameters into and out of a data flow. The change is recorded in the target in the same row as the existing data. Is ignored by the target. Parameters evaluate single values rather than sets of values. The change is recorded in the target separately from the existing data. Rows flagged as DELETE are not loaded. The operation codes are as follows: Operation code NORMAL Description Creates a new row in the target. If a row is flagged as NORMAL when loaded into a target. a parameter can indicate the last time a fact table was updated. Find out how you can participate and help to improve our documentation.

you can change its properties. right-click and select New.This document is part of a SAP study on PDF usage. 176 Data Integrator Designer Guide . go to the Data Flows tab. Creating and defining data flows You can create data flows using objects from: • • The object library The tool palette After creating a data flow. 7 Data Flows Creating and defining data flows For more information about parameters. “To change properties of a data flow” on page 177. Select the data flow category. 1. 2. To define a new data flow using the object library In the object library. see “Variables and Parameters” on page 295. For details see. Find out how you can participate and help to improve our documentation.

1. You can change the following properties of a data flow: Data Integrator Designer Guide 177 . you are telling Data Integrator to validate these objects according the requirements of the job type (either batch or realtime). To define a new data flow using the tool palette 1. Select the new data flow. Click the workspace for a job or work flow to place the data flow. Select the data flow icon in the tool palette. and targets you need. and targets you need. To change properties of a data flow Right-click the data flow and select Properties. Drag the data flow into the workspace for a job or a work flow. 5.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. 4. You can add data flows to batch and real-time jobs. Data Flows Creating and defining data flows 7 3. 2. When you drag a data flow icon into a job. Add the sources. Add the sources. 3. The Properties window opens for the data flow. transforms. 2. transforms.

For more information. Database links allow local users to access data on a remote database. c. which can be on the local or a remote computer of the same or different database type. Source and target objects A data flow directly reads and loads data using two types of objects: Source objects — Define sources from which you read data 178 Data Integrator Designer Guide . 3. and recovery. Use database links Database links are communication paths between one database server and another. Find out how you can participate and help to improve our documentation. see the Data Integrator Performance Optimization Guide. For more information. 7 Data Flows Source and target objects a. For more information about how Data Integrator processes data flows with multiple conditions such as execute once. a batch job will never re-execute that data flow after the data flow completes successfully. Execute only once When you specify that a data flow should only execute once. You can select one of the following values for the Cache type option on your data flow Properties window: • • In-Memory—Choose this value if your data flow processes a small amount of data that can fit in the available memory. parallel flows. filtering. Degree of parallelism Degree Of Parallelism (DOP) is a property of a data flow that defines how many times each transform within a data flow replicates to process a parallel subset of data. see the Data Integrator Performance Optimization Guide. d. Pageable—This value is the default. groups. sorts. Cache type You can cache data to improve performance of operations such as joins. Click OK. Business Objects recommends that you do not mark a data flow as Execute only once if a parent work flow is a recovery unit.This document is part of a SAP study on PDF usage. lookups. For more information. see the Data Integrator Reference Guide. see the Data Integrator Performance Optimization Guide. b. and table comparisons. even if the data flow is contained in a work flow that is a recovery unit that re-executes.

A delimited or fixed-width flat file A file with an application. See Direct “Real-time source and target objects” on page 266). For more information. If you have the SAP licensed extension.specific format (not readable by SQL or XML parser) A file formatted with XML tags Direct Through adapter File Document XML file XML message Direct Used as a source in real-time jobs. Find out how you can participate and help to improve our documentation.specific format Through (not readable by SQL or XML parser) adapter A file formatted with XML tags Direct File Document XML file Data Integrator Designer Guide 179 . Target object Table Template table Description A file formatted with columns and rows as used in relational databases Data Integrator access Direct or through adapter A table whose format is based on the Direct output of the preceding transform (used in development) A delimited or fixed-width flat file Direct A file with an application. Data Flows Source and target objects 7 Target objects — Define targets to which you write (or load) data Source objects Source objects represent data sources read from data flows.This document is part of a SAP study on PDF usage. Source object Table Template table Description Data Integrator access A file formatted with columns and rows Direct or through as used in relational databases adapter A template table that has been created Direct and saved in another data flow (used in development). See “Template tables” on page 181. Target objects Target objects represent data targets that can be written to in data flows. you can also use IDocs as sources. see the the Data Integrator Supplement for SAP.

select Tools > Object Library to open it. Define a file format and import the file Import an XML file format “Template tables” on page 181 “File Formats” on page 135 “To import a DTD or XML Schema format” on page 233 Define an adapter “Adapter datastores” on datastore and import page 111 object metadata. Select the appropriate object library tab: • • Formats tab for flat files. To add a source or target object to a data flow Open the data flow in which you want to place the object. 1. you can also use IDocs as targets. Adding source or target objects to data flows Fulfill the following prerequisites before using a source or target object in a data flow: For Tables accessed directly from a database Template tables Files XML files and messages Objects accessed through an adapter Prerequisite Reference Define a database “Database datastores” on datastore and import page 81 table metadata. Define a database datastore. XML message Outbound message If you have the SAP licensed extension. Find out how you can participate and help to improve our documentation. 3. DTDs. primarily for debugging data flows) See “Real-time source and target objects” on page 266). If the object library is not already open.This document is part of a SAP study on PDF usage. or XML Schemas Datastores tab for database and adapter objects 180 Data Integrator Designer Guide . 2. See “Real-time source and target objects” on page 266). 7 Data Flows Source and target objects Target object XML template file Description Data Integrator access An XML file whose format is based on Direct the preceding transform output (used in development. see the the Data Integrator Supplement for SAP. For more information.

This document is part of a SAP study on PDF usage. Template tables During the initial design of an application. Drop the object in the workspace. To create a target template table Use one of the following methods to open the Create Template window: 1. Though a template table can be used as a source table in multiple data flows. 6. Template tables cannot have the same name as an existing table within a datastore. 7. Data Integrator automatically creates the table in the database with the schema defined by the data flow when you execute a job. With template tables. Select the kind of object to make. you might find it convenient to use template tables to represent database tables. when you release the cursor. Data Flows Source and target objects 7 4. when you release the cursor. Enter the requested information for the new template object. select the Template XML icon from the tool palette. you do not have to initially create a new table in your DBMS and import the metadata into Data Integrator. Note: Ensure that any files that reference flat file. The source or target object appears in the workspace. or XML Schema formats are accessible from the Job Server where the job will be run and specify the file location relative to this computer. Data Integrator Designer Guide 181 . Find out how you can participate and help to improve our documentation. Names can include alphanumeric characters and underscores (_). you can use it as a source in other data flows. DTD. For objects that can be either sources or targets. Set the options you require for the object. it can only be used as a target in one data flow. For new template tables and XML template files. Click the object name in the workspace Data Integrator opens the editor for the object. Click the template table icon. 5. a secondary window appears. a popup menu appears. (Expand collapsed lists by clicking the plus sign next to a container icon. Instead. 8. select the Template Table icon from the tool palette. • From the tool pallet: a. For a new XML template file.) For a new template table. Select the object you want to add as a source or target. After creating a template table as a target in one data flow.

Click the plus sign (+) next to the datastore that contains the template table you want to convert. • From the object library: a. Click inside a data flow to place the template table in the workspace.This document is part of a SAP study on PDF usage. Data Integrator warns you of any errors such as those resulting from changing the schema. Once a template table is converted. Once a template table is created in the database. are available for template tables. 4. c. From the Project menu select Save. Note that you must convert template tables to take advantage of some features such as bulk loading. 1. 7 Data Flows Source and target objects b. 3. save it. the template table’s icon changes to a target table icon and the table appears in the object library under the datastore’s list of tables. In the workspace. b. Expand a datastore. 5. you can no longer alter the schema. On the Create Template window. The table appears in the workspace as a template table icon. select a datastore. If you modify and save the data transformation operation in the data flow where the template table is a target. During the validation process. To convert a template table into a regular table from the object library Open the object library and go to the Datastores tab. When the job is executed. you can convert the template table in the repository to a regular table. Find out how you can participate and help to improve our documentation. After you are satisfied with the design of your data flow. Other features. 2. Any updates to the schema are automatically made to any other instances of the template table. the schema of the template table automatically changes. enter a table name. 2. Click OK. map the Schema In columns that you want to include in the target table. Connect the template table to the data flow as a target (usually a Query transform). Template tables are particularly useful in early application development when you are designing and testing a project. Data Integrator uses the template table to create a new table in the database you specified when you created the template table. On the Create Template window. 6. Click the template table icon and drag it to the workspace. In the Query transform. such as exporting an object. 182 Data Integrator Designer Guide .

To update the icon in all data flows. In the datastore object library. Find out how you can participate and help to improve our documentation. 1. Data Integrator converts the template table in the repository into a regular table by importing it from the database. 4.This document is part of a SAP study on PDF usage. To convert a template table into a regular table from a data flow Open the data flow containing the template table. Data Flows Source and target objects 7 A list of objects appears. Right-click on the template table you want to convert and select Import Table. 3. 2. The list of template tables appears. Right-click a template table you want to convert and select Import Table. choose View > Refresh. Data Integrator Designer Guide 183 . the table is now listed under Tables rather than Template Tables. Click the plus sign (+) next to Template Tables.

you can no longer change the table’s schema. Transforms Data Integrator includes objects called transforms. functions operate on single values in specific columns in a data set. By contrast. Generates a column filled with date values based on the start and end dates and increment that you provide. These transforms are available from the object library on the Transforms tab. Paths are defined in an expression table. See the Data Integrator Reference Guide for detailed information.” Data_Transfer Date_Generation Effective_Date 184 Data Integrator Designer Guide . Generates an additional “effective to” column based on the primary key’s “effective date.This document is part of a SAP study on PDF usage. 7 Data Flows Transforms After a template table is converted into a regular table. Find out how you can participate and help to improve our documentation. Allows a data flow to split its processing into two sub data flows and push down resource-consuming operations to the database server. Transforms manipulate input sets and produce one or more output sets. Data Integrator includes many built-in transforms. Transforms operate on data sets. Transform Case Description Simplifies branch logic in data flows by consolidating case or decision making logic in one transform.

A query transform is similar to a SQL SELECT statement. Hierarchy flattening can be both vertical and horizontal. Rotates the values in specified rows to columns. Performs the indicated SQL query operation. Processes large XML inputs in small batches. this transform supports any data stream if its input requirements are met. which has two panes: • • An input schema area and/or output schema area A options area (or parameters area) that allows you to set all the values the transform requires Data Integrator Designer Guide 185 . Generates a column filled with int values starting at zero and incrementing by one to the end value you specify. and resolves beforeand after-images for UPDATE rows. Unifies rows from two or more sources into a single target. starting from a value based on existing keys in the table you specify. maps output data. Rotates the values in specified columns to rows. so that the original values are preserved in the target. The transform you will use most often is the Query transform.This document is part of a SAP study on PDF usage. (Also see Reverse Pivot. Sorts input data. Compares two data sets and produces the difference between them as a data set with rows flagged as INSERT and UPDATE. Converts rows flagged as UPDATE to UPDATE plus INSERT. Ensures that the data at any stage in the data flow meets your criteria. You can filter out or replace data that fails your criteria. Find out how you can participate and help to improve our documentation. While commonly used to support Oracle changed-data capture.) Retrieves a data set that satisfies conditions that you specify. You specify in which column to look for updated data. History_Preserving Key_Generation Map_CDC_Operation Map_Operation Merge Pivot (Columns to Rows) Query Reverse Pivot (Rows to Columns) Row_Generation SQL Table_Comparison Validation XML_Pipeline Transform editors Transform editor layouts vary. Generates new keys for source data. Data Flows Transforms 7 Transform Hierarchy_Flattening Description Flattens hierarchical data into relational tables so that it can participate in a star schema. Allows conversions between operation codes.

from the object library. To connect a source to a transform. 5. Select the transform you want to add to the data flow. To add a transform to a data flow Open a data flow object. Alternatively. 186 Data Integrator Designer Guide . Input schema area Output schema area Options area Adding transforms to data flows Use the Designer to add transforms to data flows. 3. 6. Draw the data flow connections. Open the object library if it is not already open. click the data flow name. Find out how you can participate and help to improve our documentation. 4. 2. 7 Data Flows Transforms Here is an example of the Query transform editor. Drag the transform icon into the data flow workspace. click the arrow on the right edge of the source and drag the cursor to the arrow on the left edge of the transform. 1. Go to the Transforms tab. From the work flow. click the Data Flow tab and double-click the data flow name.This document is part of a SAP study on PDF usage.

Find out how you can participate and help to improve our documentation. You can connect the output of the transform to the input of another transform or target.dd Description Where yyyy represents a four-digit year. see the Data Integrator Reference Guide. and ss the two second digits of a time hh. and dd represents a two-digit day Where hh represents the two hour digits. mi the two minute digits. 8. Data Flows Query transform overview 7 Continue connecting inputs and outputs as required for the transform. To specify a data column as a transform option. To specify dates or times as option values.mi. use the following formats: Format yyyy. enter the column name as it appears in the input schema or drag the column name from the input schema into the option box. Click the name of the transform. This opens the transform editor. For a full description. • • 7. mm represents a two-digit month.ss Query transform overview Query transform The Query transform is by far the most commonly used transform. which lets you complete the definition of the transform. so this section provides an overview. Enter option values. nested schemas. The Query transform can perform the following operations: • • • • • • • Choose (filter) the data to extract from sources Join data from multiple sources Map columns from input to output schemas Perform transformations and functions on the data Perform data nesting and unnesting (see “Nested Data” on page 215) Add new columns.This document is part of a SAP study on PDF usage.mm. the transform may not require source data. The input for the transform might be the output from another transform or the output from a source. and function results to the output schema Assign primary keys to output columns Data Integrator Designer Guide 187 . or.

The outputs from a Query can include input to another transform or input to a target. Click anywhere in a data flow workspace. without mappings. • • The inputs for a Query can include the output from another transform or the output from a source.This document is part of a SAP study on PDF usage. providing an easier way to add a Query transform. 3. 7 Data Flows Query transform overview Adding a Query transform to a data flow Because it is so commonly used. Connect the Query to inputs and outputs. 1. the Query transform icon is included in the tool palette. Find out how you can participate and help to improve our documentation. To add a Query transform to a data flow Click the Query icon in the tool palette. Data Integrator automatically fills the Query’s output schema with the columns from the target table. If you connect a target table to a Query with an empty output schema. 2. 188 Data Integrator Designer Guide .

Find out how you can participate and help to improve our documentation. The currently selected output schema is called the current schema and determines: • • The output elements that can be modified (added. Data Flows Query transform overview 7 Query editor The query editor. Data Integrator Designer Guide 189 . or deleted) The scope of the Select through Order by tabs in the parameters area The current schema is highlighted while all other (non-current) output schemas are gray. contains the following areas: • • • Input schema area (upper left) Output schema area (upper right) Parameters area (lower tabbed area) The “i” icon indicates tabs containing user-defined entries. To change the current schema You can change the current schema in the following ways: • Select a schema from the Output list.This document is part of a SAP study on PDF usage. The input and output schema areas can contain: • • • Columns Nested schemas Functions (output only) The Schema In and Schema Out lists display the currently selected schema in each area. a graphical interface for performing query operations. mapped.

the following syntax is not supported on the Mapping tab: table. For example. or function in the output schema area and select Make Current. Double-click one of the non-current (grayed-out) elements in the output schema area. you can access these features using the buttons above the editor.This document is part of a SAP study on PDF usage. Use right-click menu options on output elements to: • • • • • Add new output columns and schemas Use adapter functions or (with the SAP license extension) SAP R/3 functions to generate new output columns Assign or reverse primary key settings on output columns. Primary key columns are flagged by a key icon. 7 Data Flows Query transform overview • • Right-click a schema. When the text editor is enabled.column # comment The job will not run and you cannot successfully export it. Drag and drop input schemas and columns into the output schema to enable the editor. Note: You cannot add comments to a mapping clause in a Query transform. Use the function wizard and the expression editor to build expressions. Find out how you can participate and help to improve our documentation. 190 Data Integrator Designer Guide . Use the Mapping tab to provide complex column mappings. To modify output schema contents You can modify the output schema in several ways: • • Drag and drop (or copy and paste) columns or nested schemas from the input schema area to the output schema area to create simple mappings. Unnest or re-nest schemas. Use the object description or workspace annotation feature instead. column.

Data Flows Data flow execution 7 • Use the Select through Order By tabs to provide additional parameters for the current schema (similar to SQL SELECT statement clauses).EMPNO < 9000 Use the buttons above the editor to build expressions. Data Integrator executes a data flow each time the data flow occurs in a job. Data Integrator skips subsequent occurrences in the job. You can drag and drop schemas and columns into these areas. Find out how you can participate and help to improve our documentation. Specifies all input schemas that are used in the current schema. Data Integrator only executes the first occurrence of the data flow. and you want to ensure that Data Integrator only executes a particular data flow one time. then load data into a target. the transaction order is to extract.This document is part of a SAP study on PDF usage. You might use this feature when developing complex batch jobs with multiple paths. The specification declares the desired output. Data flows are similar to SQL statements. However. Specifies how the output rows are sequenced (if required). Data flow execution A data flow is a declarative specification from which Data Integrator determines the correct data to process. for example: TABLE1. For example in data flows placed in batch jobs. you can specify that a batch job execute a particular data flow only one time. See “Creating and defining data flows” on page 176 for information on how to specify that a job execute a data flow only one time. Where Specifies conditions that determine which rows are output. The syntax is like an SQL SELECT WHERE clause. Select From Specifies whether to output only distinct rows (discarding any identical duplicate rows). Data Integrator Designer Guide 191 . Outer Join Specifies an inner table and outer table for any joins (in the Where sheet) that are to be treated as outer joins.EMPNO AND TABLE1. such as jobs with try/catch blocks or conditionals. Group By Order By Specifies how the output rows are grouped (if required).EMPNO > 1000 OR TABLE2. • Use the Search tab to locate input and output elements containing a specific word or term. In that case. transform.EMPNO = TABLE2.

for SQL sources and targets. table comparison and lookups) across multiple processes and computers. grouping. Data Integrator tries to push down joins and function evaluations. By pushing down operations to the database. Resource-intensive operations include joins. For example. see the Data Integrator Reference Guide. Data Integrator creates database-specific SQL statements based on a job’s data flow diagrams. See the Data Integrator Performance Optimization Guide for more information. GROUP BY. Data flow design influences the number of operations that Data Integrator can push to the source or target database. ORDER BY. Find out how you can participate and help to improve our documentation. This work distribution provides the following potential benefits: • • Better memory management by taking advantage of more CPU resources and physical memory Better job performance and scalability by using concurrent sub data flow execution to take advantage of grid computing 192 Data Integrator Designer Guide . 7 Data Flows Data flow execution The following sections provide an overview of advanced features for data flows: • • • • “Push down operations to the database server” on page 192 “Distributed data flow execution” on page 192 “Load balancing” on page 193 “Caches” on page 194 Push down operations to the database server From the information in the data flow specification. Data Integrator pushes down as many transform operations as possible to the source or target database and combines as many operations as possible into one request to the database. You can use the Data_Transfer transform to pushdown resource-intensive operations anywhere within a data flow to the database. Before running a job. Data Integrator produces output while optimizing performance. Data Integrator reduces the number of rows and operations that the Data Integrator engine must process. you can examine the SQL that Data Integrator generates and alter your design to produce the most efficient results.This document is part of a SAP study on PDF usage. Distributed data flow execution Data Integrator provides capabilities to distribute CPU-intensive and memoryintensive data processing work (such as join. and DISTINCT. For more information. For example. To optimize performance.

This document is part of a SAP study on PDF usage. You can specify the following values on the Distribution level option when you execute a job: Data Integrator Designer Guide 193 . Data Integrator does not need to process the entire data flow on the Job Server computer. Data Flows Data flow execution 7 You can create sub data flows so that Data Integrator does not need to process the entire data flow in memory at one time. • Data_Transfer transform With this transform. When you specify multiple Run as a separate process options. Load balancing You can distribute the execution of a job or a part of a job across multiple Job Servers within a Server Group to better balance resource-intensive operations. the sub data flow processes run in parallel. the Data_Transfer transform can push down the processing of a resource-intensive operation to the database server. For more information. You can also distribute the sub data flows to different job servers within a server group to use additional memory and CPU resources. Find out how you can participate and help to improve our documentation. This transform splits the data flow into two sub data flows and transfers the data to a table in the database server to enable Data Integrator to push down the operation. Use the following features to split a data flow into multiple sub data flows: • Run as a separate process option on resource-intensive operations that include the following: • • Hierarchy_Flattening transform Query operations that are CPU-intensive and memory-intensive: • • • • • • • DISTINCT GROUP BY Join ORDER BY Table_Comparison transform Lookup_ext function Count_distinct function If you select the Run as a separate process option for multiple operations in a data flow. see the Data Integrator Performance Optimization Guide. Instead. For more information and usage scenarios for separate processes. see the Data Integrator Performance Optimization Guide. Data Integrator splits the data flow into smaller sub data flows that use separate resources (memory and computer) from each other.

• • • Joins — Because an inner source of a join must be read for each row of an outer source. Audit Data Flow Overview You can audit objects within a data flow to collect run time audit statistics. see the Data Integrator Performance Optimization Guide. For more information. Data flow level . 7 Data Flows Audit Data Flow Overview • • • Job level . Sub data flow level . Caches Data Integrator provides the option to cache data in memory to improve operations such as the following in your data flows. • Pageable cache Use a pageable cache when your data flow processes a very large amount of data that does not fit in memory. and loaded into targets. you might want to cache a source when it is used as an inner source in a join. or table lookup) within a data flow can execute on an available Job Server. processed by various transforms. table comparison. you might want to cache the comparison table. 194 Data Integrator Designer Guide . You can perform the following tasks with this auditing feature: • Collect audit statistics about data read into a Data Integrator job.An resource-intensive operation (such as a sort. If you split your data flow into sub data flows that each run on a different Job Server. Lookups — Because a lookup table might exist on a remote database. the Data Integrator Performance Optimization Guide. Find out how you can participate and help to improve our documentation.A job can execute on an available Job Server. Table comparisons — Because a comparison table must be read for each row of a source.Each data flow within a job can execute on an available Job Server. you might want to cache it in memory to reduce access times. each sub data flow can use its own cache type.This document is part of a SAP study on PDF usage. Data Integrator provides the following types of caches that your data flow can use for all of the operations it contains: • In-memory Use in-memory cache when your data flow processes a small amount of data that fits in memory.

Query the audit statistics that persist in the Data Integrator repository. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide 195 . Data Flows Audit Data Flow Overview 7 • • • Define rules about the audit statistics to determine if the correct data is processed. For a full description of auditing data flows. Generate notification of audit failures.This document is part of a SAP study on PDF usage. see “Using Auditing” on page 362.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. 7 Data Flows Audit Data Flow Overview 196 Data Integrator Designer Guide .

This document is part of a SAP study on PDF usage. Data Integrator Designer Guide Work Flows chapter . Find out how you can participate and help to improve our documentation.

Steps in a work flow Work flow steps take the form of icons that you place in the work space to create a work flow diagram. 8 Work Flows About this chapter About this chapter This chapter contains the following topics: • • • • • • • • • What is a work flow? Steps in a work flow Order of execution in work flows Example of a work flow Creating work flows Conditionals While loops Try/catch blocks Scripts What is a work flow? A work flow defines the decision-making process for executing data flows.This document is part of a SAP study on PDF usage. For example. Find out how you can participate and help to improve our documentation. the purpose of a work flow is to prepare for executing data flows and to set the state of the system after the data flows are complete. elements in a work flow can determine the path of execution based on a value set by a previous job or can indicate an alternative path if something goes wrong in the primary path. with one exception: jobs do not have parameters. Jobs are special because you can execute them. Jobs (introduced in Chapter 4: Projects) are special work flows. Ultimately. Almost all of the features documented for work flows also apply to jobs. The following objects can be elements in work flows: • 198 Work flows Data Integrator Designer Guide .

This document is part of a SAP study on PDF usage. Here is the diagram for a work flow that calls three data flows: Note that Data_Flow1 has no connection from the left but is connected on the right to the left edge of Data_Flow2 and that Data_Flow2 is connected to Data_Flow3. unless the jobs containing those work flows execute in parallel. Execution begins with Data_Flow1 and continues through the three data flows. the steps need not be connected. If there is no dependency. A work flow can also call itself. Order of execution in work flows Steps in a work flow execute in a left-to-right sequence indicated by the lines connecting the steps. Data Integrator can execute the independent steps in the work flow as separate processes. and you can nest calls to any depth. Work Flows Order of execution in work flows 8 • • • • • Data flows Scripts Conditionals While loops Try/catch blocks Work flows can call other work flows. In the following work flow. Data Integrator executes data flows 1 through 3 in parallel: Data Integrator Designer Guide 199 . Connect steps in a work flow when there is a dependency between the steps. Find out how you can participate and help to improve our documentation. In that case. There is a single thread of control connecting all three steps. The connections you make between the icons in the workspace determine the order in which work flows execute.

If the connections are not active. You define a data flow in which the actual data transformation takes place. define each sequence as a separate work flow. However. you want to check that the data connections required to build the fact table are active when data is read from them. In addition. which automatically sends mail notifying an administrator of the problem. and you want to ensure that Data Integrator only executes a particular work flow or data flow one time. To do this in Data Integrator. 8 Work Flows Example of a work flow To execute more complex work flows in parallel. you define a try/catch block. you want to determine when the fact table was last updated so that you only extract rows that have been added or changed since that date. You need to write a script to determine when the last update was made. such as jobs with try/catch blocks or conditionals. In that case. 200 Data Integrator Designer Guide . the catch runs a script you wrote. You can then pass this date to the data flow as a parameter. Data Integrator only executes the first occurrence of the work flow or data flow. You might use this feature when developing complex jobs with multiple paths. Data Integrator skips subsequent occurrences in the job. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. before you move data from the source. then call each of the work flows from another work flow as in the following example: Define work flow A Define work flow B Call work flows A and B from work flow C You can specify that a job execute a particular work flow or data flow only one time. Example of a work flow Suppose you want to update a fact table.

Right-click and choose New. 2. try/catch blocks. Creating work flows You can create work flows using one of two methods: • • Object library Tool palette After creating a work flow. Add the data flows.This document is part of a SAP study on PDF usage. If more than one instance of a work flow appears in a job. you can specify that a job only execute the work flow one time. you can improve execution performance by running the work flow only one time. Rather. a job will never re-execute that work flow after the work flow completes successfully. Business Objects recommends that you not mark a work flow as Execute only once if the work flow or a parent work flow is a recovery unit. even if the work flow appears in the job multiple times. which looks like the following: Data Integrator executes these steps in the order that you connect them. Find out how you can participate and help to improve our documentation. Click where you want to place the work flow in the diagram. This decision-making process is defined as a work flow. 2. Go to the Work Flows tab. even if the work flow is contained in a work flow that is a recovery unit that reexecutes. they are steps of a decision-making process that influences the data flow. 4. work flows. 1. To create a new work flow using the object library Open the object library. Drag the work flow into the diagram. 5. To create a new work flow using the tool palette Select the work flow icon in the tool palette. 1. Work Flows Creating work flows 8 Scripts and error detection cannot execute in the data flow. To specify that a job executes the work flow one time When you specify that a work flow should only execute once. Data Integrator Designer Guide 201 . conditionals. 3. and scripts that you need.

8 Work Flows Conditionals For more information about how Data Integrator processes work flows with multiple conditions like execute once. and standard operators to construct the expression. Then Else Define the Then and Else branches inside the definition of the conditional. 202 Data Integrator Designer Guide . then and else diagrams) are included in the scope of the parent control flow’s variables and parameters. 2. Click OK. see the Data Integrator Reference Guide. parallel flows. To define a conditional. variables. you specify a condition and two logical branches: If A Boolean expression that evaluates to TRUE or FALSE. The Properties window opens for the work flow. (Optional) Work flow elements to execute if the If expression evaluates to FALSE. 3. 1. Work flow elements to execute if the If expression evaluates to TRUE. and recovery. Conditionals and their components (if expressions. Find out how you can participate and help to improve our documentation. Select the Execute only once check box. Right click on the work flow and select Properties. You can use functions.This document is part of a SAP study on PDF usage. Conditionals Conditionals are single-use objects used to implement if/then/else logic in a work flow.

Work Flows Conditionals 8 A conditional can fit in a work flow. The definition of the conditional shows the two branches as follows: Data Integrator Designer Guide 203 . Suppose you use a Windows command file to transfer data from a legacy system into Data Integrator. you define two work flows— one for each branch of the conditional.This document is part of a SAP study on PDF usage. You write a script in a work flow to run the command file and return a success flag. If the elements in each branch are simple. You then define a conditional that reads the success flag to determine if the data is available for the rest of the work flow. To implement this conditional in Data Integrator. you can define them in the conditional editor itself. Find out how you can participate and help to improve our documentation.

4. 8 Work Flows Conditionals Work flow executed when if is TRUE Work flow executed when if is FALSE Both the Then and Else branches of the conditional can contain any object that you can have in a work flow including other work flows. Click the location where you want to place the conditional in the diagram. Business Objects recommends that you define. To add an existing work flow. 6. 7. Find out how you can participate and help to improve our documentation. select the desired work flow. 9. 5. Open the work flow in which you want to place the conditional. Click the name of the conditional to open the conditional editor. 2. and so on. open the object library to the Work Flows tab. nested conditionals. 204 Data Integrator Designer Guide . Click if. then drag it into the Then box. To define a conditional Define the work flows that are called by the Then and Else branches of the conditional. Add your predefined work flow to the Then box. Enter the Boolean expression that controls the conditional. 3. Click the icon for a conditional in the tool palette. After you complete the expression. test. The conditional appears in the diagram. 1. You might want to use the function wizard or smart editor. 8. Continue building your expression.This document is part of a SAP study on PDF usage. try/catch blocks. click OK. and save each work flow as a separate object rather than constructing these work flows inside the conditional editor.

If the If expression evaluates to FALSE and the Else box is blank. This section discusses: • • • Design considerations Defining a while loop Using a while loop with View Data Design considerations The while loop is a single-use object that you can use in a work flow. The while loop repeats a sequence of steps as long as a condition is true. Work Flows While loops 8 10. The conditional is now defined. 12. Data Integrator tests your conditional for syntax errors and displays any errors encountered. Click the Back button to return to the work flow that calls the conditional. Data Integrator exits the conditional and continues with the work flow.This document is part of a SAP study on PDF usage. While loops Use a while loop to repeat a sequence of steps in a work flow as long as a condition is true. Find out how you can participate and help to improve our documentation. choose Debug > Validate. (Optional) Add your predefined work flow to the Else box. After you complete the conditional. While number !=0 True Step 1 False Step 2 Data Integrator Designer Guide 205 . 11.

As long as the file does not exist. such as a counter. say one minute. you might want a work flow to wait until the system writes a particular file. the steps done during the while loop result in a change in the condition so that the condition is eventually no longer satisfied and the work flow exits from the while loop. you must add another check to the loop. 206 Data Integrator Designer Guide . 8 Work Flows While loops Typically. For example. If the condition does not change. change the while loop to check for the existence of the file and the value of the counter. As long as the file does not exist and the counter is less than a particular value. Find out how you can participate and help to improve our documentation. In other words. You can use a while loop to check for the existence of the file using the file_exists function. before checking again. Because the system might never write the file. the while loop will not end. to ensure that the while loop eventually exits. To define a while loop Open the work flow where you want to place the while loop. While file_exists(tempt. repeat the while loop. In each iteration of the loop. you can have the work flow go into sleep mode for a particular length of time. 1.txt) = 0 or counter < 10 True sleep(60000) False counter = counter + 1 Defining a while loop You can define a while loop in any work flow. put the work flow in sleep mode and then increment the counter.This document is part of a SAP study on PDF usage.

Connect these objects to represent the order that you want the steps completed. Data Integrator tests your definition for syntax errors and displays any errors encountered. In the While box at the top of the editor. 7. which gives you more space to enter an expression and access to the function wizard. choose Debug > Validate. 4. 3. You can add any objects valid in a work flow including scripts. Data Integrator Designer Guide 207 . Note: Although you can include the parent work flow in the while loop. recursive calls can create an infinite loop. 8. Add the steps you want completed during the while loop to the workspace in the while loop editor. Click OK after you enter an expression in the editor. Click the location where you want to place the while loop in the workspace diagram. click to open the expression editor.This document is part of a SAP study on PDF usage. work flows. Click the while loop icon on the tool palette. Work Flows While loops 8 2. Click the while loop to open the while loop editor. and data flows. Find out how you can participate and help to improve our documentation. 5. enter the condition that must apply to initiate and repeat the steps in the while loop. Close the while loop editor to return to the calling work flow. 6. Alternatively. After defining the steps in the while loop. The while loop appears in the diagram.

If an exception is thrown during the execution of a try/catch block and if no catch is looking for that exception.This document is part of a SAP study on PDF usage. possibly after the first iteration of the while loop. the while loop will complete normally. Try/catch blocks: • • • “Catch” classes of exceptions “thrown” by Data Integrator. • • Try/catch blocks A try/catch block is a combination of one try object and one or more catch objects that allow you to specify alternative work flows if errors occur while Data Integrator is executing a job. Scanned objects in the while loop will show results from the last iteration. Insert one or more catches in the work flow after the steps. then the job will complete after the scannable objects in the while loop are satisfied. Data Integrator might not complete all iterations of a while loop if you run a job in view data mode: • If the while loop contains scannable objects and there are no scannable objects outside the while loop (for example. a job stops when Data Integrator has retrieved the specified number of rows for all scannable objects. If there are scannable objects after the while loop. then the exception is handled by normal error logic. the DBMS. In each catch. do the following: • • Indicate the group of errors that you want to catch. the job will complete as soon as all scannable objects are satisfied. Define the work flows that a thrown exception executes. Find out how you can participate and help to improve our documentation. Here’s the general method to implement exception handling: • • • Insert a try object before the steps for which you are handling errors. 8 Work Flows Try/catch blocks Using a while loop with View Data When using View Data. If there are no scannable objects following the while loop but there are scannable objects completed in parallel to the while loop. or the operating system Apply solutions that you provide Continue execution Try and catch objects are single-use objects. Depending on the design of your job. if the while loop is the last object in a job). The while loop might complete any number of iterations. 208 Data Integrator Designer Guide .

6. Find out how you can participate and help to improve our documentation. The action initiated by the catch can be simple or complex. 9. 5. if the data flow BuildTable causes any system-generated exceptions handled by the catch. 4. Click the catch icon in the tool palette. 8. Open the work flow that includes the try/catch block. To define a try/catch block Define the work flow that is called by each catch you expect to define. Connect the try and catch to the objects they enclose. Send a prepared e-mail message to a system administrator. Click the try icon in the tool palette. test. Run a scaled-down version of a failed work flow or data flow. The try icon appears in the diagram. 3. Repeat steps 5 and 6 if you want to catch other exception groups in this try/catch block. Click the name of the catch object to open the catch editor. and save each work flow as a separate object rather than constructing these work flows inside the catch editor. Here are some examples of possible exception actions: • • • 1. Click the location where you want to place the try in the diagram. Click the location where you want to place the catch in the diagram. The catch icon appears in the diagram. Data Integrator Designer Guide 209 . the try merely initiates the try/catch block. then the work flow defined in the catch executes.This document is part of a SAP study on PDF usage. Work Flows Try/catch blocks 8 The following work flow shows a try/catch block surrounding a data flow: In this case. Rerun a failed work flow or data flow. We recommend that you define. Note: There is no editor for a try. 7. 2.

11. see “Categories of available exceptions” on page 210. After you have completed the catch. open the object library to the Work Flows tab. Repeat steps 9 through 11 until you have chosen all of the exception groups for this catch.This document is part of a SAP study on PDF usage. Repeat steps 9 to 14 for each catch in the work flow. Data Integrator tests your definition for syntax errors and displays any errors encountered. 16. choose Debug > Validate. select the desired work flow. 13. 12. 15.) Each catch supports one exception group selection. Add your predefined work flow to the catch work flow box. Click Set. Find out how you can participate and help to improve our documentation. 8 Work Flows Try/catch blocks Available exceptions Chosen exceptions Catch work flow 10. and then drag it into the box. To add an existing work flow. (For a complete list of available exceptions. Click the Back button to return to the work flow that calls the catch. If any error in the exception group listed in the catch occurs during the execution of this try/catch block. 14. Data Integrator executes the catch work flow. Categories of available exceptions Categories of available exceptions include: • • ABAP generation errors Database access errors 210 Data Integrator Designer Guide . Select a group of exceptions from the list of Available Exceptions.

Variable names start with a dollar sign ($). you can use the SQL function in a script to determine the most recent update time for a table and then assign that value to a variable. The basic rules for the syntax of the script are as follows: For example. String values are enclosed in single quotation marks ('). A script can contain the following statements: • • • • • • • • • • Function calls If statements While statements Assignment statements Operators Each line ends with a semicolon (. the following script statement determines today’s date and assigns the value to the variable $TODAY: Data Integrator Designer Guide 211 . Work Flows Scripts 8 • • • • • • • • • • • • Email errors Engine abort errors Execution errors File access errors Connection and bulk loader errors Parser errors R/3 execution errors Predefined transform errors Repository access errors Resolver errors System exception errors User transform errors Scripts Scripts are single-use objects used to call functions and assign values to variables in a work flow. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Comments start with a pound sign (#). Function calls always specify parameters even if the function uses no parameters.). For example. You can then assign the variable to a parameter that passes into a data flow and identifies the rows to extract from a source.

This document is part of a SAP study on PDF usage. Enter the script statements. Click the location where you want to place the script in the diagram. 1. 212 Data Integrator Designer Guide . 4. After you complete the script. The following figure shows the script editor displaying a script that determines the start time from the output of a custom function. 8 Work Flows Scripts $TODAY = sysdate(). You cannot use variables unless you declare them in the work flow that calls the script. Data Integrator tests your script for syntax errors and displays any errors encountered. Click the function button to include functions in your script. create a custom function containing the script steps. For more information about scripts and the Business Objects scripting language. 2. 3. Click the name of the script to open the script editor. see the Data Integrator Reference Guide. To create a script Open the work flow. each followed by a semicolon. 6. 5. Click the script icon in the tool palette. select Debug > Validate. To save a script If you want to save a script that you use often. The script icon appears in the diagram. Find out how you can participate and help to improve our documentation.

Work Flows Scripts 8 Debugging scripts using the print function Data Integrator has a debugging feature that allows you to print: • • The values of variables and parameters during execution The execution path followed within a script You can use the print function to write the values of parameters and variables in a work flow to the trace log. Data Integrator Designer Guide 213 . Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. For example. this line in a script: print ('The value of parameter $x: [$x]'). produces the following output in the trace log: The following output is being printed via the Print function in <Session job_name>. see the Data Integrator Reference Guide. The value of parameter $x: value For details about the print function.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. 8 Work Flows Scripts 214 Data Integrator Designer Guide .

Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide Nested Data chapter .This document is part of a SAP study on PDF usage.

This mechanism is called Nested Relational Data Modelling (NRDM). skip this chapter. What is nested data? Real-world data often has hierarchical relationships that are represented in a relational database with master-detail schemas using foreign keys to create the mapping. and transforms. NRDM provides a way to view and manipulate hierarchical relationships within data flow sources. such as XML documents and SAP R/ 3 IDocs. Sales orders are often presented using nesting: the line items in a sales order are related to a single header and are represented using a nested schema. 9 Nested Data About this chapter About this chapter This chapter contains the following topics: • • • • • What is nested data? Representing hierarchical data Formatting XML documents Operations on nested data XML extraction and parsing for columns If you do not plan to use nested data sources or targets. handle hierarchical relationships through nested data. 216 Data Integrator Designer Guide . targets. some data sets. Each row of the sales order data set contains a nested line item schema. Find out how you can participate and help to improve our documentation. However. Data Integrator maps nested data to a separate schema implicitly related to a single row and column of the parent schema.This document is part of a SAP study on PDF usage.

columns inside a nested schema can also contain columns. Examples include: • Multiple rows in a single data set • Multiple data sets related by a join • Nested data Using the nested data method can be more concise (no repeated information). and can scale to present a deeper level of hierarchical complexity. For example.This document is part of a SAP study on PDF usage. Nested Data Representing hierarchical data 9 Representing hierarchical data You can represent the same hierarchical data in several ways. Find out how you can participate and help to improve our documentation. There is a unique instance of each nested schema for each row at each level of the relationship. Data Integrator Designer Guide 217 .

which indicates that the object contains columns. targets. 218 Data Integrator Designer Guide . • • Sales is the top-level schema. and transforms in data flows. LineItems is a nested schema. 9 Nested Data Representing hierarchical data Generalizing further with nested data. Nested schemas appear with a schema icon paired with a plus sign.This document is part of a SAP study on PDF usage. each row at each level can have any number of columns containing nested schemas. The structure of the schema shows how the data is ordered. The minus sign in front of the schema icon indicates that the column list is open. In Data Integrator. you can see the structure of nested data in the input and output schemas of sources. Find out how you can participate and help to improve our documentation.

When you import a format document’s metadata. Their valid structure is stored in separate format documents.xml) can be specified using either an XML Schema (. This section discusses: • • • • • Importing XML Schemas Specifying source options for XML files Mapping optional schemas Using Document Type Definitions (DTDs) Generating DTDs and XML Schemas from an NRDM schema Data Integrator Designer Guide 219 . The format of an XML file or message (. Nested Data Formatting XML documents 9 • CustInfo is a nested schema with the column list closed. which you can use as sources or targets in jobs.xsd for example) or a document type definition (. XML documents are hierarchical.This document is part of a SAP study on PDF usage. Formatting XML documents Data Integrator allows you to import and export metadata for XML documents (files or messages). it is structured into Data Integrator’s internal schema for hierarchical documents which uses the nested relational data model (NRDM). Find out how you can participate and help to improve our documentation.dtd).

This section describes the following topics: • • • Importing XML schemas Importing abstract types Importing substitution groups Importing XML schemas Import the metadata for each XML Schema you use. Find out how you can participate and help to improve our documentation. 220 Data Integrator Designer Guide . 9 Nested Data Formatting XML documents Importing XML Schemas Data Integrator supports WC3 XML Schema Specification 1. and line items—the corresponding XML Schema includes the order structure and the relationship between data. For an XML document that contains information to place a sales order—order header. Seethe Data Integrator Reference Guide for this XML Schema’s URL.0.This document is part of a SAP study on PDF usage. customer. The object library lists imported XML Schemas in the Formats tab.

To import an XML Schema From the object library. See the Data Integrator Reference Guide for the list of Data Integrator attributes. Data Integrator Designer Guide 221 . then imports the following: • • • • • Document structure Namespace Table and column names Data type of each column Nested table and column attributes While XML Schemas make a distinction between elements and attributes. Data Integrator reads the defined elements and attributes. Enter settings into the Import XML Schema Format window: When importing an XML Schema: • Enter the name you want to use for the format in Data Integrator. Nested Data Formatting XML documents 9 When importing an XML Schema. Data Integrator imports and converts them all to Data Integrator nested table and column attributes.This document is part of a SAP study on PDF usage. 2. Find out how you can participate and help to improve our documentation. 3. Right-click the XML Schemas icon. click the Format tab. 1.

The XML Schema Format window appears in the workspace. Expand the XML Schema category. the job that uses this XML Schema will fail. This value must match the number of recursive levels in the XML Schema’s content. • • 4. Note: If your Job Server is on a different computer than the Designer. element B contains A). Otherwise. select a name in the Namespace drop-down list to identify the imported XML Schema. you cannot use Browse to specify the file path. In the Root element name drop-down list. but the Job Server must be able to access it. specify the number of levels it has by entering a value in the Circular level box. You must type the path. You can set Data Integrator to import strings as a varchar of any size.This document is part of a SAP study on PDF usage. 9 Nested Data Formatting XML documents • Enter the file name of the XML Schema or its URL address. select the Formats tab. You can also view and edit nested table and column attributes from the Column Properties window. If the XML Schema contains recursive elements (element A contains B. select the name of the primary node you want to import. Find out how you can participate and help to improve our documentation. 1. Double-click an XML Schema name. Data Integrator only imports elements of the XML Schema that belong to this node or any subnodes. To view and edit nested table and column attributes for XML Schema From the object library. 3. Click OK. After you import an XML Schema. 2. You can type an absolute path or a relative path. you can edit its column properties such as data type using the General tab of the Column Properties window. Varchar 1024 is the default. 222 Data Integrator Designer Guide . • • If the root element name is not unique within the XML Schema.

Find out how you can participate and help to improve our documentation. See the Data Integrator Reference Guide for more information about data types supported by XML Schema. Nested Data Formatting XML documents 9 The Type column displays the data types that Data Integrator uses when it imports the XML document metadata. Data Integrator Designer Guide 223 .This document is part of a SAP study on PDF usage.

a member of the element’s substitution group must appear in the instance document. 9 Nested Data Formatting XML documents 4. BookType. • • When an element is defined as abstract. The default is to select all complex types in the substitution group or all derived types for the abstract type. For example. 224 Data Integrator Designer Guide . and NewspaperType. the instance document must use a type derived from it (identified by the xsi:type attribute). Find out how you can participate and help to improve our documentation. an abstract element PublicationType can have a substitution group that consists of complex types such as MagazineType. Double-click a nested table or column and select Attributes to view or edit XML Schema attributes.This document is part of a SAP study on PDF usage. Importing abstract types An XML schema uses abstract types to force substitution for a particular element or type. When a type is defined as abstract. but you can choose to select a subset.

the Abstract type button is enabled. To select a subset of derived types for an abstract type. To limit the number of derived types to import for an abstract type On the Import XML Schema Format window. Find out how you can participate and help to improve our documentation. the following excerpt from an xsd defines the PublicationType element as abstract with derived types BookType and MagazineType: <xsd:complexType name="PublicationType" abstract="true"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="Date" type="xsd:gYear"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="BookType"> <xsd:complexContent> <xsd:extension base="PublicationType"> <xsd:sequence> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence> </xsd:extension> /xsd:complexContent> </xsd:complexType> <xsd:complexType name="MagazineType"> <xsd:complexContent> <xsd:restriction base="PublicationType"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string" minOccurs="0" maxOccurs="1"/> <xsd:element name="Date" type="xsd:gYear"/> </xsd:sequence> </xsd:restriction> </xsd:complexContent> </xsd:complexType> 2. Data Integrator Designer Guide 225 . click the Abstract type button and take the following actions: a. when you enter the file name or URL address of an XML Schema that contains an abstract type. From the drop-down list on the Abstract type box. select the name of the abstract type. For example.This document is part of a SAP study on PDF usage. Nested Data Formatting XML documents 9 1.

AdsType. and NewspaperType: <xsd:element name="Publication" type="PublicationType"/> 1. but you can choose to select a subset. c. To limit the number of substitution groups to import On the Import XML Schema Format window. Find out how you can participate and help to improve our documentation. The default is to select all substitution groups. Note: When you edit your XML schema format. Data Integrator selects all derived types for the abstract type by default. the Substitution Group button is enabled. when you enter the file name or URL address of an XML Schema that contains substitution groups. but an application typically only uses a limited number of them. Importing substitution groups An XML schema uses substitution groups to assign elements to a special group of elements that can be substituted for a particular named element called the head element. 226 Data Integrator Designer Guide . 9 Nested Data Formatting XML documents b. Select the check boxes in front of each derived type name that you want to import. BookType. the following excerpt from an xsd defines the PublicationType element with substitution groups MagazineType. The list of substitution groups can have hundreds or even thousands of members. In other words. Click OK. For example. the subset that you previously selected is not preserved.This document is part of a SAP study on PDF usage.

This document is part of a SAP study on PDF usage. c. In other words. b. select the name of the substitution group. the subset that you previously selected is not preserved. Note: When you edit your XML schema format. Data Integrator selects all elements for the substitution group by default. Click OK. Select the check boxes in front of each substitution group name that you want to import. Data Integrator Designer Guide 227 . Nested Data Formatting XML documents 9 <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Publication" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Magazine" type="MagazineType" substitutionGroup="Publication"/> <xsd:element name="Book" type="BookType" substitutionGroup="Publication"/> <xsd:element name="Ads" type="AdsType" substitutionGroup="Publication"/> <xsd:element name="Newspaper" type="NewspaperType" substitutionGroup="Publication"/> 2. Click the Substitution Group button and take the following actions a. Find out how you can participate and help to improve our documentation. From the drop-down list on the Substitution group box.

You must specify the name of the source XML file in the XML file text box. To read multiple XML files at one time Open the editor for your source XML file In XML File on the Source tab. Expand the XML Schema and drag the XML Schema that defines your source XML file into your data flow. To create a data flow with a source XML file From the object library. 5. 228 Data Integrator Designer Guide . 2. 3. see Reading multiple XML files at one time on page 228. Reading multiple XML files at one time Data Integrator can read multiple files with the same format from a single directory using a single source object. 4. Double-click the XML source in the work space to open the XML Source File Editor. To specify multiple files. 9 Nested Data Formatting XML documents Specifying source options for XML files After you import metadata for XML documents (files or messages).This document is part of a SAP study on PDF usage. see Identifying source file names on page 229. Creating a data flow with a source XML file 1. 2. click the Format tab. 1. To identify the source XML file. Find out how you can participate and help to improve our documentation. enter a file name containing a wild card character (* or ?). you create a data flow to use the XML documents as sources or targets in jobs. Place a query in the data flow and connect the XML source to the input of the query. For information about other source options. see the Data Integrator Reference Guide.

Find out how you can participate and help to improve our documentation. see the Data Integrator Reference Guide.xml reads all files with the xml extension from the specified directory For information about other source options.xml might read files from the year 1999 D:\orders\*. Nested Data Formatting XML documents 9 For example: D:\orders\1999????. Identifying source file names You might want to identify the source XML file for each row in your source output in the following situations: • • You specified a wildcard character to read multiple source files at one time You load from a different source file on different days Data Integrator Designer Guide 229 .This document is part of a SAP study on PDF usage.

To identify the source XML file for each row in the target In the XML Source File Editor. select Include file name column which generates a column DI_FILENAME to contain the name of the source XML file. Find out how you can participate and help to improve our documentation. map the DI_FILENAME column from Schema In to Schema Out. When you run the job.This document is part of a SAP study on PDF usage. 2. the target DI_FILENAME column will contain the source XML file name for each row in the target. 3. 230 Data Integrator Designer Guide . In the Query editor. 9 Nested Data Formatting XML documents 1.

Data Integrator generates special ATL and does not perform user interface validation for this nested table. To toggle it off. you can still provide a mapping for the schema by appropriately programming the corresponding sub-query block with application logic that specifies how Data Integrator should produce the output. Note: If the Optional Table value is something other than yes or no. then go to the Attributes tab and set the Optional Table attribute value to yes or no. Nested Data Formatting XML documents 9 Mapping optional schemas You can quickly specify default mapping for optional schemas without having to manually construct an empty nested table for each optional schema in the Query transform. Click Apply and OK to set. You can also right-click a nested table and select Properties. Also. While a schema element is marked as optional. Data Integrator instantiates the empty nested table when you run the job. ENAME varchar(10). Example: CREATE NEW Query ( EMPNO int KEY . NT2 al_nested_table (C1 int) ) SET(“Optional Data Integrator Designer Guide 231 . Find out how you can participate and help to improve our documentation. the resulting query block must be complete and conform to normal validation rules required for a nested query block. Data Integrator automatically marks nested tables as optional if the corresponding option was set in the DTD or XSD file. • • To make a nested table “optional” Right-click a nested table and select Optional to toggle it on. You must map any output schema not marked as optional to a valid nested query block. When you run a job with a nested table set to optional and you have nothing defined for any columns and nested tables beneath that table. if you modify any part of the sub-query block. Data Integrator retains this option when you copy and paste schemas into your Query transforms. JOB varchar (9) NT1 al_nested_table ( DEPTNO int KEY . DNAME varchar (14). defined sub-query block. However.This document is part of a SAP study on PDF usage. When you make a schema column optional and do not provide mapping for it. when you import XML schemas (either through DTDs or XSD files). This feature is especially helpful when you have very large XML schemas with many nested levels in your jobs. this nested table cannot be marked as optional. Data Integrator generates a NULL in the corresponding PROJECT list slot of the ATL for any optional schema without an associated. right-click the nested table again and select Optional again.

If you import the metadata from an XML file. You can import metadata from either an existing XML file (with a reference to a DTD) or DTD file. The object library lists imported DTDs in the Formats tab. customer. EMP. Note: You cannot mark top-level schemas. 9 Nested Data Formatting XML documents Table” = ‘yes’) ) AS SELECT EMP. and line items—the corresponding DTD includes the order structure and the relationship between data.This document is part of a SAP study on PDF usage. EMP. DEPT. The DTD describes the data contained in the XML document and the relationships among the elements in the data. unnested tables. 232 Data Integrator Designer Guide . Using Document Type Definitions (DTDs) The format of an XML document (file or message) can be specified by a document type definition (DTD).EMPNO.JOB. NULL FROM EMP. Data Integrator automatically retrieves the DTD for that XML file. or nested tables containing function calls optional. For an XML document that contains information to place a sales order—order header.ENAME. Import the metadata for each DTD you use. Find out how you can participate and help to improve our documentation.

• If importing an XML file.This document is part of a SAP study on PDF usage. Enter settings into the Import DTD Format window: • • In the DTD definition name box. but the Job Server must be able to access it. You can type an absolute path or a relative path. 1. You must type the path. enter the name you want to give the imported DTD format in Data Integrator. 3. Nested Data Formatting XML documents 9 When importing a DTD. To import a DTD or XML Schema format From the object library. 2. The Import DTD Format window opens. select the DTD option. Right-click the DTDs icon and select New. click the Format tab. Enter the file that specifies the DTD you want to import. Data Integrator ignores other parts of the definition. Data Integrator reads the defined elements and attributes. Find out how you can participate and help to improve our documentation. See the Data Integrator Reference Guide for information about Data Integrator attributes that support DTDs. select XML for the File type option. This allows you to modify imported XML data and edit the data type as needed. If importing a DTD file. Note: If your Job Server is on a different computer than the Designer. Data Integrator Designer Guide 233 . you cannot use Browse to specify the file path. such as text and comments.

The DTD Format window appears in the workspace. Expand the DTDs category. Double-click a nested table or column.This document is part of a SAP study on PDF usage. 4. This value must match the number of recursive levels in the DTD’s content. select the name of the primary node you want to import. specify the number of levels it has by entering a value in the Circular level box. Click OK. You can set Data Integrator to import strings as a varchar of any size. Double-click a DTD name. 5. element B contains A). 9 Nested Data Formatting XML documents • • In the Root element name box. Otherwise. Varchar 1024 is the default. Data Integrator only imports elements of the DTD that belong to this node or any subnodes. 2. To view and edit nested table and column attributes for DTDs From the object library. 3. select the Formats tab. 234 Data Integrator Designer Guide . Select the Attributes tab to view or edit DTD attributes. • 4. The Column Properties window opens. 1. You can also view and edit DTD nested table and column attributes from the Column Properties window. which in turn is used to set up an XML source for the staged file. Generating DTDs and XML Schemas from an NRDM schema You can right-click any schema from within a query editor in the Designer and generate a DTD or an XML Schema that corresponds to the structure of the selected schema (either NRDM or relational). This feature is useful if you want to stage data to an XML file and subsequently read it into another data flow. Then use it to setup an XML format. Find out how you can participate and help to improve our documentation. First generate a DTD/XML Schema. If the DTD contains recursive elements (element A contains B. you can edit its column properties such as data type using the General tab of the Column Properties window. After you import a DTD. the job that uses this DTD will fail.

Find out how you can participate and help to improve our documentation. The Native Type attribute is used to set the type of the element or attribute. Nested tables become intermediate elements. the MinOccurs and MaxOccurs values will be set based on the Minimum Occurrence and Maximum Occurrence attributes of the corresponding nested table. Nested Data Operations on nested data 9 The DTD/XML Schema generated will be based on the following information: • • • • • Columns become either elements or attributes based on whether the XML Type attribute is set to ATTRIBUTE or ELEMENT. If the Required attribute is set to NO. Operations on nested data This section discusses: • • • • Overview of nested data and the Query transform FROM clause construction Nesting columns Using correlated columns in nested data Data Integrator Designer Guide 235 . While generating XML Schemas. See the Data Integrator Reference Guide for details about how Data Integrator creates internal attributes when importing a DTD or XML Schema. the corresponding element or attribute is marked optional.This document is part of a SAP study on PDF usage. No other information is considered while generating the DTD or XML Schema.

The FROM tab includes top-level columns by default. which the Schema Out text box displays. the Query transform assumes that the FROM clause in the SELECT statement contains the data sets that are connected as inputs to the query object. Without nested schemas. you can use the XML_Pipeline transform (see the Data Integrator Reference Guide). the query provides an interface to perform SELECTs at each level of the relationship that you define in the output schema. you use the Query transform to manipulate nested data. a Query transform allows you to execute a SELECT statement. a query that includes nested data includes a SELECT statement to define operations for each parent and child schema in the output. Find out how you can participate and help to improve our documentation. When working with nested data. If you want to extract only part of the nested data. When working with nested data. The mapping between input and output schemas defines the project list for the statement. The Query Editor contains a tab for each clause of the query: • • The SELECT select_list applies to the current schema. 9 Nested Data Operations on nested data • • • • Distinct rows and nested data Grouping values across nested schemas Unnesting nested data How transforms handle nested data Overview of nested data and the Query transform With relational data. because a SELECT statement can only include references to relational data sets. However. In Data Integrator. The other SELECT statement elements defined by the query work the same with nested data as they do with flat data. you must explicitly define the FROM clause in a query. 236 Data Integrator Designer Guide . Data Integrator assists by setting the toplevel inputs as the default FROM clause values for the top-level output schema.This document is part of a SAP study on PDF usage. You can include columns from nested schemas or remove the top-level columns in the FROM list by adding schemas to the FROM tab.

Find out how you can participate and help to improve our documentation. The current schema allows you to distinguish multiple SELECT statements from each other within a single query. you indicate that all of the columns in the schema—including columns containing nested schemas—are available to be included in the output. FROM clause construction When you include a schema in the FROM clause. constrained by the WHERE clause for the current schema. However. Nested Data Operations on nested data 9 The parameters you enter for the following tabs apply only to the current schema (as displayed in the Schema Out text box at the top right): • • • WHERE GROUP BY ORDER BY For information on setting the current schema and completing the parameters. you indicate that the output will be formed from the cross product of the two schemas. These FROM clause descriptions and the behavior of the query are exactly the same with nested data as with relational data.This document is part of a SAP study on PDF usage. If you include more than one schema in the FROM clause. because the SELECT statements are dependent upon each other—and because the user interface makes it easy to construct arbitrary data sets—determining the appropriate FROM clauses for multiple levels of nesting can be complex. Data Integrator Designer Guide 237 . see Query editor on page 189.

The first schema in the path must always be a top-level schema from the input. Include both input schemas at the top-level in the FROM clause to produce the appropriate data.This document is part of a SAP study on PDF usage. The next two examples use the sales order data set to illustrate scenarios where FROM clause values change the data resulting from the query. 238 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. 9 Nested Data Operations on nested data A FROM clause can contain: • • Any top-level schema from the input Any schema that is a column of a schema in the FROM clause of the parent schema The FROM clauses form a path that can start at any level of the output. The data that a SELECT statement from a lower schema produces differs depending on whether or not a schema is included in the FROM clause at the top-level. Example: FROM clause includes all top-level inputs To include detailed customer information for all of the orders in the output. join the order schema at the top-level with a customer schema.

This document is part of a SAP study on PDF usage. Customer name. Nested Data Operations on nested data 9 This example shows: • • The FROM clause includes the two top-level schemas OrderStatus_In and cust. and you want the output to include detailed material information for each line-item. Data Integrator Designer Guide 239 . and Address for each SALES_ORDER_NUMBER. Example: Lower level FROM clause contains top-level input Suppose you want the detailed information from one schema to appear for each row in a lower level of another schema. Find out how you can participate and help to improve our documentation. For example. the input includes a materials schema and a nested line-item schema. The Schema Out pane shows customer details CustID.

You can also include new columns or include mapping expressions for the columns. Source list: Drag the columns from the input to the output. The line items for a single row of the header schema are equal to the results of a query including the order number: SELECT * FROM LineItems WHERE Header. Nesting columns When you nest rows of one schema inside another. 3.OrderNo = LineItems.OrderNo In Data Integrator. source list. • • FROM clause: Include the input sources in the list on the From tab. 1. To include the Description from the top-level Materials schema for each row in the nested LineItems schema • • • Map Description from the top-level Materials Schema In to LineItems Specify the following join constraint: "Order".This document is part of a SAP study on PDF usage. Place a query in the data flow and connect the sources to the input of the query. the data set produced in the nested schema is the result of a query against the first one using the related values from the second one. When you indicate the columns included in the nested schema. 240 Data Integrator Designer Guide . 2. and WHERE clause to describe the SELECT statement that the query executes to determine the top-level data set. For example.LineItems. specify the query used to define the nested data set for each row of the parent schema. Find out how you can participate and help to improve our documentation. specify the top-level Materials schema and the nested LineItems schema.Item = Materials. 9 Nested Data Operations on nested data This example shows: • • The nested schema LineItems in Schema Out has a FROM clause that specifies only the Orders schema. you can use a query transform to construct a nested data set from relational data. To construct a nested data set Create a data flow with the sources that you want to include in the nested data set. you can nest the line items under the header schema.Item In the FROM clause. if you have sales-order information in a header schema and a line-item schema. Indicate the FROM clause.

Make the top-level schema the current schema. the columns in a nested schema are implicitly related to the columns in the parent row. WHERE clause: Only columns are available that meet the requirements for the FROM clause. nest another schema at this level. you need to drag schemas from the input to populate the FROM clause. if that schema is included in the FROM clause for this schema.) The query editor changes to display the new current schema. You can also drag an entire schema from the input to the output. In the output of the query. WHERE clause: Include any filtering or joins required to define the data set for the top-level output. To take advantage of this relationship. Change the current schema to the nested schema. Data Integrator Designer Guide 241 . Nested Data Operations on nested data 9 • 4. Repeat steps 4 through 6 in this current schema. nest another schema under the top level. you can use columns from the parent schema in the construction of the nested schema. Using correlated columns in nested data Correlation allows you to use columns from a higher-level schema to construct a nested schema. source list. (For information on setting the current schema and completing the parameters. and WHERE clause to describe the SELECT statement that the query executes to determine the top-level data set. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. • FROM clause: If you created a new output schema. A new schema icon appears in the output. If the output requires it. 5. If you dragged an existing schema from the input to the top-level output. Indicate the FROM clause. The higher-level column is a correlated column. In a nested-relational model. • • 7. 8. right-click and choose New Output Schema. Create a new schema in the output. see Query editor on page 189. that schema is automatically listed. nested under the top-level schema. 6. Select list: Only columns are available that meet the requirements for the FROM clause as described in FROM clause construction on page 237. If the output requires it.

4. If the correlated column comes from a schema other than the immediate parent. see Query editor on page 189. 5. • To include a correlated column in a nested schema. To used a correlated column in a nested schema Create a data flow with a source that includes a parent schema with a nested schema. 9 Nested Data Operations on nested data Including a correlated column in a nested schema can serve two purposes: • The correlated column is a key in the parent schema. The data set created for LineItems includes all of the LineItems columns and the OrderNo. the source could be an order header schema that has a LineItems column that contains a nested schema. Including the correlated column creates a new output column in the LineItems schema called OrderNo and maps it to the Order. Data Integrator creates a column called LineItems that contains a nested schema that corresponds to the LineItems nested schema in the input. 3. 1.OrderNo column.) Include a correlated column in the nested schema. 242 Data Integrator Designer Guide . copy all columns of the parent schema to the output.This document is part of a SAP study on PDF usage. For example. the data in the nested schema includes only the rows that match both the related values in the current row of the parent schema and the value of the correlated column. 2. Change the current schema to the LineItems schema. In the query editor. Connect a query to the output of the source. The correlated column is an attribute in the parent schema. Find out how you can participate and help to improve our documentation. Including the attribute in the nested schema allows you to use the attribute to simplify correlated queries against the nested data. drag the OrderNo column from the Header schema into the LineItems schema. Correlated columns can include columns from the parent schema and any other schemas in the FROM clause of the parent schema. You can always remove the correlated column from the lower-level schema in a subsequent query transform. For example. (For information on setting the current schema and completing the parameters. you do not need to include the schema that includes the column in the FROM clause of the nested schema. In addition to the top-level columns. Including the key in the nested schema allows you to maintain a relationship between the two schemas after converting them from the nested data model to a relational model.

the grouping operation combines the nested schemas for each group.State) and create an output schema that includes State column (set to Order. you can set the Group By clause in the top level of the data set to the state column (Order. the multi-level must be unnested. The result is a set of rows (one for each state) that has the State column and the LineItems nested schema that contains all the LineItems for all the orders for that state. To load the data into relational schemas. Grouping values across nested schemas When you specify a Group By clause for a schema with a nested schema. Data Integrator Designer Guide 243 . Find out how you can participate and help to improve our documentation. This is particularly useful to avoid cross products in joins that produce nested output. to assemble all the line items included in all the orders for each state from a set of orders. For example. Nested Data Operations on nested data 9 Distinct rows and nested data The Distinct rows option in Query transforms removes any duplicate rows at the top level of a join. For example. Unnesting a schema produces a cross-product of the top-level schema (parent) and the nested schema (child). Unnesting nested data Loading a data set that contains nested schemas into a relational (nonnested) target requires that the nested rows be unnested.State) and LineItems nested schema.This document is part of a SAP study on PDF usage. a sales order may use a nested schema to define the relationship between the order header and the order line items.

Data Integrator allows you to unnest any number of nested schemas at any depth. 244 Data Integrator Designer Guide . No matter how many levels are involved. the result of unnesting schemas is a cross product of the parent and child schemas. for example. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. and so on to the top-level schema. then the result—the cross product of the parent and the inner-most child—is then unnested from its parent. 9 Nested Data Operations on nested data It is also possible that you would load different columns from different nesting levels into different schemas. A sales order. may be flattened so that the order number is maintained separately with each line item and the header and line item information loaded into separate schemas. the inner-most child is unnested first. When more than one level of unnesting occurs.

Nested Data Operations on nested data 9 Unnesting all schemas (cross product of all data) might not produce the results you intend. Data for unneeded columns or schemas might be more difficult to filter out after the unnesting operation.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. flattening a sales order by unnesting customer and line-item schemas produces rows of data that might not be useful for processing the order. Data Integrator Designer Guide 245 . and then cut the unneeded columns or nested columns. To unnest nested data Create the output that you want to unnest in the output schema of a query. to remove nested schemas or columns inside nested schemas. 1. make the nested schema the current schema. You can use the Cut command to remove columns or schemas from the top level. if an order includes multiple customer values such as ship-to and bill-to addresses. For example.

as the following diagram shows.This document is part of a SAP study on PDF usage. 246 Data Integrator Designer Guide . To transform values in lower levels of nested schemas Take one of the following actions to obtain the nested data 1. right-click the schema name and choose Unnest. For details. Use an XML_Pipeline transform to select portions of the nested data. For each of the nested schemas that you want to unnest. Only the columns at the first level of the input data set are available for subsequent transforms. Perform the transformation. • • • 2. Find out how you can participate and help to improve our documentation. Nest the data again to reconstruct the nested relationships. Unnest How transforms handle nested data Nested data included in the input to transforms (with the exception of a query or XML_Pipeline transform) passes through the transform without being included in the transform’s operation. see the Data Integrator Reference Guide. 9 Nested Data Operations on nested data 2. The output of the query (the input to the next step in the data flow) includes the data in the new relationship. For details. see Unnesting nested data on page 243. Use a query transform to unnest the data.

transform it as NRDM data. This function takes varchar data only.This document is part of a SAP study on PDF usage. then loads the generated XML to a varchar column. • • Data Integrator converts a clob data type input to varchar if you select the Import unsupported data types as VARCHAR of size option when you create a database datastore connection in the Datastore Editor. Find out how you can participate and help to improve our documentation. which takes the output of the load_to_xml function as input. Note: Data Integrator limits the size of the XML supported with these methods to 100K due to the current limitation of its varchar data type. Data Integrator Designer Guide 247 . The function load_to_xml generates XML from a given NRDM structure in Data Integrator. or clob. Nested Data XML extraction and parsing for columns 9 XML extraction and parsing for columns In addition to extracting XML message and file data. then load it to a target or flat file column. More and more database vendors allow you to store XML in one column. Data Integrator provides four functions to support extracting from and loading to columns: • • • • extract_from_xml load_to_xml long_to_varchar varchar_to_long The extract_from_xml function gets the XML content stored in a single column and builds the corresponding NRDM structure so that Data Integrator can transform it. If your source uses a long data type. To enable extracting and parsing for columns. use the long_to_varchar function to convert data to varchar. representing it as NRDM data during transformation. There are plans to lift this restriction in the future. Data Integrator’s XML handling capability also supports reading from and writing to such fields. you can also use Data Integrator to extract XML data stored in a source table or flat file column. Sample Scenarios The following scenarios describe how to use four Data Integrator functions to extract XML data from a source column and load it into a target column. then loading it to an XML message or file. If you want a job to convert the output to a long column. The field is usually a varchar. long. data from long and clob columns must be converted to varchar before it can be transformed by Data Integrator. use the varchar_to_long function.

2. 9 Nested Data XML extraction and parsing for columns Scenario 1 Using long_to_varchar and extract_from_xml functions to extract XML data from a column with data of the type long. Data Integrator will truncate the data and cause a runtime error. then select the long_to_varchar function and configure it by entering its parameters. 3. Name this output column content. which would waste computer memory at runtime. 2. and a data flow for your design. 248 Data Integrator Designer Guide . Opened the data flow and dropped the source table with the column named content in the data flow. 4. 4. select the Conversion function type. 3. The second parameter in this function (4000 in this case) is the maximum size of the XML data stored in the table column. a job. Find out how you can participate and help to improve our documentation. open the Function Wizard. If the size is not big enough to hold the maximum XML data for the column. which provides the format for the XML data. Create a query with an output column of data type varchar. Imported the XML Schema PO. do not enter a number that is too big.xsd. To extract XML data from a column into Data Integrator First. map the source table column to a new output column. In the query editor. which contains XML data for a purchase order. 4000) From this point: 1. Use this parameter with caution. Created a Project. assume you have previously performed the following steps: 1. Conversely. long_to_varchar(content. In the Map section of the query editor. Imported an Oracle table that contains a column named Content with the data type long. 5. Create a second query that uses the function extract_from_xml to extract the XML data. and make sure that its size is big enough to hold the XML data.This document is part of a SAP study on PDF usage. into the Data Integrator repository.

Data Integrator Designer Guide 249 . Note: You can only use the extract_from_xml function in a new function call. b. this function is not displayed in the function wizard. choose New Function Call….This document is part of a SAP study on PDF usage. Nested Data XML extraction and parsing for columns 9 a. To invoke the function extract_from_xml. Enter 1 if you want Data Integrator to validate the XML with the specified Schema. When the Function Wizard opens. • • • c. Enter the name of the purchase order schema (in this case PO) The third parameter is Enable validation. right-click the current context in the query. Enter content. which is the output column in the previous query that holds the XML data The second parameter is the DTD or XML Schema name. select Conversion and extract_from_xml. Enter 0 if you do not. Click Next. The first is the XML column name. Enter values for the input parameters. Otherwise. Find out how you can participate and help to improve our documentation.

Click Finish. If the function fails due to an error when trying to produce the XML output. Find out how you can participate and help to improve our documentation. comment. Data Integrator generates the function call in the current context and populates the output schema of the query with the output columns you specified. if you want to load the NRDM structure to a target XML file. With the data converted into the Data Integrator NRDM structure. shipTo. 9 Nested Data XML extraction and parsing for columns d. create an XML file target and connect the second query to it. For the function. For example. The return type of the column is defined in the schema. e. 250 Data Integrator Designer Guide . and items. you are ready to do appropriate transformation operations on it. which include either scalar or NRDM column data. select a column or columns that you want to use on output. Data Integrator returns NULL for scalar columns and empty nested tables for NRDM columns. Returns NULL if AL_ERROR_NUM is 0 Choose one or more of these columns as the appropriate output for the extract_from_xml function. You can select any number of the top-level columns from an XML schema. 6. The extract_from_xml function also adds two columns: • • AL_ERROR_NUM — returns error codes: 0 for success and a non-zero integer for failures AL_ERROR_MSG — returns an error message if AL_ERROR_NUM is not 0. Imagine that this purchase order schema has five top-level elements: orderDate. Note: If you find that you want to modify the function call. billTo.This document is part of a SAP study on PDF usage. right-click the function call in the second query and choose Modify Function Call.

you use the function varchar_to_long to convert the value of varchar data type to a value of the data type long. you can use just one query by entering the function expression long_to_varchar directly into the first parameter of the function extract_from_xml. Find out how you can participate and help to improve our documentation. Alternatively. you want to convert an NRDM structure for a purchase order to XML data using the function load_to_xml.This document is part of a SAP study on PDF usage. If the data type of the source column is not long but varchar. In this example. Because the function load_to_xml returns a value of varchar data type. do not include the function long_to_varchar in your data flow. which is of the long data type. The first parameter of the function extract_from_xml can take a column of data type varchar or an expression that returns data of type varchar. and then load the data to an Oracle table column called content. we created two queries: the first query to convert the data using the long_to_varchar function and the second query to add the extract_from_xml function. Scenario 2 Using the load_to_xml function and the varchar_to_long function to convert a Data Integrator NRDM structure to scalar data of the varchar type in an XML format and load it to a column of the data type long. Data Integrator Designer Guide 251 . Nested Data XML extraction and parsing for columns 9 In this example. to extract XML data from a column of data type long.

in this example. In this query. b. 9 Nested Data XML extraction and parsing for columns 1. 'PO'. 252 Data Integrator Designer Guide . click the category Conversion Functions. In the mapping area of the Query window. a. Click Finish.0" encoding = "UTF-8" ?>'. you used two queries. 7. You can use just one query if you use the two functions in one expression: varchar_to_long( load_to_xml(PO. and then select the function load_to_xml. c. this function converts the NRDM structure of purchase order PO to XML data and assigns the value to output column content. 4000) ) If the data type of the column in the target database table that stores the XML data is varchar. In this example. 6. To load XML data into a column of the data type long Create a query and connect a previous query or source (that has the NRDM structure of a purchase order) to it.0" encoding = "UTF-8" ?>'. 4000) 2. Click Next. 1. Make sure the size of the column is big enough to hold the XML data. The function varchar_to_long takes only one input parameter. From the Mapping area open the function wizard. For more information. 5. You used the first query to convert an NRDM structure to XML data and to assign the value to a column of varchar data type.This document is part of a SAP study on PDF usage. create an output column of the data type varchar called content. Find out how you can participate and help to improve our documentation. NULL. see the Data Integrator Reference Guide. '<?xml version="1. 1. Enter values for the input parameters. 3. notice the function expression: load_to_xml(PO. 1. 1. Open the function wizard from the mapping section of the query and select the Conversion Functions category Use the function varchar_to_long to map the input column content to the output column content. The function load_to_xml has seven parameters. there is no need for varchar_to_long in the transformation. Assume the column is called content and it is of the data type long. Create another query with output columns matching the columns of the target table. varchar_to_long(content) Connect this query to a database target. d. You used the second query to convert the varchar data type to long. Like the example using the extract_from_xml function. Enter a value for the input parameter. NULL. 4. 'PO'. '<?xml version="1.

Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide Real-time jobs chapter .This document is part of a SAP study on PDF usage.

254 Data Integrator Designer Guide . Real-time job — Performs a predefined set of operations for that message type and creates a response. Real-time means that Data Integrator can receive requests from ERP systems and Web applications and send replies immediately after getting the requested data from a data cache or a second application. Find out how you can participate and help to improve our documentation. When a message is received. The Data Integrator Access Server constantly listens for incoming messages. It could be an order status request produced by a Web application that requires an answer from a data cache or back-office system. the Access Server routes the message to a waiting process that performs a predefined set of operations for the message type. The Access Server then receives a response for the message and replies to the originating application. This chapter contains the following topics: • • • • • • • Request-response message processing What is a real-time job? Creating real-time jobs Real-time source and target objects Testing real-time jobs Building blocks for real-time jobs Designing real-time applications Request-response message processing The message passed through a real-time system includes the information required to perform a business transaction. Two Data Integrator components support request-response message processing: • • Access Server — Listens for messages and routes each message based on message type. You define operations for processing on-demand messages by building real-time jobs in the Designer. The content of the message can vary: • • It could be a sales order or an invoice processed by an ERP system destined for a data cache. 10 Real-time jobs Overview Overview Data Integrator supports real-time data transformation.This document is part of a SAP study on PDF usage.

Also in real-time jobs. Real-time jobs What is a real-time job? 10 Processing might require that additional data be added to the message from a data cache or that the message data be loaded to a data cache. If a customer wants to know when they can pick up their order at your distribution center. However. and loads data. For example. This ensures that each message receives a reply as soon as possible.This document is part of a SAP study on PDF usage. transforms. Each real-time job can extract data from a single message type. You create a different real-time job for each type of message your system can produce. you might use transforms differently in real-time jobs. Data Integrator writes data to message targets and secondary targets in parallel. you might want to create a CheckOrderStatus job using a look-up function to count order items and then a case transform to provide status in the form of strings: “No items are ready for pickup” or “X items in your order are ready for pickup” or “Your order is ready for pickup”. a real-time job extracts. It can also extract data from other sources such as tables or files. What is a real-time job? The Data Integrator Designer allows you to define the processing of real-time messages using a real-time job. Real-time versus batch Like a batch job. Find out how you can participate and help to improve our documentation. The Access Server returns the response to the originating application. Realtime jobs “extract” data from the body of the message received and from any secondary sources used in the job. The same powerful transformations you can define in batch jobs are available in real-time jobs. Data Integrator Designer Guide 255 . you might use branches and logic controls more often than you would in batch jobs.

it passes the message to a running real-time service designed to process this message type. In a second case. suppose a message includes information required to determine order status for a particular order.This document is part of a SAP study on PDF usage. 256 Data Integrator Designer Guide . customer information. instead. the message contains data that can be represented as a single column in a single-row table. In this case. Typical messages include information required to implement a particular business operation and to produce an appropriate response. and the line-item details for the order. When the Access Server receives a message. The real-time service processes the message and returns a response. For example. real-time jobs do not execute in response to a schedule or internal trigger. 10 Real-time jobs What is a real-time job? Unlike batch jobs. real-time jobs execute as real-time services started through the Administrator. The message contents might be as simple as the sales order number. The message might include the order number. The corresponding real-time job might use the input to query the right sources and return the appropriate product information. The real-time service continues to listen and process messages on demand until it receives an instruction to shut down. Find out how you can participate and help to improve our documentation. Real-time services then wait for messages from the Access Server. Messages How you design a real-time job depends on what message you want it to process. The message processing could return confirmation that the order was submitted successfully. a message could be a sales order to be entered into an ERP system.

the message contains data that cannot be represented in a single table. Real-time jobs can send only one row of data in a reply message (message target). Later sections describe the actual objects that you would use to construct the logic in the Designer. you can include values from a data cache to supplement the transaction before applying it against the back-office application (such as an ERP system). the order header information can be represented by a table and the line items for the order can be represented by a second table. Data Integrator represents the header and line item data in the message in a nested relationship. the real-time job processes all of the rows of the nested table for each row of the top-level table. both of the line items are processed for the single row of header information. toplevel table. legacy). Data Integrator data flows support the nesting of tables within other tables. you can structure message targets so that all data is contained in a single row by nesting tables within columns of a single. Using a query transform. In this sales order. Data Integrator Designer Guide 257 . However.This document is part of a SAP study on PDF usage. Real-time jobs What is a real-time job? 10 In this case. See Chapter 9: Nested Data for details. Loading transactions into a back-office application A real-time job can receive a transaction from a Web application and load it to a back-office application (ERP. Real-time job examples These examples provide a high-level description of how real-time jobs address typical real-time scenarios. SCM. Find out how you can participate and help to improve our documentation. When processing the message.

Find out how you can participate and help to improve our documentation. back-office apps You can create real-time jobs that use values from a data cache to determine whether or not to query the back-office application (such as an ERP system) directly. Retrieving values. 258 Data Integrator Designer Guide . 10 Real-time jobs What is a real-time job? Collecting back-office data into a data cache You can use messages to keep the data cache current. data cache. Real-time jobs can receive messages from a back-office application and load them into a data cache or data warehouse.This document is part of a SAP study on PDF usage.

data flow. It runs only when a real-time service is shut down. a real-time job. once started as a real-time service. a real-time processing loop. work flow. The clean-up component (optional) can be a script. Figure 10-1 :Real-time job • • • The initialization component (optional) can be a script. or a combination of objects. listens for a request. while loops. It runs only when a real-time service starts. Data Integrator processes the request. A real-time job is divided into three processing components: initialization. object usage must adhere to a valid real-time job model. and continues listening. returns a reply. Real-time job models In contrast to batch jobs. Real-time jobs Creating real-time jobs 10 Creating real-time jobs You can create real-time jobs using the same objects as batch jobs (data flows. Find out how you can participate and help to improve our documentation. conditionals. You can specify any number of work flows and data flows inside it. The real-time processing loop is a container for the job’s single process logic. data flow. and clean-up. work flow. etc.This document is part of a SAP study on PDF usage.). work flows. However. scripts. or a combination of objects. When a real-time job receives a request (typically to access small number of records). Data Integrator Designer Guide 259 . which typically move large amounts of data at scheduled times. This listen-process-listen logic forms a processing loop.

a single message source must be included in the first step and a single message target must be included in the last step. you can ensure that data in each message is completely processed in an initial data flow before processing for the next data flows starts. The following models support this rule: • • • Single data flow model Multiple data flow model Request/Acknowledge data flow model (see the Data Integrator Supplement for SAP) Single data flow model With the single data flow model. Find out how you can participate and help to improve our documentation. if the data represents 40 items. For example. The last object in the loop must be a data flow. you create a real-time job using a single data flow in its real-time processing loop. 10 Real-time jobs Creating real-time jobs In a real-time processing loop. This allows you to control and collect all the data in a message at any point in a real-time job for design and troubleshooting purposes. Figure 10-2 :Real-time processing loop By using multiple data flows. Figure 10-1 :Real-time processing loop Multiple data flow model The multiple data flow model allows you to create a real-time job using multiple data flows in its real-time processing loop. all 40 must pass though the first data flow to a staging or memory table before passing to a second data flow.This document is part of a SAP study on PDF usage. If you use multiple data flows in a real-time processing loop: • • The first object in the loop must be a data flow. This data flow must have one and only one message source. This single data flow must include a single message source and a single message target. 260 Data Integrator Designer Guide . This data flow must have a message target.

Using real-time job models Single data flow model When you use a single data flow within a real-time processing loop your data flow diagram might look like this: Notice that the data flow has one message source and one message target. Multiple data flow model When you use multiple data flows within a real-time processing loop your data flow diagrams might look like those in the following example scenario in which Data Integrator writes data to several targets according to your multiple data flow design. Reply to each message with the query join results Data Integrator Designer Guide 261 . Example scenario requirements: Your job must do the following tasks. and you can add them inside any number of work flows. Memory tables store data in memory while a loop runs. They improve the performance of real-time jobs with multiple data flows. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. You can add any number of additional data flows to the loop. Real-time jobs Creating real-time jobs 10 • • • Additional data flows cannot have message sources or targets. completing each one before moving on to the next: • • • Receive requests about the status of individual orders from a web portal and record each message to a backup flat file Perform a query join to find the status of the order and write to a customer database table. All data flows can use input and/or output memory tables to pass data sets on to the next data flow.

see “Memory datastores” on page 102. Meanwhile. Notice this data flow has neither a message source nor a message target. For more information about building real-time jobs. 10 Real-time jobs Creating real-time jobs Solution: First. • The second data flow reads the message data from the memory table (table source). performs a join with stored data (table source).This document is part of a SAP study on PDF usage. It reads the result of the join in the memory table (table source) and loads the reply (XML message target). For more information. Note: You might want to create a memory table to move data to sequential data flows. and writes the results to a database table (table target) and a new memory table (table target). Second. set up the tasks in each data flow: • The first data flow receives the XML message (using an XML message source) and records the message to the flat file (flat file format target). see the “Building blocks for real-time jobs” and “Designing real-time applications” sections. this same data flow writes the data into a memory table (table target). 262 Data Integrator Designer Guide . create a real-time job and add a data flow. a work flow. Find out how you can participate and help to improve our documentation. add a data flow to the work flow. • The last data flow sends the reply. Next. and an another data flow to the real-time processing loop.

This document is part of a SAP study on PDF usage. Real-time jobs Creating real-time jobs 10 Creating a real-time job 1. The workspace displays the job’s structure. use the naming convention: RTJOB_JobName. 3. New_RTJob1 appears in the project area. Find out how you can participate and help to improve our documentation. 2. From the project area. Although saved real-time jobs are grouped together under the Job tab of the object library. To create a real-time job In the Designer. rename New_RTJob1. Always add a prefix to job names with their job type. In these cases. In this case. create or open an existing project. job names may also appear in text editors used to create adapter or Web Services calls. a prefix saved with the job name will help you identify it. Data Integrator Designer Guide 263 . In the project area. which consists of two markers: • • RT_Process_begins Step_ends These markers represent the beginning and end of a real-time processing loop. right-click the white space and select New Realtime job from the shortcut menu.

5. If you want to create a job with multiple data flows: a. For more information about sources and targets. and connect initialization object(s) and clean-up object(s) as needed. A real-time job with a single data flow might look like this: e. When you place a data flow icon into a job. This data flow must include one message source. Connect the begin and end markers to the data flow. 10 Real-time jobs Creating real-time jobs 4. c. Drop and configure a data flow. Find out how you can participate and help to improve our documentation. d. 264 Data Integrator Designer Guide . Add. you are telling Data Integrator to validate the data flow according the requirements of the job type (batch or real-time). Click inside the loop. If you want to create a job with a single data flow: a. Build the data flow including a message source and message target. configure. The boundaries of a loop are indicated by begin and end markers. Click the data flow icon in the tool pallet. see “Real-time source and target objects” on page 266. One message source and one message target are allowed in a realtime processing loop. b.This document is part of a SAP study on PDF usage. You can add data flows to either a batch or real-time job.

This data flow must include one message target. These data flows will run in parallel when job processing begins. After adding and configuring all objects. and clean up objects outside the real-time processing loop as needed. drop other objects such as work flows. 7. d. c. Drop. Open each object and configure it. e. drop and configure your last data flow. Real-time jobs Creating real-time jobs 10 b. After this data flow. scripts. drop data flows within job-level work flows.This document is part of a SAP study on PDF usage. and connect the initialization. configure. f. Data Integrator Designer Guide 265 . Do not connect these secondary-level data flows. validate your job. Just before the end of the loop. Find out how you can participate and help to improve our documentation. data flows. To include parallel processing in a real-time job. Assign test source and target files for the job and execute it in test mode. Save the job. For more information see “Testing real-time jobs” on page 270. 8. A real-time job with multiple data flows might look like this: 6. Note: Objects at the real-time job level in Designer diagrams must be connected. Return to the real-time job window and connect all the objects. or conditionals from left to right between the first data flow and end of the real-time processing loop. Connected objects run in sequential order.

10 Real-time jobs Real-time source and target objects 9. Find out how you can participate and help to improve our documentation. on page 111 under adapter datastore To view an XML message source or target schema In the workspace of a real-time job. If the XML message source or target contains nested data. and import object metadata. Real-time source and target objects Real-time jobs must contain a real-time source and/or target object. click the name of an XML message source or XML message target to open its editor. configure a service and service providers for the job and run it in your test environment. you can also use IDoc messages as real-time sources. with the following additions: For XML messages Prerequisite Import a DTD or XML Schema to define a format Reference Object library location “To import a DTD or Formats tab XML Schema format” on page 233 Outbound message Define an adapter datastore “Adapter datastores” Datastores tab. Adding sources and targets to real-time jobs is similar to adding them to batch jobs. 266 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. see the Data Integrator Supplement for SAP. Those normally available are: Object XML message Description Used as a: Data Integrator Access An XML message structured in Source or target Directly or through a DTD or XML Schema format adapters Through an adapter Outbound message A real-time message with an Target application-specific format (not readable by XML parser) If you have the SAP licensed extension. the schema displays nested tables to represent the relationships among the data. Using the Administrator. For more information.

The order contains the customer name. The supplementary information might come from the ERP system itself or from a data cache containing the same information. but when you apply the order against your ERP system.This document is part of a SAP study on PDF usage. Inside a data flow of a real-time job. Data Integrator Designer Guide 267 . you need to supply more detailed customer information. Find out how you can participate and help to improve our documentation. you can supplement the message with the customer information to produce the complete document to send to the ERP system. suppose you are processing a message that contains a sales order from a Web application. Real-time jobs Real-time source and target objects 10 Root element Columns at the top level Nested table Columns nested one level Sample file for testing Secondary sources and targets Real-time jobs can also have secondary sources or targets (see “Source and target objects” on page 178). For example.

in which the data resulting from the processing of a single data flow can be loaded into multiple tables as a single transaction. 10 Real-time jobs Real-time source and target objects Tables and files (including XML files) as sources can provide this supplementary information. You can specify the order in which tables in the transaction are included using the target table editor.This document is part of a SAP study on PDF usage. 268 Data Integrator Designer Guide . This feature supports a scenario in which you have a set of tables with foreign keys that depend on one with primary keys. which can reduce performance when moving large amounts of data. Add secondary sources and targets to data flows in real-time jobs as you would to data flows in batch jobs (See “Adding source or target objects to data flows” on page 180). use caution when you consider enabling this option for a batch job because it requires the use of memory. However. Transactional loading of tables Target tables in real-time jobs support transactional loading. Data Integrator reads data from secondary sources according to the way you design the data flow. Find out how you can participate and help to improve our documentation. Note: Target tables in batch jobs also support transactional loading. Data Integrator loads data to secondary targets in parallel with a target message. No part of the transaction applies if any part fails.

Real-time jobs Real-time source and target objects 10 Turn on transactional loading Assign the order of this table in the transaction You can use transactional loading only when all the targets in a data flow are in the same datastore. In real-time jobs. If you use transactional loading. pre-load. Find out how you can participate and help to improve our documentation. you can control which table is included in the next outer-most loop of the join using the join ranks for the tables. do not cache data from secondary sources unless the data is static. you cannot use bulk loading. targets in each datastore load independently. • Data Integrator Designer Guide 269 . The data will be read when the real-time job starts and will not be updated while the job is running. If more than one supplementary source is included in the join. or post-load commands. Data Integrator includes the data set from the real-time source as the outer loop of the join.This document is part of a SAP study on PDF usage. Design tips for data flows in real-time jobs Keep in mind the following when you are designing data flows: • If you include a table in a join with a real-time source. If the data flow loads tables in more than one datastore.

Execute the job. 2. 270 Data Integrator Designer Guide . These include: • • • Executing a real-time job in test mode Using View Data Using an XML file target Executing a real-time job in test mode You can test real-time job designs without configuring the job as a service associated with an Access Server. the real-time job returns an empty response to the Access Server. For example. enter a file name in the XML test file box. 1. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. • • For more detailed information about real-time job processing see the Data Integrator Reference Guide. To avoid this issue. you can structure any amount of data into a single “row” because columns in tables can contain other tables. your job might be designed in such a way that no data passes to the reply message. Recovery mechanisms are not supported in real-time jobs. You might want to provide appropriate instructions to your user (exception handling in your job) to account for this type of scenario. Enter the full path name for the source file that contains your sample data. With NRDM. See Chapter 9: Nested Data for more information. In test mode. 10 Real-time jobs Testing real-time jobs • If no rows are passed to the XML target. if a request comes in for a product number that does not exist. To specify a sample XML message and target test file In the XML message source and target editors. the target reads the first row and discards the other rows. you can execute a real-time job using a sample source message from a file to determine if Data Integrator produces the expected target message. Use paths for both test files relative to the computer that runs the Job Server for the current repository. If more than one row passes to the XML target. use your knowledge of Data Integrator’s Nested Relational Data Model (NRDM) and structure your message source and target formats so that one “row” equals one message. Testing real-time jobs There are several ways to test real-time jobs during development.

To use a file to capture output from steps in a real-time job In the Formats tab of the object library. you can include XML files as sources or targets in batch and real-time jobs. Data Integrator reads data from the source test file and loads it into the target test file. A menu prompts you for the function of the file. In the file editor. Just like an XML message. Find out how you can participate and help to improve our documentation. The XML file target appears in the workspace. execute your job using View Data. then dragging the format into the data flow definition. 1. Real-time jobs Testing real-time jobs 10 Test mode is always enabled for real-time jobs. you define an XML file by importing a DTD or XML Schema for the file. With View Data. Using an XML file target You can use an “XML file target” to capture the message produced by a data flow while allowing the message to be returned to the Access Server. 3. Connect the output of the step in the data flow that you want to capture to the input of the file. Data Integrator Designer Guide 271 . Choose Make XML File Target. Unlike XML messages. drag the DTD or XML Schema into a data flow of a real-time job. 4. 2. you can capture a sample of your output data to ensure your design is working. Enter a file name relative to the computer running the Job Server. specify the location to which Data Integrator writes data.This document is part of a SAP study on PDF usage. See Chapter 15: Design and Debug for more information. Using View Data To ensure that your design returns the results you expect.

If not. 3. Include a table or file as a source. Use the data in the real-time source to find the necessary supplementary data. Find out how you can participate and help to improve our documentation. Use a query to extract the necessary data from the table or file. 272 Data Integrator Designer Guide . include the files or tables from which you require supplementary information. 10 Real-time jobs Building blocks for real-time jobs Building blocks for real-time jobs This section describes some of the most common operations that real-time jobs can perform and how to define them in the Designer: • • • Supplementing message data Branching data flow based on a data cache value Calling application functions Also read about “Embedded Data Flows” on page 283 and the Case transform in the Data Integrator Reference Guide. In addition to the real-time source. Supplementing message data The data included in messages from real-time sources might not map exactly to your requirements for processing or storing the information.This document is part of a SAP study on PDF usage. 2. You can include a join expression in the query to extract the specific values required from the supplementary source. you can define steps in the real-time job to supplement the message information. One technique for supplementing the data in a real-time source includes these steps: 1.

The business logic uses the customer number and priority rating to determine the level of status to return. consider these alternatives: • • Lookup function call — Returns a default value if no match is found Outer join — Always returns a value. If no value returns from the join. a request message includes sales order information and its reply message returns order status. the query produces no rows and the message returns to the Access Server empty. resulting in output for only the sales document and line items included in the input from the application. Find out how you can participate and help to improve our documentation. even if no match is found To supplement message data In this example. The message includes only the customer name and the order number. A real-time job is then defined to retrieve the customer number and rating from other sources before determining the order status. Real-time jobs Building blocks for real-time jobs 10 Input from the Web application The WHERE clause joins the two inputs. Be careful to use data in the join that is guaranteed to return a value. If you cannot guarantee that a value returns. Data Integrator Designer Guide 273 .This document is part of a SAP study on PDF usage.

2. In this example. Join the sources.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.CustName) You can construct the output to include only the columns that the realtime job needs to determine order status. Both branches return order status for each line item in the order. Complete the real-time job to determine order status. construct a join on the customer name: (Message. the supplementary information required doesn’t change very often. so it is reasonable to extract the data from a data cache rather than going to an ERP system directly. 4. The example shown here determines order status in one of two methods based on the customer status value. 10 Real-time jobs Building blocks for real-time jobs 1. Order status for other customers is determined from a data cache of sales order information. The illustration below shows a single data flow model. The logic can be arranged in a single or multiple data flows. In a query transform. The data flow merges the results and constructs the response. 3. Include the supplementary source in the real-time job. This source could be a table or file. The next section describes how to design branch paths in a data flow. Include the real-time source in the real-time job.CustName = Cust_Status. Order status for the highest ranked customers is determined directly from the ERP. 274 Data Integrator Designer Guide .

You might need to consider error-checking and exception-handling to make sure that a value passes to the target. This example describes a section of a real-time job that processes a new sales order. Merge the results from each path into a single data set.This document is part of a SAP study on PDF usage. the real-time job returns an empty response (begin and end XML tags only) to the Access Server. either from a data cache or ERP Branching data flow based on a data cache value One of the most powerful things you can do with a real-time job is to design logic that determines whether responses should be generated from a data cache or if they must be generated from data in a back-office application (ERP. You might need to consider the case where the rule indicates back-office application access. If the target receives an empty set. Find out how you can participate and help to improve our documentation. SCM. CRM). Route the single result to the real-time target. Determine the rule for when to access the data cache and when to access the back-office application. One technique for constructing this logic includes these steps: 1. 3. 4. but the system is not currently available. “is there enough inventory on hand to fill this order?” Data Integrator Designer Guide 275 . Define each path that could result from the outcome. The section is responsible for checking the inventory available of the ordered products—it answers the question. 5. Real-time jobs Building blocks for real-time jobs 10 Supplement the order from a data cache Join with a customer table to determine customer priority Determine order status. 2. Compare data from the real-time source with the rule.

the output can use the same convention. See “To import a DTD or XML Schema format” on page 233 to define the format of the data in the XML message. The XML target that ultimately returns a response to the Access Server requires a single row at the top-most level.This document is part of a SAP study on PDF usage. 3. 10 Real-time jobs Building blocks for real-time jobs The rule controlling access to the back-office application indicates that the inventory (Inv) must be more than a pre-determined value (IMargin) greater than the ordered quantity (Qty) to consider the data cached inventory value acceptable. Data Integrator makes a comparison for each line item in the order. To branch a data flow based on a rule Create a real-time job and drop a data flow inside it. The input is already nested under the sales order. the output needs to include some way to indicate whether the inventory is or is not available. 276 Data Integrator Designer Guide . Determine the values you want to return from the data flow. Add the XML source in the data flow. The XML source contains the entire sales order. yet the data flow compares values for line items inside the sales order. 2. Find out how you can participate and help to improve our documentation. Because this data flow needs to be able to determine inventory values for multiple line items. the structure of the output requires the inventory information to be nested. In addition. 1.

then delete any unneeded columns or nested tables from the output. Connect the output of the XML source to the input of a query and map the appropriate columns to the output. you would add a join expression in the WHERE clause of the query. 5. You can drag all of the columns and nested tables from the input to the output. 6. Real-time jobs Building blocks for real-time jobs 10 4.This document is part of a SAP study on PDF usage. Add the comparison table from the data cache to the data flow as a source. Because the comparison occurs between a nested table and another top-level table. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide 277 . you have to define the join more carefully: • • • • Change context to the LineItem table Include the Inventory table in the FROM clause in this context (the LineItem table is already in the From list) Define an outer join with the Inventory table as the inner table Add the join expression in the WHERE clause in this context In this example. Construct the query so you extract the expected data from the inventory data cache table. you can assume that there will always be exactly one value in the Inventory table for each line item and can therefore leave out the outer join definition. Without nested data.

10 Real-time jobs Building blocks for real-time jobs After changing contexts. The inventory values in the ERP inventory table are then substituted for the data cache inventory values in the output There are several ways to return values from the ERP.This document is part of a SAP study on PDF usage. • Query to process valid inventory values from the data cache The WHERE clause at the nested level (LineItem) of the query ensures that the quantities specified in the incoming line item rows are appropriately accounted for by inventory values from the data cache. Find out how you can participate and help to improve our documentation. For example. This example uses a join so that the processing can be 278 Data Integrator Designer Guide . Include the values from the Inventory table that you need to make the comparison. • Query to retrieve inventory values from the ERP The WHERE clause at the nested level (LineItem) of the query ensures that the quantities specified in the incoming line item rows are not accounted for by inventory values from the data cache. you could use a lookup function or a join on the specific table in the ERP system. Add two queries to the data flow: 8. Drag the Inv and IMargin columns from the input to the LineItem table. From tab list includes the Inventory table Where tab expression applies only in this schema 7. Split the output of the query based on the inventory comparison. the nested table is active while any other tables in the output schema are grayed out.

LineItems. at the nested-level context: WHERE Compare2cache.Qty >= (Compare2cache.LineItems.INV + Compare2cache. ERP_Inventory 9.LineItems.LineItems CheckERP. As in the previous join.INV + Compare2cache. Real-time jobs Building blocks for real-time jobs 10 performed by the ERP system rather than Data Integrator.LineItems.This document is part of a SAP study on PDF usage.LineItems. CacheOK. make sure to define an outer join so that the line item row is not lost.LineItems. Find out how you can participate and help to improve our documentation.LineItems. at the nested-level context: WHERE Compare2cache. The “CacheOK” branch of this example always returns line-item rows that include enough inventory to account for the order quantity. “is there enough inventory to fill this order?” To complete the order processing. each branch returns an inventory value that can then be compared to the order quantity to answer the question. if you cannot guarantee that a value will be returned by the join. Show inventory levels only if less than the order quantity.Qty < (Compare2cache.IMARGIN) FROM Compare2cache.IMARGIN) FROM Compare2cache. you can remove the inventory value from the output of these rows. Data Integrator Designer Guide 279 . The goal from this section of the data flow was an answer to.

To make up the input. top-level columns. Both branches of the data flow include the same column and nested tables. A data flow may contain several steps that call a function. 11. and any tables nested one-level down relative to the tables listed in the FROM clause of the context calling the function. Merge the branches into one response. you can specify the top-level table. Find out how you can participate and help to improve our documentation. 10 Real-time jobs Building blocks for real-time jobs The “CheckERP” branch can return line item rows without enough inventory to account for the order quantity. then shape the results into the columns and tables required for a response. You can include tables as input or output parameters to the function. retrieve results. the available inventory value can be useful if customers want to change their order quantity to match the inventory available.IMARGIN 10. The Merge transform combines the results of the two branches into a single data set. Calling application functions A real-time job can use application functions to operate on data. Change the mapping of the Inv column in each of the branches to show available inventory values only if they are less than the order quantity. If the application function includes a structure as an input parameter.This document is part of a SAP study on PDF usage.INV – ERP_Inventory. • • For data cache OK: Inv maps from 'NULL' For CheckERP: Inv maps from ERP_Inventory. Complete the processing of the message. Application functions require input values for some parameters and some can be left unspecified. 280 Data Integrator Designer Guide . You must determine the requirements of the function to prepare the appropriate inputs. you must specify the individual columns that make up the structure. Add the XML target to the output of the Merge transform.

product availability. In particular. The information you allow your customers to access through your Web application can impact the performance that your customers see on the Web. To reduce the impact of queries requiring direct ERP system access. You can maximize performance through your Web application design decisions. modify your Web application. if your ERP system supports a complicated pricing structure that includes dependencies such as customer priority. Data Integrator Designer Guide 281 . reducing the performance your customer experiences with your Web application. The alternative might be to request pricing information directly from the ERP system. These techniques are evident in the way airline reservations systems provide pricing information—a quote for a specific flight—contrasted with other retail Web sites that show pricing for every item displayed as part of product catalogs. you can structure your application to reduce the number of queries that require direct back-office (ERP. For example. Find out how you can participate and help to improve our documentation. Because each implementation of an ERP system is different and because Data Integrator includes versatile decision support logic. you have many opportunities to design a system that meets your internal and external information and resource needs. Real-time jobs Designing real-time applications 10 Designing real-time applications Data Integrator provides a reliable and low-impact connection between a Web application and an back-office applications such as an enterprise resource planning (ERP) system. or order quantity. ERP system access is likely to be much slower than direct database access.This document is part of a SAP study on PDF usage. SCM. you might not be able to depend on values from a data cache for pricing information. Legacy) application access. design the application to avoid displaying price information along with standard product information and instead show pricing only after the customer has chosen a specific product and quantity. This section discusses: • • • Reducing queries requiring back-office application access Messages from real-time jobs to adapter instances Real-time service invoked by an adapter instance Reducing queries requiring back-office application access This section provides a collection of recommendations and considerations that can help reduce the time you spend experimenting in your development cycles. Using the pricing example.

and returns the response to a target (again.) When an operation instance (in an adapter) gets a message from an information resource. (Please see your adapter SDK documentation for more information about terms such as operation instance and information resource. see “Importing metadata through an adapter datastore” on page 114. an adapter instance). They cannot be used to receive messages. Real-time service invoked by an adapter instance This section uses terms consistent with Java programming. • • Message function calls allow the adapter instance to collect requests and send replies.This document is part of a SAP study on PDF usage. Using these objects in real-time jobs is the same as in batch jobs. refer to the adapter documentation to decide if you need to create a message function call or an outbound message. 10 Real-time jobs Designing real-time applications Messages from real-time jobs to adapter instances If a real-time job will send a message to an adapter instance. See “To modify output schema contents” on page 190. In the example data flow below. 282 Data Integrator Designer Guide . then sends the XML message to a real-time service. the Query processes a message (here represented by “Employment”) received from a source (an adapter instance). The DTD or XML Schema represents the data schema for the information resource. the message from the adapter is represented by a DTD or XML Schema object (stored in the Formats tab of the object library). it translates it to XML (if necessary). The real-time service processes the message from the information resource (relayed by the adapter) and returns a response. In the real-time service. Find out how you can participate and help to improve our documentation. Outbound message objects can only send outbound messages. For information on importing message function calls and outbound messages.

Data Integrator Designer Guide Embedded Data Flows chapter .This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

then executes it. Replicate sections of a data flow as embedded data flows so you can execute them independently.. 11 Embedded Data Flows About this chapter About this chapter Data Integrator provides an easy-to-use option to create embedded data flows. The embedded data flow can contain any number of sources or targets. Reuse data flow logic. but only one input or one output can pass data to or from the parent data flow. Add an embedded data flow at the end of a data flow Add an embedded data flow at the beginning of a data flow Replicate an existing data flow. This chapter covers the following topics: • • • • • • Overview Example of when to use embedded data flows Creating embedded data flows Using embedded data flows Testing embedded data flows Troubleshooting embedded data flows Overview An embedded data flow is a data flow that is called from inside another data flow. Data passes into or out of the embedded data flow from the parent flow through a single source or target..This document is part of a SAP study on PDF usage. Save logical sections of a data flow so you can use the exact logic in other data flows. Group sections of a data flow in embedded data flows to allow clearer layout and documentation. Use embedded data flows to: • • • Simplify data flow display. it expands any embedded data flows. Debug data flow logic. optimizes the parent data flow. Find out how you can participate and help to improve our documentation. An embedded data flow is a design aid that has no effect on job execution. 284 Data Integrator Designer Guide . or provide an easy way to replicate the logic and modify it for other flows. When Data Integrator executes the parent data flow. You can create the following types of embedded data flows: Type One input One output No input or output Use when you want to.

The Case transform sends each row from the source to different transforms that process it to get a unique target output.This document is part of a SAP study on PDF usage. You can simplify the parent data flow by using embedded data flows for the three different cases. Data Integrator Designer Guide 285 . Find out how you can participate and help to improve our documentation. a data flow uses a single source to load three different target systems. Embedded Data Flows Example of when to use embedded data flows 11 Example of when to use embedded data flows In this example.

Note: You can specify only one port. and select Make Embedded Data Flow. Then: • • Open the data flow you just added. Right-click and select Make Embedded Data Flow.This document is part of a SAP study on PDF usage. 2. the embedded data flow is connected to the parent by one input object. Using the Make Embedded Data Flow option 1. • • Select objects within a data flow. Data Integrator marks the object you select as the connection point for this embedded data flow. 11 Embedded Data Flows Creating embedded data flows Creating embedded data flows There are two ways to create embedded data flows. 286 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. Drag a complete and fully validated data flow from the object library into an open data flow in the workspace. To create an embedded data flow Select objects from an open data flow using one of the following methods: • • • • Click the white space and drag the rectangle around the objects CTRL-click each object All connected to each other Connected to other objects according to the type of embedded data flow you want to create: Ensure that the set of objects you select are: • • • One input One output No input or output In the example shown in step 2. which means that the embedded data flow can appear only at the beginning or at the end of the parent data flow. Right-click one object you want to use as an input or as an output port and select Make Port for that object. right-click.

If Replace objects in original data flow is selected. the original data flow becomes a parent data flow. Data Integrator saves the new embedded data flow object to the repository and displays it in the object library under the Data Flows tab. If you deselect the Replace objects in original data flow box. 3. Click OK. Data Integrator Designer Guide 287 .This document is part of a SAP study on PDF usage. Name the embedded data flow using the convention EDF_EDFName for example EDF_ERP. Data Integrator will not make a change in the original data flow. Embedded Data Flows Creating embedded data flows 11 The Create Embedded Data Flow window opens. which has a call to the new embedded data flow. 4. The embedded data flow is represented in the new parent data flow as shown in step 4. Find out how you can participate and help to improve our documentation. You can use an embedded data flow created without replacement as a stand-alone data flow for troubleshooting.

Notice that Data Integrator created a new object. For example. The naming conventions for each embedded data flow type are: Type One input One output No input or output Naming Conventions EDFName_Input EDFName_Output Data Integrator creates an embedded data flow without an input or output object 288 Data Integrator Designer Guide . the embedded data flow will include a target XML file object labeled EDFName_Output. if an embedded data flow has an output connection. Data Integrator automatically creates an input or output object based on the object that is connected to the embedded data flow when it is created. 6. EDF_ERP_Input. 11 Embedded Data Flows Creating embedded data flows 5. When you use the Make Embedded Data flow option. Find out how you can participate and help to improve our documentation. which is the input port that connects this embedded data flow to the parent data flow.This document is part of a SAP study on PDF usage. Click the name of the embedded data flow to open it.

Consider renaming the flow using the EDF_EDFName naming convention.This document is part of a SAP study on PDF usage. you might want to use the Update Schema option or the Match Schema option. Different types of embedded data flow ports are indicated by directional markings on the embedded data flow icon. then mark which source or target to use to pass data between the parent and the embedded data flows. 1. The embedded data flow appears without any arrowheads (ports) in the workspace. You can reuse an embedded data flow by dragging it from the Data Flow tab of the object library into other data flows. To create an embedded data flow out of an existing data flow Drag an existing valid data flow from the object library into a data flow that is open in the workspace. put the data flow inside the parent data flow. To save mapping time. 2. Create data flow 2 and data flow 3 and add embedded data flow 1 to both of them. Find out how you can participate and help to improve our documentation. 4. Right-click a source or target object (file or table) and select Make Port. Data Integrator Designer Guide 289 . Embedded Data Flows Creating embedded data flows 11 Creating embedded data flows from existing flows To call an existing data flow from inside another data flow. Data Integrator creates new input or output XML file and saves the schema in the repository as an XML Schema. Select objects in data flow 1. 3. Input port No port Output port Using embedded data flows When you create and configure an embedded data flow using the Make Embedded Data Flow option. and create embedded data flow 1 so that parent data flow 1 calls embedded data flow 1. The following example scenario uses both options: • • • Create data flow 1. Open the embedded data flow. Note: Ensure that you specify only one input or output port.

Change the schema of the object preceding embedded data flow 1 and use the Update Schema option with embedded data flow 1. It updates the schema of embedded data flow 1 in the repository. Right-click the embedded data flow object and select Update Schema. 11 Embedded Data Flows Creating embedded data flows • Go back to data flow 1. To update a schema 1. For example. This option updates the schema of an embedded data flow’s input object with the schema of the preceding object in the parent data flow. Open the embedded data flow’s parent data flow. Find out how you can participate and help to improve our documentation. All occurrences of the embedded data flow update when you use this option. Data Integrator copies the schema of Case to the input of EDF_ERP. 290 Data Integrator Designer Guide . in the data flow shown below. Updating Schemas Data Integrator provides an option to update an input schema of an embedded data flow. The Match Schema option only affects settings in the current data flow. Use the Match Schema option for embedded data flow 1 in both data flow 2 and data flow 3 to resolve the mismatches at runtime.This document is part of a SAP study on PDF usage. • • The following sections describe the use of the Update Schema and Match Schema options in more detail. Now the schemas in data flow 2 and data flow 3 that are feeding into embedded data flow 1 will be different from the schema the embedded data flow expects. 2.

Data Integrator also allows the schema of the preceding object in the parent data flow to have more or fewer columns than the embedded data flow. The embedded data flow ignores additional columns and reads missing columns as NULL. Deleting embedded data flow objects You can delete embedded data flow ports. Right-click the embedded data flow object and select Match Schema > By Name or Match Schema > By Position. 1. Data Integrator Designer Guide 291 . See the section on “Type conversion” in the Data Integrator Reference Guide for more information. Embedded Data Flows Creating embedded data flows 11 Matching data between parent and embedded data flow The schema of an embedded data flow’s input object can match the schema of the preceding object in the parent data flow by name or position. or remove entire embedded data flows. Find out how you can participate and help to improve our documentation. A match by position is the default. To specify how schemas should be matched Open the embedded data flow’s parent data flow. 2. The Match Schema option only affects settings for the current data flow. Columns in both schemas must have identical or convertible data types.This document is part of a SAP study on PDF usage.

Testing embedded data flows You might find it easier to test embedded data flows by running them separately as regular data flows. see Chapter 15: Design and Debug. Put the embedded data flow into a job. transformed. View Data to sample data passed into an embedded data flow. If you delete embedded data flows from the object library. 11 Embedded Data Flows Creating embedded data flows To remove a port Right-click the input or output object within the embedded data flow and deselect Make Port. To remove an embedded data flow Select it from the open parent data flow and choose Delete from the rightclick menu or edit menu. Note: You cannot remove a port simply by deleting the connection in the parent flow. Data Integrator removes the connection to the parent object. and loaded into targets. and rules about the audit statistics to verify the expected data is processed. the embedded data flow icon appears with a red circle-slash flag in the parent data flow. You can also use the following features to test embedded data flows: • • For for more information on both of these features. Find out how you can participate and help to improve our documentation. 292 Data Integrator Designer Guide . For more configuration information see the Data Integrator Reference Guide. To separately test an embedded data flow Specify an XML file for the input port or output port. Delete these defunct embedded data flow objects from the parent data flows. When you use the Make Embedded Data Flow option.This document is part of a SAP study on PDF usage. 1. an input or output XML file object is created and then (optional) connected to the preceding or succeeding object in the parent data flow. 3. To test the XML file without a parent data flow. click the name of the XML file to open its source or target editor to specify a file name. 2. Auditing statistics about the data read from sources. Run the job.

Variables and parameters declared in the embedded data flow that are not also declared in the parent data flow. Embedded Data Flows Creating embedded data flows 11 Troubleshooting embedded data flows The following situations produce errors: • • • • • • Both an input port and output port are specified in an embedded data flow.This document is part of a SAP study on PDF usage. DF1 data flow calls EDF1 embedded data flow which calls EDF2. and embedded data flows can only have one. Embedding the same data flow at any level within itself. remains selected. Transforms with splitters (such as the Case transform) specified as the output port object because a splitter produces multiple outputs. Find out how you can participate and help to improve our documentation. For example. You can however have unlimited embedding levels. Trapped defunct data flows. Data Integrator Designer Guide 293 . in the embedded data flow. Deleted connection to the parent data flow while the Make Port option. See “To remove an embedded data flow” on page 292. See “To remove a port” on page 292.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. 11 Embedded Data Flows Creating embedded data flows 294 Data Integrator Designer Guide .

This document is part of a SAP study on PDF usage. Data Integrator Designer Guide Variables and Parameters chapter . Find out how you can participate and help to improve our documentation.

This document is part of a SAP study on PDF usage. The data type of a variable can be any supported by Data Integrator such as an integer. If you define variables in a job or work flow. Data Integrator typically uses them in a script. This chapter contains the following topics: • • • • • • • Overview The Variables and Parameters window Using local variables and parameters Using global variables Local and global variable rules Environment variables Setting file names at run-time using variables Overview You can increase the flexibility and reusability of work flows and data flows using local and global variables when you design your jobs. 296 Data Integrator Designer Guide . decimal. date. You can use variables in expressions to facilitate decision-making or data manipulation (using arithmetic or character substitution). or text string. It also introduces the use of environment variables. For example. Find out how you can participate and help to improve our documentation. or conditional process. catch.doc’). 12 Variables and Parameters About this chapter About this chapter This chapter covers creating local and global variables for Data Integrator jobs. a variable can be used in a LOOP or IF statement to check a variable's value to decide which step to perform: If $amount_owed > 0 print(‘$invoice. Variables are symbolic placeholders for values.

Variables and Parameters Overview 12 Work Flow Variables defined: $AA int $BB int Script If $AA < 0 $AA = 0. execution. Catch If $BB < 0 $BB = 0. Global variables are restricted to the job in which they are created. $BB = $AA + $BB. use them in a custom function or in the WHERE clause of a query transform. or schedule properties. parameters. You create local variables. Find out how you can participate and help to improve our documentation. Parameters are expressions that pass to a work flow or data flow when they are called in a job. For example. Data Integrator Designer Guide 297 . In Data Integrator. $AA = $AA + $BB. You must use parameters to pass local variables to child objects (work flows and data flows).This document is part of a SAP study on PDF usage. You can also set global variable values using external job. they do not require parameters to be passed to work flows and data flows. and global variables using the Variables and Parameters window in the Designer. however. You can set values for local or global variables in script objects. Conditional If Expression $AA >= $BB You can use variables inside data flows. local variables are restricted to the object in which they are created (job or work flow).

Local variable parameters can only be set at the work flow and data flow level. To view the variables and parameters in each job. 298 Data Integrator Designer Guide . the window does not indicate a context. The Context box in the window changes to show the object you are viewing. during production you can change values for default global variables at runtime from a job's schedule or SOAP call without having to open a job in the Designer. double-click an object. Variables can be used as file names for: • • • • • Flat file sources and targets XML file sources and targets XML message targets (executed in the Designer in test mode) IDoc file sources and targets (in an SAP R/3 environment) IDoc message sources and targets (SAP R/3 environment) The Variables and Parameters window Data Integrator displays the variables and parameters defined for an object in the Variables and Parameters window. For example. and parameter type) for an object type. see the Data Integrator Management Console: Administrator Guide. or data flow In the Tools menu. select Variables. 2. For more information about setting global variable values in SOAP calls. The Variables and Parameters window opens. data type. The Definitions tab allows you to create and view variables (name and data type) and parameters (name. or from the project area click an object to open it in the workspace. Global variables can only be set at the job level.This document is part of a SAP study on PDF usage. 1. 12 Variables and Parameters The Variables and Parameters window Using global variables provides you with maximum flexibility. work flow. The Variables and Parameters window contains two tabs. From the object library. If there is no object selected. Find out how you can participate and help to improve our documentation.

Data flows cannot return output values. Data Integrator scripting language rules and syntax The following illustration shows the relationship between an open work flow called DeltaFacts.This document is part of a SAP study on PDF usage. the Context box in the Variables and Parameters window. Variables and Parameters The Variables and Parameters window 12 The following table lists what type of variables and parameters you can create using the Variables and Parameters window when you select different objects. Object Type Job Work flow What you can create Used by for the object Local variables Global variables Local variables Parameters Data flow Parameters A script or conditional in the job Any object in the job This work flow or passed down to other work flows or data flows using a parameter. Find out how you can participate and help to improve our documentation. For the input parameter type. The Calls tab allows you to view the name of each parameter defined for all objects in a parent object’s definition. or another parameter. and the content in the Definition and Calls tabs. or a function in the data flow. Data Integrator Designer Guide 299 . values in the Calls tab can be variables or parameters. and a compatible data type if they are placed inside an output parameter type. values in the Calls tab can be constants. column mapping. Values in the Calls tab must also use: • • The same data type as the variable if they are placed inside an input or input/output parameter type. Parent objects to pass local variables. A WHERE clause. variables. You can also enter values for each parameter by right-clicking a parameter and clicking Properties. Work flows may also return variables or parameters to parent objects. For the output or input/output parameter type.

define the local variable. then from the calling object. Using local variables and parameters To pass a local variable to another object. define the variable in a parent work flow and then pass the value of the variable as a parameter of the data flow. Parameters defined in WF_DeltaWrapB. The parent work flow (not shown) passes values to or receives values from WF_DeltaFacts. to use a local variable inside a data flow. 12 Variables and Parameters Using local variables and parameters The definition of work flow WF_DeltaFacts is open in the workspace. Parameters defined in WF_DeltaFacts.This document is part of a SAP study on PDF usage. For example. which is called by WF_DeltaFacts. 300 Data Integrator Designer Guide . create a parameter and map the parameter to the local variable by entering a parameter value. WF_DeltaFacts can pass values to or receive values from WF_DeltaWrapB using these parameters. Find out how you can participate and help to improve our documentation.

A query transform in the data flow uses the parameters passed in to filter the part numbers extracted from the source. For example. and pass that value to the data flow as the end value. it can pass the end value of the range $EndRange as a parameter to the data flow to indicate the start value of the range to process next. Data Integrator Designer Guide 301 . If the work flow that calls DF_PartFlow records the range of numbers processed. or input/output. output. It can process all of the part numbers in use or a range of part numbers based on external requirements such as the range of numbers processed most recently. For more information see the Data Integrator Reference Guide. Passing values into data flows You can use a value passed as a parameter into a data flow to control the data transformed in the data flow. The value passed by the parameter can be used by any object called by the work flow or data flow. Data Integrator can calculate a new end value based on a stored number of parts to process each time. Variables and Parameters Using local variables and parameters 12 Parameters Parameters can be defined to: • • Pass their values into and out of work flows Pass their values into data flows Each parameter is assigned a type: input.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. the data flow DF_PartFlow processes daily inventory values. Note: You can also create local variables and parameters for use in custom functions. such as $SizeOfSet.

2. Always begin the name with a dollar sign ($). 9. To define a local variable Click the name of the job or work flow in the project area or workspace. Select the new variable (for example. 5. Set the value of the parameter in the flow call. 302 Data Integrator Designer Guide . Select Variables. 12 Variables and Parameters Using local variables and parameters The data flow could be used by multiple calls contained in one or more work flows to perform the same task on different part number ranges by specifying different parameters for the particular calls. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. 10. The name can include any alpha or numeric character or underscores (_). Defining parameters There are two steps for setting up a parameter for a work flow or data flow: • • Add the parameter definition to the flow. Defining local variables Variables are defined in the Variables and Parameter window. Select the data type for the variable. 7. Go the Definition tab. 4. Right-click and choose Insert. 3. Click OK. or double-click one from the object library. Right-click and choose Properties. 8. 1. $NewVariable0). but cannot contain blank spaces. Choose Tools > Variables to open the Variables and Parameters window. 6. Enter the name of the new variable.

2. Click the plus sign (+) next to the object that contains the parameter you want to set. Right-click and choose Insert. Data Integrator Designer Guide 303 . 8. Enter the expression the parameter will pass in the Value box. Go to the Definition tab. 10. 11. 2. 5. 5. A list of parameters passed to that object appears. or data flow. 4. 4. $NewArgument1). 3. Find out how you can participate and help to improve our documentation. and choose Properties. To add the parameter to the flow definition Click the name of the work flow or data flow. 3. Enter the name of the parameter using alphanumeric characters with no blank spaces. 1. To set the value of the parameter in the flow call Open the calling job. or input/output). work flow. select the Calls tab. it must have a compatible data type if it is an output parameter type. Variables and Parameters Using local variables and parameters 12 1. output. 7. Click OK. 6. Select Parameters. Select the parameter type (input. Select the data type for the parameter. Select the new parameter (for example.This document is part of a SAP study on PDF usage. Open the Variables and Parameters window. right-click. Select the parameter. work flow. In the Variables and Parameters window. 9. or data flow. The parameter must have the same data type as the variable if it is an input or input/output parameter. The Calls tab shows all the objects that are called from the open job. Right-click and choose Properties.

However. If the parameter type is output or input/output. 2. or ‘string1’). 3. 4. Setting parameters is not necessary when you use global variables. 12 Variables and Parameters Using global variables If the parameter type is input. Choose Tools > Variables to open the Variables and Parameters window. Find out how you can participate and help to improve our documentation. To create a global variable Click the name of a job in the project area or double-click a job from the object library. 304 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. From the shortcut menu. once you use a name for a global variable in a job. then the value is a variable or parameter. Click OK. 5. click Insert. or another parameter (for example $startID or $parm1). a variable. that name becomes reserved for the job. Global variables are exclusive within the context of the job in which they are created. Go the Definition tab. Special syntax $variable_name 'string' Using global variables Global variables are global within a job. Use the following syntax to indicate special values: Value type Variable String 6. Right-click Global Variables (job Context_) to open a shortcut menu. then its value can be an expression that contains a constant (for example 0. 3. $NewJobGlobalVariable appears inside the global variables tree: 1. This section discusses: • • • Creating global variables Viewing global variables Setting global variable values Creating global variables Define variables in the Variables and Parameter window.

Click OK. You can view global variables from the Variables and Parameters window (with an open job in the work space) or from the Properties dialog of a selected job. 3. 1. Click the Global Variables tab. The Variables and Parameters window displays the renamed global variable. Right-click $NewJobGlobalVariable and select Properties from the shortcut menu. Global variables appear on this tab. select the Jobs tab. The Global Variable Properties window opens: 7. To view global variables in a job from the Properties dialog In the object library. Data Integrator Designer Guide 305 . Viewing global variables Global variables.This document is part of a SAP study on PDF usage. defined in a job. are visible to those objects relative to that job. Right-click and select Properties. A global variable defined in one job is not available for modification or viewing from another job. Variables and Parameters Using global variables 12 $NewJobGlobalVariable 6. 8. Find out how you can participate and help to improve our documentation. Rename the variable and select a data type. 2.

Click the Global Variable tab. However. 306 Data Integrator Designer Guide . Values for global variables can be set outside a job: • • As a job property As an execution or schedule property Global variables without defined values are also allowed. By setting values outside a job. you can set and maintain global variable values outside a job. 2. All values defined as job properties are shown in the Properties and the Execution Properties dialogs of the Designer and in the Execution Options and Schedule pages of the Administrator. All global variables created in the job appear.This document is part of a SAP study on PDF usage. 1. To set a global variable value as a job property Right-click a job in the object library or project area. 12 Variables and Parameters Using global variables Setting global variable values In addition to setting a variable inside a job using an initialization script. 3. They are read as NULL. Note: You cannot pass global variables as command line arguments for realtime jobs. Enter values for the global variables in this job. the internal value will override the external job value. Find out how you can participate and help to improve our documentation. you can rely on these dialogs for viewing values set for global variables and easily edit values when testing or scheduling a job. if you set a value for the same variable both inside and outside a job. 4. Values set outside a job are processed the same way as those set in an initialization script. Click Properties.

Find out how you can participate and help to improve our documentation. See the Data Integrator Reference Guide for syntax information and example scripts. This allows you to override job property values at run-time. you can execute real-time jobs from the Designer in test mode. Click OK. 5. You can also view and edit these default values in the Execution Properties dialog of the Designer and in the Execution Options and Schedule pages of the Administrator. Note: For testing purposes. or execute or schedule a batch job from the Administrator.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 307 . 1. Variables and Parameters Using global variables 12 You can use any statement used in a script with this option. To set a global variable value as an execution property Execute a job from the Designer. Data Integrator saves values in the repository as job properties. Make sure to set the execution properties for a real-time job.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. 12 Variables and Parameters Using global variables 2. 308 Data Integrator Designer Guide . View the global variables in the job and their default values (if available).

Variables and Parameters Using global variables 12 Data Integrator Designer Guide 309 . Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

Find out how you can participate and help to improve our documentation. click OK. 3. the Global Variable sections in these windows do not appear. 12 Variables and Parameters Using global variables If no global variables exist in a job. 4. If you are using the Designer. Edit values for global variables as desired. If you are using the Administrator. 310 Data Integrator Designer Guide . click Execute or Schedule.This document is part of a SAP study on PDF usage.

Values entered as execution properties are not saved. Variables and Parameters Using global variables 12 The job runs using the values you enter. then the statement $YEAR=2002 will replace $YEAR=2003. assume that a job. $DAY is not defined and Data Integrator reads it as NULL. $MONTH=’JANUARY’. However. Data Integrator executes a list of statements which includes default values for JOB_Test1: $YEAR=2003. has three global variables declared: $YEAR. A value defined inside a job has the highest rank. Data Integrator selects the highest ranking value for use in the job. • If you set a global variable value for both a job property and a schedule property. $MONTH. if you enter different values for a single global variable. Find out how you can participate and help to improve our documentation. you set variables $MONTH and $DAY as execution properties to values ‘JANUARY’ and 31 respectively. the schedule property value overrides the job property value and becomes the external. Consequently. JOB_Test1. Note: In this scenario. and $DAY. $DAY=31. For your the job run. default value for the current job run. • If you set a global variable value as both a job and an execution property. these values are only associated with a job schedule.This document is part of a SAP study on PDF usage. these values are viewed and edited from within the Administrator. not the job itself. Variable $YEAR is set as a job property with a value of 2003. For the second job run. You cannot save execution property global variable values. Data Integrator saves schedule property values in the repository. however. You set $DAY to 31 during the first job run. Data Integrator Designer Guide 311 . For example. if you set variables $YEAR and $MONTH as execution properties to values 2002 and ‘JANUARY’ respectively. execution properties for global variable values are not saved. A value entered as a job property has the lowest rank. Values entered as schedule properties are saved but can only be accessed from within the Administrator. Automatic ranking of global variable values in a job Using the methods described in the previous section. Data Integrator executes the following list of statements: $YEAR=2002. the execution property value overrides the job property value and becomes the default value for the current job run. $MONTH=’JANUARY’.

Data Integrator continues the processing the job with this new value. or job property values as default values. For example. Since the value in the script is inside the job. schedule. Find out how you can participate and help to improve our documentation. each containing a data flow. Up until that point.This document is part of a SAP study on PDF usage. suppose you have a job called JOB_Test2 that has three work flows. ‘MAY’ overrides ‘APRIL’ for the variable $MONTH. The first and third data flows have the same global variable with no value defined. However. 12 Variables and Parameters Using global variables • A global variable value defined inside a job always overrides any external values. The execution property $MONTH = ‘APRIL’ is the global variable value. the override does not occur until Data Integrator attempts to apply the external values to the job being processed with the internal value. 312 Data Integrator Designer Guide . In this scenario. ‘APRIL’ becomes the default value for the job. The second data flow is inside a work flow that is preceded by a script in which $MONTH is defined as ‘MAY’. ‘APRIL’ remains the value for the global variable until it encounters the other value for the same variable in the second work flow. Data Integrator processes execution.

Find out how you can participate and help to improve our documentation. you can rely on these dialogs for viewing all global variables and their values. see the Data Integrator Reference Guide. Any name modification to a global variable can only be performed at the job level. By setting values outside a job. and recovery.This document is part of a SAP study on PDF usage. you must validate these local and global variables within the job context in which they were created. use global variables as file names and start and end dates. Naming • • Local and global variables must have unique names within their job context. For example. parallel flows. the local and global variables defined in that job context are also replicated. However. you can set global variable values when creating or editing a schedule without opening the Designer. When you replicate a data flow or work flow. Data Integrator reports an error. all parameters and local and global variables are also replicated. Local and global variable rules When defining local or global variables. values defined as job properties are shown in the Properties and the Execution Properties dialogs of the Designer and in the Execution Options and Schedule pages of the Administrator. there are advantages to defining values for global variables outside a job. If you attempt to validate a data flow or work flow containing global variables without a job. You can also easily edit them for testing and scheduling. In the Administrator. For example. consider rules for: • • • Naming Replicating jobs and work flows Importing and exporting For information about how Data Integrator processes variables in work flows with multiple conditions like execute once. Variables and Parameters Local and global variable rules 12 Advantages to setting values outside a job While you can set values inside jobs. Data Integrator Designer Guide 313 . Replicating jobs and work flows • • When you replicate all objects.

or call a system environment variable. a validation error will occur. 3. Flat files XML files and messages IDoc files and messages (in an SAP R/3 environment) The lookup_ext function (for a flat file used as a translate table parameter) To use a variable in a flat file name Create a local or global variable using the Variables and Parameters window. work flows. and is_set_env functions provide access to underlying operating system variables that behave as the operating system allows. 314 Data Integrator Designer Guide . When you export a lower-level object (such as a data flow) without the parent job. Declare the variable in the file format editor or in the Function editor as a lookup_ext parameter. the value is visible to all objects in that job. retrieve. set_env. You can temporarily set the value of an environment variable inside a job. Create a script to set the value of a local or global variable. you also export all local and global variables defined for that job. 2. The get_env. Find out how you can participate and help to improve our documentation. and test the values of environment variables. 12 Variables and Parameters Environment variables Importing and exporting • • When you export a job object. the global variable is not exported. Use the get_env. For more information about these functions. or data flows. Variables can be used as file names for: • The following sources and targets: • • • • 1. Environment variables You can use system-environment variables inside Data Integrator jobs.This document is part of a SAP study on PDF usage. set_env. Only the call to that global variable is exported. Setting file names at run-time using variables You can set file names at runtime by specifying a variable as the file name. see the Data Integrator Reference Guide. Once set. and is_set_env functions to set. work flow or data flow. If you use this object in another job without defining the global variable in the new job.

0/Vfilenames/work/VF0015.in) also make use of the wild cards (* and ?) supported by Data Integrator.in’. $FILEINPUT = ‘d:verions5.0/Vfilenames/goldlog/KNA1comma. The figure above provides an example of how to use multiple variable names and wild cards. substitute the path and file name in the Translate table box in the lookup_ext function editor with the variable name. The two names (KNA1comma.* and KNA1c?mma. For more information. specify both the file name and the directory name. • The following figure shows how you can set values for variables in flat file sources and targets in a script. see the Data Integrator Reference Guide.*. See the Data Integrator Reference Guide for more information about creating scripts. For lookups. Find out how you can participate and help to improve our documentation. Variables and Parameters Setting file names at run-time using variables 12 • When you set a variable value for a flat file. Data Integrator Designer Guide 315 .0/Vfilenames/goldlog/KNA1c?mma. When you use variables as sources and targets. Notice that the $FILEINPUT variable includes two file names (separated by a comma). You cannot enter a variable in the Root directory property. Neither is supported when using variables in the lookup_ext function. $FILEOUTPUT = ‘d:/version5. you can also use multiple file names and wild cards.This document is part of a SAP study on PDF usage. d:/verion5. Enter the variable in the File(s) property under Data File(s) in the File Format Editor.out’.

This document is part of a SAP study on PDF usage. 12 Variables and Parameters Setting file names at run-time using variables 316 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.

Data Integrator Designer Guide Executing Jobs chapter . Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

13 Executing Jobs About this chapter About this chapter This chapter contains the following topics: • • • • • Overview of Data Integrator job execution Preparing for job execution Executing jobs as immediate tasks Debugging execution errors Changing Job Server options Overview of Data Integrator job execution You can run Data Integrator jobs in three different ways. • Scheduled jobs Batch jobs are scheduled. Depending on your needs. The corresponding Job Server must be running. usually many times on the same machine) must be running.This document is part of a SAP study on PDF usage. both the Designer and designated Job Server (where the job executes. For these jobs. You will most likely run immediate jobs only during the development cycle. When jobs are scheduled by third-party software: • • The job initiates outside of Data Integrator. To schedule a job. The Data Integrator Designer does not need to be running. use the Data Integrator Administrator or use a third-party scheduler. 318 Data Integrator Designer Guide . you can configure: • Immediate jobs Data Integrator initiates both batch and real-time jobs and runs them immediately from within the Data Integrator Designer. Use the Data Integrator Administrator to create a service from a real-time job. Find out how you can participate and help to improve our documentation. The job operates on a batch job (or shell script for UNIX) that has been exported from Data Integrator. When a job is invoked by a third-party scheduler: • • • Services Real-time jobs are set up as services that continuously listen for requests from an Access Server and process requests on-demand as they are received.

If during validation Data Integrator discovers an error in an object definition. Clicking the Validate Current View button from the toolbar (or choosing Validate > Current View from the Debug menu). you can access additional information by right-clicking the error listing and selecting View from the context menu. Executing Jobs Preparing for job execution 13 Preparing for job execution Follow these preparation procedures before you execute.This document is part of a SAP study on PDF usage. double-click the error in the Output window to open the editor of the object containing the error. Data Integrator also validates jobs before exporting them. If there are errors. The default is not to validate. If you are unable to read the complete error text in the window. This command checks the syntax of the object definition for the active workspace and for all objects that are called from the active workspace view recursively. You can set the Designer options (Tools > Options > Designer > General) to validate jobs started in Designer before job execution. it opens a dialog box indicating that an error exists. or export a job to be executed as a scheduled task: • • • Validating jobs and job components Ensuring that the Job Server is running Setting job execution options Validating jobs and job components You can also explicitly validate jobs and their components as you create them by: Clicking the Validate All button from the toolbar (or choosing Validate > All Objects in View from the Debug menu). schedule. This command checks the syntax of the object definition for the active workspace. Data Integrator Designer Guide 319 . Find out how you can participate and help to improve our documentation. then opens the Output window to display the error.

ensure that the Job Server is associated with the repository where the client is running. Warning Error Ensuring that the Job Server is running Before you execute a job (either as an immediate or scheduled task). Setting job execution options Options for jobs include Debug and Trace. Job Server is running Job Server is inactive The name of the active Job Server and port number displays in the status bar when the cursor is over the icon. Although these are object options—they affect the function of the object—they are located in either the Property or the Execution window associated with the job.This document is part of a SAP study on PDF usage. No action is required. it displays the status of the Job Server for the repository to which you are connected. When the Designer starts. You must fix the error before the job will execute. The name of the active Job Server and port number display when you roll-over the Job Server icon. 320 Data Integrator Designer Guide . but you might get unexpected results. The error is severe enough to stop job execution. For example. if the data type of a source column in a transform within a data flow does not match the data type of the target column in the transform. Data Integrator alerts you with a warning message. 13 Executing Jobs Preparing for job execution Error messages have these levels of severity: Severity Information Description Informative message only—does not prevent the job from running. The error is not severe enough to stop job execution. Find out how you can participate and help to improve our documentation.

There might also be warning messages—for example. 2. For more information about using the Global Variable tab. Both the Designer and Job Server must be running for the job to execute.This document is part of a SAP study on PDF usage. Select options on the Properties window: • • • For an introduction to object properties. messages indicating that date values will be converted to datetime Data Integrator Designer Guide 321 . see the Data Integrator Designer Guide. Data Integrator prompts you to save any objects that have changes that have not been saved. see “Viewing and changing object properties” on page 53 For information about Debug and Trace properties. You must correct any serious errors before the job will run. 2. If you have selected this check box. Find out how you can participate and help to improve our documentation. Proceed to the next step. The right-click Execute menu sets the options for a single execution only and overrides the default settings The right-click Properties menu sets the default settings To set execution options for every execution of the job From the Project area. To execute a job as an immediate task In the project area. right-click the job name and choose Properties. The next step depends on whether you selected the Perform complete validation before job execution check box in the Designer Options (see “Designer — General” on page 67): 1. • • 1. Right-click and choose Execute. 3. select the job name. a window opens showing execution properties (debug and trace) for the job. Executing jobs as immediate tasks Immediate or “on demand” tasks are initiated from the Data Integrator Designer. Executing Jobs Executing jobs as immediate tasks 13 Execution options for jobs can either be set for a single instance or as a default value. see the Data Integrator Reference Guide. Data Integrator validates the job before it runs. • • If you have not selected this check box.

“Setting global variable values” on page 306. see: • • the Data Integrator Reference Guide. datastore profiles for sources and targets if applicable. Click OK. You can choose the Job Server that you want to process this job. 5. or select global variables at runtime. enable automatic recovery. Use the buttons at the top of the log window to display the trace log. After the job validates. and error log (if there are any errors). see “Debugging execution errors” on page 324. a window opens showing the execution properties (debug and trace) for the job. Note: Setting execution properties here affects a temporary change for the current execution only.This document is part of a SAP study on PDF usage. For more information. As Data Integrator begins execution. 322 Data Integrator Designer Guide . override the default trace properties. 13 Executing Jobs Executing jobs as immediate tasks values. Correct them if you want (they will not prevent job execution) or click OK to continue. monitor log. Set the execution properties. the execution window opens with the trace log button active. For more information about execution logs. 4. Find out how you can participate and help to improve our documentation.

use an RDBMS query tool to check the contents of the target table or file. The traffic-light icons in the Monitor tab have the following meanings: • • A green light indicates that the job is running You can right-click and select Kill Job to stop a job that is still running. • A red cross indicates that the job encountered an error Log tab You can also select the Log tab to view a job’s trace log history.This document is part of a SAP study on PDF usage. Executing Jobs Executing jobs as immediate tasks 13 After the job is complete. Data Integrator Designer Guide 323 . See “Examining target data” on page 329. Find out how you can participate and help to improve our documentation. This description is saved with the log which can be accessed later from the Log tab. Monitor tab The Monitor tab lists the trace logs of all current or most recent executions of a job. A red light indicates that the job has stopped You can right-click and select Properties to add a description for a specific trace log.

and the duration of each step. Always examine your target data to see if your job produced the results you expected.This document is part of a SAP study on PDF usage. monitor. some of the ABAP errors are also available in the Data Integrator error log. Use the trace. Displays each step of each data flow in the job. Error log Target data The following sections describe how to use these tools: • Using Data Integrator logs 324 Data Integrator Designer Guide . If the job ran against SAP data. and error log icons (left to right at the top of the job execution window in the workspace) to view each type of available log for the date and time that the job was run. the number of rows streamed through each step. Displays the name of the object being executed when an Data Integrator error occurred and the text of the resulting error message. 13 Executing Jobs Debugging execution errors Click on a trace log to open it in the workspace. Find out how you can participate and help to improve our documentation. Debugging execution errors The following tables lists tools that can help you understand execution errors: Tool Trace log monitor log Definition Itemizes the steps executed in the job and the time execution began and ended.

This document is part of a SAP study on PDF usage. select Tools > Options > Designer > General > Open monitor on job execution. the execution window opens automatically. To access a log during job execution If your Designer is running when job execution begins. Data Integrator Designer Guide 325 . • • To open the trace log on job execution. select one or multiple lines and use the key commands [Ctrl+C]. To copy log content from an open log. see the Data Integrator Management Console: Administrator Guide. Executing Jobs Debugging execution errors 13 • • • • Examining trace logs Examining monitor logs Examining error logs Examining target data Using Data Integrator logs This section describes how to use Data Integrator logs in the Designer. For information about administering logs from the Administrator. Find out how you can participate and help to improve our documentation. Use the monitor and error log icons (middle and right icons at the top of the execution window) to view these logs. displaying the trace log information.

Indicates that the job encountered an error on this explicitly selected Job Server. To access a log after the execution window has been closed In the project area. click the Log tab. The Job Server listed executed the job. 13 Executing Jobs Debugging execution errors The execution window stays open until you explicitly close it. Find out how you can participate and help to improve our documentation. The Job Server listed executed the job. expand the job you are interested in to view the list of trace log files and click one. 2.This document is part of a SAP study on PDF usage. Click a job name to view all trace. and error log files in the workspace. 1. Indicates that the was job executed successfully by a server group. N_ 326 Data Integrator Designer Guide . Log indicators signify the following: Job Log Description Indicator Indicates that the job executed successfully on this explicitly selected Job Server. Alternatively. monitor. Indicates that the job encountered an error while being executed by a server group.

) Use the list box to switch between log types or to view No logs or All logs. (Identify the execution from the position in sequence or datetime stamp. and which parts of the execution are the most time consuming. 4. Find out how you can participate and help to improve our documentation. Click the log icon for the execution of the job you are interested in. 2. see the Data Integrator Management Console: Administrator Guide. To delete a log You can set how long to keep logs in Data Integrator Administrator. see the Data Integrator Management Console: Administrator Guide.This document is part of a SAP study on PDF usage. In the project area. Examining trace logs Use the trace logs to determine where an execution failed. Data Integrator Designer Guide 327 . Executing Jobs Debugging execution errors 13 3. click the Log tab. For information about examining trace logs from the Administrator. For more information. whether the execution steps occur in the order you expect. Right-click the log you want to delete and select Delete Log. The following figure shows an example of a trace log. If want to delete logs from the Designer manually: 1.

It lists the time spent in a given component of a job and the number of data rows that streamed through the component. If the execution completed without error. Examining error logs Data Integrator produces an error log for every job execution. Find out how you can participate and help to improve our documentation. Use the error logs to determine how an execution failed. The following screen shows an example of an error log.This document is part of a SAP study on PDF usage. 13 Executing Jobs Debugging execution errors Examining monitor logs The monitor log quantifies the activities of the components of the job. The following screen shows an example of a monitor log. the error log is blank. 328 Data Integrator Designer Guide .

This index is used by server groups. 60 Data Integrator Designer Guide 329 . you might want to return to the Designer and change values for the following Job Server options: Table 13-1 :Job Server Options Option Option Description Default Value 10800000 (3 hours) 90000 (90 seconds) Adapter Data (For adapters) Defines the time a function call or outbound Exchange Timeout message will wait for the response from the adapter operation.This document is part of a SAP study on PDF usage. After you familiarize yourself with the more technical aspects of how Data Integrator handles data (using the Data Integrator Reference Guide) and some of its interfaces like those for adapters and SAP R/3. Updated values were handled properly. Executing Jobs Changing Job Server options 13 Examining target data The best measure of the success of a job is the state of the target data. Data was not lost between updates of the target. Adapter Start Timeout (For adapters) Defines the time that the Administrator or Designer will wait for a response from the Job Server that manages adapters (start/stop/status). Information is saved in: $LINK_DIR/log/ <JobServerName>/server_eventlog. AL_JobServerLoad Enables a Job Server to log server group information if the FALSE BalanceDebug value is set to TRUE. Always examine your data to make sure the data movement operation produced the results you expect. Find out how you can participate and help to improve our documentation. Be sure that: • • • • • Data was not converted to incompatible types or truncated. Changing Job Server options There are many options available in Data Integrator for troubleshooting and tuning a job.txt AL_JobServerLoad Sets the polling interval (in seconds) that Data Integrator OSPolling uses to get status information used to calculate the load balancing index. Data was not duplicated in the target. Generated keys have been properly incremented.

Sets the FTP connection retry interval in milliseconds. If the engine calls this function too fast (processing parallel data flows for example). For more information.This document is part of a SAP study on PDF usage. the engine internally creates two source files that feed the two queries instead of a splitter that feeds the two queries. the function may fail. or other connection information. user name. 1000 Sets the Degree of Parallelism for all data flows run by a 1 given Job Server. If you create a job in which a file source feeds into two FALSE queries Data Integrator might hang. If a data flow’s Degree of parallelism value is 0. then the Job Server will use the Global_DOP value. Sets the number of retries for an FTP connection that initially 0 fails. FALSE FALSE FTP Number of Retry FTP Retry Interval Global_DOP Ignore Reduced Msg Type Ignore Reduced Msg Type_foo OCI Server Attach Retry 3 The engine calls the Oracle OCIServerAttach function each time it makes a connection to Oracle. (For SAP R/3) Disables IDoc reduced message type processing for a specific message type (such as foo) if the value is set to TRUE. To correct this. 13 Executing Jobs Changing Job Server options Option Display DI Internal Jobs Option Description Default Value Displays Data Integrator’s internal datastore FALSE CD_DS_d0cafae2 and its related jobs in the object library. increase the retry value to 5. Splitter Optimization 330 Data Integrator Designer Guide . (For SAP R/3) Disables IDoc reduced message type processing for all message types if the value is set to TRUE. change the default value of this option to TRUE. The CD_DS_d0cafae2 datastore supports two internal jobs. The Job Server will use the data flow’s Degree of parallelism value if it is set to any value except zero because it overrides the Global_DOP value. This enables the calculate usage dependency job (CD_JOBd0cafae2) and the server group job (di_job_al_mach_info) to run without a connection error. The first calculates usage dependencies on repository tables and the second updates server group configurations. see the Data Integrator Performance Optimization Guide. close and reopen the Designer. If this option is set to TRUE. If you change your repository password. Find out how you can participate and help to improve our documentation. then update the CD_DS_d0cafae2 datastore configuration to match your new repository configuration. You can also set the Degree of parallelism for individual data flows from each data flow’s Properties window.

a. Use Domain Name Adds a domain name to a Job Server name in the repository. 2. Click OK. To change option values for an individual Job Server Select the Job Server you want to work with by making it your default Job Server. all data flows will not use linked datastores. The use of linked datastores can also be disabled from any data flow properties dialog.This document is part of a SAP study on PDF usage. 3. Select Tools > Options > Designer > Environment. For more information. The data flow level option takes precedence over this Job Server level option. c. see the Data Integrator Performance Optimization Guide. TRUE This creates a fully qualified server name and allows the Designer to locate a Job Server on a different domain. Executing Jobs Changing Job Server options 13 Option Use Explicit Database Links Option Description Default Value Jobs with imported database links normally will show TRUE improved performance because Data Integrator uses these links to push down processing to a database. Select Tools > Options > Job Server > General. If you set this option to FALSE. Find out how you can participate and help to improve our documentation. b. 1. Select a Job Server from the Default Job Server section. Enter the section and key you want to use from the following list of value pairs: Section int int AL_JobServer AL_JobServer string AL_Engine AL_Engine AL_Engine AL_Engine AL_Engine AL_Engine Key AdapterDataExchangeTimeout AdapterStartTimeout AL_JobServerLoadBalanceDebug AL_JobServerLoadOSPolling DisplayDIInternalJobs FTPNumberOfRetry FTPRetryInterval Global_DOP IgnoreReducedMsgType IgnoreReducedMsgType_foo OCIServerAttach_Retry Data Integrator Designer Guide 331 .

332 Data Integrator Designer Guide . 13 Executing Jobs Changing Job Server options Section AL_Engine AL_Engine Repository Key SPLITTER_OPTIMIZATION UseExplicitDatabaseLinks UseDomainName 4. Find out how you can participate and help to improve our documentation. For example.This document is part of a SAP study on PDF usage. Enter a value. Re-select a default Job Server by repeating step 1. To save the settings and close the Options window. 5. 6. click OK. enter the following to change the default value for the number of times a Job Server will retry to make an FTP connection if it initially fails: These settings will change the default value for the FTPNumberOfRetry option from zero to two. as needed.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide Data Quality chapter .

Take appropriate actions when the data does not meet your business rules. Verify that your source data meets your business rules. The distribution. data cleansing or other transforms. transform. as well as your target data warehouse. View your source data before you execute a job to help you create higher quality job designs. or target object processes correct data. The Data Profiler can identify anomalies in your source data to help you better define corrective actions in the validation transform. • • • • • • • • • • • • • Use the View Data feature to: Use the Validation transform to: Use the auditing data flow feature to: Use data cleansing transforms to improve the quality of your data. relationship. and structure of your source data to better design your Data Integrator jobs and data flows. transform and load (ETL) jobs. data quality control becomes critical in your extract. Compare sample data from different steps of your job to verify that your data extraction job returns the results you expect. 334 Data Integrator Designer Guide . These features can help ensure that you have “trusted” information. 14 Data Quality Chapter overview Chapter overview With operational systems frequently changing. The Data Integrator Designer provides data quality controls that act as a firewall to identify and fix errors in your data. Define the actions to take when an audit rule fails. Find out how you can participate and help to improve our documentation. This feedback allows business users to quickly review. assess. The content of your source and target data so that you can verify that your data extraction job returns the results you expect. Define rules that determine if a source. see Chapter 18: Data Cleansing. Use Data Validation dashboards in the Metadata Reporting tool to evaluate the reliability of your target data based on the validation rules you created in your Data Integrator batch jobs. and identify potential inconsistencies or errors in source data.This document is part of a SAP study on PDF usage. For more information. The Data Integrator Designer provides the following features that you can use to determine and improve the quality and structure of your source data: • Use the Data Profiler to determine: • The quality of your source data before you extract it.

This chapter contains the following topics: • • • • Using the Data Profiler Using View Data to determine data quality Using the Validation transform Using Auditing Using the Data Profiler The Data Profiler executes on a profiler server to provide the following data profiler information that multiple users can view: • Column analysis—The Data Profiler provides two types of column profiles: • Basic profiling—This information includes minimum value. median. see the Data Integrator Management Console: Metadata Reports User’s Guide. including columns that have an existing primary key and foreign key relationship. Find out how you can participate and help to improve our documentation. and maximum string length. For the most recent list of profile information. median string length. and pattern percent. refer to the Data Integrator Release Notes. distinct percent. pattern count. Save the values in all columns in each row. • • Relationship analysis—This information identifies data mismatches between any two columns for which you define a relationship. You can save two levels of data: • • • • • • • Save the data only in the columns that you select for the relationship. Data Quality Using the Data Profiler 14 For more information about Data Validation dashboards. average value.This document is part of a SAP study on PDF usage. maximum value. The topics in this section include: Connecting to the profiler server Profiler statistics Executing a profiler task Monitoring profiler tasks using the Designer Viewing the profiler results Data Integrator Designer Guide 335 . Detailed profiling—Detailed column analysis includes distinct count. minimum string length.

You provide this connection information on the Profiler Server Login window. which include: • • • • • • • • • • • • • • Attunity Connector for mainframe databases DB2 Oracle SQL Server Sybase IQ Teradata JDE One World JDE World Oracle Applications PeopleSoft SAP R/3 Siebel Applications. See the Data Integrator Release Notes for the complete list of sources that the Data Profiler supports. The Data Integrator Designer must connect to the profiler server to run the Data Profiler and view the profiler results. see the Data Integrator Management Console: Administrator Guide. Find out how you can participate and help to improve our documentation. which include: Flat files Connecting to the profiler server You must install and configure the profiler server before you can use the Data Profiler. 336 Data Integrator Designer Guide . double-click the Profiler Server icon which is to the right of the Job Server icon.This document is part of a SAP study on PDF usage. • Databases. On the bottom status bar. To connect to a Data Profiler Server from the Data Integrator Designer Use one of the following methods to invoke the Profiler Server Login window: 1. • • From the tool bar menu. select Tools > Profiler Server Login. For details. 14 Data Quality Using the Data Profiler Data sources that you can profile You can execute the Data Profiler on data contained in the following sources.

enter the Data Profiler Server connection information. Find out how you can participate and help to improve our documentation. If the host name is valid. Data Quality Using the Data Profiler 14 2. Field Host Port Description The name of the computer where the Data Profiler Server exists.This document is part of a SAP study on PDF usage. Note: When you click Test. In the Profiler Server Login window. Data Integrator Designer Guide 337 . 3. you receive a message that indicates that the profiler server is running. the drop-down list in User Name displays the user names that belong to the profiler server. Click Test to validate the Profiler Server location. see the Data Integrator Management Console: Administrator Guide. To add profiler users. Port number through which the Designer connects to the Data Profiler Server.

Field Description User Name The user name for the Profiler Server login. Click Connect. When you successfully connect to the profiler server. Enter the user information in the Profiler Server Login window.This document is part of a SAP study on PDF usage. The password for the Profiler Server login. Find out how you can participate and help to improve our documentation. In addition. You can select a user name from the drop-down list or enter a new name. 338 Data Integrator Designer Guide . the Profiler Server icon on the bottom status bar no longer has the red X on it. 14 Data Quality Using the Data Profiler 4. the status bar displays the location of the profiler server. Password 5. when you move the pointer over this icon.

see “Submitting column profiler tasks” on page 342. For character columns. all sources must be in the same datastore. For numeric columns. the length of the shortest string value in this column. the average length of the string values string length in this column. Number of rows that contain this lowest value in this column. the length of the longest string value in this column. Percentage of rows that contain a NULL value in this column. Basic profiling By default. Data Integrator Designer Guide 339 . Number of 0 values in this column. For character columns. For details. Average For character columns. the average value in this column. Nulls Nulls % Zeros Number of NULL values in this column. the highest value in this column. The Data Profiler provides two types of column profiles: • • Basic profiling Detailed profiling This section also includes Examples of using column profile statistics to improve data quality.This document is part of a SAP study on PDF usage. Basic Attribute Min Min count Max Max count Average Min string length Max string length Description Of all values. Number of rows that contain this highest value in this column. The columns can all belong to one data source or from multiple data sources. Of all values. If you generate statistics for multiple sources in one profile task. Find out how you can participate and help to improve our documentation. Data Quality Using the Data Profiler 14 Profiler statistics You can calculate and generate two types of data profiler statistics: • • Column profile Column profile Relationship profile You can generate statistics for one or more columns. the Data Profiler generates the following basic profiler attributes for each column that you select. the lowest value in this column.

Percentage of rows that contain each pattern in this column. these profile statistics might show that a column value is markedly higher than the other values in a data source. including the following tasks: • Obtain basic statistics. the number of rows that contain a blank in this column. Distincts Distinct % Patterns Pattern % Number of distinct values in this column. Detailed profiling You can generate more detailed attributes in addition to the above attributes. Median For character columns. frequencies. 340 Data Integrator Designer Guide . the value that is in the middle row of string length the source table. Number of different patterns in this column. Business Objects recommends that you do not select the detailed profile unless you need the following attributes: Detailed Attribute Median Description The value that is in the middle row of the source table. see “Viewing column profile data” on page 350. Percentage of rows that contain each distinct value in this column.This document is part of a SAP study on PDF usage. see “Submitting column profiler tasks” on page 342. see the following: • • • For the most recent list of profiler attributes. Examples of using column profile statistics to improve data quality You can use the column profile attributes to assist you in different tasks. For more information. Find out how you can participate and help to improve our documentation. Percentage of rows that contain a blank in this column. but detailed attributes generation consumes more time and computer resources. 14 Data Quality Using the Data Profiler Basic Attribute Zeros % Blanks Blanks % Description Percentage of rows that contain a 0 value in this column. and outliers. For example. see the Data Integrator Release Notes. To view the profiler attributes. To generate the profiler attributes. ranges. You might then decide to define a validation transform to set a flag in a different table when you load this outlier into the target table. For character columns. Therefore.

the Data Profiler saves the data only in the columns that you select for the relationship. When you view the relationship profile results. For example. the profile statistics might show that phone number has several different formats. including the following tasks: Data Integrator Designer Guide 341 . Analyze the numeric range. and blanks in the source system. you might decide to define a validation transform to convert them all to use the same target format. you can drill down to see the actual data that does not match (see “Viewing the profiler results”). Find out how you can participate and help to improve our documentation. You might then decide to define a validation transform to replace the null value with a phrase such as “Unknown” in the target table. You can use the relationship profile to assist you in different tasks. For example. nulls. Identify missing information. and a different range in another source. customer number might have one range of numbers in one source. the profile statistics might show that nulls occur for fax number. For example. You can choose between two levels of relationship profiles to save: • Save key columns data only By default. For example.This document is part of a SAP study on PDF usage. For details. Data Quality Using the Data Profiler 14 • Identify variations of the same content. You might then decide which data type you want to use in your target data warehouse. The sources can be: • • • Tables Flat files A combination of a table and a flat file The key columns can have a primary key and foreign key relationship defined or they can be unrelated (as when one comes from a datastore and the other from a file format). part number might be an integer data type in one data source and a varchar data type in another data source. With this profile information. • Save all columns data You can save the values in the other columns in each row. but this processing will take longer and consume more computer resources to complete. Your target will need to have a data type that can accommodate the maximum range. • • • Relationship profile A relationship profile shows the percentage of non matching values in columns of two sources. see “Submitting relationship profiler tasks” on page 346. Discover data patterns and formats.

You can execute the following profiler tasks: • • Submitting column profiler tasks Submitting relationship profiler tasks You cannot execute a column profile task with a relationship profile task. For example. • Right-click and select Submit Column Profile Request. Reasons to submit profile tasks this way include: 342 Data Integrator Designer Guide . one data source might include region. Executing a profiler task The Data Profiler allows you to calculate profiler statistics for any set of columns you choose. go to the Datastores tab and select a table. After you select your data source. Validate relationships across data sources. For a flat file. but some problems only exist in one system or the other. If you want to profile all tables within a datastore. To select multiple files in the Formats tab. you can generate column profile statistics in one of the following ways: 1. duplicate names and addresses might exist between two sources or no name might exist for an address in one source. 2. but another source might not. For a table. For example.This document is part of a SAP study on PDF usage. To select a subset of tables in the datastore tab. 14 Data Quality Using the Data Profiler • • • Identify missing data in the source system. hold down the Ctrl key as you select each table. For example. LONG or TEXT data type. two different problem tracking systems might include a subset of common customerreported problems. you can select either a table or flat file. Submitting column profiler tasks For a list of profiler attributes that the Data Profiler generates. Identify redundant data across data sources. go to the Formats tab and select a file. see “Column profile” on page 339 To generate profile statistics for columns in one or more data sources In the Object Library of the Data Integrator Designer. Find out how you can participate and help to improve our documentation. select the datastore name. hold down the Ctrl key as you select each file. Note: This optional feature is not available for columns with nested schemas.

Right-click. If you select one source. firstsourcename lastsourcename 4. Data Quality Using the Data Profiler 14 • • • Some of the profile statistics can take a long time to calculate. or to remove dashes which are allowed in column names but not in task names. For more information. click the Profile tab. The value is C for column profile that obtains attributes (such as low value and high value) for each selected column.This document is part of a SAP study on PDF usage. and you must wait for the task to complete before you can perform other tasks on the Designer. see “Column profile” on page 339. The Data Profiler generates a default name for each profiler task. The profile task runs asynchronously and you can perform other Designer tasks while the profile task executes. This option submits a synchronous profile task. Name of first source in alphabetic order. (Optional) Edit the profiler task name. Keep the check in front of each column that you want to profile and remove the check in front of each column that you do not want to profile. The profile statistics have not yet been generated. the Submit Column Profile Request window lists the columns and data types. Data Integrator Designer Guide 343 . Name of last source in alphabetic order if you select multiple sources. or The date that the profile statistics were generated is older than you want. You might want to use this option if you are already on the View Data window and you notice that either: • • 3. You can profile multiple sources in one profile task. Type of profile. the default name has the following format: username_t_sourcename If you select a multiple sources. the default name has the following format: username_t_firstsourcename_lastsourcename Column username t Description Name of the user that Data Integrator uses to access system services. select View Data. Find out how you can participate and help to improve our documentation. You can edit the task name to create a more meaningful name. a unique name. If you select a single source. and click Update.

a. If you selected multiple sources. 14 Data Quality Using the Data Profiler Alternatively. the Submit Column Profiler Request window lists the sources on the left.This document is part of a SAP study on PDF usage. you can click the check box at the top in front of Name to deselect all columns and then select the check boxes. Select a data source to display its columns on the right side. 344 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. 5.

Data Quality Using the Data Profiler 14 b. Click Submit to execute the profile task. Instead. median string length. see “Monitoring profiler tasks using the Designer” on page 349. you must re-import the source table before you execute the profile task. median value. For details. see the Data Integrator Management Console Administrator Guide. Data Integrator Designer Guide 345 . For details. 8. Note: If the table metadata changed since you imported it (for example. the Profiler monitor pane appears automatically when you click Submit. a profile task is submitted asynchronously and you must wait for it to complete before you can do other tasks on the Designer.This document is part of a SAP study on PDF usage. Choose Detailed profiling only if you want these attributes: distinct count. Repeat steps a and b for each data source. 6. (Optional) Select Detailed profiling for a column. c. keep the check in front of each column that you want to profile. click Detailed profiling and select Apply to all columns of all sources. you can view the profile results in the View Data option. and remove the check in front of each column that you do not want to profile. On the right side of the Submit Column Profile Request window. Find out how you can participate and help to improve our documentation. a new column was added). For details. Note: The Data Profiler consumes a large amount of resources when it generates detailed profile statistics. 7. If you clicked Update on the Profile tab of the View Data window. If you want detailed attributes for all columns in all sources listed. If you choose Detailed profiling. ensure that you specify a pageable cache directory that contains enough disk space for the amount of data you profile. click Detailed profiling and select Remove from all columns of all sources. If you want to remove Detailed profiling for all columns. pattern. distinct percent. When the profiler task has completed. You can also monitor your profiler task by name in the Data Integrator Administrator. see “Viewing the profiler results”. If you clicked the Submit Column Profile Request option to reach this Submit Column Profiler Request window. pattern count. See Configuring Job Server runtime resources” in the Data Integrator Getting Started Guide. the Profiler monitor window does not appear when you click Submit. Alternatively. you can click the check box at the top in front of Name to deselect all columns and then select the individual check box for the columns you want to profile.

Right-click on the first source. Go to the Datastore or Format tab in the Object Library. select two sources. see “Data sources that you can profile” on page 336. Find out how you can participate and help to improve our documentation. 14 Data Quality Using the Data Profiler Submitting relationship profiler tasks A relationship profile shows the percentage of non matching values in columns of two sources. For example. To generate a relationship profile for columns in two sources In the Object Library of the Data Integrator Designer. To select two sources in the same datastore or file format: a. The columns can have a primary key and foreign key relationship defined or they can be unrelated (as when one comes from a datastore and the other from a file format). c. If you plan to use Relationship profiling. d. a. select Submit Relationship Profile Request > Relationship with. if you run a relationship profile task on an integer column and a varchar column. 1. ensure that you specify a pageable cache directory that contains enough disk space for the amount of data you profile. Hold the Ctrl key down as you select the second table. Go to the Datastore or Format tab in the Object Library. b. Change to a different Datastore or Format in the Object Library Click on the second source. b. To select two sources from different datastores or files: The Submit Relationship Profile Request window appears. Note: The Data Profiler consumes a large amount of resources when it generates relationship values. The sources can be any of the following: • • • Tables Flat files A combination of a table and a flat file For more details. but they must be convertible.This document is part of a SAP study on PDF usage. The two columns do not need to be the same data type. the Data Profiler converts the integer value to a varchar value to make the comparison. 346 Data Integrator Designer Guide . See Configuring Job Server runtime resources in the Data Integrator Getting Started Guide. c. Right-click and select Submit Relationship Profile Request.

(Optional) Edit the profiler task name. The value is R for Relationship profile that obtains non matching values in the two selected columns. which are allowed in column names but not in task names. You can edit the task name to create a more meaningful name. Find out how you can participate and help to improve our documentation. a unique name. Name last selected source. The default name that the Data Profiler generates for multiple sources has the following format: username_t_firstsourcename_lastsourcename Column username t Description Name of the user that Data Integrator uses to access system services.This document is part of a SAP study on PDF usage. Name first selected source. 2. Data Quality Using the Data Profiler 14 Note: You cannot create a relationship profile for the same column in the same source or for columns with a LONG or TEXT data type. or to remove dashes. firstsourcename lastsourcename Data Integrator Designer Guide 347 . Type of profile.

select Save all columns data. 348 Data Integrator Designer Guide . Click Propose Relation near the bottom of the Submit Relationship Profile Request window. Click Submit to execute the profiler task. • • 6. If you deleted all relations and you want the Data Profiler to select an existing primary-key and foreign-key relationship. The bottom half of the Submit Relationship Profile Request window shows that the profile task will use the equal (=) operation to compare the two columns. Move the cursor to the first column that you want to select. If a primary key and foreign key relationship does not exist between the two data sources. Right-click in the upper pane and click Delete All Relations. the upper pane of the Submit Relationship Profile Request window shows a line between the primary key column and foreign key column of the two sources. To delete all existing relationships between the two sources. You can change the columns to profile. To delete an existing relationship between two columns. You can resize each data source to show all columns. Find out how you can participate and help to improve our documentation. If you want to see values in the other columns in the relationship profile. 7. do one of the following actions: b.This document is part of a SAP study on PDF usage. Hold down the cursor and draw a line to the other column that you want to select. Right-click in the upper pane and click Propose Relation. 14 Data Quality Using the Data Profiler 3. and select Delete Selected Relation. 4. see “Viewing relationship profile data” on page 354. specify the columns that you want to profile. Click Delete All Relations near the bottom of the Submit Relationship Profile Request window. This option indicates that the Data Profiler saves the data only in the columns that you select for the relationship. To specify or change the columns that you want to see relationship values: a. For the profile results. right-click. the Save key columns data only option is selected. select the line. and you will not see any sample data in the other columns when you view the relationship profile. do one of the following actions: • • 5. if the relationship exists. The Data Profiler will determine which values are not equal and calculate the percentage of non matching values. By default. By default.

You can also monitor your profiler task by name in the Data Integrator Administrator. For details. If you clicked Update on the Profile tab of the View Data window. You can dock this profiler monitor pane in the Designer or keep it separate. you can view the profile results in the View Data option when you right click on a table in the Object Library. the Information window also displays the error message. When the profiler task has completed.This document is part of a SAP study on PDF usage. see the Data Integrator Management Console Administrator Guide. The Profiler monitor pane appears automatically when you click Submit. 8. For details about the Profile monitor. Data Integrator Designer Guide 349 . For details. a new column was added). 9. Monitoring profiler tasks using the Designer The Profiler monitor window appears automatically when you submit a profiler task (see “Executing a profiler task” on page 342). You can click on the icons in the upper-left corner of the Profiler monitor to display the following information: Refreshes the Profiler monitor pane to display the latest status of profiler tasks Sources that the selected task is profiling. Data Quality Using the Data Profiler 14 Note: If the table metadata changed since you imported it (for example. Find out how you can participate and help to improve our documentation. For more information about parameters. If the task failed. see the Data Integrator Management Console Administrator Guide. you must re-import the source table before you execute the profile task. The Profiler monitor pane displays the currently running task and all of the profiler tasks that have executed within a configured number of days. see “Monitoring profiler tasks using the Designer”. see “Viewing the profiler results” on page 350. you must click Tools > Profiler monitor on the Menu bar to view the Profiler monitor window.

• • • Pending — The task is on the wait queue because the maximum number of concurrent tasks has been reached or another task is profiling the same table. the default name has the following format: username_t_sourcename If the profiler task is for multiple sources. Find out how you can participate and help to improve our documentation. 350 Data Integrator Designer Guide . This section describes: • • Viewing column profile data on the Profile tab in View Data. Timestamp Date and time that the profiler task executed. Viewing relationship profile data on the Relationship tab in View Data. Viewing the profiler results The Data Profiler calculates and saves the profiler attributes into a profiler repository that multiple users can view. Sources Names of the tables for which the profiler task executes. Right-click and select View Data. 2. Running — The task is currently executing. Error — The task terminated with an error. 14 Data Quality Using the Data Profiler The Profiler monitor shows the following columns: Column Name Description Name of the profiler task that was submitted from the Designer. To view the column attributes generated by the Data Profiler In the Object Library. select the table for which you want to view profiler attributes. Double-click on the value in this Status column to display the error message. the default name has the following format: username_t_firstsourcename_lastsourcename Type The type of profiler task can be: • Column • Status Relationship The status of a profiler task can be: • Done — The task completed successfully. Viewing column profile data 1. 3.This document is part of a SAP study on PDF usage. If the profiler task is for a single source. Click the Profile tab (second) to view the column profile attributes.

the average string length length of the string values in this column. Select names from this column. You can sort the values in each attribute column by clicking the column heading. The Profile tab shows the number of physical records that the Data Profiler processed to generate the values in the profile grid. The value n/a in the profile grid indicates an attribute does not apply to a data type.This document is part of a SAP study on PDF usage. the highest value in this column. b. Nulls Nulls % Zeros Number of NULL values in this column. Number of rows that contain this lowest value in this column. Perform the steps in “Executing a profiler task” on page 342. Percentage of rows that contain a NULL value in this column. Relevant data type Character Numeric Yes Yes Yes Yes Yes No No No Yes Yes Yes Datetime Yes Yes Yes Yes Yes No No No Yes Yes No Basic Profile attribute Min Min count Max Max count Average Min string length Max string length Description Of all values. Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Average For character columns. the average value in this column. then click Update. Data Integrator Designer Guide 351 . Find out how you can participate and help to improve our documentation. To populate the profile grid. Data Quality Using the Data Profiler 14 a. do one of the following actions: • • c. Of all values. The profile grid contains the column names in the current source and profile attributes for each column. Number of 0 values in this column. For character columns. the lowest value in this column. the length of the shortest string value in this column. For numeric columns. For character columns. Number of rows that contain this highest value in this column. the length of the longest string value in this column.

the Profile tab also displays the following detailed attribute columns. You can hide columns that you do not want to view by clicking the Show/Hide Columns icon. The bottom half of the View Data window displays the rows that contain the attribute value that you clicked. d. 14 Data Quality Using the Data Profiler Basic Profile attribute Zeros % Blanks Blanks % Description Relevant data type Character Numeric Yes No No Datetime No No No Percentage of rows that contain a 0 value No in this column. For character columns. The value that is in the middle row of the source Yes table. 352 Data Integrator Designer Guide . your target ADDRESS column might only be 45 characters. but the Profiling data for this Customer source table shows that the maximum string length is 46. Percentage of rows that contain each distinct value in this column. 4. Click on the value 46 to view the actual data. For example. If you selected the Detailed profiling option on the Submit Column Profile Request window. the number of rows that contain a blank in this column. Patterns Number of different patterns in this column.This document is part of a SAP study on PDF usage. Yes Percentage of rows that contain a blank in Yes this column. The Data Profiler uses the following calculation to obtain the median value: (Total number of rows / 2) + 1 Median string length For character columns. You can resize the width of the column to display the entire string. Yes No No Pattern % Percentage of rows that contain each distinct value in this column The format of each unique pattern in this column. Find out how you can participate and help to improve our documentation. Yes No No Yes No No Click an attribute value to view the entire row in the source table. the value that is in the middle row of the source table. Relevant data type Character Numeric Datetime Yes Yes Yes Yes Yes Yes Yes Yes Detailed Description Profile attribute Distincts Distinct % Median Number of distinct values in this column.

Select only the column names you need for this profiling operation because Update calculations impact performance. and you must wait for the task to complete before you can perform other tasks on the Designer. The Submit column Profile Request window appears. You can also click the check box at the top in front of Name to deselect all columns and then select each check box in front of each column you want to profile. The Last updated value in the bottom left corner of the Profile tab is the timestamp when the profile attributes were last generated. The Distincts attribute for the REGION column shows the statistic 19 which means 19 distinct values for REGION exist. and percentages appear on the right side of the Profile tab. The pattern values. 6. Click a statistic in either Distincts or Patterns to display the percentage of each distinct value or pattern value in a column. Data Quality Using the Data Profiler 14 5. For example. Note: The Update option submits a synchronous profile task.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. the following Profile tab for table CUSTOMERS shows the profile attributes for column REGION. number of records for each pattern value. Data Integrator Designer Guide 353 . Reasons to update at this point include: • • The profile attributes have not yet been generated The date that the profile attributes were generated is older than you want. (Optional) Click Update if you want to update the profile attributes.

The Profiling data on the right side shows that a very large percentage of values for REGION is Null. Therefore. Click the statistic in the Distincts column to display each of the 19 values and the percentage of rows in table CUSTOMERS that have that value for column REGION. the bars in the right-most column show the relative size of each percentage. 9. In addition. Find out how you can participate and help to improve our documentation. For details. see “Define validation rule based on column profile” on page 359. 8. The sources can be tables.This document is part of a SAP study on PDF usage. The columns can have a primary key and foreign key relationship defined or they can be unrelated (as when one comes from a datastore and the other from a file format). 14 Data Quality Using the Data Profiler 7. or a combination of a table and a flat file. 354 Data Integrator Designer Guide . decide what value you want to substitute for Null values when you define a validation transform. Click on either Null under Value or 60 under Records to display the other columns in the rows that have a Null value in the REGION column. Viewing relationship profile data Relationship profile data shows the percentage of non matching values in columns of two sources. Your business rules might dictate that REGION should not contain Null values in your target data warehouse. flat files.

67% of rows in table ODS_CUSTOMER have CUST_ID values that do not exist in table ODS_SALESORDER. Click the nonzero percentage in the diagram to view the key values that are not contained within the other table. the number of records with that CUST_ID value. the following View Data Relationship tab shows the percentage (16. Right-click and select View Data. The value in the left oval indicates that 16. The non matching values KT03 and SA03 display on the right side of the Relationship tab. Click the 16. Note: The Relationship tab is visible only if you executed a relationship profile task. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide 355 .This document is part of a SAP study on PDF usage. Each row displays a non matching CUST_ID value. To view the relationship profile data generated by the Data Profiler In the Object Library. 2. select the table or file for which you want to view relationship profile data. and the percentage of total customers with this CUST_ID value. 4.67) of customers that do not have a sales order. Data Quality Using the Data Profiler 14 1. The relationship profile was defined on the CUST_ID column in table ODS_CUSTOMER and CUST_ID column in table ODS_SALESORDER. 3. For example. Click the Relationship tab (third) to view the relationship profile results.67 percentage in the ODS_CUSTOMER oval to display the CUST_ID values that do not exist in the ODS_SALESORDER table.

Data tab The Data tab is always available and displays the data contents of sample rows. Compare sample data from different steps of your job to verify that your data extraction job returns the results you expect. The bottom half of the Relationship Profile tab displays the values in the other columns of the row that has the value KT03 in the column CUST_ID. see “Using View Data” on page 404. Find out how you can participate and help to improve our documentation. The following Data tab shows a subset of rows for the customers that are in France. Using View Data to determine data quality Use View Data to help you determine the quality of your source and target data. you cannot view the data in the other columns. View Data provides the capability to: • • View sample source data before you execute a job to create higher quality job designs. See step 6 in “Submitting relationship profiler tasks” on page 346. 14 Data Quality Using View Data to determine data quality 5. see “Define validation rule based on column profile” on page 359. Note: If you did not select Save all column data on the Submit Relationship Profile Request window. Click one of the values on the right side to display the other columns in the rows that contain that value. For example. For an example. You can see the data in different ways from the three tabs on the View Data panel: • • • Data tab Profile tab Relationship Profile or Column Profile tab For more information about View Data options and how to use View Data to design and debug your jobs.This document is part of a SAP study on PDF usage. You can display a subset of columns in each row and define filters to display a subset of rows (see “View Data properties” on page 408). your business rules might dictate that all phone and fax numbers be in one format for each country. 356 Data Integrator Designer Guide .

pattern count. For information about other options on the Data tab. and maximum value. median string length. For more information. see “Viewing column profile data” on page 350. the Profile tab displays the following column attributes: distinct values. distinct count. You can now decide which format you want to use in your target data warehouse and define a validation transform accordingly (see “Define validation rule based on column profile” on page 359). see “Column Profile tab” on page 417. • If you do not use the Data Profiler. median. For more information. minimum value. minimum string length. and maximum string length. Data Quality Using View Data to determine data quality 14 Notice that the PHONE and FAX columns displays values with two different formats.This document is part of a SAP study on PDF usage. For more information. such as average value. Relationship Profile or Column Profile tab The third tab that displays depends on whether or not you configured and use the Data Profiler. Profile tab Two displays are available on the Profile tab: • • Without the Data Profiler. If you configured and use the Data Profiler. Data Integrator Designer Guide 357 . distinct percent. NULLs. see “Data tab” on page 414. the Column Profile tab allows you to calculate statistical information for a single column. Find out how you can participate and help to improve our documentation. and pattern percent. the Profile tab displays the same above column attributes plus many more calculated statistics. see “Profile tab” on page 415.

see “Viewing relationship profile data” on page 354. select the View Data right-click option on the table that you profiled. For more information. the Relationship tab displays the data mismatches between two columns from which you can determine the integrity of your data between two sources. 14 Data Quality Using the Validation transform • If you use the Data Profiler. Find out how you can participate and help to improve our documentation. 358 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. suppose you want to analyze the data in the Customer table in the Microsoft SQL Server Northwinds sample database. take any corrective actions. The Patterns attribute for the PHONE column shows the value 20 which means 20 different patterns exist. The Profile tab shows the following column profile attributes: 1. 2. Analyze column profile To obtain column profile information. For example. The Data Profiler and View Data features can identify anomalies in the incoming data to help you better define corrective actions in the validation transform. To analyze column profile attributes In the Object Library of the Designer. follow the procedure “Submitting column profiler tasks” on page 342. Using the Validation transform The validation transform provides the ability to compare your incoming data against a set of pre-defined business rules and. if needed. Access the Profile tab on the View Data window.

define a validation rule with the Match pattern option. Find out how you can participate and help to improve our documentation. To remove this ‘(1)’ prefix when you load the customer records into your target table. Data Quality Using the Validation transform 14 3. 4. see “Define validation rule based on column profile” below.99. Suppose that your business rules dictate that all phone numbers in France should have the format 99. the profiling data shows that two records have the format (9) 99.This document is part of a SAP study on PDF usage.99.99. However.99 under Pattern or click the value 2 under Records. You see that some phone numbers in France have a prefix of ‘(1)’.99. select the PHONE column.99.99. To define a validation rule to substitute a different value for a specific pattern 1. In the validation transform editor. For more information about the validation transform. Define validation rule based on column profile This section takes the Data Profiler results and defines the validation transform according to the sample business rules. see the Data Integrator Reference Guide. To display the columns in these two records. For the example in “Analyze column profile” on page 358. 5.99. click either the value (9) 99. For details. select the column for which you want to replace a specific pattern. Data Integrator Designer Guide 359 .99. Click the value 20 under the Patterns attribute to display the individual patterns and the percentage of rows in table CUSTOMERS that have that pattern for column PHONE.

In the Condition area.This document is part of a SAP study on PDF usage.99’ 4.99. Find out how you can participate and help to improve our documentation. Using the phone example above. 3. In the Action on Failure area. 14 Data Quality Using the Validation transform 2.99. Click the Enable validation check box. Either manually enter the replace_substr function in the text box or click Function to have the Define Input Parameter(s) window help you set up the replace_substr function: replace_substr(CUSTOMERS. Substitute with. 5. select the Match pattern and enter the specific pattern that you want to pass per your business rules. enter the following pattern: ‘99. null) 360 Data Integrator Designer Guide . select Send to Pass and check the box For Pass. '(1) '.PHONE.

enter: null 13. 11. Click Finish. double-click the source table and doubleclick the column name. 9. Data Integrator Designer Guide 361 . It only replaces occurrences of the value (1) because the Data Profiler results show only that specific value in the source data. use the View Data icons to verify that the string was substituted correctly. Find out how you can participate and help to improve our documentation. Click Next. In the Input Parameter window. Repeat steps 1 through 5 to define a similar validation rule for the FAX column. For Search string on the Define Input Parameter(s) window. click the Input string dropdown list to display source tables. 7. After you execute the job. double-click the Customer table and double-click the Phone column name. Data Quality Using the Validation transform 14 Note: This replace_substr function does not replace any number enclosed in parenthesis.This document is part of a SAP study on PDF usage. In the Define Input Parameter(s) window. 8. 10. Click Function and select the string function replace_substr. For the phone example. In our example. enter the value of the string you want to replace. For Replace string on the Define Input Parameter(s) window. For the phone example. enter: ‘(1) ’ 12. enter your replacement value. 6.

Auditing stores these statistics in the Data Integrator repository. Display the audit statistics after the job execution to help identify the object in the data flow that might have produced incorrect data. For details. and action on failure Guidelines to choose audit points Auditing embedded data flows Resolving invalid audit labels Viewing audit results Auditing objects in a data flow You can collect audit statistics on the data that flows out of any Data Integrator object. If a transform has multiple distinct or different outputs (such as Validation or Case). or target. you can audit each output independently. Find out how you can participate and help to improve our documentation. Use auditing to perform the following tasks: • • Define audit points to collect run time statistics about the data that flows out of objects. such as a source. Define rules with these audit statistics to ensure that the data at the following points in a data flow is what you expect: • • • • • Extracted from sources Processed by transforms Loaded into targets Generate a run time notification that includes the audit rule that failed and the values of the audit statistics at the time of failure. 14 Data Quality Using Auditing Using Auditing Auditing provides a way to ensure that a data flow loads correct data into the warehouse. 362 Data Integrator Designer Guide . This section describes the following topics: • • • • • • • Auditing objects in a data flow Accessing the Audit window Defining audit points. performance might degrade because pushdown operations cannot occur after an audit point. rules. Note: If you add an audit point prior to an operation that is usually pushed down to the database server.This document is part of a SAP study on PDF usage. see “Guidelines to choose audit points” on page 371. transform.

This function only includes the Good rows. • • Audit function This section describes the data types for the audit functions and the error count statistics. For more information. You use these labels to define audit rules for the data flow. or a target. output schema. you define the following objects in the Audit window: • • Audit point — The object in a data flow where you collect audit statistics. double. custom script. If you define multiple rules in a data flow. Checksum of the values in the column. You can audit a source. Data Quality Using Auditing 14 To use auditing. Average of the numeric values in the column. raise exception. Applicable data types include decimal. and real. For more information. Find out how you can participate and help to improve our documentation. and real. Sum of the numeric values in the column. see “Audit function” on page 363 Audit Functions Description This function collects two statistics: • Good count for rows that were successfully processed. Audit function — The audit statistic that Data Integrator collects for a table.This document is part of a SAP study on PDF usage. integer. integer. Applicable data types include decimal. all rules must succeed or the audit fails. or column. • Error count for rows that generated some type of error if you enabled error handling. Data Object Table or output Count schema Column Sum Column Average Column Checksum • Audit label — The unique name in the data flow that Data Integrator generates for the audit statistics collected for each audit function that you define. see “Audit label” on page 364. You identify the object to audit when you define an audit function on it. This function only includes the Good rows. Actions on audit failure — One or more of three ways to generate notification of an audit rule (or rules) failure: email. The following table shows the audit functions that you can define. see “Audit notification” on page 365. see “Audit rule” on page 365. For more information. double. Audit rule — A Boolean expression in which you use audit labels to verify the Data Integrator job. a transform. Data Integrator Designer Guide 363 . For more information.

Generating label names If the audit point is on a table or output schema. 14 Data Quality Using Auditing Data types The following table shows the default data type for each audit function and the permissible data types. Error row count for rows that the Data Integrator job could not process but ignores those rows to continue processing. Audit label Data Integrator generates a unique name for each audit function that you define on an audit point. DECIMAL. REAL Type of audited column INTEGER. the labels have the following formats: $Count_objectname_embeddedDFname $CountError_objectname_embeddedDFname $auditfunction_objectname_embeddedDFname 364 Data Integrator Designer Guide . DOUBLE. Find out how you can participate and help to improve our documentation. which are allowed in column names but not in label names. REAL VARCHAR(128) VARCHAR(128) Error count statistic When you enable a Count audit function. Audit Functions Default Data Type Count Sum Average Checksum INTEGER Allowed Data Types INTEGER Type of audited column INTEGER. DECIMAL. One way that error rows can result is when you specify the User overflow file option in the Source Editor or Target Editor. Data Integrator generates the following two labels for the audit function Count: $Count_objectname $CountError_objectname If the audit point is on a column. DOUBLE. Data Integrator generates an audit label with the following format: $auditfunction_objectname If the audit point is in an embedded data flow. You might want to edit a label name to create a shorter meaningful name or to remove dashes. You can edit the label names. Data Integrator collects two types of statistics: • • Good row count for rows processed without any error.This document is part of a SAP study on PDF usage. You can change the data type in the Properties window for each audit function in the Data Integrator Designer.

The RHS can be a single audit label.This document is part of a SAP study on PDF usage. $Count_CUSTOMER = $Count_CUSTDW $Sum_ORDER_US + $Sum_ORDER_EUROPE = $Sum_ORDER_DW round($Avg_ORDER_TOTAL) >= 10000 The following Boolean expressions are examples of audit rules: Audit notification You can choose any combination of the following actions for notification of an audit failure. If you edit the label name after you use it in an audit rule. and the error log shows which audit rule failed. multiple audit labels that form an expression with one or more mathematical operators. The job stops at the first audit rule that fails. a Boolean operator. Find out how you can participate and help to improve our documentation. a function with audit labels as parameters. or a constant. Use a comma to separate the list of email addresses. Raise exception — The job fails if an audit rule fails. Audit rule An audit rule is a Boolean expression which consists of a Left-Hand-Side (LHS). You can use this audit exception in a try/catch block. • • Script — Data Integrator executes the custom script that you create in this option. or a function with audit labels as parameters. If you choose all three actions. you must define the server and sender for the Simple Mail Tool Protocol (SMTP) in the Data Integrator Server Manager. You can specify a variable for the email list. Data Integrator Designer Guide 365 . This option uses the smtp_to function to send email. This action is the default. the audit rule does not automatically use the new name. multiple audit labels that form an expression with one or more mathematical operators. and a Right-Hand-Side (RHS). • • The LHS can be a single audit label. Data Quality Using Auditing 14 Editing label names You can edit the audit label name when you create the audit function and before you create an audit rule that uses the label. You must redefine the rule with the new name. Data Integrator executes them in this order: • Email to list — Data Integrator sends a notification of which audit rule failed to the email addresses that you list in this option. You can continue the job execution in a try/catch block. Therefore.

transform. 366 Data Integrator Designer Guide . Icon Tool tip Collapse All Description Collapses the expansion of the source. see “Viewing audit results” on page 377. right-click on a data flow name and select the Auditing option. right-click on a data flow icon and select the Auditing option. click the Audit icon in the toolbar. You can view which rule failed in the Auditing Details report in the Metadata Reporting tool. and target objects. the job completes successfully and the audit does not write messages to the job log. the Audit window shows the first query. For more information. Accessing the Audit window Access the Audit window from one of the following places in the Data Integrator Designer: • • • From the Data Flows tab of the object library. Find out how you can participate and help to improve our documentation. Click the icons on the upper left corner of the Label tab to change the display. When you first access the Audit window. the Label tab displays the sources and targets in the data flow.This document is part of a SAP study on PDF usage. In the workspace. If your data flow contains multiple consecutive query transforms. When a data flow is open in the workspace. 14 Data Quality Using Auditing If you uncheck this action and an audit rule fails.

If the data flow Query contains multiple consecutive query transforms. Define audit points. right-click on an object that you want to audit and choose an audit function or Properties. target. To define auditing in a data flow Access the Audit window.This document is part of a SAP study on PDF usage.REGION_ID = 2 R3 contains rows where ODS_CUSTOMER. Show Labelled Displays the objects that have audit labels defined. see “Auditing objects in a data flow” on page 362. and Target and first first query objects in the data flow. Data Quality Using Auditing 14 Icon Tool tip Show All Objects Description Displays all the objects within the data flow. rules. Find out how you can participate and help to improve our documentation. the Properties window allows you to edit the audit label and change the data type of the audit function. 2. 2 or 3) Data Integrator Designer Guide 367 . Objects Defining audit points. Show Source. In addition to choosing an audit function.REGION_ID = 1 R2 contains rows where ODS_CUSTOMER. and action on failure 1. For the format of this label.REGION_ID IN (1. the data flow Case_DF has the following objects and you want to verify that all of the source rows are processed by the Case transform. Data Integrator generates the following: • • An audit icon on the object in the data flow in the workspace An audit label that you use to define audit rules. For example. Use one of the methods that section “Accessing the Audit window” on page 366 describes. • • Source table ODS_CUSTOMER Four target tables: • • • • R1 contains rows where ODS_CUSTOMER. only the first query displays. On the Label tab. Default display which shows the source.REGION_ID = 3 R123 contains rows where ODS_CUSTOMER. When you define an audit point.

and an audit icon appears on the source object in the workspace. Similarly.This document is part of a SAP study on PDF usage. 368 Data Integrator Designer Guide . Data Integrator creates the audit labels $Count_ODS_CUSTOMER and $CountError_ODS_CUSTOMER. right-click on each of the target tables and choose Count. Right-click on source table ODS_CUSTOMER and choose Count. 14 Data Quality Using Auditing a. The Audit window shows the following audit labels. Find out how you can participate and help to improve our documentation. b.

For example. Click the function to remove the check mark and delete the associated audit label. use the Custom expression box with its function and smart editors to type in the operator. If you require a Boolean operator that is not in this list. and select the value (No Audit) in the Audit function drop-down list. On the Rule tab in the Audit window. 3. Find out how you can participate and help to improve our documentation. Choose a Boolean operator from the second drop-down list. The options in the editor provide common Boolean operators. Define audit rules. c. If you want to remove an audit label. Select the label for the second audit point from the third drop-down list. click Add which activates the expression editor of the Auditing Rules section. Data Quality Using Auditing 14 c.This document is part of a SAP study on PDF usage. and the audit function that you previously defined displays with a check mark in front of it. use the expression editor. right-click on the label. If you want to compare the first audit value to a constant instead of a second audit value. When you right-click on the label. Select the label of the first audit point in the first drop-down list. select audit labels and the Boolean operation in the expression editor as follows: Data Integrator Designer Guide 369 . use the Customer expression box. you can also select Properties. to verify that the count of rows from the source table is equal to the rows in the target table. If you want to compare audit statistics for one object against one other object. b. which consists of three text boxes with drop-down lists: a.

f. For example. Click OK to close the smart editor. b. Type a Boolean operator Drag the audit labels of the other objects to which you want to compare the audit statistics of the first object and place appropriate mathematical operators between them. click on the title “Auditing Rule” or on another option. drag the audit labels. select the Custom expression box. a. Drag the first audit label of the object to the editor pane. to verify that the count of rows from the source table is equal to the sum of rows in the first three target tables. g. type in the Boolean operation and plus signs in the smart editor as follows: 370 Data Integrator Designer Guide . The audit rule displays in the Custom editor. c. Click the ellipsis button to open the full-size smart editor window. Find out how you can participate and help to improve our documentation. e. d.This document is part of a SAP study on PDF usage. 14 Data Quality Using Auditing If you want to compare audit statistics for one or more objects against statistics for multiple other objects or a constant. To update the rule in the top Auditing Rule box. h. Click Close in the Audit window. Click the Variables tab on the left and expand the Labels node.

Data Integrator Designer Guide 371 . • Email to list — Data Integrator sends a notification of which audit rule failed to the email addresses that you list in this option. the Data Integrator optimizer cannot pushdown operations after the audit point. see “Viewing audit results” on page 377. Guidelines to choose audit points The following are guidelines to choose audit points: • When you audit the output data of an object. you can view all audit results on the Job Monitor Log. Script — Data Integrator executes the script that you create in this option. Execute the job. to obtain audit statistics on the query results. For details. Find out how you can participate and help to improve our documentation. suppose your data flow has a source. define the first audit point on the query or later in the data flow. You can view passed and failed audit rules in the metadata reports. Therefore. if the performance of a query that is pushed to the database server is more important than gathering audit statistics from the source. query. For example. the job completes successfully and the audit does not write messages to the job log. Data Quality Using Auditing 14 4. You can choose one or more of the following actions: • Raise exception — The job fails if an audit rule fails and the error log shows which audit rule failed. This action is the default. You can view which rule failed in the Auditing Details report in the Metadata Reporting tool.This document is part of a SAP study on PDF usage. Look at the audit results. For more information. If you turn on the audit trace on the Trace tab in the Execution Properties window. Define the first audit point on the query. 6. and target objects. rather than on the source. and the query has a WHERE clause that is pushed to the database server that significantly reduces the amount of data that returns to Data Integrator. The Execution Properties window has the Enable auditing option checked by default. see “Viewing audit results” on page 377. Use a comma to separate the list of email addresses. If you clear this option and an audit rule fails. Uncheck this box if you do not want to collect audit statistics for this specific job execution. You can specify a variable for the email list. Define the action to take if the audit fails. • 5.

an audit function might exist on the output port or on the input port. expand the objects to display any audit functions defined within the embedded data flow. 1. You cannot audit within an SAP R/3 data flow. the number of rows loaded is not available to Data Integrator.This document is part of a SAP study on PDF usage. 3. For the other bulk loading methods. If you use the CHECKSUM audit function in a job that normally executes in parallel. Data Integrator cannot execute it. You cannot audit NRDM schemas or real-time jobs. Data Integrator disables the DOP for the whole data flow. you must enable the audit label of the embedded data flow. Find out how you can participate and help to improve our documentation. To enable auditing in an embedded data flow Open the parent data flow in the Data Integrator Designer workspace. The order of rows is important for the result of CHECKSUM. If a data flow is embedded at the beginning or at the end of the parent data flow. 14 Data Quality Using Auditing • • • • • • If a pushdown_sql function is after an audit point. You can only audit a bulkload that uses the Oracle API method. This section describes the following considerations when you audit embedded data flows: • • Enabling auditing in an embedded data flow Audit points not visible outside of the embedded data flow Enabling auditing in an embedded data flow If you want to collect audit statistics on an embedded data flow when you execute the parent data flow. 372 Data Integrator Designer Guide . Auditing is disabled when you run a job with the debugger. Click on the Audit icon in the toolbar to open the Audit window On the Label tab. and DOP processes the rows in a different order than in the source. but you can audit the output of an SAP R/3 data flow. 2. Auditing embedded data flows You can define audit labels and audit rules in an embedded data flow.

Data Quality Using Auditing 14 The following Audit window shows an example of an embedded audit function that does not have an audit label defined in the parent data flow. Right-click on the Audit function and choose Enable. 4. You can also choose Properties to change the label name and enable the label. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide 373 .This document is part of a SAP study on PDF usage.

14 Data Quality Using Auditing 5. some of the objects are not visible in the parent data flow. Because some of the objects are not visible in the parent data flow. the audit points on these objects are also not visible in the parent data flow. data passes into the embedded data flow from the parent through a single target. You can define audit rules with the enabled label. For example. When you embed a data flow at the end of another data flow. In either case. Audit points not visible outside of the embedded data flow When you embed a data flow at the beginning of another data flow. data passes from the embedded data flow to the parent data flow through a single source. 374 Data Integrator Designer Guide . the following embedded dataflow has an audit function defined on the source SQL transform and an audit function defined on the target table.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

The following Audit window for the parent data flow shows the audit function defined in the embedded data flow.This document is part of a SAP study on PDF usage. An audit point still exists for the entire embedded data flow. Data Quality Using Auditing 14 The following Audit window shows these two audit points. the target Output becomes a source for the parent data flow and the SQL transform is no longer visible. When you embed this data flow. but does not show an Audit Label. Find out how you can participate and help to improve our documentation. but the label is no longer applicable. Data Integrator Designer Guide 375 .

5. If you want to delete all of the invalid labels at once. Expand the Invalid Labels node to display the individual labels. 3. 1. After you define a corresponding audit label on a new object. right-click on the invalid label and choose Delete. 4. right-click on the audit function in the Audit window and select Enable. To resolve invalid audit labels Open the Audit window.This document is part of a SAP study on PDF usage. If you delete or rename an object that had an audit point defined on it The following Audit window shows the invalid label that results when an embedded data flow deletes an audit label that the parent data flow had enabled. 14 Data Quality Using Auditing If you want to audit the embedded data flow. 2. right click on the Invalid Labels node and click on Delete All. Resolving invalid audit labels An audit label can become invalid in the following situations: • • If you delete the audit label in an embedded data flow that the parent data flow has enabled. Note any labels that you would like to define on any new objects in the data flow. Find out how you can participate and help to improve our documentation. 376 Data Integrator Designer Guide .

Audit Label $Count_ODS_CUSTOMER = 12. Audit Label $CountError_R3 = 0. Audit Label $Count_R3 = 3. Metadata Reports Email message. The following sample audit success messages appear in the Job Monitor Log when Audit Trace is set to Yes: Audit Label $Count_R2 = 4. RHS=12. Data flow <Case_DF>. Data flow <Case_DF>. Find out how you can participate and help to improve our documentation. the places that display audit information depends on the Action on failure option that you chose: Action on failure Raise exception Email to list Script Places where you can view audit information Job Error Log. Audit Label $Count_R1 = 5. Audit Rule passed ($Count_ODS_CUSTOMER = $CountR123): LHS=12.This document is part of a SAP study on PDF usage. Audit Label $CountError_ODS_CUSTOMER = 0. Data flow <Case_DF>. Metadata Reports Job Monitor Log If you set Audit Trace to Yes on the Trace tab in the Execution Properties window. Audit Label $CountError_R1 = 0. Audit Label $CountError_R2 = 0. Data flow <Case_DF>. Data flow <Case_DF>. The following sample message appears in the Job Error Log: Audit rule failed <($Count_ODS_CUSTOMER = $CountR1)> for <Data flow Case_DF>. Audit Label $CountError_R123 = 0. audit messages appear in the Job Monitor Log. Data flow <Case_DF>. You can see messages for audit rules that passed and failed. Data Quality Using Auditing 14 Viewing audit results You can see the audit status in one of the following places: • • Job Monitor Log If the audit rule fails. Data flow <Case_DF>. Data Integrator Designer Guide 377 . Data flow <Case_DF>. Data flow <Case_DF>. Data flow <Case_DF>. Data flow <Case_DF>. Data flow <Case_DF>. Audit Label $Count_R123 = 12. Job Error Log When you choose the Raise exception option and an audit rule fails. the Job Error Log shows the rule that failed. Metadata Reports Wherever the custom script sends the audit messages. RHS=12. Audit Rule passed ($Count_ODS_CUSTOMER = (($CountR1 + $CountR2 + $Count_R3)): LHS=12.

Data Quality Projects and datastores are imported into the Data Integrator Designer and used to call Data Quality Projects from a server. cleansed. For examples of these Metadata Reports. 14 Data Quality Data Cleansing with Data Integrator Data Quality Metadata Reports You can look at the Audit Status column in the Data Flow Execution Statistics reports of the Metadata Report tool. This section covers the following topics: • • • • • • • Overview of Data Integrator Data Quality architecture Data Quality Terms and Definitions Creating a Data Quality datastore Importing Data Quality Projects Using the Data Quality transform Mapping input fields from the data flow to the project Creating custom projects 378 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. Information Collected — This status occurs when you define audit labels to collect statistics but do not define audit rules. Failed — Audit rule failed. This Audit Status column has the following values: • • • • Not Audited Passed — All audit rules succeeded. This value is a link to the Auditing Details report which shows the values of the audit labels. This value is a link to the Auditing Details report which shows the audit rules and values of the audit labels.This document is part of a SAP study on PDF usage. and passed back to the Data Integrator job. Data is passed to the Data Quality Projects. Data Cleansing with Data Integrator Data Quality Data Integrator Data Quality integrates the data cleansing functionality of the Business Objects Data Quality application with Data Integrator. This data cleansing functionality is initiated and viewed in the Data Integrator Designer. see the Data Integrator Management Console: Metadata Reports User’s Guide. This value is a link to the Auditing Details report which shows the rule that failed and values of the audit labels.

The following diagram illustrates the flow of data from the Data Integrator source. Data Quality Data Cleansing with Data Integrator Data Quality 14 Overview of Data Integrator Data Quality architecture Data Quality Projects are imported in a Data Quality datastore in Data Integrator and used as Data Quality transforms in data flows. Data is passed to a running Data Quality workflow via a Data Integrator reader socket thread to the Data Quality socket based reader. where the cleansed data is further processed by the dataflow. and the Data Quality server. the Data Integrator Query Transform. via the reader and writer socket threads. the Data Integrator Data Quality transform. Data Integrator Designer Guide 379 . The remainder of this chapter explains how to use the Data Integrator Designer to implement data cleansing as provided by the above integration of Data Integrator and Data Quality. Find out how you can participate and help to improve our documentation. the dataflow streams the input data to the Data Quality server where the data is cleaned and then sent back to Data Integrator.This document is part of a SAP study on PDF usage. Cleansed data is passed back to the Data Integrator job via the Data Quality socket based writer. At execution time.

This document is part of a SAP study on PDF usage. 14 Data Quality Data Cleansing with Data Integrator Data Quality 380 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.

3. 1. Contains imported projects that are available on the associated Data Quality server. 2. Data Integrator Data Quality Project: A reusable (first class) object that can be dropped onto data flows to provide specific data cleansing function (as defined by the project). Data Integrator Designer Guide 381 . 4. These objects are imported and grouped within a data quality datastore which must be created first. Data Quality Overview of steps to use Data Integrator Data Quality 14 Data Quality Terms and Definitions Data Quality Datastore: A data store that represents a connection to a Data Quality server. and to import Data Quality Projects (or blueprints). Overview of steps to use Data Integrator Data Quality Use the following steps to cleanse data in a Data Quality Project with Data Integrator: 1.This document is part of a SAP study on PDF usage. The Create New Datastore dialog box appears. This datastore allows you to connect to the Data Quality server. Create a Data Quality Datastore in Data Integrator Import a project from Data Quality into the Datastore Call the imorted Data Quality in a Data Integrator dataflow as a transform Map the Input and Output fields These steps are explained in detail in the remaining sections of this chapter. Find out how you can participate and help to improve our documentation. is usually called a transform. when used in a Data Integraton dataflow. An imported project. Blueprint: A sample Data Quality Project that can be used by Data Integrator without modification. To create a new Data Quality datastore Click the Datastore tab in the Local Object Library. Creating a Data Quality datastore Creating a Data Quality datastore is the first step in the data cleansing workflow.

For example. Choose File > New > Datastore to invoke this dialog box: Fill out the datastore properties window as shown in the following table: Option Datastore name Datastore type Server name Port number Repository path Description Type a descriptive name for the new datastore. Choose BusinessObjects Data Quality from the drop-down list. but the values for the settings must coincide with the necessary Data Quality configurations. The default path to the configuration rules folder is: C:\dqxi\11_5\repository\configuration_rules If you store your integrated batch projects elsewhere. provide that path here. if you installed to your local computer. Type the port number you use for the Data Quality server.This document is part of a SAP study on PDF usage. 382 Data Integrator Designer Guide . Type a number that represents the maximum number of seconds to wait for a connection to be made between Data Integrator and Data Quality Note: Timeout • This is the same process you use to set up any new datastore. 14 Data Quality Overview of steps to use Data Integrator Data Quality 2. Note: The "Repository Path" is a path on the machine where the Data Quality server runs (it might not be the same machine where the Data Integrator client is running). Type the path to the configuration_rules folder. type localhost here. Find out how you can participate and help to improve our documentation. Type the host name you selected when installing Data Quality.

To import the Data Quality Project into Data Integrator Double-click the datastore in the Local Objects Library A list of XML files (Data Quality Projects and blueprints) appears in the Workspace on the right of the Designer window. as shown below. 1.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 383 . Find out how you can participate and help to improve our documentation. next import a project from the Data Quality server. as shown below. Data Quality Importing Data Quality Projects 14 • The Data Quality server must be up and available when importing a Data Quality Project. Right-click the project you wish to import and choose Import. or when a Data Integrator Data Quality job is being executed. You cannot start and stop the Data Quality server from Data Integrator. Figure 14-3 :Importing a Data Quality Project from the Designer Note: You can also import multiple projects at a time by holding the Shift key and selecting a range of projects. 2. Importing Data Quality Projects After you have created the new Data Quality datastore.

as shown below: Figure 14-4 :Data Quality Projects shown as children in the Designer After you import a Data Quality Project into Data Integrator. 384 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. you can drag and drop it into a dataflow to call it like any typical Data Integrator transform. 14 Data Quality Importing Data Quality Projects Note that after you import the Data Quality Project it appears as child of the Data Quality datastore.This document is part of a SAP study on PDF usage.

Find out how you can participate and help to improve our documentation. it behaves like any other Data Integrator transform. You can drill into the transform to set its properties and configure data mappings. a query.This document is part of a SAP study on PDF usage. and a file writer: Figure 14-5 :Data flow containg the Data Quality transform Data Integrator Designer Guide 385 . You can decide which fields are sent to the data quality engine and which fields bypass the Data Quality server. a data quality transform. Data Quality Using the Data Quality transform 14 Using the Data Quality transform After a Data Quality Project is imported and dropped onto a data flow. The following graphic shows a simple data flow that contains an input reader.

When a field is identified as passthrough. its data will not be modified from its source to its target.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. After this option is enabled. those fields should be identified as passthrough. 14 Data Quality Using the Data Quality transform Drill into the Data Quality transform and click on the Properties tab to see the following view: Figure 14-6 :Data Quality transform Properties view Enable Passthrough Check this option if you need to define output fields that copy their data directly from the input without any cleansing performed. 386 Data Integrator Designer Guide . Data Integrator will not match the passthrough fields with the correct original records. you can drag fields from the upper left to the upper right window. but should appear in the output. Note: Passthrough should not be used when the Data Quality Project changes the order of records. for example when a sorter is used or when the number of records is changed in the Data Quality Project. If there are fields that do not require cleansing.

The project also defines what data quality operations are performed on these fields. Mapping input fields from the data flow to the project The following graphic displays the mapping view of a data quality transform that uses the AddressCleanseUSA project. which might not exist on the server machine. Figure 14-7 :Mapping view of a data quality transform The fields visible in the lower left window above are input fields expected by the project's socket reader. Data Quality Using the Data Quality transform 14 Substitution File The substitution file is used by the Data Quality Project during execution. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide 387 . If you do not specify a filename here. Substitution files are located in the configuration_rules folder of your repository on the data quality sever machine. Note that any unmapped fields pass throgh the engine as an empty string. See the Business Objects Data Quality Project Architect manual for more information about customizing your Data Quality Projects. The Data Quality Project defines the name and the meaning of the input fields.xml. Data Integrator uses the filename Substitutions.This document is part of a SAP study on PDF usage.

drill into the query transform from the data flow view. the field Address1 is passed to the transform from the previous query. The following view of the output schema is displayed: 388 Data Integrator Designer Guide . Address1 is then mapped to the Data Quality Project field ADDRESS_LINE1 The fields visible in the lower right window. In the example above. To map the cleansed output fields from the project back to the dataflow. Find out how you can participate and help to improve our documentation. In the example above. 14 Data Quality Using the Data Quality transform To map input fields from the data flow to the Data Quality Project. drag columns from the lower right window to the upper right window (shown in the picture above). Unmapped fields will be passed as NULL values to the Data Quality server. To examine the final output schema. are the output fields from the Data Quality Project's socket writer. To create a passthrough column.This document is part of a SAP study on PDF usage. "Name". drag the field from the upper left window to the upper right window. If passthrough is enabled you can also map your passthrough columns with this view. you can add a description here to help document your output fields. Also. You can change the name of the output columns. The project defines the field names. Note that not all fields need to be mapped. but not the data type. below. and "Company" are defined as passthrough. the fields "ID". drag columns from the upper left window to the lower left window.

or you can create custom Data Quality Projects in the Data Quality Project Architect. if they suit your needs. Data Quality Creating custom projects 14 Figure 14-8 :Final output schema final output schema Creating custom projects You can use the Data Quality blueprints. Choose New > Project > Integrated batch project. Data Integrator Designer Guide 389 . Launch Data Quality Project Architect. 1. Note: To open the Data Quality Project Architect from Data Integrator at any time. 3. Drag and drop the transforms or compound transforms you need onto the canvas to begin creating your project.This document is part of a SAP study on PDF usage. 2. you can also right-click on a project and choose. Find out how you can participate and help to improve our documentation. To create a custom project Open the Data Quality Project Architect.

Find out how you can participate and help to improve our documentation. For more information. (This is set in the Common > Performance > Num_Of_Threads option. see your Business Objects Data Quality user documentation. you must follow the rules explained below when you define the Data Quality Project. In order to use integrated batch projects.) You should not change these settings if the output order of records must match the input order of records. Using integrated batch project Data Integrator use Data Quality Projects of the type. it opens the selected project. Plugins are set to zero threads. Setting threads to a number greater than one can result in records being output in a different order. Adhere to the following rules when you set up an integrated batch project: • Number of threads. 14 Data Quality Creating custom projects Figure 14-9 :Data Quality Project Architect menu This drop-down menu only appears if the Data Quality Project Architect application is installed on the same machine as Data Integrator. which is a requirement for using passaround. “integrated batch project” for data cleansing.This document is part of a SAP study on PDF usage. 390 Data Integrator Designer Guide . By default. When the Project Aarchitect is launched. all transforms are set to run on one thread.

If you use a Sorter transform. with match results fields providing information on the match process. However. and if you are able to use passthrough.This document is part of a SAP study on PDF usage. and the second collection will most likely finish processing before the first. the Misc_Options > Input_Mode option must be set to Batch. and to cleanse name. SSN. you can adjust the Transform_Performance > Buffer_Size_Kilobytes option value to increase performance. If you must edit a blueprint. phone. you must edit them in the Data Quality Project Architect. Each thread processes a collection. These blueprints reside on the Data Quality server. After creating a datastore to connect with the Data Quality repository. The records at this point are not in their original order. A sample integrated batch project configured to cleanse consumer data and identify matching data records based on similar family name and address data. The output data includes all records from the data source. A sample integrated batch project configured to cleanse address data in the USA. Data Quality blueprints for Data Integrator Business Objects Data Quality provides Data Integrator users with Data Quality blueprints to set up your Data Quality Projects. consumer_match_family_name_ address_usa_full Data Integrator Designer Guide 391 . date. then you can set the number of threads to a number greater than one. Find out how you can participate and help to improve our documentation. suppose you set the number of threads to 2 in the Match transform. If you use an Associate transform. Data Quality Creating custom projects 14 For example. • • • Do not use the Unique_ID or Observer transform in your projects. You can access the Project Architect through the Start menu or you can access it from within Data Integrator (see. and the first collection contains 1000 records and the second collection contains two records. The names and descriptions are listed below: Name address_cleanse_usa address_data_cleanse_usa Description A sample integrated batch project configured to cleanse address data in the USA. you will have a list of blueprints to choose from. and email data using the Englishbased USA data quality rules. if order is of secondary importance to performance. title. “Creating custom projects” on page 389). firm. The functionality available in these transforms can be replicated using the Data Integrator Query transform.

14 Data Quality Creating custom projects Name consumer_match_family_name_ address_usa_pass1 Description A sample integrated batch project configured to cleanse consumer data and generate a break group key. consumer_match_family_name_ address_usa_pass2 consumer_match_name_address A sample integrated batch project configured to cleanse _usa_full consumer data and identify matching data records based on similar name and address data.This document is part of a SAP study on PDF usage. A sample integrated batch project configured to read the data prepared in the pass1 project and identify matching data records based on similar family name and address data. corporate_match_firm_address_ usa_pass1 corporate_match_firm_address_ usa_pass2 corporate_match_name_firm_ address_usa_full 392 Data Integrator Designer Guide . and address data. firm. consumer_match_name_address A sample integrated batch project configured to read the _usa_pass2 data prepared in the pass1 project and identify matching data records based on similar name and address data. with match results fields providing information on the match process. preparing data to be used in the pass2 project to identify matching data records based on similar name and address data. corporate_match_firm_address_ usa_full A sample integrated batch project configured to cleanse corporate data and identify matching data records based on similar firm and address data. consumer_match_name_address A sample integrated batch project configured to cleanse _usa_pass1 consumer data and generate a break group key. A sample integrated batch project configured to cleanse corporate data and generate a break group key. with match results fields providing information on the match process. A sample integrated batch project configured to read the data prepared in the pass1 project and identify matching data records based on similar firm and address data. A sample integrated batch project configured to cleanse corporate data and identify matching data records based on similar name. with match results fields providing information on the match process. The output data includes all records from the data source. Find out how you can participate and help to improve our documentation. preparing data to be used in the pass2 project to identify matching data records based on similar firm and address data. The output data includes all records from the data source. The output data includes all records from the data source. preparing data to be used in the pass2 project to identify matching data records based on similar family name and address data.

Data Integrator Designer Guide 393 . firm. and email data using the English-based USA data quality rules. corporate_match_name_firm_ address_usa_pass2 data_cleanse_usa Mapping blueprint fields The following table lists the extra fields in the integrated batch Reader transforms of the Data Quality blueprints. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. which match reports use for identifying statistics about match groups. Firm match standards. If you want to use any of these. Field name Address_Locality3 Address_Post_Office_Box_Number Address_Primary_Name1 Address_Primary_Postfix1 Address_Primary_Prefix1 Address_Primary_Type Firm1_Firm_Name_Match_STD1-3 Match_Apply_Blank_Penalty Match_Data_Source_ID Description City. title. Specifies the source ID. such as a directional. such as a directional. Address data that comes at the end of a street name. firm. lane. Address data that comes at the beginning of a street name. or suburb. Note: The maximum field length allowed to be passed from Data Integrator to Data Quality is 512 characters (1024 bytes). A field that contains the indicator to apply blank penalties. date. and address data. A sample integrated batch project configured to cleanse name. phone. preparing data to be used in the pass2 project to identify matching data records based on similar name. A sample integrated batch project configured to read the data prepared in the pass1 project and identify matching data records based on similar name. Post office box number Street name data. firm. SSN. boulevard. Fields mapped from Data Integrator to Data Quality that are larger than that will generate an error at runtime. Data that tells what type of street it is (street. Data Quality Creating custom projects 14 Name corporate_match_name_firm_ address_usa_pass1 Description A sample integrated batch project configured to cleanse corporate data and generate a break group key. town. and address data. and so on). you must map them.

Match standards for the given (middle) name of the 3 third person in the data record. Person3_GivenName2_Match_STD1. Person1_Honorary_Postname Person2_Honorary_Postname Person3_Honorary_Postname Honorary postname for up to three persons in the data record indicating certification. If you are familiar with Match/Consolidate. or affiliation. The Match_Source_ID is specific to statistics within match groups themselves. CPA. academic degree. Data_Source_ID is specific to the metadata repository report statistics that are generated. Person1_GivenName2_Match_STD2 The second and third match standards for the given Person1_GivenName2_Match_STD3 (middle) name of the first person in the data record. A user may want to qualify within a reader source multiple Source_ID. The Data_Source_ID. Person2_GivenName1_Match_STD1. for example). the same data is used for both.This document is part of a SAP study on PDF usage. but you can use something else as well. Specifies the gender of the persons in your data record. but there may be times when a reader is not enough for Source_ID identification. (up to three persons) Specifies the best record priority of the record. 394 Data Integrator Designer Guide . 14 Data Quality Creating custom projects Field name Match_Perform_Data_Salvage Match_Person1_Gender Match_Person2_Gender Match_Person3_Gender Match_Priority Match_Qualification_Table Match_Source_ID Description Specifies the indicator (Y/N) for performing a data salvage. The Data_Source_ID value can be used to fill this.Given name1(first name) match standards for the 3 second person in your data record.Match standards for the given (middle) name of the 3 second person in the data record. however. which I believe you are.Given name1(first name) match standards for the third 3 person in your data record. Person2_GivenName2_Match_STD1. this equates to the List_ID value. For example. Find out how you can participate and help to improve our documentation. Specifies the values for use in qualification tables (driver and passenger ID. is tied to the MDR statistics. Many times. Person3_GivenName1_Match_STD1.

5 > Documentation. including setting up projects. For example. Use these fields to map custom fields that you want to pass into Data Quality but that do not have a Data Quality counterparts. refer to the Data Quality documentation found in Start > Programs > BusinessObjects XI Release 2 > Data Quality 11. and so on. Find out how you can participate and help to improve our documentation. mapping fields. CN244-56. Data Quality documentation For information about using Data Quality.This document is part of a SAP study on PDF usage. Data Quality Creating custom projects 14 Field name UDPM1-4 User_Defined_01-20 Description Input of data that you have defined in your pattern file. Data Integrator Designer Guide 395 . creating substitution files.

14 Data Quality Creating custom projects 396 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

Data Integrator Designer Guide Design and Debug chapter .This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Use the Interactive Debugger to set breakpoints and filters between transforms within a data flow and view job data row-by-row during a job execution. when the following parent data flow is saved. See which data flows use the same object. and target data in a data flow after a job executes. see “Using Auditing” on page 362. 15 Design and Debug About this chapter About this chapter This chapter covers the following Designer features that you can use to design and debug jobs: • • • • • Use the View Where Used feature to determine the impact of editing a metadata object (for example. Using View Where Used Using View Data Using the interactive debugger Comparing Objects Calculating usage dependencies This chapter contains the following topics: • • • • • Using View Where Used When you save a job. or data flow Data Integrator also saves the list of objects used in them in your repository. Data Integrator also saves pointers between it and its three children: • • • a table source a query transform a file target 398 Data Integrator Designer Guide . Parent/child relationship data is preserved.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. transform. Use the auditing data flow feature to verify that correct data is processed by a source. or target object. transform. at table). Use the Difference Viewer to compare the metadata for similar objects and their properties. work flow. For example. Use the View Data feature to view sample source. For more information.

for example. For example. will have on other data flows that are using the same table. you may need to delete a source table definition and re-import the table (or edit the table schema). To access parent/child relationship information from the object library View an object in the object library to see the number of times that it has been used. Design and Debug Using View Where Used 15 You can use this parent/child relationship data to determine what impact a table change. find all the data flows that are also using the table and update them as needed. 1.This document is part of a SAP study on PDF usage. while maintaining a data flow. The data can be accessed using the View Where Used option. From the object library You can view how many times an object is used and then view where it is used. To access the View Where Used option in the Designer you can work from the object library or the workspace. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide 399 . Before doing this.

Find out how you can participate and help to improve our documentation. 2. to find objects that are not used. 15 Design and Debug Using View Where Used The Usage Count column is displayed on all object library tabs except: • • • Projects Jobs Transforms Click the Usage Count column heading to sort values. right-click the object and select View Where Used. 400 Data Integrator Designer Guide . If the Usage count is greater than zero.This document is part of a SAP study on PDF usage. For example.

Find out how you can participate and help to improve our documentation. For example.This document is part of a SAP study on PDF usage.: • • Source Target Data Integrator Designer Guide 401 . Design and Debug Using View Where Used 15 The Output window opens. Table DEPT is used by data flow DF1. flat files. etc. table DEPT is used as a Source. The type and name of the selected object is displayed in the first column’s heading. Other possible values for the As column are: • For XML files and messages. in data flow DF1. tables. in the following example: The As column provides additional context. The Information tab displays rows for each parent of the object you selected. The As column tells you how the selected object is used by the parent.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. 15 Design and Debug Using View Where Used • As For flat files and tables only: Description Translate table/file used in a lookup function Translate table/file used in a lookup_ext function Lookup() Lookup_ext() Lookup_seq() Translate table/file used in a lookup_seq function • For tables only: As Comparison Description Table used in the Table Comparison transform Key Generation Table used in the Key Generation transform 3. you can double-click a row in the output window again. double-click a parent object. The workspace diagram opens highlighting the child object the parent is using. Once a parent is open in the workspace. From the Output window. • If the row represents a different parent. the workspace diagram for that object opens. 402 Data Integrator Designer Guide .

select the View Where Used button. The names of objects used in parents can only be seen by opening the parent in the workspace.This document is part of a SAP study on PDF usage. right-click an object in the workspace diagram and select the View Where Used option. For example. Find out how you can participate and help to improve our documentation. this object is simply highlighted in the open diagram. or from the tool bar. In this example. The Output window opens with a list of parent objects that use the selected object. You can customize workspace object names for sources and targets. From the workspace From an open diagram of an object in the workspace (such as a data flow). if you select a table. Data Integrator saves both the name used in each parent and the name used in the object library. the Output window displays a list of data flows that use the table. you can view where a parent or child object is used: • To view information for the open (parent) object. The Information tab on the Output window displays the name used in the object library. the Output window opens with a list of jobs (parent objects) that use the open data flow. select View > Where Used. This is an important option because a child object in the Output window might not match the name used in its parent. Data Integrator Designer Guide 403 . • To view information for a child object. Design and Debug Using View Where Used 15 • If the row represents a child object in the same parent.

even when the job does not execute successfully. The Designer counts an object’s usage as the number of times it is used for a unique purpose. • • • Transforms are not supported. and ending data at your targets. these are not listed in the Output window display for a table. then rightclick a data flow and select the View Where Used option. For example. If function A is saved in one data flow. a data flow is the parent. a table is not often joined to itself in a job design. open the work flow in the workspace. The fact that function B is used once in function A is not counted. and function A is not in any data flows or scripts. the usage count in the object library will be 1 for both functions A and B. For more information see. • Data Integrator does not save parent/child relationships between functions.This document is part of a SAP study on PDF usage. • If function A calls function B. If the table is also used by a grandparent (a work flow for example). changed data from transformations. Find out how you can participate and help to improve our documentation. “Where Used” on page 465. 404 Data Integrator Designer Guide . This includes custom ABAP transforms that you might create to support an SAP R/3 environment. For example. View imported source data. You can also use the Metadata Reports tool to run a Where Used dependency report for any object. This occurrence should be rare. for a table. This report lists all related objects not just parent/child pairs. 15 Design and Debug Using View Data Limitations • • This feature is not supported in central repositories. At any point after you import a data source. in data flow DF1 if table DEPT is used as a source twice and a target once the object library displays its Usage count as 2. Using View Data View Data provides a way to scan and capture a sample of the data produced by each step in a job. For example. the Usage count in the object library will be zero for both functions. To see the relationship between a data flow and a work flow. you can check on the status of that data—before and after processing your data flows. Only parent and child pairs are shown in the Information tab of the Output window.

• • Transforms (For more information. Using one or more View Data panes. the Table Profile tab and Column Profile tab options are not supported for hierarchies. you can view and compare sample data from different steps. For SAP R/3 and PeopleSoft. The topics in this section include: • • • • Accessing View Data Viewing data in the workspace View Data properties View Data tabs Accessing View Data There are multiple places throughout Designer where you can open a View Data pane. Armed with data details. View Data information is displayed in embedded panels for easy navigation between your flows and the data. see “Using the interactive debugger” on page 418). (See “Viewing data in the workspace” on page 406 for more information. Click the View data button (magnifying glass icon) to open a View Data pane for that source or target object. Use View Data to look at: • Sources and targets View Data allows you to see data before you execute a job. You can scan and analyze imported table and file data from the object library as well as see the data for those same objects within existing jobs. you can refer back to the source data again. Of course after you execute the job. Design and Debug Using View Data 15 Use View Data to check the data while designing and testing jobs to ensure that your design returns the results you expect. you can create higher quality job designs. see “Viewing data passed by transforms” on page 435) Lines in a diagram (For more information.) Data Integrator Designer Guide 405 . Note: View Data is not supported for SAP R/3 IDocs. Sources and targets You can view data for sources and targets from two different locations: • View Data button View Data buttons appear on source and target objects when you drag them into the workspace.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

the file must physically exist and be available from your computer’s operating system. The Table Metadata. XML Format Editor. • Right-click a table and select Open or Properties. There are two ways to open a View Data pane from the object library: • Right-click a table object and select View Data. 15 Design and Debug Using View Data • Object library View Data in potential source or target objects from the Datastores or Formats tabs. To view data for a table. Viewing data in the workspace View Data can be accessed from the workspace when magnifying glass buttons appear over qualified objects in a data flow. To view data for a file. Find out how you can participate and help to improve our documentation. the table must be from a supported database.This document is part of a SAP study on PDF usage. From any of these windows. Transforms To view data after transformation. see “Viewing data passed by transforms” on page 435. This means: 406 Data Integrator Designer Guide . or Properties window opens. you can select the View Data tab.

When both panes are filled and you click another View Data button. files must physically exist and be accessible. Find out how you can participate and help to improve our documentation. To open a View Data pane in the Designer workspace. For transforms. and tables must be from a supported database. see “Viewing data passed by transforms” on page 435.This document is part of a SAP study on PDF usage. Click the magnifying glass button for another object and a second pane appears below the workspace area (Note that the first pane area shrinks to accommodate the presence of the second pane). The black area in each icon Data Integrator Designer Guide 407 . You can open two View Data panes for simultaneous viewing. Design and Debug Using View Data 15 • • For sources and targets. a small menu appears containing window placement icons. A large View Data pane appears beneath the current workspace area. click the magnifying glass button on a data flow object.

408 Data Integrator Designer Guide . Replace left pane Replace right pane The description or path for the selected View Data button displays at the top of the pane. For example. if you select a View Data button on the line between the query named Query and the target named ALVW_JOBINFO(joes. View Data properties You can access View Data properties from tool bar buttons or the right-click menu. the path would indicate: Query -> ALVW_JOBINFO(Joes. and the object name to the right.DI_REPO). Click a menu option and the data from the latest selected object replaces the data in the corresponding pane. The Designer highlights the View Data pane for the object. Looking for grey View Data buttons on objects and lines. • For sources and targets. the path consists of the object name on the left. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. an arrow.DI_REPO) You can also find the View Data pane that is associated with an object or line by: • • Rolling your cursor over a View Data button on an object or line. the description is the full object name: • • • ObjectName(Datastore. The Designer displays View Data buttons on open objects with grey rather than white backgrounds.Owner) for tables FileName(File Format Name) for files For View Data buttons on a line. 15 Design and Debug Using View Data indicates the pane you want to replace with a new set of data.

2. Default sample size is 1000 rows for imported source and target objects. The number of rows displayed is determined by a combination of several conditions: • Sample size — The number of rows sampled in memory. Design and Debug Using View Data 15 View Data displays your data in the rows and columns of a data grid. 1. For more information. Filtering You can focus on different sets of rows in a local or new data sample by placing fetch conditions on columns. or right-click the grid and select Filters. Data Integrator uses the Data sample rate option instead of sample size. To view and add filters In the View Data tool bar. The Filters window opens.This document is part of a SAP study on PDF usage. Maximum sample size is 5000 rows. When using the interactive debugger. see “Starting and stopping the interactive debugger” on page 424. Set sample size for sources and targets from Tools > Options > Designer > General > View Data sampling size. You can see which conditions have been applied in the navigation bar. click the Filters button. • • Filtering Sorting If your original data set is smaller or if you use filters. Data Integrator Designer Guide 409 . the number of returned rows could be less than the default. Find out how you can participate and help to improve our documentation. Create filters.

select an operator (AND.dd hh24:mm. OR) for the engine to use in concatenating filters.This document is part of a SAP study on PDF usage. Each row in this window is considered a filter. 410 Data Integrator Designer Guide . Column—Select a name from the first column. In the Concatenate all filters using list box.dd hh24:mm:ss yyyy.mm. Value—Enter a value in the third column that uses one of the following data type formats: Data Type Integer. Operator—Select an operator from the second column. 15 Design and Debug Using View Data The Filters window has three columns: a. Select {remove filter} to delete the filter. double. c. real date time datetime varchar Format standard yyyy.mm.ss ‘abc’ 3. Find out how you can participate and help to improve our documentation. b.

To remove sorting for an object. Find out how you can participate and help to improve our documentation. then opens the Filters window so you can view or edit the new filter. see “Using Refresh” on page 411. An arrow appears on the heading to indicate sort order: ascending (up arrow) or descending (down arrow). click Apply. To stop a refresh operation. Your filters are saved for the current object and the local sample updates to show the data filtered as specified in the Filters dialog. In the View Data tool bar. All filters are removed for the current object. 3. see “Using Refresh” on page 411.This document is part of a SAP study on PDF usage. or right-click the grid and select Remove Filters. To use filters with a new sample. Sorting You can click one or more column headings in the data grid to sort your data. Design and Debug Using View Data 15 4. To see how the filter affects the current set of returned rows. click OK. The priority of a sort is from left to right on the grid. <column> = <cell value>. click the column heading again. Data Integrator Designer Guide 411 . When you are finished. 5. click the Stop button. To save filters and close the Filters window. use the Refresh command. To use sorts with a new sample. To add a filter for a selected cell Select a cell from the sample data grid. from the tool bar click the Remove Sort button. or right-click the cell and select Add Filter. or right-click the grid and select Remove Sort. or right-click the data grid and select Refresh. 2. in the tool bar click the Refresh button in the tool bar. Using Refresh To fetch another data sample from the database using new filter and sort settings. 1. While Data Integrator is refreshing the data. click the Add Filter button. all View Data controls except the Stop button are disabled. The Add Filter option adds the new filter condition. To remove filters from an object. After you edit filtering and sorting. click OK. To change sort order. go to the View Data tool bar and click the Remove Filters button.

This option is only available if the total number of columns in the table is ten or fewer. Hide. 15 Design and Debug Using View Data Using Show/Hide Columns You can limit the number of columns displayed in View Data by using the Show/Hide Columns option from: • • • The tool bar. The right-click menu. click the Open Window tool bar button to activate a separate. Opening a new window To see more of the data sample that you are viewing in a View Data pane. or right-click the data grid and select Show/Hide Columns. To show or hide columns Click the Show/Hide columns tool bar button. Click OK. Select a column to display it. 1. From any View Data pane. open a full-sized View Data window. The arrow shortcut menu. 412 Data Integrator Designer Guide . or Hide All. you can right-click and select Open in new window from the menu. located to the right of the Show/Hide Columns tool bar button. 3. Show All. The Column Settings window opens. Find out how you can participate and help to improve our documentation. 2. full-sized View Data window. Select the columns that you want to display or click one of the following buttons: Show. You can also “quick hide” a column by right-clicking the column heading and selecting Hide from the menu.This document is part of a SAP study on PDF usage. Alternatively.

Option Description Open in new window Opens the View Data pane in a larger window. Design and Debug Using View Data 15 View Data tool bar options The following options are available on View Data panes. See “Sorting” on page 411. Copies View Data pane cell data. Refresh data Fetches another data sample from existing data in the View Data pane using new filter and sort settings. Open Filters window Opens the Filters window. Removes sort settings for the object you select. Prints View Data pane data. See “Opening a new window” on page 412. Removes all filters in the View Data pane. Save As Print Copy Cell Saves the data in the View Data pane. See “Using Refresh” on page 411. See “Using Show/Hide Columns” on page 412 View Data tabs The View Data panel for objects contains three tabs: • Data tab Data Integrator Designer Guide 413 . Add a Filter Remove Filter Remove Sort Show/hide navigation Show/hide columns See “To add a filter for a selected cell” on page 411. See “Filtering” on page 409. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Shows or hides the navigation bar which appears below the data table.

Without the Data Profiler. The data grid updates to show the data in the selected cell or nested table. The Profile and Relationship tabs are supported with the Data Profiler (see “Viewing the profiler results” on page 350 for more information). Find out how you can participate and help to improve our documentation. Alternatively. The Data tab is always available. When a column references nested schemas. 414 Data Integrator Designer Guide . It also indicates nested schemas such as those used in XML files and messages. that column is shaded yellow and a small table icon appears in the column heading. a. the Profile and Column Profile tabs are supported for some sources and targets (see release notes for more information). 1. 15 Design and Debug Using View Data • • Profile tab Column Profile tab Use tab options to give you a complete profile of a source or target object. To view a nested schema Double-click a cell. c. Click a cell in a marked column. b. Click the Drill Down button.This document is part of a SAP study on PDF usage. The Drill Down button (an ellipses) appears in the cell. Data tab The Data tab allows you to use the properties described in “View Data properties” on page 408.

See the entire path to the selected column or table displayed to the right of the Drill Up button. for example <CompanyName>. Use the path and the data grid to navigate through nested schemas. Click the Drill Up button at the top of the data grid to move up in the hierarchy. This optional feature is not available for columns with nested schemas or for the LONG data type. For details. For example: • • Select a lower-level nested column and double-click a cell to update the data grid. Nested schema references are shown in angle brackets. while nested schema references are displayed in grey. the selected cell value is marked by a special icon. data is shown for columns. Profile tab If you use the Data Profiler. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Also. tables and columns in the selected path are displayed in blue. In the Data area. Design and Debug Using View Data 15 In the Schema area. the Profile tab displays the profile attributes that you selected on the Submit Column Profile Request option. 1. To use the Profile tab without the Data Profiler Select one or more columns. The Profile tab allows you to calculate statistical information for any set of columns you choose. Continue to use the data grid side of the panel to navigate. Data Integrator Designer Guide 415 . see “Executing a profiler task” on page 342.

15 Design and Debug Using View Data Select only the column names you need for this profiling operation because Update calculations impact performance. you can click the Records button on the Profile tab to count the total number of physical records in the object you are profiling. The total number of NULL values in this column. then click Update to populate the profile grid. The total number of distinct values in this column. Note that Min and Max columns are not sortable. 3. The time that this statistic was calculated. the maximum value in this column. Find out how you can participate and help to improve our documentation. Click Update. Select names from this column. You can also right-click to use the Select All and Deselect All menu options. 416 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. 2. The grid contains six columns: Column Column Description Names of columns in the current table. Of all values. In addition to updating statistics. The statistics display in the Profile grid. Of all values. Distinct Values NULLs Min Max Last Updated Sort values in this grid by clicking the column headings. Data Integrator saves previously calculated values in the repository and displays them until the next update. the minimum value in this column.

3. The total number of rows in the specified column that contain this value. which means that Data Integrator returns the top 10 most frequently used values. Results are saved in the repository and displayed until you perform a new update. Total Percentage The percentage of rows in the specified column that have this value compared to the total number of values in the column. Design and Debug Using View Data 15 Column Profile tab The Column Profile tab allows you to calculate statistical information for a single column. 2. or “Other” (remaining values that are not used as frequently). the Relationship tab displays instead of the Column Profile (see “Viewing relationship profile data” on page 354 for more information). plus an additional value called “Other.“ So. Select a column name in the list box. Note: This optional feature is not available for columns with nested schemas or the LONG data type. The grid contains three columns: Column Value Description A “top” (most frequently used) value found in your specified column. The default is 10. Find out how you can participate and help to improve our documentation. Data Integrator returns a number of values up to the number specified in the Top box. Click Update. you will get up to 6 returned values (the top 5 used values in the specified column. if you enter 5 in the Top box. To calculate value usage statistics for a column Enter a number in the Top box. Data Integrator Designer Guide 417 .This document is part of a SAP study on PDF usage. The Column Profile grid displays statistics for the specified column. 1. If you use the Data Profiler. plus the “Other” category). This number is used to find the most frequently used values in the column.

and so on.This document is part of a SAP study on PDF usage. The 418 Data Integrator Designer Guide . set properties for the execution. 50 percent use the value Item3. statistical results in the preceding table indicate that of the four most frequently used values in the Name column. then click OK. the total number of rows counted during the calculation for each top value is 1000. The interactive debugger provides powerful options to debug a job. Using the interactive debugger The Designer includes an interactive debugger that allows you to examine and modify data row-by-row (during a debug mode job execution) by placing filters and breakpoints on lines in a data flow diagram. as only 10 percent is shown in the Other category. For this example. Find out how you can participate and help to improve our documentation. 15 Design and Debug Using the interactive debugger For example. The debug mode begins. Select Start debug. 20 percent use the value Item2. Topics in this section are: • • • • • • • Before starting the interactive debugger Starting and stopping the interactive debugger Windows Menu options and tool bar Viewing data passed by transforms Push-down optimizer Limitations Note: A repository upgrade is required to use this feature. you can start the interactive debugger from the Debug menu when a job is active in the workspace. You can also see that the four most frequently used values (the “top four”) are used in 90 percent of all cases. Before starting the interactive debugger Like executing a job.

• 1. you might want to set the following: • • Filters and breakpoints Interactive debugger port between the Designer and an engine. To set a filter or breakpoint In the workspace. The debugger uses the filters and pauses at the breakpoints you set. This often means that the first transform in each data flow of a job is pushed down to the source database. menus. While in debug mode. see “Push-down optimizer” on page 435. Data Integrator Designer Guide 419 . Open one of its data flows. 2. Setting filters and breakpoints You can set any combination of filters and breakpoints in a data flow before you start the interactive debugger. however. If you do not set predefined filters or breakpoints: • The Designer will optimize the debug job execution.This document is part of a SAP study on PDF usage. you cannot view the data in a job between its source and the first transform unless you set a predefined breakpoint on that line. Find out how you can participate and help to improve our documentation. Right-click the line that you want to examine and select Set Filter/ Breakpoint. All interactive debugger commands are listed in the Designer’s Debug menu. The Designer enables the appropriate commands as you progress through an interactive debugging session. open the job that you want to debug. all other Designer features are set to read-only. 3. To exit the debug mode and return other Designer features to read/write. Before you start a debugging session. A line is a line between two objects in a workspace diagram. click the Stop debug button on the interactive debugger toolbar. You can pause a job manually by using a debug option called Pause Debug (the job pauses before it encounters the next transform). Design and Debug Using the interactive debugger 15 Debug mode provides the interactive debugger's windows. and tool bar buttons that you can use to control the pace of the job and view data by pausing the job execution using filters and breakpoints. Consequently. For more information.

4. For example. Find out how you can participate and help to improve our documentation. the following window represents the line between AL_ATTR (a source table) and Query (a Query transform). Its title bar displays the objects to which the line connects. 420 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. 15 Design and Debug Using the interactive debugger Alternatively. The Breakpoint window opens. you can double-click the line or click the line then: • • • Press F9 Click the Set Filter/Breakpoint button on the tool bar Select Debug > Set Filter/Breakpoint. Set and enable a filter or a breakpoint using the options in this window.

Note that complex expressions are not supported in a debug filter. Data Integrator provides the following filter and breakpoint conditions: Icon Description Breakpoint disabled Breakpoint enabled Filter disabled Data Integrator Designer Guide 421 . Place a debug filter on a line between a source and a transform or two transforms. the execution pauses when the number of rows you specify pass through the breakpoint. Like a filter. Instead of selecting a conditional or unconditional breakpoint. Find out how you can participate and help to improve our documentation. the job execution pauses for the first row passed to a breakpoint. Design and Debug Using the interactive debugger 15 A debug filter functions as a simple Query transform with a WHERE clause. Click OK. A breakpoint is the location where a debug job execution pauses and returns control to you.This document is part of a SAP study on PDF usage. If you set a filter and a breakpoint on the same line. the job execution pauses for the first row passed to the breakpoint that meets the condition. • • If you use a breakpoint without a condition. Choose to use a breakpoint with or without conditions. The breakpoint can only see the filtered rows. The appropriate icon appears on the selected line. If you use a breakpoint with a condition. 5. A breakpoint condition applies to the after image for UPDATE. you can also use the Break after ‘n’ row(s) option. you can set a breakpoint between a source and transform or two transforms. NORMAL and INSERT row types and to the before image for a DELETE row type. Use a filter to reduce a data set in a debug job execution. Data Integrator applies the filter first. In this case.

it highlights subsequent lines and displays the locator box at your current position. For more information. when you start the interactive debugger. Find out how you can participate and help to improve our documentation. the debugger highlights a line when it pauses there. The locator box appears over the breakpoint icon as shown in the following diagram: A View Data button also appears over the breakpoint. As the debugger steps though your job’s data flow logic. You can use this button to open and close the View Data panes. the job pauses at your breakpoint. A red locator box also indicates your current location in the data flow.This document is part of a SAP study on PDF usage. 15 Design and Debug Using the interactive debugger Icon Description Filter enabled Filter and breakpoint disabled Filter and breakpoint enabled Filter enabled and breakpoint disabled Filter disabled and breakpoint enabled In addition to the filter and breakpoint icons that can appear on a line. see “Windows” on page 427. 422 Data Integrator Designer Guide . For example.

Enter a value in the Interactive Debugger box. 1. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide 423 . The interactive debugger port is set to 5001 by default.This document is part of a SAP study on PDF usage. 2. 3. To change the interactive debugger port setting Select Tools > Options > Designer > Environment. Click OK. Design and Debug Using the interactive debugger 15 Changing the interactive debugger port The Designer uses a port to an engine to start and stop the interactive debugger.

right-click a job and select Start debug. in the project area you can click a job and then: • • • Press Ctrl+F8 From the Debug menu. Once a job is active. You can select a job from the object library or from the project area to activate it in the workspace. Click the Start debug button on the tool bar. To start the interactive debugger In the project area. 424 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. 1. the Designer enables the Start Debug option on the Debug menu and tool bar. Alternatively. The Debug Properties window opens. 15 Design and Debug Using the interactive debugger Starting and stopping the interactive debugger A job must be active in the workspace before you can start the interactive debugger. click Start debug. Find out how you can participate and help to improve our documentation.

Find out how you can participate and help to improve our documentation. For example.This document is part of a SAP study on PDF usage. then the Designer displays up Data Integrator Designer Guide 425 . You will also find more information about the Trace and Global Variable options. in the following data flow diagram. if the source table has 1000 rows and you set the Data sample rate to 500. Print all trace messages. and Job Server options. See the Data Integrator Reference Guide for a description of the Monitor sample rate. Design and Debug Using the interactive debugger 15 The Debug Properties window includes three parameters similar to the Execution Properties window (used when you just want to run a job). The options unique to the Debug Properties window are: • Data sample rate — The number of rows cached for each line when a job executes using the interactive debugger.

You now have control of the job execution. 1000 rows 1000 rows Can view up to 500 rows • 2. Adds Debugging Job <JobName> to its title bar. 3. Note: You cannot perform any operations that affect your repository (such as dropping objects into a data flow) when you execute a job in debug mode. click Stop debug. To stop a job in debug mode and exit the interactive debugger Click the Stop Debug button on the tool bar. 15 Design and Debug Using the interactive debugger to 500 of the last rows that pass through a selected line. Displays the debug icon in the status bar. it pauses the job execution. The interactive debugger windows display information about the job execution up to this point. When the debugger encounters a breakpoint.This document is part of a SAP study on PDF usage. The debugger displays the last row processed when it reaches a breakpoint. Sets the user interface to read-only. Defaults to cleared. They also update as you manually step through the job or allow the debugger to continue the execution. Exit the debugger when the job is finished — Click to stop the debugger and return to normal mode after the job executes. The job you selected from the project area starts to run in debug mode. press Shift+F8. or From the Debug menu. Click OK. 426 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. The Designer: • • • • • Displays the interactive debugger windows. Enter the debug properties that you want to use or use the defaults. Enables the appropriate Debug menu and tool bar options.

Data Integrator Designer Guide 427 . Call stack window The Call Stack window lists the objects in the path encountered so far (before either the job completes. the Designer displays three additional windows as well as the View Data panes beneath the work space. encounters a breakpoint. then click and drag its title bar to re-dock it. To move a debugger window. use the Debug menu or the tool bar. Design and Debug Using the interactive debugger 15 Windows When you start a job in the interactive debugger. Your layout is preserved for your next Designer session. Call Stack window Trace window Variable window Each window is docked in the Designer’s window. See “Menu options and tool bar” on page 433. Control bar Control buttons The Designer saves the layout you create when you stop the interactive debugger. double-click on the window’s control bar to release it.This document is part of a SAP study on PDF usage. To show or hide a debugger window manually. or you pause it). You can resize or hide a debugger window using its control buttons. The following diagram shows the default locations for these windows. Find out how you can participate and help to improve our documentation.

if you click an object in a diagram. Similarly. When the job completes the debugger gives you a final opportunity to examine data. You can double-click an object in the Call Stack window to open it in the workspace. the following Call Stack window indicates that the data you are currently viewing is in a data flow called aSimple and shows that the path taken began with a job called Simple and passed through a condition called Switch before it entered the data flow. 15 Design and Debug Using the interactive debugger For example. When the job completes. press shift+F8. Trace window The Trace window displays the debugger’s output status and errors. Debug Variables window The Debug Variables window displays global variables in use by the job at each breakpoint. the Call Stack window highlights the object. Stop debugger. select the Stop Debug button on the tool bar. When you must exit the debugger.This document is part of a SAP study on PDF usage. or select Debug > Stop Debug. this window displays the following: Job <JobName> finished. 428 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.

see “Using View Data” on page 404. you can undo the discard immediately afterwards. If you accidentally discard a row. Displays (above the View Data tool bar) the names of objects to which a line connects using the format: TableName(DatastoreName. To discard a row from the next step in a data flow process. You can fix the job design later to eliminate the error permanently. Data Integrator Designer Guide 429 . To edit cell data: • • • • Deselect the All check box so that only has one row displayed. Select the discarded row and click Undo Discard Row.TableOwnerName)-> QueryName. For example. Double-click a cell or right-click it and select Edit cell. Allows you to edit data in a cell. The following View Data pane options are unique to the interactive debugger: • • • • • Allows you to view data that passes through lines. Design and Debug Using the interactive debugger 15 View Data pane The View Data pane for lines uses the same tool bar and navigation options described for the View Data feature. You might want to fix an error temporarily to continue with a debugger run. Uses a property called the Data sample rate. Allows you to flag a row that you do not want the next transform to process. Find out how you can participate and help to improve our documentation. Discarded row data appears in the strike-through style in the View Data pane (for example. select it and click Discard Row. Displays data one row at a time by default.This document is part of a SAP study on PDF usage. 100345). Provides the All check box which allows you to see more than one row of processed data. For more information.

For example.This document is part of a SAP study on PDF usage. 15 Design and Debug Using the interactive debugger Alternatively. right-click a row and select either Discard Row or Undo Discard Row from the shortcut menu. it displays the first row processed at a pre-defined breakpoint. Find out how you can participate and help to improve our documentation. if a source in a data flow has four rows and you set the Data sample rate to 2 when you start the debugger. 430 Data Integrator Designer Guide .

If you click Get Next Row again. only the last two rows processed are displayed because you set the sample size to 2. .This document is part of a SAP study on PDF usage. you have viewed two rows that have passed through a line. Design and Debug Using the interactive debugger 15 If you use the Get Next Row option. At this point. then the next row at the same breakpoint is displayed: If you want to see both rows. The row displayed at the bottom of the table is the last row processed. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide 431 . select the All check box on the upper-right corner of this pane.

select a command from the list and click Execute. You can also select a single line on the left and view/edit its filters and breakpoints on the right side of this window. To manage these. When you are finished using the Filters/ Breakpoints window. select the line(s) that you want to edit. 432 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. click OK. 15 Design and Debug Using the interactive debugger Filters and Breakpoints window You can manage interactive debugger filters and breakpoints using the Filters/Breakpoints window. Find out how you can participate and help to improve our documentation. Lines that contain filters or breakpoints are listed in the far-left side of the Filters/Breakpoints window. You can open this window from the Debug menu or tool bar.

Available when a job is active in the workspace.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Stops a debug mode execution and exits the debugger. Shift+F8 Start debug Stop debug Pause debug None Data Integrator Designer Guide 433 . Allows you to manually pause the debugger. All Designer operations are reset to read/write. Design and Debug Using the interactive debugger 15 Menu options and tool bar Once you start the interactive debugger. Available when a job is active in the workspace. you can access appropriate options from the Designer’s Debug menu and tool bar. You can use this option instead of a breakpoint. Debug menu Debug tool bar Table 15-2 :Debug menu and tool bar options Option Execute Description Key Commands Opens the Execution Properties window from F8 which you can select job properties then execute a job outside the debug mode. Ctrl+F8 Opens the Debug Properties window from which you can select job properties then execute a job in debug mode (start the debugger). Other Designer operations are set to read-only until you stop the debugger.

F9 edit. the Designer provides a popup menu from which you can select the logic branch you want to take. None When not selected. Available when a data flow is active in the workspace. From the workspace. Find out how you can participate and help to improve our documentation. Allows you to give control of the job back to the Designer. Opens a dialog from which you can set. This option is always available in the Designer. You can also set conditions for breakpoints. The debugger continues until: • You use the Pause debug option F11 Get next row Continue Ctrl+F10 • • Show Filters/ Breakpoints Another breakpoint is encountered The job completes Shows all filters and breakpoints that exist in a job. remove. Filters/ Breakpoints.. If the transform you step over has multiple outputs. Shows or hides the Call Stack window None Set Filter/ Breakpoints.This document is part of a SAP study on PDF usage.. all filters and breakpoints are hidden from view. 15 Design and Debug Using the interactive debugger Option Step over Description Key Commands Allows you to manually move to the next line in a F10 data flow by stepping over a transform in the workspace. Also offers the same functionality as the Set Filters/ Breakpoints window. Use this option to see the first row in a data set after it is transformed. The workspace displays a red square on the line to indicate the path you are using. Call Stack 434 Data Integrator Designer Guide . enable or disable filters and breakpoints. Allows you to stay at the current breakpoint and view the next row of data in the data set. you can right-click a line and select the same option from a short cut menu. Opens a dialog with which you can manage Alt+F9 multiple filters and breakpoints in a data flow. This option is always available in the Designer...

3. 1. click the name of the data flow to view. Because the interactive debugger requires a job execution. 1. 3. Find out how you can participate and help to improve our documentation. 2. right-click a job and click Start debug. Design and Debug Using the interactive debugger 15 Option Variables Description Shows or hides the Debug Variables window Key Commands None Trace Shows or hides the Trace window None Viewing data passed by transforms To view the data passed by transforms. 4. 2. Clear the Exit the debugger when the job is finished check box.This document is part of a SAP study on PDF usage. which is 500. You can enter a value in the Data sample rate text box or leave the default value. execute the job in debug mode. Push-down optimizer When Data Integrator executes a job. it normally pushes down as many operations as possible to the source database to maximize performance. Navigate through the data to review it. Click OK. The Debug Properties window opens. the following push-down rules apply: • Query transforms The first transform after a source object in a data flow is optimized in the interactive debugger and pushed down to the source database if both objects meet the push-down criteria and if you have not placed a Data Integrator Designer Guide 435 . To view data passed by transforms In the project area. When done. in the project area. Click the View Data button displayed on a line in the data flow. To view sample data in debug mode While still in debug mode after the job completes. click the Stop debug button on the toolbar.

Pre-defined breakpoints are breakpoints defined before you start the interactive debugger. 15 Design and Debug Using the interactive debugger breakpoint on the line before the first transform. You cannot place a breakpoint on this line and you cannot use the View Data pane. A repository upgrade is required to use this feature. After the interactive debugger is started. 436 Data Integrator Designer Guide . If any of these objects have the same name. For example. • Breakpoints Data Integrator does not push down any operations if you set a predefined breakpoint. if the first transform is pushed down. it is push-down. Debug options are not available at the work flow level. • Filters If the input of a pre-defined filter is a database source. For more information about push-down criteria. if there are several outputs for a transform you can choose which path to use. Pre-defined filters are interactive debugger filters defined before you start the interactive debugger. see the Data Integrator Performance Optimization Guide. the result of your selection is unpredictable.This document is part of a SAP study on PDF usage. the line is disabled during the debugging session. Limitations • • • • The interactive debugger can be used to examine data flows. The debugger cannot be used with SAP R/3 data flows. All objects in a data flow must have a unique name. Find out how you can participate and help to improve our documentation.

This document is part of a SAP study on PDF usage. You can compare: • • • two different objects different versions of the same object an object in the local object library with its counterpart in the central object library You can compare just the top-level objects. — Compares the selected object to another similar type of object Object with dependents to.. 2. From the shortcut menu. 3. color. highlight Compare. the panes show items such as the object’s properties and the properties of and connections (links) between its child objects. Design and Debug Comparing Objects 15 Comparing Objects Data Integrator allows you to compare any two objects and their properties by using the Difference Viewer utility. Objects must be of the same type. When you move the cursor over an object that is eligible for comparison. To compare two different objects In the local or central object library. right-click an object name. the target cursor changes color. Some of these properties are configurable. The Difference Viewer window opens in the workspace. and background shading.. Click on the desired object. you can compare a job to another job or a custom function to another custom function.. and from the submenu.. Find out how you can participate and help to improve our documentation. • • • • Object to central — Compares the selected object to its counterpart in the central object library Object with dependents to central — Compares the selected object and its dependent objects to its counterpart in the central object library Object to. Data Integrator Designer Guide 437 . or you can include the object’s dependents in the comparison. click one of the following options (availability depends on the object you selected): 1. The window identifies changed items with a combination of icons. but you cannot compare a job to a data flow. Depending on the object type. for example. — Compares the selected object and its dependents to another similar type of object The cursor changes to a target icon.

438 Data Integrator Designer Guide . 1. In the central object library. The Difference Viewer window opens in the workspace. Click Show Differences or Show Differences with Dependents. see the Data Integrator Advanced Development and Migration Guide. Find out how you can participate and help to improve our documentation. 4. and from the shortcut menu click Show History. Ctrl-click the two versions or labels you want to compare. For more information about using the History window. 2. you can compare two objects that have different versions or labels. right-click an object name. 3. In the History window. Close the History window. 15 Design and Debug Comparing Objects Object names and their locations Navigation bar Toolbar Status bar To compare two versions of the same object If you are working in a multiuser environment and using a central object library.This document is part of a SAP study on PDF usage.

This document is part of a SAP study on PDF usage. Following each object name is its location. press F5. and the second object appears on the right. Filter buttons • • • • • Hide non-executable elements — Select this option to remove from view those elements that do not affect job execution Hide identical elements — Select this option to remove from view those elements that do not have differences Disable filters — Removes all filters applied to the comparison Show levels — Show Level 1 shows only the objects you selected for comparison. Toolbar The toolbar includes the following buttons. Expanding or collapsing any property set also expands or collapses the compared object’s corresponding property set. Show Level 2 expands to the next level. Find (Ctrl+F) — Click to open a text search dialog box. The next section describes these features. Show All Levels expands all levels of both trees. the main designer window also contains a menu called Difference Viewer. when a Difference Viewer window is active. Find out how you can participate and help to improve our documentation. Design and Debug Comparing Objects 15 Overview of the Difference Viewer window The first object you selected appears in the left pane of the window. To refresh a Difference Viewer window. The Difference Viewer window includes the following features: • • • • toolbar navigation bar status bar shortcut menu Also. You can have multiple Difference Viewer windows open at a time in the workspace. • Navigation buttons: • • • • • • • First Difference (Alt+Home) Previous Difference (Alt+left arrow) Current Difference Next Difference (Alt+right arrow) Last Difference (Alt+End) Enable filter(s) — Click to open the Filters dialog box. etc. Data Integrator Designer Guide 439 .

The item has been added to the object in the right pane. Navigation bar The vertical navigation bar contains colored bars that represent each of the differences throughout the comparison. See the next section for more information on how to navigate through differences. An arrow in the navigation bar indicates the difference that is currently highlighted in the panes. “Shortcut menu”. Find out how you can participate and help to improve our documentation. The differences between the items are highlighted in blue (the default) text.This document is part of a SAP study on PDF usage. 440 Data Integrator Designer Guide . You must close this window before continuing in Data Integrator. Consolidated This icon appears next to an item if items within it have differences. the currently highlighted difference is 9 of 24 total differences in the comparison). Icon Difference Deleted Changed Inserted Description The item does not appear in the object in the right pane. The status bar also includes a reference for which difference is currently selected in the comparison (for example. 15 Design and Debug Comparing Objects • Open in new window — Click to open the currently active Difference Viewer in a separate window. The purple brackets in the bar indicate the portion of the comparison that is currently in view in the panes. Status bar The status bar at the bottom of the window includes a key that illustrates the color scheme and icons that identify the differences between the two objects. You can click on the navigation bar to select a difference (the cursor point will have a star on it). Expand the item by clicking its plus sign to view the differences You can change the color of these icons by right-clicking in the Difference Viewer window and clicking Configuration. See the next section. The colors correspond to those in the status bar for each difference. The status bar also indicates that there is at least one filter applied to this comparison.

1. Right-click in the body of the Difference Viewer window to display the shortcut toolbar. For example. Data Integrator Designer Guide 441 . changed. see “To change the color scheme” on page 441. Changed. or consolidated items in the comparison panes. 2. or click OK to close the Configuration window. Deleted.This document is part of a SAP study on PDF usage. Design and Debug Comparing Objects 15 Shortcut menu Right-clicking in the body of the Difference Viewer window displays a shortcut menu that contains all the toolbar commands plus: • View — Toggle to display or hide the status bar. Layout — Use to reposition the navigation bar. Click the Color sample to open the Color palette. Click a Basic color or create a custom color. Find out how you can participate and help to improve our documentation. 4. 3. 7. navigation bar. Configuration — Click to modify viewing options for elements with differences. 5. You can customize this color scheme as follows. Click another marker to change it. Click OK. or secondary toolbar (an additional toolbar that appears at the top of the window. • • To change the color scheme The status bar at the bottom of the Difference Viewer window shows the current color scheme being used to identify deleted. Click a marker (Inserted. or Consolidated) to change. 6. inserted. you might find this useful if you have the Differences Viewer open in a separate window). Click Configuration to open the Configuration window.

2. The purple brackets in the bar indicate the portion of the comparison that is currently in view in the panes. 1. Use the scroll bar in either pane to adjust the bracketed view. 15 Design and Debug Comparing Objects To change the background shading Items with differences appear with a background default color of grey. Navigating through differences The Difference Viewer window offers several options for navigating through differences. 4. You can customize this background. Click OK. Click a Basic color or create a custom color. You can click on these bars to jump to different places in the comparison. Use the scroll bars for these panes to navigate within them. 442 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. Click Configuration to open the Configuration window. for example to view only inserted items (with a default color of green). click the magnifying glass to view the text in a set of new panes that appear below the main object panes. To apply different background colors to different markers. You can navigate through the differences between the objects using the navigation buttons on the toolbar. 5. Difference Viewer menu When a Difference Viewer window is active in the workspace. 8. 3. Note that an arrow appears next to the colored bar that corresponds to that item. or select the Apply for all markers check box. You can also use the navigation bar. The menu contains the same commands as the toolbar. Click OK to close the Configuration window. click the marker to configure and repeat steps 4 through 6. The item is marked with the appropriate icon and only the differing text appears highlighted in the color assigned to that type of difference. 7. clicking the Next Difference button highlights the next item that differs in some way from the compared object. Click the Background sample to open the Color palette. Click the magnifying glass (or any other item) to close the text panes. For text-based items such as scripts. 6. the main Designer window contains a menu called Difference Viewer. Right-click in the body of the Difference Viewer window to display the shortcut toolbar. Select an item in either pane that has a difference. For example. Click a marker to change.This document is part of a SAP study on PDF usage.

or Select Tools > Options > Designer > General > Calculate column mappings when saving data flows. To calculate column mappings from the Designer Right-click the object library and select Repository > Calculate column mappings. Note: If you change configuration settings for your repository. Design and Debug Calculating usage dependencies 15 Calculating usage dependencies You can calculate usage dependencies from the Designer at any time. • To calculate usage dependencies from the Designer Right-click in the object library of the current repository and select Repository > Calculate Usage Dependencies.This document is part of a SAP study on PDF usage. The Calculate Usage Dependency option populates the internal AL_USAGE table and ALVW_PARENT_CHILD view. • • Data Integrator Designer Guide 443 . Find out how you can participate and help to improve our documentation. you must also change the internal datastore configuration for the calculate usage dependencies operation.

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. 15 Design and Debug Calculating usage dependencies 444 Data Integrator Designer Guide .

Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide Exchanging metadata chapter .

x XML MIMB (the Meta Integration® Model Bridge) MIMB is a Windows stand-alone utility that converts metadata models among design tool formats. the additional formats it supports are listed in the Metadata Exchange window. you can export metadata into an XML file. you can export metadata directly from a repository into a universe using the Create or Update data mode. and repositories in distributed heterogeneous environments. Metadata exchange Creating Business Objects universes Attributes that support metadata exchange This chapter discusses these topics: • • • Metadata exchange You can exchange metadata between Data Integrator and third-party tools using XML files and the Metadata Exchange option. Using the Business Objects Universes option.This document is part of a SAP study on PDF usage. 16 Exchanging metadata About this chapter About this chapter Data Integrator allows you to import metadata from AllFusion ERwin Data Modeler (ERwin) from Computer Associates and export metadata for use with reporting tools like those available with the BusinessObjects 2000 BI Platform. Data Integrator can also use: 446 Data Integrator Designer Guide . After you create the file. • • Using the Metadata Exchange option. By using MIMB with Data Integrator. If MIMB is installed. you must manually import it into another tool. you can exchange metadata with all formats that MIMB supports. platforms.0 XML/XMI 1.1 CWM (the Common Warehouse Metamodel)— is a specification that enables easy interchange of data warehouse metadata between tools. Find out how you can participate and help to improve our documentation. You can use the Metadata Exchange option or the Business Objects Universes option to export metadata. • • ERwin 4. Data Integrator supports two built-in metadata exchange formats: • CWM 1.

3. 1. select Metadata Exchange. select ERwin 4. In the Metadata format box. Exporting metadata files from Data Integrator You can export Data Integrator metadata into a file that other tools can read. 1. select Import metadata from file. Specify the Source file name (enter directly or click Browse to search). 2. 5.x XML from the list of available formats. Click OK to complete the import. In the Metadata Exchange window. select Metadata Exchange.This document is part of a SAP study on PDF usage. 4. See “Creating Business Objects universes” on page 449. 6. Data Integrator Designer Guide 447 . Select the Target datastore name from the list of Data Integrator datastores. To export metadata from Data Integrator using Metadata Exchange From the Tools menu. This section discusses: • • Creating Business Objects universes Exporting metadata files from Data Integrator Importing metadata files into Data Integrator You can import metadata from ERwin Data Modeler 4.x XML into a Data Integrator datastore. Exchanging metadata Metadata exchange 16 • BusinessObjects Universe Builder Converts Data Integrator repository metadata to Business Objects universe metadata. Find out how you can participate and help to improve our documentation. To import metadata into Data Integrator using Metadata Exchange From the Tools menu.

Specify the target file name (enter directly or click Browse to search). the metadata is exported without opening the MIMB application. Select a Metadata format for the target from the list of available formats. select Export Data Integrator metadata to file. like this: 448 Data Integrator Designer Guide . 16 Exchanging metadata Metadata exchange 2. When you search for the file. Using the MIMB application provides more configuration options for structuring the metadata in the exported file. select the Visual check box to open the MIMB application when completing the export process. If you do not select the Visual check box.This document is part of a SAP study on PDF usage. In the Metadata Exchange window. 3. you open a typical browse window. Find out how you can participate and help to improve our documentation. If you have MIMB installed and you select an MIMB-supported format. 4.

1. 6. You can create Business Objects universes using the Tools menu or the object library.x XML MIMB format (only if installed) After you select a file. 3. File type XML XML All Creating Business Objects universes Data Integrator allows you to easily export its metadata to Business Objects universes for use with business intelligence tools. Select either Create or Update. A universe is a layer of metadata used to translate physical metadata into logical metadata.0 XML/XMI 1. To create universes using the Tools menu Select Tools > Business Objects Universes. The Create Universe or Update Universe window opens. Find out how you can participate and help to improve our documentation. first install BusinessObjects Universe Builder on the same computer as BusinessObjects Designer and Data Integrator Designer. For example the physical column name deptno might become Department Number according to a given universe design.This document is part of a SAP study on PDF usage. Select the datastore(s) that contain the tables and columns to export and click OK. Data Integrator Designer Guide 449 . 2. You can install Universe Builder using the installer for Data Integrator Designer or using the separate Universe Builder CD. refer to the BusinessObjects Universe Builder Guide. Select the Source datastore name from the list of Data Integrator datastores.1 DI ERwin 4. For more information. Click OK to complete the export. Exchanging metadata Creating Business Objects universes 16 Find any of the following file formats/types: Format DI CWM 1. click Open. Note: To use this export option. Data Integrator launches the Universe Builder application and provides repository information for the selected datastores. 5.

To create universes using the object library Select the Datastores tab. 2. Select either Create or Update. refer to the BusinessObjects Universe Builder Guide. column Schema Object data type Join expression Class description Class description Class name Object description Object description Object name Object description Object description 450 Data Integrator Designer Guide . table Object. 3. 16 Exchanging metadata Creating Business Objects universes 1. For more information. Find out how you can participate and help to improve our documentation. Mappings between repository and universe metadata Data Integrator metadata maps to BusinessObjects Universe metadata as follows: Data Integrator Table Column Owner Column data type (see next table) Primary key/foreign key relationship Table description Table Business Description Table Business Name Column description Column Business description Column Business Name Column mapping Column source information (lineage) Data types also map: Data Integrator Data type Date/Datetime/Time Decimal Int Double/Real BusinessObjects Type Date Number Number Number BusinessObjects Universe Class.This document is part of a SAP study on PDF usage. Right-click a datastore and select Business Objects Universes. Data Integrator launches the Universe Builder application and provides repository information for the selected datastores.

• A Business_Name is a logical field. Exchanging metadata Attributes that support metadata exchange 16 Data Integrator Data type Interval Varchar Long BusinessObjects Type Number Character Long Text Attributes that support metadata exchange The attributes Business_Name and Business_Description exist in Data Integrator for both tables and columns. and load physical data while the Business Name data remains intact. Data Integrator Designer Guide 451 . transform. Use this attribute to define and run jobs that extract.1. Data Integrator stores it as a separate and distinct field from physical table or column names. • Data Integrator includes two additional column attributes that support metadata exchanged between Data Integrator and BusinessObjects: • • Column_Usage Associated_Dimension For more information see the Data Integrator Reference Guide. These attributes support metadata exchanged between Data Integrator and BusinessObjects through the Universe Builder (UB) 1. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. A Business_Description is a business-level description of a table or column. Data Integrator transfers this information separately and adds it to a BusinessObjects Class description.

This document is part of a SAP study on PDF usage. 16 Exchanging metadata Attributes that support metadata exchange 452 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.

Data Integrator Designer Guide Recovery Mechanisms chapter .This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

This document is part of a SAP study on PDF usage. even if the step is caught in a try/catch block. some data flows in the job may have completed and some tables may have been loaded. You can use various techniques to recover from unsuccessful job executions. 454 Data Integrator Designer Guide . For recovery purposes. Find out how you can participate and help to improve our documentation. You might need to use a combination of these techniques depending on the relationships between data flows in your application. However. you might need to roll back changes manually from target tables if interruptions occur during job execution. Manually recovering jobs using status tables — A design technique that allows you to rerun jobs without regard to partial results in a previous run. This chapter contains the following topics: • • • • Recovering from unsuccessful job execution Automatically recovering jobs Manually recovering jobs using status tables Processing data with problems Recovering from unsuccessful job execution If an Data Integrator job does not complete properly. If you do not use these techniques. Data Integrator retrieves the results for successfully completed steps and reruns uncompleted or failed steps under the same conditions as the original job. rerun the job and retrieve all the data without duplicate or missing data. Therefore. during the failed job execution. you must fix the problems that prevented the successful execution of the job and run the job again. you can choose to run the job again in recovery mode. Data Integrator records the result of each successfully completed step in a job. Automatically recovering jobs With automatic recovery. This section discusses two techniques: • • Automatically recovering jobs — A Data Integrator feature that allows you to run unsuccessful jobs in recovery mode. Data Integrator considers steps that raise exceptions as failed steps. or altered. 17 Recovery Mechanisms About this chapter About this chapter Recovery mechanisms are available in Data Integrator for batch jobs only. If a job fails. partially loaded. During recovery mode. you need to design your data movement jobs so that you can recover—that is.

you must enable the feature during initial execution of a job. Data Integrator Designer Guide 455 . 3. Data Integrator does not record the results from the steps during the job and cannot recover the job if it fails. select the Enable Recovery check box. select the job name.This document is part of a SAP study on PDF usage. you must perform any recovery operations manually. Data Integrator prompts you to save any changes. • To run a job with recovery enabled from the Administrator When you schedule or execute a job from the Administrator. Right-click and choose Execute. 2. Recovery Mechanisms Automatically recovering jobs 17 Enabling automated recovery To use the automatic recover feature. Make sure that the Enable Recovery check box is selected on the Execution Properties window. To run a job from Designer with recovery enabled In the project area. If this check box is not selected. 1. Find out how you can participate and help to improve our documentation. In that case. Data Integrator saves the results from successfully completed steps when the automatic recovery feature is enabled.

then click OK. 1. Right-click and choose Properties. If the work flow does not complete successfully. even if that work flow or data flow is contained within a recovery unit work flow that re-executes. Data Integrator executes the entire work flow during recovery. you should designate the work flow as a “recovery unit. To specify a work flow as a recovery unit In the project area. 17 Recovery Mechanisms Automatically recovering jobs Marking recovery units In some cases. see the Data Integrator Reference Guide. the entire work flow must complete successfully. For example. even steps that executed successfully in prior work flow runs. select the work flow. and recovery. 3.This document is part of a SAP study on PDF usage. parallel flows. Select the Recover as a unit check box. For more information about how Data Integrator processes data flows and work flows with multiple conditions like execute once. a job will never re-execute that work flow or data flow after it completes successfully. 2. Find out how you can participate and help to improve our documentation.” When a work flow is a recovery unit. Because of the dependency. there are some exceptions to recovery unit processing. However. Therefore. Business Objects recommends that you not mark a work flow or data flow as Execute only once when the work flow or a parent work flow is a recovery unit. 456 Data Integrator Designer Guide . when you specify that a work flow or a data flow should only execute once. steps in a work flow depend on each other and must be executed together.

then Data Integrator retrieves the results from the previous execution. If any step in the work flow did not complete successfully. Data Integrator prompts you to save any objects that have unsaved changes. Running in recovery mode If a job with automated recovery enabled fails during execution. If the entire work flow completes successfully—that is. you cannot use automatic recovery but must run the job as if it is a first run. Find out how you can participate and help to improve our documentation. To run a job in recovery mode from Designer In the project area. As in normal job execution. Right-click and choose Execute. If you need to make any changes to the job itself to correct the failure. When you select Recover from last failed execution. Data Integrator considers this work flow a unit. Data Integrator executes the steps in parallel if they are not connected in the work flow diagrams and in serial if they are connected. 3. Data Integrator retrieves the results from any steps that were previously executed successfully and executes or re-executes any other steps. without an error—during a previous execution. Make sure that the Recover from last failed execution check box is selected. you can reexecute the job in recovery mode. Recovery Mechanisms Automatically recovering jobs 17 During recovery.This document is part of a SAP study on PDF usage. when the previous run succeeded. On the workspace diagram. the black “x” and green arrow symbol indicate that a work flow is a recovery unit. Data Integrator Designer Guide 457 . then the entire work flow re-executes during recovery. This option is not available when a job has not yet been executed. you need to determine and remove the cause of the failure and rerun the job in recovery mode. Data Integrator executes the steps or recovery units that did not complete successfully in a previous execution—this includes steps that failed and steps that threw an exception but completed successfully such as those in a try/catch block. select the (failed) job name. 1. As with any job execution failure. In recovery mode. or when recovery mode was disabled during previous run. 2.

To ensure that the fact tables are loaded with the data that corresponds properly to the data already loaded in the dimension tables. suppose a daily update job running overnight successfully loads dimension tables in a warehouse. However. then the job execution may follow a completely different path through conditional steps or try/catch blocks. the administrator truncates the log file and runs the job again in recovery mode. For example. When recovery is enabled. If the recovery job used new extraction criteria—such as basing data extraction on the current system date—the data in the fact tables would not correspond to the data previously extracted into the dimension tables. If the job was allowed to run under changed conditions—suppose a sysdate function returns a new date to control what data is extracted—then the new data loaded into the targets will no longer match data successfully loaded into the target during the first execution of the job. while the job is running. and so forth—and the recovery job uses the stored values.This document is part of a SAP study on PDF usage. 17 Recovery Mechanisms Automatically recovering jobs If you clear this option. Ensuring proper execution path The automated recovery system requires that a job in recovery mode runs again exactly as it ran previously. It is important that the recovery job run exactly as the previous run. failed run successfully loaded them. the database log overflows and stops the job from loading fact tables. The next day. To ensure that the recovery job follows the exact execution path that the original job followed. The recovery job does not reload the dimension tables because the original. Data Integrator records any external inputs to the job— return values for systime and sysdate. In addition. if the recovery job uses new values. Data Integrator stores results from the following types of steps: • • • • 458 Work flows Batch data flows Script statements Custom functions (stateless type only) Data Integrator Designer Guide . select the Recover from last failed execution check box. the recovery job must use the same extraction criteria that the original job used when loading the dimension tables. Find out how you can participate and help to improve our documentation. results from scripts. Data Integrator runs this job anew. When you schedule or execute a (failed) job from the Administrator. performing all steps.

Because the execution path through the try/catch block might be different in the recovered job. If an exception is thrown inside a try/catch block. However. Recovery Mechanisms Automatically recovering jobs 17 • • • • • • SQL function exec function get_env function rand function sysdate function systime function Using try/catch blocks with automatic recovery Data Integrator does not save the result of a try/catch block for reuse during recovery. using variables set in the try/catch block could alter the results during automatic recovery. Subsequent steps are based on the value of $i. the job fails in the subsequent work flow. Job execution logic $i = 10 IF $i < 1 IF $i<1=TRUE IF $i<1=FALSE $i = 0 During the first job execution. $i.This document is part of a SAP study on PDF usage. Data Integrator Designer Guide 459 . then during recovery Data Integrator executes the step that threw the exception and subsequent steps. that you set within a try/catch block. you set an alternate value for $i. If an exception occurs. which is caught. the first work flow contains an error that throws an exception. suppose you create a job that defines a variable. For example. Find out how you can participate and help to improve our documentation.

only some of the required rows could be inserted in a table. the first work flow no longer throws the exception. 17 Recovery Mechanisms Automatically recovering jobs First job execution An error occurs while processing this work flow An exception is thrown and caught in the first execution You fix the error and run the job in recovery mode. You can use several methods to ensure that you do not insert duplicate rows: • Design the data flow to completely replace the target table during each execution 460 Data Integrator Designer Guide . is different. Ensuring that data is not duplicated in targets Define work flows to allow jobs correct recovery. Thus the value of the variable. producing different results. do not use values set inside the try/catch block in any subsequent steps. and the job selects a different subsequent work flow. During the recovery execution. Recovery execution The execution path changes because of the results from the try/catch block No exception is thrown in the recovery execution To ensure proper results with automatic recovery when a job contains a try/ catch block. A data flow might be partially completed during an incomplete run. Find out how you can participate and help to improve our documentation. You do not want to insert duplicate rows during recovery when the data flow re-executes. $i. As a result.This document is part of a SAP study on PDF usage.

Data Integrator Designer Guide 461 . During recovery. 1. partial database load would match this criteria. (The variable value is set in a script. Create a preload SQL command that deletes rows based on the value in that field.) The rows inserted during the previous. you must define variables and pass them to data flows correctly. You can create a script with a variable that records the current time stamp before any new rows are inserted. the variable value is not reset. Using preload SQL to allow re-executable data flows To use preload SQL commands to remove partial database updates. In the target table options.This document is part of a SAP study on PDF usage. Consider this technique when the target table is large and the changes to the table are relatively few. however. tables must contain a field that allows you to tell when a row was inserted. Find out how you can participate and help to improve our documentation. • Set the auto correct load option for the target table The auto correct load option checks the target table for existing rows before adding new rows to the table. the preload SQL command deletes rows based on a variable that is set before the partial insertion step began. To use preload SQL commands properly. For example. • Include a SQL command to execute before the table loads Preload SQL commands can remove partial database updates that occur during incomplete execution of a step in a job. During initial execution. suppose a table contains a column that records the time stamp of any row insertion. add a preload SQL command that deletes any rows with a time-date stamp greater than that recorded by the variable. which is executed successfully during the initial run. To use preload SQL commands to ensure proper recovery Determine appropriate values that you can use to track records inserted in your tables. Using the auto correct load option. Recovery Mechanisms Automatically recovering jobs 17 This technique can be optimal when the changes to the target table are numerous compared to the size of the table. and the preload SQL command would delete them. You can use tuning techniques such as bulk loading options to improve overall performance. Typically. can needlessly slow jobs executed in non-recovery mode. no rows match the deletion criteria.

if each row in a table is marked with the insertion time stamp. create scripts that set tracking variables outside the work flow. You need to create a separate script that sets the required variables before each data flow or work flow that loads a table. 3. otherwise. Connect the scripts to the corresponding data flows or work flows. Work Flow (Not a recovery unit) When a work flow is not a recovery unit. you do not want tracking variables reset during recovery because when they reset. 2. Find out how you can participate and help to improve our documentation. Connect the script to the work flow. typically at the job level. otherwise. Scripts are unique steps in jobs or work flows. create the scripts at the work flow level. Create variables that can store the “tracking” values. then you can use the value from the sysdate() function to determine when a row was added to that table. create the “tracking” variables for that work flow at the job level. For information about creating scripts. Connect the script directly to the appropriate data flow. Create scripts that set the variables to the appropriate values. see “Scripts” on page 211. For information about creating a variable. see “Defining local variables” on page 302. Generally. Job Work Flow (A recovery unit) When a work flow is a recovery unit. 17 Recovery Mechanisms Automatically recovering jobs For example. 4. Variables are either job or work-flow specific.This document is part of a SAP study on PDF usage. create the scripts for that work flow at the job level. create scripts that set tracking variables inside the work flow before the data flow that requires the value. create your tracking variables at the work flow level. If a work flow is a recovery unit. the preload SQL command will not work properly. 462 Data Integrator Designer Guide . If a work flow is a recovery unit.

A job designed for manual recovery must have certain characteristics: • • You can run the job repeatedly. Store the flag value in a status table. You created a variable $load_time that records the value from the sysdate() function before the load starts. The job implements special steps to recover data when a step did not complete successfully during a previous run. Data Integrator Designer Guide 463 . Insert appropriate preload SQL commands that remove any records inserted during earlier unsuccessful runs. see the Data Integrator Reference Guide. Recovery Mechanisms Manually recovering jobs using status tables 17 5. 6. Then. For example. The preload SQL commands reference the parameter containing the tracking variable. Manually recovering jobs using status tables You can design your jobs and work flows so that you can manually recover from an unsuccessful run. A “failure” value signals Data Integrator to take a recovery execution path. The table records a job’s execution status. see “Defining parameters” on page 302. Create parameters to pass the variable information from the job or work flow where you created the variable to the data flow that uses the tracking variable in the preload SQL command. delete from PO_ITEM where TIMESTMP > [$load_time] For information about creating preload SQL commands. your preload SQL command must delete any records in the table where the value in TIMESTMP is larger than the value in $load_time. suppose the PO_ITEM table records the date-time stamp in the TIMESTMP column. deleting rows that were inserted after the variable was set. Find out how you can participate and help to improve our documentation. To implement a work flow with a recovery execution path: • • • Define a flag that indicates when the work flow is running in recovery mode. You can use an execution status table to produce jobs that can be run multiple times without duplicating target rows. Check the flag value in the status table before executing a work flow to determine which path to execute in the work flow. and you passed that variable to the data flow that loads the PO_ITEM table in a parameter named $load_time. For information about creating variables.This document is part of a SAP study on PDF usage.

as illustrated: 464 Data Integrator Designer Guide . For example. This work flow would have five steps. 17 Recovery Mechanisms Manually recovering jobs using status tables • Update the flag value when the work flow executes successfully. you could design a work flow that uses the auto correct load option when a previous run does not complete successfully.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

('UPDATE status_table SET stop_timestamp = SYSDATE WHERE start_timestamp = (SELECT MAX(start_timestamp) FROM status_table)')). ELSE $recovery_needed = 0. $stop_date = sql('target_ds'. IF (($StopStamp = NULL) OR ($StopStamp = '')) $recovery_needed = 1.This document is part of a SAP study on PDF usage. ('SELECT stop_timestamp FROM status_table WHERE start_timestamp = (SELECT MAX(start_timestamp) FROM status_table)')). 1 5 Data Integrator Designer Guide 465 . Recovery Mechanisms Manually recovering jobs using status tables 17 $StopStamp = sql('target_ds'. Find out how you can participate and help to improve our documentation.

4. see the Data Integrator Reference Guide. this section discusses three techniques: • • • Using overflow files Filtering missing or bad values Handling facts with missing dimensions Using overflow files A row that cannot be inserted is a common data problem. Retrieve the flag value. Data Integrator might insert rows with missing information. Data Integrator writes the row to the overflow file instead. In a conditional. you can set the option to use an overflow file in the Options tab. For example. give a full path name to ensure that Data Integrator creates a unique file when more than one file is created in the same job. You can design your data flows to anticipate and process these types of problems. Find out how you can participate and help to improve our documentation. For any table used as a target. 17 Recovery Mechanisms Processing data with problems 1. Use the overflow file to process this type of data problem. This section describes mechanisms you can use to anticipate and process data problems. By default. Processing data with problems Jobs might not produce the results you expect because of problems with data. In other cases. execute the non-recovery data flow load_customer.This document is part of a SAP study on PDF usage. you might have a data flow write rows with missing information to a special file that you can inspect later. which indicates the success or failure of the previous execution. When you specify an overflow file. evaluate the $recovery_needed variable. 5. Update the flag value in the status table to indicate successful execution. If recovery is required. If recovery is not required. In particular. When you specify an overflow file and Data Integrator cannot load a row into a table. Data Integrator is unable to insert a row. 466 Data Integrator Designer Guide . This data flow loads the data without the auto correct load option. Store this value in a variable such as $recovery_needed. The trace log indicates the data flow in which the load failed and the location of the file. from the status table. For more information about the auto correct load option. 3. the name of the overflow file is the target table name. This data flow loads the data using the auto correct load option. In some cases. 2. execute the recovery data flow recover_customer.

suppose you are extracting data from a source and you know that some phone numbers and customer names are missing. Filtering missing or bad values A missing or invalid value in the source data is another common data problem. you can use the commands to load the target manually when the target is accessible. Data Integrator Designer Guide 467 . For example. There are many reasons for loading to fail. Note: You cannot use overflow files when loading to a BW Transfer Structure. Every new run will overwrite the existing overflow file. you can use Data Integrator to read the data from the overflow file. You can also choose to include this data in the target or to disregard it. and load it into the target table.This document is part of a SAP study on PDF usage. If you select data. load the data into a target. Recovery Mechanisms Processing data with problems 17 When you select the overflow file option. cleanse it. for example: • • • Out of memory for the target Overflow column settings Duplicate key values You can use the overflow information to identify invalid data in your source or problems introduced in the data movement. and filter the NULL values into a file for your inspection. Using queries in data flows. you can identify missing or invalid values in source data. If you select SQL commands. you choose what Data Integrator writes to the file about the rows that failed to load: either the data from the row or the SQL commands required to load the row. Find out how you can participate and help to improve our documentation. You can use a data flow to extract data from the source.

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

17

Recovery Mechanisms Processing data with problems

This data flow has five steps, as illustrated.
key_generation ( ' target_ds.owner.Customer ', ' Customer_Gen_Key ', 1)

3 1 2

4
SELECT Query.CustomerID, Query.NAME, Query.PHONE FROM Query WHERE (NAME = NULL) OR (PHONE = NULL);

5
10002, ,(415)366-1864 20030,Tanaka, 21101,Navarro, 17001, ,(213)433-2219 16401, ,(609)771-5123

The data flow: 1. 2. 3. 4. 5. Extracts data from the source Selects the data set to load into the target and applies new keys. (It does this by using the Key_Generation function.) Loads the data set into the target, using the bulk load option for best performance Uses the same data set for which new keys were generated in step 2, and select rows with missing customer names and phone numbers Writes the customer IDs for the rows with missing data to a file

Now, suppose you do not want to load rows with missing customer names into your target. You can insert another query into the data flow to ensure that Data Integrator does not insert incomplete rows into the target. The new query filters the rows with missing customer names before loading any rows into the target. The missing data query still collects those rows along with the rows containing missing phone numbers. In this version of the example, the Key_Generation transform adds keys for new rows before inserting the filtered data set into the target.

468

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Recovery Mechanisms Processing data with problems

17

The data flow now has six steps, as shown.
SELECT * FROM source WHERE (NAME <> NULL);

2 1

3

4

6 5
SELECT Query.CustomerID, Query.NAME, Query.PHONE FROM Query WHERE (NAME = NULL) OR (PHONE = NULL);

1. 2. 3. 4. 5.

Extracts data from the source Selects the data set to load into the target by filtering out rows with no customer name values Generates keys for rows with customer names Loads the valid data set (rows with customer names) into the target using the bulk load option for best performance Uses a separate query transform to select rows from the source that have no names or phones Note that Data Integrator does not load rows with missing customer names into the target; however, Data Integrator does load rows with missing phone numbers.

6.

Writes the customer IDs for the rows with missing data to a file.

You could add more queries into the data flow to select additional missing or invalid values for later inspection.

Data Integrator Designer Guide

469

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

17

Recovery Mechanisms Processing data with problems

Handling facts with missing dimensions
Another data problem occurs when Data Integrator searches a dimension table and cannot find the values required to complete a fact table. You can approach this problem in several ways:

Leave the problem row out of the fact table. Typically, this is not a good idea because analysis done on the facts will be missing the contribution from this row.

Note the row that generated the error, but load the row into the target table anyway. You can mark the row as having an error, or pass the row information to an error file as in the examples from “Filtering missing or bad values” on page 467.

Fix the problem programmatically. Depending on the data missing, you can insert a new row in the dimension table, add information from a secondary source, or use some other method of providing data outside of the normal, high-performance path.

470

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

Data Integrator Designer Guide

Techniques for Capturing Changed Data

chapter

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data About this chapter

About this chapter
This chapter contains the following topics:

• • • • • • •

Understanding changed-data capture Using CDC with Oracle sources Using CDC with DB2 sources Using CDC with Attunity mainframe sources Using CDC with Microsoft SQL Server databases Using CDC with timestamp-based sources Using CDC for targets

Understanding changed-data capture
When you have a large amount of data to update regularly and a small amount of system down time for scheduled maintenance on a data warehouse, update data over time, or delta load. Two commonly used delta load methods are full refresh and changed-data capture (CDC).

Full refresh
Full refresh is easy to implement and easy to manage. This method ensures that no data will be overlooked or left out due to technical or programming errors. For an environment with a manageable amount of source data, full refresh is an easy method you can use to perform a delta load to a target system.

Capturing only changes
After an initial load is complete, you can choose to extract only new or modified data and update the target system. Identifying and loading only changed data is called changed-data capture (CDC). This includes only incremental data that has changed since the last refresh cycle. Data Integrator acts as a mechanism to locate and extract only the incremental data that changed since the last refresh. Improving performance and preserving history are the most important reasons for using changed-data capture.

Performance improves because with less data to extract, transform, and load, the job typically takes less time.

472

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Understanding changed-data capture

18

If the target system has to track the history of changes so that data can be correctly analyzed over time, the changed-data capture method can provide a record of these changes. For example, if a customer moves from one sales region to another, simply updating the customer record to reflect the new region negatively affects any analysis by region over time because the purchases made by that customer before the move are attributed to the new region.

This chapter discusses both general concepts and specific procedures for performing changed-data capture in Data Integrator.

Source-based and target-based CDC
Changed-data capture can be either source-based or target-based.

Source-based CDC
Source-based changed-data capture extracts only the changed rows from the source. It is sometimes called incremental extraction. This method is preferred because it improves performance by extracting the least number of rows. Data Integrator offers access to source-based changed data that various software vendors provide. The following table shows the data sources that Data Integrator supports.
Table 18-3 :Data Sources and Changed Data Capture Products and Techniques

Data Source Oracle 9i and higher

Products or techniques to use for Changed Data Capture Use Oracle’s CDC packages to create and manage CDC tables. These packages make use of a publish and subscribe model. You can create a CDC datastore for Oracle sources using Data Integrator Designer. You can also use the Designer to create CDC tables in Oracle, then import them for use in Data Integrator jobs. For more information, refer to “Using CDC with Oracle sources” on page 475.

Data Integrator Designer Guide

473

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Understanding changed-data capture

Data Source

Products or techniques to use for Changed Data Capture

DB2 UDB for Windows, Use the following products to capture changed UNIX, and Linux, data from DB2 sources: • DB2 Information Integrator for Replication Edition 8.2 (DB2 II Replication Edition) • IBM WebSphere Message Queue 5.3.1 (MQ)

Data Integrator’s real-time IBM Event Publisher adapter. DB2 II Replication Edition publishes changes from DB2 onto WebSphere Message Queues. Use the Data Integrator Designer to create a CDC datastore for DB2 sources. Use the Data Integrator Administrator to configure an IBM Event Publisher adapter and create Data Integrator real-time jobs to capture the changed data from the MQ queues. For more information, refer to “Using CDC with DB2 sources” on page 495. For mainframe data sources that use Attunity to connect to Data Integrator, you can use Attunity Streams 4.6. For more information, refer to “Using CDC with Attunity mainframe sources” on page 505. Use Microsoft SQL Replication Server to capture changed data from SQL Server databases. For more information, refer to “Using CDC with Microsoft SQL Server databases” on page 513. Use date and time fields to compare source-based changed-data capture job runs. This technique makes use of a creation and/or modification timestamp on every row. You can compare rows using the time of the last update as a reference. This method is called timestamp-based CDC. For more information, refer to “Using CDC with timestamp-based sources” on page 522.

Mainframe data sources (Adabas, DB2 UDB for z/OS, IMS, SQL/MP, VSAM, flat files) accessed with Attunity Connect Microsoft SQL Server databases

Other sources

Target-based CDC
Target-based changed-data capture extracts all the data from the source, but loads only the changed rows into the target.

474

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with Oracle sources

18

Target-based changed-data capture is useful when you want to capture history but do not have the option to use source-based changed-data capture. Data Integrator offers table comparison to support this method.

Using CDC with Oracle sources
If your environment must keep large amounts of data current, the Oracle Change Data Capture (CDC) feature is a simple solution to limiting the number or rows that Data Integrator reads on a regular basis. A source that reads only the most recent operations (INSERTS, UPDATES, DELETES), allows you to design smaller, faster delta loads. This section includes the following topics:

• • • • • •

Overview of CDC for Oracle databases Setting up Oracle CDC Importing CDC data from Oracle Configuring an Oracle CDC source Creating a data flow with an Oracle CDC source Maintaining CDC tables and subscriptions

Overview of CDC for Oracle databases
With Oracle 9i or higher, Data Integrator manages the CDC environment by accessing Oracle's CDC packages. Oracle’s packages use the publish and subscribe model with its CDC tables. Oracle publishes changed data from the original table to its CDC table. Data Integrator Designer allows you to create or import CDC tables and create subscriptions to access the data in the CDC table. Separate subscriptions allows each user to keep track of the last changed row that he or she accessed. You can also enable check-points for subscriptions so that Data Integrator only reads the latest changes in the CDC table. Oracle uses the following terms for Change Data Capture:

• •

Change (CDC) table—A relational table that contains changed data that results from DML operations performed on a source table. Change set—A group of CDC tables that are transactionally consistent. For example, SalesOrder and SalesItem tables should be in a change set to ensure that changes to an order and its line items are captured together.

Data Integrator Designer Guide

475

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Using CDC with Oracle sources

• •

Change source—The database that contains one or more change sets. Publisher—The person who captures and publishes the changed data. The publisher is usually a database administrator (DBA) who creates and maintains the schema objects that make up the source database and staging database. Publishing mode—Specifies when and how to capture the changed data. For details, see the following table of publishing modes. Source database—The production database that contains the data that you extracted for your initial load. The source database contains the source tables. Staging database—The database where the changed data is published. Depending on the publishing mode, the staging database can be the same as, or different from, the source database. Subscriber—A user that can access the published data in the CDC tables. Subscription—Controls access to the change data from one or more source tables within a single change set. A subscription contains one or more subscriber views. Subscriber view—The changed data that the publisher has granted the subscriber access to use.

• • • • • •

Oracle 10G supports the following publishing modes: Publishing mode Synchronous How capture data When Location of captured data changes is available CDC tables must reside in the source database Considerations

Uses internal Real time triggers on the source tables to store the changes in CDC tables

Adds overhead to source database at capture time Available in Oracle 9i and Oracle 10G

476

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with Oracle sources

18

Publishing mode

How capture data

When Location of captured data changes is available A change set contains multiple CDC tables and must reside locally in the source database A change set contains multiple CDC tables and can be remote or local to the source database

Considerations

Asynchronous Uses redo or archive Near real time HotLog logs for the source database

• •

Improves performance because data is captured offline Available in Oracle 10G only Improves performance because data is captured offline Available in Oracle 10G only

Asynchronous Uses redo logs AutoLog managed by log transport services that automate transfer from source database to staging database

Depends on frequency of redo log switches on the source database

Oracle CDC in synchronous mode
The following diagram shows how the changed data flows from Oracle CDC tables to Data Integrator in synchronous mode.

When a transaction changes a source table, internal triggers capture the changed data and store it in the corresponding CDC table.

Data Integrator Designer Guide

477

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Using CDC with Oracle sources

Oracle CDC in asynchronous HotLog mode
The following diagram shows how the changed data flows from Oracle CDC tables to Data Integrator in asynchronous HotLog mode.

When a transaction changes a source table, the Logwriter records the changes in the Online Log Redo files. Oracle Streams processes automatically populate the CDC tables when transactions are committed.

Oracle CDC in asynchronous AutoLog mode
The following diagram shows how the changed data flows from Oracle CDC tables to Data Integrator in asynchronous AutoLog mode.

478

Data Integrator Designer Guide

Techniques for Capturing Changed Data Using CDC with Oracle sources 18 When the log switches on the source database. Asynchronous CDC is available with Oracle Enterprise Edition only. open Oracle’s Admin directory. Find out how you can participate and help to improve our documentation. if a CDC package needs to be re-installed. Note: The Oracle archive process requires uninterrupted connectivity through Oracle Net to send the redo log files to the remote file server (RFS). Oracle archives the redo log file and copies the Online Log Redo files to the staging database. These packages are installed by default. enable Oracle’s system triggers. Set source table owner privileges so CDC tables can be created. However. purged. Data Integrator Designer Guide 479 . and dropped as needed. For synchronous CDC. then find and run Oracle’s SQL script initcdc. Enable Java. Oracle Streams processes populate the CDC tables from the copied log files. • • • • • • Synchronous CDC is available with Oracle Standard Edition and Enterprise Edition. Give datastore owners the SELECT privilege for CDC tables and the SELECT_CATALOG_ROLE and EXECUTE_CATALOG_ROLE privileges.sql.This document is part of a SAP study on PDF usage. Setting up Oracle CDC Use the following system requirements on your Oracle source database server to track changes: • Install Oracle’s CDC packages.

2. • • CDC datastores To gain access to CDC tables. If you will use the Data Integrator Designer to create CDC tables. Find out how you can participate and help to improve our documentation. Use one of the following ways: • Use an Oracle utility to create CDC tables 480 Data Integrator Designer Guide . Specify the name of your staging database (the change source database where the changed data is published) in Connection name. Select the CDC check box. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources • For asynchronous AutoLog CDC: • The source database DBA must build a LogMiner data dictionary to enable the log transport services to send this data dictionary to the staging database. The publisher (usually the source database DBA) must configure log transport services to copy the redo log files from the source database system to the staging database system and to automatically register the redo log files. Importing CDC data from Oracle You must create a CDC table in Oracle for every source table you want to read from before you can import that CDC table using Data Integrator. 1. create a CDC datastore using the Designer. You can use this datastore to browse and import CDC tables. you need to specify the SCN in the wizard (see step 12 of procedure “To invoke the New CDC table wizard in the Designer” on page 481). Like other datastores. you can create. and access a CDC datastore from the Datastores tab of the object library. Select an Oracle version. To create a CDC datastore for Oracle Create a database datastore with the Database Type option set to Oracle. edit. Enter the User and Password for your staging database and click OK. 3. Oracle automatically updates the data dictionary with any source table DDL operations that occur during CDC to keep the staging tables consistent with the source tables. A CDC datastore is a read-only datastore that can only access tables.This document is part of a SAP study on PDF usage. 5. The Designer only allows you to select the Oracle versions that support CDC packages. The source database DBA must also obtain the SCN value of the data dictionary build. 4.

right-click a CDC datastore and select Open. To invoke the New CDC table wizard in the Designer In the object library. In the Datastore Explorer. Data Integrator Designer Guide 481 . This wizard allows you to add a CDC table. The New CDC table wizard opens. 2. Import by Name. If you select Open. Find out how you can participate and help to improve our documentation. When CDC tables exist in Oracle Import an Oracle CDC table by right-clicking the CDC datastore name in the object library and selecting Open.This document is part of a SAP study on PDF usage. you can browse the datastore for existing CDC tables using the Datastore Explorer. Techniques for Capturing Changed Data Using CDC with Oracle sources 18 • Use Data Integrator Designer to create CDC tables Using existing Oracle CDC tables 1. right-click it and select Import. 2. or Search. right-click the white space in the External Metadata section. When you find the table that you want to import. and select New. The Data Integrator Designer provides you the ability to create Oracle CDC tables for all publishing modes: • • • Synchronous CDC HotLog Asynchronous CDC AutoLog Asynchronous CDC Creating CDC tables Data Integrator 1.

18 Techniques for Capturing Changed Data Using CDC with Oracle sources Note: If the Datastore Explorer opens and no CDC tables exist in your datastore. 3. The user name for the source database DBA. Use the service name of the Oracle Net service configuration. this wizard opens automatically. Select the publishing mode on the first page of the wizard.This document is part of a SAP study on PDF usage. select Asynchronous AutoLog and provide the following source database connection information: Field Connection name User Name Password Description The name of the database where the Change Source resides. Find out how you can participate and help to improve our documentation. you can only select the Synchronous mode. the wizard selects the Asynchronous HotLog mode by default. If your source database is Oracle 9i. If your source database is Oracle 10G. If your source database uses Asynchronous AutoLog publishing mode. 482 Data Integrator Designer Guide . The Asynchronous modes are disabled. The password for the Change Source user.

Data Integrator generates a table name using the following convention: CDC__SourceTableName. Techniques for Capturing Changed Data Using CDC with Oracle sources 18 4. To filter a search. a. By default. 8. The source table owner name is also displayed in the CDC table owner list box. By default. 5. The second page of the wizard appears. c. By default. the owner name of the new CDC table is the owner name of the datastore.This document is part of a SAP study on PDF usage. (Optional) Select Generate before-images if you want to track before. For more information about this option. Click the Search button to see a list of non-CDC external tables available in this datastore. all columns are selected. You can use a wild-card character (%) to perform pattern matching for Name or Owner values. If the owner name you want to use is not in the list. Specify which columns to include or exclude from the CDC table in one of the following ways: • Remove the check mark from the box next to the name of each column that you want to exclude. see “Using before-images” on page 490. enter a different owner name. 7. Click a name in the list of returned tables and click Next to create a CDC table using the selected table as a source table. Find out how you can participate and help to improve our documentation.and after-images in the new CDC table. enter values for a table Owner and/or Name. Specify the CDC table owner for the new CDC table. Click Next. Specify the source table information in the second page of the wizard. Specify the CDC table name for the new CDC table. b. Data Integrator Designer Guide 483 . 6.

For an Oracle CDC table. Oracle adds other columns when it creates a CDC table. Note: All tables that Data Integrator imports through a CDC datastore contain a column that indicates which operation to perform for each row. 484 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. this column is called Operation$. then imported it successfully into Data Integrator. Click Finish. These columns all use a dollar sign as a suffix. creates the CDC table on the Oracle server. and imports the table’s metadata into Data Integrator’s repository. 10. Click OK on the information dialog. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources • Click Unselect All and place a check mark next to the name of each column that you want to include. The Designer connects to the Oracle instance. For synchronous publishing mode: a. click Next. In addition to this column. For asynchronous (HotLog or AutoLog) publishing mode. 9. Find out how you can participate and help to improve our documentation. This dialog confirms that Oracle created a new CDC table. b.

creates the CDC table on the Oracle server. you can create a new change set by typing in the name. For an Oracle CDC table. For asynchronous HotLog publishing mode. d. Find out how you can participate and help to improve our documentation. this column is called Operation$. select a name from the drop-down list for Change set name. Select Stop capture on DDL if a DDL error occurs and you do not want to capture data. a.This document is part of a SAP study on PDF usage. b. If you would like to add this change table to an existing change set to keep the changes transactionally consistent with the tables in the change set. The Designer connects to the Oracle instance. Select Define retention period to enable the Begin Date and End Date text boxes. specify the change set information in the fourth page of the wizard. c. Click Finish. and imports the table’s metadata into Data Integrator’s repository. Note: All tables that Data Integrator imports through a CDC datastore contain a column that indicates which operation to perform for each row. These columns all use a dollar sign as a suffix. Techniques for Capturing Changed Data Using CDC with Oracle sources 18 11. Oracle adds other columns when it creates a CDC table. Alternatively. In addition to this column. Data Integrator Designer Guide 485 .

If you would like to add this change table to an existing change set to keep the changes transactionally consistent with the tables in the change set. specify the change set and change source information in the fourth page of the wizard. You can obtain this name from the source database Global_Name table SCN of data dictionary build—SCN value of the data dictionary build. select a name from the drop-down list for Change set name. e. c. d. If you want to create a new change source. Select Stop capture on DDL if a DDL error occurs during data capture and you do not want to capture data. Find out how you can participate and help to improve our documentation. • • • Change source name—Name of the CDC change source. 486 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. Alternatively. Select Define retention period to enable the Begin Date and End Date text boxes. type the following information: b. refer to your Oracle documentation. a. Source database—Name of the source database. For more information about these parameters. If you would like to add this change table to an existing change source. select a name from the drop-down list for Change source name. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources 12. For asynchronous AutoLog publishing mode. you can create a new change set by typing in the name.

This document is part of a SAP study on PDF usage. Note: All tables that Data Integrator imports through a CDC datastore contain a column that indicates which operation to perform for each row. The Designer connects to the Oracle staging database. 2. To view an imported CDC table Find your CDC datastore in the object library. Techniques for Capturing Changed Data Using CDC with Oracle sources 18 f. creates the CDC table on the change source. Click Finish. and imports the table’s metadata into Data Integrator’s repository. 3. These columns all use a dollar sign as a suffix. An imported Oracle CDC table schema looks like the following: Oracle CDC table columns Oracle source columns Data Integrator Designer Guide 487 . Double-click a table name or right-click and select Open. Viewing an imported CDC table 1. it also adds two columns to the table’s schema: DI_SEQUENCE_NUMBER with the data type integer and DI_OPERATION_TYPE with the data type varchar(1). Find out how you can participate and help to improve our documentation. Expand the Tables folder. For an Oracle CDC table. In addition to this column. this column is called Operation$. Oracle adds other columns when it creates a CDC table. When Data Integrator imports a CDC table.

depending on the options that were selected when the CDC table is created. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources This example has eight control columns added to the original table: • • Two generated by Data Integrator Six Oracle control columns Note: The Oracle control columns vary. The translation is as follows: Operation$ I D UO. it checks the values in column Operation$ and translates them to Data Integrator values in the DI_OPERATION_TYPE column. All Oracle control columns end with a dollar sign ($).This document is part of a SAP study on PDF usage. it automatically becomes a source object. Find out how you can participate and help to improve our documentation. see “Using before-images” on page 490. UU UN DI_OPERATION_TYPE I D B U Configuring an Oracle CDC source When you drag a CDC datastore table into a data flow. 1. Both the before. The DI_SEQUENCE_NUMBER column The DI_SEQUENCE_NUMBER column starts with zero at the beginning of each extraction. For information about when to consider using before-images. This field increments by one each time Data Integrator reads a row except when it encounters a pair of before.and after-images for an UPDATE operation.and after-images receive the same sequence number. The DI_OPERATION_TYPE column The possible values for the DI_OPERATION_TYPE column are: • • • • I for INSERT D for DELETE B for before-image of an UPDATE U for after-image of an UPDATE When Data Integrator reads rows from Oracle. This sequencing column provides a way to collate image pairs if they are separated as a result of the data flow design. To configure a CDC table Drag a CDC datastore table into a data flow. 488 Data Integrator Designer Guide .

If you want to read before-images (for a CDC table set to capture them). Business Objects recommends that you enable check-pointing for a subscription name in a production environment. Data Integrator Designer Guide 489 . Data Integrator moves the check-point forward to mark the last row read. 4. Enable check-point Get before-image for Oracle allows a before-image and an after-image to be each update row associated with an UPDATE row. Click the name of this source object to open its Source Table Editor. Techniques for Capturing Changed Data Using CDC with Oracle sources 18 2. Subscriptions are created in Oracle and saved for each CDC table A subscription name is unique to a datastore. see “Using before-images” on page 490. only after-images are retrieved. For more information. Find out how you can participate and help to improve our documentation. This value is required. see “Using check-points” on page 490. Specify a value for the CDC subscription name. There are three CDC table options in the Source Table Editor’s CDC Options tab: Option Name CDC subscription name Description The name that marks a set of changed data in a continuously growing Oracle CDC table. Click the CDC Options tab. Select from the list or create a new subscription. By default. For example.This document is part of a SAP study on PDF usage. the next time the CDC job runs. Once a check-point is enabled. owner. it reads only the rows inserted into the CDC table since the last check-point. Enables Data Integrator to restrict CDC subscription reads using check-points. For more information. you can use the same subscription name (without conflict) with different tables in the same datastore if they have different owner names. After a job completes successfully. and table name. 3. enable this option.

fewer rows pass through the engine which allows the job to execute in less time. This means that identical jobs. then the job reads all the rows in the table. 490 Data Integrator Designer Guide . Data Integrator can expand the UPDATE row into two rows: one row for the before-image of the update. you can calculate the difference between an employee’s new and old salary by looking at the difference between the values in salary fields. When you migrate CDC jobs from test to production. The before image of an update row is the image of the row before the row is changed. do not reuse data flows that use CDC datastores because each time a source table extracts data it uses the same subscription name. their primary keys do not need to be updated. By not retrieving before-images. Find out how you can participate and help to improve our documentation. Note: To avoid data corruption problems. and the after image of an update row refers to the image of the row after the change is applied. can get different results and leave check-points in different locations in the table. when source tables are updated. If you do not enable check-pointing. and one row for the after-image of the update. You can use before-images to: • Update primary keys However. enter a name in the CDC Subscription name box and select the Enable check-point option. on the Source Table Editor. depending upon when they run. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources Using check-points When a job in Data Integrator runs with check-pointing enabled.This document is part of a SAP study on PDF usage. for example. This increases processing time. prior to when the update operation is applied to the target. the Data Integrator uses the source table’s subscription name to read the most recent set of appended rows. • Calculate change logic between data in columns For example. The default behavior is that a CDC reader retrieves after-images only. Using before-images If you want to retrieve the before-images of UPDATE rows. To use check-points. if ever runs again. under most circumstances. a best practice scenario would be to change the subscription name for the production job so that the test job. will not affect the production job’s results.

Select the Get before-images for each update row option in the CDC table’s source editor. Data Integrator processes two rows. CDC table is not set-up properly. make sure the Oracle CDC table is also setup to retrieve full before-images. re-direct. When using functions and transforms that re-order. eliminate. and multiply the number of rows in a data flow (for example. If you create an Oracle CDC table using Data Integrator Designer. Add the appropriate target table and connect the objects. enabling the Get before-images for each update row option has no effect. 1. but undesirable results may still occur due to programming errors. In addition to the performance impact of this data volume increase. This would cause data integrity issues. 3. Query and Map_CDC_Operation transforms to the data flow workspace. Note: A data flow can only contain one CDC source.This document is part of a SAP study on PDF usage.and after-image pairs may be separated or lost depending on the design of your data flow. For more information. Use the procedure in “Configuring an Oracle CDC source” on page 488 to configure the CDC table. Data Integrator Designer Guide 491 . • Once you select the Get before-images for each update row option. due to the use of the group by or order by clauses in a query) be aware of the possible impact to targets. Techniques for Capturing Changed Data Using CDC with Oracle sources 18 When you want to capture before-images for update rows: • At CDC table creation time. see “To invoke the New CDC table wizard in the Designer” on page 481. Find out how you can participate and help to improve our documentation. To define a data flow with an Oracle CDC table source From the Designer object library. drag the Oracle CDC table. you can select the Generate beforeimages check box to do this. If the underlying. The Map_CDC_Operation transform can resolve problems. the before. 2. you use a Query transform to remove the Oracle control columns and the Map_CDC_Operation transform to interpret the Data Integrator control columns and take appropriate actions. for every update. Creating a data flow with an Oracle CDC source To use an Oracle CDC source.

492 Data Integrator Designer Guide . For example. the DI_OPERATION_TYPE column is automatically selected as the Row operation column. 5. In the Query Editor. For detailed information about the Map_CDC_Operation transform. The Map_CDC_Operation transform uses the values in the column in the Row Operation Column box to perform the appropriate operation on the source row for the target table. the corresponding row is deleted from the target table. The operations can be INSERT. see the Data Integrator Reference Guide. if the operation is DELETE. For an Oracle CDC source table.This document is part of a SAP study on PDF usage. map only the Data Integrator control columns and the source table columns that you want in your target table. Find out how you can participate and help to improve our documentation. DELETE. or UPDATE. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources 4.

Data Integrator Designer Guide 493 . 1.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Purging CDC tables Periodically purge CDC tables so they do not grow indefinitely. Techniques for Capturing Changed Data Using CDC with Oracle sources 18 Maintaining CDC tables and subscriptions This section discusses purging CDC tables and dropping CDC subscriptions and tables. To drop Oracle CDC subscriptions or tables From the object library. Refer to your Oracle documentation for how to purge data that is no longer being used by any subscribers. Dropping CDC subscriptions and tables You can drop Oracle CDC tables and their subscriptions from the Datastore Explorer window in Data Integrator Designer. right-click a CDC datastore and select Open.

In the Datastore Explorer window. • Drop table This option drops the Oracle CDC table and also deletes it from the Data Integrator repository. 494 Data Integrator Designer Guide . T2 4. Select each subscription name to drop it from Oracle and delete it from the Data Integrator repository.This document is part of a SAP study on PDF usage. click Repository Metadata. Oracle subscriptions are associated with these subscription names. 18 Techniques for Capturing Changed Data Using CDC with Oracle sources 2. Choose: • Drop Subscription This option opens the list of subscriptions you created in Data Integrator for the selected table. 3. Find out how you can participate and help to improve our documentation. Right-click a table and select CDC maintenance.

DELETES). The IBM EP adapter converts the IBM Event Publisher messages into Data Integrator internal messages and sends the internal messages to different real-time jobs. Oracle CDC does not support the following operations because they disable all database triggers: • • • Direct-path INSERT statements The multi_table_insert statement in parallel DML mode If you are using check-pointing and running your job in recovery mode. UPDATES. lookup_ext. faster delta loads. Data Integrator’s IBM Event Publisher (EP) adapter listens for messages from IBM WebSphere MQ. such as lookup. the DB2 CDC feature is a simple solution to limiting the number or rows that Data Integrator reads on a regular basis. Data Integrator cannot compare or search these columns.This document is part of a SAP study on PDF usage. DELETE. Key_Generation. Check-points are ignored. the recovered job will begin to review the job at the start of the CDC table. Oracle CDC captures DML statements. and total_rows You can only create one CDC source in a data flow. and SQL transforms All database functions. and UPDATE. allows you to design smaller. key_generation. including INSERT. Using CDC with DB2 sources If your environment must keep large amounts of data current. Techniques for Capturing Changed Data Using CDC with DB2 sources 18 Limitations The following limitations exist when using CDC with Oracle sources: • You cannot use the following transforms and functions with a source table imported with a CDC datastore because of the existence of the Data Integrator generated columns for CDC tables. Data Integrator Designer Guide 495 . • • • • Table_Comparison. Find out how you can participate and help to improve our documentation. However. sql. A source that reads only the most recent operations (INSERTS. The data uses the following path: • • • The Q Capture program on the Replication Server sends data from the DB2 log to the WebSphere MQ Queue Manager. Data Integrator captures changed data on an IBM DB2 database server and applies it on-demand to a target system.

Stops processing messages if one message fails.This document is part of a SAP study on PDF usage. Data Integrator connects to a stack of IBM products. 18 Techniques for Capturing Changed Data Using CDC with DB2 sources The input message is forwarded to multiple real-time services if they process data from some or all the tables in the message. Stops processing messages if it cannot deliver them to an enabled realtime service. Use DB2 Information Integrator (II) Replication to create a pathway between a DB2 server log and IBM WebSphere MQ. and its IBM EP 496 Data Integrator Designer Guide . Guaranteed delivery To achieve guaranteed delivery. MQ is a peer-to-peer product. DB2 II Replication.You can install it on one computer if DB2 UBD. the IBM EP Adapter: • • • Discards table data if there is no real-time service for it. Find out how you can participate and help to improve our documentation. which it provides to publish changed data from a variety of IBM sources including DB2 UBD for Windows and UNIX. Setting up DB2 Data Integrator supports reading changed data from DB2 Universal Database (UDB) on Windows and UNIX. which means that the data must transfer from one MQ system to another MQ system. the Data Integrator Job Server.

DB2 II Replication. Find out how you can participate and help to improve our documentation. select persistent as the value for Default Persistence and set the value for Max Message Length to match the size of a table row. It assumes that DB2. The Restart queue keeps track of where to start reading the DB2 recovery log after the Q Capture program restarts. The following steps summarize the procedure for setting up the source-side and make recommendations for settings. then install MQ on both computers. The Administrator queue receives control and status messages from the Q Capture program. Enter the names for the queues (that you created in MQ) that will function as the administration and restart queues in DB2 II Replication. Name a new schema for these tables. The Replication Center creates control tables on this server. specify 4000 for Max Message Length. This ensures that when there is a change in any column Data Integrator receives the whole row in the request message. which will publish table data out to MQ. The value that you specify here should less than or equal to the Max Message Length you set earlier for MQ).This document is part of a SAP study on PDF usage. • Start the Q capture program. A message must be able to hold at least one row. Specify a schema name to identify the queue capture program and its unique set of control tables. For each local queue. Choose to send both changed and unchanged columns to Data Integrator. If your configuration is spread over more-than-one computer. Specify a server. Select the Event publishing view from the DB2 II Replication Center Launchpad and use the wizard to create Q capture control tables. the names of the tables that you want to publish. The Q Capture Server is the DB2 database that contains source data. Techniques for Capturing Changed Data Using CDC with DB2 sources 18 Adapter are also on the same computer. for example if DB2 and DB2 II Replication are on a different computer than Data Integrator. Note: Do not select any option for XML publication that restricts the message content to changed columns only. • Configure MQ by creating two local queues (admin and restart) on the DB2 II Replication computer and one local queue (data) on the Data Integrator computer. Data Integrator Designer Guide 497 . and MQ are on one computer while a second computer has Data Integrator and MQ installed. • • • • • • Complete the Q Capture program configuration by specifying XML publishing options: specify the DB2 server name. Also define a remote data queue on the DB2 II Replication computer. the names of the MQ queues. and the properties of the XML messages (for example.

To use Data Integrator to read and load DB2 changed data.This document is part of a SAP study on PDF usage. However. do the following: • • Install a Job Server Using the Server Manager. • • • Using the Designer: • • • • • • • Create a CDC datastore for DB2 Import metadata for DB2 tables Build real-time jobs using metadata Enable a real-time service Start the IBM EP Adapter (starts real-time services too) Monitor the real-time services and adapter Using the CDC Services node in the Administrator: 498 Data Integrator Designer Guide .3. enable the Job Server to support adapters (the IBM EP Adapter is installed with every Job Server). 18 Techniques for Capturing Changed Data Using CDC with DB2 sources For more detailed documentation about setting up IBM products on the source side. and filters out data that is not required in the associated real-time job. before you can create a DB2 CDC datastore. converts them to a format used by Data Integrator real-time services. Open the Data Integrator Administrator and. You can create real-time services from real-time jobs.1 Setting up Data Integrator Data Integrator uses real-time services to read changed data from DB2.2 WebSphere MQ for Windows: Quick Beginnings Version 5. which you use to import metadata to create real-time jobs. you must configure a real-time IBM Event Publisher (EP) Adapter instance. The IBM EP Adapter reads messages from MQ. Find out how you can participate and help to improve our documentation. configure Data Integrator’s IBM EP Adapter before you proceed with creating a datastore. A Job Server must be adapter-enabled to appear as a Job Server choice in the Create Datastore Editor. add the Access Server and repository that will process DB2 CDC jobs. under the Management node. see IBM publications: • • DB2 Information Integrator Replication and Event Publishing Guide and Reference V8. Adapters are installed automatically with all Job Servers. Using the CDC Services node in the Administrator. Data Integrator allows one MQ queue and one repository per adapter instance name.

(for which you would create an adapter instance using the Administrator’s Adapter Instances node and create an adapter datastore connection using the Designer). In fact. Uses CDC Services to create an IBM Event Publisher adapter instance Unlike other Data Integrator supported adapters. which require one message source and target. DB2 CDC metadata is imported directly from the DB2 database. Data Integrator imports regular table metadata with a DB2 CDC datastore connection. create an adapter instance using the Administrator’s CDC Services node and configure a database DB2 CDC datastore in the Designer. Does not use messages in real-time jobs Unlike other Data Integrator real-time jobs. Uses different connections for Import/Execute commands Unlike other adapters. Data Integrator does not support message sources or targets in that job. if one or more DB2 CDC tables exist in a real-time job.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Data Integrator imports and processes a source in a DB2 CDC job as a regular table. Techniques for Capturing Changed Data Using CDC with DB2 sources 18 CDC Services This section describes procedures that are unique to the DB2 CDC feature. Data Integrator uses the IBM EP Adapter only to process the data. Data Integrator Designer Guide 499 . which Data Integrator uses to both import metadata for jobs and process those jobs.

1. 5. Data Integrator automatically sets adapter parameters and provides short cuts for configuring and monitoring associated real-time services. which pushes it out to applications like Data Integrator’s IBM EP Adapter using messages. you can create. Data Integrator allows you to import DB2 tables and create real-time jobs for maintaining changed data. For more information. To create a CDC datastore for DB2 Enter a datastore name. you must configure and monitor an IBM EP Adapter and its services as a CDC Service. Before you can create a DB2 CDC datastore. create a DB2 CDC datastore using the Designer. which are configured separately under the Adapter Instance and Real-time > Access Server > Real-time Services nodes in the Administrator. DB2 II reads data from the DB2 log and publishes it to MQ. 4. 2. use the Real-time > Access Server > CDC Services node. 18 Techniques for Capturing Changed Data Using CDC with DB2 sources Uses CDC Services to configure adapters and real-time jobs Unlike other Data Integrator adapters and real-time services. To gain access to DB2 CDC tables. edit.x or higher. you must create an IBM EP Adapter instance using the Administrator. and access a CDC datastore from the Datastores tab of the object library. Change-data tables are only available from DB2 UDB 8. Find out how you can participate and help to improve our documentation. Select Database as the Datastore Type and DB2 as the Database Type. CDC datastores DB2 II control tables use the publish/subscribe model. 500 Data Integrator Designer Guide . Enter a database User name and Password. A CDC datastore is a read-only datastore that can only access tables. To configure a CDC service. see the Data Integrator Management Console: Administrator Guide. Enter a Data source (use the name of the Replication server). 3. Select a Database version. Like other datastores.This document is part of a SAP study on PDF usage.

If you configure a test file name here. (Optional) Enter a name for a test file. Find out how you can participate and help to improve our documentation. When you check this box. which requires the full path from a test file name on the Job Server computer. When you run a real-time job from the Designer.txt). Enter the name of the control table schema that you created for this datastore using DB2 II. Data Integrator Designer Guide 501 . See the Data Integrator Management Console: Administrator Guide.This document is part of a SAP study on PDF usage. Data Integrator accepts a wild card in the file name (*. Every file matching the file name becomes a input message. Techniques for Capturing Changed Data Using CDC with DB2 sources 18 6. 8. Select the Enable CDC check box. 9. Data Integrator runs it in test mode. 7. enter the name of the Job Server (that manages the adapter instance) and adapter instance name (that will access changed data). the Advanced options display. you must enter corresponding configuration information for the adapter. In the Event Publisher Configuration section of the Advanced options.

and after-images receive the same sequence number. This field increments by one each time Data Integrator reads a row except when it encounters a pair of before. Find out how you can participate and help to improve our documentation. When you find the table that you want to import.This document is part of a SAP study on PDF usage.and after-image pairs could be separated or lost depending on the design of your data flow. Importing CDC data from DB2 To import CDC table metadata from DB2 1. Import by Name. 2. In addition to the performance impact of this data volume increase. it adds four columns. This sequencing column provides a way to collate image pairs if they become separated as a result of the data flow design. You can use this datastore to import CDC tables. you can browse the datastore for existing CDC tables using the Datastore Explorer. You can configure DB2II to create before-images. see “Using before-images” on page 490. click Apply. 11. Configuring a DB2 CDC source When Data Integrator imports a CDC table. For information about when to consider using before-images. or Search. If Data Integrator encounters before-images it retrieves them before applying the after-image UPDATE operation to the target. Data Integrator preserves two columns from DB2 II: • • • • DI_Db2_TRANS_ISN (transaction sequence number) DI_DB2_TRANS_TS (time stamp) DI_SEQUENCE_NUMBER DI_OPERATION_TYPE Data Integrator generates the other two additional columns: The DI_SEQUENCE_NUMBER column The DI_SEQUENCE_NUMBER column starts with zero at the beginning of each extraction. 18 Techniques for Capturing Changed Data Using CDC with DB2 sources 10. the before. If you want to create more than one configuration for this datastore. If you select Open. Both the before. Right-click the CDC datastore name in the object library and select Open.and after-images. then click Edit and follow steps 6 through 8 again for any additional configurations. 502 Data Integrator Designer Guide . Click OK. right-click it and select Import.

Data Integrator Designer Guide 503 . When using functions and transforms that re-order. There are no additional source table options for DB2 CDC tables. If during the course of a data flow the before. see the Data Integrator Reference Guide. It checks and translates the tags in the message to Data Integrator values for the DI_OPERATION_TYPE column. Valid values for this column are: • • • • I for INSERT D for DELETE B for before-image of an UPDATE U for after-image of an UPDATE Data Integrator receives each row from DB2II as a message. 1. DB2 CDC tables When you drag a CDC datastore table into a data flow. The DI_OPERATION_TYPE column Data Integrator generates values in the DI_OPERATION_TYPE column. eliminate. To configure a DB2 CDC table Drag a CDC datastore table into a data flow.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. be aware of the possible impact to targets. The Map_CDC_Operation transform allows you to restore the original ordering of image pairs by using the DI_SEQUENCE_NUMBER column as its Sequencing column. but undesirable results can still occur due to programming errors. and multiply the number of rows in a data flow.and after-images become separated or get multiplied into many rows (for example. 2. Techniques for Capturing Changed Data Using CDC with DB2 sources 18 which would cause data integrity issues. The Map_CDC_Operation transform can resolve problems. For detailed information about the Map_CDC_Operation transform. it automatically becomes a source object. using GROUP BY or ORDER BY clauses in a query). re-direct. If you want to set a Join Rank for this table. click the name of this source object to open its Source Target Editor. you can lose row order.

504 Data Integrator Designer Guide . lookup_ext. Data Integrator loads it regardless with DB2 CDC. If the data has not changed. and checkpoint information is not configurable in Data Integrator as it is for Oracle CDC. sql. Key_Generation. such as lookup. limit the number of columns that you publish. and SQL transforms All database functions. Data Integrator cannot compare or search these columns. key_generation. you will have 5 messages with 10 values each or 50 values to update. Find out how you can participate and help to improve our documentation. Subscription. Checkpoints allow Data Integrator to check to see of a column value has changed since the last update. if you have 10 columns in a table and 5 rows of data. You can also use DB2 II to specify that you want to track before-images. To decrease load time. This filtering step can limit update processing time. DB2 II publishes each row of a table as a message which contains data for all columns in the row. and total_rows You can only create one CDC source in a data flow. Data Integrator only requires that you publish all rows for columns that you actually publish. Use the Replication Server's Event Publishing options in DB2 II to select tables and columns (subscribers) to be published. Data Integrator knows not to load that data. Checkpoints are not supported. you can limit the amount of data in a message. then you will have 5 messages with 4 values each or 20 values to update. 18 Techniques for Capturing Changed Data Using CDC with DB2 sources Limitations The following limitations exist for this feature: • You cannot use the following transforms and functions with a source table imported with a CDC datastore because of the existence of the Data Integrator generated columns for CDC tables. If there are no changes in a particular column. If you limit the number of columns to 4.This document is part of a SAP study on PDF usage. because you can use DB2 II to allow some columns to be published while disallowing others. The View Data feature is not supported for DB2 CDC tables The embedded data flow feature is not supported. For DB2 CDC. before-image. For example. • • • • • • Table_Comparison. However.

one for each changed table. Less processing also occurs during recovery of a failed job because the recovery process does not need to back out the uncommitted changes. multiple journal scans. faster delta loads. the mainframe CDC feature is a simple solution to limiting the number of rows that must be read on a regular basis. The advantages of a staging area are: • • • A single journal scan can extract changes to more than one table. A source that reads only the most recent operations (INSERTS. Find out how you can participate and help to improve our documentation. Without a staging area.This document is part of a SAP study on PDF usage. UPDATES. Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources 18 Using CDC with Attunity mainframe sources If your environment must keep large amounts of data current. Extracts only committed changes which is less processing than extracting every change. Data Integrator captures changed data on Attunity mainframe data sources and applies it to a target system. After the first request to capture changes. is required to extract changes. The following diagram shows the path that the data takes from Attunity CDC to Data Integrator. the CDC agent stores a context that the agent uses as a marker to not recapture changes prior to it. DELETES) allows you to design smaller. The CDC Agent sends the changed data to an optional staging area. • The Attunity CDC Agent monitors the database journal for changes to specific tables. Data Integrator Designer Guide 505 .

• • Select a name for your CDC agent. The following steps summarize the procedure for using the Attunity Studio wizard to create a CDC data source. Setting up Attunity CDC If you currently use Attunity as the connection to Data Integrator to extract data from mainframe sources. create an Attunity CDC data source in Attunity Studio. A workspace for the CDC agent to manage the change capture event queue. you will not capture before images even if you specify the Data Integrator option Get before-image for each update row. Attunity generates the CDC data source on the same computer as the CDC agent by default. Specify if you want to capture before images for update operations. • 506 Data Integrator Designer Guide . • The Attunity Studio wizard generates the following components that you need to specify on the Data Integrator Datastore Editor when you define an Attunity CDC datastore: • A CDC data source name that you specify in the Data Integrator option Data source. choose one of the following methods to capture changes and specify the location of the journal: • • • • VSAM under CICS—By CICS Log stream DB2 on OS/390 and z/OS platforms—By DB2 Journal DB2 on OS/400—By DB400 Journal DISAM on Windows—By Journal For a complete list of supported data sources. a staging area requires additional storage and processing overhead. Find out how you can participate and help to improve our documentation. You specify the workspace name in the Data Integrator option Attunity workspace. Refer to the Attunity CDC documentation for details. If you do not specify this option in Attunity Studio.This document is part of a SAP study on PDF usage. see the Attunity Connect CDC document. You have the option of placing the CDC data source on the client (same computer as Data Integrator). Select the tables to monitor for changes. • Attunity Connect CDC sends the changes to the CDC data sources through which Data Integrator can access the changes using standard ODBC or JDBC. 18 Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources However. Obtain the host name of this computer to specify in the Data Integrator option Host location. • • Specify your data source. Based on your data source.

To create a CDC datastore for Attunity Open the Datastore Editor. Setting up Data Integrator To use Data Integrator to read and load changed data from mainframe sources using Attunity. select Database. 4. 3. Find out how you can participate and help to improve our documentation. Enter a name for the datastore. select Attunity_Connector. Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources 18 For more information. Data Integrator Designer Guide 507 . refer to the CDC setup chapter in the Attunity Connect: The Change Data Capture Solution. In the Datastore type box. Refer to the “Mainframe interface” section of Chapter 5: Datastores for a list of mainframe data sources and an introduction to creating database datastores. In the Database type box.This document is part of a SAP study on PDF usage. 2. 1. do the following procedures on the Data Integrator Designer: • • • • Create a CDC datastore for Attunity Import metadata for Attunity tables Configure a mainframe CDC source Build real-time jobs using metadata Creating CDC datastores The CDC datastore option is available for all mainframe interfaces to Data Integrator.

When you setup access to the data sources in Attunity Studio. • • • 6. You can specify more than one data source for one datastore. specify the Attunity daemon port number. 18 Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources 5. 8. You can enable CDC for the following data sources. Check the Enable CDC box to enable the CDC feature.This document is part of a SAP study on PDF usage. specify the name of the host on which the Attunity data source daemon exists. but you cannot join two CDC tables. All Attunity data sources must use the same workspace. Specify the Attunity server workspace name that the CDC agent uses to manage the change capture event queue for the CDC data source. For the current list of data sources. enter the connection information once. All Attunity data sources must be accessible by the same user name and password. refer to the Attunity web site. 9. If you list multiple data source names for one Attunity Connector datastore. • • 7. VSAM under CICS DB2 UDB for z/OS DB2 UDB for OS/400 In the Data source box. and import the tables. 10. In the Port box. Complete the rest of the dialog and click OK. it is easier to create one datastore. use the same workspace name for each data source. In the Host location box. 508 Data Integrator Designer Guide . The default value is 2551. Data Integrator imports data from regular Attunity data sources differently than from CDC data sources. If you can access all of the CDC tables through one Attunity data source. ensure that you meet the following requirements: • Do not specify regular Attunity data sources with CDC data sources in the same Data Integrator datastore. Once saved. this datastore becomes a CDC datastore. Find out how you can participate and help to improve our documentation. You can now use the new datastore connection to import metadata tables into the current Data Integrator repository. specify the name of the Attunity CDC data source. You might want to specify multiple data sources in one Attunity datastore for easier management.

Import by Name. In the object library. Functions and templates are not available because the Attunity CDC datastore is read-only. You can configure Attunity Streams to retrieve before. Note that if you do not configure Attunity Streams to capture before.images of UPDATE rows before Data Integrator applies the UPDATE operation to the target. Both the beforeand after-images receive the same sequence number.images in the database.This document is part of a SAP study on PDF usage. using GROUP BY or ORDER BY clauses in a query).and after-images become separated or get multiplied into many rows (for example. Data Integrator will discard the rows. For mainframe CDC. Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources 18 Importing mainframe CDC data After you create a CDC datastore. right-click the datastore name and select Open. Find out how you can participate and help to improve our documentation. Data Integrator Designer Guide 509 .and after-images. see “Using before-images” on page 490. This field increments by one each time Data Integrator reads a row except when it encounters a pair of before. For information about when to consider using before-images. you can use it to import CDC table metadata. you can lose row order. only the CDC tables that you selected in the procedure “Setting up Attunity CDC” on page 506 are visible when you browse external metadata. If during the course of a data flow the before. This sequencing column provides a way to collate image pairs if they become separated as a result of the data flow design. The Data Integrator import operation adds the following columns to the original table: Column name DI_SEQUENCE_NUMB ER DI_OPERATION_TYPE Data type integer varchar(1) varchar(26) varchar(4) varchar(12) varchar(64) Source of column Generated by Data Integrator Generated by Data Integrator Supplied by Attunity Streams Supplied by Attunity Streams Supplied by Attunity Streams Supplied by Attunity Streams Context Timestamp TransactionID Operation tableName varchar(128) Supplied by Attunity Streams The DI_SEQUENCE_NUMBER column The DI_SEQUENCE_NUMBER column starts with zero at the beginning of each extraction. or Search.

Specify a value for the CDC subscription name. The DI_OPERATION_TYPE column Data Integrator generates values in the DI_OPERATION_TYPE column. Valid values for this column are: • • • • I for INSERT D for DELETE B for before-image of an UPDATE U for after-image of an UPDATE Configuring a mainframe CDC source When you drag a CDC datastore table into a data flow. Find out how you can participate and help to improve our documentation. Click the CDC Options tab. 2. 18 Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources The Map_CDC_Operation transform allows you to restore the original ordering of image pairs by using the DI_SEQUENCE_NUMBER column as its Sequencing column. 1. it automatically becomes a source object. For detailed information about the Map_CDC_Operation transform. The table automatically becomes a source object. 4. 3. see the Data Integrator Reference Guide.This document is part of a SAP study on PDF usage. 510 Data Integrator Designer Guide . To configure a mainframe CDC table Drag a CDC datastore table into a data flow. Click the name of this source object to open its Source Table Editor.

Select from the list or type a new name to create a new subscription. checkpoints are not enabled. By default. This field is required. Find out how you can participate and help to improve our documentation. A subscription name must be unique within a datastore. see “Using mainframe check-points” on page 511. Enable checkpoint Enables Data Integrator to restrict CDC reads using check-points. you can use the same subscription name (without conflict) with different tables that have the same name in the same datastore if they have different owner names. Attunity CDC uses the subscription name to save the position of each user. By default. For example. it reads only the rows inserted into the CDC table since the last check-point. and table name. the next time the CDC job runs. owner. only after-images are retrieved. Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources 18 The Source Table Editor’s CDC Options tab shows the following three CDC table options: Option Name Description CDC subscription A name that Data Integrator uses to keep track of the name position in the continuously growing Attunity CDC table. If row your source can log before-images and you want to read them during change-data capture jobs. Data Integrator Designer Guide 511 . enable this option. For more information.This document is part of a SAP study on PDF usage. Get before-image Some databases allow two images to be associated with for each update an UPDATE row: a before-image and an after-image. You can use multiple subscription names to identify different users who read from the same imported Attunity CDC table. For more information. Attunity CDC uses the subscription name to mark the last row read so that the next Data Integrator job starts reading the CDC table from that position. Rows of changed data append to the previous load in the CDC data source. Using mainframe check-points Attunity CDC agents read mainframe sources and load changed data either into a staging area or directly into the CDC data source. see “Using before-images” on page 490. Once a check-point is placed.

a best-practice scenario is to change the subscription name for the production job. re-direct. the recovered job begins to review the CDC data source at the last check-point. see “Using before-images” on page 490. it does not affect the production job’s results. After you check the Get before-images for each update row option. otherwise enabling the Get before-images for each update row option in Data Integrator has no effect. and multiply the number of rows in a data flow. Using before-images from mainframe sources For an introduction to before. a CDC job in Data Integrator uses the subscription name to read the most recent set of appended rows and to mark the end of the read. if the test job ever runs again. In addition to the performance impact of this data volume increase.This document is part of a SAP study on PDF usage. but undesirable results can still occur due to programming errors. The underlying. be aware of the possible impact to targets. The Map_CDC_Operation transform can resolve problems. log-based CDC capture software must be set up properly. If check-points are not enabled. To use check-points. which would cause data integrity issues. When you must capture before-image update rows: • • Make sure Attunity Streams is set up to retrieve full before-images. For detailed information about the Map_CDC_Operation transform. Therefore. If you enable check-points and you run your CDC job in recovery mode. When you use functions and transforms that re-order. can get different results and leave check-points in different locations in the file.and after-image pairs could be separated or lost depending on the design of your data flow. Select the Get before-images for each update row option in the CDC table’s source editor. When you migrate CDC jobs from test to production. on the Source Table Editor enter the CDC Subscription name and select the Enable check-point option.and after-images. 18 Techniques for Capturing Changed Data Using CDC with Attunity mainframe sources When you enable check-points. Data Integrator processes two rows for every update. eliminate. the before. Find out how you can participate and help to improve our documentation. Note: To avoid data corruption problems. do not reuse data flows that use CDC datastores because each time a source table extracts data it uses the same subscription name. 512 Data Integrator Designer Guide . see the Data Integrator Reference Guide. the CDC job reads all the rows in the Attunity CDC data source and processing time increases. depending upon when they run. This means that identical jobs.

the CDC feature is a simple solution to limit the number of rows that must be read on a regular basis. To capture changed data. and total_rows You can only create one CDC source in a data flow. An article can be any of the following: • • • • • • • • An entire table Certain columns (using a vertical filter) Certain rows (using a horizontal filter) A stored procedure or view definition The execution of a stored procedure A view An indexed view A user-defined function Data Integrator Designer Guide 513 . Key_Generation.This document is part of a SAP study on PDF usage. Microsoft uses the following terms for the SQL Replication Server: • Article—An article is a table. Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases 18 Limitations The following limitations exist for this feature: • You cannot use the following transforms and functions with a source table imported with a CDC datastore because of the existence of the Data Integrator generated columns for CDC tables. UPDATES. A source that reads only the most recent operations (INSERTS. Find out how you can participate and help to improve our documentation. or a database object that the DBA specifies for replication. Data Integrator interacts with SQL Replication Server. sql. key_generation. Data Integrator cannot compare or search these columns. DELETES) allows you to design smaller. a partition. Overview of CDC for SQL Server databases Data Integrator captures changed data on SQL Server databases and applies it to a target system. Using CDC with Microsoft SQL Server databases If your environment must keep large amounts of data current. and SQL transforms All database functions. such as lookup. faster delta loads. lookup_ext. • • • Table_Comparison.

not to all of the publications available on a Publisher. Data Integrator reads the distribution database to obtain changed data. Subscriber—A subscriber is a server that receives replicated data. not to individual articles within a publication. 514 Data Integrator Designer Guide . They subscribe only to the publications that they need. The following diagram shows how the changed data flows from MS SQL Replication Server to Data Integrator.This document is part of a SAP study on PDF usage. Data Integrator reads the data from the command table within the Distribution database. applies appropriate filters. Publication—A publication is a collection of one or more articles from one database. • • • An application makes changes to a database and the Publisher within the MS SQL Replication Server captures these changes within a transaction log. Subscribers subscribe to publications. and creates input rows for a target data warehouse table. Find out how you can participate and help to improve our documentation. history data. 18 Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases • • • • Distributor—The Distributor is a server that stores metadata. Publisher—The Publisher is a server that makes data available for replication to other servers. Data Integrator obtains changed data from the Distribution database in the MS SQL Replication Server. The Log Reader Agent in the Distributor reads the Publisher’s transaction log and saves the changed data in the Distribution database. and transactions into the distribution database. A publication makes it easier to specify a logically related set of data and database objects that you want to replicate together.

The following steps summarize the procedure to configure SQL Replication Server for your SQL Server database. and the Distribution option. Data Integrator requires the following settings in the Advanced Options: • • Select Transactional publication on the Select Publication Type window. One of the columns in Sysarticles indicates which columns in a source table are being published. One of these tables is Sysarticles which contains a row for each article defined in this specific database. This MS SQL wizard generates the following components that you need to specify on the Data Integrator Datastore Editor when you define an SQL Server CDC datastore: • • • • • MSSQL distribution server name MSSQL distribution database name MSSQL distribution user name MSSQL distribution password Select the New Publications option on the Replication node of the Microsoft SQL Enterprise Manager to create new publications that specify the tables that you want to publish. select the Configure publishing. Follow the wizard to create the Distributor and Distribution database. This type updates data at the Publisher and send changes incrementally to the Subscriber. MSrepl_commands—contains rows of replicated commands (changes to data). Setting up SQL Replication Server for CDC If your Data Integrator currently connects to SQL Server to extract data. Replication Server creates tables on the source database. In the Commands tab of the Table Article Properties window: Data Integrator Designer Guide 515 . Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases 18 Data Integrator accesses the following tables within the Distribution database: • • • • MSarticles—contains one row for each article that a Publisher replicates. MSpublications—contains one row for each publication that a Publisher replicates.This document is part of a SAP study on PDF usage. • On the Replication node of the Microsoft SQL Enterprise Manager. Find out how you can participate and help to improve our documentation. When you enable a database for replication. configure the Distribution database in SQL Replication Server to capture changes on these tables. subscribers. MSpublisher_databases—contains one row for each Publisher and Publisher database pair that the local Distributor services.

For more information. Check the Enable CDC box to enable the CDC feature. select Database.This document is part of a SAP study on PDF usage. Otherwise. 5. Setting up Data Integrator To use Data Integrator to read and load changed data from SQL Server databases. Select option Yes. select CALL. allow anonymous subscriptions to save all transactions in the Distribution database. • In the Snapshot tab of the Table Article Properties window: • • • • Specify a publication name and description. 18 Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases • • If you want before images for UPDATE and DELETE commands. In the Datastore type box. You specify this publication name on the Data Integrator Datastore Editor when you define an MSSQL CDC datastore. 4. do the following procedures on the Data Integrator Designer: • • • Create a CDC datastore for SQL Server Import metadata for SQL Server tables Configure a CDC source Creating CDC datastores The CDC datastore option is available for SQL Server connections to Data Integrator. select Microsoft SQL Server. Select Keep the existing table unchanged because Data Integrator treats the table as a log. Clear the options Create the stored procedures during initial synchronization of subscriptions and Send parameters in binary format options because Data Integrator does not use store procedures and has its own internal format. To create a CDC datastore for SQL Server Open the Datastore Editor. 3. Find out how you can participate and help to improve our documentation. In the Database type box. select XCALL. 1. Clear Clustered indexes because Data Integrator treats the table as a log and reads sequentially from it. refer to the Microsoft SQL Enterprise Manager online help. Refer to “Defining a database datastore” on page 85 for an introduction to creating database datastores. 2. Enter a name for the datastore. 516 Data Integrator Designer Guide .

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases

18

6. 7. 8.

Select a Database version. Change-data tables are only available from SQL Server 2000 Enterprise. Enter a Database name (use the name of the Replication server). Enter a database User name and Password.

9.

In the CDC section, enter the following names that you created for this datastore when you configured the Distributor and Publisher in the MS SQL Replication Server:

• • • • •

MSSQL distribution server name MSSQL distribution database name MSSQL publication name MSSQL distribution user name MSSQL distribution password

10. If you want to create more than one configuration for this datastore, click Apply, then click Edit and follow step 9 again for any additional configurations. 11. Click OK.

Data Integrator Designer Guide

517

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases

You can now use the new datastore connection to import metadata tables into the current Data Integrator repository.

Importing SQL Server CDC data
After you create a CDC datastore, you can use it to import CDC table metadata. In the object library, right-click the datastore name and select Open, Import by Name, or Search. Only the CDC tables that you selected in the procedure “Setting up SQL Replication Server for CDC” on page 515 are visible when you browse external metadata. Data Integrator uses the MSpublications and MSarticles table in the Distribution database of SQL Replication Server to create a list of published tables. When you import each CDC table, Data Integrator uses the Sysarticles table in the Publisher database of SQL Replication Server to display only published columns. An imported CDC table schema might look like the following:

The Data Integrator import operation adds the following columns to the original table: Column name DI_SEQUENCE_NUMBER DI_OPERATION_TYPE Data type integer varchar(1) Source of column Generated by Data Integrator Generated by Data Integrator

518

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases

18

Column name MSSQL_TRAN_SEQNO

Data type

Source of column

varchar(256) Supplied by SQL Replication Server Supplied by SQL Replication Server

MSSQL_TRAN_TIMESTAMP timestamp

The DI_SEQUENCE_NUMBER column
The DI_SEQUENCE_NUMBER column starts with zero at the beginning of each extraction. This field increments by one each time Data Integrator reads a row except when it encounters a pair of before- and after-images. Both the beforeand after-images receive the same sequence number. This sequencing column provides a way to collate image pairs if they become separated as a result of the data flow design. You can configure SQL Replication Server to retrieve before-images of UPDATE rows before Data Integrator applies the UPDATE operation to the target. Note that if you do not configure SQL Replication Server to capture before-images in the database, only after-images are captured by default. For information about when to consider using before-images, see “Using beforeimages” on page 490. If during the course of a data flow the before- and after-images become separated or get multiplied into many rows (for example, using GROUP BY or ORDER BY clauses in a query), you can lose row order. The Map_CDC_Operation transform allows you to restore the original ordering of image pairs by using the DI_SEQUENCE_NUMBER column as its Sequencing column. For detailed information about the Map_CDC_Operation transform, see the Data Integrator Reference Guide.

The DI_OPERATION_TYPE column
Data Integrator generates values in the DI_OPERATION_TYPE column. Valid values for this column are:

• • • •

I for INSERT D for DELETE B for before-image of an UPDATE U for after-image of an UPDATE

Configuring a SQL Server CDC source
When you drag a CDC datastore table into a data flow, it automatically becomes a source object.

Data Integrator Designer Guide

519

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases

1. 2. 3. 4.

To configure a SQL Server CDC table Drag a CDC datastore table into a data flow. The table automatically becomes a source object. Click the name of this source object to open its Source Table Editor. Click the CDC Options tab. Specify a value for the CDC subscription name.

The Source Target Editor’s CDC Options tab shows the following three CDC table options: Option Name Description

CDC subscription A name that Data Integrator uses to keep track of the name position in the continuously growing SQL Server CDC table. SQL Server CDC uses the subscription name to mark the last row read so that the next Data Integrator job starts reading the CDC table from that position. You can use multiple subscription names to identify different users who read from the same imported SQL Server CDC table. SQL Server CDC uses the subscription name to save the position of each user. Select from the list or type a new name to create a new subscription. A subscription name must be unique within a datastore, owner, and table name. For example, you can use the same subscription name (without conflict) with different tables that have the same name in the same datastore if they have different owner names. This value is required. Enable checkpoint Enables Data Integrator to restrict CDC reads using check-points. Once a check-point is placed, the next time the CDC job runs, it reads only the rows inserted into the CDC table since the last check-point. For more information, see “Using mainframe check-points” on page 511. By default, checkpoints are not enabled.

520

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with Microsoft SQL Server databases

18

Option Name

Description

Get before-image Some databases allow two images to be associated with an UPDATE row: a before-image and an after-image. If for each update your source can log before-images and you want to read row them during change-data capture jobs, enable this option. By default, only after-images are retrieved. For more information, see “Using before-images” on page 490.

Using check-points
A Log Reader Agent in SQL Replication Server read the transaction log of the Publisher and saves the changed data into the Distribution database, which Data Integrator uses as the CDC data source. Rows of changed data append to the previous load in the CDC data source. When you enable check-points, a CDC job in Data Integrator uses the subscription name to read the most recent set of appended rows and to mark the end of the read. If check-points are not enabled, the CDC job reads all the rows in the CDC data source and processing time increases. To use check-points, on the Source Table Editor enter the CDC Subscription name and select the Enable check-point option. If you enable check-points and you run your CDC job in recovery mode, the recovered job begins to review the CDC data source at the last check-point. Note: To avoid data corruption problems, do not reuse data flows that use CDC datastores because each time a source table extracts data it uses the same subscription name. This means that identical jobs, depending upon when they run, can get different results and leave check-points in different locations in the file.

Using before-images from SQL Server sources
For an introduction to before- and after-images, see “Using before-images” on page 490. When you must capture before-image update rows:

Make sure SQL Replication Server is set up to retrieve full beforeimages. When you create a Publication in SQL Replication Server, specify XCALL for UPDATE commands and DELETE commands to obtain before images.

Select the Get before-images for each update row option in the CDC table’s source editor.

Data Integrator Designer Guide

521

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Using CDC with timestamp-based sources

SQL Replication Server must be set up properly, otherwise enabling the Get before-images for each update row option in Data Integrator has no effect. After you check the Get before-images for each update row option, Data Integrator processes two rows for every update. In addition to the performance impact of this data volume increase, the before- and after-image pairs could be separated or lost depending on the design of your data flow, which would cause data integrity issues. The Map_CDC_Operation transform can resolve problems, but undesirable results can still occur due to programming errors. When you use functions and transforms that re-order, re-direct, eliminate, and multiply the number of rows in a data flow, be aware of the possible impact to targets. For detailed information about the Map_CDC_Operation transform, see the Data Integrator Reference Guide.

Limitations
The following limitations exist for this feature:

You cannot use the following transforms and functions with a source table imported with a CDC datastore because of the existence of the Data Integrator generated columns for CDC tables. Data Integrator cannot compare or search these columns.

• • •

Table_Comparison, Key_Generation, and SQL transforms All database functions, such as lookup, lookup_ext, key_generation, sql, and total_rows

You can only create one CDC source in a data flow.

Using CDC with timestamp-based sources
Use Timestamp-based CDC to track changes: • If you are using sources other than Oracle 9i, DB2 8.2, mainframes accessed through IBM II Classic Federation, or mainframes accessed through Attunity and

If the following conditions are true:

• • •

There are date and time fields in the tables being updated You are updating a large table that has a small percentage of changes between extracts and an index on the date and time fields You are not concerned about capturing intermediate results of each transaction between extracts (for example, if a customer changes regions twice in the same day).

522

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with timestamp-based sources

18

Business Objects does not recommend using the Timestamp-based CDC when:

• • •

You have a large table, a large percentage of it changes between extracts, and there is no index on the timestamps. You need to capture physical row deletes. You need to capture multiple events occurring on the same row between extracts.

This section discusses what you need to consider when using source-based, time-stamped, changed-data capture:

• • •

Processing timestamps Overlaps Types of timestamps

In these sections, the term timestamp refers to date, time, or datetime values. The discussion in this section applies to cases where the source table has either CREATE or UPDATE timestamps for each row. Timestamps can indicate whether a row was created or updated. Some tables have both create and update timestamps; some tables have just one. This section assumes that tables contain at least an update timestamp. For other situations, see “Types of timestamps” on page 533. Some systems have timestamps with dates and times, some with just the dates, and some with monotonically generated increasing numbers. You can treat dates and generated numbers the same. It is important to note that for the timestamps based on real time, time zones can become important. If you keep track of timestamps using the nomenclature of the source system (that is, using the source time or sourcegenerated number), you can treat both temporal (specific time) and logical (time relative to another time or event) timestamps the same way.

Processing timestamps
The basic technique for using timestamps to determine changes and to save the highest timestamp loaded in a given job and start the next job with that timestamp. To do this, create a status table that tracks the timestamps of rows loaded in a job. At the end of a job, UPDATE this table with the latest loaded timestamp. The next job then reads the timestamp from the status table and selects only the rows in the source for which the timestamp is later than the status table timestamp.

Data Integrator Designer Guide

523

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Using CDC with timestamp-based sources

The following example illustrates the technique. Assume that the last load occurred at 2:00 PM on January 1, 1998. At that time, the source table had only one row (key=1) with a timestamp earlier than the previous load. Data Integrator loads this row into the target table and updates the status table with the highest timestamp loaded: 1:10 PM on January 1, 1998. After 2:00 PM Data Integrator adds more rows to the source table: Source table Key 1 2 3 Data Tanaka Lani Update_Timestamp 01/01/98 02:12 PM 01/01/98 02:39 PM

Alvarez 01/01/98 01:10 PM

Target table Key 1 Data Update_Timestamp

Alvarez 01/01/98 01:10 PM

Status table Last_Timestamp 01/01/98 01:10 PM At 3:00 PM on January 1, 1998, the job runs again. This time the job does the following: 1. 2. Reads the Last_Timestamp field from the status table (01/01/98 01:10 PM). Selects rows from the source table whose timestamps are later than the value of Last_Timestamp. The SQL command to select these rows is:
SELECT * FROM Source WHERE 'Update_Timestamp' > '01/01/98 01:10 pm'

This operation returns the second and third rows (key=2 and key=3). 3. 4. Loads these new rows into the target table. Updates the status table with the latest timestamp in the target table (01/
01/98 02:39 PM) with the following SQL statement: UPDATE STATUS SET 'Last_Timestamp' = SELECT MAX('Update_Timestamp') FROM target_table

The target shows the new data:

524

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with timestamp-based sources

18

Source table Key 1 2 3 Data Tanaka Lani Update_Timestamp 01/01/98 02:12 PM 01/01/98 02:39 PM

Alvarez 01/01/98 01:10 PM

Target table Key 2 3 1 Data Tanaka Lani Update_Timestamp 01/01/98 02:12 PM 01/01/98 02:39 PM

Alvarez 01/01/98 01:10 PM

Status table Last_Timestamp 01/01/98 02:39 PM To specify these operations, a Data Integrator data flow requires the following objects (and assumes all the required metadata for the source and target tables has been imported):

A data flow to extract the changed data from the source table and load it into the target table:
Data flow: Changed data with timestamps

The query selects rows from SOURCE_TABLE to load to TARGET_TABLE.

The query includes a WHERE clause to filter rows with older timestamps.

A work flow to perform the following: 1. Read the status table

Data Integrator Designer Guide

525

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

18

Techniques for Capturing Changed Data Using CDC with timestamp-based sources

2.

Set the value of a variable to the last timestamp

526

Data Integrator Designer Guide

This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.
Techniques for Capturing Changed Data Using CDC with timestamp-based sources

18

3. 4.

Call the data flow with the variable passed to it as a parameter Update the status table with the new timestamp

Work flow: Changed data with timestamps $Last_Timestamp_var = sql('target_ds', 'SELECT to_char(last_timestamp, \'YYYY.MM.DD HH24:MI:SS\') FROM status_table');

1

2

3 4
$Last_Timestamp_var = sql('target_ds', ' UPDATE status_table SET last_timestamp = (SELECT MAX(target_table.update_timestamp) FROM target_table) ');

A job to execute the work flow

Overlaps
Unless source data is rigorously isolated during the extraction process (which typically is not practical), there is a window of time when changes can be lost between two extraction runs. This overlap period affects source-based changed-data capture because this kind of data capture relies on a static timestamp to determine changed data. For example, suppose a table has 1000 rows (ordered 1 to 1000). The job starts with timestamp 3:00 and extracts each row. While the job is executing, it updates two rows (1 and 1000) with timestamps 3:01 and 3:02, respectively. The job extracts row 200 when someone updates row 1. When the job extracts row 300, it updates row 1000. When complete, the job extracts the latest timestamp (3:02) from row 1000 but misses the update to row 1.

Data Integrator Designer Guide

527

1000 Column A .. . .... ... . see “Types of timestamps” on page 533. update row 1 (original row 1 already extracted) Update row 1000 Extract row 600 Extract row 1000. This section continues on the assumption that there is at least an update timestamp... ...... Find out how you can participate and help to improve our documentation. For other situations....This document is part of a SAP study on PDF usage.. 528 Data Integrator Designer Guide ........ . . job done There are three techniques for handling this situation: • • • Overlap avoidance Overlap reconciliation Presampling The following sections describe these techniques and their implementations in Data Integrator. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources Here is the data in the table: Row Number 1 2 3 . . Here is the timeline of events (assume the job extracts 200 rows per minute): 3:00 3:01 3:02 3:03 3:05 Start job extraction at row 1 Extract row 200. 200 . 600 .

Overlap reconciliation Overlap reconciliation requires a special extraction process that reapplies changes that could have occurred during the overlap period. This extraction can be executed separately from the regular extraction. an overlap period of n (or n plus some small increment) hours is recommended. If the data volume is sufficiently low. perform it for as few rows as possible. If it can take up to n hours to extract the data from the source system. overlap reconciliation reapplies the data updated between 9:30 PM and 10:30 PM on January 1. it cannot blindly bulk load them or it creates duplicates. 1998. For example. but rows flagged as UPDATE rarely are.This document is part of a SAP study on PDF usage. For example. You can avoid overlaps if there is a processing interval where no updates are occurring on the target system. it guarantees that you never have an overlap and greatly simplifies timestamp management. For example. While this regular job does not give you up-to-the-minute updates. Because the overlap data flow is likely to apply the same rows again. it is possible to set up a system where there is no possibility of an overlap. the regular data flow selects the new rows from the source. therefore. This lookup affects performance. Data Integrator Designer Guide 529 . avoiding the need to create two different data flows. generates new keys for them. you can load the entire new data set using this technique of checking before loading. There is an advantage to creating a separate overlap data flow. rows flagged as INSERT are often loaded into a fact table. the overlap data flow must check whether the rows exist in the target and insert only the ones that are missing. Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 Overlap avoidance In some cases. you can run a job at 1:00 AM every night that selects only the data updated the previous day until midnight. an overlap period of at least two hours is recommended. and uses the database loader to add the new facts to the target database. if you can guarantee that the data extraction from the source system does not last more than one hour. if the highest timestamp loaded from the previous job was 01/01/98 10:30 PM and the overlap period is one hour. Find out how you can participate and help to improve our documentation. Therefore. Thus. For example. if it takes at most two hours to run the job. The overlap period is usually equal to the maximum possible extraction time. A “regular” data flow can assume that all the changes are new and make assumptions to simplify logic and improve performance.

the end timestamp is the timestamp selected by the current job. The technique is an extension of the simple timestamp processing technique described previously in “Processing timestamps” on page 523. and the next job runs. The start timestamp is the latest timestamp extracted by the previous job. 1998. The SQL command to select one row is: SELECT MAX(Update_Timestamp) FROM source table 530 Data Integrator Designer Guide . it does the following: 1. The main difference is that the status table now contains a start and an end timestamp. To return to the example: The last extraction job loaded data from the source table to the target table and updated the status table with the latest timestamp loaded: Source table Key 1 2 3 Data Alvarez Tanaka Lani Update_Timestamp 01/01/98 01:10 PM 01/01/98 02:12 PM 01/01/98 02:39 PM Target table Key 1 Data Alvarez Update_Timestamp 01/01/98 01:10 PM Status table Start_Timestamp End_Timestamp 01/01/98 01:10 PM NULL Now it’s 3:00 PM on January 1. Selects the most recent timestamp from the source table and inserts it into the status table as the End Timestamp. saving it. and then extracting rows up to that timestamp. Find out how you can participate and help to improve our documentation. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources Presampling Presampling eliminates the overlap by first identifying the most recent timestamp in the system.This document is part of a SAP study on PDF usage.

The table values end up as follows: Source table Key 1 2 3 Data Alvarez Tanaka Lani Update_Timestamp 01/01/98 01:10 PM 01/01/98 02:12 PM 01/01/98 02:39 PM Target table Key 1 2 3 Data Alvarez Tanaka Lani Update_Timestamp 01/01/98 01:10 PM 01/01/98 02:12 PM 01/01/98 02:39 PM Status table Start_Timestamp End_Timestamp 01/01/98 02:39 PM NULL Data Integrator Designer Guide 531 . The SQL command to select these rows is: SELECT * FROM source table WHERE Update_Timestamp > '1/1/98 1:10pm' AND Update_Timestamp <= '1/1/98 2:39pm' This operation returns the second and third rows (key=2 and key=3) 3. Find out how you can participate and help to improve our documentation. Loads these new rows into the target table.This document is part of a SAP study on PDF usage. 4. Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 The status table becomes: Status table Start_Timestamp End_Timestamp 01/01/98 01:10 PM 01/01/98 02:39 PM 2. Updates the status table by setting the start timestamp to the previous end timestamp and setting the end timestamp to NULL. Selects rows from the source table whose timestamps are greater than the start timestamp but less than or equal to the end timestamp.

This document is part of a SAP study on PDF usage. 532 Data Integrator Designer Guide . • A work flow to perform the following: 1. respectively. Find out how you can participate and help to improve our documentation. Update the start timestamp with the value from end timestamp and set the end timestamp to NULL. Call the data flow with the variables passed to it as parameters. Data flow: Changed data with overlap The query selects rows from SOURCE_TABLE to load to TARGET_TABLE. The query includes a WHERE clause to filter rows between timestamps. 3. 4. 2. Read the source table to find the most recent timestamp. Set the value of two variables to the start of the overlap time and to the end of the overlap time. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources To enhance the previous example to consider the overlap time requires the following changes to the work flow: • A data flow to extract the changes since the last update and before the most recent timestamp.

This section discusses these timestamps: • • • Create-only timestamps Update-only timestamps Create and update timestamps Data Integrator Designer Guide 533 . 'UPDATE status_table SET end_timestamp = \'\' '). (Typically. $Start_timestamp_var = sql('target_ds'. 'SELECT MAX(update_stamp) FROM source_table'). 2 1 3 4 $Start_timestamp_var = sql('target_ds'.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. there are systems that keep separate timestamps that record when rows are created and when they are updated. (('UPDATE status_table SET end_timestamp = \' ' || to_char($End_timestamp_var. $End_timestamp_var = sql('target_ds'.) Finally. Others have timestamps that record only when rows are updated. update-only systems set the update timestamp when the row is created or updated. Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 Work flow: Changed data with overlap $End_timestamp_var = sql('target_ds'. 'yyyy-mm-dd hh24:mi:ss')) || '\' ')). 'UPDATE status_table SET start_timestamp = end_stamp'). Types of timestamps Some systems have timestamps that record only when rows are created.

the new rows go through the key-generation process and are inserted into the target. 534 Data Integrator Designer Guide . Update-only timestamps Using only an update timestamp helps minimize the impact on the source systems. describes how to identify changes. If the system provides only an update timestamp and there is no way to tell new rows from updated rows. you can combine the following two techniques: • • Periodically (for example. you can extract only the new rows. but it makes loading the target systems more difficult. weekly) extract the updated rows by processing the entire table. If the table never gets updated. The job extracts all the changed rows and then filters unneeded rows using their timestamps.This document is part of a SAP study on PDF usage. your job has to reconcile the new data set against the existing data using the techniques described in the section “Using CDC for targets” on page 545. The section “Using CDC for targets” on page 545. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources Create-only timestamps If the source system provides only create timestamps. you have these options: • • • If the table is small enough. daily) extract only the new rows. you can process the entire table to identify the changes. and the updated rows go through the key-lookup process and are updated in the target. Less frequently (for example. Create and update timestamps Both timestamps allow you to easily separate new data from updates to the existing data. Find out how you can participate and help to improve our documentation. Accomplish these extractions in Data Integrator by adding the WHERE clause from the following SQL commands into an appropriate query transform: • • Find new rows: SELECT * FROM source_table WHERE Create_Timestamp > $Last_Timestamp Find updated rows: SELECT * FROM source_table WHERE Create_Timestamp <= $Last_Timestamp AND Update_Timestamp > $Last_Timestamp) From here. If the table is large and gets updated.

you cannot simply reload it every time a record changes: unless you assign the generated key of 123 to the customer ABC. When you run a job to update this table. the simplest technique is to look up the key for all rows using the lookup function in a query. the source customer rows must match the existing keys. you might want to separate the extraction of new rows into a separate data flow to take advantage of bulk loading into the target. customer ABC has a generated key 123 in the customer dimension table. If you do not find the key. Timestamp-based CDC examples This section discusses the following techniques for time-stamped based CDC: • • Preserving generated keys Preserving history Preserving generated keys For performance reasons. many data warehouse dimension tables use generated keys to join with the fact table. For example. Even if the customer dimension is small. the customer dimension table contains generated keys. Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 For performance reasons. the customer dimension table and the fact tables do not correlate. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. The updated rows cannot be loaded by bulk into the same target at the same time. Source customer table Company Name ABC DEF GHI JKL Customer ID 001 002 003 004 Data Integrator Designer Guide 535 . In the following example. generate a new one. You can preserve generated keys by: • • Using the lookup function Comparing tables Using the lookup function If history preservation is not an issue and the only goal is to generate the correct keys for the existing rows. All facts for customer ABC have 123 as the customer key.

Extracts the source rows. 3. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources Target dimension table Gen_Key Company Name 123 124 125 ABC DEF GHI Customer ID 001 002 003 This example data flow does the following: 1. Loads the result into a file (to be able to test this stage of the data flow before adding the next steps).This document is part of a SAP study on PDF usage. Data flow: Replace generated keys 1 Source data without generated keys 3 Source data with generated keys when they exist 2 536 Data Integrator Designer Guide . 2. Find out how you can participate and help to improve our documentation. Retrieves the existing keys using a lookup function in the mapping of a new column in a query.

The arguments for the function are as follows: lookup function arguments Description target_ds. Data Integrator Designer Guide 537 . The resulting data set contains all the rows from the source with generated keys where available: Result data set Gen_Key Company Name 123 124 125 NULL ABC DEF GHI JKL Customer ID 001 002 003 004 Adding a new generated key to the new records requires filtering out the new rows from the existing and updated rows. A query to select the rows with NULL generated keys. A Key_Generation transform to determine the appropriate key to add. this requires the following steps: 1. Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 The lookup function compares the source rows with the target. 3. 2.customer GKey NULL 'PRE_LOAD_CACHE' Customer_ID Customer_ID Fully qualified name of the target table containing the generated keys. The column in the source table containing the values to use to match rows.This document is part of a SAP study on PDF usage. In the data flow. Find out how you can participate and help to improve our documentation. A target to load the new rows into the customer dimension table. NULL value to insert in the key column if no existing key is found. The column in the target table containing the value to use to match rows.owner. The column name in the target table containing the generated keys. Caching option to optimize the lookup performance.

18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources The data flow expands as follows. the rows from the source whose keys were found in the target table might contain updated data. Data Integrator loads all rows from the source into the target. The data flow requires new steps to handle updated rows. A new line leaving the query that looked up the existing keys. 538 Data Integrator Designer Guide . Data flow: Adding new generated keys 2 1 3 Customer dimension table—new rows with new generated keys This data flow handles the new rows. as follows: 1. Because this example assumes that preserving history is not a requirement. A query to filter the rows with existing keys from the rows with no keys. 2. however. 3. Find out how you can participate and help to improve our documentation. A target to load the rows into the customer dimension table.This document is part of a SAP study on PDF usage.

If the amount of data is large.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. The table-comparison transform examines all source rows and performs the following operations: • • • • Generates an INSERT for any new row not in the target table. Generates an UPDATE for any row in the target table that has changed. Fills in the generated key for the updated rows. Data Integrator Designer Guide 539 . a table-comparison transform provides a better alternative by allowing the data flow to load only changed rows. it generates an UPDATE and is loaded into the target. This is the data set that Data Integrator loads into the target table. Ignores any row that is in the target table and has not changed. Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 The data flow expands as follows: Data flow: Loading all rows into the target 3 1 Customer dimension table: all rows with existing generated keys 2 Comparing tables The drawback of the generated-keys method is that even if the row has not been changed. You can then run the result through the key-generation transform to assign a new key for every INSERT.

2. A source to extract the rows from the source table(s). 540 Data Integrator Designer Guide . 4. A table-comparison transform to generate INSERT and UPDATE rows and to fill in existing keys.This document is part of a SAP study on PDF usage. you will perform history preservation on dimension tables. Most likely. 3. Find out how you can participate and help to improve our documentation. A query to map columns from the source. A key-generation transform to generate new keys. 5. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources The data flow that accomplishes this transformation includes the following steps: 1. A target to load the rows into the customer dimension table Data flow: Load only updated or new rows 1 2 3 5 4 Preserving history History preserving allows the data warehouse or data mart to maintain the history of data so you can analyze it over time.

For these rows. Data Integrator Designer Guide 541 . one customer moved from the East region to the West region. This produces a second row in the target instead of overwriting the first row. 2. flags the row as INSERT. In the following example. if the values have changed. it compares the values of specified columns and. The data flow contains the following steps: 1. if a customer moves from one sales region to another. Data Integrator provides a special transform that preserves data history to prevent this kind of situation. A Key_Generation transform gives the new row a new generated key and loads the row into the customer dimension table. The original row describing the customer remains in the customer dimension table with a unique generated key. the History_Preserving transform generates a new row for that customer. Source Customer table Customer Fred's Coffee Region East Phone (212) 123-4567 (650) 222-1212 (115) 231-1233 Jane's Donuts West Sandy's Candy Central Target Customer table GKey 1 2 3 Customer Fred's Coffee Region East Phone (212) 123-4567 (201) 777-1717 (115) 454-8000 Jane's Donuts East Sandy's Candy Central In this example. Find out how you can participate and help to improve our documentation. simply updating the customer record to reflect the new region would give you misleading results in an analysis by region over time because all purchases made by a customer before the move would incorrectly be attributed to the new region. and another customer’s phone number changed. The History_Preserving transform ignores everything but rows flagged as UPDATE. Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 For example. To expand on how Data Integrator would handle the example of the customer who moves between regions: • • • If Region is a column marked for comparison. the data flow preserves the history for the Region column but does not preserve history for the Phone column. A source to extract the rows from the source table(s).This document is part of a SAP study on PDF usage. A query to map columns from the source.

the change in the Jane's Donuts row created a new row in the customer dimension table. 5. 6. A History_Preserving transform to convert certain UPDATE rows to INSERT rows. Find out how you can participate and help to improve our documentation. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources 3. Data flow: Preserve history in the target 1 2 3 5 6 4 The resulting dimension table is as follows: Target Customer table GKey 1 2 3 4 Customer Fred's Coffee Region East Phone (212) 123-4567 (201) 777-1717 (115) 231-1233 (650) 222-1212 New row Updated rows Jane's Donuts East Sandy's Candy Central Jane's Donuts West Because the Region column was set as a Compare column in the History_Preserving transform. A key-generation transform to generate new keys for the updated rows that are now flagged as INSERT. 4. Because the Phone column was not used in the comparison. A table-comparison transform to generate INSERTs and UPDATEs and to fill in existing keys. A target to load the rows into the customer dimension table.This document is part of a SAP study on PDF usage. the change in the Sandy's Candy row did not create a new row but updated the existing one. 542 Data Integrator Designer Guide .

” Data Integrator supports Valid from and Valid to date columns. Customer). Note that updates to non-history preserving columns update all versions of the row if the update is performed on the natural key (for example. (Valid in this sense means that the record’s generated key value is used to load the fact table during this time interval. Valid_from date and valid_to date To support temporal queries like “What was the customer’s billing address on May 24. Value Set value in column Column identifies the current valid record in the target table for a given source table primary key. the History_Preserving transform generates an UPDATE record before it generates an INSERT statement.) When you specify the Valid from and Valid to entries. the History_Preserving transform generates an UPDATE record before it generates an INSERT statement for history-preservation reasons (it converts an UPDATE into an INSERT). which is less than the Valid to date. In history-preserving techniques. there are multiple records in the target table with the same source primary key values. 1998. When you specify Column. and only update the latest version if the update is on the generated key (for example. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. This UPDATE record will set the Column value to Reset value in the target table record with the same source primary key as the INSERT statement. You can control which key to use for updating by appropriately configuring the loading options in the target editor. Techniques for Capturing Changed Data Using CDC with timestamp-based sources 18 Now that there are two rows for Jane's Donuts. The UPDATE record will set the Valid to date column on the current record (the one with the same primary key as the INSERT) to the value in the Valid from date column in the INSERT record. A record from the source table is considered valid in the dimension table for all date values t such that the Valid from date is less than or equal to t. GKey). This UPDATE statement updates the Valid to value. Data Integrator enables you to set an update flag to mark the current record in a dimension table. the History_Preserving transform generates only one extra UPDATE statement for every INSERT statement it produces. Update flag To support slowly changing dimension techniques. correlations between the dimension table and the fact table must use the highest key value. When you specify entries in both the groups. Data Integrator Designer Guide 543 . In the INSERT statement the Column will be set to Set value.

you might choose to extract all header and detail information whenever any changes occur at the header level or in any individual line item. In some instances. or you might not have access to such information (for example. a change to the default ship-to address in the order header might impact none of the existing line items.This document is part of a SAP study on PDF usage. this WHERE clause is not well optimized and might cause serious performance degradation. DETAIL WHERE HEADER. you must consider: • • Header and detail synchronization Capturing physical deletions Header and detail synchronization Typically. but it might improve the final performance of your system while not altering the result of your target database.LAST_MODIFIED BETWEEN $G_SDATE AND $G_EDATE) For some databases.LAST_MODIFIED BETWEEN $G_SDATE AND $G_EDATE OR DETAIL. however. your source system might not consistently update those tracking columns.LAST_MODIFIED BETWEEN $G_SDATE AND $G_EDATE OR DETAIL. but the same column at the order header level does not update. use logic similar to this SQL statement: SELECT … FROM HEADER.ID = DETAIL.ID = DETAIL. source systems keep track of header and detail information changes in an independent way. 544 Data Integrator Designer Guide . when rows are physically deleted). Conversely. such as in: … WHERE HEADER. You might opt to relax that clause by removing one of the upper bounds. its “last modified date” column updates.ID AND (HEADER.ID AND (HEADER.LAST_MODIFIED >= $G_SDATE) … This might retrieve a few more rows than originally intended. Find out how you can participate and help to improve our documentation. In these cases. 18 Techniques for Capturing Changed Data Using CDC with timestamp-based sources Additional job design tips When designing a job to implement changed-data capture (CDC). For example. To extract all header and detail rows when any of these elements have changed. if a line-item status changes.

do not provide enough information to make use of the source-based changeddata capture techniques. suppose that the business that the job supports usually deletes orders shortly after creating them. however. Perform a partial refresh based on a data-driven time-window — For example. then you must capture these physical deletions when synchronizing header and detail information. • • When physical deletions of detail information in a header-detail relationship are possible (for example. Perform a full refresh — Simply reload all of the data. Data Integrator Designer Guide 545 . refreshing the last month of orders is appropriate to maintain integrity.This document is part of a SAP study on PDF usage. If the first non-closed order in your source table occurred six months ago. Perform a partial refresh based on a business-driven time-window — For example. Techniques for Capturing Changed Data Using CDC for targets 18 Capturing physical deletions When your source system allows rows to be physically deleted. In this case. then by refreshing the last six months of data you are guaranteed to have achieved synchronization. Some source systems. Find out how you can participate and help to improve our documentation. therefore fully synchronizing the source system and the target database. Check every order that could possibly be deleted — You must verify whether any non-closed order has been deleted. Target-based changed-data capture allows you to use the technique when source-based change information is limited. Using CDC for targets Source-based changed-data capture is almost always preferable to targetbased capture for performance reasons. removing line items from an existing order). this technique requires you to keep a record of the primary keys for every object that is a candidate for deletion. then you can scan that log to identify the rows you need to delete. your job should include logic to update your target database correspondingly. To be efficient. There are several ways to do this: • • • Scan a log of operations — If your system logs transactions in a readable format or if you can alter the system to generate such a log. suppose that the source system only allows physical deletion of orders that have not been closed.

18 Techniques for Capturing Changed Data Using CDC for targets 546 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

Data Integrator Designer Guide Monitoring jobs chapter .This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

The SNMP agent monitors and records information about the Job Server and the jobs it is running. 548 Data Integrator Designer Guide . 19 Monitoring jobs About this chapter About this chapter This chapter contains the following topics: • • Administrator SNMP support Administrator The Data Integrator Administrator is your primary monitoring resource for all jobs designed in the Data Integrator Designer. and NMS application architecture About SNMP Agent’s Management Information Base (MIB) About an NMS application Configuring Data Integrator to support an NMS application Troubleshooting About the Data Integrator SNMP agent When you enable SNMP (simple network management protocol) for a Job Server. SNMP support Data Integrator includes an SNMP (simple network management protocol) agent that allows you to connect to third-party applications to monitor its jobs. Find out how you can participate and help to improve our documentation. you can use your NMS application to monitor the status of Data Integrator jobs. You can configure NMS (network management software) applications to communicate with the SNMP agent. SNMP agent. that Job Server sends information about the jobs it runs to the SNMP agent. You can use an SNMP-supported application to monitor Data Integrator job status and receive error events. Topics in this section include: • • • • • • About the Data Integrator SNMP agent Job Server. Thus. For detailed information.This document is part of a SAP study on PDF usage. see the Data Integrator Administrator Guide.

When you have a Data Integrator SNMP license on a computer. SNMP agent. Monitoring jobs SNMP support 19 The SNMP agent is a license-controlled feature of the Data Integrator Job Server. When you configure the SNMP agent. Job Server. and NMS application architecture You must configure one Job Server to communicate with the SNMP agent and to manage the communication for SNMP-enabled Job Servers. The agent listens for requests from the NMS applications and responds to requests. Data Integrator Designer Guide 549 . SNMP starts when the Data Integrator Service starts. Like Job Servers. you specify one agent port and any number of trap receiver ports. traps notify you about potential problems. The agent uses the trap receiver ports to send error events (traps or notifications) to NMS applications. The Data Integrator SNMP agent sends proactive messages (traps) to NMS applications. This Job Server does not need to be the same one configured with the communication port. The SNMP agent uses the agent port to communicate with NMS applications using UDP (user datagram protocol). When you enable a Job Server for SNMP. it will send events to the SNMP agent via the Job Server with the communication port. While you use an NMS application. You must also enable at least one Job Server for SNMP. you can enable SNMP for any number of Job Servers running on that computer. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

Tables include: Table 19-1 : Data Integrator MIB Job Server table Column jsIndex jsName jsStatus Description A unique index that identifies each row in the table Name of Job Server Status of the Job Server. and the current system time. The scalar variables list the installed version of Data Integrator.txt Consult these files for more detailed descriptions and up-to-date information about the structure of objects in the Data Integrator MIB. Find out how you can participate and help to improve our documentation. The tables contain information about the status of Job Servers and the jobs they run. 19 Monitoring jobs SNMP support About SNMP Agent’s Management Information Base (MIB) The Data Integrator SNMP agent uses a management information base (MIB) to store information about SNMP-enabled Job Servers and the jobs they run.txt BOBJ-DI-MIB. the time the agent started. Metadata for the Data Integrator MIB is stored in two files which are located in the LINK_DIR/bin/snmp/mibs directory: BOBJ-ROOT-MIB.This document is part of a SAP study on PDF usage. Possible values are: • notEnabled • • • • • • • • • • • • • initializing optimizing ready proceed wait stop stopRunOnce stopRecovered stopError notResponding error warning trace 550 Data Integrator Designer Guide . The Data Integrator MIB contains five scalar variables and two tables.

Monitoring jobs SNMP support 19 Table 19-1 :Data Integrator MIB Job table Column jobIdDate jobIdTime jobIdN jobRowN jType Description The part of a job’s identifier that matches the date The part of a job’s identifier that matches the time The final integer in a job’s identifier A unique index that identifies an object in a job Associated object for this row.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. Possible values include: notEnabled initializing optimizing ready proceed wait stop stopRunOnce stopRecovered stopError notResponding error warning trace • • • • Depends on the type of object: Data flow — The number of input rows read Work flow — Always zero Job — Sum of values for all data flows in the job Error. warning. • • • • • • • • • • • • • • jRowsIn The status of the object. Possible values include: job — A job wf — A work flow df — A data flow error — An error message trace — A trace message • • • • • jName jStatus The name or identifier for this object. trace — Always zero Data Integrator Designer Guide 551 . such as the job or work flow name or the error message identifier.

you set a job lifetime and a maximum table size. or trace has occurred during this job The time when the object’s jStatus. and data flows. warning. The Data Integrator SNMP agent receives data about jobs and Job Servers from SNMP-enabled Job Servers and maintains this data in the Data Integrator MIB for currently running jobs and recently completed jobs. The agent summarizes and eliminates individual data flow and work flow records for completed jobs periodically to reduce the size of the MIB. each time the agent starts it loads data into the Job table for each Job Server. warning. If the MIB’s size reaches the maximum table size. work flows. 19 Monitoring jobs SNMP support Column jRowsOut Description Depends on the type of object: • Data flow — The number of output rows written • • • jStatusTime jExecTime jInitTime jMessage Work flow — Always zero Job — Sum of values for all data flows in the job Error. jMessage contains the message text. additional error rows as needed. Find out how you can participate and help to improve our documentation. either empty or an information message. For errors. warnings. The MIB is stored in memory. work flow or data flow). During configuration. or trace messages. The number of milliseconds between the beginning of the object’s execution and jStatusTime.This document is part of a SAP study on PDF usage. or jRowsOut last changed. The SNMP agent maintains the data for completed jobs for the specified lifetime. 552 Data Integrator Designer Guide . the agent eliminates 20 percent of the completed jobs. trace — Number of times that the error. The data that remains includes: • • • One Job table row with the statistics for the entire job For a successful job. The number of milliseconds necessary to compile the object (job. jRowsIn. zero additional rows For a failed job. The data is from jobs that ran just before the Job Servers were shut down. starting with the oldest jobs. For jobs. To provide some historical context.

Because there are no writable objects in the Data Integrator MIB. Find out how you can participate and help to improve our documentation. the agent sends an SNMPv2-Trap PDU to the SNMP ports that you have configured.internet.businessObjects. Data Integrator does not send traps for real-time jobs until the jobs have reached their cycle count.snmpv2. GetBulkRequest. While you use an NMS application.snmp iso.org.system iso.internet.dod.internet.This document is part of a SAP study on PDF usage. • Agent denies a request due to an authentication failure (if configured to do so). After the jobs reach their cycle count. Data Integrator Designer Guide 553 .org.dod. See “Traps” on page 560. Monitoring jobs SNMP support 19 About an NMS application An NMS application can query the Data Integrator SNMP agent for the information stored in the Data Integrator MIB (iso.dod. The agent sends traps when: • • • • • • Errors occur during batch or real-time jobs Job Servers fail Agent starts Agent has an internal error Agent has an orderly shut down Agent restarts and the previous agent was unable to send a trap caused by a job error (these traps include a historical annotation) Note: This can occur if the machine fails unexpectedly or is halted without an orderly shutdown. GetNextRequest. Data Integrator refreshes status or sends additional traps. Specifically. The agent responds to SNMP GetRequest.dod.mib-2.mgmt.mib-2. the agent gracefully rejects SetRequest commands for that MIB.org.snmpModules The agent listens on the agent port for commands from an NMS application which communicates commands as PDUs (protocol data units). and SetRequest commands that specify valid object identifiers (OIDs). Similarly.dataIntegrator) or one of the standard SNMP MIBs: • • • iso.mtmt.internet.enterprises. traps notify you about potential problems.private. Note: Status for real-time services does not appear until the real-time services have processed enough messages to reach their cycle counts.org.

continue with this procedure. check the agent. When you select a Job Server to support SNMP. Otherwise. the Job Server port. Refer to the documentation for the NMS application. Exactly one Job Server must be configured to support adapters and SNMP communication. and whether SNMP is enabled. 3.This document is part of a SAP study on PDF usage. you are telling Data Integrator to send events to the SNMP agent via the communication Job Server on the same machine. This window lists the Job Servers currently configured.1 > Server Manager). Find out how you can participate and help to improve our documentation. verify that the communication port is correct. SNMP configuration in Windows 1. 19 Monitoring jobs SNMP support Configuring Data Integrator to support an NMS application If you have an NMS application that monitors systems across your network using SNMP. 554 Data Integrator Designer Guide . • • • Configure the Data Integrator SNMP agent on the same computer for which you configured Job Servers. If you want to add a new Job Server. When you enable SNMP for a Job Server. you can use that application to monitor Data Integrator jobs. Select a Job Server and click Edit. if the application gets a time-out). the communication port. 2. If a Job Server is already configured to support adapters and SNMP communication. Configure your NMS application to query the Data Integrator MIB for job status using the agent port. The summary lists the Job Server name. click Add. The Job Server Configuration Editor opens. Note: Supporting SNMP communication and enabling SNMP are separate configurations. whether the Job Server supports adapters and SNMP communication. To select a Job Server to support SNMP communication Open the Data Integrator Server Manager (Start > Programs > BusinessObjects Data Integrator 6. you must also specify the communication port that connects the Job Server to the SNMP agent. In the Data Integrator Server Manager. click Edit Job Server Config. Enable SNMP on each Job Server that you want to monitor. If the Data Integrator SNMP agent does not respond to the NMS application (for example. To do so: • Select one Data Integrator Job Server on each computer to support SNMP communication.

• If you want to configure the SNMP agent (including enabling SNMP for Job Servers): a. d. The default value is 4001. The SNMP agent allows you to enable or disable more than one Job Server at a time. Click OK. In the Server Manager window.This document is part of a SAP study on PDF usage. 5. 7. b. Select the Support adapter and SNMP communication check box. The Job Server Configuration Editor opens. select Restart. 1. In the Communication port box. 6. see “To configure the SNMP agent” on page 556. Enter a port number that is not used for other applications. Click OK. see step 2 below. 3. select OK. Open the Data Integrator Server Manager (Start > Programs > BusinessObjects Data Integrator 6. This window lists the Job Servers currently configured. Data Integrator uses the same port to communicate with adapters. Verify that the repositories associated with this Job Server are correct. You can enable SNMP for a Job Server from the Server Manager or from the SNMP agent. c. enter the port you want to use for communication between the Data Integrator SNMP agent and Job Servers on this computer.1 > Server Manager). To use the Server Manager. “To configure the SNMP agent” on page 556. To use the SNMP agent. Select the Enable SNMP check box In the Job Server Configuration Editor. Monitoring jobs SNMP support 19 4. • • 2. The SNMP agent allows you to enable or disable more than one Job Server at a time. To enable SNMP on a Job Server You can enable SNMP for a Job Server from the Server Manager or from the SNMP agent. Find out how you can participate and help to improve our documentation. b. In the Job Server Configuration Editor. c. • If you want to enable SNMP for the current Job Server: a. click Edit Job Server Config. Skip to step 2 in the procedure. select OK. The summary indicates whether the Job Server is SNMP-enabled. Data Integrator Designer Guide 555 . In the Data Integrator Server Manager.

3. In the Server Manager window. Find out how you can participate and help to improve our documentation. which restarts the Job Servers using the new configuration information. you can modify current or default SNMP configuration parameters. Select a category and set configuration parameters for your SNMP agent. v3 — Parameters that affect how the agent grants NMS applications using the v3 version of SNMP access to the Data Integrator MIB and Data Integrator supported standard MIBs Traps — Parameters that determine where the agent sends trap messages • • 4. 6. After you enable the SNMP agent. “SNMP configuration parameters”. 8. The SNMP Configuration Editor opens. Select a Job Server and click Edit. Select the Enable SNMP on this machine check box to enable the Data Integrator SNMP agent. click Edit SNMP Config. 5. 4. Click OK after you enter the correct configuration parameters for the Data Integrator SNMP agent. Select the Enable SNMP check box. 5. repeat steps 4 through 6. select Restart. click Add. If you want to add a new Job Server. SNMP-enabled Job Servers send the SNMP agent messages about job status and job errors. To configure the SNMP agent Open the Data Integrator Server Manager (Start > Programs > BusinessObjects Data Integrator 6. see the next section.1 > Server Manager). In the Data Integrator Server Manager. 7. 9.This document is part of a SAP study on PDF usage. 19 Monitoring jobs SNMP support You can enable SNMP on any Job Server. The Data Integrator Service restarts. In the Job Server Configuration Editor. • • • Job Servers for SNMP — Job Servers enabled for SNMP on this machine System Variables — Parameters that affect basic agent operation Access Control. Click OK. select OK. To enable SNMP on additional Job Servers. For details on each parameter category. Parameter categories include: 1. 556 Data Integrator Designer Guide . v1/v2c — Parameters that affect how the agent grants NMS applications using the v1 or v2c version of SNMP access to the Data Integrator MIB and Data Integrator supported standard MIBs Access Control. 2.

System name System contact Data Integrator Designer Guide 557 . click Enable All. In the Data Integrator Server Manager window. to enable SNMP for all the configured Job Servers.This document is part of a SAP study on PDF usage. which restarts the Data Integrator SNMP agent using the new configuration information. Find out how you can participate and help to improve our documentation. The security mechanism used by v1 is not robust. Optional. You can also enable or disable SNMP for individual Job Servers using the Job Server Configuration Editor. Enter text that describes the person to contact regarding this system. Agent port Enter the port at which the agent listens for commands (PDUs) from NMS applications and responds to those commands. See “To enable SNMP on a Job Server” on page 555. Parameter Description Minimum Select the earliest version of SNMP that NMS applications SNMP version will use to query the agent: v1. Network monitors might contact this person to resolve identified problems. and trap messages sent by the agent are not compatible with v1. then you must set this value to v1. the SNMP agent maintains and reports data for jobs that run on that Job Server. When SNMP is enabled for a Job Server. or v3. To enable SNMP for a Job Server that is not enabled. To disable SNMP for a Job Server that is enabled. click Disable All. SNMP configuration parameters Job Servers for SNMP Use this category to select any number of Job Servers and enable SNMP for them. Business Objects recommends that you not use v1. select that Job Server and click Disable. Business Objects recommends that you reconfigure the NMS application. If other devices or agents that the application monitors support v2c or v3. The Data Integrator Service restarts. select that Job Server and click Enable. to disable SNMP for all configured Job Servers. the standard SNMP input port. The editor lists each configured Job Server in one of two columns: Not enabled for SNMP or Enabled for SNMP. System Variables Use this category to set parameters that affect the SNMP agent’s operation. Enter the name of the computer. If not. Similarly. v2c. Note: Some NMS applications use v1 by default. Monitoring jobs SNMP support 19 6. This text is reported to the NMS application. click Restart. The default port is 161.

Select Read-only to permit this community to read the Data Integrator MIB only. The default is 819 (0. or encryption pass phrases. the agent reduces the MIB by 20 percent by eliminating completed jobs. eliminates individual data flow and work flow records) after one-eighth of a job’s lifetime. With this setting. The editor lists community names and the type of access permitted. Enter the maximum number of kilobytes that the agent can use for the Data Integrator MIB. Find out how you can participate and help to improve our documentation. 2. Enter the maximum number of minutes a job will remain in the Data Integrator MIB after the job completes. If an NMS application monitoring the Data Integrator SNMP agent uses SNMP version v1 or v2c. Enter text that identifies this computer such as physical location information. 558 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. Select Read-write to permit this community to send SetRequest commands for all read-write variables in any MIB and GetRequest commands for variables in any MIB. 3. this community is not permitted to send SetRequest commands to any MIB or GetRequest commands to a standard SNMP MIB for trees that contain security information such as community strings. you must set the Minimum SNMP version to either v1 or v2c under the System Variables category. 19 Monitoring jobs SNMP support Parameter System location JobTable cache lifetime (in min) Description Optional. To enable access for a new community Click Add. 1. The agent summarizes jobs (that is. starting with the oldest jobs. JobTable cache max size (in KB) Access Control.8 Megabytes) which will store approximately 1000 jobs. v1/v2c Use this category to enter the information that allows NMS applications using SNMP version v1 or v2c to access the MIBs controlled by the Data Integrator SNMP agent. If the MIB reaches this size. user password. This text is reported to the NMS application. Remember that selecting this option gives the community the capability of reading and then modifying variables in the standard SNMP MIBs. Default lifetime is 1440 (one day). The agent eliminates jobs completely after reaching the lifetime limit.

Parameter Description Read-only Select Read-only to permit this user to read the Data Integrator MIB only. The NMS application includes this name in all requests to the agent. user passwords. Monitoring jobs SNMP support 19 4. 5. To edit a community’s name or access Select the community name and click Edit. In Community name. Change access type and community name as desired. enter the community name permitted to send requests to this agent. The editor lists user names along with properties of each user. 2. Remember that selecting this option gives the user the capability of reading and then modifying variables in the standard SNMP MIB. Access Control. this user is not permitted to send SetRequest commands to any MIB or GetRequest commands to a standard SNMP MIB for a tree that contains security information such as community strings. 2. 2. 1. or encryption passphrases. Data Integrator Designer Guide 559 . With this setting. 1. Typically. To enable access for a new user Click Add. Find out how you can participate and help to improve our documentation. To delete access for a particular community Select the community name. Read-write Select Read-write to permit this user to send SetRequest commands for all read-write variables in any MIB and Get request commands for variables in any MIB. v3 Use this category to enter the information that allows NMS applications using SNMP version v3 to access the MIBs controlled by the Data Integrator SNMP agent.This document is part of a SAP study on PDF usage. the administrator of the NMS application assigns the name. Click OK. Names are case-sensitive and must be unique. Click Delete. Enter appropriate information for the user. Click OK. 1. 3.

To add a new trap receiver Click Add. 3. 1. in addition to traps about job errors. Parameters Machine name Description Enter the name of the computer or the IP address of the computer where the agent sends trap messages. user name. This is a computer where an NMS application is installed. Click Delete. The password is case-sensitive. Password Confirm password 3. such as incorrect passwords or community names. Traps Use this category to configure where to send traps. The editor lists the receivers of the trap messages sent by the Data Integrator SNMP agent. and password as desired. The NMS application includes this name in all requests to the agent. 560 Data Integrator Designer Guide . Select the Enable traps for authentication failures check box if you want the agent to send traps when requests fail due to authentication errors. To edit a user’s name or access data Select the user name and click Edit. 2. To delete access for a user Select the user name. the administrator of the NMS application assigns the name. 2. Find out how you can participate and help to improve our documentation. Click OK. Enter the password for the user. 1. 19 Monitoring jobs SNMP support Parameter Description User name Enter a name of a user to which the Data Integrator SNMP agent will respond.This document is part of a SAP study on PDF usage. Names are case-sensitive. 1. A receiver is an NMS application identified by a machine and port. Enter identifying information about the trap receiver. Re-enter the password to safeguard against typing errors. Typically. Click OK. Each name must be unique. 2. Change access data.

Monitoring jobs SNMP support 19 Parameters Port Description Enter the port where the NMS application listens for trap messages. the standard SNMP output port. Find out how you can participate and help to improve our documentation. Community name 3. SNMP configuration on UNIX This section lists the procedures to configure SNMP on UNIX. Click OK.sh $ . 1. For more detailed descriptions about the options mentioned here. Enter: $ cd $LINK_DIR/bin/ $ . 3. Enter the community name that the NMS application expects in trap messages. 2. Data Integrator Designer Guide 561 . 1./svrcfg Note: The second command sets the environment variables before running the Server Manager.This document is part of a SAP study on PDF usage. see “SNMP configuration in Windows” on page 554. Click OK. To select a Job Server to support SNMP communication Run the Server Manager. The default value is 162. Update the identifying information about the trap receiver. . 1. To change information for a trap receiver Select the trap receiver and click Edit. To delete a trap receiver Select the trap receiver. Click Delete. 2./al_env.

Enter the serial number of the Job Server you want to work with when you see the following question: Enter serial number of Job Server to edit: 1 5. ** Data Integrator Server Manager Utility ** 1 : Control Job Service 2 : Configure Job Server 3 : Configure Runtime Resources 4 : Configure Access Server 5 : Configure Web Server 6 : Configure SNMP Agent 7 : Configure SMTP 8 : Configure HACMPa x : Exit Enter Option: 2 a. Enter ‘y’ when prompted with the following question: Do you want to manage adapters and SNMP communication for the Job Server 'Server1''Y|N' [Y]?: 562 Data Integrator Designer Guide . for AIX only 3. 19 Monitoring jobs SNMP support 2. Enter option e : Edit a JOB SERVER entry. 4. Select option 2 to configure a Job Server. Enter a number that will be used as the SNMP communication port when you see the following question: Enter TCP Port Number for Job Server <S1> [19111]: 6. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage.

To configure an agent Run the Server Manager./svrcfg 1. 5. _________________________________________________________ ** Current Job Server Information ** _________________________________________________________ S# -1* 2 Job Server Name --------Server1 Server2 TCP Port ----19111 19112 Enable SNMP ------Y N Repository Connection --------repo1@orasvr1 repo2@orasvr1 *:JobServer <S1> supports adapter and SNMP communication on port:19110 _____________________________________________________ c: Create a new JOB SERVER entry e: Edit a JOB SERVER entry d: Delete a JOB SERVER entry u: UPDATE a REPO q: Quit Enter Option: q a: Add a REPO to job server y: Resync a REPO r: Remove a REPO from job server s: Set default REPO 7. enter q. 3. the Job Server set to manage adapters or SNMP is marked with an asterisk and noted below the list of Job Servers. 2./svrcfg 1. Monitoring jobs SNMP support 19 When you return to the Current Job Server Information page. then enter x. Enter option e : Edit a JOB SERVER entry. Enter y when prompted with the following question: Do you want to Enable SNMP for this JobServer 'Y|N' [N]: To exit the Server Manager. Enter: $ cd $LINK_DIR/bin/ $ .This document is part of a SAP study on PDF usage. . To enable SNMP on a Job Server Run the Server Manager. Enter: $ cd $LINK_DIR/bin/ $ ./al_env. ./al_env. To exit the Server Manager.sh $ . then enter x. Data Integrator Designer Guide 563 .sh $ . Select option 2 to configure a Job Server. Find out how you can participate and help to improve our documentation. 4. enter q.

19 Monitoring jobs SNMP support 2. Find out how you can participate and help to improve our documentation. Select option 6 Configure SNMP Agent.This document is part of a SAP study on PDF usage. the SNMP configuration menu appears. 564 Data Integrator Designer Guide . ** Data Integrator Server Manager Utility ** 1 : Control Job Service 2 : Configure Job Server 3 : Configure Runtime Resources 4 : Configure Access Server 5 : Configure Web Server 6 : Configure SNMP Agent 7 : Configure SMTP 8 : Configure HACMPa x : Exit Enter Option: 3 a. for AIX only One of these options appears based on the current configuration: • • SNMP is Disabled for this installation of Data Integrator [ D = Keep Disabled / E = Enable ]? : SNMP is Enabled for this installation of Data Integrator [ E = Keep Enabled / D = Disable ]? : Once you enable SNMP for Data Integrator.

Press ENTER to keep the default values or enter new values for each variable.Exit to previous Menu Enter Option: • To modify system variables.This document is part of a SAP study on PDF usage.Access Control. v1/v2c 3 . v3 4 .Access Control. Data Integrator Designer Guide 565 . choose 1. Monitoring jobs SNMP support 19 The following is a sample SNMP configuration menu screen. SYSTEM VARIABLES --------------Minimum SNMP Version: v2c System Name: hpsrvr3 System Contact: sysadmtr System Location: JobTable Cache Lifetime: 1440 (in min) JobTable Cache Max Size: 819 (in KB) Default Port: 4961 Access Control. v3 ---------------------Read User : read_v3 Write User : write_v3 SNMP TRAPS ---------Authentication Traps : Enabled Agent is configured to send trap to: aixserver3:20162 with Community String trap_co 1 .Modify Traps X . v1/v2c ---------------------Read Community string : read_v2 Write Community string : write_v2 Access Control. Values for each variable display.Modify System Variables 2 . Find out how you can participate and help to improve our documentation.

enter a new value or press RETURN to keep the original value.Exit without save to Previous Menu Enter Option: Use this menu to: • 566 Enable/disable authentication traps.Save Changes X . 19 Monitoring jobs SNMP support • To modify SNMP v1 and V2c community names.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. A submenu displays.Save Changes X . choose 4.Enable sending traps on Authentication failures F . A submenu displays. choose 3. Data Integrator Designer Guide .Add WRITE User D . TRAP SETUP ---------Authentication Traps : Enabled Agent is configured to send trap to: aixserver3:20162 with Community String trap_co A .Delete TRAP receiver E . v1/v2c ---------------------Read Community string : read_v2 R . either enter a new value or press RETURN to keep the original value.Add READ User W . Access Control. A submenu displays. v3 -----------------Read User : read_v3 R . choose 2.Save Changes X . At the prompt.Add TRAP receiver D . either enter a new value or press RETURN to keep the original value.Delete User S .Disable sending traps on Authentication failures S . At the prompt. At the prompt.Add WRITE Community String D .Delete Community String S .Exit without save to Previous Menu Enter Option: • To modify TRAP setup.Exit without save to Previous Menu Enter Option: • To modify SNMP v3 USER names. Access Control.Add READ Community String W .

Monitoring jobs SNMP support 19 • Configure a trap receiver (host name and port number on which the trap receiver is listening. Under the Traps category of the SNMP Configuration Editor: • • b. Unauthorized requests will fail due to a time-out. If the versions are incompatible. After you enter “S”. A “-“ indicates deleted names. Under System Variables in the SNMP Configuration Editor. revert to the original time-out setting. if the NMS application sends messages using SNMP version v2c or v3. Troubleshooting 1. A “+” indicates newly added names. The SNMP agent does not reply to unauthorized requests. Inspect the output of the trap receiver for authorization traps. and community string the trap receiver will use to authenticate the traps). To determine whether the agent regards a request as unauthorized: a. For example. To troubleshoot the Data Integrator SNMP agent Check that you are using a valid v1/v2c community name or v3 user name. If the agent does not respond and the time-out is more than 20 seconds. Data Integrator Designer Guide 567 . change the NMS application setting or the configuration of the Data Integrator SNMP agent. Repeat until the agent responds or the time-out exceeds 20 seconds. 2. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. 3. Verify that the agent and the NMS application are using compatible versions of SNMP. Increase the SNMP agent’s time-out. the Minimum SNMP version must not be greater than the version of the messages that the NMS application sends. Set a trap receiver Select the Enable traps for authentication failures check box Restart the Data Integrator SNMP agent. and try the next step. c. all additions and deletions display and you are prompted to confirm them. you can set the Minimum SNMP version to v2c.

conf snmpd. Contact Business Objects technical support. include: • • • The exact NMS command you are trying to run and the exact error message Name and version of the NMS application you are running and the version of SNMP protocol that application uses. 5. Check errors in the SNMP error log: “installation directory”/bin/snmp/snmpd. Copies of the following four files from “installation directory”/bin/snmp directory: • • • • 6.conf snmpd. In your defect report. if possible. 568 Data Integrator Designer Guide . If not. If other devices or agents that the NMS application monitors support v2c or v3.log Use the Server Manager to resolve errors. Find out how you can participate and help to improve our documentation.conf snmpd. Encryption passphrases are locked to a given SNMP engine ID based on the IP address of the computer running the SNMP agent.This document is part of a SAP study on PDF usage. If you change the IP address of that computer. snmp. you must re-create all your SNMP users in the Data Integrator Server Manager. then you must configure the Data Integrator SNMP agent to accept version v1 commands.p. Business Objects recommends that you reconfigure the NMS application. 19 Monitoring jobs SNMP support Note: Some NMS applications use v1 by default.log Check that encryption pass phrases are accessible. 4.

364 B blanks. 367 definition 363 audit rule Boolean expression examples 365 defining 369 definition 363 audit trace 371 Audit window. Data Quality 379 ASCII format files 136 attributes creating 56 deleting 56 object 53 setting values 55 table. business name 451 audit function data types 364 definition 363 enabling in embedded data flow 373 error row count statistics 363 errow row count statistics 364 good row count statistics 363.This document is part of a SAP study on PDF usage. 365 generating 364 removing 369 resolving invalid 376 audit notification defining 371 ways to generate 365 audit point defining 366. Oracle 69 business name table attribute 451 Data Integrator Designer Guide 569 . datastores 111 AL_JobServerLoadBalanceDebug 329 AL_JobServerLoadOSPolling 329 annotations adding 60 deleting 60 resizing 60 using 45 application metadata 111 arechitecture. Index Symbols $ See dollar signs . See semicolons ’ See single quotation marks audit label definition 363 editing 364. Find out how you can participate and help to improve our documentation. description 366 auditing data flow description 362 enabling for job execution 371 guidelines 371 viewing failed rule 377 viewing results 377 viewing status 378 auto correct loading 461 A about Data Integrator command 39 Access Server message processing 254 real-time jobs 256 specifying for metadata reporting 66 Adapter Data Exchange Timeout 329 Adapter SDK 112 Adapter Start Timeout 329 adapters. converting to NULL for Oracle loader 69 Bridge to Business Objects metadata 447 browsing adapter datastore 114 datastores 90 metadata 90 bulk loading.

when to use 473 target-based 475. adapter 112 file formats 139 jobs 74. using with sources 522 using with mainframes 505 using with Oracle 475–495 using with SQL Server 513. system configurations configuring Data Profiling Servers 336 connecting objects 45 contents command 39 converting data types 94 Copy command 35 copying objects 53 creating data flows 176 datastores for databases 85 datastores. 545 timestamp-based. Index C caching comparison tables 194 lookup tables 194 source tables 194 Calculate Usage Dependency 443 Calculating usage dependencies 443 Callable from SQL statement check box 99 calling reusable objects 31 calls to objects in the object library 31 catch editor 209 catch. See try/catch blocks central repositories. defined 474 timestamp-based. extractiong and parsing XML to 247 comments 51. processing 466–470 recovering 454–466 testing 329 data cache. selecting over ERP for data 275–280 data flow.This document is part of a SAP study on PDF usage. database requirements 479 changed-data capture 505 overview 472 source-based. displaying 69 century change year 69 change-data capture Oracle. 513 close command 34 closing windows 47 columns. examples 535. 75 projects 73 repository 22 while loop 206 work flows 201 current Job Server 66 current schema 189 custom functions displaying 36 object library access 49 saving scripts 212 Cut command 35 CWM 446 D data capturing changes 472 loading 179 missing or bad values 467 problems with. 211 Compact Repository command 34 concat_date_time function 166 conditionals defining 204 description 202 editor 203–205 example 203 using for manual recovery 466 configurations See also datastore configurations. overview 473 source-based. Find out how you can participate and help to improve our documentation. distributing 192 data flows accessible from tool palette 43 adding sources 180 connecting objects in 45 creating from object library 176 defining 176 defining parameters 302 description 172 designing for automatic recovery 461–463 embedded 283 executing only once 456 execution order 191 570 Data Integrator Designer Guide . 543 timestamp-based. auditing 194 data flow.

changing 89 requirements for database links 110 Sybase 26 using multiple configurations 80 date formats 152 DB2 logging in to repository 25 using with parameterized SQL 85 Debug menu 37 Debug options Interactive Debugger 418 View Data 404 View Where Used 398. viewing 90 options.This document is part of a SAP study on PDF usage. sorting 90 objects in. implementing 80 database connection. converting 94 database links and datastores 110 defined 109 requirements for datastores 110 databases changes to. Windows 336 Data Quality 378 Data Quality datastore 381 Data Quality projects 383 Data quality. implementing in datastore 80 connections to datastore. 191 Data Integrator Designer window 32 objects 30 Data options 69 Data Profiler. executing only once 178. 404 debugger 418 limitations 436 managing filters and breakpoints 432 setting filters and breakpoints 419 show/hide filters and breakpoints 434 starting and stopping 424 tool bar options 413 using the View Data pane 429 windows 427 debugging scripts 212 decimals 151 declarative specifications 191 default configuration 118 default Job Server. 100 memory 102 multiple configurations. purpose 115 object library access 49 objects in. operation codes in 174 data transformations in real-time jobs 254 data types. View Data 356 data sets. 175 passing parameters to 175 resizing in workspace 46 sources 178 steps in 173 targets 179 in work flows 173 data flows in batch jobs. 75 nested tables 216 objects 31 Data Integrator Designer Guide 571 . adapter 112 jobs 74. changing 89 persistent cache 106 properties. data generated 335 Data Profiling Server configuring. setting 66 defining conditionals 204 data flows 176 datastores for databases 85 datastores. adapter 112 description 80 exporting 119 importing metadata 94. Find out how you can participate and help to improve our documentation. overview 87 datastores adapters 111 and database links 110 browsing metadata 91 connecting to databases 85 custom 90 database changes. setting 303 parameters in 172. Index in jobs 173 object library access 49 parameter values. changing 89 default configuration 118 defining. changing 89 datastore editor.

opening 181 transform. specifying files during 157. system setting 36 object descriptions. description 328 error logs. maximum number 69 environment variables 314 ERP system reducing direct requests to 281 selecting over data cache 275–280 error log files. 304 while loops 206 Degree of parallelism. processing 466–470 572 Data Integrator Designer Guide . description 324 errors catching and correcting 208 categories 210 data. Global_DOP.This document is part of a SAP study on PDF usage. object level 57 descriptions. system level 57 ending lines in scripts 211 engine processes. importing 96 DTD See also XML messages format of XML message 232 metadata. description of 50 query 189 script 212 table. Find out how you can participate and help to improve our documentation. Index parameters 302 projects 73 reusable objects 51 try/catch blocks 209 variables 302. missing values in 470 disabling object descriptions 59 disconnecting objects 45 Display DI Internal Jobs 330 distinct rows and nested data 243 document type definition. overview 185 embedded data flow audit points not visible 374 creating by selecting objects within a data flow 286 definition 284 enabling auditing 372 embedded data flows description 283 troubleshooting 293 enabling descriptions. 58 displaying 58 editing 59 enabling system setting 36 hiding 58 resizing 58 using 45 viewing 57 design phase. See DTD dollar signs ($) in variable names 211 variables in parameters 304 domains importing automatically 68 metadata. importing 232 object library access 49 duplicate rows 243 E Edit menu 35 editing schemas 94. displaying 69 monitoring job execution 68 options 66 port 67 schema levels displayed 67 window 32 dimension tables. 158 Designer central repositories. opening 187 transform. 114 editor catch 209 conditional 203–205 file format 137 object. setting 330 Delete command Edit menu 35 Project menu 34 DELETE operation code 175 deleting annotations 60 lines in diagrams 45 objects 62 reusable objects from diagrams 63 descriptions adding 58.

metadata exchange 447 external tables. 230 new 140. setting 330 functions application 280 concat_date_time 166 contrasted with transforms 184 editing 94. variables in 156. missing dimension values 470 Field ID 159 file format file transfers using a custom program 160 file format editor editing file format 148 modes 137 navigation 138 specifying multiple files 149 work areas 137 file formats creating 139 creating from a table 147 date formats for fields 152 delimited 141 editing 137. specifying 157. 230 multi-byte characters in name 156 reading multiple 149 reading multiple XML files 228 specifying in Data Integrator 139 filtering to find missing or bad values 467 formats 49. creating 148 target. 49 FROM clause for nested tables 237–239 FTP Number of Retry 330 FTP. 158 number 151 object library access 49 overview 136 reading multiple files 149 replicating 145 sample files. Index debugging object definitions 320 messages 320 not caught by try/catch blocks 208 sample solutions 209 severity 320 exceptions See also try/catch blocks automatic recovery and 454 available 210 categories 210 implementing handling 208 sample solutions 209 try/catch blocks and 210 exchanging metadata 446 executing jobs data flows in 191 immediately 321 work flows in 200 execution enabling recovery for 455 order in data flows 191 order in work flows 199 unsuccessful. using 143 source. creating 148 variables in name 314 Web log. 114 metadata imported 95 scripts. example 167 Web logs 165 file transfers 160 Custom transfer program options for flat files 162 files delimited 136 design phase. recovering from 454–466 Exit command 34 exporting files. metadata. 158 fixed-width 136 identifying source 150.This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation. using in 211 WL_GetKeyValue 166 word_ext 166 G global variables creating 304 viewing 305 Data Integrator Designer Guide 573 . 148 file names. viewing 92 F fact tables. 157. 158. 314 fixed width 141 identifying source 150. connection retry interval.

copying 325 lookup function generating keys. using for 535–539 J Job Server associating with repository 23 default 66 default.This document is part of a SAP study on PDF usage. creating 22 log files statistics 328 viewing during job execution 325 logging in DB2 25 Designer 22–27 Oracle 24 repository version 23 SQL Server 25 Sybase 26 logs. 100 History_Preserving transform 541 I icons. ignoring a specific type 330 importing metadata adapters 114 into datastore 96. See conditionals Ignore Reduced Msg Type 330 Ignore Reduced Msg Type_fooSAP R/3 reduced message type. Find out how you can participate and help to improve our documentation. importing 94. changing options for 329 options. 100 DTD 232 using Metadata Exchange 447 information messages 320 INSERT operation code 175 intermediate results. running in 457–458 resizing in workspace 46 stopping. ignoring a specific type 330 if/then/else in work flows. Index graphical user interface See Designer H Help menu 39 hiding object descriptions 59 hierarchies. enabling 455 recovery mode. setting 303 recovery mode. connecting to 81 lines connecting objects in diagrams 45 ending script 211 Linked datastores 109 linked datastores 110 loading data changed data 472 objects 179 local object library 47 local repository. 49 IDoc reduced message type. See data sets object library access 49 objects in 74 organizing complex 74 parameter values. displaying names 42. from the Designer 323 testing 318. metadata. 321–323 troubleshooting file sources feeding 2 queries 330 validation 68 K Key_Generation transform 535 L legacy systems. ignoring 330 reduced message type. under Tools > Options 69 SNMP configuration 549 job server LoadBalanceDebug option 329 LoadOSPolling option 329 jobs creating 74 debugging execution 324–329 defining 75 executing 321 monitoring 68 M mainframes 574 Data Integrator Designer Guide .

opening on job execution 68 multi-user owner renaming 128 using aliases and owner renaming 126 N naming conventions. deleting 56 attributes. setting 55 calling existing 52 connecting 45 copying 53 Data Integrator 30 defining 52 descriptions 57 editors 50 imported. 114 changes in. information imported 94 Universe Builder 447. characters displayed 67 naming 53 Data Integrator Designer Guide 575 . viewing 91. 100. creating 102. creating with 201 objects annotations 59 attributes. Index connecting to 81 using changed-data capture 505 memory datastores. 107 memory tables 102 create row ID option 104 creating 103. 449 metadata exchange exporting a file 447 importing a file 447 Microsoft SQL Server. 114 reporting 66 tables. determining 92 exchanging files with external applications 446 external tables. removing duplicate rows 243 nested tables creating 240 FROM clause 237–239 in real-time jobs 257 SELECT statement 236 unnesting data 243–246 unnesting for use in transforms 246 viewing structure 218 New command 34 NMS application. 92 functions. information imported 95 imported tables. viewing 90 in jobs 74 names. viewing 91 importing 96. creating 56 attributes. objects 76 naming objects 53 nested data. Find out how you can participate and help to improve our documentation.This document is part of a SAP study on PDF usage. logging in to repository 25 MIMB 446 Monitor tab 323 monitor. 107 script functions for 105 troubleshooting 105 update schema option 104 menu bar 33 menus Debug 37 Edit 35 Help 39 Project 34 Tools 36 Validation 38 View 35 Window 38 messages See also real-time jobs error 320 information 320 warning 320 metadata analysis categories 443 application 111. relationship to SNMP agent 548 NORMAL operation code 175 number formats 151 O object library creating reusable object in 51 deleting objects from 62 local 47 objects available in 49 opening 48 tabs in 49 work flows.

converting blanks 69 logging into repository 24 package 98 troubleshooting parallel data flows 330 using changed-data capture 475–495 output schema. limiting 67 query transforms compared to SQL SELECT statements 191 output schema. See sources real-time jobs Access Server. Data Quality 383 projects defining 73 definition 72 object library access 49 propagating schema edits 94. requirement for 254 adding to a project 263 576 Data Integrator Designer Guide . Find out how you can participate and help to improve our documentation. auto filling 188 overview 187 in real-time jobs 257 quotation marks. importing procedure from 98 palette. with database links 110 persistent cache datastore 106 Persistent cache datastores 106 ports. 535 improving. filling automatically 188 output window. See tool palette Palette command 35 parameters in data flows 172. Designer 67 preload SQL commands job recovery and 461 recovery. single 211. Index properties. 114 properties definition 30 object. displaying 36 overflow files 466–467 P package.This document is part of a SAP study on PDF usage. viewing 53 versus options 30 pushing operations to database stored procedure restrictions 99 Q query editor description 189 schema tree. 175 dates as 187 default 67 defining 303 example 175 passing automatically 67 passing to data flows 175 setting values passed in 303 syntax for values 304 times as 187 Paste command 35 PeopleSoft. 304 R reading data. importing metadata 96 performance changed-data capture and 472. listing of 175 options Designer 66 versus properties 30 Options window 66 Oracle bulk loading. viewing 53 relationships 31 relationships among 31 renaming 53 reusable 30 searching for 63 single-use 31 sorting in object library 90 OCI Server Attach Retry 330 Open command 34 opening projects 73 operation codes. using for 461–463 pre-packaged adapters 112 preserving history changed-data capture and 473 Print command 34 print function 213 Print Setup command 34 project area 41 project menu 34 Project.

272–274 testing 270 transactional loading 268 record length field 159 recovery. relationship to 48 Oracle. 461– 463 results saved 458 starting 457 try/catch block and 459 variables for 462–463 work flows. automatic for batch jobs data flows. logging in 24 storing object definitions in 31 versions 23 Reset Users window 26 reusable objects calling existing 52 creating 51 defining 51 deleting from object library 62 list of 49 reusing 31 saving 62 single definition 31 storing 31 run. choosing 275–280 compared to batch jobs 255 creating 263 description 255 examples 257–258 message types 256–257 message-response processing 254 RFC calls in 280 sources. correcting 457 overview 454 preload SQL commands. executing once 178 enabling 455 executing path during 458 failures. ignoring 330 Save All command 34 Save command 34 saving projects 74 reusable objects 62 scripts 212 scaling workspace 46 schemas changes in imported data 92 editing 94. using for 461. using for 463 recursions. in work flows 199 Refresh command 36 renaming objects 53 replicating file format templates 145 objects 53 repository creating 22 DB2. using for 461 data flows in batch jobs. executing once 201. Find out how you can participate and help to improve our documentation. using for 466 designing work flows for 463–466 status table. associating with 23 Microsoft SQL Server. logging in 25 object library. executing once 456 recovery. manual conditionals. Index branching 275–280 cached data or ERP data. automatic auto correct load option. See execution S SAP R/3 reduced message type. logging in 25 Job Server. limiting 67 script editor 212 scripting language 313 scripts adding from tool palette 43 debugging 212 elements of 211 examples 211 saving 212 syntax 211 writing 212 searching for objects 63 secondary index information for tables 93 SELECT statements Data Integrator Designer Guide 577 . 456 work flows.This document is part of a SAP study on PDF usage. 114 levels displayed 67 tree elements in editor. specifying as unit 456–457 recovery. supplementary 267.

defined 560 sources data flows 178 editor. defined 550 troubleshooting 567 SNMP agent parameters 556–561 Access Control v3. using for 539 tables adding to data flows as sources 180 caching for comparisons 194 caching for inner joins 194 caching for lookups 194 domains. defined 556. defined 556 Access Control. 100 loading in single transaction 268 memory 102 metadata imported 94 schema. opening 181 files 139 Splitter Optimization 330 SQL Server log in 25 SQL Server.This document is part of a SAP study on PDF usage. defined 556 traps. description 328 statistics logs. description 324 Status Bar command 35 status table 463 steps in data flows 173 in jobs 198 in work flows 198. 557 System Variables.) in scripts 211 simple network management protocol 548 single quotation marks (') in scripts 211 string values in parameters 304 single-use objects description 31 list of 42 SNMP enable for a Job Server on UNIX 563 enable for a Job Server on Windows 555 SNMP agent configuration and architecture 549 configure on UNIX 563 configure on Windows 556 defined 548 events and NMS commands 553 real-time jobs and cycle count 553 relationship to Job Server 549 relationship to MIB 550 relationship to NMS application 548 status of jobs and Job Servers. viewing metadata 92 external. opening 181 external. defined 559 Access Control. importing automatically 68 editing 94. using changed-data capture 513 statistics log files. v3. defined 558 Job Servers for SNMP. 199 stored procedures. viewing metadata for 91 importing domains 68 importing metadata 94. Find out how you can participate and help to improve our documentation. restriction on SQL 99 storing reusable objects 31 strings comments in 211 in scripts 211 Sybase datastore 26 log in 26 syntax debugging object definitions 319 values in parameters 304 system configurations creating 132 defining 131 displaying 36 exporting 133. 133 system variables. 114 editor. See environment variables T table comparison 475 Table_Comparison transform changed data capture. defined 557 Traps. Index equivalent in Data Integrator 191 for nested tables 236 semicolons (. viewing metadata for 91 imported. v1/v2c. defined 556 Access Control.v1/v2c. determining changes in 92 template 181 target-based changed-data capture 474 578 Data Integrator Designer Guide .

329 files 139 generating keys 535 overflow files for 466 preserving history 540 template tables converting to tables 182–184 using 181 testing real-time jobs 270 Timestamp-based change-data capture with sources overview 523 Timestamp-based changed-data capture with sources create and update 534 create-only 534 examples 524 overlaps 527 overlaps. rules for 313 in names of file formats 314 overview 296–298 Data Integrator Designer Guide 579 . See conditionals try/catch blocks automatic recovery restriction 459 catch editor 209 defining 209 description 208 example 209 from tool palette 43 U Undo command 35 Universe Builder 447. rules for 313 linking to parameters 67 local 300–304 local. avoiding 529 overlaps. 449 create or update a universe 449 mappings between repository and universe data types 450 metadata mappings between a repository and universe 450 unnesting data in nested tables 243–246 UPDATE operation code 175 UseDomainName 331 UseExplicitDatabaseLinks 331 user interface. reconciling 529 presampling 530 processing timestamps 523 sample data flows and work flows 525 update-only 534 tool palette defining data flows with 177 description 42 displaying 35 work flows. using in 156 global 304–313 global. resetting 26 V validating jobs before execution 68 Validation menu 38 variables environment 314 file names. description 327 trace logs description 324 open on job execution 325 transactional loading 268 transforms contrasted with functions 184 editors 185 inputs 187 and nested data 246 object library access 49 query 187 in real-time jobs 254 schema tree. See try/catch blocks true/false in work flows. limiting in editor 67 trees. Find out how you can participate and help to improve our documentation. importing 96 tries. Index targets changed-data capture 545 data flows 178 evaluating results 324. creating with 201 toolbar 39 Toolbar command 35 Tools menu 36 trace log files.This document is part of a SAP study on PDF usage. See Designer users. PeopleSoft metadata.

456 execution order 199 from tool palette 43 independent steps in 199 multiple steps in 200 object library access 49 parameter values. passing automatically 67 in R/3 data flows 302 recovery. setting 303 purpose of 198 recovering as a unit 456–457 resizing in workspace 46 scripts in 211 steps in 198 try/catch blocks in 208 variables. 201. 200 conditionals in 202 connecting objects in 45 connecting steps 199 creating 201 data flows in 173 defining parameters 302 defining variables 302 description 198 designing for automatic recovery 461–463 designing for manual recovery 463–466 example 200 executing only once 200. 404 selecting before deleting an object 62 views. 314.This document is part of a SAP study on PDF usage. 114 sample for testing 270 viewing schema 266 XML Schema. 404 overview 404 set sample size. passing automatically 67 workspace annotations 45 arranging windows 47 characters displayed 67 closing windows 47 description 44 descriptions in 45 scaling 46 writing data 179 X XML data. 314. 314 versions. 114 reading multiple files 228 as targets 271 XML messages 219 editing 94. Designer option 398. importing metadata 94. extracting and parsing to columns 247 XML files editing 94. Find out how you can participate and help to improve our documentation. Designer option 398. for sources or targets 67. 404 while loops defining 206 design considerations 205 view data 208 Window menu 38 windows closing 47 Options 66 WL_GetKeyValue function 166 word_ext function 166 work flows adding parameters 303 calling other work flows 199. 409 tool bar options 413 using with while loops 208 View menu 35 View where used. use for 462–463 in scripts 211 system 314 Variables and Parameters window. 314 for sources and targets 314. Index parameters. object library access 49 XML source editor specifying multiple files 228 580 Data Integrator Designer Guide . using 298 variables as file names for lookup_ext function 314. 100 W warning messages 320 Web logs Data Integrator support for 165 overview 165 Where used. repository 23 view data 356.

Index Y years.This document is part of a SAP study on PDF usage. interpreting two digits 69 Data Integrator Designer Guide 581 . Find out how you can participate and help to improve our documentation.

Index 582 Data Integrator Designer Guide .This document is part of a SAP study on PDF usage. Find out how you can participate and help to improve our documentation.

SAP PDF Usage Study This document is part of an SAP study to improve documentation. the tags are only active if you explicitly agree to join the study and install an Activator tool provided by SAP that enables the tracking http://help. Once this key content has been identified. The insights gained will be used to create more helpful content in the future.htm. tracking no longer takes place. Participate now! . close or print a tagged document and sends the information to SAP together with a unique ID. it will be the subject of special quality measures to ensure that it is of a high quality and optimally fulfills customers’ needs. When you uninstall the tool. but not stored or used. The key content will also be analyzed to get a better understanding of the type of information that is most valuable for customers. However. If you re-install the tool a new ID will be generated. The IP address is transmitted to SAP.sap.com/PDFusage/PDFUsage_Study. Once the tool is installed. To enable the collection of data. The study collects data on how documents are used offline to find out which content is most important and helpful. this document is tagged. The ID is generated when you install the tool and enables the distinction between total views and views by distinct users. even if you open a tagged document. but not the identification of a particular. The study will run until the 31st of August 2010 after that point in time the tool will no longer be functional and therefore also not collect data and send pseudomyzed data to SAP. No additional data which would allow the personal identification of a user is transferred. it registers whenever you open.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->