SAP BusinessObjects Data Services Designer Guide

SAP BusinessObjects Data Services XI 3.2 SP1 (12.2.1)

Copyright

© 2009 SAP AG. All rights reserved.SAP, R/3, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP Business ByDesign, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and other countries. Business Objects and the Business Objects logo, BusinessObjects, Crystal Reports, Crystal Decisions, Web Intelligence, Xcelsius, and other Business Objects products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of Business Objects S.A. in the United States and in other countries. Business Objects is an SAP company.All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary.These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty. 2009-10-24

Contents
Chapter 1 Introduction 21 Welcome to SAP BusinessObjects Data Services....................................22 Welcome..............................................................................................22 Documentation set for SAP BusinessObjects Data Services...............22 Accessing documentation....................................................................25 SAP BusinessObjects information resources.......................................26 Overview of this guide...............................................................................28 About this guide....................................................................................28 Who should read this guide..................................................................29 Chapter 2 Logging into the Designer 31

Creating a repository.................................................................................32 Associating the repository with a Job Server.............................................32 Entering repository login information.........................................................33 Version restrictions...............................................................................33 Oracle login .........................................................................................33 Microsoft SQL Server login..................................................................34 IBM DB2 login......................................................................................34 Sybase ASE login.................................................................................35 Resetting users..........................................................................................35 Chapter 3 Designer User Interface 37

Objects.......................................................................................................38 Reusable objects..................................................................................38 Single-use objects................................................................................39 Object hierarchy...................................................................................39

SAP BusinessObjects Data Services Designer Guide

3

Contents

Designer window.......................................................................................41 Menu bar....................................................................................................42 Project menu........................................................................................42 Edit menu.............................................................................................43 View menu............................................................................................43 Tools menu...........................................................................................44 Debug menu.........................................................................................46 Validation menu....................................................................................47 Dictionary menu...................................................................................48 Window menu.......................................................................................48 Help menu............................................................................................49 Toolbar.......................................................................................................49 Project area ..............................................................................................52 Tool palette................................................................................................53 Workspace.................................................................................................55 Moving objects in the workspace area.................................................55 Connecting objects...............................................................................55 Disconnecting objects..........................................................................56 Describing objects ...............................................................................56 Scaling the workspace.........................................................................56 Arranging workspace windows.............................................................57 Closing workspace windows................................................................57 Local object library.....................................................................................57 To open the object library.....................................................................58 To display the name of each tab as well as its icon.............................59 To sort columns in the object library.....................................................59 Object editors.............................................................................................60 Working with objects..................................................................................61 Creating new reusable objects.............................................................61 Changing object names........................................................................63 Viewing and changing object properties...............................................63

4

SAP BusinessObjects Data Services Designer Guide

Contents

Creating descriptions............................................................................66 Creating annotations ...........................................................................68 Copying objects....................................................................................69 Saving and deleting objects.................................................................70 Searching for objects............................................................................72 General and environment options.............................................................74 Designer — Environment.....................................................................75 Designer — General.............................................................................77 Designer — Graphics...........................................................................78 Designer — Central Repository Connections......................................79 Data — General...................................................................................79 Job Server — Environment..................................................................80 Job Server — General..........................................................................80 Chapter 4 Projects and Jobs 83

Projects......................................................................................................84 Objects that make up a project.............................................................84 Creating a new project.........................................................................85 Opening existing projects.....................................................................85 Saving projects.....................................................................................85 Jobs...........................................................................................................86 Creating jobs........................................................................................87 Naming conventions for objects in jobs................................................88 Chapter 5 Datastores 91

What are datastores?................................................................................92 Database datastores..................................................................................93 Mainframe interface..............................................................................93 Defining a database datastore.............................................................97 Configuring data sources used in a datastore....................................100

SAP BusinessObjects Data Services Designer Guide

5

Contents

Changing a datastore definition.........................................................102 Browsing metadata through a database datastore............................104 Importing metadata through a database datastore............................108 Memory datastores.............................................................................114 Persistent cache datastores...............................................................120 Linked datastores...............................................................................123 Adapter datastores..................................................................................125 Defining an adapter datastore............................................................126 Browsing metadata through an adapter datastore.............................129 Importing metadata through an adapter datastore.............................129 Web service datastores...........................................................................130 Defining a web service datastore.......................................................130 Browsing WSDL metadata through a web service datastore.............131 Importing metadata through a web service datastore........................133 Creating and managing multiple datastore configurations......................133 Definitions...........................................................................................134 Why use multiple datastore configurations?......................................135 Creating a new configuration..............................................................135 Adding a datastore alias.....................................................................137 Functions to identify the configuration................................................138 Portability solutions............................................................................140 Job portability tips...............................................................................145 Renaming table and function owner...................................................146 Defining a system configuration.........................................................152 Chapter 6 File formats 155

What are file formats?..............................................................................156 File format editor......................................................................................156 Creating file formats.................................................................................159 To specify a source or target file.........................................................159 To create a new file format.................................................................159

6

SAP BusinessObjects Data Services Designer Guide

Contents

Modeling a file format on a sample file...............................................161 Replicating and renaming file formats................................................163 To create a file format from an existing flat table schema..................164 To create a specific source or target file.............................................164 Editing file formats...................................................................................165 To edit a file format template..............................................................165 To edit a source or target file..............................................................166 Change multiple column properties....................................................166 File format features..................................................................................167 Reading multiple files at one time......................................................167 Identifying source file names .............................................................168 Number formats..................................................................................168 Ignoring rows with specified markers.................................................169 Date formats at the field level.............................................................170 Parallel process threads.....................................................................171 Error handling for flat-file sources......................................................171 Creating COBOL copybook file formats..................................................175 To create a new COBOL copybook file format...................................175 To create a new COBOL copybook file format and a data file...........176 To create rules to identify which records represent which schemas...177 To identify the field that contains the length of the schema's record...178 Creating Microsoft Excel workbook file formats on UNIX platforms .......178 To create a Microsoft Excel workbook file format on UNIX ...............179 File transfers............................................................................................180 Custom transfer system variables for flat files...................................180 Custom transfer options for flat files...................................................182 Setting custom transfer options..........................................................184 Design tips..........................................................................................185 Web log support.......................................................................................186 Word_ext function..............................................................................188 Concat_date_time function.................................................................188

SAP BusinessObjects Data Services Designer Guide

7

Contents

WL_GetKeyValue function.................................................................189 Chapter 7 Data Flows 193

What is a data flow?................................................................................194 Naming data flows..............................................................................194 Data flow example..............................................................................194 Steps in a data flow............................................................................195 Data flows as steps in work flows.......................................................195 Intermediate data sets in a data flow.................................................196 Operation codes.................................................................................196 Passing parameters to data flows......................................................197 Creating and defining data flows.............................................................198 To define a new data flow using the object library..............................198 To define a new data flow using the tool palette................................199 To change properties of a data flow...................................................199 Source and target objects........................................................................200 Source objects....................................................................................201 Target objects.....................................................................................201 Adding source or target objects to data flows....................................202 Template tables..................................................................................204 Converting template tables to regular tables......................................206 Adding columns within a data flow .........................................................207 To add columns within a data flow.....................................................209 Propagating columns in a data flow containing a Merge transform....209 Lookup tables and the lookup_ext function.............................................211 Accessing the lookup_ext editor........................................................212 Example: Defining a simple lookup_ext function................................213 Example: Defining a complex lookup_ext function ............................216 Data flow execution.................................................................................219 Push down operations to the database server...................................220 Distributed data flow execution..........................................................221

8

SAP BusinessObjects Data Services Designer Guide

Contents

Load balancing...................................................................................222 Caches...............................................................................................223 Audit Data Flow overview........................................................................223 Chapter 8 Transforms 225

Transform configurations.........................................................................229 To create a transform configuration....................................................230 To add a user-defined field ................................................................231 To add transforms to data flows...............................................................232 Transform editors.....................................................................................233 Chapter 9 Query transform overview 235

Query editor.............................................................................................236 To change the current schema................................................................237 To modify output schema contents..........................................................237 To add a Query transform to a data flow.................................................238 Chapter 10 Data Quality transforms overview 241

Data Quality editors.................................................................................242 Associate, Match, and User-Defined transform editors...........................245 Ordered options editor.............................................................................246 To add a Data Quality transform to a data flow.......................................246 Chapter 11 Work Flows 249

What is a work flow?................................................................................250 Steps in a work flow.................................................................................250 Order of execution in work flows.............................................................251 Example of a work flow............................................................................252 Creating work flows.................................................................................253 To create a new work flow using the object library.............................253

SAP BusinessObjects Data Services Designer Guide

9

Contents

To create a new work flow using the tool palette ...............................253 To specify that a job executes the work flow one time.......................254 Conditionals.............................................................................................254 To define a conditional........................................................................256 While loops..............................................................................................257 Design considerations........................................................................257 Defining a while loop..........................................................................258 Using a while loop with View Data.....................................................259 Try/catch blocks.......................................................................................260 Defining a try/catch block...................................................................261 Categories of available exceptions....................................................263 Example: Catching details of an error................................................263 Scripts......................................................................................................264 To create a script................................................................................265 Debugging scripts using the print function.........................................266 Chapter 12 Nested Data 267

What is nested data?...............................................................................268 Representing hierarchical data................................................................268 Formatting XML documents.....................................................................271 Importing XML Schemas....................................................................272 Specifying source options for XML files ............................................278 Mapping optional schemas.................................................................279 Using Document Type Definitions (DTDs) ........................................281 Generating DTDs and XML Schemas from an NRDM schema.........283 Operations on nested data......................................................................284 Overview of nested data and the Query transform............................284 FROM clause construction.................................................................285 Nesting columns ................................................................................289 Using correlated columns in nested data...........................................290 Distinct rows and nested data............................................................292

10

SAP BusinessObjects Data Services Designer Guide

Contents

Grouping values across nested schemas..........................................292 Unnesting nested data ......................................................................293 Transforming lower levels of nested data...........................................296 XML extraction and parsing for columns.................................................297 Sample Scenarios..............................................................................298 Chapter 13 Real-time Jobs 303

Request-response message processing.................................................304 What is a real-time job?...........................................................................305 Real-time versus batch.......................................................................305 Messages...........................................................................................306 Real-time job examples......................................................................308 Creating real-time jobs.............................................................................310 Real-time job models..........................................................................310 Using real-time job models.................................................................311 To create a real-time job.....................................................................313 Real-time source and target objects........................................................316 To view an XML message source or target schema..........................317 Secondary sources and targets.........................................................318 Transactional loading of tables...........................................................319 Design tips for data flows in real-time jobs.........................................320 Testing real-time jobs...............................................................................321 Executing a real-time job in test mode...............................................321 Using View Data.................................................................................322 Using an XML file target.....................................................................322 Building blocks for real-time jobs.............................................................323 Supplementing message data............................................................323 Branching data flow based on a data cache value.............................326 Calling application functions...............................................................332 Designing real-time applications..............................................................332 Reducing queries requiring back-office application access...............333

SAP BusinessObjects Data Services Designer Guide

11

Contents

Messages from real-time jobs to adapter instances...........................333 Real-time service invoked by an adapter instance.............................334 Chapter 14 Embedded Data Flows 335

Overview of embedded data flows..........................................................336 Example of when to use embedded data flows.......................................337 Creating embedded data flows................................................................337 Using the Make Embedded Data Flow option....................................338 Creating embedded data flows from existing flows............................341 Using embedded data flows...............................................................342 Separately testing an embedded data flow........................................344 Troubleshooting embedded data flows..............................................345 Chapter 15 Variables and Parameters 347

Overview of variables and parameters....................................................348 The Variables and Parameters window...................................................350 To view the variables and parameters in each job, work flow, or data flow.....................................................................................................350 Using local variables and parameters.....................................................352 Parameters.........................................................................................353 Passing values into data flows...........................................................353 To define a local variable....................................................................354 Defining parameters...........................................................................355 Using global variables .............................................................................356 Creating global variables....................................................................357 Viewing global variables ....................................................................357 Setting global variable values.............................................................358 Local and global variable rules................................................................363 Naming...............................................................................................363 Replicating jobs and work flows.........................................................363

12

SAP BusinessObjects Data Services Designer Guide

Contents

Importing and exporting......................................................................363 Environment variables.............................................................................364 Setting file names at run-time using variables.........................................364 To use a variable in a flat file name....................................................364 Substitution parameters...........................................................................366 Overview of substitution parameters..................................................366 Using the Substitution Parameter Editor............................................368 Associating a substitution parameter configuration with a system configuration.......................................................................................371 Overriding a substitution parameter in the Administrator...................372 Executing a job with substitution parameters ....................................373 Exporting and importing substitution parameters...............................374 Chapter 16 Executing Jobs 377

Overview of job execution........................................................................378 Preparing for job execution......................................................................379 Validating jobs and job components...................................................379 Ensuring that the Job Server is running.............................................380 Setting job execution options.............................................................381 Executing jobs as immediate tasks.........................................................381 To execute a job as an immediate task..............................................381 Monitor tab ........................................................................................383 Log tab ..............................................................................................383 Debugging execution errors....................................................................383 Using logs...........................................................................................384 Examining target data........................................................................388 Changing Job Server options..................................................................389 To change option values for an individual Job Server........................392 To use mapped drive names in a path...............................................395

SAP BusinessObjects Data Services Designer Guide

13

Contents

Chapter 17

Data Assessment

397

Using the Data Profiler.............................................................................399 Data sources that you can profile.......................................................400 Connecting to the profiler server........................................................400 Profiler statistics.................................................................................402 Executing a profiler task.....................................................................406 Monitoring profiler tasks using the Designer......................................412 Viewing the profiler results.................................................................414 Using View Data to determine data quality..............................................422 Data tab..............................................................................................422 Profile tab...........................................................................................423 Relationship Profile or Column Profile tab.........................................424 Using the Validation transform.................................................................424 Analyze column profile.......................................................................424 Defining validation rule based on column profile................................426 Using Auditing .........................................................................................427 Auditing objects in a data flow............................................................428 Accessing the Audit window...............................................................433 Defining audit points, rules, and action on failure..............................434 Guidelines to choose audit points .....................................................438 Auditing embedded data flows...........................................................439 Resolving invalid audit labels.............................................................442 Viewing audit results .........................................................................443 Chapter 18 Data Quality 447

Overview of data quality..........................................................................448 Address Cleanse.....................................................................................448 How address cleanse works...............................................................449 Prepare your input data......................................................................452

14

SAP BusinessObjects Data Services Designer Guide

Contents

Determine which transform(s) to use.................................................454 Identify the country of destination......................................................458 Set up the reference files...................................................................458 Define the standardization options.....................................................460 Beyond the basics..............................................................................461 Process Japanese addressees .........................................................514 Supported countries (Global Address Cleanse).................................526 Address Server...................................................................................529 New Zealand Certification..................................................................533 Data Cleanse...........................................................................................537 What is Data Cleanse.........................................................................538 Parse data..........................................................................................538 Prepare records for matching.............................................................540 Data parsing overview........................................................................541 Parsing dictionaries............................................................................548 Dictionary entries................................................................................552 Classifications....................................................................................562 Region-specific data...........................................................................571 Japanese Data...................................................................................573 Universal Data Cleanse......................................................................578 Rank and prioritize parsing engines...................................................601 Geocoding...............................................................................................602 Prepare records for geocoding...........................................................602 Understanding your output.................................................................602 Match.......................................................................................................603 Matching strategies............................................................................603 Match components.............................................................................604 Match Wizard.....................................................................................607 Transforms for match data flows........................................................614 Working in the Match and Associate editors......................................616 Physical and logical sources..............................................................617

SAP BusinessObjects Data Services Designer Guide

15

Contents

Match preparation..............................................................................623 Match criteria......................................................................................650 Post-match processing.......................................................................669 Association matching.........................................................................692 Unicode matching...............................................................................693 Phonetic matching..............................................................................696 Set up for match reports ....................................................................699 Chapter 19 Design and Debug 701

Using View Where Used..........................................................................702 Accessing View Where Used from the object library.........................703 Accessing View Where Used from the workspace.............................706 Limitations..........................................................................................707 Using View Data......................................................................................708 Accessing View Data..........................................................................709 Viewing data in the workspace...........................................................710 View Data Properties..........................................................................712 View Data tool bar options.................................................................717 View Data tabs...................................................................................718 Using the interactive debugger................................................................723 Before starting the interactive debugger............................................723 Starting and stopping the interactive debugger..................................727 Panes.................................................................................................729 Debug menu options and tool bar......................................................738 Viewing data passed by transforms...................................................740 Push-down optimizer..........................................................................741 Limitations..........................................................................................742 Comparing Objects..................................................................................742 To compare two different objects........................................................743 To compare two versions of the same object.....................................744 Overview of the Difference Viewer window........................................744

16

SAP BusinessObjects Data Services Designer Guide

Contents

Navigating through differences...........................................................748 Calculating column mappings..................................................................749 To automatically calculate column mappings ....................................750 To manually calculate column mappings ...........................................750 Chapter 20 Exchanging Metadata 751

Metadata exchange.................................................................................752 Importing metadata files into the software.........................................753 Exporting metadata files from the software........................................753 Creating SAP universes...........................................................................754 To create universes using the Tools menu ........................................755 To create universes using the object library.......................................755 Mappings between repository and universe metadata.......................755 Attributes that support metadata exchange.......................................757 SAP BusinessObjects Accelerator .........................................................758 SAP BusinessObjects Accelerator Workflow.....................................759 Modifying BWA indexes.....................................................................760 Chapter 21 Recovery Mechanisms 763

Recovering from unsuccessful job execution..........................................764 Automatically recovering jobs..................................................................765 Enabling automated recovery............................................................765 Marking recovery units.......................................................................766 Running in recovery mode.................................................................767 Ensuring proper execution path.........................................................768 Using try/catch blocks with automatic recovery.................................769 Ensuring that data is not duplicated in targets...................................771 Using preload SQL to allow re-executable data flows .......................772 Manually recovering jobs using status tables..........................................774 Processing data with problems................................................................775

SAP BusinessObjects Data Services Designer Guide

17

..............................800 Limitations.............................................................................................................783 Overview of CDC for Oracle databases......................................................776 Filtering missing or bad values .........780 Source-based and target-based CDC...................................................................................811 Using CDC with Microsoft SQL Server databases ....................................................................812 Setting up Microsoft SQL Server for CDC...............................................................................794 To configure an Oracle CDC source table..................778 Chapter 22 Techniques for Capturing Changed Data 779 Understanding changed-data capture.............................803 Setting up the software for CDC on mainframe sources....................................783 Setting up Oracle CDC...........Contents Using overflow files......806 Configuring a mainframe CDC source......801 Using CDC with Attunity mainframe sources............................................................................................................789 Importing CDC data from Oracle...............................................................................810 Limitations...790 Viewing an imported CDC table......795 To create a data flow with an Oracle CDC source..................................................................................................................................................................................................................................................................................812 Overview of CDC for SQL Server databases..............802 Setting up Attunity CDC...818 18 SAP BusinessObjects Data Services Designer Guide ............817 Importing SQL Server CDC data.......................781 Using CDC with Oracle sources................................................777 Handling facts with missing dimensions..............................804 Importing mainframe CDC data.................................................................808 Using mainframe check-points.........................................................................................814 Setting up the software for CDC on SQL Server........................780 Full refresh.......................................................................................................................................................................................799 Maintaining CDC tables and subscriptions......................788 To create a CDC datastore for Oracle.......780 Capturing only changes......................................

................852 Job Server....858 Configuring the software to support an NMS application....................821 Limitations.............871 Index 873 SAP BusinessObjects Data Services Designer Guide 19 ..849 Chapter 23 Monitoring Jobs 851 Administrator..................................................847 Using CDC for targets..............................................................................................................................................................838 Timestamp-based CDC examples...................................................................................................................................853 About SNMP Agent's Management Information Base (MIB)...............................................852 About the SNMP agent................852 SNMP support...................830 Types of timestamps......................................................................840 Additional job design tips.826 Overlaps...............................................................Contents Configuring a SQL Server CDC source................825 Using CDC with timestamp-based sources......................................................................... SNMP agent......................................................................................................................................................................... and NMS application architecture.........................................859 Troubleshooting......................................................................854 About an NMS application...................825 Processing timestamps...................................................................

Contents 20 SAP BusinessObjects Data Services Designer Guide .

Introduction 1 .

and trusted. The data integration processes of SAP BusinessObjects Data Services allow organizations to easily explore. transform. Documentation set for SAP BusinessObjects Data Services You should become familiar with all the pieces of documentation that relate to your SAP BusinessObjects Data Services product. ensuring that end users are always working with information that's readily available. This document is not updated for service pack or fix pack releases. languages. accurate. delivering enterprise performance and scalability. cleanse. and deliver any type of data anywhere across the enterprise. extract. and consolidate data anywhere. Document Documentation Map What this document provides Information about available SAP BusinessObjects Data Services books. The data quality processes of SAP BusinessObjects Data Services allow organizations to easily standardize. Important information you need before installing and deploying this version of SAP BusinessObjects Data Services Release Summary Release Notes 22 SAP BusinessObjects Data Services Designer Guide .1 Introduction Welcome to SAP BusinessObjects Data Services Welcome to SAP BusinessObjects Data Services Welcome SAP BusinessObjects Data Services XI Release 3 provides data integration and data quality processes in one runtime environment. and locations Highlights of new key features in this SAP BusinessObjects Data Services release.

This manual also contains information about how to migrate from SAP BusinessObjects Data Quality Management to SAP BusinessObjects Data Services Information about how to improve the performance of SAP BusinessObjects Data Services Detailed reference material for SAP BusinessObjects Data Services Designer Installation Guide for Windows Installation Guide for UNIX Advanced Development Guide Designer Guide Integrator's Guide Management Console: Administrator Guide Management Console: Metadata Reports Guide Migration Considerations Performance Optimization Guide Reference Guide SAP BusinessObjects Data Services Designer Guide 23 .Introduction Welcome to SAP BusinessObjects Data Services 1 Document Getting Started Guide What this document provides An introduction to SAP BusinessObjects Data Services Information about and procedures for installing SAP BusinessObjects Data Services in a Windows environment Information about and procedures for installing SAP BusinessObjects Data Services in a UNIX environment Guidelines and options for migrating applications including information on multi-user functionality and the use of the central repository for version control Information about how to use SAP BusinessObjects Data Services Designer Information for third-party developers to access SAP BusinessObjects Data Services functionality using web services and APIs Information about how to use SAP BusinessObjects Data Services Administrator Information about how to use SAP BusinessObjects Data Services Metadata Reports Release-specific product behavior changes from earlier versions of SAP BusinessObjects Data Services to the latest release.

Edwards • Supplement for Oracle Applications • Supplement for PeopleSoft • Supplement for Siebel • Supplement for SAP A step-by-step introduction to using SAP BusinessObjects Data Services Technical Manuals Tutorial In addition.D.com Adapter Interface Supplement for J. Edwards World and J.Information about the interface between SAP BusinessObjects Data Services and Oracle Applications plications Supplement for PeopleSoft Information about interfaces between SAP BusinessObjects Data Services and PeopleSoft 24 SAP BusinessObjects Data Services Designer Guide .D. configure. Edwards OneWorld Supplement for Oracle Ap.1 Introduction Welcome to SAP BusinessObjects Data Services Document What this document provides A compiled “master” PDF of core SAP BusinessObjects Data Services books containing a searchable master table of contents and index: • Getting Started Guide • Installation Guide for Windows • Installation Guide for UNIX • Designer Guide • Reference Guide • Management Console: Metadata Reports Guide • Management Console: Administrator Guide • Performance Optimization Guide • Advanced Development Guide • Supplement for J. Edwards What this document provides Information about how to install. and use the SAP BusinessObjects Data Services Salesforce. you may need to refer to several Adapter Guides and Supplemental Guides.com Adapter Interface Information about interfaces between SAP BusinessObjects Data Services and J. Document Salesforce.D.D.

1. Using Adobe Reader.Introduction Welcome to SAP BusinessObjects Data Services 1 Document Supplement for SAP What this document provides Information about interfaces between SAP BusinessObjects Data Services.2 > SAP BusinessObjects Data Services > Data Services Documentation. Note: Only a subset of the documentation is available from the Start menu. SAP BusinessObjects Data Services Designer Guide 25 . Go to LINK_DIR/doc/book/en/. open the PDF file of the document that you want to view. you can access the online documentation by going to the directory where the printable PDF files were installed. 2. Accessing documentation on Windows After you install SAP BusinessObjects Data Services. The documentation set for this release is available in LINK_DIR\Doc\Books\en. 1. SAP Applications. 2. you can access the documentation from the Start menu. Choose Start > Programs > SAP BusinessObjects XI 3. and SAP NetWeaver BW Information about the interface between SAP BusinessObjects Data Services and Siebel Supplement for Siebel Accessing documentation You can access the complete documentation set for SAP BusinessObjects Data Services in several places. Accessing documentation on UNIX After you install SAP BusinessObjects Data Services. Click the appropriate shortcut for the document that you want to view.

education. 2. 3. Click All Products in the navigation pane on the left.1 Introduction Welcome to SAP BusinessObjects Data Services Accessing documentation from the Web You can access the complete documentation set for SAP BusinessObjects Data Services from the SAP BusinessObjects Technical Customer Assurance site. and consulting to ensure maximum business intelligence benefit to your business. You can view the PDFs online or save them to your computer. Go to http://help. SAP BusinessObjects information resources A global network of SAP BusinessObjects technology experts provides customer support.com. 1. Click SAP BusinessObjects at the top of the page. Useful addresses at a glance: Address Content 26 SAP BusinessObjects Data Services Designer Guide .sap.

downloads.Get online and timely information about SAP nity BusinessObjects Data Services.sap. SAP BusinessObjects can offer a training package to suit your learning needs and preferred learning style. All content is to and from the community.sap.com/irj/boc/blueprints Product documentation http://help.com/irj/scn/forums Blueprints http://www. Consulting. Education services can provide information about training options and modules. https://www.Introduction Welcome to SAP BusinessObjects Data Services 1 Address Content Customer Support. Blueprints for you to download and modify to fit your needs. samples. additional downloads. data flows. and online forums. as well as links to technical articles. From traditional classroom learning to targeted e-learning seminars. sample data. including tips and tricks. so feel free to join in and contact us if you have a submission. and Education Information about Technical Customer Assurservices ance programs. template tables.com/businessobjects/ SAP BusinessObjects Data Services Designer Guide 27 . Consulting http://service.com/ services can provide you with information about how SAP BusinessObjects can help maximize your business intelligence investment.sdn. and custom functions to run the data flows in your environment with only a few modifications. and https://www.sdn.sdn. Each blueprint contains the necessary SAP BusinessObjects Data Services project. SAP BusinessObjects product documentation.sap. jobs. Forums on SCN (SAP Community Network Search the SAP BusinessObjects forums on the SAP Community Network to learn from other SAP BusinessObjects Data Services users and start posting questions or share your knowledge with the community.sap.com/irj/boc/ds much more.sap. SAP BusinessObjects Data Services Commu. file formats.

front-office. In the left panel of the window. navigate to Documentation > Supported Platforms/PARs > SAP BusinessObjects Data Services > SAP BusinessObjects Data Services XI 3. transform. and back-office applications. You can also use the Designer to define logical paths for processing message-based queries and transactions from Web-based.sap.1 Introduction Overview of this guide Address Supported Platforms (formerly the Products Availability Report or PAR) Content Get information about supported platforms for SAP BusinessObjects Data Services. Click the appropriate link in the main window. About this guide The guide contains two kinds of information: • • Conceptual information that helps you understand the Data Services Designer and how it works Procedural information that explains in a step-by-step manner how to accomplish a task You will find this guide most useful: • • • While you are learning about the product While you are performing tasks in the design and early testing phase of your data-movement projects As a general source of information during any phase of your projects 28 SAP BusinessObjects Data Services Designer Guide .x. https://service. and load data from databases and applications into a data warehouse used for analytic and on-demand queries. The Data Services Designer provides a graphical user interface (GUI) development environment in which you define data application logic to extract.com/bosap-support Overview of this guide Welcome to the Designer Guide.

You understand your organization's data needs. If you are interested in using this product to design real-time processing. etc. You are familiar with SQL (Structured Query Language). data integration. business intelligence.Introduction Overview of this guide 1 Who should read this guide This and other Data Services product documentation assumes the following: • You are an application developer. HTTP. SAP BusinessObjects Data Services Designer Guide 29 . RDBMS. and SOAP protocols. or database administrator working on data extraction. You understand your source data systems. you should be familiar with: • • • DTD and XML Schema formats for XML files Publishing Web Services (WSDL. and messaging concepts.) • • • • You are familiar Data Services installation environments—Microsoft Windows or UNIX. or data quality. data warehousing. consultant.

1 Introduction Overview of this guide 30 SAP BusinessObjects Data Services Designer Guide .

Logging into the Designer 2 .

When you log in to the Designer. you define a Job Server and link it to a repository during installation.2 > SAP BusinessObjects Data Services > Data Services Repository Manager. you can define or edit Job Servers or links between repositories and Job Servers at any time using the Server Manager. Repositories can reside on Oracle. 32 SAP BusinessObjects Data Services Designer Guide . 3. which is the process that starts jobs. Microsoft SQL Server. Define a database for the local repository using your database management system. IBM DB2.2 Logging into the Designer Creating a repository This section describes how to log in to the Designer. Typically. In production environments. you can balance loads appropriately. Associating the repository with a Job Server Each repository must be associated with at least one Job Server. To create a local repository: 1. 2. However. you can create a repository at any time using the Repository Manager. you select one of the associated repositories. In the Repository Manager window. The same Job Server can run jobs stored on multiple repositories. you are actually logging in to the database you defined for the repository. you create a repository during installation. Typically. You can link any number of repositories to a single Job Server. This adds the repository schema to the specified database. Creating a repository You must configure a local repository to log in to the software. However. Click Create. choose Programs > SAP BusinessObjects XI 3. From the Start menu. When running a job from a repository. Sybase ASE. To create a Job Server for your local repository • Open the Server Manager. 4. enter the database connection information for the repository and select Local for repository type.

User name and Password —The user name and password for the repository that you defined in your Oracle database. In the Repository Login window.2 > SAP BusinessObjects Data Services > Data Services Designer. During login. Version restrictions Your repository version must be associated with the same major release as the Designer and must be less than or equal to the version of the Designer. Entering repository login information To log in. the software alerts you if there is a mismatch between your Designer version and your repository version. Database connection name — The TNSnames. complete the following fields: • • • • Database type — Select Oracle. Remember — Check this box if you want the Designer to store this information for the next time you log in. Some features in the current release of the Designer might not be supported if you are not logged in to the latest version of the repository. Oracle login Choose Start > Programs > SAP BusinessObjects XI 3. The required information varies with the type of database containing the repository. enter the connection information for your repository. you can view the software and repository versions by selecting Help > About Data Services.Logging into the Designer Entering repository login information 2 Choose Start > Programs > SAP BusinessObjects XI 3. SAP BusinessObjects Data Services Designer Guide 33 .ora entry or Net Service Name of the database.2 > SAP BusinessObjects Data Services > Data Services Server Manager. After you log in.

clear to authenticate using the existing Microsoft SQL Server login account name and password and complete the User name and Password fields. For a DB2 repository.2 > SAP BusinessObjects Data Services > Data Services Designer. Database name — The name of the specific database to which you are connecting. 6.2 > SAP BusinessObjects Data Services > Data Services Designer. Database server name —The database server name. Remember — Check this box if you want the Designer to store this information for the next time you log in. 2. you must complete the following fields: • • • • Database type — Select DB2. 3. IBM DB2 login Choose Start > Programs > SAP BusinessObjects XI 3. User name and Password — The user name and password for the repository that you defined in your DB2 database.2 Logging into the Designer Entering repository login information Microsoft SQL Server login Choose Start > Programs > SAP BusinessObjects XI 3. Remember — Check this box if you want the Designer to store this information for the next time you log in. you must complete the following fields: 1. User name and Password — The user name and password for the repository that you defined in your Microsoft SQL Server database. Windows authentication — Select to have Microsoft SQL Server validate the login account name and password using information from the Windows operating system. For a Microsoft SQL Server repository. 34 SAP BusinessObjects Data Services Designer Guide . DB2 datasource — The data source name. 4. 5. Database type — Select Microsoft_SQL_Server.

listing the users and the time they logged in to the repository. Remember — Check this box if you want the Designer to store this information for the next time you log in. If the case does not match. If this happens. Database server name — Enter the database's server name. For a Sybase ASE repository. Exit to terminate the login attempt and close the session. You can: • • • Reset Users to clear the users in the repository and set yourself as the currently logged in user. the Reset Users window appears. From this window. SAP BusinessObjects Data Services Designer Guide 35 . more than one person may attempt to log in to a single repository. • • • Database name — Enter the name of the specific database to which you are connecting. the case you type for the database server name must match the associated case in the SYBASE_Home\interfaces file. you might receive an error because the Job Server cannot communicate with the repository. when logging in to a Sybase repository in the Designer.Logging into the Designer Resetting users 2 Sybase ASE login Choose Start > Programs > SAP BusinessObjects XI 3.2 > SAP BusinessObjects Data Services > Data Services Designer. you have several options. User name and Password — Enter the user name and password for this database. Resetting users Occasionally. Continue to log in to the system regardless of who else might be connected. Note: For UNIX Job Servers. you must complete the following fields: • • Database type — Select Sybase ASE.

36 SAP BusinessObjects Data Services Designer Guide .2 Logging into the Designer Resetting users Note: Only use Reset Users or Continue if you know that you are the only user connected to the repository. Subsequent changes could corrupt the repository.

Designer User Interface 3 .

all calls to the object refer to that definition. Access reusable objects through the local object library. you are changing the object in all other places in which it appears. A data flow. Properties describe an object. or work with in Designer are called objects. the name of the database to which you connect is an option for the datastore object. For example. can call the same data flow.3 Designer User Interface Objects This section provides basic information about the Designer's graphical user interface. Reusable objects You can reuse and replicate most objects defined in the software. system functions. which control the operation of objects. If the data flow changes. A reusable object has a single definition. Multiple jobs. edit. • The software has two types of objects: Reusable and single-use. Properties. The object type affects how you define and retrieve the object. For example. projects. but do not affect its operation. After you define and save a reusable object. Objects are hierarchical and consist of: • Options. 38 SAP BusinessObjects Data Services Designer Guide . The local object library shows objects such as source and target metadata. If you change the definition of the object in one place. is a reusable object. the software stores the definition in the local repository. in a datastore. Objects All "entities" you define. for example. both jobs use the new version of the data flow. and jobs. the name of the object and the date it was created are properties. You can then reuse the definition as often as necessary by creating calls to the definition. which document the object. like a weekly load job and a daily load job.

The following figure shows the relationships between major object types: SAP BusinessObjects Data Services Designer Guide 39 .Designer User Interface Objects 3 The object library contains object definitions. Single-use objects Some objects are defined only within the context of a single job or data flow. for example scripts and specific transform definitions. When you drag and drop an object from the object library. Object hierarchy Object relationships are hierarchical. you are really creating a new reference (or call) to the existing object definition.

3 Designer User Interface Objects 40 SAP BusinessObjects Data Services Designer Guide .

In the software. Workspace SAP BusinessObjects Data Services Designer Guide 41 . and modify objects. The area of the application window in which you define. or work with are objects. modify. all entities you create.Designer User Interface Designer window 3 Designer window The Designer user interface consists of a single application window and several embedded supporting windows. display. In addition to the Menu bar and Toolbar. there are other key areas of the application window: Area Project area Description Contains the current project (and the job(s) and other objects within it) available to you at a given time.

Project menu The project menu contains standard Windows as well as software-specific options. and the objects you build and save. XML Schema. Open Close Delete Save Save All Print 42 SAP BusinessObjects Data Services Designer Guide . data flow. file format. Option New Description Define a new project. DTD. batch job.3 Designer User Interface Menu bar Area Local object library Description Provides access to local repository objects including built-in system objects. Open an existing project. Menu bar This section contains a brief description of the Designer's menus. Save the object open in the workspace. transform. datastore. work flow. such as jobs and data flows. Tool palette Buttons on the tool palette enable you to add new objects to the workspace. such as transforms. Print the active workspace. or custom function. Delete the selected object. Save all changes to objects in the current Designer session. Close the currently open project. real-time job.

SAP BusinessObjects Data Services Designer Guide 43 . View menu A check mark indicates that the tool is active. Recover Last Recover deleted objects to the workspace from which they Deleted were deleted. Copy the selected objects or text to the clipboard. Exit Designer. Clear all objects in the active workspace (no undo).Designer User Interface Menu bar 3 Option Print Setup Compact Repository Exit Description Set up default printer information. Remove redundant and obsolete objects from the repository tables. Delete the selected objects. Select All Clear All Select all objects in the active workspace. Cut the selected objects or text and place it on the clipboard. Edit menu The Edit menu provides standard Windows commands with a few restrictions. Paste the contents of the clipboard into the active workspace or text box. Option Undo Cut Copy Paste Delete Description Undo the last operation. Only the most recently deleted objects are recovered.

Display the Match Editor to edit Match transform options. Display the Match Wizard to create a match data flow. tions Refresh Redraw the display.3 Designer User Interface Menu bar Option Toolbar Status Bar Palette Description Display or remove the toolbar in the Designer window. Display or remove the status bar in the Designer window. Profiler Monitor Run Match Wizard Match Editor Associate Editor 44 SAP BusinessObjects Data Services Designer Guide . The transform(s) that the Match Wizard generates will be placed downstream from the transform you selected. Enabled Descrip. Display the status of Profiler tasks. The Output window shows errors that occur such as during job validation or object export. Option Object Library Project Area Variables Output Description Open or close the object library window. Display the Associate Editor to edit Associate transform options. Open or close the Output window. Display or remove the project area from the Designer window. Tools menu An icon with a different color background indicates that the tool is active. Open or close the Variables and Parameters window. Use this command to ensure the content of the workspace represents the most up-to-date information from the repository.View descriptions for objects with enabled descriptions. Display or remove the floating tool palette. Select a transform in a data flow to activate this menu item.

Export (create or update) metadata in BusinessObjects Universes. Import From File Accelerator Index Designer Metadata Exchange BusinessObjects Universes Central Repositories Create or edit connections to a central repository for managing object versions among multiple users. For more information on DMT and FMT files. To export your whole repository. Display the Custom Functions window. The default file types are ATL.Display the Substitution Parameter Editor to create and ter Configurations edit substitution paramters and configurations. Substitution Parame. For more information on importing objects. SAP BusinessObjects Data Services Designer Guide 45 . You can drag objects from the object library into the editor for export. Model SAP NetWeaver Business Warehouse Accelerator (BWA) indexes on top of any data source and load the indexes and data to BWA. Export Export individual repository objects to another repository or file. XML.Designer User Interface Menu bar 3 Option User-Defined Editor Custom Functions System Configurations Description Display the User-Defined Editor to edit User-Defined transform options. in the object library right-click and select Repository > Export to file. see the Advanced Development Guide. Import and export metadata to third-party systems via a file. Profiler Server Login Connect to the Profiler Server. and FMT. see the Migration Guide. DMT. This command opens the Export editor in the workspace. Import objects into the current repository from a file. Display the System Configurations editor.

Exporting/importing objects in Data Services • Reference Guide: Functions and Procedures. The Execute and Start Debug options are only active when a job is selected. Importing from a file • Advanced Development Guide: Export/Import.3 Designer User Interface Menu bar Option Options Data Services Management Console Description Display the Options window. All other options are available as appropriate when a job is running in the Debug mode. 46 SAP BusinessObjects Data Services Designer Guide . Display the Management Console. Assess and Monitor Open Data Insight to profile. This menu item is only available if you have purchased and installed Data Insight. examine and report on the quality of your data. Custom functions • Local object library • Project area • Variables and Parameters • Using the Data Profiler • Creating and managing multiple datastore configurations • Connecting to the profiler server • Metadata exchange • Creating SAP universes • General and environment options Debug menu The only options available on this menu at all times are Show Filters/Breakpoints and Filters/Breakpoints. Related Topics • Advanced Development Guide: Multi-user environment setup • Advanced Development Guide: Export/Import.

To view SQL SAP BusinessObjects Data Services Designer Guide 47 .Designer User Interface Menu bar 3 Option Execute Start Debug Description Opens the Execution Properties window which allows you to execute the selected job. Display the SQL that Data Services generated for a selected data flow. Show FilShows and hides filters and breakpoints in workspace diaters/Breakpoints grams. Filters/Breakpoints Related Topics Opens a window you can use to manage filters and breakpoints. Option Validate Show ATL Display Optimized SQL Related Topics Description Validate the objects in the current workspace view or all objects in the job before executing the application. Opens the Debug Properties window which allows you to run a job in the debug mode. • Using the interactive debugger • Filters and Breakpoints window Validation menu The Designer displays options on this menu as appropriate when an object is open in the workspace. • Performance Optimization Guide: Maximizing Push-Down Operations. View a read-only version of the language associated with the job.

48 SAP BusinessObjects Data Services Designer Guide .3 Designer User Interface Menu bar Dictionary menu The Dictionary menu contains options for interacting with the dictionaries used by cleansing packages and the Data Cleanse transform. Create a new primary dictionary entry. tion Edit Classification Edit an existing dictionary classification. Update the connection information for the dictionary repository connection. Add Custom Output Create Dictionary Delete Dictionary Manage Connection Add custom output categories and fields to a dictionary. Create a new dictionary in the repository. Display conflict logs generated by the Bulk Load feature. Window menu The Window menu provides standard Windows options. Import a group of dictionary changes from an external file. Add New Classifica.Add a new dictionary classification. Export changes from a dictionary to an XML file. Delete a dictionary from the repository. Option Search Add New Dictionary Entry Bulk Load View Bulk Load Conflict Logs Export Dictionary Changes Universal Data Cleanse Description Search for existing dictionary entries. Dictionary-related options specific to the Universal Data Cleanse feature.

Display the Technical Manuals.Display a summary of new features for this release. Display window panels side by side. The name of the currently-selected object is indicated by a check mark. ry About Data Ser. Display window panels overlapping with titles showing. Release Summa. Toolbar In addition to many of the standard Windows tools. Job Server and engine.Designer User Interface Toolbar 3 Option Back Forward Cascade Tile Horizontally Tile Vertically Close All Windows Description Move back in the list of active workspace windows. You can also access the als same documentation from the <LINKDIR>\Doc\Books directory. Release Notes Display release notes for this release. Help menu The Help menu provides standard help options. Display window panels one above the other.Display information about the software including versions of vices the Designer. Navigate to another open object by selecting its name in the list. the software provides application-specific tools. A list of objects open in the workspace also appears on the Windows menu. Option Contents Description Display the Technical Manuals. Move forward in the list of active workspace windows. Technical Manu. Close all open windows. and copyright information. including: SAP BusinessObjects Data Services Designer Guide 49 .

Variables Project Area Output View Enabled Descriptions Validate Current View Validate All Objects in View Audit Objects in Data Flow 50 SAP BusinessObjects Data Services Designer Guide . Validates the object definition open in the workspace.3 Designer User Interface Toolbar Icon Tool Description Close all windows Local Object Library Central Object Library Closes all open windows in the workspace. Enables the system level setting for viewing object descriptions in the workspace. Objects included in the definition are also validated. Opens and closes the central object library window. Opens the Audit window to define audit labels and rules for the data flow. Opens and closes the variables and parameters creation window. Opens and closes the output window. Opens and closes the local object library window. Validates the object definition open in the workspace. Opens and closes the project area. Other objects included in the definition are also validated.

examine and report on the quality of your data. Move back in the list of active workspace windows. Related Topics • Debug menu options and tool bar SAP BusinessObjects Data Services Designer Guide 51 . before you decide to make design changes. Move forward in the list of active workspace windows. Opens the Data Insight application where you can profile. To see if an object in a data flow is reused elsewhere. Go Back Go Forward Management Console Assess and Monitor Contents Use the tools to the right of the About tool with the interactive debugger. right-click one and select View Where Used. Opens and closes the Management Console window. Use this command to find other jobs that use the same data flow. Opens the Technical Manuals PDF for information about using the software.Designer User Interface Toolbar 3 Icon Tool Description View Where Used Opens the Output window. which lists parent objects (such as jobs) of the object currently open in the workspace (such as a data flow).

or select Hide from the menu. Here's an example of the Project window's Designer tab. the project area disappears from the Designer window. When you deselect Allow Docking. you can click and drag the project area to dock at and undock from any edge within the Designer window. To control project area location. View the status of currently executing jobs. it stays undocked. Tabs on the bottom of the project area support different tasks. To unhide the project area. view and manage projects. click its toolbar icon.3 Designer User Interface Project area Project area The project area provides a hierarchical view of the objects used in each project. • When you select Allow Docking. View the history of complete jobs. which shows the project hierarchy: 52 SAP BusinessObjects Data Services Designer Guide . you can click and drag the project area to any location on your screen and it will not dock inside the Designer window. • When you select Hide. right-click its gray border and select/deselect Allow Docking. Provides a hierarchical view of all objects used in each project. These tasks can also be done using the Administrator. Tabs include: Create. just double-click the gray border. including which steps are complete and which steps are executing. To quickly switch between your last docked and undocked locations. Selecting a specific job execution displays its status. When you drag the project area away from a Designer window edge. Logs can also be viewed with the Administrator.

You can move the tool palette anywhere on your screen or dock it on any edge of the Designer window. The icons in the tool palette allow you to create new objects in the workspace. Tool palette The tool palette is a separate window that appears by default on the right edge of the Designer workspace. The tool palette contains the following icons: SAP BusinessObjects Data Services Designer Guide 53 . the window highlights your location within the project hierarchy. as shown. If a new object is reusable. To show the name of each icon. it will be automatically available in the object library after you create it. later you can drag that existing data flow from the object library. you are creating a new definition of an object. For example. The icons are disabled when they are not allowed to be added to the diagram open in the workspace. hold the cursor over the icon until the tool tip for the icon appears. When you create an object from the tool palette. if you select the data flow icon from the tool palette and define a new data flow.Designer User Interface Tool palette 3 As you drill down into objects in the Designer workspace. adding a call to the existing definition.

Use it to define column mappings Data flows and row selections. Creates a new script object. (reusable) Used only with the SAP application. (sinJobs and work flows gle-use) Creates an annotation. (single-use) Used only with the SAP application. and data flows Data flows Jobs and work flows Work flow Data flow Jobs and work flows ABAP data flow Query transform Template table Template XML Data transport Script Conditional Try Catch Annotation 54 SAP BusinessObjects Data Services Designer Guide . Jobs and work flows (single-use) Creates a new try object. (single-use) Creates a table for a target. (sinJobs and work flows gle-use) Creates a new conditional object.3 Designer User Interface Tool palette Icon Tool Description (class) Available Pointer Returns the tool pointer to a selection pointer for selecting and Everywhere moving objects in a diagram. work flows. Creates a new work flow. (singleuse) Jobs. Creates a template for a query. (singleJobs and work flows use) Creates a new catch object. (sinData flows gle-use) Creates an XML template. (reusable) Creates a new data flow.

These processes are represented by icons that you drag and drop into a workspace to create a workspace diagram. To connect objects: 1.Designer User Interface Workspace 3 Workspace When you open or select a job or any flow within a job hierarchy. Place the objects you want to connect in the workspace. Click and drag from the triangle on the right edge of an object to the triangle on the left edge of the next object in the flow. Click to select the object. To move an object to a different place in the workspace area: 1. 2. The workspace provides a place to manipulate system objects and graphically assemble data movement processes. the workspace becomes "active" with your selection. Moving objects in the workspace area Use standard mouse commands to move objects in the workspace. Connecting objects You specify the flow of data through jobs and work flows by connecting objects in the workspace from left to right in the order you want the data to be moved. SAP BusinessObjects Data Services Designer Guide 55 . This diagram is a visual representation of an entire data movement application or some part of a data movement application. Drag the object to where you want to place it in the workspace. 2.

right-click in the workspace and select a desired scale. you can describe the incremental behavior of individual jobs with numerous annotations and label each object with a basic description. work flow.3 Designer User Interface Workspace Disconnecting objects To disconnect objects 1. Press the Delete key. To change the scale of the workspace 1. Click the connecting line. or data flow. 56 SAP BusinessObjects Data Services Designer Guide . you can change the focus of a job. 2. or data flow. work flow. You can view object descriptions and annotations in the workspace. By scaling the workspace. In the drop-down list on the tool bar. 100%). You can use annotations to explain a job. For example. For example. you might want to increase the scale to examine a particular part of a work flow. Together. Describing objects You can use descriptions to add comments about objects. Related Topics • Creating descriptions • Creating annotations Scaling the workspace You can control the scale of the workspace. or you might want to reduce the scale so that you can examine the entire work flow without scrolling. This job loads current categories and expenses and produces tables for analysis. select a predefined scale or enter a custom value (for example. Alternatively. descriptions and annotations allow you to document an SAP BusinessObjects Data Services application. 2.

and work flows. jobs. Closing workspace windows When you drill into an object in the project area or workspace. tile horizontally.) Note: These views use system resources. Close the views individually by clicking the close box in the top right corner of the workspace.Designer User Interface Local object library 3 Note: You can also select Scale to Fit and Scale to Whole: • • Select Scale to Fit and the Designer calculates the scale that fits the entire project in the current view area. Arranging workspace windows The Window menu allows you to arrange multiple open workspace windows in the following ways: cascade. (You can show/hide these tabs from the Tools > Options menu. and as you open more objects in the workspace. If you have a large number of open views. you might notice a decline in performance. or tile vertically. data flows. These objects include built-in system objects. and the objects you build and save. Select Scale to Whole to show the entire workspace area in the current view area. Related Topics • General and environment options Local object library The local object library provides access to reusable objects. Close all open views by selecting Window > Close All Windows or clicking the Close All Windows icon on the toolbar. such as transforms. The view is marked by a tab at the bottom of the workspace area. Go to Designer > General options and select/deselect Show tabs in workspace. more tabs appear. SAP BusinessObjects Data Services Designer Guide 57 . a view of the object's definition opens in the workspace area. such as datastores.

• When you select Allow Docking. you can click and drag the object library to any location on your screen and it will not dock inside the Designer window. When you deselect Allow Docking. Tab Description Projects are sets of jobs available at a given time. • When you select Hide. it stays undocked. you can click and drag the object library to dock at and undock from any edge within the Designer window. 58 SAP BusinessObjects Data Services Designer Guide . or click the object library icon in the icon bar. Updates to the repository occur through normal software operation. the object library disappears from the Designer window. click its toolbar icon. The table shows the tab on which the object type appears in the object library and describes the context in which you can use each type of object. The object library gives you access to the object types listed in the following table. or select Hide from the menu. Saving the objects you create adds them to the repository. To control object library location. To quickly switch between your last docked and undocked locations.3 Designer User Interface Local object library The local object library is a window into your local repository and eliminates the need to access the repository directly. To unhide the object library. just double-click the gray border. Jobs are executable work flows. When you drag the object library away from a Designer window edge. Related Topics • Advanced Development: Central versus local repository To open the object library • Choose Tools > Object Library. Access saved objects through the local object library. right-click its gray border and select/deselect Allow Docking. There are two job types: batch jobs and real-time jobs.

Datastores represent connections to databases and applications used in your project. defining the interdependencies between them. To display the name of each tab as well as its icon 1. To sort columns in the object library • Click the column heading. SAP BusinessObjects Data Services Designer Guide 59 . producing output data sets from the sources you specify. Hold the cursor over the tab until the tool tip for the tab appears. or XML message. Transforms operate on data. Formats describe the structure of a flat file. XML file. Data flows describe how to process a task. Custom Functions are functions written in the software's Scripting Language. The object library lists both built-in and custom transforms. and functions imported into the software.Designer User Interface Local object library 3 Tab Description Work flows order data flows and the operations that support data flows. documents. or 2. Under each datastore is a list of the tables. Make the object library window wider until the names appear. You can use them in your jobs.

To list names in descending order. A schema is a data structure that can contain columns. in the workspace click the name of the object to open its editor. they are grouped in tabs in the editor. click the Data Flow column heading again. you can sort data flows by clicking the Data Flow column heading once. Names are listed in ascending order. you can: • • Undo or redo previous actions performed in the window (right-click and choose Undo or Redo) Find a string in the editor (right-click and choose Find) 60 SAP BusinessObjects Data Services Designer Guide . other nested schemas. and functions (the contents are called schema elements). The editor displays the input and output schemas for the object and a panel below them listing options set for the object. Object editors To work with the options for an object. If there are many options. as shown in the following illustration: In an editor.3 Designer User Interface Object editors For example. A table is a schema containing only columns. A common example of an editor is the editor for the query transform.

tool palette. 2.Designer User Interface Working with objects 3 • • Drag-and-drop column names from the input schema into relevant option boxes Use colors to identify strings and comments in text boxes where you can edit expressions (keywords appear blue. Right-click anywhere except on existing objects and choose New. Enter options such as name and description to define the object. Creating new reusable objects You can create reusable objects from the object library or by using the tool palette. Right-click the new object and select Properties. comments begin with a pound sign and appear green) Note: You cannot add comments to a mapping clause in a Query transform. Use the object description or workspace annotation feature instead. the following syntax is not supported on the Mapping tab: table. With these tasks. After you create an object. Open the object library by choosing Tools > Object Library. 4. editing its definition and adding calls to other objects. strings are enclosed in quotes and appear pink. you use various parts of the Designer—the toolbar. For example.column # comment The job will not run and you cannot successfully export it. To create a reusable object (in the object library) 1. Related Topics • Query editor Working with objects This section discusses common tasks you complete when working with objects in the Designer. and local object library. you can work with the object. 3. workspace. SAP BusinessObjects Data Services Designer Guide 61 . Click the tab corresponding to the object type.

you can drag a data flow into a job. but you cannot drag a work flow into a data flow. left-click the icon for the object you want to create. The software opens a blank workspace in which you define the object. You define an object using other objects. 2. if you click the name of a batch data flow. 2. Open the object library by choosing Tools > Object Library. To add an existing object (create a new call to an existing object) 1. targets. For example. Select an object.3 Designer User Interface Working with objects To create a reusable object (using the tool palette) 1. For example. click the object name. Move the cursor to the workspace and left-click again. To open an object's definition You can open an object's definition in one of two ways: 1. 3. The object icon appears in the workspace where you have clicked. From the workspace. In the tool palette. 2. a new workspace opens for you to assemble sources. Drag the object to the workspace. Note: Objects dragged into the workspace must obey the hierarchy logic. Related Topics • Object hierarchy 62 SAP BusinessObjects Data Services Designer Guide . and transforms that make up the actual flow. Click the tab corresponding to any object type. 4. From the project area. click the object.

Edit the text in the name text box. Select the object in the object library. Click to select the object in the workspace. change) an object's properties through its property page. Right-click and choose Replicate. c. b. c. The software makes a copy of the top-level object (but not of objects that it calls) and gives it a new name. which you can edit. Edit the text in the first text box. 3.Designer User Interface Working with objects 3 Changing object names You can change the name of an object from the workspace or the object library. To view. You can also create a copy of an existing object. To change the name of an object in the object library a. Viewing and changing object properties You can view (and. d. Right-click and choose Edit Name. Select the object in the object library. 1. Click outside the text box or press Enter to save the new name. and add object properties 1. 2. Right-click and choose Properties. Note: You cannot change the names of built-in objects. SAP BusinessObjects Data Services Designer Guide 63 . c. To copy an object a. Right-click and choose Properties. change. Click OK. To change the name of an object in the workspace a. in some cases. Select the object in the object library. The General tab of the Properties window opens. b. d. b. 2.

Note that you can toggle object descriptions on and off by right-clicking any object in the workspace and selecting/clearing View Enabled Descriptions. Attributes and Class Attributes are the most common and are described in the following sections. The property sheets vary by object type. Complete the property sheets. other properties may appear on the General tab. but General. Examples include: • • • • • Execute only once Recover as a unit Degree of parallelism Use database links Cache type Related Topics • Performance Optimization Guide: Using Caches • Linked datastores • Performance Optimization Guide: Using Parallel Execution • Recovery Mechanisms • Creating and defining data flows Attributes tab The Attributes tab allows you to assign values to the attributes of the current object. you can change the object name as well as enter or edit the object description. click Apply to save changes without closing the window. Depending on the object. General tab The General tab contains two main object properties: name and description. You can add object descriptions to single-use objects as well as to reusable objects.3 Designer User Interface Working with objects 3. 64 SAP BusinessObjects Data Services Designer Guide . From the General tab. When finished. 4. click OK to save changes you made to the object properties and to close the window. Alternatively.

The new attribute is now available for all of the objects of this class. all data flow objects have the same class attributes. SAP BusinessObjects Data Services Designer Guide 65 . Class Attributes tab The Class Attributes tab shows the attributes available for the type of object selected. When you select an attribute with a system-defined value. To create a new attribute for a class of objects. right-click in the attribute list and select Add. the Value field is unavailable. select the attribute and enter the value in the Value box at the bottom of the window. For example.Designer User Interface Working with objects 3 To assign a value to an attribute. Some attribute values are set by the software and cannot be edited.

You cannot delete the class attributes predefined by Data Services. To activate the object-level setting.3 Designer User Interface Working with objects To delete an attribute. The system-level setting is unique to your setup. its description moves as well. select it then right-click and choose Delete. An ellipses after the text in a description indicates that there is more text. When you move an object. select ViewEnabled Descriptions. Therefore. resize the description by clicking and dragging it. The system-level setting is disabled by default. A description is associated with a particular object. you also import or export its description. The object-level setting is also disabled by default unless you add or edit a description from the workspace. To see all the text. The object-level setting is saved with the object in the repository. Both settings must be activated to view the description for a particular object. When you import or export that repository object (for example. right-click the object and select Enable object description. and production environments). Creating descriptions Use descriptions to document objects. To activate that system-level setting. when migrating between development. To see which object is 66 SAP BusinessObjects Data Services Designer Guide . descriptions are a convenient way to add comments to workspace objects. or click the View Enabled Descriptions button on the toolbar. The Designer determines when to show object descriptions based on a system-level setting and an object-level setting. test. You can see descriptions on workspace diagrams.

In the project area or object library. Click OK. 2. Alternately. right-click an object. In the Properties window. Enter your comments in the Description text box. The description displays automatically in the workspace (and the object's Enable Object Description option is selected). The description for the object displays in the object library. right-click an object and select Properties. select an existing object (such as a job) that contains an object to which you have added a description (such as a work flow). From the View menu. 3. To display a description in the workspace 1. select Enabled Descriptions. view the object's name in the status bar. To hide a particular object's description 1. right-click an object and select Properties. Right-click the work flow and select Enable Object Description. To add a description to an object 1. In the workspace diagram. SAP BusinessObjects Data Services Designer Guide 67 . you can select the View Enabled Descriptions button on the toolbar. 3. In the workspace. enter text in the Description box. 2. From the View menu. 3. Click OK. The description displays in the workspace under the object. select Enabled Descriptions. In the project area. 4.Designer User Interface Working with objects 3 associated with which selected description. 2. To add a description to an object from the workspace 1.

then right-clicking one of the selected objects. work flow. Enter. An annotation is associated with the job. 2. or data flow where it appears. double-click an object description. then right-clicking one of the selected objects. You can select the Do not show me this again check box to avoid this alert. In the workspace. Note: If you attempt to edit the description of a reusable object. copy. you import or export associated annotations. To annotate a workspace diagram 1. or data flow. or paste text into the description. Dragging a selection box around all the objects you want to select. or a diagram in a workspace. cut. because the object-level switch overrides the system-level switch. even if the View Enabled Descriptions option is checked. across all jobs. after deactivating the alert. select Save. 3. To edit object descriptions 1. 2. When you import or export that job.3 Designer User Interface Working with objects Alternately. Open the workspace diagram you want to annotate. part of a flow. In the Project menu. the software alerts you that the description will be updated for every occurrence of the object. However. In the pop-up menu. 68 SAP BusinessObjects Data Services Designer Guide . Creating annotations Annotations describe a flow. work flow. The description for the object selected is hidden. deselect Enable Object Description. you can select multiple objects by: • • Pressing and holding the Control key while selecting objects in the workspace diagram. you can right-click any object and select Properties to open the object's Properties window and add or edit its description. you can only reactivate the alert by calling Technical Support. Alternately.

select the objects you want to cut or copy. or Ctrl+A. Shift-click. Copying objects Objects can be cut or copied and then pasted on the workspace where valid. you can select an annotation and press the Delete key. calls to data flows and works flows can be cut or copied and then pasted to valid objects in the workspace. You can add any number of annotations to a diagram. Right-click an annotation. or jobs. and delete text directly on the annotation. you must be define each within its new context. In addition. local variables. data flow. You can select multiple objects using Ctrl-click. parameters. Alternately.Designer User Interface Working with objects 3 You can use annotations to describe any workspace such as a job. To delete an annotation 1. In the tool palette. or while loop. and substitution parameters are copied. Click a location in the workspace to place the annotation. You can add. References to global variables. edit. 2. Select Delete. Note: The paste operation duplicates the selected objects in a flow. An annotation appears on the diagram. work flow. To cut or copy and then paste objects: 1. however. The replicate operation creates a new object in the object library. 2. but still calls the original objects. work flows. In other words. Multiple objects can be copied and pasted either within the same or other data flows. conditional. 3. the paste operation uses the original object in another location. click the annotation icon. SAP BusinessObjects Data Services Designer Guide 69 . you can resize and move the annotation by clicking and dragging. In the workspace. catch. Additionally.

3 Designer User Interface Working with objects 2. subsequent pasted objects are layered on top of each other. 3. the definitions of any single-use objects it calls. You can choose to save changes to the reusable object currently open in the workspace. • use the Ctrl+V keyboard short-cut. • click Edit > Paste. Saving and deleting objects "Saving" an object in the software means storing the language that describes the object to the repository. Right-click and then select either Cut or Copy. single-use objects are saved only as part of the definition of the reusable object that calls them. Note: The objects are pasted in the selected location if you right-click and select Paste. only the call is saved. Choose Project > Save. When you save the object. This command saves all objects open in the workspace. the object properties. Where necessary to avoid a naming conflict. The objects are pasted in the upper left-hand corner of the workspace if you paste using any of the following methods: • cIick the Paste icon. The software stores the description even if the object is not complete or contains an error (does not validate). Open the project in which your object is included. The content of the included reusable objects is not saved. a new name is automatically generated. and any calls to other reusable objects are recorded in the repository. To save changes to a single reusable object 1. You can save reusable objects. Click within the same flow or select a different flow. 2. Right-click and select Paste. If you use a method that pastes the objects to the upper left-hand corner. 70 SAP BusinessObjects Data Services Designer Guide .

Click OK. Note: The software also prompts you to save all objects that have changes when you execute a job and when you exit the Designer. • If you attempt to delete an object that is being used. You must remove or replace these calls to produce an executable job. To save all changed objects in the repository 1.Designer User Interface Working with objects 3 Repeat these steps for other individual objects you want to save. 2. 3. Saving a reusable object saves any single-use object included in it. select the object. Right-click and choose Delete. Related Topics • Using View Where Used SAP BusinessObjects Data Services Designer Guide 71 . 2. the software provides a warning message and the option of using the View Where Used feature. • If you select Yes. Choose Project > Save All. The software lists the reusable objects that were changed since the last save operation. To delete an object definition from the repository 1. Note: Built-in objects such as transforms cannot be deleted from the object library. (optional) Deselect any listed object to avoid saving it. In the object library. the software marks all calls to the object with a red "deleted" icon to indicate that the calls are invalid.

Enter the appropriate values for the search.3 Designer User Interface Working with objects To delete an object call 1. Searching for objects From within the object library. you can search for objects defined in the repository or objects available through a datastore. From the search results window you can use the context menu to: • Open an item • View the attributes (Properties) • Import external tables as repository metadata You can also drag objects from the search results window and drop them in the desired location. If you delete a reusable object from the workspace or from the project area. The Search window provides you with the following options: 72 SAP BusinessObjects Data Services Designer Guide . Right-click in the object library and choose Search. To search for an object 1. Right-click the object call and choose Delete. Click Search. 2. The object definition remains in the object library. 2. The objects matching your entries are listed in the window. Options available in the Search window are described in detail following this procedure. 3. Open the object that contains the call you want to delete. only the object call is deleted. The software displays the Search window.

Name Description SAP BusinessObjects Data Services Designer Guide 73 . Choose from the repository or a specific datastore. and Domains. the name is not case sensitive. objects you create in the Designer have no description unless you add a one. Look in When you designate a datastore. If you are searching in a datastore and the name is case sensitive in that datastore. By default. The object name to find. The search returns objects whose description attribute contains the value entered. Work flows. When searching a datastore or application. If you are searching in the repository. Jobs. IDOCs.Designer User Interface Working with objects 3 Option Description Where to search. Data flows. choose from Tables. Hierarchies. Objects imported into the repository have a description from their source. The type of object to find. Object type When searching the repository. you can also choose to search the imported data (Internal Data) or the entire datastore (External Data). enter the name as it appears in the database or application and use double quotation marks (") around the name to preserve the case. The object description to find. choose from object types available through that datastore. You can designate whether the information to be located Contains the specified name or Equals the specified name using the drop-down box next to the Name field. Files.

Data. You can search by attribute values only when searching in the repository. select Tools > Options. Value Match Select Contains to search for any attribute that contains the value specified. The attribute value to find. a description appears on the right. The Advanced button provides the following options: Option Attribute Description The object attribute in which to search. 74 SAP BusinessObjects Data Services Designer Guide . The window displays option groups for Designer.3 Designer User Interface General and environment options The Search window also includes an Advanced button where. As you select each option group or option. Select Equals to search for any attribute that contains only the value specified. General and environment options To open the Options window. and Job Server options. you can choose to search for objects based on their attribute values. The type of search performed. Expand the options by clicking the plus icon.

one Job Server must be defined as the default Job Server to use at login. SAP BusinessObjects Data Services Designer Guide 75 . Administrator Table 3-9: Default Job Server Option Description Current Displays the current value of the default Job Server.Designer User Interface General and environment options 3 Designer — Environment Table 3-8: Default Administrator for Metadata Reporting Option Description Select the Administrator that the metadata reporting tool uses. Note: Job-specific options and path names specified in Designer refer to the current default Job Server. Allows you to specify a new value for the default Job Server from a dropdown list of Job Servers associated with this repository. Changes are effective immediately. An Administrator is defined by host name and port. modify these options and path names. If you change the default Job Server. New If a repository is associated with several Job Servers.

Designer automatically sets an available port to receive messages from the current Job Server. Changes will not take effect until you restart the software. Allows you to set a communication port for the Designer to communicate with a Job Server while running in Debug mode. Only activated when you deselect the previous control. Uncheck to specify a listening port or port range.3 Designer User Interface General and environment options Table 3-10: Designer Communication Ports Option Description Allow Designer to set the port for Job Server communication If checked. The default is checked. If the local repository that you logged in to when you opened the Designer is associated with a server group. enter the same port number in both the From port and To port text boxes. Allows you to specify a range of ports from which the Designer can choose a listening port. To specify a specific listening port. From To Interactive Debugger Server group for local repository 76 SAP BusinessObjects Data Services Designer Guide . Enter port numbers in the port text boxes. the name of the server group appears. Specify port range You may choose to constrain the port used for communication between Designer and Job Server when the two components are separated by a firewall.

the Input schema and the Output schema. but the Designer name Maximum schema The number of elements displayed in the schema tree. you should validate your design manually before job execution. The default is unchecked. Open monitor on job Affects the behavior of the Designer when you execute a job. SAP BusinessObjects Data Services Designer Guide 77 . The default is 100. The default is on. With execution this option enabled. Automatically import Select this check box to automatically import domains when importing domains a table that references a domain. View data by clicking the magnifying glass icon on source and target objects. the workspace remains as is. The default is 17 characters. If you keep this default execution setting. the software variables of the same automatically passes the value as a parameter with the same name name to a data flow called by a work flow. Enter a number for expand only displays the number entered here.Designer User Interface General and environment options 3 Related Topics • Changing the interactive debugger port Designer — General Option View data sampling size (rows) Description Controls the sample size used to display the data in sources and targets in open data flows in the workspace. the Designer switches the workspace to the monitor view during job execution. Number of characters Controls the length of the object names displayed in the workspace. Element tree elements to auto names are not allowed to exceed this number. Perform complete If checked. the software performs a complete job validation before validation before job running a job. Default parameters to When you declare a variable at the work-flow level. otherwise. in workspace icon Object names are allowed to exceed this number.

you can easily distinguish your job/work flow design workspace from your data flow design workspace. be sure to validate your entire job before saving it because column mapping calculation is sensitive to errors and will skip data flows that have validation problems. 78 SAP BusinessObjects Data Services Designer Guide .3 Designer User Interface General and environment options Option Description Automatically calcuCalculates information about target tables and columns and the late column mapsources used to populate them. The software uses this information pings for metadata reports such as impact and lineage. Allows you to decide if you want to use the tabs at the bottom of the workspace to navigate. Column mapping information is stored in the AL_COLMAP table (ALVW_MAPPING view) after you save a data flow or import objects to or export objects from a repository. If the option is selected. For example. Related Topics • Using View Data • Management Console Metadata Reports Guide: Refresh Usage Data tab Designer — Graphics Choose and preview stylistic elements to customize your workspaces. Show dialog when job is completed: Show tabs in workspace Exclude non-executable elements from exported XML Allows you to choose if you want to see an alert or just read the trace messages. or custom reports. Designer workspace display coordinates would not be exported. Excludes elements not processed during job execution from exported XML documents. auto documentation. Using these options.

gray. Choose a plain or tiled background pattern for the selected flow type. Modify settings for each type using the remaining options. Line Type Line Thickness Background style Color scheme Use navigation waterAdd a watermark graphic to the background of the flow type selected. right-click one of the central repository connections listed and select Activate. or white. mark Note that this option is only available with a plain background style. To activate a central repository. Reactivate automatiSelect if you want the active central repository to be reactivated cally whenever you log in to the software using the current local repository.Designer User Interface General and environment options 3 Option Workspace flow type Description Switch between the two workspace flow types (Job/Work Flow and Data Flow) to view default settings. Choose a style for object connector lines. Set the background color to blue. Designer — Central Repository Connections Option Central Repository Connections Description Displays the central repository connections and the active central repository. Data — General Option Description SAP BusinessObjects Data Services Designer Guide 79 . Set the connector line thickness.

Job Server — General Use this window to reset Job Server options or with guidance from SAP Technical customer Support. For example. The default value is 15. Two-digit years less than this value are interpreted as 20##.3 Designer User Interface General and environment options Option Century Change Year Description Indicates how the software interprets the century for two-digit years. 80 SAP BusinessObjects Data Services Designer Guide . if the Century Change Year is set to 15: Two-digit year 99 16 15 14 Interpreted as 1999 1916 1915 2014 Convert blanks to Converts blanks to NULL values when loading data using the Oracle nulls for Oracle bulk bulk loader utility and: loader • the column is not part of the primary key • the column is nullable Job Server — Environment Option Description Maximum number Sets a limit on the number of engine processes that this of engine processes Job Server can have running concurrently. Two-digit years greater than or equal to this value are interpreted as 19##.

Designer User Interface General and environment options 3 Related Topics • Changing Job Server options SAP BusinessObjects Data Services Designer Guide 81 .

3 Designer User Interface General and environment options 82 SAP BusinessObjects Data Services Designer Guide .

Projects and Jobs 4 .

A project is the highest level of organization offered by the software. The software shows you the contents as both names in the project area hierarchy and icons in the workspace. Projects cannot be shared among multiple users. expand it to view the lower-level objects contained in the object. Only one project can be open at a time. Each item selected in the project area also displays in the workspace: 84 SAP BusinessObjects Data Services Designer Guide . Objects that make up a project The objects in a project appear hierarchically in the project area. the Job_KeyGen job contains two data flows. In the following example. Projects have common characteristics: • • • Projects are listed in the object library. Projects A project is a reusable object that allows you to group jobs. You can use a project to group jobs that have schedules that depend on one another or that you want to monitor together. Opening a project makes one group of objects easily accessible in the user interface.4 Projects and Jobs Projects Project and job objects represent the top two levels of organization for the application flows you create using the Designer. and the DF_EmpMap data flow contains multiple objects. If a plus sign (+) appears next to an object.

Saving projects To save all changes to a project 1. 2. Enter the name of your new project. Choose Project > Save All. the software closes that project and opens the new one. 3. they also appear in the project area. Choose Project > Open. Note: If another project was already open. 3. Opening existing projects To open an existing project 1. 2. Select the name of an existing project from the list.Projects and Jobs Projects 4 Creating a new project 1. Choose Project > New > Project. SAP BusinessObjects Data Services Designer Guide 85 . The new project appears in the project area. Click Open. As you add jobs and other lower-level objects to the project. The name can include alphanumeric characters and underscores (_). It cannot contain blank spaces. Click Create.

you can schedule batch jobs and set up real-time jobs as services that execute a process when the software receives a message request. Note: The software also prompts you to save all objects that have changes when you execute a job and when you exit the Designer. 3. You can manually execute and test jobs in development. Each step is represented by an object icon that you place in the workspace to create a job diagram. then create a single job that calls those work flows. Jobs A job is the only object you can execute. organize its content into individual work flows. You can include any of the following objects in a job definition: • Data flows • • • • Sources Targets Transforms Work flows • • • • Scripts Conditionals While Loops Try/catch blocks If a job becomes complex. A job is made up of steps you want executed together. work flows. A job diagram is made up of two or more objects connected together. 86 SAP BusinessObjects Data Services Designer Guide . 2. Click OK.4 Projects and Jobs Jobs The software lists the jobs. (optional) Deselect any listed object to avoid saving it. Saving a reusable object saves any single-use object included in it. In production. and data flows that you edited since the last save.

SAP BusinessObjects Data Services Designer Guide 87 . The name can include alphanumeric characters and underscores (_). Right-click and choose New BatchJob or Real Time Job. drag it into the project area. 2. Edit the name. There are some restrictions regarding the use of some software features with real-time jobs. select the project name. Related Topics • Work Flows • Real-time Jobs Creating jobs To create a job in the project area 1. You can add work flows and data flows to both batch and real-time jobs. 2. The software opens a new workspace for you to define the job. It cannot contain blank spaces. To create a job in the object library 1. Go to the Jobs tab.Projects and Jobs Jobs 4 Real-time jobs use the same components as batch jobs. 4. When you drag a work flow or data flow icon into a job. 3. Right-click Batch Jobs or Real Time Jobs and choose New. It cannot contain blank spaces. Right-click and select Properties to change the object's name and add a description. In the project area. To add the job to the open project. A new job with a default name appears. The name can include alphanumeric characters and underscores (_). 5. you are telling the software to validate these objects according the requirements of the job type (either batch or real-time). 3.

This allows you to more easily work with metadata across all applications such as: • • • • Data-modeling applications ETL applications Reporting applications Adapter software development kits Examples of conventions recommended for use with jobs and other objects are shown in the following table.4 Projects and Jobs Jobs Naming conventions for objects in jobs We recommend that you follow consistent naming conventions to facilitate object identification across all systems in your enterprise. Prefix DF_ EDF_ EDF_ RTJob_ WF_ JOB_ n/a DC_ SC_ n/a PROC_ Suffix n/a _Input _Output n/a n/a n/a _DS n/a n/a Object Data flow Embedded data flow Embedded data flow Real-time job Work flow Job Datastore Datastore configuration System configuration Example DF_Currency EDF_Example_Input EDF_Example_Output RTJob_OrderStatus WF_SalesOrg JOB_SalesOrg ORA_DS DC_DB2_production SC_ORA_test Catalog_Memory_DS PROC_SalesStatus _Memory_DS Memory datastore n/a Stored procedure 88 SAP BusinessObjects Data Services Designer Guide .

you might want to provide standardized names for objects that identify a specific action across all object types. By using a prefix or suffix.Projects and Jobs Jobs 4 Although the Designer is a graphical user interface with icons representing objects in its windows. naming conventions can also include path name identifiers. In addition to prefixes and suffixes.<package>.<owner>. For example. other interfaces might require you to identify object types by the text alone. RTJob_OrderStatus. you can more easily identify your object's type. For example: DF_OrderStatus. In addition to prefixes and suffixes.<PROC_Name> <datastore>.<PROC_Name> SAP BusinessObjects Data Services Designer Guide 89 .<owner>. the stored procedure naming convention can look like either of the following: <datastore>.

4 Projects and Jobs Jobs 90 SAP BusinessObjects Data Services Designer Guide .

Datastores 5 .

The software reads and writes data stored in XML documents through DTDs and XML Schemas.D. Edwards One World and J. and PROD) to the same datastore name. PeopleSoft. you can add a set of configurations (DEV. Edwards World. Note: The software reads and writes data stored in flat files through flat file formats. Oracle Applications. and Siebel Applications. provides details about the Attunity Connector datastore. J. The specific information that a datastore object can access depends on the connection configuration. make corresponding changes in the datastore information in the software. This visual flag allows you to find and update data flows affected by datastore changes. What are datastores? Datastores represent connection configurations between the software and databases or applications. SAP applicationsand SAP NetWeaver BW. See the appropriate supplement guide. When your database or application changes. Note: Objects deleted from a datastore connection are identified in the project area and workspace by a red "deleted" icon. Datastore configurations allow the software to access metadata from a database or application and read from or write to that database or application while the software executes a job. The software does not automatically detect the new information. Applications that have pre-packaged or user-written adapters. You can group any set of datastore configurations into a system configuration.D. and instructions for configuring datastores. 92 SAP BusinessObjects Data Services Designer Guide . For example.5 Datastores What are datastores? This section describes different types of datastores. These connection settings stay with the datastore during export or import. You can create multiple configurations for a datastore. SAP BusinessObjects Data Services datastores can connect to: • • • Databases and mainframe file systems. This allows you to plan ahead for the different environments your datastore may be used in and limits the work involved with migrating jobs. These configurations can be direct or through adapters. TEST.

and thus. Microsoft SQL Server. select a system configuration. For a complete list of sources. and Teradata databases (using native connections) Other databases (through ODBC) A repository. • • • • • Adabas DB2 UDB for OS/390 and DB2 UDB for OS/400 IMS/DB VSAM Flat files on OS/390 and flat files on OS/400 SAP BusinessObjects Data Services Designer Guide 93 . Related Topics • Database datastores • Adapter datastores • File formats • Formatting XML documents • Creating and managing multiple datastore configurations Database datastores Database datastores can represent single or multiple connections with: • • Legacy systems using Attunity Connect IBM DB2. The data sources that Attunity Connect accesses are in the following list. Informix. using a memory datastore or persistent cache datastore • • Mainframe interface The software provides the Attunity Connector datastore that accesses mainframe data sources through Attunity Connect. MySQL. the set of datastore configurations for your current environment. Sybase IQ. refer to the Attunity documentation. HP Neoview. Oracle. Netezza. SAP BusinessObjects Data Federator. Sybase ASE.Datastores Database datastores 5 When running or scheduling a job.

To create an Attunity Connector datastore: 1. you do not need to install a driver manager. Servers Install and configure the Attunity Connect product on the server (for example. Enter a name for the datastore. When you install a Job Server on UNIX. the installer will prompt you to provide an installation directory path for Attunity connector software. refer to their documentation.5 Datastores Database datastores Prerequisites for an Attunity datastore Attunity Connector accesses mainframe data using software that you must manually install on the mainframe server and the local client (Job Server) computer. Attunity also offers an optional tool called Attunity Studio. In the Datastores tab of the object library. because the software loads ODBC drivers directly on UNIX platforms. The ODBC driver is required. install the Attunity Connect product. right-click and select New.1 or later. upgrade your repository to SAP BusinessObjects Data Services version 6. The software connects to Attunity Connector using its ODBC interface. Configuring an Attunity datastore To use the Attunity Connector datastore option. Clients To access mainframe data using Attunity Connector. 94 SAP BusinessObjects Data Services Designer Guide . Configure ODBC data sources on the client (SAP BusinessObjects Data Services Job Server). For more information about how to install and configure these products. an zSeries computer).5. which you can use for configuration and administration. In addition. 2. It is not necessary to purchase a separate ODBC driver manager for UNIX and Windows platforms.

In the Database type box. If you have several types of data on the same computer. select Attunity Connector. and a unique Attunity server workspace name. For example. You can now use the new datastore connection to import metadata tables into the current repository. which reduces the amount of data transmitted through your network. Type the Attunity data source name. 6. you might want to access both types of data using a single connection. ensure that you meet the following requirements: • All Attunity data sources must be accessible by the same user name and password. enter the following values into the Data source box: DSN4. SAP BusinessObjects Data Services Designer Guide 95 . if you have a DB2 data source named DSN4 and a VSAM data source named Navdemo. 7. for example a DB2 database and VSAM. select Database. click the Advanced button.AttunityDataSourceName For example. To specify multiple sources in the Datastore Editor: 1. you can use a single connection to join tables (and push the join operation down to a remote server).Navdemo 2. location of the Attunity daemon (Host location). Specifying multiple data sources in one Attunity datastore You can use the Attunity Connector datastore to access multiple Attunity data sources on the same Attunity Daemon location.Datastores Database datastores 5 3. Click OK. In the Datastore type box. Separate data source names with semicolons in the Attunity data source box using the following format: AttunityDataSourceName. If you list multiple data source names for one Attunity Connector datastore. 5. To change any of the default options (such as Rows per Commit or Language). 4. the Attunity daemon port number.

be sure to use this format.5 Datastores Database datastores • All Attunity data sources must use the same workspace. use the same workspace name for each data source. the maximum size of the Attunity data source name and actual owner name is 63 (the colon accounts for 1 character). Data Services naming convention for Attunity tables Data Services' format for accessing Attunity tables is unique to Data Services. The format is as follows: AttunityDataSource:OwnerName. the maximum size of the owner name is 64 characters. You can author SQL in the following constructs: • • • • • SQL function SQL transform Pushdown_sql function Pre-load commands in table loader Post-load commands in table loader Note: For any table in Data Services. Data Services automatically generates the correct SQL for this format. When you setup access to the data sources in Attunity Studio. In the case of Attunity tables. However. precede the table name with the data source and owner names separated by a colon. when you author SQL.TableName When using the Designer to create your jobs with imported Attunity tables. Because a single datastore can access multiple software systems that do not share the same namespace. Data Services cannot access a table with an owner name larger than 64 characters. With an Attunity Connector. the name of the Attunity data source must be specified when referring to a table. Limitations All Data Services features are available when you use an Attunity Connector datastore except the following: 96 SAP BusinessObjects Data Services Designer Guide .

To define a Database datastore 1. right-click and select New. When running a job on UNIX. get appropriate access privileges to the database or file system that the datastore describes. SAP BusinessObjects Data Services Designer Guide 97 . to allow the software to use parameterized SQL when reading or writing to DB2 databases. For example. To avoid this error. (OPEN) This error occurs because of insufficient file permissions to some of the files in the Attunity installation directory. In the Datastores tab of the object library. If a user is not authorized to create. jobs will still run. they will produce a warning message and will run less efficiently. change the file permissions for all files in the Attunity directory to 777 by executing the following command from the Attunity installation directory: $ chmod -R 777 * • Defining a database datastore Define at least one database datastore for each database or mainframe file system with which you are exchanging data. authorize the user (of the datastore/database) to create.Datastores Database datastores 5 • • • • • Bulk loading Imported functions (imports metadata for tables only) Template tables (creating tables) The datetime data type supports up to 2 sub-seconds only Data Services cannot load timestamp data into a timestamp column in a table because Attunity truncates varchar data to 8 characters. execute and drop stored procedures. the job could fail with following error: [D000] Cannot open file /usr1/attun/navroot/def/sys System error 13: The file access permissions do not allow the specified action. However. which is not enough to correctly represent a timestamp value. To define a datastore.. execute and drop stored procedures.

Enter the name of the new datastore in the Datastore Name field. click the cells under each configuration name. At this point. the software displays other options relevant to that type. b. you must also specify the catalog name and the schema name in the URL. the following buttons are available: 98 SAP BusinessObjects Data Services Designer Guide . you may see all of the tables from each catalog. Enter the appropriate information for the selected database type. Select ODBC Admin and then the System DSN tab.schema=schemaname. If you do not. jdbc:leselect://localhost/catalog name. for example. To enter values for each configuration option. The name can contain any alphabetical or numeric characters or underscores (_). enter the catalog name and the schema name. select Advanced. Select the Datastore type. c. Keep Enable automatic data transfer selected to enable transfer tables in this datastore that the Data_Transfer transform can use to push down subsequent database operations. Note: If you select Data Federator.5 Datastores Database datastores 2. Memory. and then click Configure. click OK. In the URL option. This check box displays for all databases except Attunity Connector. Choose Database. you can save the datastore or add more information to it: • To save the datastore and close the Datastore Editor. For the datastore as a whole. 4. 7. a. and Persistent Cache. • To add more information. Data Federator. 5. It cannot contain spaces. When you select a Datastore Type. 3. The Enable automatic data transfer check box is selected by default when you create a new datastore and you chose Database for Datastore type. 6. Highlight Data Federator. Select the Database type.

and manage multiple configurations for a datastore. Opens a text window that displays how the software will code the selections you make for this datastore in its scripting language. Saves selections. Use the tool bar on this window to add. Edit Show ATL OK Cancel Apply 8. the correct database type to use when creating a datastore on Netezza was ODBC.Datastores Database datastores 5 Buttons Description Import unsupported data types as VARCHAR of size The data types that the software supports are documented in the Reference Guide. If you want the software to convert a data type in your source that it would not normally support. Opens the Configurations for Datastore dialog.0. configure.7. Click OK. SAP BusinessObjects Data Services 11. Cancels selections and closes the Datastore Editor window.7. Note: On versions of Data Integrator prior to version 11. Saves selections and closes the Datastore Editor (Create New Datastore) window. select this option and enter the number of characters that you will allow. When using Netezza as the SAP BusinessObjects Data Services Designer Guide 99 .1 provides a specific Netezza option as the Database type instead of ODBC.

so 100 SAP BusinessObjects Data Services Designer Guide .ini) of different ODBC drivers into one single file and point to this file in ODBCINI environment variable. Then for every data source name mentioned in an ODBC datastore. Verify this functionality by referring to your ODBC vendor's documentation.5 Datastores Database datastores database with the software. Related Topics • Performance Optimization Guide: Data Transfer transform for push-down operations • Reference Guide: Datastore • Creating and managing multiple datastore configurations • Ways of importing metadata Configuring data sources used in a datastore When UNIX_ODBC_DRIVER_MANAGER_LIB is specified in DSConfig. then the software assumes the user wants to use DataDirect as ODBC driver manager. the software loads the library named for the UNIX_ODBC_DRIVER_MANAGER_LIB property. Then for every data source name mentioned in ODBC datastore. the software assumes the user wants to use a third-party ODBC driver manager and automatically disables its ODBC driver manager. To configure MYSQL ODBC on Linux 1.ini).txt. If UseDIUNIXODBCDriverManager is TRUE.ini file and loads the library mentioned in driver property. For example: [test_mysql] Driver = /home/mysql/myodbc/lib/libmyodbc3_r. the software loads the DataDirect driver manager library. we recommend that you choose the software's Netezza option as the Database type rather than ODBC. When configuring the ODBC driver. you can often combine the configuration files (odbc. If the option UseDIUNIXODBCDriverManager is FALSE. Add the data source to the UNIX ODBC driver manager configuration file ($LINK_DIR/bin/odbc. then the software searches $LINK_DIR/bin/odbc.

Add the following environment settings to .ini.profile: ODBCINI=DataDirect Install Dir/lib/odbc. Add the data source to DataDirect configuration file as: [test_ifmx_odbc] Driver=/3pt/merant50/lib/ivifcl20.so 2.0 Informix Wire Protocol ApplicationUsingThreads=1 CancelDetectInterval=0 Database=test_db3 HostName=ifmxsrvr_host LogonID=informix Password= Protocol=olsoctcp ServerName=ol_ifmxservr Service=1526 TrimBlankFromIndexName=1 3.ini). For example: [test_ifmx_odbc] Driver = Driver=/3pt/merant50/lib/ivifcl20.profile ODBCINI=MyODBC Install Dir/lib/odbc.so mysql_host 3306 test test test 3 3.so Description=DataDirect 5. Add the data source to the UNIX ODBC driver manager configuration file ($LINK_DIR/bin/odbc. Add the data source to the MyODBC driver configuration file as: [test_mysql] Driver SERVER PORT USER Password Database OPTION SOCKET = = = = = = = = /home/mysql/myodbc/lib/libmyodbc3_r.ini.export ODBCINI LD_LIBRARY_PATH=MyODBC Install Dir/lib:$LD_LIBRARY_PATH To configure DataDirect Informix ODBC driver on Linux 1.export ODBCINI SAP BusinessObjects Data Services Designer Guide 101 .Datastores Database datastores 5 2. Add the following environment settings to .

5 Datastores Database datastores LD_LIBRARY_PATH=DataDirect Install Dir/lib:$LD_LI BRARY_PATH To configure Neoview ODBC on Linux and HP-64 to use Neoview Transporter on UNIX. Add the following environment setting to . Copy the configured MXODSN file to $LINK_DIR/bin.ini). Note: To pass multibyte data to Neoview systems. the regional settings must be changed to UTF-8. For example: [neoview2] Driver=/usr/lib/libhpodbc.so 2.5 or newer • Neoview JDBC Type 4 driver • Neoview ODBC UNIX drivers • Neoview Command Interface 1. Add the data source to the UNIX ODBC driver manager configuration file ($LINK_DIR/bin/odbc. • Neoview Transporter Java Client • Java JRE version 1. you must install the following software components. datastores are defined by both options and properties: 102 SAP BusinessObjects Data Services Designer Guide .profile: For Linux: LD_LIBRARY_PATH=neoview lib dir installation:$LD_LI BRARY_PATH For HP-64: SHLIB_PATH=neoview lib dir installation:$SHLIB_PATH 3. Changing a datastore definition Like all objects.

The Configurations for Datastore dialog opens when you select Edit in the Datastore Editor. For example. To change datastore options 1. Click Advanced and change properties for the current configuration. Click Edit to add. You can do the following tasks: • • • Change the connection information for the current datastore configuration. Click OK. 3. SAP BusinessObjects Data Services Designer Guide 103 . Properties document the object. 3. Right-click the datastore name and choose Edit. The options take effect immediately. Go to the datastore tab in the object library. 4. The Datastore Editor appears (the title bar for this dialog displays Edit Datastore). the name of the datastore and the date on which it was created are datastore properties. Click OK. The Properties window opens.Datastores Database datastores 5 • • Options control the operation of objects. 2. Database datastores To change datastore properties 1. Go to the Datastores tab in the object library. Right-click the datastore name and select Properties. you can use the fields in the grid to change connection values and properties for the new configuration. the name of the database to connect to is a datastore option. Change the datastore properties. Once you add a new configuration to an existing datastore. Properties are merely descriptive of the object and do not affect its operation. or delete additional configurations. edit. For example. Related Topics • Reference Guide: Data Services Objects. 2.

Click the plus sign (+) next to the datastore name to view the object types in the datastore. 104 SAP BusinessObjects Data Services Designer Guide . For example.5 Datastores Database datastores Related Topics • Reference Guide: Data Services Objects. 2. 3. For example. database datastores have functions. and template tables. Go to the Datastores tab in the object library. tables. Click the plus sign (+) next to an object type to view the objects of that type imported from the datastore. click the plus sign (+) next to tables to view the imported tables. To view imported objects 1. You can use the software to view metadata for imported or non-imported objects and to check whether the metadata has changed for objects already imported. Datastore Browsing metadata through a database datastore The software stores metadata information for all imported objects in a datastore.

you can double-click the datastore icon. The datastore explorer lists the tables in the datastore. right-click. 3. you can right-click for further options. Choose a datastore. You can view tables in the external database or tables in the internal repository. Select Repository metadata to view imported tables. To view datastore metadata 1. Imports (or re-imports) metadata from the database into the repository. Checks for differences between metadata in the database and metadata in the repository. Click again to sort in reverse-alphabetical order. (Alternatively. and select Open.) metadata. If you select one or more tables. Select External metadata to view tables in the external database. Select the Datastores tab in the object library.) The software opens the datastore explorer in the workspace.Datastores Database datastores 5 To sort the list of objects Click the column heading to sort the objects in each grouping and the groupings in each datastore alphabetically. SAP BusinessObjects Data Services Designer Guide 105 . Import Reconcile 4. 2. You can also search through them. Command Description Open (Only available if you select Opens the editor for the table one table.

select External Metadata.5 Datastores Database datastores If you select one or more tables. Reimports metadata from the database into the repository. 106 SAP BusinessObjects Data Services Designer Guide . Shows the properties of the selected table. Deletes the table or tables from the repository. In the browser window showing the list of repository tables. Opens the View Data window which allows you to see the data currently in the table. you can right-click for further options. Reconcile Reimport Delete Properties (Only available if you select one table) View Data Related Topics • To import by searching To determine if a schema has changed since it was imported 1. Command Description Open (Only available if you select Opens the editor for the table one table) metadata. Choose the table or tables you want to check for changes. 2. Checks for differences between metadata in the repository and metadata in the database.

Right-click and choose Reconcile. To view the metadata for an imported table 1. Right-click and choose Open. A table editor appears in the workspace and displays the schema and attributes of the table.Datastores Database datastores 5 3. The Imported column displays YES to indicate that the table has been imported into the repository. 1. The left portion of the window displays the Index list. right-click a table to open the shortcut menu. Select the table name in the list of imported tables. From the shortcut menu. select the table you want to view. In the Properties window. SAP BusinessObjects Data Services Designer Guide 107 . A table editor appears in the workspace and displays the schema and attributes of the table. reimport the table. click the Indexes tab. The Changed column displays YES to indicate that the database tables differ from the metadata imported into the software. click Properties to open the Properties window. In the browser window showing the list of external tables. 2. To use the most recent metadata from the software. 2. 3. To view secondary index information for tables Secondary index information can help you understand the schema of an imported table. Right-click and select Open. To browse the metadata for an external table 1. 4. 2. Click an index to see the contents. From the datastores tab in the Designer.

the software converts the data type to one that is supported. Imported table information The software determines and stores a specific set of metadata information for tables. If the table name exceeds 64 characters. it ignores the column entirely. 108 SAP BusinessObjects Data Services Designer Guide . Note: Table name The maximum table name length supported by the software is 64 characters. if the software cannot convert the data type. Column name Column description Column data type If a column is defined as an unsupported data type.5 Datastores Database datastores Importing metadata through a database datastore For database datastores. In some cases. The name of the column. you can import metadata for tables and functions. The edits are propagated to all objects that call these objects. The data type for the column. Metadata Description The name of the table as it appears in the database. you may not be able to import the table. The description of the column. descriptions. Table description The description of the table. you can edit column names. and data types. After importing metadata.

Primary key column After a table has been added to a data flow diagram. You may change the decimal precision or scale and varchar size within the software after importing from the SAP BusinessObjects Data Federator data source. Table attribute Owner name Note: The owner name for MySQL and Netezza data sources corresponds to the name of the database or schema where the table appears. Name of the table owner. Any varchar column imported to the software from an SAP BusinessObjects Data Federator data source is varchar(1024). these columns are indicated in the column list by a key icon next to the column name.6).Datastores Database datastores 5 Metadata Description Column content type The content type identifies the type of data in the field. SAP BusinessObjects Data Services Designer Guide 109 . The column(s) that comprise the primary key for the table. Information the software records about the table such as the date created and date modified if these values are available. Varchar and Column Information from BusinessObjects Data Federator tables Any decimal column imported to Data Serves from a BusinessObjects Data Federator data source is converted to the decimal precision and scale(28.

Information that is imported for functions includes: • • • Function parameters Return type Name. MS SQL Server. You can also import stored functions and packages from Oracle. The items available to import through the datastore appear in the workspace. Related Topics • Reference Guide: About procedures Ways of importing metadata This section discusses methods you can use to import metadata. You can configure imported functions and procedures through the function wizard and the smart editor in a category identified by the datastore name. You can use these functions and procedures in the extraction specifications you give Data Services. and Teredata databases. Functions and procedures appear in the Function branch of each datastore tree. Sybase ASE. Right-click and choose Open. owner Imported functions and procedures appear on the Datastores tab of the object library. 3. Open the object library. Go to the Datastores tab. 110 SAP BusinessObjects Data Services Designer Guide . 4. 2.5 Datastores Database datastores Imported stored function and procedure information The software can import stored procedures from DB2. Sybase IQ. Select the datastore you want to use. 1. Oracle. To import by browsing Note: Functions cannot be imported by browsing.

Enter a table name in the Name box to specify a particular table. choose the type of item you want to import from the Type list. go to the Datastores tab to display the list of imported objects. Select the items for which you want to import metadata. Click the plus sign to navigate the structure. you must select a table rather than a folder that contains tables. To import by name 1. SAP BusinessObjects Data Services Designer Guide 111 . If you are importing a stored procedure. b. If this is true. 4. or select the All check box. Right-click and choose Import By Name. to import a table. if available. the tables are organized and displayed as a tree structure. any table with the specified table name). 5. there is a plus sign (+) to the left of the name. To import tables: a. The workspace contains columns that indicate whether the table has already been imported into the software (Imported) and if the table schema has changed since it was imported (Changed). 7. select Function. you specify matching tables regardless of owner (that is. To verify whether the repository contains the most recent metadata for an object. right-click the object and choose Reconcile. 6. 6.Datastores Database datastores 5 In some environments. Open the object library. To import functions and procedures: • In the Name box. In the Import By Name window. Enter an owner name in the Owner box to limit the specified tables to a particular owner. Click the Datastores tab. to specify all tables. 2. For example. 5. In the object library. Select the datastore you want to use. enter the name of the function or stored procedure. If the name is case-sensitive in the database (and not all uppercase). Right-click and choose Import. If you leave the owner name blank. 3. enter the name as it appears in the database and use double quotation marks (") around the name to preserve the case. 7.

Select the name of the datastore you want to use. procedures. Click OK.. clear the Callable from SQL expression check box. you specify matching functions regardless of owner (that is. A stored procedure cannot be pushed down to a database inside another SQL statement when the stored procedure contains a DDL statement. • Enter an owner name in the Owner box to limit the specified functions to a particular owner. If you are importing an Oracle function or stored procedure and any of the following conditions apply. The software allows you to import procedures or functions created within packages and use them as top-level procedures or functions. any function with the specified name). To import by searching Note: Functions cannot be imported by searching.g. Right-click and select Search. the software imports all stored procedures and stored functions defined within the Oracle package. Click the Datastores tab. Open the object library. 1. You can also enter the name of a package. An Oracle package is an encapsulated collection of related program objects (e. If you enter a package name. If you leave the owner name blank. ends the current transaction with COMMIT or ROLLBACK.5 Datastores Database datastores If the name is case-sensitive in the database (and not all uppercase). 3. or issues any ALTER SESSION or ALTER SYSTEM commands. constants. 4. enter the name as it appears in the database and use double quotation marks (") around the name to preserve the case. • 8. Otherwise. cursors. You cannot import an individual function or procedure defined within a package. variables. and exceptions) stored together in the database. The Search window appears. the software will convert names into all upper-case characters. 2. 112 SAP BusinessObjects Data Services Designer Guide . functions.

you need to search for owner. 13. 7. That is. 8. 6. Select the datastore in which you want to search from the Look In box. If the name is case-sensitive in the database (and not all uppercase). (Optional) Enter a description in the Description text box. External indicates that the software searches for the item in the entire database defined by the datastore. The software lists the tables matching your search criteria. Click Search. Enter the entire item name or some part of it in the Name text box. select the table.Datastores Database datastores 5 5. Select Contains or Equals from the drop-down list to the right depending on whether you provide a complete or partial search value. enter the name as it appears in the database and use double quotation marks (") around the name to preserve the case. or table. Internal indicates that the software searches only the items that have been imported. you opened the datastore. function. Go to the Advanced tab to search using the software's attribute values. 12. 11. which updates the object's metadata from your database (reimporting overwrites any changes you might have made to the object in the software). and selected the objects to SAP BusinessObjects Data Services Designer Guide 113 . 9. The advanced options only apply to searches of imported items. Select External from the drop-down box to the right of the Look In box. Reimporting objects If you have already imported an object such as a datastore. Equals qualifies only the full search string. 10. you can reimport it. and choose Import. viewed the repository metadata. Select the object type in the Type box. To reimport objects in previous versions of the software. To import a table from the returned list. right-click.table_name rather than simply table_name.

the software requests confirmation for each object unless you check the box Don't ask me again for the remaining objects. functions. In this version of the software. or right-click a category node or datastore name and click Reimport All. If you are unsure whether to reimport (and thereby overwrite) the object. IDOCs. Click Yes to reimport the metadata. By contrast. Memory datastores The software also allows you to create a database datastore using Memory as the Database type. You can also select multiple individual objects using Ctrl-click or Shift-click. you can reimport objects using the object library at various levels: • • • Individual objects — Reimports the metadata for an individual object such as a table or function Category node level — Reimports the definitions of all objects of that type in that datastore. In the object library. click View Where Used to display where the object is currently being used in your jobs. for example all tables in the datastore Datastore level — Reimports the entire datastore and all its dependent objects including tables.5 Datastores Database datastores reimport. a memory datastore contains memory table schemas saved in the repository. click the Datastores tab. 3. A datastore normally provides a connection to a database. 4. If you selected multiple objects to reimport (for example with Reimport All). Memory datastores are designed to enhance processing performance of data flows executing in real-time jobs. A memory datastore is a container for memory tables. or adapter. Data (typically small amounts in a real-time job) is stored in memory to provide immediate access instead of going to the original source data. Right-click an individual object and click Reimport. application. 114 SAP BusinessObjects Data Services Designer Guide . 2. You can skip objects to reimport by clicking No for that object. and hierarchies To reimport objects from the object library 1.

enter the name of the new datastore. Be sure to use the naming convention "Memory_DS". Therefore. the performance of real-time jobs with multiple data flows is far better than it would be if files or regular tables were used to store intermediate data. For best performance. 2. • The lifetime of memory table data is the duration of the job. label a memory datastore to distinguish its memory tables from regular database tables in the workspace. The data in memory tables cannot be shared between different real-time jobs. By caching intermediate data. Store table data in memory for the duration of a job. 5. No additional attributes are required for the memory datastore. In the Name box. From the Project menu. By storing table data in memory. 3. Datastore names are appended to table names when table icons appear in the workspace. select NewDatastore. only use memory tables when processing small quantities of data. the LOOKUP_EXT function and other transforms and functions that do not require database operations can access data without having to read it from a remote database. Memory tables are represented in the workspace with regular table icons. Creating memory datastores You can create memory datastores using the Datastore Editor window. In the Datastore type box keep the default Database. 4. Memory tables can cache data from relational database tables and hierarchical data files such as XML messages and SAP IDocs (both of which contain nested schemas).Datastores Database datastores 5 Memory tables are schemas that allow you to cache intermediate data. SAP BusinessObjects Data Services Designer Guide 115 . In the Database Type box select Memory. To define a memory datastore 1. Support for the use of memory tables in batch jobs is not available. Memory tables can be used to: • Move data between data flows in real-time jobs. Click OK.

Instead. 3. In the workspace. 4. Connect the memory table to the data flow as a target. The first time you save the job. From the tool palette. Enter a table name. If you want a system-generated row ID column in the table. the table appears with a table icon in the workspace and in the object library under the memory datastore. the software defines the memory table's schema and saves the table. 8. From the Project menu select Save.5 Datastores Database datastores Creating memory tables When you create a memory table. click the template table icon. 116 SAP BusinessObjects Data Services Designer Guide . Click inside a data flow to place the template table. select the memory datastore. Click OK. 6. which can be either a schema from a relational database table or hierarchical data files such as XML messages. The memory table appears in the workspace as a template table icon. you can use a memory table as a source or target in any data flow. 5. From the Create Table window. Subsequently. 7. you do not have to specify the table's schema or import the table's metadata. The Create Table window opens. To create a memory table 1. Related Topics • Create Row ID option Using memory tables as sources and targets After you create a memory table as a target in one data flow. click the Create Row ID check box. the memory table's icon changes to a target table icon and the table appears in the object library under the memory datastore's list of tables. 2. the software creates the schema for each memory table automatically based on the preceding schema.

In the object library. Otherwise. and drag it into an open data flow. A list of tables appears. If you are using a memory table as a target. click the Datastores tab. SAP BusinessObjects Data Services Designer Guide 117 . 4. The schema of the preceding object is used to update the memory target table's schema. open the memory table's target table editor to set table options. 2. Right-click the memory target table's icon in the work space. 3. Related Topics • Memory table target options Update Schema option You might want to quickly update a memory target table's schema if the preceding schema changes. 6. Connect the memory table as a source or target in the data flow. Select Update Schema. To do this. Expand Tables. The current memory table is updated in your repository. Save the job. 5. All occurrences of the current memory table are updated with the new schema. Expand the memory datastore that contains the memory table you want to use. 2. To update the schema of a memory target table 1.Datastores Database datastores 5 Related Topics • Real-time Jobs To use a memory table as a source or target 1. Select the memory table you want to use as a source or target. you would have to add a new memory table to update a schema. use the Update Schema option.

$1 = $I + 1. Create Row ID option If the Create Row ID is checked in the Create Memory Table window.5 Datastores Database datastores Memory table target options The Delete data from table before loading option is available for memory table targets.'='. 'NO_CACHE'.$I]). Use the DI_Row_ID column to iterate through a table using a lookup_ext function in a script. end end In the preceding script. the software generates an integer column called DI_Row_ID in which the first row inserted gets a value of 1. $count=0 while ($count < $NumOfRows) begin $data = lookup_ext([memory_DS. If you deselect this option. etc. open the memory target table editor. new data will append to the existing table data. To set this option.[O]. The default is on (the box is selected)..[DI_Row_ID.table1) $I = 1. a blank space (where a table owner would be for a regular table). the second row inserted gets a value of 2. For example: $NumOfRows = total_rows (memory_DS. if ($data != NULL) begin $count = $count + 1. table1 is a memory table. a dot. then a second dot. There are no 118 SAP BusinessObjects Data Services Designer Guide . The table's name is preceded by its datastore name (memory_DS).table1. This new column allows you to use a LOOKUP_EXT expression as an iterator in a script.'MAX']..[A]. Note: The same functionality is available for other datastore types using the SQL function.

. use the following syntax: TOTAL_ROWS( DatastoreName. • Two log files contain information specific to memory tables: trace_memory_reader log and trace_memory_loader log. This provides finer control than the active job has over your data and memory usage. If used with a memory datastore. use the Update Schema option or create a new memory table to match the schema of the preceding object in the data flow. is that the software runs out of virtual memory space. To correct this error. Related Topics • Reference Guide: Functions and Procedures.TableName) function returns the number of rows in a particular table in a datastore. The TRUNCATE_TABLE( DatastoreName. The software exits if it runs out of memory while executing any operation. particularly when using memory tables. A validation and run time error occurs if the schema of a memory table does not match the schema of the preceding object in the data flow. so tables are identified by just the datastore name and the table name as shown.. Select the LOOKUP_EXT function arguments (line 7) from the function editor when you define a LOOKUP_EXT function. The TOTAL_ROWS(DatastoreName. Descriptions of built-in functions Troubleshooting memory tables • One possible error. This function can be used with any type of datastore. • SAP BusinessObjects Data Services Designer Guide 119 .Owner.TableName ) function can only be used with memory tables.TableName ) The software also provides a built-in function that you can use to explicitly expunge data from a memory table.Datastores Database datastores 5 owners for memory datastores.

or updates on a persistent cache table. You can then subsequently read from the cache table in another data flow. For example. Note: You cannot cache data from hierarchical data files such as XML messages and SAP IDocs (both of which contain nested schemas). When you load data into a persistent cache table. you can create a cache once and subsequent jobs can use this cache instead of creating it each time. application. Persistent cache datastores provide the following benefits for data flows that process large volumes of data. • A persistent cache datastore is a container for cache tables. you can access a lookup table or comparison table locally (instead of reading from a remote database). A datastore normally provides a connection to a database. deletes. • You can store a large amount of data in persistent cache which the software quickly loads into memory to provide immediate access during a job. You can create cache tables that multiple data flows can share (unlike a memory table which cannot be shared between different real-time jobs). Creating persistent cache datastores You can create persistent cache datastores using the Datastore Editor window.5 Datastores Database datastores Persistent cache datastores The software also allows you to create a database datastore using Persistent cache as the Database type. Persistent cache tables allow you to cache large amounts of data. By contrast. or adapter. 120 SAP BusinessObjects Data Services Designer Guide . Persistent cache tables can cache data from relational database tables and files. the software always truncates and recreates the table. if a large lookup table used in a lookup_ext function rarely changes. a persistent cache datastore contains cache table schemas saved in the repository. You create a persistent cache table by loading data into the persistent cache target table using one data flow. You cannot perform incremental inserts. For example.

In the Cache directory box. select Persistent cache. 5. label a persistent cache datastore to distinguish its persistent cache tables from regular database tables in the workspace. Use one of the following methods to open the Create Template window: • From the tool palette: a. Be sure to use a naming convention such as "Persist_DS". Subsequently. In the Name box. the software creates the schema for each persistent cache table automatically based on the preceding schema. You create a persistent cache table in one of the following ways: • • As a target template table in a data flow As part of the Data_Transfer transform during the job execution Related Topics • Reference Guide: Data_Transfer To create a persistent cache table as a target in a data flow 1. Persistent cache tables are represented in the workspace with regular table icons. In the Datastore type box. Therefore. In the Database Type box. you can either type or browse to a directory where you want to store the persistent cache. 4. SAP BusinessObjects Data Services Designer Guide 121 . keep the default Database. enter the name of the new datastore. Datastore names are appended to table names when table icons appear in the workspace. Click the template table icon. From the Project menu.Datastores Database datastores 5 To define a persistent cache datastore 1. you do not have to specify the table's schema or import the table's metadata. 2. The first time you save the job. select NewDatastore. 3. Creating persistent cache tables When you create a persistent cache table. Instead. 6. the table appears with a table icon in the workspace and in the object library under the persistent cache datastore. the software defines the persistent cache table's schema and saves the table. Click OK.

4. Include duplicate keys — Select this check box to cache duplicate keys. Connect the persistent cache table to the data flow as a target (usually a Query transform).5 Datastores Database datastores b. On the Keys tab. 8. 6. Expand a persistent cache datastore. map the Schema In columns that you want to include in the persistent cache table. 2. Click inside a data flow to place the template table in the workspace. This option is selected by default. • From the object library: a. On the Create Template window. The persistent cache table appears in the workspace as a template table icon. This option is the default. Click the template table icon and drag it to the workspace. specify the key column or columns to use as the key in the persistent cache table. Open the persistent cache table's target table editor to set table options. On the Options tab of the persistent cache target table editor. 122 SAP BusinessObjects Data Services Designer Guide . 5. • Column comparison — Specifies how the input columns are mapped to persistent cache table columns. Click OK. you can change the following options for the persistent cache table. b. In the Query transform. 7. On the Create Template window. • • Compare_by_name — The software maps source columns to target columns by name. select the persistent cache datastore. c. enter a table name. There are two options: • Compare_by_position — The software disregards the column names and maps source columns to target columns by position. 3.

SAP BusinessObjects Data Services Designer Guide 123 . Users logged into database Customers must define a separate link. called Orders. The software refers to communication paths between databases as database links. which can be on the local or a remote computer and of the same or different database type. In DB2. to access data on Orders. linked servers provide the one-way communication path from one database server to another. the template table's icon changes to a target table icon and the table appears in the object library under the persistent cache datastore's list of tables. Related Topics • Reference Guide:Target persistent cache tables Using persistent cache tables as sources After you create a persistent cache table as a target in one data flow. The datastores in a database link relationship are called linked datastores. In the workspace. can store a database link to access information in a remote Oracle database. The software uses linked datastores to enhance its performance by pushing down operations to a target database using a target datastore. Related Topics • Reference Guide: Persistent cache source Linked datastores Various database vendors support one-way communication paths from one database server to another. Customers. a local Oracle database server. For example. Oracle calls these paths database links. you can use the persistent cache table as a source in any data flow. the one-way communication path from a database server to another database server is provided by an information server that allows a set of servers to get data from remote data sources. From the Project menu select Save. cannot use the same link to access data in Orders. You can also use it as a lookup table or comparison table. These solutions allow local users to access data on a remote database. In Microsoft SQL Server.Datastores Database datastores 5 9. stored in the data dictionary of database Customers. Users connected to Customers however.

database name. The datastores must connect to the databases defined in the database link. DBLink 1 through 4. Additional requirements are as follows: • • • • • A local server for database links must be a target server in the software A remote server for database links must be a source server in the software An external (exists first in a database) database link establishes the relationship between any target datastore and a source datastore A Local datastore can be related to zero or multiple datastores using a database link for each remote database Two datastores can be related to each other using one link only The following diagram shows the possible relationships between database links and linked datastores: Four database links. are on database DB1 and the software reads them through datastore Ds1.You can associate the datastore to another datastore and then import an external database link as an option of a datastore. password. 124 SAP BusinessObjects Data Services Designer Guide . and database type. such as its host name. The same information is stored in an SAP BusinessObjects Data Services database datastore. user name.5 Datastores Database datastores Related Topics • Performance Optimization Guide: Database link support for push-down operations across datastores Relationship between database links and datastores A database link stores information about how to connect to a remote data source.

Adapters are represented in Designer by adapter datastores. adapters allow you to: • • • Browse application metadata Import application metadata into a repository Move batch and real-time data between the software and applications SAP offers an Adapter Software Development Kit (SDK) to develop your own custom adapters. Dblink3 is not mapped to any datastore in the software because there is no datastore defined for the remote data source to which the external database link refers. Although it is not a regular case. This relationship is called linked datastore Dblink1 (the linked datastore has the same name as the external database link).Datastores Adapter datastores 5 • • • • Dblink1 relates datastore Ds1 to datastore Ds2. For more information on these products. Adapter datastores Depending on the adapter implementation. you cannot import DBLink2 to do the same. you can buy the software pre-packaged adapters to access application metadata and data in any application. the software allows only one database link between a target datastore and a source datastore pair. contact your SAP sales representative. For example. However. which are also related by Dblink1. if you select DBLink1 to link target datastore DS1 with source datastore DS2. Jobs provide batch and real-time data movement between the software and applications through an adapter datastore's subordinate objects: SAP BusinessObjects Data Services Designer Guide 125 . Also. you can create multiple external database links that connect to the same remote source. Dblink4 relates Ds1 with Ds3. Dblink2 is not mapped to any datastore in the software because it relates Ds1 with Ds2.

5 Datastores Adapter datastores Subordinate Objects Use as For Tables Documents Functions Message functions Outbound messages Source or target Source or target Batch data movement Function call in query Function call in query Target only Adapters can provide access to an application's data and metadata or just metadata. if the data source is SQL-compatible. while the software extracts data from or loads data directly to the application. To define a datastore. Related Topics • Management Console Administrator Guide: Adapters • Source and target objects • Real-time source and target objects Defining an adapter datastore You need to define at least one datastore for each adapter through which you are extracting or loading data. 126 SAP BusinessObjects Data Services Designer Guide . For example. you must have appropriate access privileges to the application that the adapter serves. the adapter might be designed to access metadata.

configure. Enter a unique identifying name for the datastore. 7. 3. SAP BusinessObjects Data Services Designer Guide 127 . and manage multiple configurations for a datastore. Select a Job server from the list. configure the Job Server to support local adapters using the System Manager utility.Datastores Adapter datastores 5 To define an adapter datastore 1. Use the tool bar on this window to add. Create new Datastore). the following buttons are available: Buttons Description Edit Opens the Configurations for Datastore dialog. 2. 5. It can be the same as the adapter name. Enter all adapter information required to complete the datastore connection. the software displays it below the grid. click to select the Datastores tab. The Datastore Editor dialog opens (the title bar reads. 6. In the Datastore type list. The datastore name appears in the Designer only. For the datastore as a whole. In the Object Library. Adapters residing on the Job Server computer and registered with the selected Job Server appear in the Job server list. and ensure that the Job Server's service is running. Also the adapter documentation should list all information required for a datastore connection. Note: If the developer included a description for each option. To create an adapter datastore. you must first install the adapter on the Job Server computer. Right-click and select New. Select an adapter instance from the Adapter instance name list. 4. select Adapter.

then it retains the previous properties. The datastore configuration is saved in your metadata repository and the new datastore appears in the object library. After you complete your datastore connection. Click OK. To change an adapter datastore's configuration 1. OK Cancel Apply 8. When editing an adapter datastore. then it displays them accordingly. Saves selections and closes the Datastore Editor (Create New Datastore) window.5 Datastores Adapter datastores Buttons Description Show ATL Opens a text window that displays how the software will code the selections you make for this datastore in its scripting language. Click OK. enter or select a value. The software looks for the Job Server and adapter instance name you specify. 2. Edit configuration information. 3. 128 SAP BusinessObjects Data Services Designer Guide . and the Designer can communicate to get the adapter's properties. Cancels selections and closes the Datastore Editor window. If the Designer cannot get the adapter's properties. If the Job Server and adapter instance both exist. you can browse and/or import metadata from the data source through the adapter. Saves selections. Right-click the datastore you want to browse and select Edit to open the Datastore Editor window.

3. The software removes the datastore and all metadata objects contained within that datastore from the metadata repository. A window opens showing source metadata. Your edits propagate to all objects that call these objects. SAP BusinessObjects Data Services Designer Guide 129 . you can edit it. then select Open. 2. they appear with a deleted icon . 2. To import application metadata while browsing 1. Click OK in the confirmation window. Browsing metadata through an adapter datastore The metadata you can browse depends on the specific adapter. Click plus signs [+] to expand objects and view subordinate objects. Right-click the datastore you want to browse and select Open. Scroll to view metadata name and description attributes. To browse application metadata 1. Right-click the datastore you want to delete and select Delete. Importing metadata through an adapter datastore The metadata you can import depends on the specific adapter. If these objects exist in established flows. After importing metadata. Right-click the datastore you want to browse. 4. Right-click any object to check importability. To delete an adapter datastore and associated metadata objects 1.Datastores Adapter datastores 5 The edited datastore configuration is saved in your metadata repository.

4. The object is imported into one of the adapter datastore containers (documents. outbound messages. 3. right-click and select New. 2. Find the metadata object you want to import from the browsable list. 3. tables. Enter the name of the new datastore in the Datastore name field. It cannot contain spaces. Click OK. Web service datastores Web service datastores represent a connection from Data Services to an external web service-based data source. or message functions). Click each import parameter text box and enter specific information related to the object you want to import. 130 SAP BusinessObjects Data Services Designer Guide .5 Datastores Web service datastores 2. then select Import by name. you must have the appropriate access priveliges to the web services that the datastore describes. Defining a web service datastore You need to define at least one datastore for each web service with which you are exchanging data. Any object(s) matching your parameter constraints are imported to one of the corresponding categories specified under the datastore. The Import by name window appears containing import parameters with corresponding text boxes. Right-click the object and select Import. 2. The name can contain any alphabetical or numeric characters or underscores (_). functions. To import application metadata by name 1. In the Datastores tab of the object library. To define a web services datastore 1. To define a datastore. Right-click the datastore from which you want metadata.

Datastores Web service datastores 5 3. 4. Choose Web Service. Specify the Web Service URL. 2. Edit configuration information. Data Services removes the datastore and all metadata objects contained within that datastore from the metadata repository. Click OK in the confirmation window. you can browse and/or import metadata from the web service through the datastore. The edited datastore configuration is saved in your metadata repository. If these objects exist in established data flows. After you complete your datastore connection. You can use Data Services to view metadata for imported or SAP BusinessObjects Data Services Designer Guide 131 . When you select a Datastore Type. Browsing WSDL metadata through a web service datastore Data Services stores metadata information for all imported objects in a datastore. Click OK. Right-click the datastore you want to delete and select Delete. 3. 2. Select the Datastore type. Data Services displays other options relevant to that type. The datastore configuration is saved in your metadata repository and the new datastore appears in the object library. they appear with a deleted icon. 5. Right-click the datastore you want to browse and select Edit to open the Datastore Editor window. To delete a web service datastore and associated metadata objects 1. Click OK. The URL must accept connections and return the WSDL. To change a web service datastore's configuration 1.

Web service datastores have functions. You can view ports and operations in the external web service or in the internal repository. The datastore explorer lists the web service ports and operations in the datastore. you can right-click for further options. Click the plus sign (+) next to an object type to view the objects of that type imported from the datastore. 3. Select External metadata to view web service ports and operations from the external WSDL. Click again to sort in reverse-alphabetical order. 2. To sort the list of objects Click the column heading to sort the objects in each grouping and the groupings in each datastore alphabetically. Select the Datastores tab in the object library. If you select one or more operations. you can double-click the datastore icon. 132 SAP BusinessObjects Data Services Designer Guide . (Alternatively. Command Import Description Imports (or re-imports) operations from the database into the repository. right-click. You can also search through them. If you select one or more operations. 2. you can right-click for further options. 3. Choose a datastore. Go to the Datastores tab in the object library. To view imported objects 1.) Data Services opens the datastore explorer in the workspace. 4. Select Repository metadata to view imported web service operations.5 Datastores Web service datastores non-imported objects and to check whether the metadata has changed for objects already imported. To view WSDL metadata 1. Click the plus sign (+) next to the datastore name to view the object types in the datastore. and select Open.

you can import metadata for web service operations. Shows the properties of the selected web service operation. you can select a set of configurations that includes the sources and targets you want by selecting a system configuration when you execute or schedule the job.Datastores Creating and managing multiple datastore configurations 5 Command Delete Properties Description Deletes the operation or operations from the repository. 3. and PROD) Multi-instance (databases with different versions or locales) SAP BusinessObjects Data Services Designer Guide 133 . Importing metadata through a web service datastore For web service datastores. such as: • • • OEM (different databases for design and distribution) Migration (different connections for DEV. The ability to create multiple datastore configurations provides greater ease-of-use for job portability scenarios. then select Open. Right-click the datastore you want to browse. The operation is imported into the web service datastore's function container. To import web service operations 1. 2. Creating and managing multiple datastore configurations Creating multiple configurations for a single datastore allows you to consolidate separate datastore connections for similar sources or targets into one source or target datastore with multiple configurations. TEST. Find the web service operation you want to import from the browsable list. Then. Right-click the operation and select Import.

If you do not create a system configuration. Specify a current configuration for each system configuration. database type. If a datastore has only one configuration.The datastore configuration that the software uses for browsing and uration ” importing database objects (tables and functions) and executing jobs if no system configuration is specified. Related Topics • Portability solutions Definitions Refer to the following terms when creating and managing multiple datastore configurations: Term Definition “Datastore configuration” Allows you to provide multiple metadata sources or targets for datastores. “Current datastore config. Some database objects do not have owners.The datastore configuration that the software uses to execute a job. password. the software uses the default datastore configuration as the current configuration at job execution time. database objects in an ODBC datastore connecting to an Access database do not have owners. 134 SAP BusinessObjects Data Services Designer Guide . If a datastore has more than one configuration. “Database objects” The tables and functions that are imported from a datastore. the software will execute the job using the system configuration. as needed. user name. select a default configuration. and locale) and their values. For example. Database objects usually have owners.5 Datastores Creating and managing multiple datastore configurations • Multi-user (databases for central and local repositories) For more information about how to use multiple datastores to support these scenarios. see . Each configuration is a property of a datastore that refers to a set of configurable options (such as database connection name. uration ” If you define a system configuration. or the system configuration does not specify a configuration for a datastore. the software uses it as the default configuration. “Default datastore config.

Creating a new configuration You can create multiple configurations for all datastore types except memory datastores. Select a system configuration when you execute a job.Datastores Creating and managing multiple datastore configurations 5 Term “Owner name” Definition Owner name of a database object (for example. Creating a new configuration within an existing source or target datastore. 2. and instances. 3. Dependent object information is generated by the where-used utility. You can create an alias from the datastore editor for any datastore configuration. Also known as database owner name or physical owner name. Related Topics • Reference Guide: Descriptions of objects. Datastore SAP BusinessObjects Data Services Designer Guide 135 . 24x7. versions. Defining a system configuration then adding datastore configurations required for a particular environment. work flows. and custom functions in which a database object is used. enterprise data warehouse environment because you can easily port jobs among different database types. porting can be as simple as: 1. A logical owner name. you can decrease end-to-end development time in a multi-source. Dependent objects are the jobs. Adding a datastore alias then map configurations with different object owner names to it. data flows. For example. “Alias” “Dependent objects” Why use multiple datastore configurations? By creating multiple datastore configurations. Use the Datastore Editor to create and edit datastore configurations. Create an alias for objects that are in different database environments if you have different owner names in those environments. a table) in an underlying database.

If only one configuration exists. Select a Database type from the drop-down menu. right-click any existing datastore and select Edit. it is the default configuration. 4. 5. The Designer automatically uses the existing SQL transform and target values for the same database type and version. c. b. or if the database version is older than your existing configuration. In the Create New Configuration window: a.) 136 SAP BusinessObjects Data Services Designer Guide . Click Edit to open the Configurations for Datastore window. the software pre-selects the Use values from value based on the existing database type and version. When you delete datastore configurations. From the Datastores tab of the object library. If you create a new datastore configuration with the same database type and version as the one previously deleted. However. Each datastore must have at least one configuration. if the database you want to associate with a new configuration is a later version than that associated with other existing configurations. Select a Database version from the drop-down menu.5 Datastores Creating and managing multiple datastore configurations To create a new datastore configuration 1. Select or clear the Restore values if they already exist option. you can choose to use the values from another existing configuration or the default for the database type and version. Click the Create New Configuration icon on the toolbar. the Restore values if they already exist option allows you to access and take advantage of the saved value settings. In the Values for table targets and SQL transforms section. The Create New Configuration window opens. if database type and version are not already specified in an existing configuration. e. Enter a unique. the Designer automatically populates the Use values from with the earlier version. logical configuration Name. the software saves all associated target values and SQL transforms. d. Click Advanced to view existing configuration information. Further. 2. 3.

SAP BusinessObjects Data Services Designer Guide 137 . Click OK to save the new configuration. you can also create multiple aliases for a datastore then map datastore configurations to each alias. Under these circumstances.Modified Objects window which provides detailed information about affected data flows and modified objects. See For each datastore. the software requires that one configuration be designated as the default configuration. The software exports system configurations separate from other job related objects. When you export a repository. the software must add any new database type and version values to these transform and target objects. Adding a datastore alias From the datastore editor. • f. the software preserves all configurations in all datastores including related SQL transform text and target table editor settings. allowing you to provide new values. the software displays the Added New Values . the software does not attempt to restore target and SQL transform values. These same results also display in the Output window of the Designer. Your first datastore configuration is automatically designated as the default. when you add a new datastore configuration. the software overrides configurations in the target with source configurations. you can use the datastore editor to flag a different configuration as the default. If the datastore you are exporting already exists in the target repository.Datastores Creating and managing multiple datastore configurations 5 • If you keep this option (selected as default) the software uses customized target and SQL transform values from previously deleted datastore configurations. The software uses the default configuration to import metadata and also preserves the default configuration during export and multi-user operations. If your datastore contains pre-existing data flows with SQL transforms or target objects. If you deselect Restore values if they already exist. however after adding one or more additional datastore configurations.

Returns the database name of the current datastore configuration if the database type is MS SQL Server or Sybase ASE. the software substitutes your specified datastore configuration alias for the real owner name when you import metadata for database objects. Functions to identify the configuration The software provides six functions that are useful when working with multiple source and target datastore configurations. Returns the database version of the current datastore configuration. Function Category Description db_type Miscellaneous Returns the database type of the current datastore configuration. 2. db_version Miscellaneous db_database_name Miscellaneous 138 SAP BusinessObjects Data Services Designer Guide . Click OK. see Renaming table and function owner. 3. You can also rename tables and functions after you import them. use only alphanumeric characters and the underscore symbol (_) to enter an alias name. For more information.5 Datastores Creating and managing multiple datastore configurations To create an alias 1. The Create New Alias window closes and your new alias appears underneath the Aliases category When you define a datastore alias. Under Alias Name in Designer. then click Aliases (Click here to create). click Advanced. From within the datastore editor. The Create New Alias window opens.

make sure that the table metadata schemas match exactly. if you have a datastore with a configuration for Oracle sources and SQL sources. data types. alias names. Returns the name of the current system configuration. returns a NULL value. number and order of columns. design your jobs so that you do not need to change schemas. and so on when you switch between datastore configurations. functions. Related Topics • Reference Guide: Descriptions of built-in functions SAP BusinessObjects Data Services Designer Guide 139 . variables. If no system configuration is defined.Datastores Creating and managing multiple datastore configurations 5 Function Category Description db_owner Miscellaneous Returns the real owner name that corresponds to the given alias name under the current datastore configuration. and content types. Use the same table names. data types. You can also use variable interpolation in SQL text with these functions to enable a SQL transform to perform successfully regardless of which configuration the Job Server uses at job execution time. as well as the same column names. current_configuration Miscellaneous current_system_con Miscellaneous figuration The software links any SQL transform and target table editor settings used in a data flow to datastore configurations. Returns the name of the datastore configuration that is in use at runtime. Use the Administrator to select a system configuration as well as view the underlying datastore configuration associated with it when you: • • • • Execute batch jobs Schedule batch jobs View batch job history Create services for real-time jobs To use multiple configurations successfully. For example.

or you export jobs directly from one repository to another repository. Because the software overwrites datastore configurations during export. The Export utility saves additional configurations in the target environment. You use a typical repository migration procedure. Database objects (tables and functions) can belong to different owners. The software provides several different solutions for porting jobs. add configurations for the test environment when migrating from development to test) to the source repository (for example. user name. the process typically includes the following characteristics: • • • • The environments use the same database type but may have unique database versions or locales. other connection properties. 140 SAP BusinessObjects Data Services Designer Guide . Either you export jobs to an ATL file then import the ATL file to another repository. Related Topics • Advanced Development Guide: Multi-user Development • Advanced Development Guide: Multi-user Environment Setup Migration between environments When you must move repository metadata to another environment (for example from development to test or from test to production) which uses different source and target databases. password. Each environment has a unique database connection name. add to the development repository before migrating to the test environment). and owner mapping. you should add configurations for the target environment (for example. which means that you do not have to edit datastores before running ported jobs in the target environment.5 Datastores Creating and managing multiple datastore configurations • Reference Guide: SQL • Job portability tips Portability solutions Set multiple source or target configurations for a single datastore if you want to quickly change connections to a different source or target database.

Use the database object owner renaming tool to rename owners of any existing database objects. To support executing jobs under different instances. To load multiple instances of a data source to a target data warehouse 1. user name. database version. Create a datastore that connects to a particular instance. SAP BusinessObjects Data Services Designer Guide 141 . Map owner names from the new database instance configurations to the aliases that you defined in an earlier step. 5. Import database objects and develop jobs using those objects. Related Topics • Advanced Development Guide: Export/Import Loading Multiple instances If you must load multiple instances of a data source to a target data warehouse. This allows you to use database objects for jobs that are transparent to other database instances. Minimal security issues: Testers and operators in production do not need permission to modify repository objects. and locale information. This datastore configuration contains all configurable properties such as database type. 2.Datastores Creating and managing multiple datastore configurations 5 This solution offers the following advantages: • • Minimal production down time: You can start jobs as soon as you export them. the software imports all objects using the metadata alias rather than using real owner names. the task is the same as in a migration scenario except that you are using only one repository. add datastore configurations for each additional instance. Define the first datastore configuration. 6. 4. When you define a configuration for an Adapter datastore. then run the jobs. password. Define a set of alias-to-owner mappings within the datastore configuration. database connection name. 3. When you use an alias for a configuration. 7. make sure that the relevant Job Server is running so the Designer can find all available adapter instances for the datastore.

Database tables across different databases belong to different owners. Since a datastore can only access one instance at a time. you may need to trigger functions at run-time to match different instances. If this is the case. the software copies target table and SQL transform database properties from the previous configuration to each additional configuration when you save it. Run the jobs in all database instances. Each instance has a unique database connection name. If you selected a bulk loader method for one or more target tables within your job's data flows. user name. • • • • To deploy jobs to other database types as an OEM partner 1. The software does not copy bulk loader options for targets from one database type to another. password. The software also requires different settings for the target table (configurable in the target table editor). the deployment typically has the following characteristics: • • The instances require various source database types and versions. the software requires different SQL text for functions (such as lookup_ext and sql) and transforms (such as the SQL transform). and new configurations apply to different database types. To support a new instance under a new database type. The instances may use different locales. 142 SAP BusinessObjects Data Services Designer Guide . open your targets and manually set the bulk loader option (assuming you still want to use the bulk loader method with the new database type). and owner mappings. Develop jobs for a particular database type following the steps described in the Loading Multiple instances scenario. You export jobs to ATL files for deployment.5 Datastores Creating and managing multiple datastore configurations 8. Related Topics • Renaming table and function owner OEM deployment If you design jobs for one database type and deploy those jobs to other database types as an OEM partner. other connection properties.

consider replacing these names with variables to supply owner names or database names for multiple database types. The software preserves object history (versions and labels). to check in and check out jobs. Related Topics • Reference Guide: SQL Multi-user development If you are using a central repository management system. This way. The instances share the same database type but may have different versions and locales. modify the SQL text for the new database type. When this occurs. allowing multiple developers. password. If the SQL text contains any hard-coded owner names or database names. and pushdown_sql() functions. lookup_ext(). and owner mapping. the development environment typically has the following characteristics: • • It has a central repository and a number of local repositories. Multiple development environments get merged (via central repository operations such as check in and check out) at times. use the db_type() and similar functions to get the database type and version of the current datastore configuration and provide the correct SQL text for that database type and version using the variable substitution (interpolation) technique. Because the software does not support unique SQL text for each database type or version of the sql(). Reference this report to make manual changes as needed. user name. If the SQL text in any SQL transform is not applicable for the new database type. Database objects may belong to different owners. other connection properties. real owner names (used initially to import objects) must be later mapped to a set of aliases shared among all users. 2. each with their own local repository.Datastores Creating and managing multiple datastore configurations 5 When the software saves a new configuration it also generates a report that provides a list of targets automatically set for bulk loading. Each instance has a unique database connection name. • • • • SAP BusinessObjects Data Services Designer Guide 143 . you will not have to modify the SQL text for each environment. 3.

consider these points: • Use the Renaming table and function owner to consolidate object database object owner names into aliases. • If all the dependent objects cannot be checked out (data flows are checked out by another user). check out the datastore to a local repository and apply the renaming tool in the local repository. the original object will co-exist with the new object. add a configuration and make it your default configuration while working in your own environment. After renaming. renaming will create a new object that has the alias and delete the original object that has the original owner name. • Use caution because checking in datastores and checking them out as multi-user operations can override datastore configurations. Checking in the new objects does not automatically check in the dependent objects that were checked out. If you cannot check out some of the dependent objects. It is recommended that the last developer delete the configurations that • 144 SAP BusinessObjects Data Services Designer Guide . • Maintain the datastore configurations of all users by not overriding the configurations they created. You are responsible for checking in all the dependent objects that were checked out during the owner renaming process. the software will ask you to check out the dependent objects. which gives you the option to proceed or cancel the operation. • If all the dependent objects can be checked out. the renaming tool only affects the flows that you can check out. The number of flows affected by the renaming process will affect the Usage and Where-Used information in the Designer for both the original object and the new object. • When your group completes the development phase.5 Datastores Creating and managing multiple datastore configurations In the multi-user development scenario you must define aliases so that the software can properly preserve the history for all objects in the shared environment. Porting jobs in a multi-user environment When porting jobs in a multi-user environment. Instead. To rename the database objects stored in the central repository. • If the objects to be renamed have dependent objects. the software displays a message. • Renaming occurs in local repositories. • The software does not delete original objects from the central repository when you check in the new objects.

Job portability tips • The software assumes that the metadata of a table or function is the same across different database types and versions specified in different configurations in the same datastore. • Enhanced datastore editor Using the enhanced datastore editor. you can configure database table targets for different database types/versions to match their datastore configurations. For instance. parallel reading will not occur. you can enter different SQL text for different database types/versions and use variable substitution in the SQL text to allow the software to read the correct text for its associated datastore configuration. when you create a new datastore configuration you can choose to copy the database properties • • • SAP BusinessObjects Data Services Designer Guide 145 . Import metadata for a database object using the default configuration and use that same metadata with all configurations defined in the same datastore. then later use the table in a job to extract from DB2. the software supports parallel reading on Oracle hash-partitioned tables. However.Datastores Creating and managing multiple datastore configurations 5 apply to the development environments and add the configurations that apply to the test or production environments. The software supports options in some database types or versions that it does not support in others For example. if you import a table when the default configuration of the datastore is Oracle. If you import an Oracle hash-partitioned table and set your data flow to run in parallel. The following features support job portability: • Enhanced SQL transform With the enhanced SQL transform. when you run your job using sources from a DB2 environment. your job will run. not on DB2 or other database hash-partitioned tables. • Enhanced target table editor Using enhanced target table editor options. the software will read from each partition in parallel.

if a stored procedure is a stored function (only Oracle supports stored functions). Stored procedure schemas should match. data types. all databases must be Oracle). if you have a VARCHAR column in an Oracle source. This process is called owner renaming. For example. It is recommended that you name the tables. functions. or functions. use a VARCHAR column in the Microsoft SQL Server source too. template tables. functions. • • Related Topics • Advanced Development Guide: Multi-user Development • Advanced Development Guide: Multi-user Environment Setup Renaming table and function owner The software allows you to rename the owner of imported tables. and in/out types of the parameters must match exactly. it should have exactly three parameters in the other databases. Further. 146 SAP BusinessObjects Data Services Designer Guide . • When you design a job that will be run from different database types or versions. use a DATETIME column in the Microsoft SQL Server source. The column data types should be the same or compatible. positions. the software assumes that the signature of the stored procedure is exactly the same for the two databases. name database tables. If you have a DATE column in an Oracle source. Define primary and foreign keys the same way. the column names.5 Datastores Creating and managing multiple datastore configurations (including the datastore and table target options as well as the SQL transform text) from an existing configuration or use the current values. This means the number of columns. and column positions should be exactly the same. If your stored procedure has three parameters in one database. Table schemas should match across the databases in a datastore. the names. then you have to use it as a function with all other configurations in a datastore (in other words. When you import a stored procedure from one datastore configuration and try to use it for another datastore configuration. and stored procedures the same for all sources. For example. and stored procedures using all upper-case characters. If you create configurations for both case-insensitive databases and case-sensitive databases in the same datastore.

• • If the objects you want to rename are from a case-sensitive database. You may need to choose a different object name. the instances of a table or function in a data flow are affected. the software will base the case-sensitivity of new owner names on the case sensitivity of the default configuration. the owner renaming mechanism preserves case sensitivity. Note: If the object you are renaming already exists in the datastore. To ensure that all objects are portable across all configurations in this scenario. 3. If they are different. expand a table. When using objects stored in a central repository. Consolidating metadata under a single alias name allows you to access accurate and consistent dependency information at any time while also allowing you to more easily switch between configurations when you move jobs to different environments. enter all owner names and object names using uppercase characters. a shared alias makes it easy to track objects checked in by multiple users. From the Datastore tab of the local object library. If they are the same. The software supports both case-sensitive and case-insensitive owner renaming. then the software proceeds. template table. To rename the owner of a table or function 1. If all users of local repositories use the same alias. When you rename an owner. 2. not the datastore from which they were imported. SAP BusinessObjects Data Services Designer Guide 147 . Right-click the table or function and select Rename Owner. then the software displays a message to that effect. the software can track dependencies for objects that your team checks in and out of the central repository. When you enter a New Owner Name. the software determines if that the two objects have the same schema. or function category.Datastores Creating and managing multiple datastore configurations 5 Use owner renaming to assign a single metadata alias instead of the real owner name for database objects in the datastore. Enter a New Owner Name then click Rename. If the objects you want to rename are from a datastore that contains both case-sensitive and case-insensitive databases. the software uses it as a metadata alias for the table or function.

When you are checking objects in and out of a central repository. Case 1 Object is not checked out. there are several behaviors possible when you select the Rename button. Behavior: Same as Case 1. depending upon the check-out state of a renamed object and whether that object is associated with any dependent objects.5 Datastores Creating and managing multiple datastore configurations During the owner renaming process: • • The software updates the dependent objects (jobs. the software renames the object owner. and object has no dependent objects in the local or central repository. • Using the Rename window in a multi-user scenario This section provides a detailed description of Rename Owner window behavior in a multi-user scenario. Behavior: When you click Rename. Using an alias for all objects stored in a central repository allows the software to track all objects checked in by multiple users. 148 SAP BusinessObjects Data Services Designer Guide . the software can track dependencies for objects that your team checks in and out of the central repository. Displayed Usage and Where-Used information reflect the number of updated dependent objects. work flows. Case 2 Object is checked out. If the software successfully updates all the dependent objects. If all local repository users use the same alias. it deletes the metadata for the object with the original owner name from the object library and the repository. The object library shows the entry of the object with the new owner name. and data flows that use the renamed object) to use the new owner name. and object has no dependent objects in the local or central repository.

Note: An object might still have one or more dependent objects in the central repository. the Rename Owner mechanism (by design) does not affect the dependent objects in the central repository. and object has one or more dependent objects (in the local repository). If you click Continue. Behavior: This case contains some complexity. the software renames the objects and modifies the dependent objects to refer to the renamed object using the new owner name.Datastores Creating and managing multiple datastore configurations 5 Case 3 Object is not checked out. SAP BusinessObjects Data Services Designer Guide 149 . if the object to be renamed is not checked out. Case 4 Object is checked out and has one or more dependent objects. If you click Cancel. the Designer returns to the Rename Owner window. However. the software displays a second window listing the dependent objects (that use or refer to the renamed object). Behavior: When you click Rename.

Please select Tools | Central Repository… to activate that repository before renaming.5 Datastores Creating and managing multiple datastore configurations • If you are not connected to the central repository. • If the dependent object is in the central repository. For example: Oracle. the Rename Owner window opens. the status message reads: Not checked out • If you have the dependent object checked out or it is checked out by another user. When you click Rename.production. If a dependent object is located in the local repository only. • If you are connected to the central repository. No check out necessary. and it is not checked out. a second window opens to display the dependent objects and a status indicating their check-out state and location.user1 The window with dependent objects looks like this: 150 SAP BusinessObjects Data Services Designer Guide . the status message reads: Used only in local repository. the status message shows the name of the checked out repository. the status message reads: This object is checked out from central repository X.

Case 4a You click Continue. After you check out the dependent object. To use the Rename Owner feature to its best advantage. the purpose of this second window is to show the dependent objects. without having to go to the Central Object Library window. SAP BusinessObjects Data Services Designer Guide 151 . If the check out was successful. then right-click and select Check Out. the Designer updates the status. check out associated dependent objects from the central repository. but one or more dependent objects are not checked out from the central repository. this window allows you to check out the necessary dependent objects from the central repository. select one or more objects. the status shows the name of the local repository. In addition.Datastores Creating and managing multiple datastore configurations 5 As in Case 2. From the central repository. This helps avoid having dependent objects that refer to objects with owner names that do not exist. This is useful when the software identifies a dependent object in the central repository but another user has it checked out. When that user checks in the dependent object. Click the Refresh List button to update the check out status in the list. click Refresh List to update the status and verify that the dependent object is no longer checked out.

in the Datastore tab of the local object library. The software then performs an "undo checkout" on the original object. Although to you. the software displays another dialog box that warns you about objects not yet checked out and to confirm your desire to continue. Click No to return to the previous dialog box showing the dependent objects. The software renames the owner of the selected object. The software modifies objects that are not checked out in the local repository to refer to the new owner name. it created a new object identical to the original. in reality the software has not modified the original object. and modifies all dependent objects to refer to the new owner name. Click Yes to proceed with renaming the selected object and to edit its dependent objects. the Output window displays the following message: Object <Object_Name>: Owner name <Old_Owner> could not be renamed to <New_Owner >. it looks as if the original object has a new owner name. including references from depen dent objects. but uses the new owner name. It is your responsibility to maintain consistency with the objects in the central repository. The original object with the old owner name still exists.5 Datastores Creating and managing multiple datastore configurations In this situation. If the software does not successfully rename the owner. Defining a system configuration What is the difference between datastore configurations and system configurations? 152 SAP BusinessObjects Data Services Designer Guide . Case 4b You click Continue. It becomes your responsibility to check in the renamed object. When the rename operation is successful. and all dependent objects are checked out from the central repository. the software updates the table or function with the new owner name and the Output window displays the following message: Object <Object_Name>: owner name <Old_Owner> successfully renamed to <New_Owner>.

The "Edit System Configurations" window displays. System configurations — Each system configuration defines a set of datastore configurations that you want to use together when running a job. a job designer defines the required datastore and system configurations and then a system administrator determines which system configuration to use when scheduling or starting a job.Datastores Creating and managing multiple datastore configurations 5 • • Datastore configurations — Each datastore configuration defines a connection to a particular database from a single datastore. do one of the following: • Click the Create New Configuration icon to add a configuration that references the default configuration of the substitution parameters and each datastore connection. You can use the copy as a template and edit the substitution parameter or datastore configuration selections to suit your needs. However. select Tools > System Configurations. you can export system configurations to a separate flat file which you can later import. Select a system configuration to use at run-time. From the Designer menu bar. SAP BusinessObjects Data Services Designer Guide 153 . In many enterprises. When designing jobs. determine and create datastore configurations and system configurations depending on your business environment and rules. You can also associate substitution parameter configurations to system configurations. Related Topics • Creating a new configuration To create a system configuration 1. You can define a system configuration if your repository contains at least one datastore with multiple configurations. Create datastore configurations for the datastores in your repository before you create system configurations to organize and associate them. 2. • Select an existing configuration and click the Duplicate Configuration icon to create a copy of the selected configuration. To add a new system configuration. You cannot check in or check out system configurations in a multi-user environment. The software maintains system configurations separate from jobs.

Click the Rename Configuration icon to enable the edit mode for the configuration name field. 4. Click OK. select the Datastores tab and right-click a datastore. For each datastore. In the object library. select a substitution parameter configuration to associate with the system configuration.atl file to easily identify that file as a system configuration. If desired. Click OK to save your system configuration settings. 6. It is recommended that you add the SC_ prefix to each exported system configuration . This practice is particularly helpful when you export the system configuration. Select the system configuration you want to rename. 154 SAP BusinessObjects Data Services Designer Guide . b. 2. It is recommended that you follow a consistent naming convention and use the prefix SC_ in each system configuration name so that you can easily identify this file as a system configuration. select the datastore configuration you want to use when you run a job using the system configuration. From the list. the Job Server uses the default datastore configuration at run-time. Related Topics • Associating a substitution parameter configuration with a system configuration To export a system configuration 1. 3. c. unique name and click outside the name field to accept your choice. 5. If you do not map a datastore configuration to a system configuration.5 Datastores Creating and managing multiple datastore configurations 3. a. rename the new system configuration. Select Repository > Export System Configurations. Type a new.

File formats 6 .

see Supplement for SAP: Connecting to SAP Applications. you must: • • Create a file format template that defines the structure for a file. A file format template is a generic description that can be used for many data files. how to use the file format editor. Therefore. File formats. Available properties vary by the mode of the file format editor: 156 SAP BusinessObjects Data Services Designer Guide . and how to create a file format in the software. File format editor Use the file format editor to set properties for file format templates and source and target file formats.6 File formats What are file formats? This section discussed file formats. File formats describe the metadata structure. you use a file format to connect the software to source or target data when the data is stored in a file rather than a database table. Create a specific source or target file format in a data flow. File format objects can describe files in: • • • Delimited format — Characters such as commas or tabs separate each field Fixed width format — The column width is specified by the user SAP file format For more information about file formats. Related Topics • Reference Guide: File format What are file formats? A file format is a set of properties describing the structure of a flat file (ASCII). The software can use data stored in files for data sources and targets. When working with file formats. The source or target file format is based on a template and specifies connection information such as the file name. The object library stores file format templates that you use to define specific file formats as sources and targets in data flows. A file format describes a specific file. A file format defines a connection to a file.

Column Attributes — Edit and define the columns or fields in the file. You can expand the file format editor to the full screen size. Field-specific formats override the default format set in the Properties-Values area. Data Preview — View how the settings affect sample data. • The file format editor contains "splitter" bars to allow resizing of the window and all the work areas.File formats File format editor 6 • • • • New mode — Create a new file format template Edit mode — Edit an existing file format template Source mode — Edit the file format of a particular source file Target mode — Edit the file format of a particular target file The file format editor has three work areas: • • Properties-Values — Edit the values for file format properties. SAP BusinessObjects Data Services Designer Guide 157 . Expand and collapse the property groups by clicking the leading plus or minus. The properties and appearance of the work areas vary with the format of the file.

Note: The Show ATL button displays a view-only copy of the Transformation Language file generated for your file format.6 File formats File format editor You can navigate within the file format editor as follows: • • • • Switch between work areas using the Tab key. Open a drop-down menu in the Properties-Values area by pressing the ALT-down arrow key combination. and arrow keys. you can also edit the column metadata structure in the Data Preview area. 158 SAP BusinessObjects Data Services Designer Guide . Page Down. You might be directed to use this by Technical Customer Assurance. Navigate through fields in the Data Preview area with the Page Up. When the file format type is fixed-width.

enter a name that describes this file format template. 4. In the local object library. go to the Formats tab. In Type. right-click Flat Files. specify the file type: • Delimited — Select Delimited if the file uses a character sequence to separate columns • Fixed width — Select Fixed width if the file uses specified widths for each column.File formats Creating file formats 6 Related Topics • Reference Guide: File format Creating file formats To specify a source or target file • Create a file format template that defines the structure for a file. and select New. the format represents a file that is based on the template and specifies connection information such as the file name. you cannot change the name. SAP BusinessObjects Data Services Designer Guide 159 . 3. select YES for Custom transfer program. 2. If you want to read and load files using a third-party file-transfer program. Related Topics • To create a new file format • Modeling a file format on a sample file • Replicating and renaming file formats • To create a file format from an existing flat table schema • To create a specific source or target file To create a new file format 1. When you drag and drop a file format into a data flow. In Name. After you save this file format template.

For a decimal or real data type. b. Enter field name.6 File formats Creating file formats 5. If an appropriate content type cannot be automatically filled. Set data types. specify the structure of the columns in the Column Attributes work area: a. the software cannot use the source column format specified. This information overrides the default format set in the Properties-Values area for that data type. Complete the other properties to describe files that this template represents. Click Save & Close to save the file format template and close the file format editor. f. Related Topics • Reference Guide: Locales and Multi-byte Functionality • File transfers • Reference Guide: File format 160 SAP BusinessObjects Data Services Designer Guide . Enter Format field information for appropriate data types. and the column names and data types in the target schema do not match those in the source schema. Enter the Content Type. then it will default to blank. 7. the content type may automatically fill based on the field name. Enter field lengths for Blob and VarChar data types. You can model a file format on a sample file. if you only specify a source column format. the software writes to the target file using the transform's output schema. if desired. it defaults to the format used by the code page on the computer where the Job Server is installed. Look for properties available when the file format editor is in source mode or target mode. If you do specify columns and they do not match the output schema from the preceding transform. c. For source files. Enter scale and precision information for Numeric and Decimal data types. If you have added a column while creating a new format. e. 6. d. Instead. Note: • • You do not need to specify columns for files used as targets.

create a new flat file format template or edit an existing flat file format template. From the Formats tab in the local object library. If the file type is delimited. 3. copy and paste the path name from the telnet application directly into the Root directory text box in the file format editor. When you select Job Server. 2. • If the sample file is on the current Job Server computer. you can specify a file located on the computer where the Designer runs or on the computer where the Job Server runs. You can choose from the drop-down list or specify Unicode delimiters by directly typing the Unicode character code in the form of SAP BusinessObjects Data Services Designer Guide 161 . You can type an absolute path or a relative path. files are not case-sensitive. set the appropriate column delimiter for the sample file.) To reduce the risk of typing errors. however. Enter the Root directory and File(s) to specify the sample file. Browse to set the Root directory and File(s) to specify the sample file. A path on Windows might be C:\DATA\abc. You cannot use the Windows Explorer to determine the exact file location on Windows. abc. (For example. file names are case sensitive in the UNIX environment. so you must type the path to the file. Note: During design. set Location to Job Server. Under Data File(s): • If the sample file is on your Designer computer. you can telnet to the Job Server (UNIX or Windows) computer and find the full path name of the file you want to use. Then.txt would be two different files in the same UNIX directory. set Location to Local.File formats Creating file formats 6 Modeling a file format on a sample file 1. Note: In the Windows operating system. a path on UNIX might be /usr/data/abc. the Browse icon is disabled. Indicate the file location in the Location property.txt.txt.txt and aBc. During execution. you must specify a file located on the Job Server computer that will execute the job. but the Job Server must be able to access it. For example.

This format information overrides the default format set in the Properties-Values area for that data type. Right-click to insert or delete fields. 4. g. set Skip row header to Yes if you want to use the first row in the file to designate field names. d. you can edit the metadata structure in the Column Attributes work area: a. For example. where XXXX is a decimal Unicode character code.6 File formats Creating file formats /XXXX. You do not need to specify columns for files used as targets. The file format editor will show the column names in the Data Preview area and create the metadata structure automatically. Right-click to insert or delete fields. c. /44 is the Unicode character for the comma (. you can also edit the metadata structure in the Data Preview area: a. Click Save & Close to save the file format template and close the file format editor.) character. Set data types. 162 SAP BusinessObjects Data Services Designer Guide . Rename fields. For both delimited and fixed-width files. Enter the Content Type information. then it will default to blank. Edit the metadata structure as needed. b. b. if desired. Enter field lengths for the Blob and VarChar data type. Note: The Data Preview pane cannot display blob data. e. Click to select and highlight columns. 6. For fixed-width files. the content type may auto-fill based on the field name. f. Under Input/Output. Enter Format field information for appropriate data types. Enter scale and precision information for Numeric and Decimal data types. If an appropriate content type cannot be automatically filled. 5. If you have added a column while creating a new format.

To save time in creating file format objects. To save and view your new file format schema. 4. 5. unique name for the replicated file format. Once saved. Double-click to select the Name property value (which contains the same name as the original file format object). this is your only opportunity to modify the Name property value. Click Save & Close. you can quickly create another file format object with the same schema by replicating the existing file format and renaming it. click Save. To create a file format from an existing file format 1.File formats Creating file formats 6 Replicating and renaming file formats After you create one file format schema. Look for properties available when the file format editor is in source mode or target mode. you cannot modify the name again. replicate and rename instead of configuring from scratch. The File Format Editor opens. right-click an existing file format and choose Replicate from the menu. click Cancel or press the Esc button on your keyboard. To terminate the replication process (even after you have changed the name and clicked Save). In the Formats tab of the object library. 6. 3. displaying the schema of the copied file format. Edit other properties as desired. Also. Note: You must enter a new name for the replicated file. The software does not allow you to save the replicated file with the same name as the original (or any other existing File Format object). Type a new. Related Topics • Reference Guide: File format SAP BusinessObjects Data Services Designer Guide 163 . 2.

Enter the properties specific to the source or target file. You can access it from the Formats tab of the object library. or select Make Target to define a target file format. Select a flat file format template on the Formats tab of the local object library. 3. 5. From the Query editor.6 File formats Creating file formats To create a file format from an existing flat table schema 1. Look for properties available when the file format editor is in source mode or target mode. Related Topics • Reference Guide: File format • Setting file names at run-time using variables 164 SAP BusinessObjects Data Services Designer Guide . Drag the file format template to the data flow workspace. Note: You can use variables as file names. The software saves the file format in the repository. Select Make Source to define a source file format. be sure to specify the file name and location in the File and Location properties. Click the name of the file format object in the workspace to open the file format editor. Edit the new schema as appropriate and click Save & Close. Under File name(s). 4. right-click a schema and select Create File format. 2. To create a specific source or target file 1. Connect the file format object to other objects in the data flow as appropriate. The File Format editor opens populated with the schema you selected. 6. 2.

File formats Editing file formats 6 Editing file formats You can modify existing file format templates to match changes in the format or structure of a file. Edit the values as needed. you can edit properties that uniquely define that source or target such as the file name and location. To edit a file format template 1. changes that you make to the template are also made in the files that use the template. You cannot change the name of a file format template. Look for properties available when the file format editor is in source mode or target mode. Caution: If the template is used in other jobs (usage is greater than 0). 3. The file format editor opens with the existing format values. For example. you can edit the corresponding file format template and change the date format information. Click Save. For specific source or target file formats. if you have a date field in a source or target file that is formatted as mm/dd/yy and the data for this field changes to the format dd-mm-yy due to changes in the program that generates the source file. 2. Related Topics • Reference Guide: File format SAP BusinessObjects Data Services Designer Guide 165 . changes that you make to the template are also made in the files that use the template. Caution: If the template is used in other jobs (usage is greater than 0). In the object library Formats tab. double-click an existing flat file format (or right-click and choose Edit).

6 File formats Editing file formats To edit a source or target file 1. • To choose non-consecutive columns hold down the keyboard "Control" key and select the columns. displaying the properties for the selected source or target file. Related Topics • Reference Guide: File format Change multiple column properties Use these steps when you are creating a new file format or editing an existing one. 4. Select the "Format" tab in the Object Library. The file format editor opens. Right-click on an existing file format listed under Flat Files and choose Edit. In the column attributes area (upper right pane) select the multiple columns that you want to change. click the name of a source or target file. • To choose a series of columns. Any changes you make to values in a source or target file editor override those on the original file format. 1. Click Save. you must edit the file's file format template. 2. The "File Format Editor "opens. 166 SAP BusinessObjects Data Services Designer Guide . Look for properties available when the file format editor is in source mode or target mode. select the first column and press the keyboard "Shift" key and select the last column. To change properties that are not available in source or target mode. 3. Right click and choose Properties. 3. The "Multiple Columns Properties "window opens. Edit the desired properties. From the workspace. 2.

File formats File format features 6 5.txt might read files from the year 1999 *. You must type the path. Under File name(s).txt reads all files with the txt extension from the specified Root directory SAP BusinessObjects Data Services Designer Guide 167 . enter one of the following: • A list of file names separated by commas. 3. Set the root directory in Root directory. Note: If your Job Server is on a different computer than the Designer. File format features The software offers several capabilities for processing files. Reading multiple files at one time The software can read multiple files with the same format from a single directory using a single source object. You can type an absolute path or a relative path. set the Location of the source files to Local or Job Server. Under Data File(s) in the file format editor. The Data Type and Content Type of the selected columns change based on your settings. or • A file name containing a wild card character (* or ?). For example: 1999????. Open the editor for your source file format 2. Change the Data Type and/or the Content Type and click Ok. 4. but the Job Server must be able to access it. you cannot use Browse to specify the root directory. To specify multiple files to read 1.

2. 3.098.089. Float. You can use either symbol for the thousands indicator and either symbol for the decimal separator. and Double.) and the comma (. This option generates a column named DI_FILENAME that contains the name of the source file. For example: 2. When formatting files in the software. 168 SAP BusinessObjects Data Services Designer Guide . map the DI_FILENAME column from Schema In to Schema Out. the DI_FILENAME column for each row in the target contains the source file name. data types in which these symbols can be used include Decimal.6 File formats File format features Identifying source file names You might want to identify the source file for each row in your target in the following situations: • • You specified a wildcard character to read multiple source files at one time You load from different source files on different runs To identify the source file for each row in the target 1. In the Query editor.65. set Include file name to Yes. Number formats The dot (. When you run the job.) are the two most common formats used to determine decimal and thousand separators for numeric data types. Under Source Information in the file format editor. Numeric.65 or 2.

Use the semicolon to delimit each marker. The software expects that the decimal separator of a number will be a comma (.32-.00 or 32.0 Leading and trailing decimal signs are also supported.). Dot (. For example. it uses a comma (. it uses a dot (. It does not use thousand separators. it only uses the Job Server Locale decimal separator.) as decimal separator.##0. In this format. For example: +12.) as the thousand separator and comma (. you might want to ignore comment line markers such as # and //. The reading of the number data and this decimal separator is determined by Data Service Job Server Locale Region. When the software loads the data to a flat file.). two special characters — the semicolon (. The software expects that the decimal separator of a number will be a dot (.##0.) as the thousand separator and a dot (. India.) and the thousand separator will be dot (.File formats File format features 6 Format Description {none} The software expects that the number contains only the decimal separator. Comma (. and use the SAP BusinessObjects Data Services Designer Guide 169 . Associated with this feature.) is the decimal separator when is Data Service Locale is set to a country that uses commas (for example.) as decimal separator.) and the backslash (\) — make it possible to define multiple markers in your ignore row marker string. USA. the software will return an error if a number contains a thousand separator. and UK). #. Germany or France).000.0 #. When the software writes the data. When the software loads the data to a flat file. Ignoring rows with specified markers The file format editor provides a way to ignore rows containing a specified marker (or markers) when reading files.) is the decimal separator when Locale is set to country that uses dots (for example.) and the thousand separator will be a comma (.

Any that begin with abc or \ or . Click in the associated text box and enter a string to indicate one or more markers representing rows that the software should skip during file read and/or metadata creation. Date formats at the field level You can specify a date format at the field level to overwrite the default date. you can edit the value in the corresponding Format field to a different date format such as: 170 SAP BusinessObjects Data Services Designer Guide . The default marker value is an empty string. or date-time formats set in the Properties-Values area. when the Data Type is set to Date. Open the file format editor from the Object Library or by opening a source object in the workspace. The following table provides some ignore row marker(s) examples.) Marker Value(s) Row(s) Ignored None (this is the default value) abc abc.def. Any that begin with the string abc Any that begin with abc or def or hi Any that begin with abc or . For example.hi abc. abc.\. Find Ignore row marker(s) under the Format Property.\\. When you specify the default value. time.\. 3. To specify markers for rows to ignore 1. no rows are ignored. 2. (Each value is delimited by a semicolon unless the semicolon is preceded by a backslash.6 File formats File format features backslash to indicate special characters as markers (such as the backslash and the semicolon).

2. Find Parallel process threads under the "General" Property. in the case of a fixed-width file. double-click the source or target object. Related Topics • Reference Guide: File format SAP BusinessObjects Data Services Designer Guide 171 . Row-format errors — For example. Open the file format editor in one of the following ways: • In the Formats tab in the Object Library. You can configure the File Format Editor to identify rows in flat-file sources that contain the following types of errors: • Data-type conversion errors — For example.yy Parallel process threads Data Services can use parallel threads to read and load files to maximize performance. 3. For example.File formats File format features 6 • • • yyyy.dd mm/dd/yy dd.mm. To specify parallel threads to process your file format: 1. • These error-handling properties apply to flat-file sources only. enter the number 4 in the Parallel process threads box. if you have four CPUs on your Job Server computer. Specify the number of threads to read or load this file format. the software identifies a row that does not match the expected width value. a field might be defined in the File Format Editor as having a data type of integer but the data encountered is actually varchar.mm. right-click a file format name and click Edit. the software processes rows from flat-file sources one at a time. Error handling for flat-file sources During job execution. • In the workspace.

The total number of columns defined is <3>. Note: If you set the file format's Parallel process thread option to any value greater than 0 or {none}.def are the three columns of data from the invalid row. so a row delimiter should be seen after column number <3>. Please check the file for bad data. you can limit the number of warnings to log without stopping the job About the error file If enabled.3.6 File formats File format features Error-handling options In the File Format Editor. if so. and de fg.-80104: 1-3-A column delimiter was seen after column number <3> for row number <2> in file <d:/acl_work/in_test. 172 SAP BusinessObjects Data Services Designer Guide . column number where the error oc curred.234. The format is a semicolon-delimited text file. the Error Handling set of properties allows you to choose whether or not to have the software perform the following actions: • • • • check for either of the two types of flat-file source error write the invalid row(s) to a specified error file stop processing the source file after reaching a specified number of invalid rows log data-type conversion or row-format warnings to the error log. all columns from the invalid row The following entry illustrates a row-format error: d:/acl_work/in_test. You can have multiple input source files for the error file.234.txt>. the error file will include both types of errors.defg. The file resides on the same computer as the Job Server.2. row number in source file.. Entries in an error file have the following syntax: source file path and name. or redefine the input schema for the file by editing the file format in the UI.def where 3 indicates an error occurred after the third column. Data Services error. the row number in source file value will be -1.txt.

under the Error Handling properties for Capture data conversion errors. 4. 7. 2. In the object library. The File Format Editor opens. click Yes. In the object library. click the Formats tab. If you type a directory path here. for Capture row format errors click Yes.File formats File format features 6 Configuring the File Format Editor for error handling To capture data-type conversion or row-format errors 1. you can enter a variable that is set to a particular file with full path name. 6. If you leave Error file root directory blank. Two more fields appear: Error file root directory and Error file name. 2. Expand Flat Files. then enter only the file name in the Error file name property. Type an Error file root directory in which to store the error file. right-click a format. Click Save or Save & Close. and click Edit. Under the Error Handling properties. Type an Error file name. 5. and click Edit. 3. For Write error rows to file. 6. 5. click the Formats tab. 4. For added flexibility when naming the error file. Use variables to specify file names that you cannot otherwise enter such as those that contain multibyte characters SAP BusinessObjects Data Services Designer Guide 173 . then type a full path and file name here. click Yes. right-click a format. To capture data-type conversion errors. click Yes for either or both of the Capture data conversion errors or Capture row format errors properties. The File Format Editor opens. To capture errors in row formats. Expand Flat Files. 3. Click Save or Save & Close. To write invalid rows to an error file 1.

1. Expand Flat Files. To limit to the number of warning messages to log If you choose to log either data-type or row-format warnings. The File Format Editor opens. 174 SAP BusinessObjects Data Services Designer Guide . and click Edit. Click Save or Save & Close. Expand Flat Files. click the Formats tab. click Yes for either or both the Capture data conversion errors or Capture row format errors properties. 4. right-click a format. 3. In the object library. Under the Error Handling properties. click the Formats tab. In the object library. To log row-format warnings in the error log 1. and click Edit. Under the Error Handling properties. 3. In the object library. 3. Click Save or Save & Close. 3. for click Yes. for click Yes. for click Yes. 2. click the Formats tab. 4. right-click a format. you can limit the total number of warnings to log without interfering with job execution. The File Format Editor opens. For Maximum errors to stop job. Under the Error Handling properties. type a number. 5. To log data-type conversion warnings in the error log 1. click the Formats tab.6 File formats File format features To limit to the number of invalid rows processed before stopping the job 1. The File Format Editor opens. Note: This property was previously known as Bad rows limit. Under the Error Handling properties. 2. 2. right-click a format. In the object library. Expand Flat Files. Click Save or Save & Close. right-click a format. 2. and click Edit. The File Format Editor opens. 4. Expand Flat Files. and click Edit.

During design. you can: • • create just the format. right-click COBOL copybooks. type a number. 2. or create the format and associate it with a data file at the same time This section also describes how to: • • create rules to identify which records represent which schemas using a field ID option identify the field that contains the length of the schema's record using a record length field option Related Topics • Reference Guide: Import or Edit COBOL copybook format options • Reference Guide: COBOL copybook source options • Reference Guide: Data Types. Conversion to or from Data Services internal data types To create a new COBOL copybook file format 1. 3. click the Formats tab. you can use the Browse button. Creating COBOL copybook file formats When creating a COBOL copybook format. which usually has the extension . and click New.File formats Creating COBOL copybook file formats 6 4. The Import COBOL copybook window opens.cpy. Name the format by typing a name in the Format name field. Click Save or Save & Close. you can specify a file in one of the following ways: • For a file located on the computer where the Designer runs. then configure the source after you add the format to a data flow. SAP BusinessObjects Data Services Designer Guide 175 . For Maximum warnings to log. In the local object library. specify the COBOL copybook file format to import. 5. On the Format tab for File name.

Related Topics • Reference Guide: Data Services Objects. COBOL copybook source options To create a new COBOL copybook file format and a data file 1. The COBOL Copybook schema name(s) dialog box displays. but the Job Server must be able to access it.6 File formats Creating COBOL copybook file formats • For a file located on the computer where the Job Server runs. but the Job Server must be able to access it. you must type the path to the file. Click OK. When you later add the format to a data flow. you can use the options in the source editor to define the source. You can type an absolute path or a relative path. 5.cpy. Click OK. type or browse to the directory that contains the COBOL copybook data file to import. right-click COBOL copybooks. Name the format by typing a name in the Format name field. If desired. The software adds the COBOL copybook to the object library. 4. The Import COBOL copybook window opens. 6. In the local object library. You can type an absolute path or a relative path. On the Format tab for File name. For Directory. you can use the Browse button. 5. select or double-click a schema name to rename it. 3. which usually has the extension . 2. Click the Data File tab. During design. you must type the path to the file. you can specify a file in one of the following ways: • • For a file located on the computer where the Designer runs. 176 SAP BusinessObjects Data Services Designer Guide . and click New. 4. For a file located on the computer where the Job Server runs. specify to the COBOL copybook file format to import. click the Formats tab.

click the Data Access tab. you can use the Browse button. then type a full path and file name here. 3. 10. If you leave Directory blank. Click OK. select or double-click a schema name to rename it. For a file located on the computer where the Job Server runs. If the data file is not on the same computer as the Job Server. 8. The COBOL Copybook schema name(s) dialog box displays. SAP BusinessObjects Data Services Designer Guide 177 .File formats Creating COBOL copybook file formats 6 If you include a directory path here. The Field ID tab allows you to create rules for indentifying which records represent which schemas. On the Field ID tab. Click the Field ID tab. then enter only the file name in the Name field. You can type an absolute path or a relative path. In the top pane. Click OK. During design. If desired. you must type the path to the file. 7. The Edit COBOL Copybook window opens. Related Topics • Reference Guide: Import or Edit COBOL copybook format options To create rules to identify which records represent which schemas 1. right-click COBOL copybooks.field name> as ID. but the Job Server must be able to access it. 6. select a field to represent the schema. 9. Select FTP or Custom and enter the criteria for accessing the data file. click the Formats tab. select the check box Use field <schema name. Specify the COBOL copybook data file Name. 4. and click Edit. you can specify a file in one of the following ways: • • For a file located on the computer where the Designer runs. 2. In the local object library.

Type a value for the field. Continue (adding) inserting values as necessary. To identify the field that contains the length of the schema's record 1. refer to the Data Services Management Console: Administrator Guide. Select additional fields and insert values as necessary. For the schema to edit. The offset value automatically changes to the default of 4. 7. 2. For details about creating adapters. Click the Record Length Field tab. To create Microsoft Excel workbook file formats on Windows. Click OK. Click OK. right-click COBOL copybooks. however. The offset is the value that results in the total record length when added to the value in the Record length field. 178 SAP BusinessObjects Data Services Designer Guide . you must create and configure an adapter instance in the Administrator. Click Insert below to add an editable value to the Values list. refer to the Data Services Reference Guide.6 File formats Creating Microsoft Excel workbook file formats on UNIX platforms 5. 9. 6. The following procedure provides an overview of the configuration process. Creating Microsoft Excel workbook file formats on UNIX platforms This section describes how to use a Microsoft Excel workbook as a source with a Job Server on a UNIX platform. To access the workbook. click in its Record Length Field column to enable a drop-down menu. click the Formats tab. In the local object library. you can change it to any other numeric value. 3. 5. and click Edit. 8. The Edit COBOL Copybook window opens. Select the field (one per schema) that contains the record's length. 4.

ensure the UNIX Job Server can support adapters. You may leave all other options at their default values except when processing files larger than 1 MB. it must be available on a Windows file system. You can later change the location of the actual file to use for processing in the Excel workbook file format source editor. Ensure a repository associated with the Job Server has been added to the Administrator. SAP BusinessObjects Data Services Designer Guide 179 . You can only configure one Excel adapter per Job Server. type BOExcelAdapter (required and case sensitive). See the Reference Guide. See the Management Console: Administrator Guide. Data Services writes the records with errors to the output (in Windows. In that case. • Entries in the error log file might be represented numerically for the date and time fields.File formats Creating Microsoft Excel workbook file formats on UNIX platforms 6 Also consider the following requirements: • To import the workbook. change the Additional Java Launcher Options value to -Xms64m -Xmx512 or -Xms128m -Xmx1024m (the default is -Xms64m -Xmx256m). Excel workbook source options To create a Microsoft Excel workbook file format on UNIX 1. see the Management Console: Administrator Guide. select MSExcelAdapter. To add a repository to the Administrator. See the Installation Guide for UNIX. • To reimport or view data in the Designer. Using the Server Manager ($LINK_DIR/bin/svrcfg). Additionally. In the Administrator. add an adapter to access Excel workbooks. 3. Note that Java memory management can prevent processing very large files (or many smaller files). Use the following options: • On the Installed Adapters tab. Start the adapter. the file must be available on Windows. 4. • On the Adapter Configuration tab for the Adapter instance name. 2. Related Topics • Reference Guide: Data Services Objects. these records are ignored).

scroll the window down. You can use third-party (custom) transfer programs to: • • Incorporate company-standard file-transfer applications as part of the software job execution Provide high flexibility and security for files transferred across a firewall The custom transfer program option allows you to specify: • • A custom transfer program (invoked during job execution) Additional arguments. 180 SAP BusinessObjects Data Services Designer Guide . To view them. For details. In the Designer on the "Formats" tab of the object library. Related Topics • Installation Guide for UNIX: After Installing Data Services. based on what is available in your program.6 File formats File transfers 5. Adding and configuring adapter instances • Reference Guide: Data Services Objects. such as: • • • Connection data Encryption/decryption mechanisms Compression mechanisms Custom transfer system variables for flat files When you set Custom Transfer program to YES in the Property column of the file format editor. Excel workbook format File transfers The software can read and load files using a third-party file transfer program for flat files. Adding repositories • Management Console Administrator Guide: Adapters. the following options are added to the column. see the Reference Guide. create the file format by importing the Excel workbook. Using the Server Manager • Management Console Administrator Guide: Administrator Management.

you can collect connection information entered in the software and use that data at run-time with your custom transfer program.cmd) with five arguments. the following custom transfer options use a Windows command file (Myftp. Arguments 1 through 4 are system variables: • • • User and Password variables are for the external server The Local Directory variable is for the location where the transferred files will be stored in the software The File Name variable is for the names of the files to be transferred SAP BusinessObjects Data Services Designer Guide 181 . You can enter other information using the following system variables: Is substituted for this variable if it is defined in the Arguments field Data entered for: User name Password Local directory File(s) $AW_USER $AW_PASSWORD $AW_LOCAL_DIR $AW_FILE_NAME By using these variables as custom transfer program arguments. like the name of the remote server that the file is being transferred to or from. some transfer information. For example. may need to be entered literally as a transfer program argument.File formats File transfers 6 When you set custom transfer options for external file sources and targets.

6 File formats File transfers Argument 5 provides the literal external server name.out in the example below). the software writes the standard output into the job's trace log. only the Program executable option is mandatory. The content of the Myftp.cmd script is as follows: Note: If you do not specify a standard output file (such as ftp. @echo off set set set set set USER=%1 PASSWORD=%2 LOCAL_DIR=%3 FILE_NAME=%4 LITERAL_HOST_NAME=%5 set INP_FILE=ftp.inp echo echo echo echo echo %USER%>%INP_FILE% %PASSWORD%>>%INP_FILE% lcd %LOCAL_DIR%>>%INP_FILE% get %FILE_NAME%>>%INP_FILE% bye>>%INP_FILE% ftp -s%INPT_FILE% %LITERAL_HOST_NAME%>ftp. 182 SAP BusinessObjects Data Services Designer Guide .out Custom transfer options for flat files Of the custom transfer program options.

You can also use Arguments to enable or disable your program's built-in features such as encryption/decryption and compression mechanisms. you might design your transfer program so that when you enter -sSecureTransportOn or -CCompressionYES security or compression is enabled. You can use the mask and encryption properties of the Password box. Data entered in the Password box is masked in log files and on the screen.File formats File transfers 6 Entering User Name. These options are provided for you to specify arguments that your custom transfer program can process (such as connection data). You can use the Arguments box to enter a user name and password. then set up your custom program to either: • • Pick up its password from a trusted location Inherit security privileges from the calling program (in this case. For example. the software also provides separate User name and Password boxes. when you migrate the job to another environment. If you do not allow clear passwords to be exposed as arguments in command-line executables. However. For example. Note: • The software sends password data to the custom transfer program in clear text. the software) SAP BusinessObjects Data Services Designer Guide 183 . Password. Note: Available arguments depend on what is included in your custom transfer program. By entering the $AW_USER and $AW_PASSWORD variables as Arguments and then using the User and Password boxes to enter literal strings. See your custom transfer program documentation for a valid argument list. stored in the repository. and Arguments values is optional. and encrypted by Data Services. these extra boxes are useful in two ways: • You can more easily update users and passwords in the software both when you configure the software to use a transfer program and when you later export the job. you might want to change login information without scrolling through other arguments.

To specify system variables for Root directory and File(s) in the Arguments box: 184 SAP BusinessObjects Data Services Designer Guide . Enter a format name. specify the location of the file in the software. In the Data File(s) section. 4. Select either the Delimited or the Fixed width file type. Like other file format settings. 7. To configure a custom transfer program in the file format editor 1. Right-click Flat Files in the tab and select New. You can configure your custom transfer program in the File Format Editor window. The File Format Editor opens. 3. You can also edit the custom transfer option when exporting a file format. you can override custom transfer program settings if they are changed for a source or target in a particular data flow. 2. Select Yes for the Custom transfer program option. Note: While the custom transfer program option is not supported by SAP application file types. 5. Complete the other boxes in the file format editor window. Expand "Custom Transfer" and enter the custom transfer program name and arguments. you can use it as a data transport method for an SAP ABAP data flow. Select the Formats tab in the object library.6 File formats File transfers Setting custom transfer options The custom transfer option allows you to use a third-party program to transfer flat file sources and targets. 6.

You cannot edit updates to file sources and targets at the data flow level when exported. Arguments you can use as custom program arguments in the software depend upon what your custom transfer program expects. enter: -l$AW_LOCAL_DIR\$AW_FILE_NAME When the program runs. Note: The flag -l used in the example above is a custom program flag. Related Topics • Supplement for SAP: Custom Transfer method • Reference Guide: File format Design tips Keep the following concepts in mind when using the custom transfer options: • • Variables are not supported in file names when invoking a custom transfer program for the file. 8. After they are imported. You can only edit custom transfer options in the File Format Editor (or Datastore Editor in the case of SAP application) window before they are exported. the Root directory and File(s) settings are substituted for these variables and read by the custom transfer program. When designing a custom transfer program to work with the software.File formats File transfers 6 • • Associate the system variable $AW_LOCAL_DIRwith the local directory argument of your custom transfer program. you can adjust custom transfer option settings at the data flow level. For example. Associate the system variable $AW_FILE_NAME with the file name argument of your custom transfer program. They override file format level settings. SAP BusinessObjects Data Services Designer Guide 185 . Click Save. keep in mind that: • The software expects the called transfer program to return 0 on success and non-zero on failure.

and continues processing. the software writes a warning message into the trace log. then the software produces an error with return code and stdout/stderr output.6 File formats Web log support • The software provides trace information before and after the custom transfer program executes. Web logs typically track details of Web site hits such as: • • • • Client domain names or IP addresses User names Timestamps Requested action (might include search string) 186 SAP BusinessObjects Data Services Designer Guide . the custom transfer program has ended. • • If the custom transfer program throws an error or its execution fails (return code is not 0). if the transport file does not exist in the local directory.000 bytes of the output produced. The full transfer program and its arguments with masked password (if any) is written in the trace log. the software checks the following: • • For an ABAP data flow. it throws an error and the software stops. If the custom transfer program succeeds but produces standard output. If the custom transfer program finishes successfully (the return code = 0). This might require that the remote file and directory name be specified as arguments and then sent to the Designer interface using system variables. the software issues a warning. if the file or files to be read by the software do not exist in the local directory. When "Completed Custom transfer" appears in the trace log. • • Related Topics • Supplement for SAP: Custom Transfer method Web log support Web logs are flat files generated by Web servers and are used for business intelligence. The custom transfer program designer must provide valid option arguments to ensure that files are transferred to and from the local directory (specified in the software). For a file source. logs the first 1.

[25/JUN/1998:11:19:58 -0500] "GET /wew/js/mouseover.html HTTP/1.02 [en] (x11.g. SunOS 5.27 .99.cun.190. U. 01/Jan/1997:13:06:51 –0600 The software includes several functions for processing Web log data: • • • Word_ext function Concat_data_time function WL_GetKeyValue function Related Topics • Word_ext function • Concat_date_time function • WL_GetKeyValue function SAP BusinessObjects Data Services Designer Guide 187 .File formats Web log support 6 • • • Bytes transferred Referred address Cookie ID Web logs use a common file format and an extended common file format.com/bin/query?p=mouse+over+javascript+source+code&hc=0" "Mozilla/4.6 sun4m)" Figure 6-2: Extended Common Web Log Format The software supports both common and extended common Web log formats as sources.ya hoo.0" 200 1936 "http://av.com ..[01/Jan/1997:13:06:51 -0600] "GET /~bacuslab HTTP/1.0" 301 -4 Figure 6-1: Common Web Log Format saturn5. e. 151. The file format editor also supports the following: • • Dash as NULL indicator Time zone in date-time..

. If 2 separators are specified (+=).') returns NULL.bodi. 4.6 File formats Web log support Word_ext function The word_ext is a string function that extends the word function by returning the word identified by its position in a delimited string..wisc. separator(s)) A negative word number means count from right to left Examples word_ext('www.bb."time") 188 SAP BusinessObjects Data Services Designer Guide . This function skips consecutive delimiters. 5.edu'.') returns 'bodi'. word_ext('www."date". '. -2. 2.. Format word_ext(string. word_ext('aaa+=bbb+=ccc+zz=dd'.edu'. time) Example concat_date_time(MS40. 2.') returns 'bb'. word_ext('.') returns 'wisc'.. '+=') returns 'zz'...aaa.wisc. '. '... word_ext('www.c '.cs.MS40. the function looks for either one. '. word_number.. This function is useful for parsing URLs or file names. Concat_date_time function The concat_date_time is a date function that returns a datetime from separate date and time inputs.cs. Format concat_date_time(date.com'.

[01/Jan/1997:13:06:54 -0600] "GET /~bacuslab/Email4.0" 200 1151 151.99.[01/Jan/1997:13:06:52 -0600] "GET /~bacuslab HTTP/1.jpg HTTP/1.190.google. It is useful for parsing search strings.27 -.com/search?hl=en&lr=&safe=off&q=bo di+B2B&btnG=Google+Search" WL_GetKeyVal ue('http://www.0" 200 3218 SAP BusinessObjects Data Services Designer Guide 189 .99.99.[01/Jan/1997:13:06:51 -0600] "GET /~bacuslab HTTP/1.27 -.27 -.0" 301 -4 151.[01/Jan/1997:13:06:54 -0600] "GET /~bacuslab/BLI_Logo.com/search?hl=en&lr=&safe=off&q=bo di+B2B&btnG=Google+Search'.190.File formats Web log support 6 WL_GetKeyValue function The WL_GetKeyValue is a custom function (written in the Scripting Language) that returns the value of a given keyword.'q') returns 'bodi+B2B'.gif HTTP/1.0" 200 8210 151.27 -.0" 200 1779 151.190. keyword) Example A search in Google for bodi B2B is recorded in a Web log as: GET "http://www.190.99.gif HTTP/1. Format WL_GetKeyValue(string.google.190.99.[01/Jan/1997:13:06:54 -0600] "GET /~bacuslab/BulletA. Sample Web log formats This is a file with a common Web log file format: 151.27 -.

190 SAP BusinessObjects Data Services Designer Guide .27 -.[01/Jan/1997:13:06:51 -0600] "GET /~bacuslab/HomeCount.6 File formats Web log support 151.99.xbm HTTP/1.190.0" 200 890 This is the file format editor view of this Web log: This is a representation of a sample data flow for this Web log.

File formats Web log support 6 Related Topics • Data Flows SAP BusinessObjects Data Services Designer Guide 191 .

6 File formats Web log support 192 SAP BusinessObjects Data Services Designer Guide .

Data Flows 7 .

From inside a work flow. and auditing. using lookups. The lines connecting objects in a data flow represent the flow of data through data transformation steps. What is a data flow? Data flows extract. a data flow can send and receive information to and from other objects through input and output parameters. Naming data flows Data flow names can include alphanumeric characters and underscores (_). and loading targets. They cannot contain blank spaces. occurs inside a data flow. After you define a data flow. transforming data. data flow execution. transform. including reading sources. Your data flow consists of the following: • • • Two source tables A join between these tables. defined in a query transform A target table where the new rows are placed 194 SAP BusinessObjects Data Services Designer Guide . and load data. you can add it to a job or work flow.7 Data Flows What is a data flow? This section describes the fundamantals of data flows including data flow objects. Everything having to do with data. Data flow example Suppose you want to populate the fact table in your data warehouse with new data from two tables in your source transaction database.

however. a work flow can do the following: • Call data flows to perform data movement operations SAP BusinessObjects Data Services Designer Guide 195 . Data sets created within a data flow are not available to other steps in the work flow. A work flow does not operate on data sets and cannot provide more data to a data flow. You can use the following objects as steps in a data flow: • • • source target transforms The connections you make between the icons determine the order in which the software completes the steps. The resulting data flow looks like the following: Steps in a data flow Each icon you place in the data flow diagram becomes a step in the data flow. even when they are steps in a work flow. Related Topics • Source and target objects • Transforms Data flows as steps in work flows Data flows are closed operations.Data Flows What is a data flow? 7 You indicate the flow of data through these components by connecting them in the order that data moves through them.

The operation codes are as follows: 196 SAP BusinessObjects Data Services Designer Guide . The intermediate result consists of a set of rows from the previous operation and the schema in which the rows are arranged. be further "filtered" and directed into yet another data set. Operation codes Each row in a data set is flagged with an operation code that identifies the status of the row. in turn. This data set may.7 Data Flows What is a data flow? • • Define the conditions appropriate to run data flows Pass parameters to and from data flows Intermediate data sets in a data flow Each step in a data flow—up to the target definition—produces an intermediate result (for example. which flows to the next step in the data flow. This result is called a data set. the results of a SQL statement containing a WHERE clause).

UPDATE Rows can be flagged as UPDATE by transforms in the data flow to indicate that a change occurred in a data set as compared with an earlier image of the same data set. not even when you add a data flow to a work flow. You can. Creates a new row in the target.Data Flows What is a data flow? 7 Operation code Description Creates a new row in the target. When a data flow receives parameters. Overwrites an existing row in the target. Passing parameters to data flows Data does not flow outside a data flow. Parameters evaluate single values rather than sets of values. Is ignored by the target. The change is recorded in the target separately from the existing data. the steps inside the data flow can reference those parameters as variables. DELETE Rows can be flagged as DELETE only by the Map_Operation transform. it is inserted as a new row in the target. pass parameters into and out of a data flow. SAP BusinessObjects Data Services Designer Guide 197 . The change is recorded in the target in the same row as the existing data. however. NORMAL All rows in a data set are flagged as NORMAL when they are extracted from a source. If a row is flagged as NORMAL when loaded into a target. INSERT Rows can be flagged as INSERT by transforms in the data flow to indicate that a change occurred in a data set as compared with an earlier image of the same data set. Rows flagged as DELETE are not loaded.

4. Related Topics • Variables and Parameters Creating and defining data flows You can create data flows using objects from • • the object library the tool palette After creating a data flow. For example. 198 SAP BusinessObjects Data Services Designer Guide . Add the sources. You can use this value in a data flow to extract only rows modified since the last update. you can change its properties. Drag the data flow into the workspace for a job or a work flow. go to the Data Flows tab. Related Topics • To change properties of a data flow To define a new data flow using the object library 1. a parameter can indicate the last time a fact table was updated. Select the new data flow.7 Data Flows Creating and defining data flows Parameters make data flow definitions more flexible. 2. and targets you need. In the object library. Select the data flow category. 3. right-click and select New. transforms. 5. The following figure shows the parameter last_update used in a query to determine the data set used to load the fact table.

Click the workspace for a job or work flow to place the data flow. 2. To change properties of a data flow 1. Select the data flow icon in the tool palette. Click OK.Data Flows Creating and defining data flows 7 To define a new data flow using the tool palette 1. 3. Right-click the data flow and select Properties. you are telling the software to validate these objects according the requirements of the job type (either batch or real-time). 2. SAP BusinessObjects Data Services Designer Guide 199 . This table describes the various properties you can set for the data flow. You can add data flows to batch and real-time jobs. When you drag a data flow icon into a job. The Properties window opens for the data flow. transforms. and targets you need. Change desired properties of a data flow. 3. Add the sources.

which can be on the local or a remote computer of the same or different database type. Database links allow local users to access data on a remote database. and table comparisons. You can select one of the following values for the Cache type option on your data flow Properties window: • • In-Memory: Choose this value if your data flow processes a small amount of data that can fit in the available memory. You can cache data to improve performance of operations such as joins. Related Topics • Performance Optimization Guide: Maximizing Push-Down Operations. a batch job will never re-execute that data flow after the data flow completes successfully. Pageable: This value is the default. Degree of parallelism • Performance Optimization Guide: Using Caches • Reference Guide: Data Services Objects. sorts. It is recommended that you do not mark a data flow as Execute only once if a parent work flow is a recovery unit. Data flow Source and target objects A data flow directly reads and loads data using two types of objects: Source objects— Define sources from which you read data Target objects— Define targets to which you write (or load) data 200 SAP BusinessObjects Data Services Designer Guide . lookups. • Performance Optimization Guide: Using parallel Execution.7 Data Flows Source and target objects Option Description Execute only once When you specify that a data flow should only execute once. except if the data flow is contained in a work flow that is a recovery unit that re-executes and has not completed successfully elsewhere outside the recovery unit. Use database links Database links are communication paths between one database server and another. filtering. groups. Degree of parallelism Cache type Degree Of Parallelism (DOP) is a property of a data flow that defines how many times each transform within a data flow replicates to process a parallel subset of data. Database link support for push-down operations across datastores.

A delimited or fixed-width flat file Direct Direct Template table File Document XML file XML message A file with an application. Source object Table Description Software access A file formatted with columns and rows as used in rela. Related Topics • Template tables • Real-time source and target objects • Supplement for SAP: IDoc sources in real-time jobs Target objects Target objects represent data targets that can be written to in data flows.Data Flows Source and target objects 7 Related Topics • Source objects • Target objects Source objects Source objects represent data sources read from data flows. SAP BusinessObjects Data Services Designer Guide 201 . Direct Direct You can also use IDoc messages as real-time sources for SAP applications.Direct or through tional databases adapter A template table that has been created and saved in another data flow (used in development).specific format (not readable Through adapter by SQL or XML parser) A file formatted with XML tags Used as a source in real-time jobs.

primarily for debugging data flows) XML message Outbound message See Real-time source and target objects Direct See Real-time source and target objects You can also use IDoc messages as real-time sources for SAP applications. Related Topics • Supplement for SAP: IDoc targets in real-time jobs Adding source or target objects to data flows Fulfill the following prerequisites before using a source or target object in a data flow: 202 SAP BusinessObjects Data Services Designer Guide .specific format (not readable Through adapter by SQL or XML parser) A file formatted with XML tags Direct An XML file whose format is based on the preceding XML template file transform output (used in development.7 Data Flows Source and target objects Target object Table Description Software access A file formatted with columns and rows as used in rela.Direct or through tional databases adapter A table whose format is based on the output of the preceding transform (used in development) A delimited or fixed-width flat file Direct Direct Template table File Document XML file A file with an application.

If the object library is not already open. Select the appropriate object library tab: Choose the Formats tab for flat files. or choose the Datastores tab for database and adapter objects. 2. Files XML files and messages Objects accessed through an adapter Related Topics • Database datastores • Template tables • File formats • To import a DTD or XML Schema format • Adapter datastores To add a source or target object to a data flow 1. 4. or XML Schemas. 3. Define a database datastore.Data Flows Source and target objects 7 For Prerequisite Tables accessed directly from a database Template tables Define a database datastore and import table metadata. select Tools > Object Library to open it. Open the data flow in which you want to place the object. (Expand collapsed lists by clicking the plus sign next to a container icon. Select the object you want to add as a source or target. DTDs.) SAP BusinessObjects Data Services Designer Guide 203 . Define a file format and import the file Import an XML file format Define an adapter datastore and import object metadata.

6. With template tables. Note: Ensure that any files that reference flat file. If you modify and save the data 204 SAP BusinessObjects Data Services Designer Guide . 5. For objects that can be either sources or targets. Though a template table can be used as a source table in multiple data flows. Names can include alphanumeric characters and underscores (_).7 Data Flows Source and target objects For a new template table. Template tables are particularly useful in early application development when you are designing and testing a project. when you release the cursor. Instead. Click the object name in the workspace The software opens the editor for the object. or XML Schema formats are accessible from the Job Server where the job will be run and specify the file location relative to this computer. Template tables During the initial design of an application. The source or target object appears in the workspace. a popup menu appears. 7. 8. select the Template Table icon from the tool palette. For new template tables and XML template files. Set the options you require for the object. select the Template XML icon from the tool palette. you might find it convenient to use template tables to represent database tables. Select the kind of object to make. the software automatically creates the table in the database with the schema defined by the data flow when you execute a job. DTD. a secondary window appears. Enter the requested information for the new template object. you do not have to initially create a new table in your DBMS and import the metadata into the software. Template tables cannot have the same name as an existing table within a datastore. For a new XML template file. you can use it as a source in other data flows. it can only be used as a target in one data flow. After creating a template table as a target in one data flow. when you release the cursor. Drop the object in the workspace.

Click OK. 4. 5. Click inside a data flow to place the template table in the workspace. Click the template table icon and drag it to the workspace. From the object library: From the object library: 2. software uses the template table to create a new table in the database you specified when you created the template table. map the Schema In columns that you want to include in the target table. The table appears in the workspace as a template table icon. Click the template table icon and drag it to the workspace. the schema of the template table automatically changes. From the Project menu select Save. Expand a datastore. 6. Use one of the following methods to open the Create Template window: • From the tool palette: • • • • • • • • • Click the template table icon. Once a SAP BusinessObjects Data Services Designer Guide 205 . On the Create Template window. During the validation process. save it. Expand a datastore. select a datastore. 3. the software warns you of any errors such as those resulting from changing the schema. Any updates to the schema are automatically made to any other instances of the template table. After you are satisfied with the design of your data flow. In the workspace. To create a target template table 1. When the job is executed. enter a table name. In the Query transform. the template table's icon changes to a target table icon and the table appears in the object library under the datastore's list of tables. On the Create Template window.Data Flows Source and target objects 7 transformation operation in the data flow where the template table is a target. Connect the template table to the data flow as a target (usually a Query transform).

Other features. 206 SAP BusinessObjects Data Services Designer Guide . you can no longer alter the schema. such as exporting an object. The software converts the template table in the repository into a regular table by importing it from the database. Click the plus sign (+) next to the datastore that contains the template table you want to convert.7 Data Flows Source and target objects template table is created in the database. To update the icon in all data flows. The list of template tables appears. A list of objects appears. Click the plus sign (+) next to Template Tables. To convert a template table into a regular table from the object library 1. you can convert the template table in the repository to a regular table. After a template table is converted into a regular table. To convert a template table into a regular table from a data flow 1. Open the object library and go to the Datastores tab. 4. choose View > Refresh. 2. the table is now listed under Tables rather than Template Tables. you can no longer change the table's schema. Open the data flow containing the template table. Note: Once a template table is converted. Right-click a template table you want to convert and select Import Table. Converting template tables to regular tables You must convert template tables to regular tables to take advantage of some features such as bulk loading. 2. are available for template tables. 3. In the datastore object library. Right-click on the template table you want to convert and select Import Table.

job information. When there is more than one possible path between the starting point and ending point. The column is pulled from the selected upstream source or transform and added to each of the intermediate objects as well as the selected endpoint object. you can specify the route for the added columns. You right-click the top-level schema of the Employee_Names table and select Propagate Column From. the Propagate Column From command adds an existing column from an upstream source or transform through intermediate objects to the selected endpoint. For example. The Propagate Column From command is issued from the object where the column is needed. the Employee source table contains employee name information as well as employee ID. in the data flow below. The list of output columns displayed in the right pane changes to display the columns in the schema of the selected object. In the left pane of the "Propagate Column to Employee_Names" window. After viewing the output in the Employee_Names table. Select the MINIT column as the column you want to pull through from the source. The "Propagate Column to Employee_Names" window appears. Lastly. and hire dates. Column propagation is a pull-through operation. the data is output to an XML file called Employ ee_Names. and then click Propagate.Data Flows Adding columns within a data flow 7 Adding columns within a data flow Within a data flow. SAP BusinessObjects Data Services Designer Guide 207 . select the Employee source table from the list of objects. Columns are added in each object with no change to the data type or other attributes. The Name_Cleanse transform is used to standardize the employee names. you realize that the middle initial (minit column) should be included in the output.

• Only columns included in top-level schemas can be propagated. Columns in nested schemas cannot be propagated. • The output column name is auto-generated to avoid naming conflicts with existing columns. You can edit the column name. the column functions in exactly the same way as if it had been created manually. • Multiple columns can be selected and propagated in the same operation. Once a column is added to the schema of an object. Any existing columns are shown in the right pane of the "Propagate Column to" window in the "Already Exists In" field. Each additional column will have a unique name. 208 SAP BusinessObjects Data Services Designer Guide . • The propagated column is added at the end of the schema list in each object. if desired. • Columns are added in each object with no change to the data type or other attributes.7 Data Flows Adding columns within a data flow The minit column schema is carried through the Query and Name_Cleanse transforms to the Employee_Names table. Characteristics of propagated columns are as follows: • The Propagate Column From command can be issued from the top-level schema of either a transform or a target. Note: You cannot propagate a column through a Hierarchy_Flattening transform or a Table_Comparison transform. • A column can be propagated more than once.

the schema of the inputs into the Merge transform must be identical. select the upstream object that contains the column you want to map. right-click the top-level schema and click Propagate Column From. In the left pane of the "Propagate Column to" window. One of the following occurs: • If there is a single possible route. Columns are added in each object with no change to the data type or other attributes. To add columns within a data flow: 1. All sources must have the same schema. Select the path you prefer and click OK. the Propagate Column From command adds an existing column from an upstream source or transform through intermediate objects to a selected endpoint. The available columns in that object are displayed in the right pane along with a list of any existing mappings from that column. 2. This may occur when your data flow contains a Query transform with multiple input objects.Data Flows Adding columns within a data flow 7 To add columns within a data flow Within a data flow. 3. • If there is more than one possible path through intermediate objects. the selected column is added through the intermediate transforms to the downstream object. In the downstream object where you want to add the column (the endpoint). the "Choose Route to" dialog displays. including: • the same number of columns • the same column names SAP BusinessObjects Data Services Designer Guide 209 . Propagating columns in a data flow containing a Merge transform In valid data flows that contain two or more sources which are merged using a Merge transform. select the column you wish to add and click either Propagate or Propagate and Close. The Propagate Column From can be issued from the top-level schema in a transform or target object. In the right pane.

7 Data Flows Adding columns within a data flow • like columns must have the same data type In order to maintain a valid data flow when propagating a column through a Merge transform. 210 SAP BusinessObjects Data Services Designer Guide . In order to maintain a valid data flow. in the data flow shown below. the data from each source table is filtered and then the results are merged in the Merge transform. when propagating a column through a Merge transform you may want to follow a multi-step process: 1.DBO) source to the CountrySales target. For example. Ensure that the column you want to propagate is available in the schemas of all the objects that lead into the Merge transform on the upstream side. the column would be added to the Table Filter schema but not to the FileFilter schema. you must later add columns to the input schemas in the Merge transform so that the data flow is valid. This ensures that all inputs to the Merge transform are identical and the data flow is valid. a message warns you that after the propagate operation completes the data flow will be invalid because the input schemas in the Merge transform will not be identical. 2. Propagate the column on the downstream side of the Merge transform to the desired endpoint. If you choose to propagate a column from the SALES(Pubs. When you propagate a column and a Merge transform falls between the starting point and ending point. If you choose to continue with the column propagation operation. resulting in differing input schemas in the Merge transform and an invalid data flow. you must make sure to meet this restriction.

but it also provides extended functionality that lets you do the following: • Return multiple columns from a single lookup • Choose from more operators. • • Use the lookup_ext function to retrieve data from a lookup table based on user-defined lookup conditions that match input data to the lookup table data. including pattern matching. <. for example decide to cache the whole lookup table in memory or dynamically generate SQL for each input record • Use lookup_ext with memory datastore tables or persistent cache tables. Not only can the lookup_ext function retrieve a value in a table or file based on the values in a different source table or file. >. SAP BusinessObjects Data Services Designer Guide 211 . A lookup table can contain more than one lookup column. Output column—The column returned from the row that matches the lookup condition defined for the lookup column. ~ to identify a match in a row. The benefits of using persistent cache over memory tables for lookup tables are: • Multiple data flows can use the same lookup table that exists on persistent cache. Typically. Return policy column—Use to specify the data to return in the case where multiple rows match the lookup condition(s). You apply operators such as =. lookup tables can have the following kinds of columns: • Lookup column—Use to match a row(s) based on the input values. which is useful for narrowing large quantities of data to only the sections relevant for your lookup(s) • Call lookup_ext using the function wizard in the query output mapping to return multiple columns in a Query transform • Choose a caching strategy. to specify a lookup condition • Specify a return policy for your lookup • Call lookup_ext in scripts and custom functions (which also lets you reuse the lookup(s) packaged inside scripts) • Define custom SQL using the SQL_override parameter to populate the lookup cache.Data Flows Lookup tables and the lookup_ext function 7 Lookup tables and the lookup_ext function Lookup tables contain data that other tables reference. A lookup table can contain more than one output column.

you can add multiple columns to the output schema. You can invoke the editor in two ways: • • Add a new function call inside a Query transform—Use this option if you want the lookup table to return more than one column From the Mapping tab in a query or script function To add a new function call 1. An advantage of using the new function call is that after you close the lookup_ext function window. lookup_ext. In the Query transform "Schema out" pane. you can reopen the graphical editor to make modifications (right-click the function name in the schema and select Modify Function Call). Related Topics • Reference Guide: Functions and Procedures. lookup_ext • Performance Optimization Guide: Using Caches. Click Next to invoke the editor. without selecting a specific output column right-click in the pane and select New Function Call.7 Data Flows Lookup tables and the lookup_ext function • • • • The software does not need to construct the lookup table each time a data flow uses it. Persistent cache has no memory constraints because it is stored on disk and the software quickly pages it into memory. Use pageable cache (which is not available for the lookup and lookup_seq functions) Use expressions in lookup tables and return the resulting values For a description of the related functions lookup and lookup_seq. In the Output section. 3. 212 SAP BusinessObjects Data Services Designer Guide . Caching data Accessing the lookup_ext editor Lookup_ext has its own graphic editor. see the Reference Guide. 2. Select the "Function category" Lookup Functions and the "Function name"&#xA0.

On the "Mapping" tab. 4. or use lookup_ext as a new function call as previously described in this section. SAP BusinessObjects Data Services Designer Guide 213 . Example: Defining a simple lookup_ext function This procedure describes the process for defining a simple lookup_ext function using a new function call. Click Next to invoke the editor. the graphical editor isn't available. In a data flow. When lookup_ext returns more than one output column. The associated example illustrates how to use a lookup table to retrieve department names for employees. 3. file format. click the drop-down arrow and double-click the datastore. You can define one output column that will populate the selected column in the output schema. Next to the Lookup table text box. In the Output section. With functions used in mappings. 2. Select the "Function category"Lookup Functions and the "Function name"lookup_ext and click Next. From the "Schema in" pane. 3. 2. click Functions. The lookup_ext editor opens. select a lookup table: a. "Variable" replaces "Output column name". Select the output column name. Click Insert Below. Select the ID column in the "Schema out" pane. 1. see the Reference Guide. 5. Select the "Function category"Lookup Functions and the "Function name"lookup_ext. but you can edit the text on the "Mapping" tab manually. open the Query editor. In the "Lookup_ext .Data Flows Lookup tables and the lookup_ext function 7 To invoke the lookup_ext editor from the Mapping tab 1. drag the ID column to the "Schema out" pane. and click New Function Call. or current schema that includes the table. For details on all the available options for the lookup_ext function. right-click. 4.Select Parameters" window. use variables to store the output values.

dragging. 10.txt that is in D:\Data. pasting. NO_CACHE reads values from the lookup table for every row without caching values. the default of PRE_LOAD_CACHE is useful when the number of rows in the table is small or you expect to access a high percentage of the table values. pasting.7 Data Flows Lookup tables and the lookup_ext function b. To provide more resources to execute the lookup_ext function. 9. and enter an expression by typing. In the example. In the example. 6. To order the output.txt to retrieve department names for employees. Add a lookup table column name. the output column is ID_DEPT_NAME. b. Specify the "Output column name" by typing. dragging. c. For each. For each output column: a. add a lookup table column name (select from the drop-down list or drag from the "Parameter" pane). 8. The Employees table is as follows: 214 SAP BusinessObjects Data Services Designer Guide . enter the column name(s) in the "Order by" list. Select the lookup table and click OK. the lookup table is a file format called ID_lookup. Define the output.ID_DEPT. resulting in a small subset of data. If multiple matches are possible. Optionally change the default value from NULL. Define one or more conditions. For the Cache spec. specify the ordering and set a return policy (default is MAX) to select one match. or using the Smart Editor (click the icon in the right column). select Run as a separate process. or using the Smart Editor (click the icon in the right column). the condition is ID_DEPT = Employees. Select DEMAND_LOAD_CACHE when the number of rows in the table is large and you expect to frequently access a low percentage of table values or when you use the table in multiple lookups and the compare conditions are highly selective. Example: The following example illustrates how to use the lookup table ID_lookup. 7. In the example. select the appropriate operator. This option creates a separate child data flow process for the lookup_ext function when the software executes the data flow.

4.20) =substr(ID_Pattern. SAP BusinessObjects Data Services Designer Guide 215 .txt is as follows: ID_DEPT 10 20 ID_PATTERN ms(SSN*) ms(TAXID*) ID_RETURN =substr(ID_Pattern.6.30) ID_DEPT_NAME Payroll Accounting The lookup_ext editor would be configured as follows.Data Flows Lookup tables and the lookup_ext function 7 ID SSN111111111 SSN222222222 TAXID333333333 NAME Employee1 Employee2 Employee3 ID_DEPT 10 10 20 The lookup table ID_lookup.

The associated example uses the same lookup and input tables as in the Example: Defining a simple lookup_ext function This example illustrates how to extract and normalize employee ID numbers.7 Data Flows Lookup tables and the lookup_ext function Related Topics • Example: Defining a complex lookup_ext function Example: Defining a complex lookup_ext function This procedure describes the process for defining a complex lookup_ext function using a new function call. 216 SAP BusinessObjects Data Services Designer Guide .

Do the same for the Name column. or using the Smart Editor (click the icon in the right column). In a data flow. You want to remove the prefixes.Select Parameters" window. Specify the "Output column name"(s) by typing. right-click the Name column and click New Function Call. 5. drag the ID column to the "Schema out" pane. 2. 1. you want to extract and normalize employee Social Security numbers and tax identification numbers that have different prefixes. a query configured with lookup_ext. The data flow has one source table Employees. From the "Schema in" pane. Configure the lookup_ext editor as in the following graphic. 4. Example: In this example. Define one or more conditions.txt that is in D:\Data. In the "Schema out" pane. dragging. see the Reference Guide. Select the "Function category"Lookup Functions and the "Function name"lookup_ext and click Next. SAP BusinessObjects Data Services Designer Guide 217 . the lookup table is in the file format ID_lookup. Click Insert Below. In the "Lookup_ext . c. thereby normalizing the numbers. If you want the software to interpret the column in the lookup table as an expression and return the calculated value. the output columns are ID_RETURN and ID_DEPT_NAME. b. select a lookup table: In the example. 7. Optionally change the default value from NULL. and a target table. select the Expression check box. pasting.Data Flows Lookup tables and the lookup_ext function 7 For details on all the available options for the lookup_ext function. For each output column: a.ID. the condition is ID_PATTERN ~ Employees. Define the output. In the example. In the example. d. Add a lookup table column name. 3. You also want to identify the department from where the number came. 6. open the Query editor.

ID) a substring of up to 20 characters starting from the 4th position. The results for Employee1 and Employee2 are 111111111 and 222222222. then checks the lookup table ID_lookup.txt for all rows that satisfy the lookup condition.ID. The operator ~ means that the software will apply a pattern comparison to Employees.7 Data Flows Lookup tables and the lookup_ext function The lookup condition is ID_PATTERN ~ Employees.ID_PATTERN that matches Employees. the software applies the expression in ID_lookup. 218 SAP BusinessObjects Data Services Designer Guide . In this example.20) to the data. the software then applies the expression =substr(ID_PATTERN.4. respectively.ID_RETURN. When it encounters a pattern in ID_lookup.ID. which extracts from the matched string (Employees. The software reads each row of the source table Employees. Employee1 and Employee2 both have IDs that match the pattern ms(SSN*) in the lookup table.ID.

the software only executes the first occurrence of the data flow. ID_PATTERN in this expression refers to the lookup table column ID_PATTERN. In the lookup table. The software executes a data flow each time the data flow occurs in a job. The resulting target table is as follows: ID SSN111111111 SSN222222222 NAME Employee1 Employee2 ID_RETURN 111111111 222222222 333333333 ID_DEPT_NAME Payroll Payroll Accounting TAXID333333333 Employee3 Related Topics • Reference Guide: Functions and Procedures. the software skips subsequent occurrences in the job. the column ID_RETURN contains the expression =substr(ID_PATTERN. match_simple Data flow execution A data flow is a declarative specification from which the software determines the correct data to process. the software evaluates ID_RETURN as an expression because the Expression box is checked. When the lookup condition ID_PATTERN ~ Employees.4. For example in data flows placed in batch jobs. lookup_ext • Accessing the lookup_ext editor • Example: Defining a simple lookup_ext function • Reference Guide: Functions and Procedures.ID value. transform. then load data into a target.Data Flows Data flow execution 7 For the output of the ID_RETURN lookup column. Data flows are similar to SQL statements. which the software returns as a literal value (because the Expression box is not checked).ID is true. The output also includes the ID_DEPT_NAME column. SAP BusinessObjects Data Services Designer Guide 219 . the software evaluates the expression. However. In that case.20). you can specify that a batch job execute a particular data flow only one time. Here the software substitutes the placeholder ID_PATTERN with the actual Employees. the transaction order is to extract. The specification declares the desired output.

The following sections provide an overview of advanced features for data flows: • • • • Push down operations to the database server Distributed data flow execution Load balancing Caches Push down operations to the database server From the information in the data flow specification. To optimize performance. and you want to ensure that the software only executes a particular data flow one time. the software produces output while optimizing performance. ORDER BY. You can use the Data_Transfer transform to pushdown resource-intensive operations anywhere within a data flow to the database. the software reduces the number of rows and operations that the engine must process. Before running a job. Related Topics • Performance Optimization Guide: Maximizing push-down operations • Reference Guide: Data_Transfer 220 SAP BusinessObjects Data Services Designer Guide . See Creating and defining data flows for information on how to specify that a job execute a data flow only one time. the software pushes down as many transform operations as possible to the source or target database and combines as many operations as possible into one request to the database. For example. you can examine the SQL that the software generates and alter your design to produce the most efficient results. For example. Data flow design influences the number of operations that the software can push to the source or target database. such as jobs with try/catch blocks or conditionals. By pushing down operations to the database. Resource-intensive operations include joins. for SQL sources and targets.7 Data Flows Data flow execution You might use this feature when developing complex batch jobs with multiple paths. and DISTINCT. GROUP BY. the software tries to push down joins and function evaluations. the software creates database-specific SQL statements based on a job's data flow diagrams.

table comparison and lookups) across multiple processes and computers.Data Flows Data flow execution 7 Distributed data flow execution The software provides capabilities to distribute CPU-intensive and memory-intensive data processing work (such as join. This work distribution provides the following potential benefits: • • Better memory management by taking advantage of more CPU resources and physical memory Better job performance and scalability by using concurrent sub data flow execution to take advantage of grid computing You can create sub data flows so that the software does not need to process the entire data flow in memory at one time. Use the following features to split a data flow into multiple sub data flows: • Run as a separate process option on resource-intensive operations that include the following: • • • • • • • • • Hierarchy_Flattening transform Associate transform Country ID transform Global Address Cleanse transform Global Suggestion Lists transform Match Transform United States Regulatory Address Cleanse transform User-Defined transform Query operations that are CPU-intensive and memory-intensive: • Join • GROUP BY • ORDER BY • DISTINCT Table_Comparison transform Lookup_ext function Count_distinct function Search_replace function • • • • SAP BusinessObjects Data Services Designer Guide 221 . You can also distribute the sub data flows to different job servers within a server group to use additional memory and CPU resources. grouping.

An resource-intensive operation (such as a sort. Sub data flow level . • Data_Transfer transform With this transform. or table lookup) within a data flow can execute on an available Job Server.7 Data Flows Data flow execution If you select the Run as a separate process option for multiple operations in a data flow. Related Topics • Performance Optimization Guide: Using grid computing to distribute data flows execution 222 SAP BusinessObjects Data Services Designer Guide . You can specify the following values on the Distribution level option when you execute a job: • • • Job level .Each data flow within a job can execute on an available Job Server. This transform splits the data flow into two sub data flows and transfers the data to a table in the database server to enable the software to push down the operation. the software does not need to process the entire data flow on the Job Server computer. the software splits the data flow into smaller sub data flows that use separate resources (memory and computer) from each other. table comparison. Related Topics • Performance Optimization Guide: Splitting a data flow into sub data flows • Performance Optimization Guide: Data_Transfer transform for push-down operations Load balancing You can distribute the execution of a job or a part of a job across multiple Job Servers within a Server Group to better balance resource-intensive operations.A job can execute on an available Job Server. the Data_Transfer transform can push down the processing of a resource-intensive operation to the database server. the sub data flow processes run in parallel. When you specify multiple Run as a separate process options. Data flow level . Instead.

You can perform the following tasks with this auditing feature: • Collect audit statistics about data read into a job.Data Flows Audit Data Flow overview 7 Caches The software provides the option to cache data in memory to improve operations such as the following in your data flows. SAP BusinessObjects Data Services Designer Guide 223 . Table comparisons — Because a comparison table must be read for each row of a source. you might want to cache a source when it is used as an inner source in a join. you might want to cache the comparison table. Related Topics • Performance Optimization Guide: Using Caches Audit Data Flow overview You can audit objects within a data flow to collect run time audit statistics. If you split your data flow into sub data flows that each run on a different Job Server. you might want to cache it in memory to reduce access times. processed by various transforms. • Pageable cache Use a pageable cache when your data flow processes a very large amount of data that does not fit in memory. • Joins — Because an inner source of a join must be read for each row of an outer source. and loaded into targets. • • The software provides the following types of caches that your data flow can use for all of the operations it contains: • In-memory Use in-memory cache when your data flow processes a small amount of data that fits in memory. each sub data flow can use its own cache type. Lookups — Because a lookup table might exist on a remote database.

see Using Auditing . 224 SAP BusinessObjects Data Services Designer Guide .7 Data Flows Audit Data Flow overview • • • Define rules about the audit statistics to determine if the correct data is processed. For a full description of auditing data flows. Query the audit statistics that persist in the repository. Generate notification of audit failures.

Transforms 8 .

Generates an additional "effective to" column based on the primary key's "effective date. Transforms manipulate input sets and produce one or more output sets." Flattens hierarchical data into relational tables so that it can participate in a star schema. Transforms operate on data sets. Data Integrator Data_Transfer Date_Generation Effective_Date Hierarchy_Flattening History_Preserving Key_Generation 226 SAP BusinessObjects Data Services Designer Guide . functions operate on single values in specific columns in a data set. The transforms that you can use depend on the software package that you have purchased. The software includes many built-in transforms. By contrast. The following is a list of available transforms. If a transform belongs to a package that you have not purchased. it is grayed out and cannot be used in a job. Generates a column filled with date values based on the start and end dates and increment that you provide. Converts rows flagged as UPDATE to UPDATE plus INSERT. These transforms are available from the object library on the Transforms tab.8 Transforms Transform configurations The software includes objects called transforms. Transform CateTransform gory Description Allows a data flow to split its processing into two sub data flows and push down resourceconsuming operations to the database server. starting from a value based on existing keys in the table you specify. Hierarchy flattening can be both vertical and horizontal. Generates new keys for source data. You specify in which column to look for updated data. so that the original values are preserved in the target.

While commonly used to support Oracle changed-data capture. add prenames. phone numbers. this transform supports any data stream if its input requirements are met.Transforms Transform configurations 8 Transform CateTransform gory Description Sorts input data. dates. parses. validates. (Also see Reverse Pivot. and resolves before.and after-images for UPDATE rows. and secondary number. Parses input data and then identifies the country of destination for each record. or any combination of the two. and convert input sources to a standard format. and corrects global address data. secondary identifier. title. primary name. such as primary number. and firm data. primary type. Map_CDC_Operation Pivot (Columns to Rows) Reverse Pivot (Rows to Columns) XML_Pipeline Data Quality Associate Combine the results of two or more Match transforms or two or more Associate transforms. Social Security numbers. generate Match standards. It can also parse and manipulate various forms of international data. Processes large XML inputs in small batches. It can assign gender. to find matches across match sets. maps output data. Identifies and parses name. Identifies.) Rotates the values in specified rows to columns. directional. as well as operational and product data. and e-mail addresses. Rotates the values in specified columns to rows. Country ID Data Cleanse Global Address Cleanse SAP BusinessObjects Data Services Designer Guide 227 .

validates. Simplifies branch logic in data flows by consolidating case or decision making logic in one transform. and other operations. Does just about anything that you can write Python code to do. and it can offer suggestions for possible matches. A query transform is similar to a SQL SELECT statement. parses. best record. unique ID. Identifies matching records based on your business rules. Coding Accuracy Support System (CASS). Retrieves a data set that satisfies conditions that you specify. just to name a few possibilities.8 Transforms Transform configurations Transform CateTransform gory Description Completes and populates addresses with minimal data.S. Unifies rows from two or more sources into a single target. Global Suggestion List Match Table_Comparison USA Regulatory Address Cleanse User-Defined Platform Case Map_Operation Merge Query 228 SAP BusinessObjects Data Services Designer Guide . Also performs candidate selection. You can use the UserDefined transform to create new records and data sets. or populate a field with a specific value. and corrects USA address data according to the U. Paths are defined in an expression table. Identifies. Allows conversions between operation codes. Compares two data sets and produces the difference between them as a data set with rows flagged as INSERT and UPDATE.

you can override these preset defaults. such as Data Quality transforms. In the Transform Configuration Editor window. either by replicating an existing transform configuration or creating a new one. After you place an instance of the transform configuration in a data flow. you set up the default options. If you edit a transform configuration. Row_Generation SQL Validation Related Topics • Reference Guide: Transforms Transform configurations A transform configuration is a transform with preconfigured best practice input fields. Related Topics • To create a transform configuration SAP BusinessObjects Data Services Designer Guide 229 . You can also create your own transform configuration. unless a user has explicitly overridden the same option value in an instance. Performs the indicated SQL query operation. that change is inherited by every instance of the transform configuration used in data flows. best practice input fields. and options that can be used in multiple data flows. These are useful if you repeatedly use a transform with specific options and input and output fields. and best practice output fields for your transform configuration. have read-only transform configurations that are provided when Data Services is installed. Ensures that the data at any stage in the data flow meets your criteria.Transforms Transform configurations 8 Transform CateTransform gory Description Generates a column filled with integer values starting at zero and incrementing by one to the end value you specify. You can filter out or replace data that fails your criteria. You cannot perform export or multi-user operations on read-only transform configurations. best practice output fields. Some transforms.

Use the filter to display all options or just those options that are designated as best practice options. If you change an option value from its default value. you may map the fields in your data flow that contain address data whether the address 230 SAP BusinessObjects Data Services Designer Guide . and User-Defined transforms. 4. a green triangle appears next to the option name to indicate that you made an override. For the Associate. The "Transform Configuration Editor" window opens. which are accessed by clicking the Edit Options button. then the selected transform type cannot have transform configurations. If there are any errors. In the Transforms tab of the "Local Object Library. The available options depend on the type of transform that you are creating a configuration for." right-click a transform and select New to create a new transform configuration. select the input fields that you want to designate as the best practice input fields for the transform configuration. If New or Replicate is not available from the menu. set the option values to determine how the transform will process your data. so that it doesn't appear that one input schema is preferred over other input schemas. You must set the options in the Associate Editor. In the Input Best Practices tab. In Transform Configuration Name. Match. To designate an option as "best practice. Match Editor. Designating an option as best practice indicates to other users who use the transform configuration which options are typically set for this type of transform. they are displayed at the bottom of the window. 3. 6. options are not editable in the Options tab. Click the Verify button to check whether the selected option values are valid. In the Options tab. 2. 5.8 Transforms Transform configurations • To add a user-defined field To create a transform configuration 1. The transform configurations provided with Data Services do not specify best practice input fields. or User-Defined Editor. or right-click an existing transform configuration and select Replicate. enter the name of the transform configuration." select the Best Practice checkbox next to the option's value. For example.

Match. Transform configurations To add a user-defined field For some transforms. These transforms use user-defined fields because they do not have a predefined set of input fields. Match. you can also add user-defined output fields. Match. SAP BusinessObjects Data Services Designer Guide 231 . Click the Create button and enter the name of the input field. For Associate. or User-Defined transform configuration and select Edit. and User-Defined transforms. multiline fields. 1. Related Topics • Reference Guide: Transforms. You can add a user-defined field either to a single instance of a transform in a data flow or to a transform configuration so that it can be used in all instances.Transforms Transform configurations 8 data resides in discrete fields. In the Transforms tab of the "Local Object Library. you can create user-defined input fields. 7. In the User-Defined transform. 9. Click OK to save the transform configuration. You can now use the transform configuration in data flows. or a combination of discrete and multiline fields. These input fields will be the only fields displayed when the Best Practice filter is selected in the Input tab of the transform editor when the transform configuration is used within a data flow. select the output fields that you want to designate as the best practice output fields for the transform configuration. you can create user-defined input fields rather than fields that are recognized by the transform. These output fields will be the only fields displayed when the Best Practice filter is selected in the Output tab of the transform editor when the transform configuration is used within a data flow." right-click an existing Associate. such as the Associate. and User-Defined transform configurations. The transform configuration is displayed in the "Local Object Library" under the base transform of the same type. 8. In the Output Best Practices tab.

To connect a source to a transform. click the square on the right edge of the source and drag the cursor to the arrow on the left edge of the transform. Draw the data flow connections. 1. 2. the transform may not require source data. • The input for the transform might be the output from another transform or the output from a source. When you create a user-defined field in the transform configuration. it is displayed as an available field in each instance of the transform used in a data flow. In the Input Best Practices tab. You can also create user-defined fields within each transform instance. Related Topics • Data Quality editors To add transforms to data flows You can use the Designer to add transforms to data flows. Click OK to save the transform configuration. Select the transform or transform configuration that you want to add to the data flow.8 Transforms To add transforms to data flows The "Transform Configuration Editor" window opens. 2. 4. Open the object library if it is not already open and click the Transforms tab. 5. 3. Drag the transform or transform configuration icon into the data flow workspace. If you selected a transform that has available transform configurations. or. 3. click the Create button and enter the name of the input field. Continue connecting inputs and outputs as required for the transform. Open a data flow object. a drop-down menu prompts you to select a transform configuration. 232 SAP BusinessObjects Data Services Designer Guide .

Transforms Transform editors 8 • You can connect the output of the transform to the input of another transform or target. Enter option values. Related Topics • Query editor • Data Quality editors SAP BusinessObjects Data Services Designer Guide 233 . enter the column name as it appears in the input schema or drag the column name from the input schema into the option box. which has two panes: • • An input schema area and/or output schema area A options area (or parameters area) that allows you to set all the values the transform requires Data Quality transforms. 7. which lets you complete the definition of the transform. To specify a data column as a transform option. such as Match and Data Cleanse. The transform you may use most often is the Query transform. Related Topics • To add a Query transform to a data flow • To add a Data Quality transform to a data flow Transform editors Transform editor layouts vary. use a transform editor that lets you set options and map input and output fields. Data Integrator and other platform transforms each have their own graphical user interface to define the transform options. Double-click the name of the transform. This opens the transform editor. 6.

8 Transforms Transform editors 234 SAP BusinessObjects Data Services Designer Guide .

Query transform overview 9 .

and Functions (output only). or deleted) The scope of the Select through Order by tabs in the parameters area The current schema is highlighted while all other (non-current) output schemas are gray. mapped. The currently selected output schema is called the current schema and determines: • • The output elements that can be modified (added. a graphical interface for performing query operations. The Query transform can perform the following operations: • • • • • • • Choose (filter) the data to extract from sources Join data from multiple sources Map columns from input to output schemas Perform transformations and functions on the data Perform data nesting and unnesting Add new columns. Output schema area (upper right). so this section provides an overview. The input and output schema areas can contain: Columns. and function results to the output schema Assign primary keys to output columns Related Topics • Nested Data • Reference Guide: Transforms Query editor The query editor. The Schema In and Schema Out lists display the currently selected schema in each area. Nested schemas. nested schemas. The i icon indicates tabs containing user-defined entries. contains the following areas: Input schema area (upper left). 236 SAP BusinessObjects Data Services Designer Guide . Parameters area (lower tabbed area).9 Query transform overview Query editor The Query transform is by far the most commonly used transform.

column. or function in the output schema area and select Make Current. From SAP BusinessObjects Data Services Designer Guide 237 . you can access these features using the buttons above the editor. Double-click one of the non-current (grayed-out) elements in the output schema area. Use the function wizard and the expression editor to build expressions. Use right-click menu options on output elements to: • Add new output columns and schemas. Use the Mapping tab to provide complex column mappings.Query transform overview To change the current schema 9 To change the current schema You can change the current schema in the following ways: • • • Select a schema from the Output list. To modify output schema contents You can modify the output schema in several ways: • • Drag and drop (or copy and paste) columns or nested schemas from the input schema area to the output schema area to create simple mappings. Right-click a schema. • Assign or reverse primary key settings on output columns. Area Description • • Select Specifies whether to output only distinct rows (discarding any identical duplicate rows). When the text editor is enabled. • Unnest or re-nest schemas. • Use new function calls to generate new output columns. Primary key columns are flagged by a key icon. Specifies all input schemas that are used in the current schema. Use the Select through Order By tabs to provide additional parameters for the current schema (similar to SQL SELECT statement clauses). Drag and drop input schemas and columns into the output schema to enable the editor. You can drag and drop schemas and columns into these areas.

The outputs from a Query can include input to another transform or input to a target. The syntax is like an SQL SELECT WHERE clause.EMPNO = TABLE2. providing an easier way to add a Query transform. for example: TABLE1. Click the Query icon in the tool palette.EMPNO > 1000 OR TABLE2.9 Query transform overview To add a Query transform to a data flow Area Outer Join Description Specifies an inner table and outer table for any joins (in the Where sheet) that are to be treated as outer joins. 2. Click anywhere in a data flow workspace. Note: • • • The inputs for a Query can include the output from another transform or the output from a source.EMPNO < 9000 Where Group By Specifies how the output rows are grouped (if required). To add a Query transform to a data flow Because it is so commonly used. 1.EMPNO AND TABLE1. Connect the Query to inputs and outputs. You can change the content type for the columns in your data by selecting a different type from the output content type list. 238 SAP BusinessObjects Data Services Designer Guide . 3. Order By Specifies how the output rows are sequenced (if required). the Query transform icon is included in the tool palette. Specifies conditions that determine which rows are output. • Use the Search tab to locate input and output elements containing a specific word or term.

without mappings. SAP BusinessObjects Data Services Designer Guide 239 . the software automatically fills the Query's output schema with the columns from the target table.Query transform overview To add a Query transform to a data flow 9 • If you connect a target table to a Query with an empty output schema.

9 Query transform overview To add a Query transform to a data flow 240 SAP BusinessObjects Data Services Designer Guide .

Data Quality transforms overview 10 .

contain the following areas: input schema area (upper left). and Output. and append information to your customer and operational data. Output schema area The output schema area displays the fields that the transform outputs. The parameters area contains three tabs: Input. it is considered best practice to complete the tabs in this order. because the parameters available in a tab may depend on parameters selected in the previous tab.10 Data Quality transforms overview Data Quality editors Data Quality transforms are a set of transforms that help you improve the quality of your data. Input schema area The input schema area displays the input fields that are output from the upstream transform in the data flow. Generally. and the parameters area (lower tabbed area). standardize. output schema area (upper right). graphical interfaces for setting input and output fields and options. Data Quality transforms include the following transforms: • • • • • • • • Associate Country ID Data Cleanse Global Address Cleanse Global Suggestion List Match USA Regulatory Address Cleanse User-Defined Related Topics • Reference Guide: Transforms Data Quality editors The Data Quality editors. correct. Options. and which become the input fields for the downstream transform in the data flow. The transforms can parse. 242 SAP BusinessObjects Data Services Designer Guide .

a green triangle appears next to the option name to indicate that you made an override. and User-Defined transforms. Mapping input fields to field names that the transform recognizes tells the transform how to process that field. In the Associate. and Output tabs each contain filters that determine which fields are displayed in the tabs. Filter Description Best Practice Displays the fields or options that have been designated as a best practice for this type of transform. so that you can output many fields. These mapped output fields are displayed in the output schema area. Data Quality transforms can generate fields in addition to the input fields that that transform processes. If you change an option value from its default value. they may not meet your needs for processing or outputting your data. Match. Match. which you can access from the Edit Options button. Options tab The Options tab contain business rules that determine how the transform processes your data. Filter and sort The Input. The transform configurations provided with the software do not specify best practice input fields. and User-Defined editors. However. Instead you must use the Associate. Options. Output tab The Output tab displays the field names that can be output by the transform. You map these fields to input fields in the input schema area. SAP BusinessObjects Data Services Designer Guide 243 . Each transform has a different set of available options. you cannot edit the options directly in the Options tab.Data Quality transforms overview Data Quality editors 10 Input tab The Input tab displays the available field names that are recognized by the transform. these are merely suggestions.

You can filter each column of data to display one or more values. Related Topics • Associate. Note: To view option information for the Associate.10 Data Quality transforms overview Data Quality editors Filter Description In Use Displays the fields that have been mapped to an input field or output field. and also sort the fields in ascending or descending order. Match. the topic updates to reflect that selection. Displays all available fields. Embedded help The embedded help is the place to look when you need more information about Data Services transforms and options. and User-Defined transform editors 244 SAP BusinessObjects Data Services Designer Guide . The filter and sort menu is not available if there is only one item type in the column. they are applied from left to right. you will need to open their respective editors by selecting the transform in the data flow and then choosing Tools > <transform> Editor. All The Output tab has additional filter and sort capabilities that you access by clicking the column headers. Because you can filter and sort on multiple columns. The topic changes to help you with the context you're currently in. and User-Defined transforms. Icons in the column header indicate whether the column has a filter or sort applied to it. You can also navigate to other topics by using hyperlinks within the open topic. When you select a new transform or a new option group. Match.

Match. Match. and User-Defined transform editors The Associate. or operations. Match. Buttons — Use these to add. right-click the option group it belongs to and select the name of the option group from the menu. and User-Defined transform editors 10 Associate. remove and order option groups. and User-Defined transforms each have their own editor in which you can add option groups and edit options. 2. that are available for the transform. Option Explorer — In this area. Option Editor — In this area. 3. you specify the value of the option. you select the option groups. SAP BusinessObjects Data Services Designer Guide 245 . and in some cases even share the same option groups. 4.Data Quality transforms overview Associate. Embedded help — The embedded help displays additional information about using the current editor screen. The editor window is divided into four areas: 1. The editors for these three transforms look and act similarly. To display an option group that is hidden.

Associate • Reference Guide: Transforms. This is common in real-time data flows.10 Data Quality transforms overview Ordered options editor Related Topics • Reference Guide: Transforms. 3. To configure an ordered option: 1. Click the Add and Remove buttons to move option values between the Available and Selected values lists. User-Defined Ordered options editor Some transforms allow you to choose and specify the order of multiple values for a single option. One example is the parser sequence option of the Data Cleanse transform. 246 SAP BusinessObjects Data Services Designer Guide . 3. click Remove All. The values are listed in the Designer and separated by pipe characters. 1. To add a Data Quality transform to a data flow Data Quality transforms cannot be directly connected to an upstream transform that contains or generates nested tables. especially those that perform matching. Match • Reference Guide: Transforms. Select a value in the Available values list. Note: Remove all values. 2. you must insert either a Query transform or an XML Pipeline transform between the transform with the nested table and the Data Quality transform. Go to the Transforms tab. 2. To connect these transforms. Click OK to save your changes to the option configuration. To clear the Selected values list and move all option values to the Available values list. and click the up and down arrow buttons to change the position of the value in the list. Open the object library if it is not already open. Open a data flow object.

Double-click the name of the transform. you can add user-defined fields to the Input tab. which lets you complete the definition of the transform. 9. You can do this in two ways: • Click the first empty row at the bottom of the table and press F2 on your keyboard. Expand the Data Quality transform folder and select the transform or transform configuration that you want to add to the data flow. • Drag the appropriate input field to the first empty row at the bottom of the table. or. To rename the user-defined field. the transform may not require source data. SAP BusinessObjects Data Services Designer Guide 247 .Data Quality transforms overview To add a Data Quality transform to a data flow 10 4. This opens the transform editor. these columns are automatically mapped to the appropriate input fields. and User-Defined transforms. For example. Select the appropriate input field from the drop-down box to map the field. Drag the transform or transform configuration icon into the data flow workspace. Match. click the square on the right edge of the source or upstream transform and drag the cursor to the arrow on the left edge of the Data Quality transform. 6. • 7. This maps the input field to a field name that is recognized by the transform so that the transform knows how to process it correctly. You can change the content type for the columns in your data by selecting a different type from the output content type list. 8. and enter the new name. In the input schema. • The input for the transform might be the output from another transform or the output from a source. select the input fields that you want to map and drag them to the appropriate field in the Input tab. For the Associate. press F2 on your keyboard. Enter the name of the field. You can connect the output of the transform to the input of another transform or target. When content types are defined for the input. click the name. a drop-down menu prompts you to select a transform configuration. If you selected a transform that has available transform configurations. an input field that is named "Organization" would be mapped to the Firm field. Draw the data flow connections. 5. To connect a source or a transform to another transform.

The selected fields appear in the output schema. In the Options tab. Make sure that you set options before you map output fields. In the Output tab. a green triangle appears next to the option name to indicate that you made an override. so you can output many fields. • Make sure that you map input fields before you set option values. You can access these editors either by clicking the Edit Options button in the Options tab or by right-clicking the transform in the data flow. The output schema of this transform becomes the input schema of the next transform in the data flow. and User-Defined Editor. 11.10 Data Quality transforms overview To add a Data Quality transform to a data flow 10. because in some transforms. double-click the fields that you want to output from the transform. drag fields directly from the input schema to the output schema. select the appropriate option values to determine how the transform will process your data. the available options and option values depend on the mapped input fields. If you change an option value from its default value. Match Editor. 13. Data Quality transforms can generate fields in addition to the input fields that the transform processes. Related Topics • Reference Guide: Data Quality Fields • Data Quality editors 248 SAP BusinessObjects Data Services Designer Guide . You must set the options in the Associate Editor. options are not editable in the Options tab. 12. • For the Associate. and User-Defined transforms. double-click the output field and edit the properties in the "Column Properties" window. To rename or resize an output field. Match. If you want to pass data through the transform without processing it.

Work Flows 11 .

Jobs are special because you can execute them. Almost all of the features documented for work flows also apply to jobs. For example. with one exception: jobs do not have parameters. Jobs (introduced in Projects) are special work flows. Steps in a work flow Work flow steps take the form of icons that you place in the work space to create a work flow diagram.11 Work Flows What is a work flow? Related Topics • What is a work flow? • Steps in a work flow • Order of execution in work flows • Example of a work flow • Creating work flows • Conditionals • While loops • Try/catch blocks • Scripts What is a work flow? A work flow defines the decision-making process for executing data flows. the purpose of a work flow is to prepare for executing data flows and to set the state of the system after the data flows are complete. elements in a work flow can determine the path of execution based on a value set by a previous job or can indicate an alternative path if something goes wrong in the primary path. The following objects can be elements in work flows: • Work flows 250 SAP BusinessObjects Data Services Designer Guide . Ultimately.

There is a single thread of control connecting all three steps. Here is the diagram for a work flow that calls three data flows: Note that Data_Flow1 has no connection from the left but is connected on the right to the left edge of Data_Flow2 and that Data_Flow2 is connected to Data_Flow3.Work Flows Order of execution in work flows 11 • • • • • Data flows Scripts Conditionals While loops Try/catch blocks Work flows can call other work flows. If there is no dependency. the software can execute the independent steps in the work flow as separate processes. The connections you make between the icons in the workspace determine the order in which work flows execute. the software executes data flows 1 through 3 in parallel: SAP BusinessObjects Data Services Designer Guide 251 . In the following work flow. and you can nest calls to any depth. the steps need not be connected. In that case. Order of execution in work flows Steps in a work flow execute in a left-to-right sequence indicated by the lines connecting the steps. unless the jobs containing those work flows execute in parallel. Execution begins with Data_Flow1 and continues through the three data flows. A work flow can also call itself. Connect steps in a work flow when there is a dependency between the steps.

and you want to ensure that the software only executes a particular work flow or data flow one time. the software only executes the first occurrence of the work flow or data flow. the catch runs a script you wrote. To do this in the software. In that case. If the connections are not active. define each sequence as a separate work flow. which automatically sends mail notifying an administrator of the problem. You need to write a script to determine when the last update was made. In addition. You might use this feature when developing complex jobs with multiple paths. then call each of the work flows from another work flow as in the following example: You can specify that a job execute a particular work flow or data flow only one time. Example of a work flow Suppose you want to update a fact table. you want to determine when the fact table was last updated so that you only extract rows that have been added or changed since that date. before you move data from the source. the software skips subsequent occurrences in the job. you define a try/catch block. such as jobs with try/catch blocks or conditionals. you want to check that the data connections required to build the fact table are active when data is read from them. You can then pass this date to the data flow as a parameter.11 Work Flows Example of a work flow To execute more complex work flows in parallel. You define a data flow in which the actual data transformation takes place. 252 SAP BusinessObjects Data Services Designer Guide . However.

Creating work flows You can create work flows using one of two methods: • • Object library Tool palette After creating a work flow. even if the work flow appears in the job multiple times. you can specify that a job only execute the work flow one time.Work Flows Creating work flows 11 Scripts and error detection cannot execute in the data flow. SAP BusinessObjects Data Services Designer Guide 253 . 2. Click where you want to place the work flow in the diagram. 3. To create a new work flow using the object library 1. 2. try/catch blocks. Right-click and choose New. Go to the Work Flows tab. Drag the work flow into the diagram. Rather. Add the data flows. work flows. Select the work flow icon in the tool palette. This decision-making process is defined as a work flow. If more than one instance of a work flow appears in a job. 4. they are steps of a decision-making process that influences the data flow. conditionals. and scripts that you need. you can improve execution performance by running the work flow only one time. which looks like the following: The software executes these steps in the order that you connect them. 5. Open the object library. To create a new work flow using the tool palette 1.

11 Work Flows Conditionals To specify that a job executes the work flow one time When you specify that a work flow should only execute once. The Properties window opens for the work flow. 1. except if the work flow is contained in a work flow that is a recovery unit that re-executes and has not completed successfully elsewhere outside the recovery unit. a job will never re-execute that work flow after the work flow completes successfully. Conditionals and their components (if expressions. Click OK. Select the Execute only once check box. then and else diagrams) are included in the scope of the parent control flow's variables and parameters. It is recommended that you not mark a work flow as Execute only once if the work flow or a parent work flow is a recovery unit. you specify a condition and two logical branches: 254 SAP BusinessObjects Data Services Designer Guide . 2. Related Topics • Reference Guide: Work flow Conditionals Conditionals are single-use objects used to implement if/then/else logic in a work flow. To define a conditional. 3. Right click on the work flow and select Properties.

You write a script in a work flow to run the command file and return a success flag. Work flow elements to execute if the If expression evaluates to TRUE. Suppose you use a Windows command file to transfer data from a legacy system into the software. A conditional can fit in a work flow. You then define a conditional that reads the success flag to determine if the data is available for the rest of the work flow. Then Else Define the Then and Else branches inside the definition of the conditional. variables.Work Flows Conditionals 11 Conditional branch Description If A Boolean expression that evaluates to TRUE or FALSE. (Optional) Work flow elements to execute if the If expression evaluates to FALSE. You can use functions. SAP BusinessObjects Data Services Designer Guide 255 . and standard operators to construct the expression.

(Optional) Add your predefined work flow to the Else box. Both the Then and Else branches of the conditional can contain any object that you can have in a work flow including other work flows. and save each work flow as a separate object rather than constructing these work flows inside the conditional editor. then drag it into the Then box. 7. choose DebugValidate. Enter the Boolean expression that controls the conditional. The conditional appears in the diagram. 256 SAP BusinessObjects Data Services Designer Guide . open the object library to the Work Flows tab. 6. click OK. The software tests your conditional for syntax errors and displays any errors encountered. After you complete the expression. Click the icon for a conditional in the tool palette. 3. To define a conditional 1. If the If expression evaluates to FALSE and the Else box is blank. select the desired work flow. Click the name of the conditional to open the conditional editor. Click if. To add an existing work flow. It is recommended that you define. 9. If the elements in each branch are simple. After you complete the conditional. nested conditionals. 10.11 Work Flows Conditionals To implement this conditional in the software. Define the work flows that are called by the Then and Else branches of the conditional. Add your predefined work flow to the Then box. 4. you define two work flows—one for each branch of the conditional. Open the work flow in which you want to place the conditional. Click the location where you want to place the conditional in the diagram. test. You might want to use the function wizard or smart editor. 11. 8. 5. you can define them in the conditional editor itself. 2. Continue building your expression. try/catch blocks. the software exits the conditional and continues with the work flow. and so on.

SAP BusinessObjects Data Services Designer Guide 257 . The conditional is now defined. The while loop repeats a sequence of steps as long as a condition is true. While loops Use a while loop to repeat a sequence of steps in a work flow as long as a condition is true. This section discusses: • • • Design considerations Defining a while loop Using a while loop with View Data Design considerations The while loop is a single-use object that you can use in a work flow. Click the Back button to return to the work flow that calls the conditional.Work Flows While loops 11 12.

As long as the file does not exist and the counter is less than a particular value. such as a counter. before checking again. repeat the while loop.11 Work Flows While loops Typically. the while loop will not end. 258 SAP BusinessObjects Data Services Designer Guide . say one minute. Because the system might never write the file. to ensure that the while loop eventually exits. you must add another check to the loop. Defining a while loop You can define a while loop in any work flow. For example. you can have the work flow go into sleep mode for a particular length of time. If the condition does not change. put the work flow in sleep mode and then increment the counter. In each iteration of the loop. As long as the file does not exist. the steps done during the while loop result in a change in the condition so that the condition is eventually no longer satisfied and the work flow exits from the while loop. In other words. You can use a while loop to check for the existence of the file using the file_exists function. change the while loop to check for the existence of the file and the value of the counter. you might want a work flow to wait until the system writes a particular file.

After defining the steps in the while loop.Work Flows While loops 11 To define a while loop 1. In the While box at the top of the editor. 7. choose Debug > Validate. Using a while loop with View Data When using View Data. SAP BusinessObjects Data Services Designer Guide 259 . recursive calls can create an infinite loop. 4. 3. 8. Note: Although you can include the parent work flow in the while loop. a job stops when the software has retrieved the specified number of rows for all scannable objects. Click the while loop to open the while loop editor. Open the work flow where you want to place the while loop. Close the while loop editor to return to the calling work flow. 5. enter the condition that must apply to initiate and repeat the steps in the while loop. Click the while loop icon on the tool palette. 2. The software tests your definition for syntax errors and displays any errors encountered. Connect these objects to represent the order that you want the steps completed. and data flows. 6. Click OK after you enter an expression in the editor. You can add any objects valid in a work flow including scripts. Add the steps you want completed during the while loop to the workspace in the while loop editor. which gives you more space to enter an expression and access to the function wizard. Click the location where you want to place the while loop in the workspace diagram. click to open the expression editor. The while loop appears in the diagram. Alternatively. work flows.

Here's the general method to implement exception handling: 1. then the job will complete after the scannable objects in the while loop are satisfied. 3. In the catch object. the DBMS. Try and catch objects are single-use objects. The while loop might complete any number of iterations. Use catch functions inside the catch block to identify details of the error. • Optional.11 Work Flows Try/catch blocks Depending on the design of your job. Insert a catch object in the work flow after the steps. the while loop will complete normally. or a combination of these objects. Apply solutions that you provide for the exceptions groups or for specific errors within a group. the job will complete as soon as all scannable objects are satisfied. 260 SAP BusinessObjects Data Services Designer Guide . Continue execution. possibly after the first iteration of the while loop. if the while loop is the last object in a job). or the operating system. do the following: • Select one or more groups of errors that you want to catch. If there are no scannable objects following the while loop but there are scannable objects completed in parallel to the while loop. the software might not complete all iterations of a while loop if you run a job in view data mode: • If the while loop contains scannable objects and there are no scannable objects outside the while loop (for example. If there are scannable objects after the while loop. The actions can be a single script object. • • Try/catch blocks A try/catch block is a combination of one try object and one or more catch objects that allow you to specify alternative work flows if errors occur while the software is executing a job. Try/catch blocks: • • • "Catch" groups of exceptions "thrown" by the software. Insert a try object before the steps for which you are handling errors. a workflow. • Define the actions that a thrown exception executes. a data flow. Scanned objects in the while loop will show results from the last iteration. 2.

Click the try icon in the tool palette. 2.Work Flows Try/catch blocks 11 If an exception is thrown during the execution of a try/catch block and if no catch object is looking for that exception. Related Topics • Defining a try/catch block • Categories of available exceptions • Example: Catching details of an error • Reference Guide: Objects. the try merely initiates the try/catch block. Catch Defining a try/catch block To define a try/catch block: 1. then the actions defined in Catch_A execute. then the exception is handled by normal error logic. Click the catch icon in the tool palette. The action initiated by the catch object can be simple or complex. Click the location where you want to place the try in the diagram. 3. if the data flow BuildTable causes any system-generated exceptions specified in the catch Catch_A. Run a scaled-down version of a failed work flow or data flow. Note: There is no editor for a try. Rerun a failed work flow or data flow. Open the work flow that will include the try/catch block. The following work flow shows a try/catch block surrounding a data flow: In this case. The try icon appears in the diagram. 4. SAP BusinessObjects Data Services Designer Guide 261 . Here are some examples of possible exception actions: • • • Send the error message to an online reporting database or to your support group.

if one catch block catches an exception. b. test. The actions can be an individual script. After you have completed the catch. open the object library to the Work Flows tab. select the desired work flow. 10. If you want to define actions for specific errors. 11. Define the actions to take for each exception group and add the actions to the catch work flow box. Click the name of the catch object to open the catch editor. then Catch2 and CatchAll will not execute. if your work flow has the following sequence and Catch1 catches an exception. a. Connect the try and catch objects to the objects they enclose. a data flow. Click the location where you want to place the catch object in the work space. To select all exception groups. or any combination of these objects. Note: In a sequence of catch blocks. The software tests your definition for syntax errors and displays any errors encountered. To add an existing work flow to the catch work flow box. choose Validation > Validate > All Objects in View.11 Work Flows Try/catch blocks 5. For example. Click the Back button to return to the work flow that calls the catch. 6. 12. click the check box at the top. Select one or more groups from the list of Exceptions. a work flow. If you want to catch multiple exception groups and assign different actions to each exception group. Try > DataFlow1 > Catch1 > Catch2 > CatchAll 262 SAP BusinessObjects Data Services Designer Guide . and drag it into the box. 8. and save the actions as a separate object rather than constructing them inside the catch editor. 9. The catch object appears in the work space. repeat steps 4 through 11 for each catch in the work flow. the subsequent catch blocks will not be executed. 7. It is recommended that you define. use the following catch functions in a script that the work flow executes: • error_context() • error_message() • error_number() • error_timestamp() c.

select the checkbox in front of Database access errors (1002). Related Topics • Categories of available exceptions • Example: Catching details of an error • Reference Guide: Objects. Catch Categories of available exceptions Categories of available exceptions include: • • • • • • • • • • • • • Execution errors (1001) Database access errors (1002) Database connection errors (1003) Flat file processing errors (1004) File access errors (1005) Repository access errors (1006) SAP system errors (1007) System resource exception (1008) SAP BW execution errors (1009) XML processing errors (1010) COBOL copybook errors (1011) Excel book errors (1012) Data Quality transform errors (1013) Example: Catching details of an error This example illustrates how to use the error functions in a catch script. SAP BusinessObjects Data Services Designer Guide 263 . Suppose you want to catch database access errors and send the error details to your support group.Work Flows Try/catch blocks 11 If any error in the exception group listed in the catch occurs during the execution of this try/catch block. 1. the software executes the catch work flow. select the exception group that you want to catch. In the catch editor. In this example.

). • Send the error message that the error_message() function returns for the exception caught. The sample catch script includes a print command to print the error message for the database error.com'. print('DBMS Error: ' || error_message()).20. A script can contain the following statements: • • • • • Function calls If statements While statements Assignment statements Operators The basic rules for the syntax of the script are as follows: • 264 Each line ends with a semicolon (.11 Work Flows Scripts 2. Related Topics • Reference Guide: Objects. 4. 'Data Service error number' || error_number(). For example. This sample catch script includes the mail_to function to do the following: • Specify the email address of your support group. Catch scripts Scripts Scripts are single-use objects used to call functions and assign values to variables in a work flow.20). create a script object with the following script: mail_to('support@my. You can then assign the variable to a parameter that passes into a data flow and identifies the rows to extract from a source. In the work flow area of the catch editor. 'Error message: ' || error_message(). SAP BusinessObjects Data Services Designer Guide . • Send the error number that the error_number() function returns for the exception caught. 3. you can use the SQL function in a script to determine the most recent update time for a table and then assign that value to a variable. Catch error functions • Reference Guide: Objects.

Comments start with a pound sign (#). each followed by a semicolon. For example. 5. 3. Open the work flow. SAP BusinessObjects Data Services Designer Guide 265 . The following example shows a script that determines the start time from the output of a custom function. Click the name of the script to open the script editor.$GETIME). 2. String values are enclosed in single quotation marks ('). Click the script icon in the tool palette. Function calls always specify parameters even if the function uses no parameters. The script icon appears in the diagram.Work Flows Scripts 11 • • • • Variable names start with a dollar sign ($). AW_StartJob ('NORMAL'. the following script statement determines today's date and assigns the value to the variable $TODAY: $TODAY = sysdate(). Enter the script statements. 'YYYY_MMM_DDD_HH24:MI:SS'). $GETIME =to_date( sql('ODS_DS'. \'YYYY-MM-DDD HH24:MI:SS\') FROM EMPLOYEE'). You cannot use variables unless you declare them in the work flow that calls the script.'DELTA'. 4.'SELECT to_char(MAX(LAST_UPDATE) . Related Topics • Reference Guide: Data Services Scripting Language To create a script 1. $G_STIME. Click the location where you want to place the script in the diagram.

button and then save to name and save your script. For example. produces the following output in the trace log: The following output is being printed via the Print function in <Session job_name>. 7. 6. The software tests your script for syntax errors and displays any errors encountered. Debugging scripts using the print function The software has a debugging feature that allows you to print: • • The values of variables and parameters during execution The execution path followed within a script You can use the print function to write the values of parameters and variables in a work flow to the trace log.11 Work Flows Scripts Click the function button to include functions in your script.. Click the . print 266 SAP BusinessObjects Data Services Designer Guide .. The script is saved by default in <LINKDIR>/BusinessObjects Data Services/ DataQuality/Samples. select Validation > Validate. After you complete the script. The value of parameter $x: value Related Topics • Reference Guide: Functions and Procedures. this line in a script: print('The value of parameter $x: [$x]').

Nested Data 12 .

and transforms. This mechanism is called Nested Relational Data Modelling (NRDM). targets. What is nested data? Real-world data often has hierarchical relationships that are represented in a relational database with master-detail schemas using foreign keys to create the mapping. Representing hierarchical data You can represent the same hierarchical data in several ways. handle hierarchical relationships through nested data. The software maps nested data to a separate schema implicitly related to a single row and column of the parent schema. Each row of the sales order data set contains a nested line item schema. Sales orders are often presented using nesting: the line items in a sales order are related to a single header and are represented using a nested schema. some data sets. such as XML documents and SAP ERP IDocs.12 Nested Data What is nested data? This section discusses nested data and how to use them in the software. However. NRDM provides a way to view and manipulate hierarchical relationships within data flow sources. Examples include: • Multiple rows in a single data set 268 SAP BusinessObjects Data Services Designer Guide .

columns inside a nested schema can also contain columns. SAP BusinessObjects Data Services Designer Guide 269 . For example. There is a unique instance of each nested schema for each row at each level of the relationship. and can scale to present a deeper level of hierarchical complexity.Nested Data Representing hierarchical data 12 • Multiple data sets related by a join • Nested data Using the nested data method can be more concise (no repeated information).

The structure of the schema shows how the data is ordered. • • • Sales is the top-level schema. which indicates that the object contains columns. You can see the structure of nested data in the input and output schemas of sources. each row at each level can have any number of columns containing nested schemas. Nested schemas appear with a schema icon paired with a plus sign. and transforms in data flows. The minus sign in front of the schema icon indicates that the column list is open.12 Nested Data Representing hierarchical data Generalizing further with nested data. 270 SAP BusinessObjects Data Services Designer Guide . CustInfo is a nested schema with the column list closed. targets. LineItems is a nested schema.

it is structured into the software's internal schema for hierarchical documents which uses the nested relational data model (NRDM). Their valid structure is stored in separate format documents.Nested Data Formatting XML documents 12 Formatting XML documents The software allows you to import and export metadata for XML documents (files or messages). When you import a format document's metadata.dtd). The format of an XML file or message (.xml) can be specified using either an XML Schema (. which you can use as sources or targets in jobs. XML documents are hierarchical.xsd for example) or a document type definition (. Related Topics • Importing XML Schemas • Specifying source options for XML files • Mapping optional schemas • Using Document Type Definitions (DTDs) SAP BusinessObjects Data Services Designer Guide 271 .

and line items—the corresponding XML Schema includes the order structure and the relationship between data.12 Nested Data Formatting XML documents • Generating DTDs and XML Schemas from an NRDM schema Importing XML Schemas The software supports WC3 XML Schema Specification 1. For an XML document that contains information to place a sales order—order header. customer.0. 272 SAP BusinessObjects Data Services Designer Guide .

When importing an XML Schema. From the object library. then imports the following: • • • • • • Document structure Namespace Table and column names Data type of each column Content type of each column Nested table and column attributes While XML Schemas make a distinction between elements and attributes. The object library lists imported XML Schemas in the Formats tab. Note: If your Job Server is on a different computer than the Designer. 2. You must type the path. Enter the file name of the XML Schema or its URL address. The software reads the defined elements and attributes. SAP BusinessObjects Data Services Designer Guide 273 . 3. the software imports and converts them all to nested table and column attributes. Right-click the XML Schemas icon. click the Format tab. Related Topics • Reference Guide: XML schema To import an XML Schema 1. you cannot use Browse to specify the file path.Nested Data Formatting XML documents 12 Related Topics • Reference Guide: XML schema Importing XML schemas Import the metadata for each XML Schema you use. Enter the settings for the XML schemas that you import. When importing an XML Schema: • • Enter the name you want to use for the format in the software.

This value must match the number of recursive levels in the XML Schema's content. • In the Root element name drop-down list. you should use a unique target namespace for the schema. 2. select the name of the primary node you want to import. When Data Services generates the WSDL file for a real-time job with a source or target schema that has no target namespace. select a name in the Namespace drop-down list to identify the imported XML Schema. Click OK. select the Formats tab. the job that uses this XML Schema will fail. From the object library. After you import an XML Schema. you can edit its column properties such as data type using the General tab of the Column Properties window. • • 4. but the Job Server must be able to access it. To view and edit nested table and column attributes for XML Schema 1. If the XML Schema contains recursive elements (element A contains B. You can also view and edit nested table and column attributes from the Column Properties window. Note: When you import an XML schema for a real-time web service job. it adds an automatically generated target namespace to the types section of the XML schema. and then reattach the proper namespace information before returning the response to the client. 274 SAP BusinessObjects Data Services Designer Guide . Double-click an XML Schema name. • If the root element name is not unique within the XML Schema. element B contains A). The XML Schema Format window appears in the workspace. Otherwise. specify the number of levels it has by entering a value in the Circular level box. Expand the XML Schema category.12 Nested Data Formatting XML documents You can type an absolute path or a relative path. Varchar 1024 is the default. 3. You can set the software to import strings as a varchar of any size. The software only imports elements of the XML Schema that belong to this node or any subnodes. This can reduce performance because Data Services must suppress the namespace information from the web service request during processing.

BookType. and NewspaperType. an abstract element PublicationType can have a substitution group that consists of complex types such as MagazineType. the following excerpt from an xsd defines the PublicationType element as abstract with derived types BookType and MagazineType: <xsd:complexType name="PublicationType" ab stract="true"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string" minOc curs="0" maxOccurs="unbounded"/> <xsd:element name="Date" type="xsd:gYear"/> SAP BusinessObjects Data Services Designer Guide 275 . 4. Double-click a nested table or column and select Attributes to view or edit XML Schema attributes. For example. Related Topics • Reference Guide: XML schema Importing abstract types An XML schema uses abstract types to force substitution for a particular element or type. For example. On the Import XML Schema Format window. To limit the number of derived types to import for an abstract type 1. When a type is defined as abstract. the instance document must use a type derived from it (identified by the xsi:type attribute). • • When an element is defined as abstract. the Abstract type button is enabled.Nested Data Formatting XML documents 12 The Type column displays the data types that the software uses when it imports the XML document metadata. when you enter the file name or URL address of an XML Schema that contains an abstract type. a member of the element's substitution group must appear in the instance document. The default is to select all complex types in the substitution group or all derived types for the abstract type. but you can choose to select a subset.

click the Abstract type button and take the following actions: a. To select a subset of derived types for an abstract type. In other words. the software selects all derived types for the abstract type by default. select the name of the abstract type. From the drop-down list on the Abstract type box. Select the check boxes in front of each derived type name that you want to import. b. Click OK. Note: When you edit your XML schema format. Importing substitution groups An XML schema uses substitution groups to assign elements to a special group of elements that can be substituted for a particular named element called the head element. The list of substitution groups can have hundreds 276 SAP BusinessObjects Data Services Designer Guide . c.12 Nested Data Formatting XML documents </xsd:sequence> </xsd:complexType> <xsd:complexType name="BookType"> <xsd:complexContent> <xsd:extension base="PublicationType"> <xsd:sequence> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence> </xsd:extension> /xsd:complexContent> </xsd:complexType> <xsd:complexType name="MagazineType"> <xsd:complexContent> <xsd:restriction base="PublicationType"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string" minOccurs="0" maxOccurs="1"/> <xsd:element name="Date" type="xsd:gYear"/> </xsd:sequence> </xsd:restriction> </xsd:complexContent> </xsd:complexType> 2. the subset that you previously selected is not preserved.

select the name of the substitution group. c. Select the check boxes in front of each substitution group name that you want to import. but an application typically only uses a limited number of them. but you can choose to select a subset. For example. the subset that you previously selected is not preserved. the following excerpt from an xsd defines the PublicationType element with substitution groups MagazineType.Nested Data Formatting XML documents 12 or even thousands of members. AdsType. the software selects all elements for the substitution group by default. In other words. b. BookType. To limit the number of substitution groups to import 1. Note: When you edit your XML schema format. Click the Substitution Group button and take the following actions a. and NewspaperType: <xsd:element name="Publication" type="Publication Type"/> <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Publication" maxOccurs="unbound ed"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Magazine" type="MagazineType" sub stitutionGroup="Publication"/> <xsd:element name="Book" type="BookType" substitution Group="Publication"/> <xsd:element name="Ads" type="AdsType" substitution Group="Publication"/> <xsd:element name="Newspaper" type="NewspaperType" substitutionGroup="Publication"/> 2. Click OK. From the drop-down list on the Substitution group box. On the Import XML Schema Format window. when you enter the file name or URL address of an XML Schema that contains substitution groups. The default is to select all substitution groups. SAP BusinessObjects Data Services Designer Guide 277 . the Substitution Group button is enabled.

5. Place a query in the data flow and connect the XML source to the input of the query. Double-click the XML source in the work space to open the XML Source File Editor. From the object library. you create a data flow to use the XML documents as sources or targets in jobs. click the Format tab. To read multiple XML files at one time 1. Expand the XML Schema and drag the XML Schema that defines your source XML file into your data flow. enter a file name containing a wild card character (* or ?). In XML File on the Source tab. 3. 4. 2. You must specify the name of the source XML file in the XML file text box. Related Topics • Reading multiple XML files at one time • Identifying source file names • Reference Guide: XML file source Reading multiple XML files at one time The software can read multiple files with the same format from a single directory using a single source object. Open the editor for your source XML file 2.xml might read files from the year 1999 278 SAP BusinessObjects Data Services Designer Guide . Creating a data flow with a source XML file To create a data flow with a source XML file 1.12 Nested Data Formatting XML documents Specifying source options for XML files After you import metadata for XML documents (files or messages). For example: D:\orders\1999????.

When you run the job.xml reads all files with the xml extension from the specified directory Related Topics • Reference Guide: XML file source Identifying source file names You might want to identify the source XML file for each row in your source output in the following situations: 1. the target DI_FILENAME column will contain the source XML file name for each row in the target. 3. Also. the software automatically marks nested tables as optional if the corresponding option was set in the DTD or XSD file. You specified a wildcard character to read multiple source files at one time 2. You load from a different source file on different days To identify the source XML file for each row in the target 1. select Include file name column which generates a column DI_FILENAME to contain the name of the source XML file. The software retains this option when you copy and paste schemas into your Query transforms. When you make a schema column optional and do not provide mapping for it. map the DI_FILENAME column from Schema In to Schema Out. This feature is especially helpful when you have very large XML schemas with many nested levels in your jobs. the software instantiates the empty nested table when you run the job.Nested Data Formatting XML documents 12 D:\orders\*. In the XML Source File Editor. Mapping optional schemas You can quickly specify default mapping for optional schemas without having to manually construct an empty nested table for each optional schema in the Query transform. when you import XML schemas (either through DTDs or XSD files). SAP BusinessObjects Data Services Designer Guide 279 . 2. In the Query editor.

EMP.EMPNO. Click Apply and OK to set. unnested tables. then go to the Attributes tab and set the Optional Table attribute value to yes or no. Note: If the Optional Table value is something other than yes or no. You must map any output schema not marked as optional to a valid nested query block. JOB varchar (9) NT1 al_nested_table ( DEPTNO int KEY . The software generates a NULL in the corresponding PROJECT list slot of the ATL for any optional schema without an associated.JOB. However. the software generates special ATL and does not perform user interface validation for this nested table. EMP. this nested table cannot be marked as optional. You can also right-click a nested table and select Properties. Note: You cannot mark top-level schemas.12 Nested Data Formatting XML documents While a schema element is marked as optional. To toggle it off. Right-click a nested table and select Optional to toggle it on. DEPT. right-click the nested table again and select Optional again.ENAME. the resulting query block must be complete and conform to normal validation rules required for a nested query block. NT2 al_nested_table (C1 int) ) SET("Optional Table" = 'yes') ) AS SELECT EMP. 2. NULL FROM EMP. To make a nested table "optional" 1. you can still provide a mapping for the schema by appropriately programming the corresponding sub-query block with application logic that specifies how the software should produce the output. if you modify any part of the sub-query block. When you run a job with a nested table set to optional and you have nothing defined for any columns and nested tables beneath that table. defined sub-query block. or nested tables containing function calls optional. 280 SAP BusinessObjects Data Services Designer Guide . ENAME varchar(10). DNAME varchar (14). Example: CREATE NEW Query ( EMPNO int KEY .

SAP BusinessObjects Data Services Designer Guide 281 . The object library lists imported DTDs in the Formats tab. If you import the metadata from an XML file. The DTD describes the data contained in the XML document and the relationships among the elements in the data. customer. For an XML document that contains information to place a sales order—order header. the software automatically retrieves the DTD for that XML file. and line items—the corresponding DTD includes the order structure and the relationship between data. You can import metadata from either an existing XML file (with a reference to a DTD) or DTD file. Import the metadata for each DTD you use.Nested Data Formatting XML documents 12 Using Document Type Definitions (DTDs) The format of an XML document (file or message) can be specified by a document type definition (DTD).

You can type an absolute path or a relative path. 282 SAP BusinessObjects Data Services Designer Guide .12 Nested Data Formatting XML documents When importing a DTD. • Enter the file that specifies the DTD you want to import. This allows you to modify imported XML data and edit the data type as needed. Otherwise. • • 4. The software only imports elements of the DTD that belong to this node or any subnodes. specify the number of levels it has by entering a value in the Circular level box. 3. such as text and comments. select XML for the File type option. select the name of the primary node you want to import. • • If importing an XML file. If importing a DTD file. the job that uses this DTD will fail. but the Job Server must be able to access it. select the DTD option. element B contains A). You can set the software to import strings as a varchar of any size. You must type the path. 2. Related Topics • Reference Guide: DTD To import a DTD or XML Schema format 1. From the object library. In the Root element name box. click the Format tab. Varchar 1024 is the default. Enter settings into the Import DTD Format window: • In the DTD definition name box. you cannot use Browse to specify the file path. Click OK. the software reads the defined elements and attributes. If the DTD contains recursive elements (element A contains B. This value must match the number of recursive levels in the DTD's content. enter the name you want to give the imported DTD format in the software. Note: If your Job Server is on a different computer than the Designer. The software ignores other parts of the definition. Right-click the DTDs icon and select New.

If the Required attribute is set to NO. Use the XML format to set up an XML source for the staged file. Expand the DTDs category. Select the Attributes tab to view or edit DTD attributes. Nested tables become intermediate elements. Double-click a nested table or column.Nested Data Formatting XML documents 12 After you import a DTD. The DTD/XML Schema generated will be based on the following information: • • • Columns become either elements or attributes based on whether the XML Type attribute is set to ATTRIBUTE or ELEMENT. The DTD Format window appears in the workspace. 5. The Column Properties window opens. 4. Double-click a DTD name. Generating DTDs and XML Schemas from an NRDM schema You can right-click any schema from within a query editor in the Designer and generate a DTD or an XML Schema that corresponds to the structure of the selected schema (either NRDM or relational). Generate a DTD/XML Schema. 2. select the Formats tab. the corresponding element or attribute is marked optional. This feature is useful if you want to stage data to an XML file and subsequently read it into another data flow. To view and edit nested table and column attributes for DTDs 1. From the object library. 1. You can also view and edit DTD nested table and column attributes from the Column Properties window. 3. you can edit its column properties such as data type using the General tab of the Column Properties window. Use the DTD/XML Schema to setup an XML format 3. 2. SAP BusinessObjects Data Services Designer Guide 283 .

the Query transform assumes that the FROM clause in the SELECT statement contains the data sets that are connected as inputs to the query object. Without nested schemas. While generating XML Schemas. the MinOccurs and MaxOccurs values will be set based on the Minimum Occurrence and Maximum Occurrence attributes of the corresponding nested table. When working with nested data. the query provides an interface to perform SELECTs at each level of the relationship that you define in the output schema. Related Topics • Reference Guide: DTD • Reference Guide: XML schema Operations on nested data This section discusses the operations that you can perform on nested data. Overview of nested data and the Query transform With relational data. The other SELECT statement elements defined by the query work the same with nested data as they do with flat data. You use the Query transform to manipulate nested data. If you want to extract only part of the nested data. No other information is considered while generating the DTD or XML Schema. 284 SAP BusinessObjects Data Services Designer Guide . The mapping between input and output schemas defines the project list for the statement. because a SELECT statement can only include references to relational data sets. The software assists by setting the top-level inputs as the default FROM clause values for the top-level output schema. you must explicitly define the FROM clause in a query. However. When working with nested data. you can use the XML_Pipeline transform. a query that includes nested data includes a SELECT statement to define operations for each parent and child schema in the output.12 Nested Data Operations on nested data • • The Native Type attribute is used to set the type of the element or attribute. a Query transform allows you to execute a SELECT statement.

However. A FROM clause can contain: • Any top-level schema from the input SAP BusinessObjects Data Services Designer Guide 285 . These FROM clause descriptions and the behavior of the query are exactly the same with nested data as with relational data. you indicate that the output will be formed from the cross product of the two schemas. which the Schema Out text box displays. FROM includes top-level columns by default. If you include more than one schema in the FROM clause. You can include columns from nested schemas or remove the top-level columns in the FROM list by adding schemas to the FROM tab. because the SELECT statements are dependent upon each other—and because the user interface makes it easy to construct arbitrary data sets—determining the appropriate FROM clauses for multiple levels of nesting can be complex. The parameters you enter for the following tabs apply only to the current schema (displayed in the Schema Out text box at the top right): • • • WHERE GROUP BY ORDER BY Related Topics • Query editor • Reference Guide: XML_Pipeline FROM clause construction When you include a schema in the FROM clause. you indicate that all of the columns in the schema—including columns containing nested schemas—are available to be included in the output. The current schema allows you to distinguish multiple SELECT statements from each other within a single query.Nested Data Operations on nested data 12 The Query Editor contains a tab for each clause of the query: • • SELECT applies to the current schema. constrained by the WHERE clause for the current schema.

286 SAP BusinessObjects Data Services Designer Guide .12 Nested Data Operations on nested data • Any schema that is a column of a schema in the FROM clause of the parent schema The FROM clauses form a path that can start at any level of the output. The next two examples use the sales order data set to illustrate scenarios where FROM clause values change the data resulting from the query. Example: FROM clause includes all top-level inputs To include detailed customer information for all of the orders in the output. join the order schema at the top-level with a customer schema. The data that a SELECT statement from a lower schema produces differs depending on whether or not a schema is included in the FROM clause at the top-level. Include both input schemas at the top-level in the FROM clause to produce the appropriate data. The first schema in the path must always be a top-level schema from the input.

Customer name. The Schema Out pane shows customer details CustID. and Address for each SALES_ORDER_NUMBER. For example. and you want the output to include detailed material information for each line-item. the input includes a materials schema and a nested line-item schema. Example: Lower level FROM clause contains top-level input Suppose you want the detailed information from one schema to appear for each row in a lower level of another schema. SAP BusinessObjects Data Services Designer Guide 287 .Nested Data Operations on nested data 12 This example shows: • • The FROM clause includes the two top-level schemas OrderStatus_In and cust.

Item = Materials. To include the Description from the top-level Materials schema for each row in the nested LineItems schema • • Map Description from the top-level Materials Schema In to LineItems Specify the following join constraint: "Order".12 Nested Data Operations on nested data This example shows: • • The nested schema LineItems in Schema Out has a FROM clause that specifies only the Orders schema.LineItems.Item 288 SAP BusinessObjects Data Services Designer Guide .

you can nest the line items under the header schema. • Source list: Drag the columns from the input to the output.OrderNo = LineItems.OrderNo You can use a query transform to construct a nested data set from relational data. The line items for a single row of the header schema are equal to the results of a query including the order number: SELECT * FROM LineItems WHERE Header. specify the top-level Materials schema and the nested LineItems schema. specify the query used to define the nested data set for each row of the parent schema. You can also include new columns or include mapping expressions for the columns. • FROM clause: Include the input sources in the list on the From tab. 3. if you have sales-order information in a header schema and a line-item schema. • SAP BusinessObjects Data Services Designer Guide 289 . To construct a nested data set 1. and WHERE clause to describe the SELECT statement that the query executes to determine the top-level data set. Nesting columns When you nest rows of one schema inside another. Indicate the FROM clause. source list. 2.Nested Data Operations on nested data 12 • In the FROM clause. When you indicate the columns included in the nested schema. the data set produced in the nested schema is the result of a query against the first one using the related values from the second one. Place a query in the data flow and connect the sources to the input of the query. WHERE clause: Include any filtering or joins required to define the data set for the top-level output. Create a data flow with the sources that you want to include in the nested data set. For example.

Related Topics • Query editor • FROM clause construction Using correlated columns in nested data Correlation allows you to use columns from a higher-level schema to construct a nested schema. In a nested-relational model. WHERE clause: Only columns are available that meet the requirements for the FROM clause. A new schema icon appears in the output. • FROM clause: If you created a new output schema. You can also drag an entire schema from the input to the output. If the output requires it. 7. If the output requires it.12 Nested Data Operations on nested data 4. you need to drag schemas from the input to populate the FROM clause. • • Select list: Only columns are available that meet the requirements for the FROM clause. The query editor changes to display the new current schema. that schema is automatically listed. source list. If you dragged an existing schema from the input to the top-level output. and WHERE clause to describe the SELECT statement that the query executes to determine the top-level data set. Create a new schema in the output. In the output of the query. nested under the top-level schema. nest another schema under the top level. Change the current schema to the nested schema. right-click and choose New Output Schema. 5. if that schema is included in the FROM clause for this schema. 8. Indicate the FROM clause. nest another schema at this level. 6. Repeat steps 4 through 6 in this current schema. Make the top-level schema the current schema. the columns in a nested 290 SAP BusinessObjects Data Services Designer Guide .

Including a correlated column in a nested schema can serve two purposes: • The correlated column is a key in the parent schema.) 5. the software creates a column called LineItems that contains a nested schema that corresponds to the LineItems nested schema in the input. Change the current schema to the LineItems schema. Including the attribute in the nested schema allows you to use the attribute to simplify correlated queries against the nested data. Correlated columns can include columns from the parent schema and any other schemas in the FROM clause of the parent schema. • To include a correlated column in a nested schema. For example. Including the key in the nested schema allows you to maintain a relationship between the two schemas after converting them from the nested data model to a relational model. see Query editor. Including the correlated column creates a new output SAP BusinessObjects Data Services Designer Guide 291 .Nested Data Operations on nested data 12 schema are implicitly related to the columns in the parent row. Include a correlated column in the nested schema. Connect a query to the output of the source. you do not need to include the schema that includes the column in the FROM clause of the nested schema. copy all columns of the parent schema to the output. 3. you can use columns from the parent schema in the construction of the nested schema. For example. The correlated column is an attribute in the parent schema. In the query editor. 4. Create a data flow with a source that includes a parent schema with a nested schema. The higher-level column is a correlated column. 2. To take advantage of this relationship. To used a correlated column in a nested schema 1. drag the OrderNo column from the Header schema into the LineItems schema. the source could be an order header schema that has a LineItems column that contains a nested schema. (For information on setting the current schema and completing the parameters. In addition to the top-level columns.

OrderNo column. You can always remove the correlated column from the lower-level schema in a subsequent query transform. you can set the Group By clause in the top level of the data set to the state column (Order. the data in the nested schema includes only the rows that match both the related values in the current row of the parent schema and the value of the correlated column. If the correlated column comes from a schema other than the immediate parent.State) and create an output schema that includes State column (set to Order. Distinct rows and nested data The Distinct rows option in Query transforms removes any duplicate rows at the top level of a join.12 Nested Data Operations on nested data column in the LineItems schema called OrderNo and maps it to the Order. This is particularly useful to avoid cross products in joins that produce nested output. to assemble all the line items included in all the orders for each state from a set of orders. For example. The data set created for LineItems includes all of the LineItems columns and the OrderNo.State) and LineItems nested schema. the grouping operation combines the nested schemas for each group. 292 SAP BusinessObjects Data Services Designer Guide . Grouping values across nested schemas When you specify a Group By clause for a schema with a nested schema.

To load the data into relational schemas. a sales order may use a nested schema to define the relationship between the order header and the order line items. SAP BusinessObjects Data Services Designer Guide 293 . Unnesting a schema produces a cross-product of the top-level schema (parent) and the nested schema (child). the multi-level must be unnested. Unnesting nested data Loading a data set that contains nested schemas into a relational (non-nested) target requires that the nested rows be unnested. For example.Nested Data Operations on nested data 12 The result is a set of rows (one for each state) that has the State column and the LineItems nested schema that contains all the LineItems for all the orders for that state.

for example. the result of unnesting schemas is a cross product of the parent and child schemas. may be flattened so that the order number is maintained separately with each line item and the header and line item information loaded into separate schemas. No matter how many levels are involved. A sales order. 294 SAP BusinessObjects Data Services Designer Guide . then the result—the cross product of the parent and the inner-most child—is then unnested from its parent. the inner-most child is unnested first. The software allows you to unnest any number of nested schemas at any depth.12 Nested Data Operations on nested data It is also possible that you would load different columns from different nesting levels into different schemas. and so on to the top-level schema. When more than one level of unnesting occurs.

Data for unneeded columns or schemas might be more difficult to filter out after the unnesting operation. You can use the Cut command to remove columns or schemas from the top level. SAP BusinessObjects Data Services Designer Guide 295 . 2. flattening a sales order by unnesting customer and line-item schemas produces rows of data that might not be useful for processing the order. For each of the nested schemas that you want to unnest. To unnest nested data 1. The output of the query (the input to the next step in the data flow) includes the data in the new relationship. For example. as the following diagram shows. if an order includes multiple customer values such as ship-to and bill-to addresses. right-click the schema name and choose Unnest.Nested Data Operations on nested data 12 Unnesting all schemas (cross product of all data) might not produce the results you intend. to remove nested schemas or columns inside nested schemas. Create the output that you want to unnest in the output schema of a query. and then cut the unneeded columns or nested columns. make the nested schema the current schema.

Perform the transformation. Only the columns at the first level of the input data set are available for subsequent transforms. To transform values in lower levels of nested schemas 1. 2. • • Use an XML_Pipeline transform to select portions of the nested data.12 Nested Data Operations on nested data Transforming lower levels of nested data Nested data included in the input to transforms (with the exception of a query or XML_Pipeline transform) passes through the transform without being included in the transform's operation. Nest the data again to reconstruct the nested relationships. Related Topics • Unnesting nested data • Reference Guide: XML_Pipeline 296 SAP BusinessObjects Data Services Designer Guide . Take one of the following actions to obtain the nested data • Use a query transform to unnest the data.

use the long_to_varchar function to convert data to varchar. If you want a job to convert the output to a long column. The software's XML handling capability also supports reading from and writing to such fields. This function takes varchar data only. representing it as NRDM data during transformation. SAP BusinessObjects Data Services Designer Guide 297 . or clob. which takes the output of the load_to_xml function as input. There are plans to lift this restriction in the future. To enable extracting and parsing for columns. then loads the generated XML to a varchar column. The function load_to_xml generates XML from a given NRDM structure in the software. data from long and clob columns must be converted to varchar before it can be transformed by the software. The field is usually a varchar. The software provides four functions to support extracting from and loading to columns: • • • • extract_from_xml load_to_xml long_to_varchar varchar_to_long The extract_from_xml function gets the XML content stored in a single column and builds the corresponding NRDM structure so that the software can transform it. you can also use the software to extract XML data stored in a source table or flat file column.Nested Data XML extraction and parsing for columns 12 XML extraction and parsing for columns In addition to extracting XML message and file data. • The software converts a clob data type input to varchar if you select the Import unsupported data types as VARCHAR of size option when you create a database datastore connection in the Datastore Editor. use the varchar_to_long function. More and more database vendors allow you to store XML in one column. long. then load it to a target or flat file column. • Note: The software limits the size of the XML supported with these methods to 100K due to the current limitation of its varchar data type. If your source uses a long data type. then loading it to an XML message or file. transform it as NRDM data.

and a data flow for your design. Imported an Oracle table that contains a column named Content with the data type long. select the Conversion function type. a job. First. c. Created a Project. Imported the XML Schema PO. 4000) The second parameter in this function (4000 in this case) is the maximum size of the XML data stored in the table column. c.12 Nested Data XML extraction and parsing for columns Sample Scenarios The following scenarios describe how to use four functions to extract XML data from a source column and load it into a target column. the software will truncate the data and cause a runtime error. Scenario 1 Using long_to_varchar and extract_from_xml functions to extract XML data from a column with data of the type long. 2. which would waste computer memory at runtime. Create a query with an output column of data type varchar. If the size is not big enough to hold the maximum XML data for the column. open the Function Wizard. b. d. long_to_varchar(content.xsd. b. To extract XML data from a column into the software 1. and make sure that its size is big enough to hold the XML data. which provides the format for the XML data. d. From this point: a. In the Map section of the query editor. map the source table column to a new output column. Name this output column content. do not enter a number that is too big. 298 SAP BusinessObjects Data Services Designer Guide . In the query editor. which contains XML data for a purchase order. into the repository. Opened the data flow and dropped the source table with the column named content in the data flow. assume you have previously performed the following steps: a. Conversely. then select the long_to_varchar function and configure it by entering its parameters. Use this parameter with caution.

choose New Function Call. Enter 1 if you want the software to validate the XML with the specified Schema. select a column or columns that you want to use on output. If the function fails due to an error when trying to produce the XML output. The extract_from_xml function also adds two columns: • • AL_ERROR_NUM — returns error codes: 0 for success and a non-zero integer for failures AL_ERROR_MSG — returns an error message if AL_ERROR_NUM is not 0. Enter values for the input parameters.Nested Data XML extraction and parsing for columns 12 e. which is the output column in the previous query that holds the XML data • • The second parameter is the DTD or XML Schema name. Click Next. Returns NULL if AL_ERROR_NUM is 0 SAP BusinessObjects Data Services Designer Guide 299 . g. f. Imagine that this purchase order schema has five top-level elements: orderDate. and items. For the function. Create a second query that uses the function extract_from_xml to extract the XML data. The return type of the column is defined in the schema. Enter content. • The first is the XML column name. Otherwise. this function is not displayed in the function wizard. which include either scalar or NRDM column data. Note: You can only use the extract_from_xml function in a new function call. Enter 0 if you do not. Enter the name of the purchase order schema (in this case PO) The third parameter is Enable validation. billTo. h. When the Function Wizard opens. right-click the current context in the query. comment. the software returns NULL for scalar columns and empty nested tables for NRDM columns. You can select any number of the top-level columns from an XML schema. shipTo. To invoke the function extract_from_xml. select Conversion and extract_from_xml.

The software generates the function call in the current context and populates the output schema of the query with the output columns you specified. Scenario 2 Using the load_to_xml function and the varchar_to_long function to convert an NRDM structure to scalar data of the varchar type in an XML format and load it to a column of the data type long.12 Nested Data XML extraction and parsing for columns Choose one or more of these columns as the appropriate output for the extract_from_xml function. With the data converted into the NRDM structure. i. In this example. right-click the function call in the second query and choose Modify Function Call. to extract XML data from a column of data type long. Because the function load_to_xml returns a value of varchar data type. we created two queries: the first query to convert the data using the long_to_varchar function and the second query to add the extract_from_xml function. The first parameter of the function extract_from_xml can take a column of data type varchar or an expression that returns data of type varchar. if you want to load the NRDM structure to a target XML file. create an XML file target and connect the second query to it. Alternatively. you want to convert an NRDM structure for a purchase order to XML data using the function load_to_xml. In this example. Note: If you find that you want to modify the function call. Click Finish. which is of the long data type. you use the 300 SAP BusinessObjects Data Services Designer Guide . For example. you can use just one query by entering the function expression long_to_varchar directly into the first parameter of the function extract_from_xml. you are ready to do appropriate transformation operations on it. If the data type of the source column is not long but varchar. and then load the data to an Oracle table column called content. do not include the function long_to_varchar in your data flow.

1. 3.0" encoding = "UTF-8" ?>'. Create a query and connect a previous query or source (that has the NRDM structure of a purchase order) to it. in this example. Use the function varchar_to_long to map the input column content to the output column content. Connect this query to a database target. Enter a value for the input parameter. b. create an output column of the data type varchar called content. '<?xml version="1. Make sure the size of the column is big enough to hold the XML data. Open the function wizard from the mapping section of the query and select the Conversion Functions category c. Click Finish. You used the first query to convert an NRDM structure to SAP BusinessObjects Data Services Designer Guide 301 . varchar_to_long(content) 7. In this query. 2.Nested Data XML extraction and parsing for columns 12 function varchar_to_long to convert the value of varchar data type to a value of the data type long. In the mapping area of the Query window. Create another query with output columns matching the columns of the target table. 1. To load XML data into a column of the data type long 1. 4000) In this example. Assume the column is called content and it is of the data type long. Enter values for the input parameters. this function converts the NRDM structure of purchase order PO to XML data and assigns the value to output column content. d. 6. click the category Conversion Functions. The function load_to_xml has seven parameters. From the Mapping area open the function wizard. and then select the function load_to_xml. NULL. 4. 'PO'. a. 5. Like the example using the extract_from_xml function. you used two queries. notice the function expression: load_to_xml(PO. Click Next. The function varchar_to_long takes only one input parameter.

1.12 Nested Data XML extraction and parsing for columns XML data and to assign the value to a column of varchar data type. You used the second query to convert the varchar data type to long. '<?xml ver sion="1. 'PO'. there is no need for varchar_to_long in the transformation. Related Topics • Reference Guide: Functions and Procedure 302 SAP BusinessObjects Data Services Designer Guide . You can use just one query if you use the two functions in one expression: varchar_to_long( load_to_xml(PO. 1. 4000) ) If the data type of the column in the target database table that stores the XML data is varchar. NULL.0" encoding = "UTF-8" ?>'.

Real-time Jobs 13 .

304 SAP BusinessObjects Data Services Designer Guide . Real-time means that the software can receive requests from ERP systems and Web applications and send replies immediately after getting the requested data from a data cache or a second application. Two components support request-response message processing: • • Access Server — Listens for messages and routes each message based on message type. The content of the message can vary: • • It could be a sales order or an invoice processed by an ERP system destined for a data cache. Request-response message processing The message passed through a real-time system includes the information required to perform a business transaction.13 Real-time Jobs Request-response message processing The software supports real-time data transformation. Processing might require that additional data be added to the message from a data cache or that the message data be loaded to a data cache. It could be an order status request produced by a Web application that requires an answer from a data cache or back-office system. the Access Server routes the message to a waiting process that performs a predefined set of operations for the message type. The Access Server constantly listens for incoming messages. The Access Server then receives a response for the message and replies to the originating application. When a message is received. You define operations for processing on-demand messages by building real-time jobs in the Designer. The Access Server returns the response to the originating application. Real-time job — Performs a predefined set of operations for that message type and creates a response.

This ensures that each message receives a reply as soon as possible. Each real-time job can extract data from a single message type. You create a different real-time job for each type of message your system can produce. If a customer wants to know when they can pick up their order at your distribution center. The same powerful transformations you can define in batch jobs are available in real-time jobs. SAP BusinessObjects Data Services Designer Guide 305 . you might use transforms differently in real-time jobs. Real-time jobs "extract" data from the body of the message received and from any secondary sources used in the job.Real-time Jobs What is a real-time job? 13 What is a real-time job? The Designer allows you to define the processing of real-time messages using a real-time job. For example. However. transforms. the software writes data to message targets and secondary targets in parallel. and loads data. Real-time versus batch Like a batch job. you might want to create a CheckOrderStatus job using a look-up function to count order items and then a case transform to provide status in the form of strings: "No items are ready for pickup" or "X items in your order are ready for pickup" or "Your order is ready for pickup". a real-time job extracts. Also in real-time jobs. It can also extract data from other sources such as tables or files. you might use branches and logic controls more often than you would in batch jobs.

The real-time service processes the message and returns a response. The message contents might be as simple as the sales order number. and the line-item details for the order. it passes the message to a running real-time service designed to process this message type. The corresponding real-time job might use the input to query the right sources and return the appropriate product information. instead. 306 SAP BusinessObjects Data Services Designer Guide . When the Access Server receives a message. The real-time service continues to listen and process messages on demand until it receives an instruction to shut down. Typical messages include information required to implement a particular business operation and to produce an appropriate response. real-time jobs execute as real-time services started through the Administrator. suppose a message includes information required to determine order status for a particular order. The message might include the order number. Real-time services then wait for messages from the Access Server. Messages How you design a real-time job depends on what message you want it to process.13 Real-time Jobs What is a real-time job? Unlike batch jobs. customer information. In this case. the message contains data that can be represented as a single column in a single-row table. a message could be a sales order to be entered into an ERP system. For example. real-time jobs do not execute in response to a schedule or internal trigger. In a second case. The message processing could return confirmation that the order was submitted successfully.

both of the line items are processed for the single row of header information. the order header information can be represented by a table and the line items for the order can be represented by a second table. In this sales order. you can structure message targets so that all data is contained in a single row by nesting tables within columns of a single.Real-time Jobs What is a real-time job? 13 In this case. When processing the message. Real-time jobs can send only one row of data in a reply message (message target). the real-time job processes all of the rows of the nested table for each row of the top-level table. Related Topics • Nested Data SAP BusinessObjects Data Services Designer Guide 307 . the message contains data that cannot be represented in a single table. The software represents the header and line item data in the message in a nested relationship. top-level table. However. The software data flows support the nesting of tables within other tables.

Later sections describe the actual objects that you would use to construct the logic in the Designer. Using a query transform. Loading transactions into a back-office application A real-time job can receive a transaction from a Web application and load it to a back-office application (ERP.13 Real-time Jobs What is a real-time job? Real-time job examples These examples provide a high-level description of how real-time jobs address typical real-time scenarios. Collecting back-office data into a data cache You can use messages to keep the data cache current. SCM. 308 SAP BusinessObjects Data Services Designer Guide . legacy). you can include values from a data cache to supplement the transaction before applying it against the back-office application (such as an ERP system). Real-time jobs can receive messages from a back-office application and load them into a data cache or data warehouse.

Real-time Jobs What is a real-time job? 13 Retrieving values. back-office applications You can create real-time jobs that use values from a data cache to determine whether or not to query the back-office application (such as an ERP system) directly. SAP BusinessObjects Data Services Designer Guide 309 . data cache.

if the data represents 40 items. etc. work flows. you can ensure that data in each message is completely processed in an initial data flow before processing for the next data flows starts. scripts. while loops.13 Real-time Jobs Creating real-time jobs Creating real-time jobs You can create real-time jobs using the same objects as batch jobs (data flows. However. all 40 must pass though the first data flow to a staging or memory table before passing to a second data flow. object usage must adhere to a valid real-time job model. you create a real-time job using a single data flow in its real-time processing loop. By using multiple data flows. This allows you to control and collect all the data in a 310 SAP BusinessObjects Data Services Designer Guide . conditionals. Real-time job models Single data flow model With the single data flow model. This single data flow must include a single message source and a single message target. Multiple data flow model The multiple data flow model allows you to create a real-time job using multiple data flows in its real-time processing loop.). For example.

This data flow must have one and only one message source. Additional data flows cannot have message sources or targets. All data flows can use input and/or output memory tables to pass data sets on to the next data flow. Memory tables store data in memory while a loop runs. You can add any number of additional data flows to the loop. If you use multiple data flows in a real-time processing loop: • • • • • The first object in the loop must be a data flow. This data flow must have a message target. The last object in the loop must be a data flow. They improve the performance of real-time jobs with multiple data flows.Real-time Jobs Creating real-time jobs 13 message at any point in a real-time job for design and troubleshooting purposes. and you can add them inside any number of work flows. Using real-time job models Single data flow model When you use a single data flow within a real-time processing loop your data flow diagram might look like this: SAP BusinessObjects Data Services Designer Guide 311 .

this same data flow writes the data into a memory table (table target). Example scenario requirements: Your job must do the following tasks.13 Real-time Jobs Creating real-time jobs Notice that the data flow has one message source and one message target. completing each one before moving on to the next: • • • Receive requests about the status of individual orders from a web portal and record each message to a backup flat file Perform a query join to find the status of the order and write to a customer database table. add a data flow to the work flow. 312 SAP BusinessObjects Data Services Designer Guide . Reply to each message with the query join results Solution: First. Next. a work flow. set up the tasks in each data flow: • The first data flow receives the XML message (using an XML message source) and records the message to the flat file (flat file format target). Meanwhile. and an another data flow to the real-time processing loop. Second. create a real-time job and add a data flow. Multiple data flow model When you use multiple data flows within a real-time processing loop your data flow diagrams might look like those in the following example scenario in which Data Services writes data to several targets according to your multiple data flow design.

Related Topics • Designing real-time applications To create a real-time job 1. It reads the result of the join in the memory table (table source) and loads the reply (XML message target). which consists of two markers: • • RT_Process_begins Step_ends SAP BusinessObjects Data Services Designer Guide 313 . For more information. In the Designer. right-click the white space and select New Real-time job from the shortcut menu. • The second data flow reads the message data from the memory table (table source). performs a join with stored data (table source). New_RTJob1 appears in the project area. 2. see Memory datastores. • The last data flow sends the reply. Notice this data flow has neither a message source nor a message target.Real-time Jobs Creating real-time jobs 13 Note: You might want to create a memory table to move data to sequential data flows. create or open an existing project. From the project area. The workspace displays the job's structure. and writes the results to a database table (table target) and a new memory table (table target).

b. Always add a prefix to job names with their job type. you are telling Data Services to validate the data flow according the requirements of the job type (batch or real-time). In these cases. One message source and one message target are allowed in a real-time processing loop. You can add data flows to either a batch or real-time job. 314 SAP BusinessObjects Data Services Designer Guide . 4. c. In the project area. If you want to create a job with a single data flow: a. a prefix saved with the job name will help you identify it. Click inside the loop.13 Real-time Jobs Creating real-time jobs These markers represent the beginning and end of a real-time processing loop. job names may also appear in text editors used to create adapter or Web Services calls. The boundaries of a loop are indicated by begin and end markers. 3. Connect the begin and end markers to the data flow. Click the data flow icon in the tool palette. use the naming convention: RTJOB_JobName. rename New_RTJob1. Although saved real-time jobs are grouped together under the Job tab of the object library. When you place a data flow icon into a job. In this case.

f. configure. drop data flows within job-level work flows. Note: Objects at the real-time job level in Designer diagrams must be connected. d. and connect initialization object(s) and clean-up object(s) as needed. and clean up objects outside the real-time processing loop as needed. e. If you want to create a job with multiple data flows: a. data flows. To include parallel processing in a real-time job. This data flow must include one message source. or conditionals from left to right between the first data flow and end of the real-time processing loop. drop other objects such as work flows. A real-time job with a single data flow might look like this: 5. Build the data flow including a message source and message target. Drop.Real-time Jobs Creating real-time jobs 13 d. Just before the end of the loop. drop and configure your last data flow. Drop and configure a data flow. Return to the real-time job window and connect all the objects. c. Add. Open each object and configure it. b. configure. A real-time job with multiple data flows might look like this: SAP BusinessObjects Data Services Designer Guide 315 . scripts. Connected objects run in sequential order. Do not connect these secondary-level data flows. e. This data flow must include one message target. These data flows will run in parallel when job processing begins. and connect the initialization. After this data flow.

validate your job. 9. After adding and configuring all objects. 8. Assign test source and target files for the job and execute it in test mode. Those normally available are: 316 SAP BusinessObjects Data Services Designer Guide .13 Real-time Jobs Real-time source and target objects 6. 7. Using the Administrator. Save the job. configure a service and service providers for the job and run it in your test environment. Real-time source and target objects Real-time jobs must contain a real-time source and/or target object.

with the following additions: For XML messages Prerequisite Object library location Import a DTD or XML Schema Formats tab to define a format Define an adapter datastore and import object metadata. Datastores tab. If the XML message source or target contains nested data. SAP BusinessObjects Data Services Designer Guide 317 . under adapter datastore Outbound message Related Topics • To import a DTD or XML Schema format • Adapter datastores To view an XML message source or target schema In the workspace of a real-time job. see the Supplement for SAP. Adding sources and targets to real-time jobs is similar to adding them to batch jobs. For more information. click the name of an XML message source or XML message target to open its editor. the schema displays nested tables to represent the relationships among the data.Real-time Jobs Real-time source and target objects 13 Object Description Used as a: Software Access Directly or through adapters XML message An XML message structured in a DTD or Source or target XML Schema format A real-time message with an applicationspecific format (not readable by XML parser) Outbound message Target Through an adapter You can also use IDoc messages as real-time sources for SAP applications.

suppose you are processing a message that contains a sales order from a Web application. but when you apply the order against your ERP system. 318 SAP BusinessObjects Data Services Designer Guide .13 Real-time Jobs Real-time source and target objects Secondary sources and targets Real-time jobs can also have secondary sources or targets (see Source and target objects). you need to supply more detailed customer information. For example. The order contains the customer name.

in which the data resulting from the processing of a single data flow can be loaded into multiple tables as a single transaction. you can supplement the message with the customer information to produce the complete document to send to the ERP system. Add secondary sources and targets to data flows in real-time jobs as you would to data flows in batch jobs (See Adding source or target objects to data flows). use caution when you consider enabling this option for a batch job because it requires the use of memory. SAP BusinessObjects Data Services Designer Guide 319 . The software loads data to secondary targets in parallel with a target message. However. Note: Target tables in batch jobs also support transactional loading. The supplementary information might come from the ERP system itself or from a data cache containing the same information.Real-time Jobs Real-time source and target objects 13 Inside a data flow of a real-time job. The software reads data from secondary sources according to the way you design the data flow. which can reduce performance when moving large amounts of data. Tables and files (including XML files) as sources can provide this supplementary information. Transactional loading of tables Target tables in real-time jobs support transactional loading. No part of the transaction applies if any part fails.

the tables are still included in the same transaction but are loaded together. your job might be designed in such a way that no data passes to the reply message. the software includes the data set from the real-time source as the outer loop of the join. If you specify the same transaction order for all targets in the same datastore. You can specify the same transaction order or distinct transaction orders for all targets to be included in the same transaction. You can use transactional loading only when all the targets in a data flow are in the same datastore. Design tips for data flows in real-time jobs Keep in mind the following when you are designing data flows: • If you include a table in a join with a real-time source. • • 320 SAP BusinessObjects Data Services Designer Guide . No two tables are loaded at the same time. Loading is committed after all tables in the transaction finish loading. While multiple targets in one datastore may be included in one transaction. if a request comes in for a product number that does not exist. If no rows are passed to the XML target. If you specify distinct transaction orders for all targets in the same datastore. the targets in another datastores must be included in another transaction. the transaction orders indicate the loading orders of the tables. If the data flow loads tables in more than one datastore. you can control which table is included in the next outer-most loop of the join using the join ranks for the tables. do not cache data from secondary sources unless the data is static. You might want to provide appropriate instructions to your user (exception handling in your job) to account for this type of scenario. and so on. The data will be read when the real-time job starts and will not be updated while the job is running.13 Real-time Jobs Real-time source and target objects You can specify the order in which tables in the transaction are included using the target table editor. In real-time jobs. until the table with the largest transaction order is loaded last. targets in each datastore load independently. Loading is committed when the last table finishes loading. For example. the real-time job returns an empty response to the Access Server. The table with the smallest transaction order is loaded first. If more than one supplementary source is included in the join. This feature supports a scenario in which you have a set of tables with foreign keys that depend on one with primary keys.

To specify a sample XML message and target test file 1. • Related Topics • Reference Guide: Data Services Objects. Real-time job • Nested Data Testing real-time jobs Executing a real-time job in test mode You can test real-time job designs without configuring the job as a service associated with an Access Server. 2. you can structure any amount of data into a single "row" because columns in tables can contain other tables. With NRDM. Test mode is always enabled for real-time jobs. you can execute a real-time job using a sample source message from a file to determine if the software produces the expected target message. enter a file name in the XML test file box. use your knowledge of the software's Nested Relational Data Model (NRDM) and structure your message source and target formats so that one "row" equals one message. Enter the full path name for the source file that contains your sample data. To avoid this issue. In test mode. Execute the job. Recovery mechanisms are not supported in real-time jobs. Use paths for both test files relative to the computer that runs the Job Server for the current repository. the target reads the first row and discards the other rows. In the XML message source and target editors.Real-time Jobs Testing real-time jobs 13 • If more than one row passes to the XML target. The software reads data from the source test file and loads it into the target test file. SAP BusinessObjects Data Services Designer Guide 321 .

you can capture a sample of your output data to ensure your design is working. Enter a file name relative to the computer running the Job Server. 2. The XML file target appears in the workspace. Just like an XML message. Unlike XML messages. With View Data. In the file editor. To use a file to capture output from steps in a real-time job 1.13 Real-time Jobs Testing real-time jobs Using View Data To ensure that your design returns the results you expect. specify the location to which the software writes data. execute your job using View Data. Related Topics • Design and Debug Using an XML file target You can use an "XML file target" to capture the message produced by a data flow while allowing the message to be returned to the Access Server. 3. then dragging the format into the data flow definition. 322 SAP BusinessObjects Data Services Designer Guide . you define an XML file by importing a DTD or XML Schema for the file. In the Formats tab of the object library. Choose Make XML File Target. A menu prompts you for the function of the file. you can include XML files as sources or targets in batch and real-time jobs. drag the DTD or XML Schema into a data flow of a real-time job.

Use a query to extract the necessary data from the table or file. In addition to the real-time source. SAP BusinessObjects Data Services Designer Guide 323 . Use the data in the real-time source to find the necessary supplementary data. you can define steps in the real-time job to supplement the message information. One technique for supplementing the data in a real-time source includes these steps: 1. include the files or tables from which you require supplementary information. Building blocks for real-time jobs Supplementing message data The data included in messages from real-time sources might not map exactly to your requirements for processing or storing the information. Include a table or file as a source. You can include a join expression in the query to extract the specific values required from the supplementary source. Connect the output of the step in the data flow that you want to capture to the input of the file. 3. If not.Real-time Jobs Building blocks for real-time jobs 13 4. 2.

resulting in output for only the sales document and line items included in the input from the application. Be careful to use data in the join that is guaranteed to return a value. the query produces no rows and the message returns to the Access Server empty. 324 SAP BusinessObjects Data Services Designer Guide . a request message includes sales order information and its reply message returns order status. even if no match is found To supplement message data In this example. If you cannot guarantee that a value returns. If no value returns from the join.13 Real-time Jobs Building blocks for real-time jobs The Where clause joins the two inputs. The message includes only the customer name and the order number. A real-time job is then defined to retrieve the customer number and rating from other sources before determining the order status. The business logic uses the customer number and priority rating to determine the level of status to return. consider these alternatives: • • Lookup function call — Returns a default value if no match is found Outer join — Always returns a value.

3. 2. Include the real-time source in the real-time job. Order status for other customers is determined from a data cache of sales order information. In a query transform. construct a join on the customer name: (Message. Order status for the highest ranked customers is determined directly from the ERP. Both branches return order status for each line item in the order. Join the sources. Complete the real-time job to determine order status. the supplementary information required doesn't change very often. In this example. The illustration below shows a single data flow model. Include the supplementary source in the real-time job.Real-time Jobs Building blocks for real-time jobs 13 1. The example shown here determines order status in one of two methods based on the customer status value.CustName) You can construct the output to include only the columns that the real-time job needs to determine order status. 4. The data flow merges the results and constructs the response.CustName = Cust_Status. This source could be a table or file. so it is reasonable to extract the data from a data cache rather than going to an ERP system directly. The logic can be arranged in a single or multiple data flows. SAP BusinessObjects Data Services Designer Guide 325 . The next section describes how to design branch paths in a data flow.

One technique for constructing this logic includes these steps: 1. but the system is not currently available. Compare data from the real-time source with the rule. Determine the rule for when to access the data cache and when to access the back-office application. You might need to consider the case where the rule indicates back-office application access. SCM. You might need to consider error-checking and exception-handling to make sure that a value passes to the target. If the target receives an 326 SAP BusinessObjects Data Services Designer Guide .13 Real-time Jobs Building blocks for real-time jobs Branching data flow based on a data cache value One of the most powerful things you can do with a real-time job is to design logic that determines whether responses should be generated from a data cache or if they must be generated from data in a back-office application (ERP. Route the single result to the real-time target. Define each path that could result from the outcome. 5. 4. Merge the results from each path into a single data set. CRM). 2. 3.

the real-time job returns an empty response (begin and end XML tags only) to the Access Server. Create a real-time job and drop a data flow inside it. yet the data flow compares values for line items inside the sales order. To branch a data flow based on a rule 1. Because this data flow needs to be able to determine inventory values for multiple line items. 3.Real-time Jobs Building blocks for real-time jobs 13 empty set. See To import a DTD or XML Schema format to define the format of the data in the XML message. This example describes a section of a real-time job that processes a new sales order. Add the XML source in the data flow. the structure of the SAP BusinessObjects Data Services Designer Guide 327 . The software makes a comparison for each line item in the order. The XML source contains the entire sales order. The section is responsible for checking the inventory available of the ordered products—it answers the question. Determine the values you want to return from the data flow. The XML target that ultimately returns a response to the Access Server requires a single row at the top-most level. 2. "is there enough inventory on hand to fill this order?" The rule controlling access to the back-office application indicates that the inventory (Inv) must be more than a pre-determined value (IMargin) greater than the ordered quantity (Qty) to consider the data cached inventory value acceptable.

The input is already nested under the sales order. 6. Construct the query so you extract the expected data from the inventory data cache table. You can drag all of the columns and nested tables from the input to the output. 4. you would add a join expression in the WHERE clause of the query. then delete any unneeded columns or nested tables from the output. Because the comparison occurs between a nested 328 SAP BusinessObjects Data Services Designer Guide . In addition. the output can use the same convention.13 Real-time Jobs Building blocks for real-time jobs output requires the inventory information to be nested. 5. Add the comparison table from the data cache to the data flow as a source. the output needs to include some way to indicate whether the inventory is or is not available. Connect the output of the XML source to the input of a query and map the appropriate columns to the output. Without nested data.

Real-time Jobs Building blocks for real-time jobs 13 table and another top-level table. Include the values from the Inventory table that you need to make the comparison. you have to define the join more carefully: • • • • Change context to the LineItem table Include the Inventory table in the FROM clause in this context (the LineItem table is already in the From list) Define an outer join with the Inventory table as the inner table Add the join expression in the WHERE clause in this context In this example. Add two queries to the data flow: • Query to process valid inventory values from the data cache The WHERE clause at the nested level (LineItem) of the query ensures that the quantities specified in the incoming line item rows are appropriately accounted for by inventory values from the data cache. you can assume that there will always be exactly one value in the Inventory table for each line item and can therefore leave out the outer join definition. Drag the Inv and IMargin columns from the input to the LineItem table. 8. Split the output of the query based on the inventory comparison. SAP BusinessObjects Data Services Designer Guide 329 . 7.

Show inventory levels only if less than the order quantity. 9. make sure to define an outer join so that the line item row is not lost. you could use a lookup function or a join on the specific table in the ERP system. 330 SAP BusinessObjects Data Services Designer Guide . For example.13 Real-time Jobs Building blocks for real-time jobs • Query to retrieve inventory values from the ERP The WHERE clause at the nested level (LineItem) of the query ensures that the quantities specified in the incoming line item rows are not accounted for by inventory values from the data cache. if you cannot guarantee that a value will be returned by the join. This example uses a join so that the processing can be performed by the ERP system rather than the software. The inventory values in the ERP inventory table are then substituted for the data cache inventory values in the output There are several ways to return values from the ERP. As in the previous join.

Both branches of the data flow include the same column and nested tables.IMARGIN 10. you can remove the inventory value from the output of these rows. SAP BusinessObjects Data Services Designer Guide 331 . "is there enough inventory to fill this order?" To complete the order processing. Complete the processing of the message. Add the XML target to the output of the Merge transform.INV – ERP_Inventory. The Merge transform combines the results of the two branches into a single data set. • • For data cache OK: Inv maps from 'NULL' For CheckERP: Inv maps from ERP_Inventory. each branch returns an inventory value that can then be compared to the order quantity to answer the question. Change the mapping of the Inv column in each of the branches to show available inventory values only if they are less than the order quantity.Real-time Jobs Building blocks for real-time jobs 13 The goal from this section of the data flow was an answer to. the available inventory value can be useful if customers want to change their order quantity to match the inventory available. 11. The "CacheOK" branch of this example always returns line-item rows that include enough inventory to account for the order quantity. Merge the branches into one response. The "CheckERP" branch can return line item rows without enough inventory to account for the order quantity.

you can specify the top-level table. You must determine the requirements of the function to prepare the appropriate inputs. Application functions require input values for some parameters and some can be left unspecified. and any tables nested one-level down relative to the tables listed in the FROM clause of the context calling the function.13 Real-time Jobs Designing real-time applications Calling application functions A real-time job can use application functions to operate on data. You can include tables as input or output parameters to the function. To make up the input. top-level columns. then shape the results into the columns and tables required for a response. retrieve results. Designing real-time applications The software provides a reliable and low-impact connection between a Web application and an back-office applications such as an enterprise resource planning (ERP) system. A data flow may contain several steps that call a function. If the application function includes a structure as an input parameter. you have many opportunities to design a system that meets your internal and external information and resource needs. you must specify the individual columns that make up the structure. Because each implementation of an ERP system is different and because the software includes versatile decision support logic. 332 SAP BusinessObjects Data Services Designer Guide .

you can structure your application to reduce the number of queries that require direct back-office (ERP. design the application to avoid displaying price information along with standard product information and instead show pricing only after the customer has chosen a specific product and quantity. if your ERP system supports a complicated pricing structure that includes dependencies such as customer priority. To reduce the impact of queries requiring direct ERP system access. In particular. or order quantity. Legacy) application access. For example. you might not be able to depend on values from a data cache for pricing information. reducing the performance your customer experiences with your Web application. • Message function calls allow the adapter instance to collect requests and send replies. Messages from real-time jobs to adapter instances If a real-time job will send a message to an adapter instance. The information you allow your customers to access through your Web application can impact the performance that your customers see on the Web. The alternative might be to request pricing information directly from the ERP system. You can maximize performance through your Web application design decisions.Real-time Jobs Designing real-time applications 13 Reducing queries requiring back-office application access This section provides a collection of recommendations and considerations that can help reduce the time you spend experimenting in your development cycles. These techniques are evident in the way airline reservations systems provide pricing information—a quote for a specific flight—contrasted with other retail Web sites that show pricing for every item displayed as part of product catalogs. Using the pricing example. SCM. ERP system access is likely to be much slower than direct database access. modify your Web application. product availability. SAP BusinessObjects Data Services Designer Guide 333 . refer to the adapter documentation to decide if you need to create a message function call or an outbound message.

Using these objects in real-time jobs is the same as in batch jobs.) When an operation instance (in an adapter) gets a message from an information resource. The DTD or XML Schema represents the data schema for the information resource.13 Real-time Jobs Designing real-time applications • Outbound message objects can only send outbound messages. (Please see your adapter SDK documentation for more information about terms such as operation instance and information resource. it translates it to XML (if necessary). an adapter instance). See To modify output schema contents. 334 SAP BusinessObjects Data Services Designer Guide . Real-time service invoked by an adapter instance This section uses terms consistent with Java programming. In the real-time service. then sends the XML message to a real-time service. see Importing metadata through an adapter datastore. They cannot be used to receive messages. and returns the response to a target (again. In the example data flow below. For information on importing message function calls and outbound messages. the message from the adapter is represented by a DTD or XML Schema object (stored in the Formats tab of the object library). The real-time service processes the message from the information resource (relayed by the adapter) and returns a response. the Query processes a message (here represented by "Employment") received from a source (an adapter instance).

Embedded Data Flows 14 .

then executes it. You can create the following types of embedded data flows: Type One input One output No input or output Use when you want to. Group sections of a data flow in embedded data flows to allow clearer layout and documentation. Debug data flow logic. Reuse data flow logic.14 Embedded Data Flows Overview of embedded data flows The software provides an easy-to-use option to create embedded data flows. but only one input or one output can pass data to or from the parent data flow. The embedded data flow can contain any number of sources or targets. it expands any embedded data flows. When the software executes the parent data flow. Overview of embedded data flows An embedded data flow is a data flow that is called from inside another data flow.. Data passes into or out of the embedded data flow from the parent flow through a single source or target. Use embedded data flows to: • • Simplify data flow display. An embedded data flow is a design aid that has no effect on job execution.. • 336 SAP BusinessObjects Data Services Designer Guide . Replicate sections of a data flow as embedded data flows so you can execute them independently. or provide an easy way to replicate the logic and modify it for other flows. Save logical sections of a data flow so you can use the exact logic in other data flows. optimizes the parent data flow. Add an embedded data flow at the end of a data flow Add an embedded data flow at the beginning of a data flow Replicate an existing data flow.

Embedded Data Flows Example of when to use embedded data flows 14 Example of when to use embedded data flows In this example. a data flow uses a single source to load three different target systems. You can simplify the parent data flow by using embedded data flows for the three different cases. Creating embedded data flows There are two ways to create embedded data flows. The Case transform sends each row from the source to different transforms that process it to get a unique target output. SAP BusinessObjects Data Services Designer Guide 337 .

one output.14 Embedded Data Flows Creating embedded data flows • • Select objects within a data flow. Right-click one object you want to use as an input or as an output port and select Make Port for that object. Right-click and select Make Embedded Data Flow. right-click. and select Make Embedded Data Flow. which means that the embedded data flow can appear only at the beginning or at the end of the parent data flow. Drag a complete and fully validated data flow from the object library into an open data flow in the workspace. or no input or output 2. 338 SAP BusinessObjects Data Services Designer Guide . Using the Make Embedded Data Flow option To create an embedded data flow 1. Note: You can specify only one port. The software marks the object you select as the connection point for this embedded data flow. Select objects from an open data flow using one of the following methods: • Click the white space and drag the rectangle around the objects • CTRL-click each object Ensure that the set of objects you select are: • • All connected to each other Connected to other objects according to the type of embedded data flow you want to create such as one input. Then: • • Open the data flow you just added.

You can use an embedded data flow created without replacement as a stand-alone data flow for troubleshooting.Embedded Data Flows Creating embedded data flows 14 The Create Embedded Data Flow window opens. 3. 4. The software saves the new embedded data flow object to the repository and displays it in the object library under the Data Flows tab. SAP BusinessObjects Data Services Designer Guide 339 . with the embedded data flow connected to the parent by one input object. If you deselect the Replace objects in original data flow box. If Replace objects in original data flow is selected. which has a call to the new embedded data flow. the software will not make a change in the original data flow. the original data flow becomes a parent data flow. Click OK. Name the embedded data flow using the convention EDF_EDFName for example EDF_ERP.

which is the input port that connects this embedded data flow to the parent data flow. Notice that the software created a new object. if an embedded data flow has an output connection. the software automatically creates an input or output object based on the object that is connected to the embedded data flow when it is created. Click the name of the embedded data flow to open it. 5. When you use the Make Embedded Data flow option.14 Embedded Data Flows Creating embedded data flows The embedded data flow appears in the new parent data flow. The naming conventions for each embedded data flow type are: Type Naming Conventions One input EDFName_Input 340 SAP BusinessObjects Data Services Designer Guide . For example. EDF_ERP_Input. 6. the embedded data flow will include a target XML file object labeled EDF Name_Output.

Note: Ensure that you specify only one input or output port. SAP BusinessObjects Data Services Designer Guide 341 .Embedded Data Flows Creating embedded data flows 14 Type Naming Conventions One output EDFName_Output No input or output The software creates an embedded data flow without an input or output object Creating embedded data flows from existing flows To call an existing data flow from inside another data flow. Consider renaming the flow using the EDF_EDFName naming convention. then mark which source or target to use to pass data between the parent and the embedded data flows. put the data flow inside the parent data flow. Drag an existing valid data flow from the object library into a data flow that is open in the workspace. Right-click a source or target object (file or table) and select Make Port. The embedded data flow appears without any arrowheads (ports) in the workspace. Different types of embedded data flow ports are indicated by directional markings on the embedded data flow icon. Open the embedded data flow. 3. 2. 4. To create an embedded data flow out of an existing data flow 1.

It updates the schema of embedded data flow 1 in the repository. you might want to use the Update Schema option or the Match Schema option. Create data flow 2 and data flow 3 and add embedded data flow 1 to both of them. Use the Match Schema option for embedded data flow 1 in both data flow 2 and data flow 3 to resolve the mismatches at runtime. and create embedded data flow 1 so that parent data flow 1 calls embedded data flow 1. Change the schema of the object preceding embedded data flow 1 and use the Update Schema option with embedded data flow 1. To save mapping time. This option updates the schema of an embedded data flow's input object with the schema of the preceding object in the parent data flow. You can reuse an embedded data flow by dragging it from the Data Flow tab of the object library into other data flows. The following example scenario uses both options: • • • • Create data flow 1. the software creates new input or output XML file and saves the schema in the repository as an XML Schema. • • The following sections describe the use of the Update Schema and Match Schema options in more detail. All occurrences of the embedded data flow update when you use this option.14 Embedded Data Flows Creating embedded data flows Using embedded data flows When you create and configure an embedded data flow using the Make Embedded Data Flow option. Select objects in data flow 1. Go back to data flow 1. 342 SAP BusinessObjects Data Services Designer Guide . The Match Schema option only affects settings in the current data flow. Updating Schemas The software provides an option to update an input schema of an embedded data flow. Now the schemas in data flow 2 and data flow 3 that are feeding into embedded data flow 1 will be different from the schema the embedded data flow expects.

Embedded Data Flows Creating embedded data flows 14 To update a schema 1. For example. Data Services also allows the schema of the preceding object in the parent data flow to have more or fewer columns than the embedded data flow. the software copies the schema of Case to the input of EDF_ERP. Open the embedded data flow's parent data flow. Right-click the embedded data flow object and select Update Schema. The Match Schema option only affects settings for the current data flow. Matching data between parent and embedded data flow The schema of an embedded data flow's input object can match the schema of the preceding object in the parent data flow by name or position. 2. A match by position is the default. in the data flow shown below. Open the embedded data flow's parent data flow. The SAP BusinessObjects Data Services Designer Guide 343 . 2. To specify how schemas should be matched 1. Right-click the embedded data flow object and select Match SchemaBy Name or Match SchemaBy Position.

Data Services removes the connection to the parent object. Separately testing an embedded data flow Embedded data flows can be tested by running them separately as regular data flows. or remove entire embedded data flows. Columns in both schemas must have identical or convertible data types. Note: You cannot remove a port simply by deleting the connection in the parent flow. 1. If you delete embedded data flows from the object library. Specify an XML file for the input port or output port. See the section on "Type conversion" in the Reference Guide for more information. To remove an embedded data flow Select it from the open parent data flow and choose Delete from the right-click menu or edit menu. To remove a port Right-click the input or output object within the embedded data flow and deselect Make Port. Delete these defunct embedded data flow objects from the parent data flows. Deleting embedded data flow objects You can delete embedded data flow ports. 344 SAP BusinessObjects Data Services Designer Guide .14 Embedded Data Flows Creating embedded data flows embedded data flow ignores additional columns and reads missing columns as NULL. the embedded data flow icon appears with a red circle-slash flag in the parent data flow.

transformed.Embedded Data Flows Creating embedded data flows 14 When you use the Make Embedded Data Flow option. Auditing statistics about the data read from sources. 2. For example. Deleted connection to the parent data flow while the Make Port option. click the name of the XML file to open its source or target editor to specify a file name. in the embedded data flow. DF1 data flow calls EDF1 embedded data flow which calls EDF2. and rules about the audit statistics to verify the expected data is processed. Embedding the same data flow at any level within itself. Put the embedded data flow into a job. and embedded data flows can only have one. You can however have unlimited embedding levels. Transforms with splitters (such as the Case transform) specified as the output port object because a splitter produces multiple outputs. and loaded into targets. Trapped defunct data flows. an input or output XML file object is created and then (optional) connected to the preceding or succeeding object in the parent data flow. Variables and parameters declared in the embedded data flow that are not also declared in the parent data flow. remains selected. 3. To test the XML file without a parent data flow. You can also use the following features to test embedded data flows: • • View Data to sample data passed into an embedded data flow. Related Topics • Reference Guide: XML file • Design and Debug Troubleshooting embedded data flows The following situations produce errors: • • • • Both an input port and output port are specified in an embedded data flow. • • SAP BusinessObjects Data Services Designer Guide 345 . Run the job.

14 Embedded Data Flows Creating embedded data flows Related Topics • To remove an embedded data flow • To remove a port 346 SAP BusinessObjects Data Services Designer Guide .

Variables and Parameters 15 .

a variable can be used in a LOOP or IF statement to check a variable's value to decide which step to perform: If $amount_owed > 0 print('$invoice. the software typically uses them in a script. decimal. If you define variables in a job or work flow. The data type of a variable can be any supported by the software such as an integer. For example. or text string.15 Variables and Parameters Overview of variables and parameters This section contains information about the following: • Adding and defining local and global variables for jobs • Using environment variables • Using substitution parameters and configurations Overview of variables and parameters You can increase the flexibility and reusability of work flows and data flows by using local and global variables when you design your jobs. or conditional process. Variables are symbolic placeholders for values.doc'). 348 SAP BusinessObjects Data Services Designer Guide . date. You can use variables in expressions to facilitate decision-making or data manipulation (using arithmetic or character substitution). catch.

and global variables using the Variables and Parameters window in the Designer. You can also set global variable values using external job. however. or schedule properties. You create local variables. they do not require parameters to be passed to work flows and data flows. parameters. You can set values for local or global variables in script objects. For example. Global variables are restricted to the job in which they are created. local variables are restricted to the object in which they are created (job or work flow). In the software. Using global variables provides you with maximum flexibility. the global variables are not assigned. Note: If you have workflows that are running in parallel. Parameters are expressions that pass to a work flow or data flow when they are called in a job. execution. For example. You must use parameters to pass local variables to child objects (work flows and data flows). use them in a custom function or in the WHERE clause of a query transform. Variables can be used as file names for: • • • • • Flat file sources and targets XML file sources and targets XML message targets (executed in the Designer in test mode) IDoc file sources and targets (in an SAP application environment) IDoc message sources and targets (SAP application environment) Related Topics • Administrator Guide: Support for Web Services SAP BusinessObjects Data Services Designer Guide 349 .Variables and Parameters Overview of variables and parameters 15 You can use variables inside data flows. during production you can change values for default global variables at runtime from a job's schedule or “SOAP” call without having to open a job in the Designer.

In the Tools menu. the window does not indicate a context. select Variables. To view the variables and parameters in each job. Local variable and parameters can only be set at the work flow and data flow level. If there is no object selected. 350 SAP BusinessObjects Data Services Designer Guide . work flow. The "Variables and Parameters" window opens. From the object library. or from the project area click an object to open it in the workspace. Global variables can only be set at the job level. and parameter type) for an object type. The Variables and Parameters window contains two tabs. The Context box in the window changes to show the object you are viewing. data type. The Definitions tab allows you to create and view variables (name and data type) and parameters (name.15 Variables and Parameters The Variables and Parameters window The Variables and Parameters window The software displays the variables and parameters defined for an object in the "Variables and Parameters" window. The following table lists what type of variables and parameters you can create using the Variables and Parameters window when you select different objects. 2. double-click an object. or data flow 1.

For the output or input/output parameter type. Work flows may also return variables or parameters to parent objects. Data flows cannot return output values. A WHERE clause. values in the Calls tab can be variables or parameters. values in the Calls tab can be constants. You can also enter values for each parameter. variables. or a function in the data flow. Local variables Work flow Parameters Data flow Parameters The Calls tab allows you to view the name of each parameter defined for all objects in a parent object's definition.Variables and Parameters The Variables and Parameters window 15 Object Type What you can create for the Used by object Local variables Job Global variables A script or conditional in the job Any object in the job This work flow or passed down to other work flows or data flows using a parameter. column mapping. and a compatible data type if they are placed inside an output parameter type. Values in the Calls tab must also use: • The same data type as the variable if they are placed inside an input or input/output parameter type. Scripting language rules and syntax • SAP BusinessObjects Data Services Designer Guide 351 . Parent objects to pass local variables. or another parameter. For the input parameter type.

define the variable in a parent work flow and then pass the value of the variable as a parameter of the data flow. 352 SAP BusinessObjects Data Services Designer Guide . define the local variable.15 Variables and Parameters Using local variables and parameters The following illustration shows the relationship between an open work flow called DeltaFacts. to use a local variable inside a data flow. For example. Using local variables and parameters To pass a local variable to another object. then from the calling object. create a parameter and map the parameter to the local variable by entering a parameter value. and the content in the Definition and Calls tabs. the Context box in the Variables and Parameters window.

Related Topics • Reference Guide: Custom functions Passing values into data flows You can use a value passed as a parameter into a data flow to control the data transformed in the data flow. the data flow DF_PartFlow processes daily inventory values. output. For example. SAP BusinessObjects Data Services Designer Guide 353 . The value passed by the parameter can be used by any object called by the work flow or data flow.Variables and Parameters Using local variables and parameters 15 Parameters Parameters can be defined to: • • Pass their values into and out of work flows Pass their values into data flows Each parameter is assigned a type: input. Note: You can also create local variables and parameters for use in custom functions. It can process all of the part numbers in use or a range of part numbers based on external requirements such as the range of numbers processed most recently. or input/output.

15 Variables and Parameters Using local variables and parameters If the work flow that calls DF_PartFlow records the range of numbers processed. it can pass the end value of the range $EndRange as a parameter to the data flow to indicate the start value of the range to process next. such as $SizeOfSet. 4. select Variables. Always begin the name with a dollar sign ($). The data flow could be used by multiple calls contained in one or more work flows to perform the same task on different part number ranges by specifying different parameters for the particular calls. The name can include alphanumeric characters or underscores (_). or double-click one from the object library. To edit the name of the new variable. Click the name of the job or work flow in the project area or workspace. 354 SAP BusinessObjects Data Services Designer Guide . 3. Click Tools > Variables. and pass that value to the data flow as the end value. A new variable appears (for example. 2. but cannot contain blank spaces. 5. From the Definitions tab. A query transform in the data flow uses the parameters passed in to filter the part numbers extracted from the source. To define a local variable 1. Right-click and select Insert. click the name cell. A focus box appears around the name cell and the cursor shape changes to an arrow with a yellow pencil. $NewVariable0). The software can calculate a new end value based on a stored number of parts to process each time. The "Variables and Parameters" window appears.

Go to the Definition tab. if the parameter is an output parameter type. $NewParameter0). To add the parameter to the flow definition 1. Close the "Variables and Parameters" window. Defining parameters There are two steps for setting up a parameter for a work flow or data flow: • • Add the parameter definition to the flow. Set the value of the parameter in the flow call. 7. or input/output). output. 6. it must have the same data type as the variable. click the name cell. Right-click and select Insert. If the parameter is an input or input/output parameter. A focus box appears and the cursor shape changes to an arrow with a yellow pencil. 7. Select Parameters. Close the "Variables and Parameters" window. but cannot contain blank spaces. Click the data type cell for the new parameter and select the appropriate data type from the drop-down list. SAP BusinessObjects Data Services Designer Guide 355 . 5. Click the data type cell for the new variable and select the appropriate data type from the drop-down list. Always begin the name with a dollar sign ($). 2. Click the parameter type cell and select the parameter type (input. Click Tools > Variables. 9. Click the name of the work flow or data flow. 4. 8. To edit the name of the new variable.Variables and Parameters Using local variables and parameters 15 6. A new parameter appears (for example. The name can include alphanumeric characters or underscores (_). The "Variables and Parameters" window appears. it must have a compatible data type. 3.

or data flow. Select the Calls tab. 4. 2. Open the calling job. The Calls tab shows all the objects that are called from the open job. the value of an output or input/output parameter can be modified by any object within the flow.15 Variables and Parameters Using global variables To set the value of the parameter in the flow call 1. $startID or $parm1). 3. by definition. or data flow. use the following syntax: Value type Special syntax Variable $variable_name String 'string ' Using global variables Global variables are global within a job. or 'string1'). However. once you use a name for a global 356 SAP BusinessObjects Data Services Designer Guide . a variable. work flow. If the parameter type is input. Click the Argument Value cell. If the parameter type is output or input/output. 5. To indicate special values. Enter the expression the parameter will pass in the cell. or another parameter (for example. 0. then the value must be a variable or parameter. A focus box appears and the cursor shape changes to an arrow with a yellow pencil. work flow. The value cannot be a constant because. Click Tools > Variables to open the "Variables and Parameters" window. 3. then its value can be an expression that contains a constant (for example. Setting parameters is not necessary when you use global variables.

Variables and Parameters Using global variables

15

variable in a job, that name becomes reserved for the job. Global variables are exclusive within the context of the job in which they are created.

Creating global variables
Define variables in the Variables and Parameter window.

To create a global variable
1. Click the name of a job in the project area or double-click a job from the object library. 2. Click Tools > Variables. The "Variables and Parameters" window appears. 3. From the Definitions tab, select Global Variables. 4. Right-click Global Variables and select Insert. A new global variable appears (for example, $NewJobGlobalVari able0). A focus box appears and the cursor shape changes to an arrow with a yellow pencil. 5. To edit the name of the new variable, click the name cell. The name can include alphanumeric characters or underscores (_), but cannot contain blank spaces. Always begin the name with a dollar sign ($). 6. Click the data type cell for the new variable and select the appropriate data type from the drop-down list. 7. Close the "Variables and Parameters" window.

Viewing global variables
Global variables, defined in a job, are visible to those objects relative to that job. A global variable defined in one job is not available for modification or viewing from another job. You can view global variables from the Variables and Parameters window (with an open job in the work space) or from the Properties dialog of a selected job.

SAP BusinessObjects Data Services Designer Guide

357

15

Variables and Parameters Using global variables

To view global variables in a job from the Properties dialog
1. In the object library, select the Jobs tab. 2. Right-click the job whose global variables you want to view and select Properties. 3. Click the Global Variable tab. Global variables appear on this tab.

Setting global variable values
In addition to setting a variable inside a job using an initialization script, you can set and maintain global variable values outside a job. Values set outside a job are processed the same way as those set in an initialization script. However, if you set a value for the same variable both inside and outside a job, the internal value will override the external job value. Values for global variables can be set outside a job: • • As a job property As an execution or schedule property

Global variables without defined values are also allowed. They are read as NULL. All values defined as job properties are shown in the Properties and the Execution Properties dialogs of the Designer and in the Execution Options and Schedule pages of the Administrator. By setting values outside a job, you can rely on these dialogs for viewing values set for global variables and easily edit values when testing or scheduling a job.
Note:

You cannot pass global variables as command line arguments for real-time jobs.

To set a global variable value as a job property
1. Right-click a job in the object library or project area. 2. Click Properties.

358

SAP BusinessObjects Data Services Designer Guide

Variables and Parameters Using global variables

15

3. Click the Global Variable tab. All global variables created in the job appear. 4. Enter values for the global variables in this job. You can use any statement used in a script with this option. 5. Click OK. The software saves values in the repository as job properties. You can also view and edit these default values in the Execution Properties dialog of the Designer and in the Execution Options and Schedule pages of the Administrator. This allows you to override job property values at run-time.
Related Topics

• Reference Guide: Data Services Scripting Language

To set a global variable value as an execution property
1. Execute a job from the Designer, or execute or schedule a batch job from the Administrator.
Note:

For testing purposes, you can execute real-time jobs from the Designer in test mode. Make sure to set the execution properties for a real-time job. 2. View the global variables in the job and their default values (if available). 3. Edit values for global variables as desired. 4. If you are using the Designer, click OK. If you are using the Administrator, click Execute or Schedule. The job runs using the values you enter. Values entered as execution properties are not saved. Values entered as schedule properties are saved but can only be accessed from within the Administrator.

SAP BusinessObjects Data Services Designer Guide

359

15

Variables and Parameters Using global variables

Automatic ranking of global variable values in a job
Using the methods described in the previous section, if you enter different values for a single global variable, the software selects the highest ranking value for use in the job. A value entered as a job property has the lowest rank. A value defined inside a job has the highest rank. • If you set a global variable value as both a job and an execution property, the execution property value overrides the job property value and becomes the default value for the current job run. You cannot save execution property global variable values. For example, assume that a job, JOB_Test1, has three global variables declared: $YEAR, $MONTH, and $DAY. Variable $YEAR is set as a job property with a value of 2003. For your the job run, you set variables $MONTH and $DAY as execution properties to values 'JANUARY' and 31 respectively. The software executes a list of statements which includes default values for JOB_Test1: $YEAR=2003; $MONTH='JANUARY'; $DAY=31; For the second job run, if you set variables $YEAR and $MONTH as execution properties to values 2002 and 'JANUARY' respectively, then the statement $YEAR=2002 will replace $YEAR=2003. The software executes the following list of statements: $YEAR=2002; $MONTH='JANUARY';
Note:

In this scenario, $DAY is not defined and the software reads it as NULL. You set $DAY to 31 during the first job run; however, execution properties for global variable values are not saved. • If you set a global variable value for both a job property and a schedule property, the schedule property value overrides the job property value and becomes the external, default value for the current job run. The software saves schedule property values in the repository. However, these values are only associated with a job schedule, not the job itself.

360

SAP BusinessObjects Data Services Designer Guide

Variables and Parameters Using global variables

15

Consequently, these values are viewed and edited from within the Administrator. • A global variable value defined inside a job always overrides any external values. However, the override does not occur until the software attempts to apply the external values to the job being processed with the internal value. Up until that point, the software processes execution, schedule, or job property values as default values. For example, suppose you have a job called JOB_Test2 that has three work flows, each containing a data flow. The second data flow is inside a work flow that is preceded by a script in which $MONTH is defined as 'MAY'. The first and third data flows have the same global variable with no value defined. The execution property $MONTH = 'APRIL' is the global variable value. In this scenario, 'APRIL' becomes the default value for the job. 'APRIL' remains the value for the global variable until it encounters the other value for the same variable in the second work flow. Since the value in the script is inside the job, 'MAY' overrides 'APRIL' for the variable $MONTH. The software continues the processing the job with this new value.

SAP BusinessObjects Data Services Designer Guide

361

15

Variables and Parameters Using global variables

Advantages to setting values outside a job
While you can set values inside jobs, there are advantages to defining values for global variables outside a job. For example, values defined as job properties are shown in the Properties and the Execution Properties dialogs of the Designer and in the Execution Options and Schedule pages of the Administrator. By setting values outside a job, you can rely on these dialogs for viewing all global variables and their values. You can also easily edit them for testing and scheduling. In the Administrator, you can set global variable values when creating or editing a schedule without opening the Designer. For example, use global variables as file names and start and end dates.

362

SAP BusinessObjects Data Services Designer Guide

Variables and Parameters Local and global variable rules

15

Local and global variable rules
When defining local or global variables, consider rules for: • • • Naming Replicating jobs and work flows Importing and exporting

Naming
• • Local and global variables must have unique names within their job context. Any name modification to a global variable can only be performed at the job level.

Replicating jobs and work flows
• • When you replicate all objects, the local and global variables defined in that job context are also replicated. When you replicate a data flow or work flow, all parameters and local and global variables are also replicated. However, you must validate these local and global variables within the job context in which they were created. If you attempt to validate a data flow or work flow containing global variables without a job, Data Services reports an error.

Importing and exporting
• • When you export a job object, you also export all local and global variables defined for that job. When you export a lower-level object (such as a data flow) without the parent job, the global variable is not exported. Only the call to that global variable is exported. If you use this object in another job without defining the global variable in the new job, a validation error will occur.

SAP BusinessObjects Data Services Designer Guide

363

15

Variables and Parameters Environment variables

Environment variables
You can use system-environment variables inside jobs, work flows, or data flows. The get_env, set_env, and is_set_env functions provide access to underlying operating system variables that behave as the operating system allows. You can temporarily set the value of an environment variable inside a job, work flow or data flow. Once set, the value is visible to all objects in that job. Use the get_env, set_env, and is_set_env functions to set, retrieve, and test the values of environment variables.
Related Topics

• Reference Guide: Functions and Procedures

Setting file names at run-time using variables
You can set file names at runtime by specifying a variable as the file name. Variables can be used as file names for: • The following sources and targets: • • • • Flat files XML files and messages IDoc files and messages (in an SAP environment)

The lookup_ext function (for a flat file used as a lookup table parameter)

To use a variable in a flat file name
1. Create a local or global variable using the Variables and Parameters window. 2. Create a script to set the value of a local or global variable, or call a system environment variable. 3. Declare the variable in the file format editor or in the Function editor as a lookup_ext parameter.

364

SAP BusinessObjects Data Services Designer Guide

Variables and Parameters Setting file names at run-time using variables

15

When you set a variable value for a flat file, specify both the file name and the directory name. Enter the variable in the File(s) property under Data File(s) in the File Format Editor. You cannot enter a variable in the Root directory property. For lookups, substitute the path and file name in the Lookup table box in the lookup_ext function editor with the variable name.

The following figure shows how you can set values for variables in flat file sources and targets in a script.

When you use variables as sources and targets, you can also use multiple file names and wild cards. Neither is supported when using variables in the lookup_ext function. The figure above provides an example of how to use multiple variable names and wild cards. Notice that the $FILEINPUT variable includes two file names (separated by a comma). The two names (KNA1comma.* and KNA1c?mma.in) also make use of the wild cards (* and ?) supported by the software.

SAP BusinessObjects Data Services Designer Guide

365

15

Variables and Parameters Substitution parameters

Related Topics

• Reference Guide: lookup_ext • Reference Guide: Data Services Scripting Language

Substitution parameters
Overview of substitution parameters
Substitution parameters are useful when you want to export and run a job containing constant values in a specific environment. For example, if you create a job that references a unique directory on your local computer and you export that job to another computer, the job will look for the unique directory in the new environment. If that directory doesn’t exist, the job won’t run. Instead, by using a substitution parameter, you can easily assign a value for the original, constant value in order to run the job in the new environment. After creating a substitution parameter value for the directory in your environment, you can run the job in a different environment and all the objects that reference the original directory will automatically use the value. This means that you only need to change the constant value (the original directory name) in one place (the substitution parameter) and its value will automatically propagate to all objects in the job when it runs in the new environment. You can configure a group of substitution parameters for a particular run-time environment by associating their constant values under a substitution parameter configuration.

Substitution parameters versus global variables
Substitution parameters differ from global variables in that they apply at the repository level. Global variables apply only to the job in which they are defined. You would use a global variable when you do not know the value prior to execution and it needs to be calculated in the job. You would use a substitution parameter for constants that do not change during execution. A substitution parameter defined in a given local repository is available to all the jobs in that repository. Therefore, using a substitution parameter means

366

SAP BusinessObjects Data Services Designer Guide

Variables and Parameters Substitution parameters

15

you do not need to define a global variable in each job to parameterize a constant value. The following table describes the main differences between global variables and substitution parameters.
Global variables Defined at the job level Cannot be shared across jobs Data-type specific Substitution parameters Defined at the repository level Available to all jobs in a repository No data type (all strings)

Value can change during job execution Fixed value set prior to execution of job (constants)

However, you can use substitution parameters in all places where global variables are supported, for example: • • • • • • • Query transform WHERE clauses Mappings SQL transform SQL statement identifiers Flat-file options User-defined transforms Address cleanse transform options Matching thresholds

Using substitution parameters
You can use substitution parameters in expressions, SQL statements, option fields, and constant strings. For example, many options and expression editors include a drop-down menu that displays a list of all the available substitution parameters. The software installs some default substitution parameters that are used by some Data Quality transforms. For example, the USA Regulatory Address Cleanse transform uses the following built-in substitution parameters: • $$RefFilesAddressCleanse defines the location of the address cleanse directories.

SAP BusinessObjects Data Services Designer Guide

367

15

Variables and Parameters Substitution parameters

$$ReportsAddressCleanse (set to Yes or No) enables data collection for creating reports with address cleanse statistics. This substitution parameter provides one location where you can enable or disable that option for all jobs in the repository.

Other examples of where you can use substitution parameters include: • In a script, for example: Print('Data read in : [$$FilePath]'); or Print('[$$FilePath]'); • In a file format, for example with [$$FilePath]/file.txt as the file name

Using the Substitution Parameter Editor
Open the Substitution Parameter Editor from the Designer by selecting Tools > Tools Substitution Parameter Configurations. Use the Substitution Parameter editor to do the following tasks: • • • • • • • • Add and define a substitution parameter by adding a new row in the editor. For each substitution parameter, use right-click menus and keyboard shortcuts to Cut, Copy, Paste, Delete, and Insert parameters. Change the order of substitution parameters by dragging rows or using the Cut, Copy, Paste, and Insert commands. Add a substitution parameter configuration by clicking the Create New Substitution Parameter Configuration icon in the toolbar. Duplicate an existing substitution parameter configuration by clicking the Create Duplicate Substitution Parameter Configuration icon. Rename a substitution parameter configuration by clicking the Rename Substitution Parameter Configuration icon. Delete a substitution parameter configuration by clicking the Delete Substitution Parameter Configuration icon. Reorder the display of configurations by clicking the Sort Configuration Names in Ascending Order and Sort Configuration Names in Descending Order icons. Move the default configuration so it displays next to the list of substitution parameters by clicking the Move Default Configuration To Front icon. Change the default configuration.

• •

368

SAP BusinessObjects Data Services Designer Guide

Variables and Parameters Substitution parameters

15

Related Topics

• Adding and defining substitution parameters

Naming substitution parameters
When you name and define substitution parameters, use the following rules: • The name prefix is two dollar signs $$ (global variables are prefixed with one dollar sign). When adding new substitution parameters in the Substitution Parameter Editor, the editor automatically adds the prefix. When typing names in the Substitution Parameter Editor, do not use punctuation (including quotes or brackets) except underscores. The following characters are not allowed: ,: / ' \ " = < > + | - * % ; \t [ ] ( ) \r \n $ ] + • You can type names directly into fields, column mappings, transform options, and so on. However, you must enclose them in square brackets, for example [$$SamplesInstall]. Names can include any alpha or numeric character or underscores but cannot contain spaces. Names are not case sensitive. The maximum length of a name is 64 characters. Names must be unique within the repository.

• • • •

Adding and defining substitution parameters
1. In the Designer, open the Substitution Parameter Editor by selecting Tools > Substitution Parameter Configurations. 2. The first column lists the substitution parameters available in the repository. To create a new one, double-click in a blank cell (a pencil icon will appear in the left) and type a name. The software automatically adds a double dollar-sign prefix ($$) to the name when you navigate away from the cell. 3. The second column identifies the name of the first configuration, by default Configuration1 (you can change configuration names by double-clicking in the cell and retyping the name). Double-click in the blank cell next to the substitution parameter name and type the constant value that the

SAP BusinessObjects Data Services Designer Guide

369

15

Variables and Parameters Substitution parameters

parameter represents in that configuration. The software applies that value when you run the job. 4. To add another configuration to define a second value for the substitution parameter, click the Create New Substitution Parameter Configuration icon on the toolbar. 5. Type a unique name for the new substitution parameter configuration. 6. Enter the value the substitution parameter will use for that configuration. You can now select from one of the two substitution parameter configurations you just created. To change the default configuration that will apply when you run jobs, select it from the drop-down list box at the bottom of the window. You can also export these substitution parameter configurations for use in other environments.
Example:

In the following example, the substitution parameter $$NetworkDir has the value D:/Data/Staging in the configuration named Windows_Subst_Param_Conf and the value /usr/data/staging in the UNIX_Subst_Param_Conf configuration. Notice that each configuration can contain multiple substitution parameters.

370

SAP BusinessObjects Data Services Designer Guide

Variables and Parameters Substitution parameters

15

Related Topics

• Naming substitution parameters • Exporting and importing substitution parameters

Associating a substitution parameter configuration with a system configuration
A system configuration groups together a set of datastore configurations and a substitution parameter configuration. A substitution parameter configuration can be associated with one or more system configurations. For example, you might create one system configuration for your local system and a different system configuration for another system. Depending on your environment, both system configurations might point to the same substitution parameter configuration or each system configuration might require a different substitution parameter configuration. At job execution time, you can set the system configuration and the job will execute with the values for the associated substitution parameter configuration. To associate a substitution parameter configuration with a new or existing system configuration: 1. In the Designer, open the System Configuration Editor by selecting Tools > System Configurations. 2. Optionally create a new system configuration. 3. Under the desired system configuration name, select a substitution parameter configuration to associate with the system configuration. 4. Click OK.
Example:

The following example shows two system configurations, Americas and Europe. In this case, there are substitution parameter configurations for each region (Europe_Subst_Parm_Conf and Americas_Subst_Parm_Conf). Each substitution parameter configuration defines where the data source files are located for that region, for example D:/Data/Americas and D:/Data/Europe. Select the appropriate substitution parameter configuration and datastore configurations for each system configuration.

SAP BusinessObjects Data Services Designer Guide

371

15

Variables and Parameters Substitution parameters

Related Topics

• Defining a system configuration

Overriding a substitution parameter in the Administrator
In the Administrator, you can override the substitution parameters, or select a system configuration to specify a substitution parameter configuration, on four pages: • Execute Batch Job • Schedule Batch Job • Export Execution Command • Real-Time Service Configuration For example, the Execute Batch Job page displays the name of the selected system configuration, the substitution parameter configuration, and the name of each substitution parameter and its value. To override a substitution parameter: 1. Select the appropriate system configuration.

372

SAP BusinessObjects Data Services Designer Guide

Variables and Parameters Substitution parameters

15

2. Under Substitution Parameters, click Add Overridden Parameter, which displays the available substitution parameters. 3. From the drop-down list, select the substitution parameter to override. 4. In the second column, type the override value. Enter the value as a string without quotes (in contrast with Global Variables). 5. Execute the job.

Executing a job with substitution parameters
To see the details of how substitution parameters are being used in the job during execution in the Designer trace log: 1. 2. 3. 4. Right-click the job name and click Properties. Click the Trace tab. For the Trace Assemblers option, set the value to Yes. Click OK.

When you execute a job from the Designer, the Execution Properties window displays. You have the following options: • On the Execution Options tab from the System configuration drop-down menu, optionally select the system configuration with which you want to run the job. If you do not select a system configuration, the software applies the default substitution parameter configuration as defined in the Substitution Parameter Editor. You can click Browse to view the "Select System Configuration" window in order to see the substitution parameter configuration associated with each system configuration. The "Select System Configuration" is read-only. If you want to change a system configuration, click Tools > System Configurations. • You can override the value of specific substitution parameters at run time. Click the Substitution Parameter tab, select a substitution parameter from the Name column, and enter a value by double-clicking in the Value cell. To override substitution parameter values when you start a job via a Web service, see the Integrator's Guide.

SAP BusinessObjects Data Services Designer Guide

373

2. Be aware of the following behaviors when importing substitution parameters: • The software adds any new substitution parameters and configurations to the destination local repository. Exporting substitution parameters 1. • If the repository has a substitution parameter with the same name as in the exported file.15 Variables and Parameters Substitution parameters Related Topics • Associating a substitution parameter configuration with a system configuration • Overriding a substitution parameter in the Administrator Exporting and importing substitution parameters Substitution parameters are stored in a local repository along with their configured values. Select the check box in the Export column for the substitution parameter configurations to export. however. The software does not include substitution parameters as part of a regular export. Right-click in the local object library and select Repository > Export Substitution Parameter Configurations.You can. right-click in the object library and select Repository > Import from file. In the Designer. Similarly. Save the file. 1. export substitution parameters and configurations to other repositories by exporting them to a file and then importing the file to another repository. 374 SAP BusinessObjects Data Services Designer Guide . Importing substitution parameters The substitution parameters must have first been exported to an ATL file. importing will overwrite the parameter's value. The software saves it as a text file with an . 3. if the repository has a substitution parameter configuration with the same name as the exported configuration. importing will overwrite all the parameter values for that configuration.atl extension.

3.Variables and Parameters Substitution parameters 15 2. Browse to the file to import. Related Topics • Exporting substitution parameters SAP BusinessObjects Data Services Designer Guide 375 . Click OK.

15 Variables and Parameters Substitution parameters 376 SAP BusinessObjects Data Services Designer Guide .

Executing Jobs 16 .

When a job is invoked by a third-party scheduler: • • • The corresponding Job Server must be running. To schedule a job. You will most likely run immediate jobs only during the development cycle. Use the Administrator to create a service from a real-time job. The job operates on a batch job (or shell script for UNIX) that has been exported from the software. Overview of job execution You can run jobs in three different ways. steps to execute jobs. 378 SAP BusinessObjects Data Services Designer Guide . you can configure: • Immediate jobs The software initiates both batch and real-time jobs and runs them immediately from within the Designer. Services Real-time jobs are set up as services that continuously listen for requests from an Access Server and process requests on-demand as they are received. both the Designer and designated Job Server (where the job executes.16 Executing Jobs Overview of job execution This section contains an overview of the software job execution. debug errors. The Designer does not need to be running. For these jobs. When jobs are scheduled by third-party software: • • The job initiates outside of the software. Depending on your needs. use the Administrator or use a third-party scheduler. • Scheduled jobs Batch jobs are scheduled. and change job server options. usually many times on the same machine) must be running.

double-click the error in the Output window to open the editor of the object containing the error. If there are errors. you can access additional information by right-clicking the error listing and selecting View from the context menu. Clicking the Validate Current View button from the toolbar (or choosing ValidateCurrent View from the Debug menu).Executing Jobs Preparing for job execution 16 Preparing for job execution Validating jobs and job components You can also explicitly validate jobs and their components as you create them by: Clicking the Validate All button from the toolbar (or choosing ValidateAll Objects in View from the Debug menu). The software also validates jobs before exporting them. If during validation the software discovers an error in an object definition. it opens a dialog box indicating that an error exists. You can set the Designer options (Tools > Options > Designer > General) to validate jobs started in Designer before job execution. The default is not to validate. This command checks the syntax of the object definition for the active workspace. Error messages have these levels of severity: SAP BusinessObjects Data Services Designer Guide 379 . then opens the Output window to display the error. If you are unable to read the complete error text in the window. This command checks the syntax of the object definition for the active workspace and for all objects that are called from the active workspace view recursively.

but you might get unexpected results. No action is required. 380 SAP BusinessObjects Data Services Designer Guide . ensure that the Job Server is associated with the repository where the client is running.16 Executing Jobs Preparing for job execution Severity Description Information Informative message only—does not prevent the job from running. it displays the status of the Job Server for the repository to which you are connected. You must fix the error before the job will execute. Icon Description Job Server is running Job Server is inactive The name of the active Job Server and port number appears in the status bar when the cursor is over the icon. Ensuring that the Job Server is running Before you execute a job (either as an immediate or scheduled task). The error is not severe enough to stop job execution. Warning Error The error is severe enough to stop job execution. For example. When the Designer starts. if the data type of a source column in a transform within a data flow does not match the data type of the target column in the transform. the software alerts you with a warning message.

• • The right-click Execute menu sets the options for a single execution only and overrides the default settings The right-click Properties menu sets the default settings To set execution options for every execution of the job 1. Right-click and choose Execute. Select options on the Properties window: Related Topics • Viewing and changing object properties • Reference Guide: Parameters • Reference Guide: Trace properties • Setting global variable values Executing jobs as immediate tasks Immediate or "on demand" tasks are initiated from the Designer. Execution options for jobs can either be set for a single instance or as a default value. Both the Designer and Job Server must be running for the job to execute. From the Project area. Although these are object options—they affect the function of the object—they are located in either the Property or the Execution window associated with the job. 2. SAP BusinessObjects Data Services Designer Guide 381 .Executing Jobs Executing jobs as immediate tasks 16 Setting job execution options Options for jobs include Debug and Trace. select the job name. 2. In the project area. To execute a job as an immediate task 1. The software prompts you to save any objects that have changes that have not been saved. right-click the job name and choose Properties.

• If you have selected this check box. datastore profiles for sources and targets if applicable. monitor log. You must correct any serious errors before the job will run. The next step depends on whether you selected the Perform complete validation before job execution check box in the Designer Options: • If you have not selected this check box. or select global variables at runtime. a window opens showing execution properties (debug and trace) for the job. a window opens showing the execution properties (debug and trace) for the job. After the job is complete. As the software begins execution. Correct them if you want (they will not prevent job execution) or click OK to continue. There might also be warning messages—for example. 4. and error log (if there are any errors). Click OK. the software validates the job before it runs. Proceed to the next step. Set the execution properties. override the default trace properties. For more information. You can choose the Job Server that you want to process this job. Related Topics • Designer — General • Reference Guide: Parameters • Reference Guide: Trace properties • Setting global variable values • Debugging execution errors • Examining target data 382 SAP BusinessObjects Data Services Designer Guide . Use the buttons at the top of the log window to display the trace log. see: Note: Setting execution properties here affects a temporary change for the current execution only. use an RDBMS query tool to check the contents of the target table or file. enable automatic recovery. the execution window opens with the trace log button active. 5.16 Executing Jobs Executing jobs as immediate tasks 3. messages indicating that date values will be converted to datetime values. After the job validates.

Executing Jobs Debugging execution errors 16 Monitor tab The Monitor tab lists the trace logs of all current or most recent executions of a job. Use the trace. SAP BusinessObjects Data Services Designer Guide 383 . The traffic-light icons in the Monitor tab have the following meanings: • A green light indicates that the job is running You can right-click and select Kill Job to stop a job that is still running. This description is saved with the log which can be accessed later from the Log tab. and error log icons (left to right at the top of the job execution window in the workspace) to view each type of available log for the date and time that the job was run. • A red light indicates that the job has stopped You can right-click and select Properties to add a description for a specific trace log. • A red cross indicates that the job encountered an error Log tab You can also select the Log tab to view a job's trace log history. monitor. Debugging execution errors The following tables lists tools that can help you understand execution errors: Tool Trace log Definition Itemizes the steps executed in the job and the time execution began and ended. Click a trace log to open it in the workspace.

Displays the name of the object being executed when an error occurred and the text of the resulting error message. Monitor log Error log Target data Related Topics • Using logs • Examining trace logs • Examining monitor logs • Examining error logs • Examining target data Using logs This section describes how to use logs in the Designer. and the duration of each step. To copy log content from an open log. If the job ran against SAP data.16 Executing Jobs Debugging execution errors Tool Definition Displays each step of each data flow in the job. the number of rows streamed through each step. the execution window opens automatically. select Tools > Options > Designer > General > Open monitor on job execution. • • To open the trace log on job execution. select one or multiple lines and use the key commands [Ctrl+C]. Always examine your target data to see if your job produced the results you expected. 384 SAP BusinessObjects Data Services Designer Guide . Use the monitor and error log icons (middle and right icons at the top of the execution window) to view these logs. displaying the trace log information. some of the ABAP errors are also available in the error log. To access a log during job execution If your Designer is running when job execution begins.

) 4. Indicates that the job encountered an error on this explicitly selected Job Server. If want to delete logs from the Designer manually: SAP BusinessObjects Data Services Designer Guide 385 . Or expand the job you are interested in to view the list of trace log files and click one. click the Log tab. Indicates that the job encountered an error while being executed by a server group. 2. monitor. In the project area. To delete a log You can set how long to keep logs in Administrator. (Identify the execution from the position in sequence or datetime stamp. The Job Server listed executed the job. Use the list box to switch between log types or to view No logs or All logs.Executing Jobs Debugging execution errors 16 The execution window stays open until you close it. Click a job name to view all trace. Click the log icon for the execution of the job you are interested in. Indicates that the was job executed successfully by a server group. The Job Server listed executed the job. 3. and error log files in the workspace. Log indicators signify the following: Job Log IndicaDescription tor N_ Indicates that the job executed successfully on this explicitly selected Job Server. To access a log after the execution window has been closed 1.

whether the execution steps occur in the order you expect. 386 SAP BusinessObjects Data Services Designer Guide . Right-click the log you want to delete and select Delete Log. The following figure shows an example of a trace log. click the Log tab. and which parts of the execution are the most time consuming. 2.16 Executing Jobs Debugging execution errors 1. In the project area. Related Topics • Management Console Administrator Guide: Setting the log retention period Examining trace logs Use the trace logs to determine where an execution failed.

The following screen shows an example of an error log. Examining error logs The software produces an error log for every job execution. Use the error logs to determine how an execution failed.Executing Jobs Debugging execution errors 16 Examining monitor logs The monitor log quantifies the activities of the components of the job. It lists the time spent in a given component of a job and the number of data rows that streamed through the component. SAP BusinessObjects Data Services Designer Guide 387 . If the execution completed without error. The following screen shows an example of a monitor log. the error log is blank.

Data was not duplicated in the target.16 Executing Jobs Debugging execution errors Examining target data The best measure of the success of a job is the state of the target data. Be sure that: • • • • • Data was not converted to incompatible types or truncated. Always examine your data to make sure the data movement operation produced the results you expect. Data was not lost between updates of the target. Generated keys have been properly incremented. 388 SAP BusinessObjects Data Services Designer Guide . Updated values were handled properly.

Option Option Description Default Value Adapter Data Exchange Time-out (For adapters) Defines the time a function call or outbound message will wait for the response from the adapter operation.txt AL_JobServerLoad BalanceDebug AL_JobServerLoad OSPolling Sets the polling interval (in seconds) that the software uses to get status information 60 used to calculate the load balancing index. Enables a Job Server to log server group information if the value is set to TRUE. There are many options available in the software for troubleshooting and tuning a job. Information is saved in: FALSE $LINK_DIR/log/<JobServer Name>/server_eventlog. SAP BusinessObjects Data Services Designer Guide 389 . 10800000 (3 hours) Adapter Start Timeout (For adapters) Defines the time that the Administrator or Designer will wait for a re90000 (90 seconds) sponse from the Job Server that manages adapters (start/stop/status). This index is used by server groups.Executing Jobs Changing Job Server options 16 Changing Job Server options Familiarize yourself with the more technical aspects of how the software handles data (using the Reference Guide) and some of its interfaces like those for adapters and SAP application.

or other connection information. FALSE change the default value of this option to TRUE. user name. then update the CD_DS_d0cafae2 datastore configuration to match your new repository configuration. Sets the number of retries for an FTP con0 nection that initially fails. The CD_DS_d0cafae2 datastore supports two internal jobs. close and reopen the Designer. The first calculates usage dependencies on repository tables and the second updates server group configurations. This enables the calculate usage dependency job (CD_JOBd0cafae2) and the server group job (di_job_al_mach_info) to run without a connection error. Sets the FTP connection retry interval in milliseconds. FTP Number of Retry FTP Retry Interval 1000 390 SAP BusinessObjects Data Services Designer Guide .16 Executing Jobs Changing Job Server options Option Option Description Default Value Displays the software's internal datastore CD_DS_d0cafae2 and its related jobs in the object library. Display DI Internal Jobs If you change your repository password.

To correct this. OCI Server Attach Retry The engine calls the Oracle OCIServerAt tach function each time it makes a connection to Oracle. the engine inSplitter Optimization FALSE ternally creates two source files that feed the two queries instead of a splitter that feeds the two queries. then the Job 1 Server will use the Global_DOP value. the function may fail. The Job Server will use the data flow's Degree of parallelism value if it is set to any value except zero because it overrides the Global_DOP value. If this option is set to TRUE. FALSE (For SAP application) Disables IDoc reIgnore Reduced Msg duced message type processing for a speFALSE cific message type (such as foo ) if the Type_foo value is set to TRUE. (For SAP applications) Disables IDoc reIgnore Reduced Msg duced message type processing for all Type message types if the value is set to TRUE. You can also set the Degree of parallelism for individual data flows from each data flow's Properties window. If the engine calls this func3 tion too fast (processing parallel data flows for example). If a data flow's Degree of parallelism value is 0. SAP BusinessObjects Data Services Designer Guide 391 .Executing Jobs Changing Job Server options 16 Option Option Description Default Value Global_DOP Sets the Degree of Parallelism for all data flows run by a given Job Server. increase the retry value to 5. The software might hang if you create a job in which a file source feeds into two queries.

The use of linked datastores can also be disabled from any data flow properties dialog. The data flow level option takes precedence over this Job Server level option. Adds a domain name to a Job Server name in the repository. Select Tools > Options > Designer > Environment. Select the Job Server you want to work with by making it your default Job Server. Database link support for push-down operations across datastores To change option values for an individual Job Server 1. 2.16 Executing Jobs Changing Job Server options Option Option Description Default Value Use Explicit Database Links Jobs with imported database links normally will show improved performance because the software uses these links to push down processing to a database. Select Tools > Options > Job Server > General. Select a Job Server from the Default Job Server section. Degree of parallelism • Performance Optimization Guide: Maximizing Push-Down Operations. b. If you set this option to FALSE. Use Domain Name Related Topics • Performance Optimization Guide: Using parallel Execution. Click OK. 392 SAP BusinessObjects Data Services Designer Guide . a. This creates a fully qualiTRUE fied server name and allows the Designer to locate a Job Server on a different domain. c. all data flows will not use TRUE linked datastores.

Executing Jobs Changing Job Server options 16 3. Enter the section and key you want to use from the following list of value pairs: Section Key int AdapterDataExchangeTimeout int AdapterStartTimeout AL_JobServer AL_JobServerLoadBalanceDebug AL_JobServer AL_JobServerLoadOSPolling string DisplayDIInternalJobs AL_Engine FTPNumberOfRetry AL_Engine FTPRetryInterval AL_Engine Global_DOP AL_Engine IgnoreReducedMsgType SAP BusinessObjects Data Services Designer Guide 393 .

click OK. For example. Enter a value.16 Executing Jobs Changing Job Server options Section Key AL_Engine IgnoreReducedMsgType_foo AL_Engine OCIServerAttach_Retry AL_Engine SPLITTER_OPTIMIZATION AL_Engine UseExplicitDatabaseLinks Repository UseDomainName 4. To save the settings and close the Options window. enter the following to change the default value for the number of times a Job Server will retry to make an FTP connection if it initially fails: Option Section Key Value Sample value AL_Engine FTPNumberOfRetry 2 These settings will change the default value for the FTPNumberOfRetry option from zero to two. 394 SAP BusinessObjects Data Services Designer Guide . 5.

If you want to add another mapped drive. enter a drive letter. To use mapped drive names in a path The software supports only UNC (Universal Naming Convention) paths to directories. Choose Tools > Options. In the Value edit box.Executing Jobs Changing Job Server options 16 6. Be sure that each entry in the Key edit box is a unique name. enter LocalDrive1 to map to a local drive or Remot eDrive1 to map to a remote drive. If you set up a path to a mapped drive. To make sure that your mapped drive is not converted back to the UNC path. In the Section edit box. 2. 5. 4. enter MappedNetworkDrives. 1. you need to add your drive names in the "Options "window in the Designer. 6. In the Key edit box. as needed. Click OK to close the window. In the "Options" window. such as M:\ for a local drive or \\<machine_name>\<share_name> for a remote drive. 3. Re-select a default Job Server by repeating step 1. the software will convert that mapped drive to its UNC equivalent. expand Job Server and then select General. you need to close the "Options" window and re-enter. SAP BusinessObjects Data Services Designer Guide 395 .

16 Executing Jobs Changing Job Server options 396 SAP BusinessObjects Data Services Designer Guide .

Data Assessment 17 .

Define the actions to take when an audit rule fails. transform and load (ETL) jobs. • Use the auditing data flow feature to: • • Define rules that determine if a source. Take appropriate actions when the data does not meet your business rules. transform. Use Data Validation dashboards in the Metadata Reporting tool to evaluate the reliability of your target data based on the validation rules you created 398 SAP BusinessObjects Data Services Designer Guide . The content of your source and target data so that you can verify that your data extraction job returns the results you expect. • • Use data quality transforms to improve the quality of your data. or target object processes correct data. data quality. or other transforms. • • • Use the View Data feature to: • • View your source data before you execute a job to help you create higher quality job designs. The Designer provides the following features that you can use to determine and improve the quality and structure of your source data: • Use the Data Profiler to determine: • The quality of your source data before you extract it. data quality control becomes critical in your extract.17 Data Assessment Using the Data Profiler With operational systems frequently changing. The distribution. The Data Profiler can identify anomalies in your source data to help you better define corrective actions in the validation transform. Compare sample data from different steps of your job to verify that your data extraction job returns the results you expect. as well as your target data warehouse. These features can help ensure that you have "trusted" information. relationship. and structure of your source data to better design your jobs and data flows. The Designer provides data quality controls that act as a firewall to identify and fix errors in your data. • Use the Validation transform to: • • Verify that your source data meets your business rules.

• • Relationship analysis—This information identifies data mismatches between any two columns for which you define a relationship. maximum value. median. Save the values in all columns in each row. Related Topics • Using the Data Profiler • Using View Data to determine data quality • Using the Validation transform • Using Auditing • Overview of data quality • Management Console—Metadata Reports Guide: Data Validation Dashboard Reports Using the Data Profiler The Data Profiler executes on a profiler server to provide the following data profiler information that multiple users can view: • Column analysis—The Data Profiler provides two types of column profiles: • Basic profiling—This information includes minimum value.Data Assessment Using the Data Profiler 17 in your batch jobs. Detailed profiling—Detailed column analysis includes distinct count. SAP BusinessObjects Data Services Designer Guide 399 . including columns that have an existing primary key and foreign key relationship. and identify potential inconsistencies or errors in source data. distinct percent. and maximum string length. assess. You can save two levels of data: • • Save the data only in the columns that you select for the relationship. median string length. pattern count. minimum string length. average value. and pattern percent. This feedback allows business users to quickly review.

You provide this connection information on the Profiler Server Login window. • Databases. which include: • • • • • • • Attunity Connector for mainframe databases DB2 Oracle SQL Server Sybase IQ Teradata Applications. 400 SAP BusinessObjects Data Services Designer Guide . which include: • • • • • • JDE One World JDE World Oracle Applications PeopleSoft SAP Applications Siebel • Flat files Connecting to the profiler server You must install and configure the profiler server before you can use the Data Profiler. The Designer must connect to the profiler server to run the Data Profiler and view the profiler results.17 Data Assessment Using the Data Profiler Data sources that you can profile You can execute the Data Profiler on data contained in the following sources. See the Release Notes for the complete list of sources that the Data Profiler supports.

SAP BusinessObjects Data Services Designer Guide 401 . Field User Name Password Description The user name for the Profiler Server login. In addition.Data Assessment Using the Data Profiler 17 1. The password for the Profiler Server login. Click Connect. Port number through which the Designer connects to the Data Profiler Server. Enter the user information in the Profiler Server Login window. 5. In the Profiler Server Login window. 2. • On the bottom status bar. Note: When you click Test. If the host name is valid. when you move the pointer over this icon. Click Test to validate the Profiler Server location. Use one of the following methods to invoke the Profiler Server Login window: • From the tool bar menu. select Tools > Profiler Server Login. double-click the Profiler Server icon which is to the right of the Job Server icon. 4. the status bar displays the location of the profiler server. You can select a user name from the drop-down list or enter a new name. Field Host Description The name of the computer where the Data Profiler Server exists. you receive a message that indicates that the profiler server is running. the Profiler Server icon on the bottom status bar no longer has the red X on it. enter the Data Profiler Server connection information. the drop-down list in User Name displays the user names that belong to the profiler server. Port 3. When you successfully connect to the profiler server.

For numeric columns. For character columns.17 Data Assessment Using the Data Profiler Related Topics • Management Console—Administrator Guide: Profile Server Management • Management Console—Administrator Guide: Defining profiler users Profiler statistics Column profile You can generate statistics for one or more columns. Number of rows that contain this highest value in this column. the Data Profiler generates the following basic profiler attributes for each column that you select. Number of rows that contain this lowest value in this column. For character columns. the average value in this column. Max string length Average string length 402 SAP BusinessObjects Data Services Designer Guide . For character columns. Basic profiling By default. Basic Attribute Description Min Min count Max Max count Average Min string length Of all values. If you generate statistics for multiple sources in one profile task. Of all values. the length of the longest string value in this column. all sources must be in the same datastore. the length of the shortest string value in this column. The columns can all belong to one data source or from multiple data sources. the average length of the string values in this column. the highest value in this column. the lowest value in this column.

Number of different patterns in this column. it is recommended that you do not select the detailed profile unless you need the following attributes: Detailed Attribute Median Median string length Distincts Distinct % Patterns Pattern % Description The value that is in the middle row of the source table. Percentage of rows that contain a NULL value in this column. the number of rows that contain a blank in this column. Percentage of rows that contain a blank in this column. Therefore. Number of 0 values in this column. For character columns. Percentage of rows that contain each distinct value in this column.Data Assessment Using the Data Profiler 17 Basic Attribute Description Nulls Nulls % Zeros Zeros % Blanks Blanks % Number of NULL values in this column. Percentage of rows that contain each pattern in this column. For character columns. Number of distinct values in this column. Detailed profiling You can generate more detailed attributes in addition to the above attributes. Percentage of rows that contain a 0 value in this column. the value that is in the middle row of the source table. but detailed attributes generation consumes more time and computer resources. SAP BusinessObjects Data Services Designer Guide 403 .

17 Data Assessment Using the Data Profiler Examples of using column profile statistics to improve data quality You can use the column profile attributes to assist you in different tasks. The sources can be: • • Tables Flat files 404 SAP BusinessObjects Data Services Designer Guide . Your target will need to have a data type that can accommodate the maximum range. you might decide to define a validation transform to convert them all to use the same target format. • • • • Related Topics • To view the column attributes generated by the Data Profiler • Submitting column profiler tasks Relationship profile A relationship profile shows the percentage of non matching values in columns of two sources. the profile statistics might show that phone number has several different formats. You might then decide to define a validation transform to replace the null value with a phrase such as "Unknown" in the target table. and outliers. part number might be an integer data type in one data source and a varchar data type in another data source. and blanks in the source system. Identify variations of the same content. Discover data patterns and formats. Analyze the numeric range. customer number might have one range of numbers in one source. For example. For example. You might then decide which data type you want to use in your target data warehouse. the profile statistics might show that nulls occur for fax number. You might then decide to define a validation transform to set a flag in a different table when you load this outlier into the target table. these profile statistics might show that a column value is markedly higher than the other values in a data source. frequencies. ranges. nulls. Identify missing information. and a different range in another source. With this profile information. For example. For example. For example. including the following tasks: • Obtain basic statistics.

you can drill down to see the actual data that does not match. duplicate names and addresses might exist between two sources or no name might exist for an address in one source. Identify redundant data across data sources. the Data Profiler saves the data only in the columns that you select for the relationship. but another source might not. For example. but some problems only exist in one system or the other. including the following tasks: • • Identify missing data in the source system. two different problem tracking systems might include a subset of common customer-reported problems. one data source might include region. Validate relationships across data sources. • Save all columns data You can save the values in the other columns in each row. You can choose between two levels of relationship profiles to save: • Save key columns data only By default. but this processing will take longer and consume more computer resources to complete.Data Assessment Using the Data Profiler 17 • A combination of a table and a flat file The key columns can have a primary key and foreign key relationship defined or they can be unrelated (as when one comes from a datastore and the other from a file format). When you view the relationship profile results. For example. You can use the relationship profile to assist you in different tasks. For example. Note: The Save key columns data only level is not available when using Oracle datastores. • Related Topics • Submitting relationship profiler tasks • Viewing the profiler results SAP BusinessObjects Data Services Designer Guide 405 .

go to the Formats tab and select a file. Submitting column profiler tasks 1. This option submits a synchronous profile task. For a flat file. 2. Select this method so the profile task runs asynchronously and you can perform other Designer tasks while the profile task executes. and you must wait for the task to complete before you can perform other tasks on the Designer. To select a subset of tables in the datastore tab. select View Data. you can select either a table or flat file. click the Profile tab. go to the Datastores tab and select a table. You might want to use this option if you are already on the View Data window and you notice that either the profile statistics have not yet 406 SAP BusinessObjects Data Services Designer Guide . If you want to profile all tables within a datastore. After you select your data source. and click Update. You cannot execute a column profile task with a relationship profile task.17 Data Assessment Using the Data Profiler Executing a profiler task The Data Profiler allows you to calculate profiler statistics for any set of columns you choose. Note: This optional feature is not available for columns with nested schemas. This method also allows you to profile multiple sources in one profile task. LONG or TEXT data type. you can generate column profile statistics in one of the following ways: • Right-click and select Submit Column Profile Request. In the Object Library of the Designer. select the datastore name. For a table. • Right-click. hold down the Ctrl key as you select each table. Some of the profile statistics can take a long time to calculate.

or to remove dashes which are allowed in column names but not in task names. the default name has the following format: username_t_sourcename If you select a multiple sources. Name of first source in alphabetic order. or the date that the profile statistics were generated is older than you want. You can edit the task name to create a more meaningful name. Type of profile. SAP BusinessObjects Data Services Designer Guide 407 . Select a data source to display its columns on the right side. the Submit Column Profile Request window lists the columns and data types. (Optional) Edit the profiler task name. The value is C for column profile that obtains attributes (such as low value and high value) for each selected column. the Submit Column Profiler Request window lists the sources on the left. The Data Profiler generates a default name for each profiler task. If you select one source. the default name has the following format: username_t_firstsourcename_lastsourcename Column username Description Name of the user that the software uses to access system services. 5. t firstsourcename lastsourcename 4. a unique name. 3. Name of last source in alphabetic order if you select multiple sources.Data Assessment Using the Data Profiler 17 been generated. Alternatively. you can click the check box at the top in front of Name to deselect all columns and then select the check boxes. a. Keep the check in front of each column that you want to profile and remove the check in front of each column that you do not want to profile. If you select a single source. If you selected multiple sources.

Choose Detailed profiling only if you want these attributes: distinct count. Repeat steps 1 and 2 for each data source. Instead. Click Submit to execute the profile task. If you want detailed attributes for all columns in all sources listed. pattern count. keep the check in front of each column that you want to profile. the Profiler monitor pane appears automatically when you click Submit. you can view the profile results in the View Data option. If you clicked Update on the Profile tab of the View Data window. Note: The Data Profiler consumes a large amount of resources when it generates detailed profile statistics. median string length. the Profiler monitor window does not appear when you click Submit.17 Data Assessment Using the Data Profiler b. 6. click Detailed profiling and select Remove from all columns of all sources. you can click the check box at the top in front of Name to deselect all columns and then select the individual check box for the columns you want to profile. click Detailed profiling and select Apply to all columns of all sources. When the profiler task has completed. If you clicked the Submit Column Profile Request option to reach this Submit Column Profiler Request window. median value. If you choose Detailed profiling. and remove the check in front of each column that you do not want to profile. Alternatively. distinct percent. 408 SAP BusinessObjects Data Services Designer Guide . (Optional) Select Detailed profiling for a column. a profile task is submitted asynchronously and you must wait for it to complete before you can do other tasks on the Designer. 8. a new column was added). If you want to remove Detailed profiling for all columns. 7. Note: If the table metadata changed since you imported it (for example. pattern. On the right side of the Submit Column Profile Request window. c. You can also monitor your profiler task by name in the Administrator. you must re-import the source table before you execute the profile task. ensure that you specify a pageable cache directory that contains enough disk space for the amount of data you profile.

The two columns do not need to be the same data type. For example. In the Object Library of the Designer. select two sources. ensure that you specify a pageable cache directory that contains enough disk space for the amount of data you profile.Data Assessment Using the Data Profiler 17 Related Topics • Column profile • Monitoring profiler tasks using the Designer • Viewing the profiler results • Getting Started Guide: Configuring Job Server run-time resources • Management Console Administrator Guide: Monitoring profiler tasks using the Administrator Submitting relationship profiler tasks A relationship profile shows the percentage of non matching values in columns of two sources. The sources can be any of the following: • • • Tables Flat files A combination of a table and a flat file The columns can have a primary key and foreign key relationship defined or they can be unrelated (as when one comes from a datastore and the other from a file format). SAP BusinessObjects Data Services Designer Guide 409 . If you plan to use Relationship profiling. but they must be convertible. if you run a relationship profile task on an integer column and a varchar column. Note: The Data Profiler consumes a large amount of resources when it generates relationship values. the Data Profiler converts the integer value to a varchar value to make the comparison. Related Topics • Data sources that you can profile • Getting Started Guide: Configuring Job Server run-time resources To generate a relationship profile for columns in two sources 1.

t firstsource name lastsource name Name last selected source. Hold the Ctrl key down as you select the second table. Right-click and select Submit Relationship Profile Request . Change to a different Datastore or Format in the Object Library d. c. Note: You cannot create a relationship profile for the same column in the same source or for columns with a LONG or TEXT data type. 2. Name first selected source. Go to the Datastore or Format tab in the Object Library. The Submit Relationship Profile Request window appears. Right-click on the first source. You can edit the task name to create a more meaningful name. Type of profile. a unique name.17 Data Assessment Using the Data Profiler To select two sources in the same datastore or file format: a. 410 SAP BusinessObjects Data Services Designer Guide . The value is R for Relationship profile that obtains non matching values in the two selected columns. or to remove dashes. Click on the second source. b. Go to the Datastore or Format tab in the Object Library. The default name that the Data Profiler generates for multiple sources has the following format: username_t_firstsourcename_lastsourcename Column username Description Name of the user that the software uses to access system services. To select two sources from different datastores or files: a. which are allowed in column names but not in task names. c. select Submit > Relationship Profile Request > Relationship with. (Optional) Edit the profiler task name. b.

do one of the following actions: • • Right-click in the upper pane and click Delete All Relations. By default. the upper pane of the Submit Relationship Profile Request window shows a line between the primary key column and foreign key column of the two sources. The Data Profiler will determine which values are not equal and calculate the percentage of non matching values. 4. Click Submit to execute the profiler task. specify the columns to profile. and you will not see any sample data in the other columns when you view the relationship profile. To delete all existing relationships between the two sources. select the line. By default. Hold down the cursor and draw a line to the other column that you want to select. select Save all columns data. b. Click Delete All Relations near the bottom of the Submit Relationship Profile Request window. either right-click in the upper pane and click Propose Relation. The bottom half of the Submit Relationship Profile Request window shows that the profile task will use the equal (=) operation to compare the two columns. 5. SAP BusinessObjects Data Services Designer Guide 411 . 6. 7. right-click. and select Delete Selected Relation. You can resize each data source to show all columns. the is selected. This option indicates that the Data Profiler saves the data only in the columns that you select for the relationship. If you deleted all relations and you want the Data Profiler to select an existing primary-key and foreign-key relationship. If a primary key and foreign key relationship does not exist between the two data sources. To specify or change the columns for which you want to see relationship values: a. To delete an existing relationship between two columns.Data Assessment Using the Data Profiler 17 3. if the relationship exists. or click Propose Relation near the bottom of the Submit Relationship Profile Request window. You can change the columns to profile. If you want to see values in the other columns in the relationship profile. Move the cursor to the first column to select.

The Profiler monitor pane appears automatically when you click Submit. You can dock this profiler monitor pane in the Designer or keep it separate. You can also monitor your profiler task by name in the Administrator. the Information window also displays the error message. The Profiler monitor pane displays the currently running task and all of the profiler tasks that have executed within a configured number of days. a new column was added). 8. you can view the profile results in the View Data option when you right click on a table in the Object Library. 412 SAP BusinessObjects Data Services Designer Guide . If the task failed. When the profiler task has completed.17 Data Assessment Using the Data Profiler Note: If the table metadata changed since you imported it (for example. you must re-import the source table before you execute the profile task. 9. Related Topics • To view the relationship profile data generated by the Data Profiler • Monitoring profiler tasks using the Designer • Management Console Administrator Guide: Monitoring profiler tasks using the Administrator • Viewing the profiler results Monitoring profiler tasks using the Designer The Profiler monitor window appears automatically when you submit a profiler task if you clicked the Menu bar to view the Profiler monitor window. You can click on the icons in the upper-left corner of the Profiler monitor to display the following information: Refreshes the Profiler monitor pane to display the latest status of profiler tasks Sources that the selected task is profiling.

If the profiler task is for a single source. • Status • • Pending— The task is on the wait queue because the maximum number of concurrent tasks has been reached or another task is profiling the same table. SAP BusinessObjects Data Services Designer Guide 413 . Double-click on the value in this Status column to display the error message.Data Assessment Using the Data Profiler 17 The Profiler monitor shows the following columns: Column Description Name of the profiler task that was submitted from the Designer. Names of the tables for which the profiler task executes. Running— The task is currently executing. the default name has the following format: username_t_firstsourcename_lastsourcename Type The type of profiler task can be: • Column • Relationship The status of a profiler task can be: • Done— The task completed successfully. Timestamp Sources Date and time that the profiler task executed. the default name has the following format: Name username_t_sourcename If the profiler task is for multiple sources. Error — The task terminated with an error.

Click the Profile tab (second) to view the column profile attributes. Right-click and select View Data. Basic Profile atDescription tribute Relevant data type Character Numeric Datetime Min Of all values. c. You can sort the values in each attribute column by clicking the column heading. 2. The Profile tab shows the number of physical records that the Data Profiler processed to generate the values in the profile grid. the lowest value in Yes this column. Related Topics • To view the column attributes generated by the Data Profiler • To view the relationship profile data generated by the Data Profiler To view the column attributes generated by the Data Profiler 1.17 Data Assessment Using the Data Profiler Related Topics • Executing a profiler task • Administrator Guide: Configuring profiler task parameters Viewing the profiler results The Data Profiler calculates and saves the profiler attributes into a profiler repository that multiple users can view. In the Object Library. The profile grid contains the column names in the current source and profile attributes for each column. The value n/a in the profile grid indicates an attribute does not apply to a data type. a. Yes Yes 414 SAP BusinessObjects Data Services Designer Guide . 3. b. select the table for which you want to view profiler attributes. To populate the profile grid. execute a profiler task or select names from this column and click Update.

For character columns. No No SAP BusinessObjects Data Services Designer Guide 415 . the highest value in Yes this column. the average length of Yes the string values in this column. Relevant data type Character Numeric Datetime Min count Yes Yes Yes Max Of all values. Yes No No Yes No No Average string length For character columns. Yes Yes For character columns. Yes Yes Max count Yes Yes Yes Average For numeric columns.Data Assessment Using the Data Profiler 17 Basic Profile atDescription tribute Number of rows that contain this lowest value in this column. the length of the Max string length longest string value in this column. Number of rows that contain this highest value in this column. the avYes erage value in this column. the length of the Min string length shortest string value in this column.

Percentage of rows that contain No a 0 value in this column. No No d. Number of 0 valNo ues in this column. Percentage of rows that contain Yes a NULL value in this column.Yes umn. the number of rows that contain a blank in this column. the Profile tab also displays the following detailed attribute columns. If you selected the Detailed profiling option on the Submit Column Profile Request window. Yes Yes Nulls % Yes Yes Zeros Yes No Zeros % Yes No Blanks Yes No No Blanks % Percentage of rows that contain Yes a blank in this column. For character columns.17 Data Assessment Using the Data Profiler Basic Profile atDescription tribute Relevant data type Character Numeric Datetime Nulls Number of NULL values in this col. 416 SAP BusinessObjects Data Services Designer Guide .

Yes The format of each unique pattern in this column. the value that is in the Yes middle row of the source table. Median string length No No Pattern % No No SAP BusinessObjects Data Services Designer Guide 417 . Percentage of rows that contain each distinct value in this column. Relevant data type Character Numeric Datetime Distincts Yes Yes Yes Distinct % Percentage of rows that contain Yes each distinct value in this column. The value that is in the middle row of the source table.Data Assessment Using the Data Profiler 17 Detailed Profile Description attribute Number of distinct values in this column. The Data Profiler uses the following calculation to obtain the Yes median value: Yes Yes Median Yes Yes (Total number of rows / 2) + 1 For character columns.

17 Data Assessment Using the Data Profiler Detailed Profile Description attribute Relevant data type Character Numeric Datetime Patterns Number of different patterns in Yes this column. and percentages appear on the right side of the Profile tab. You can also click the check box at the top in front of Name to deselect all columns and then select each check box in front of each column you want to profile. but the Profiling data for this Customer source table shows that the maximum string length is 46. 5. You can hide columns that you do not want to view by clicking the Show/Hide Columns icon. Click an attribute value to view the entire row in the source table. The Last updated value in the bottom left corner of the Profile tab is the timestamp when the profile attributes were last generated. number of records for each pattern value. No No 4. Select only the column names you need for this profiling operation because Update calculations impact performance. Click a statistic in either Distincts or Patterns to display the percentage of each distinct value or pattern value in a column. Note: The Update option submits a synchronous profile task. (Optional) Click Update if you want to update the profile attributes. and you must wait for the task to complete before you can perform other tasks on the Designer. 6. 418 SAP BusinessObjects Data Services Designer Guide . Click the value 46 to view the actual data. Reasons to update at this point include: • The profile attributes have not yet been generated • The date that the profile attributes were generated is older than you want. For example. your target ADDRESS column might only be 45 characters. You can resize the width of the column to display the entire string. The Submit column Profile Request window appears. The bottom half of the View Data window displays the rows that contain the attribute value that you clicked. The pattern values.

Your business rules might dictate that REGION should not contain Null values in your target data warehouse. In addition. Therefore. 8. 9.Data Assessment Using the Data Profiler 17 For example. 7. Click either Null under Value or 60 under Records to display the other columns in the rows that have a Null value in the REGION column. The Profiling data on the right side shows that a very large percentage of values for REGION is Null. the bars in the right-most column show the relative size of each percentage. Click the statistic in the Distincts column to display each of the 19 values and the percentage of rows in table CUSTOMERS that have that value for column REGION. The Distincts attribute for the REGION column shows the statistic 19 which means 19 distinct values for REGION exist. the following Profile tab for table CUSTOMERS shows the profile attributes for column REGION. Related Topics • Executing a profiler task • Defining validation rule based on column profile SAP BusinessObjects Data Services Designer Guide 419 . decide what value you want to substitute for Null values when you define a validation transform.

Right-click and select View Data.67) of customers that do not have a sales order. The sources can be tables. Click the Relationship tab (third) to view the relationship profile results. Click the nonzero percentage in the diagram to view the key values that are not contained within the other table. 3.17 Data Assessment Using the Data Profiler To view the relationship profile data generated by the Data Profiler Relationship profile data shows the percentage of non matching values in columns of two sources. The relationship profile was defined on the CUST_ID column in table ODS_CUSTOMER and CUST_ID column in table ODS_SALESORDER. 2.67% of rows in table ODS_CUSTOMER have CUST_ID values that do not exist in table ODS_SALESORDER. flat files. In the Object Library. Note: The Relationship tab is visible only if you executed a relationship profile task. 4. select the table or file for which you want to view relationship profile data. the following View Data Relationship tab shows the percentage (16. For example. The columns can have a primary key and foreign key relationship defined or they can be unrelated (as when one comes from a datastore and the other from a file format). The value in the left oval indicates that 16. 1. or a combination of a table and a flat file. 420 SAP BusinessObjects Data Services Designer Guide .

67 percentage in the ODS_CUSTOMER oval to display the CUST_ID values that do not exist in the ODS_SALESORDER table. 5. the number of records with that CUST_ID value. Click one of the values on the right side to display the other columns in the rows that contain that value. Related Topics • Submitting relationship profiler tasks SAP BusinessObjects Data Services Designer Guide 421 .Data Assessment Using the Data Profiler 17 Click the 16. Note: If you did not select Save all column data on the Submit Relationship Profile Request window. Each row displays a non matching CUST_ID value. and the percentage of total customers with this CUST_ID value. you cannot view the data in the other columns. The bottom half of the Relationship Profile tab displays the values in the other columns of the row that has the value KT03 in the column CUST_ID. The non matching values KT03 and SA03 display on the right side of the Relationship tab.

17 Data Assessment Using View Data to determine data quality Using View Data to determine data quality Use View Data to help you determine the quality of your source and target data. Related Topics • Defining validation rule based on column profile • Using View Data Data tab The Data tab is always available and displays the data contents of sample rows. For example. View Data provides the capability to: • • View sample source data before you execute a job to create higher quality job designs. Compare sample data from different steps of your job to verify that your data extraction job returns the results you expect. The following Data tab shows a subset of rows for the customers that are in France. your business rules might dictate that all phone and fax numbers be in one format for each country. You can display a subset of columns in each row and define filters to display a subset of rows (see View Data Properties). 422 SAP BusinessObjects Data Services Designer Guide .

distinct count. minimum value. the Profile tab displays the same above column attributes plus many more calculated statistics. NULLs. median string length. and maximum string length. pattern count. the Profile tab displays the following column attributes: distinct values. minimum string length. If you configured and use the Data Profiler. see Data tab.Data Assessment Using View Data to determine data quality 17 Notice that the PHONE and FAX columns displays values with two different formats. such as average value. and maximum value. and pattern percent. For information about other options on the Data tab. You can now decide which format you want to use in your target data warehouse and define a validation transform accordingly (see Defining validation rule based on column profile). distinct percent. median. Related Topics • Profile tab • To view the column attributes generated by the Data Profiler SAP BusinessObjects Data Services Designer Guide 423 . Profile tab Two displays are available on the Profile tab: • • Without the Data Profiler.

If you use the Data Profiler. Related Topics • Submitting column profiler tasks 424 SAP BusinessObjects Data Services Designer Guide . • • If you do not use the Data Profiler. Analyze column profile You can obtain column profile information by submitting column profiler tasks. the Relationship tab displays the data mismatches between two columns from which you can determine the integrity of your data between two sources. The Data Profiler and View Data features can identify anomalies in the incoming data to help you better define corrective actions in the Validation transform.17 Data Assessment Using the Validation transform Relationship Profile or Column Profile tab The third tab that displays depends on whether or not you configured and use the Data Profiler. if needed. the Column Profile tab allows you to calculate statistical information for a single column. Related Topics • Column Profile tab • To view the relationship profile data generated by the Data Profiler Using the Validation transform The Validation transform provides the ability to compare your incoming data against a set of pre-defined business rules and. suppose you want to analyze the data in the Customer table in the Microsoft SQL Server Northwinds sample database. For example. take any corrective actions.

99. However. select the View Data right-click option on the table that you profiled.Data Assessment Using the Validation transform 17 To analyze column profile attributes 1.99. Click the value 20 under the Patterns attribute to display the individual patterns and the percentage of rows in table CUSTOMERS that have that pattern for column PHONE. 3. Access the Profile tab on the View Data window. The Profile tab shows the following column profile attributes: The Patterns attribute for the PHONE column shows the value 20 which means 20 different patterns exist. In the Object Library of the Designer. the profiling data shows that two records have the format (9) 99.99. 2. SAP BusinessObjects Data Services Designer Guide 425 .99. Suppose that your business rules dictate that all phone numbers in France should have the format 99.99.99.

select the Match pattern and enter the specific pattern that you want to pass per your business rules. '(1) '. In the Condition area.99 under Pattern or click the value 2 under Records. In the Validation transform editor. 5. define a validation rule with the Match pattern option. click either the value (9) 99. Click the Enable validation check box. select Send to Pass and check the box For Pass.99. Related Topics • Defining validation rule based on column profile Defining validation rule based on column profile This section takes the Data Profiler results and defines the Validation transform according to the sample business rules. To display the columns in these two records. Substitute with. You see that some phone numbers in France have a prefix of '(1)'. select the column for which you want to replace a specific pattern. For the example in Analyze column profile. select the PHONE column. In the Action on Failure area.99' 4. Either manually enter the replace_substr function in the text box or click Function to have the Define Input Parameter(s) window help you set up the replace_substr function: replace_substr(CUSTOMERS. 5. null) 426 SAP BusinessObjects Data Services Designer Guide .99.17 Data Assessment Using the Validation transform 4. Related Topics • Reference Guide: Validation To define a validation rule to substitute a different value for a specific pattern 1. 3.PHONE. To remove this '(1)' prefix when you load the customer records into your target table.99. 2. enter the following pattern: '99.99. Using the phone example from the Analyze column profile topic.

Click Finish. Related Topics • Analyze column profile Using Auditing Auditing provides a way to ensure that a data flow loads correct data into the warehouse. Define rules with these audit statistics to ensure that the data at the following points in a data flow is what you expect: • Extracted from sources SAP BusinessObjects Data Services Designer Guide 427 . In the Input Parameter window. 8. Click Next. For the phone example. 7. For Replace string on the Define Input Parameter(s) window. Auditing stores these statistics in the repository. For Search string on the Define Input Parameter(s) window. 9. Use auditing to perform the following tasks: • • Define audit points to collect run time statistics about the data that flows out of objects.Data Assessment Using Auditing 17 Note: This replace_substr function does not replace any number enclosed in parenthesis. double-click the source table and double-click the column name. It only replaces occurrences of the value (1) because the Data Profiler results show only that specific value in the source data. 6. In our example. Repeat steps 1 through 5 to define a similar validation rule for the FAX column. 10. enter your replacement value. enter: null 13. click the Input string drop-down list to display source tables. enter the value of the string you want to replace. After you execute the job. 11. Click Function and select the string function replace_substr. For the phone example. In the Define Input Parameter(s) window. use the View Data icons to verify that the string was substituted correctly. double-click the Customer table and double-click the Phone column name. enter: '(1) ' 12.

a transform. Audit point 428 SAP BusinessObjects Data Services Designer Guide . To use auditing. you can audit each output independently.17 Data Assessment Using Auditing • • • • Processed by transforms Loaded into targets Generate a run time notification that includes the audit rule that failed and the values of the audit statistics at the time of failure. performance might degrade because pushdown operations cannot occur after an audit point. or target. If a transform has multiple distinct or different outputs (such as Validation or Case). you define the following objects in the Audit window: Object name Description The object in a data flow where you collect audit statistics. transform. such as a source. You can audit a source. You identify the object to audit when you define an audit function on it. Display the audit statistics after the job execution to help identify the object in the data flow that might have produced incorrect data. Auditing objects in a data flow You can collect audit statistics on the data that flows out of any object. or a target. Note: If you add an audit point prior to an operation that is usually pushed down to the database server.

integer. output schema. integer. and real. This function only includes the Good rows. Checksum of the values in the column. Error count for rows that generated some type of error if you enabled error handling. double. double. The following table shows the audit functions that you can define. Applicable data types include decimal. or column. This function only includes the Good rows. Audit function Column Sum Sum of the numeric values in the column. Applicable data types include decimal. Column Average Column Checksum SAP BusinessObjects Data Services Designer Guide 429 .Data Assessment Using Auditing 17 Object name Description The audit statistic that the software collects for a table. Average of the numeric values in the column. and real. Data object Audit function Description This function collects two statistics: • Table or outCount put schema • Good count for rows that were successfully processed.

Audit label Audit rule Actions on audit failure Audit function This section describes the data types for the audit functions and the error count statistics. A Boolean expression in which you use audit labels to verify the job. raise exception. REAL INTEGER. If you define multiple rules in a data flow. all rules must succeed or the audit fails. DOUBLE. custom script.17 Data Assessment Using Auditing Object name Description The unique name in the data flow that the software generates for the audit statistics collected for each audit function that you define. DOUBLE. Data types The following table shows the default data type for each audit function and the permissible data types. You can change the data type in the Properties window for each audit function in the Designer. One or more of three ways to generate notification of an audit rule (or rules) failure: email. You use these labels to define audit rules for the data flow. DECIMAL. REAL Sum Type of audited column Average Type of audited column 430 SAP BusinessObjects Data Services Designer Guide . Audit Functions Default Data Type Allowed Data Types Count INTEGER INTEGER INTEGER. DECIMAL.

the software collects two types of statistics: • • Good row count for rows processed without any error. You might want to edit a label name to create a shorter meaningful name or to remove dashes. the software generates an audit label with the following format: $ auditfunction_objectname SAP BusinessObjects Data Services Designer Guide 431 .Data Assessment Using Auditing 17 Audit Functions Default Data Type Allowed Data Types Checksum VARCHAR(128) VARCHAR(128) Error count statistic When you enable a Count audit function. Generating label names If the audit point is on a table or output schema. One way that error rows can result is when you specify the Use overflow file option in the Source Editor or Target Editor. Audit label The software generates a unique name for each audit function that you define on an audit point. the software generates the following two labels for the audit function Count: $Count_objectname $CountError_objectname If the audit point is on a column. which are allowed in column names but not in label names. Error row count for rows that the job could not process but ignores those rows to continue processing. You can edit the label names.

If you choose all three actions. or a function with audit labels as parameters. • The LHS can be a single audit label. the audit rule does not automatically use the new name. or a constant. multiple audit labels that form an expression with one or more mathematical operators. The RHS can be a single audit label. a function with audit labels as parameters. If you edit the label name after you use it in an audit rule. a Boolean operator. • The following Boolean expressions are examples of audit rules: $Count_CUSTOMER = $Count_CUSTDW $Sum_ORDER_US + $Sum_ORDER_EUROPE = $Sum_ORDER_DW round($Avg_ORDER_TOTAL) >= 10000 Audit notification You can choose any combination of the following actions for notification of an audit failure. multiple audit labels that form an expression with one or more mathematical operators.17 Data Assessment Using Auditing If the audit point is in an embedded data flow. the labels have the following formats: $Count_objectname_embeddedDFname $CountError_objectname_embeddedDFname $auditfunction_objectname_embeddedDFname Editing label names You can edit the audit label name when you create the audit function and before you create an audit rule that uses the label. Audit rule An audit rule is a Boolean expression which consists of a Left-Hand-Side (LHS). the software executes them in this order: 432 SAP BusinessObjects Data Services Designer Guide . You must redefine the rule with the new name. and a Right-Hand-Side (RHS).

If you clear this action and an audit rule fails. You can specify a variable for the email list. Use a comma to separate the list of email addresses. If your data flow contains multiple consecutive query transforms. You can view which rule failed in the Auditing Details report in the Metadata Reporting tool. The job stops at the first audit rule that fails. • • Script — the software executes the custom script that you create in this option. and the error log shows which audit rule failed. Accessing the Audit window Access the Audit window from one of the following places in the Designer: • • • From the Data Flows tab of the object library. the Label tab displays the sources and targets in the data flow. When a data flow is open in the workspace. You can use this audit exception in a try/catch block. Click the icons on the upper left corner of the Label tab to change the display. In the workspace. the job completes successfully and the audit does not write messages to the job log. right-click on a data flow name and select the Auditing option. You can continue the job execution in a try/catch block. the Audit window shows the first query. click the Audit icon in the toolbar. you must define the server and sender for the Simple Mail Tool Protocol (SMTP) in the Server Manager. see Viewing audit results .Data Assessment Using Auditing 17 • Email to list — the software sends a notification of which audit rule failed to the email addresses that you list in this option. This option uses the smtp_to function to send email. Therefore. This action is the default. right-click on a data flow icon and select the Auditing option. When you first access the Audit window. SAP BusinessObjects Data Services Designer Guide 433 . For more information. Raise exception — The job fails if an audit rule fails.

When you define an audit point. Default display which shows the source. Displays all the objects within the data flow. If the data flow contains multiple consecutive query transforms. and target objects. 2. and action on failure 1. Define audit points. Use one of the methods that section Accessing the Audit window describes.17 Data Assessment Using Auditing Icon Tool tip Description Collapses the expansion of the source. right-click on an object that you want to audit and choose an audit function or Properties. Access the Audit window. only the first-level query displays. and first-level query objects in the data flow. transform. target. Collapse All Show All Objects Show Source. rules. the software generates the following: • An audit icon on the object in the data flow in the workspace 434 SAP BusinessObjects Data Services Designer Guide . Displays the objects that have audit labels defined. Target and first-level Query Show Labelled Objects Defining audit points. On the Label tab.

REGION_ID = 3 R123 contains rows where ODS_CUSTOMER.Data Assessment Using Auditing 17 • An audit label that you use to define audit rules. the Properties window allows you to edit the audit label and change the data type of the audit function. see Auditing objects in a data flow.REGION_ID IN (1. For example. 2 or 3) a. • • Source table ODS_CUSTOMER Four target tables: R1 contains rows where ODS_CUSTOMER. Right-click on source table ODS_CUSTOMER and choose Count. SAP BusinessObjects Data Services Designer Guide 435 . and an audit icon appears on the source object in the workspace. The software creates the audit labels $Count_ODS_CUSTOMER and $CountError_ODS_CUSTOMER. For the format of this label.REGION_ID = 2 R3 contains rows where ODS_CUSTOMER. In addition to choosing an audit function.REGION_ID = 1 R2 contains rows where ODS_CUSTOMER. the data flow Case_DF has the following objects and you want to verify that all of the source rows are processed by the Case transform.

and the audit function that you previously defined displays with a check mark in front of it. If you want to compare audit statistics for one object against one other object. Similarly. 436 SAP BusinessObjects Data Services Designer Guide . When you right-click on the label. right-click on the label. click Add which activates the expression editor of the Auditing Rules section. you can also select Properties. If you want to remove an audit label. The Audit window shows the following audit labels. c. right-click on each of the target tables and choose Count. which consists of three text boxes with drop-down lists: a. use the expression editor. and select the value (No Audit) in the Audit function drop-down list. Select the label of the first audit point in the first drop-down list. Click the function to remove the check mark and delete the associated audit label. The options in the editor provide common Boolean operators. use the Custom expression box with its function and smart editors to type in the operator.17 Data Assessment Using Auditing b. Define audit rules. b. Choose a Boolean operator from the second drop-down list. If you require a Boolean operator that is not in this list. 3. On the Rule tab in the Audit window.

select audit labels and the Boolean operation in the expression editor as follows: If you want to compare audit statistics for one or more objects against statistics for multiple other objects or a constant. SAP BusinessObjects Data Services Designer Guide 437 . click on the title "Auditing Rule" or on another option. The audit rule displays in the Custom editor. Click OK to close the smart editor. To update the rule in the top Auditing Rule box. c. use the Customer expression box. to verify that the count of rows from the source table is equal to the rows in the target table. Select the label for the second audit point from the third drop-down list. select the Custom expression box.Data Assessment Using Auditing 17 c. Drag the first audit label of the object to the editor pane. f. b. e. Click the ellipsis button to open the full-size smart editor window. a. If you want to compare the first audit value to a constant instead of a second audit value. g. For example. Click the Variables tab on the left and expand the Labels node. Type a Boolean operator Drag the audit labels of the other objects to which you want to compare the audit statistics of the first object and place appropriate mathematical operators between them. d.

type in the Boolean operation and plus signs in the smart editor as follows: Count_ODS_CUSTOMER = $Count_R1 + $Count_R2 + $Count_R3 4. Guidelines to choose audit points The following are guidelines to choose audit points: 438 SAP BusinessObjects Data Services Designer Guide . You can choose one or more of the following actions: • Raise exception — The job fails if an audit rule fails and the error log shows which audit rule failed. Execute the job. You can view passed and failed audit rules in the metadata reports. If you turn on the audit trace on the Trace tab in the Execution Properties window. • Script — the software executes the script that you create in this option. Define the action to take if the audit fails. Use a comma to separate the list of email addresses. the job completes successfully and the audit does not write messages to the job log. You can view which rule failed in the Auditing Details report in the Metadata Reporting tool. Click Close in the Audit window. Look at the audit results. see Viewing audit results . 6. For details. to verify that the count of rows from the source table is equal to the sum of rows in the first three target tables. For more information. If you clear this option and an audit rule fails. • Email to list — the software sends a notification of which audit rule failed to the email addresses that you list in this option. you can view all audit results on the Job Monitor Log. The Execution Properties window has the Enable auditing option checked by default. For example. 5. Clear this box if you do not want to collect audit statistics for this specific job execution. You can specify a variable for the email list.17 Data Assessment Using Auditing h. This action is the default. drag the audit labels. see Viewing audit results .

This section describes the following considerations when you audit embedded data flows: • • Enabling auditing in an embedded data flow Audit points not visible outside of the embedded data flow SAP BusinessObjects Data Services Designer Guide 439 . if the performance of a query that is pushed to the database server is more important than gathering audit statistics from the source. the software disables the DOP for the whole data flow. Therefore. If you use the CHECKSUM audit function in a job that normally executes in parallel. You can only audit a bulkload that uses the Oracle API method. the software cannot execute it. and target objects. You cannot audit within an ABAP Dataflow. • • • • Auditing embedded data flows You can define audit labels and audit rules in an embedded data flow. query. the optimizer cannot pushdown operations after the audit point. to obtain audit statistics on the query results. suppose your data flow has a source. For example. rather than on the source. and DOP processes the rows in a different order than in the source. but you can audit the output of an ABAP Dataflow. and the query has a WHERE clause that is pushed to the database server that significantly reduces the amount of data that returns to the software. the number of rows loaded is not available to the software.Data Assessment Using Auditing 17 • When you audit the output data of an object. • • If a pushdown_sql function is after an audit point. The order of rows is important for the result of CHECKSUM. Define the first audit point on the query. Auditing is disabled when you run a job with the debugger. For the other bulk loading methods. You cannot audit NRDM schemas or real-time jobs. define the first audit point on the query or later in the data flow.

Open the parent data flow in the Designer workspace. When you embed a data flow at the end of another data flow. Click on the Audit icon in the toolbar to open the Audit window 3. To enable auditing in an embedded data flow 1. Right-click the Audit function and choose Enable. You can define audit rules with the enabled label. Audit points not visible outside of the embedded data flow When you embed a data flow at the beginning of another data flow. 2.17 Data Assessment Using Auditing Enabling auditing in an embedded data flow If you want to collect audit statistics on an embedded data flow when you execute the parent data flow. data passes from the embedded data flow to the parent data flow through a single source. 5. If a data flow is embedded at the beginning or at the end of the parent data flow. some of the objects are not visible in the parent data flow. In either case. The following Audit window shows an example of an embedded audit function that does not have an audit label defined in the parent data flow. On the Label tab. an audit function might exist on the output port or on the input port. 440 SAP BusinessObjects Data Services Designer Guide . expand the objects to display any audit functions defined within the embedded data flow. 4. You can also choose Properties to change the label name and enable the label. you must enable the audit label of the embedded data flow. data passes into the embedded data flow from the parent through a single target.

SAP BusinessObjects Data Services Designer Guide 441 . The following Audit window shows these two audit points. For example. the following embedded data flow has an audit function defined on the source SQL transform and an audit function defined on the target table.Data Assessment Using Auditing 17 Because some of the objects are not visible in the parent data flow. the audit points on these objects are also not visible in the parent data flow. the target Output becomes a source for the parent data flow and the SQL transform is no longer visible. When you embed this data flow.

If you want to audit the embedded data flow. The following Audit window for the parent data flow shows the audit function defined in the embedded data flow. but does not show an Audit Label. 442 SAP BusinessObjects Data Services Designer Guide . Resolving invalid audit labels An audit label can become invalid in the following situations: • • If you delete the audit label in an embedded data flow that the parent data flow has enabled. right-click on the audit function in the Audit window and select Enable. If you delete or rename an object that had an audit point defined on it The following Audit window shows the invalid label that results when an embedded data flow deletes an audit label that the parent data flow had enabled.17 Data Assessment Using Auditing An audit point still exists for the entire embedded data flow. but the label is no longer applicable.

3. If you want to delete all of the invalid labels at once. 2. right-click on the invalid label and choose Delete. the places that display audit information depends on the Action on failure option that you chose: SAP BusinessObjects Data Services Designer Guide 443 . Viewing audit results You can see the audit status in one of the following places: • • Job Monitor Log If the audit rule fails. right click on the Invalid Labels node and click on Delete All. 4. Expand the Invalid Labels node to display the individual labels.Data Assessment Using Auditing 17 To resolve invalid audit labels 1. Open the Audit window. 5. Note any labels that you would like to define on any new objects in the data flow. After you define a corresponding audit label on a new object.

Metadata Reports Wherever the custom script sends the audit messages. Data flow <Case_DF>. Data flow <Case_DF>. Data flow <Case_DF>. Data flow <Case_DF>. Data flow <Case_DF>. Audit Label $Count_ODS_CUSTOMER = 12. Data flow <Case_DF>. Audit Rule passed ($Count_ODS_CUSTOMER = (($CountR1 + $CountR2 + $Count_R3)): LHS=12. The following sample audit success messages appear in the Job Monitor Log when Audit Trace is set to Yes: Audit Label $Count_R2 = 4. Data flow <Case_DF>. 444 SAP BusinessObjects Data Services Designer Guide .17 Data Assessment Using Auditing Action on failure Raise exception Email to list Places where you can view audit information Job Error Log. Data flow <Case_DF>. Audit Label $CountError_R123 = 0. Audit Label $CountError_R2 = 0. Audit Label $Count_R123 = 12. RHS=12. Audit Label $Count_R3 = 3. Audit Label $CountError_R1 = 0. Audit Label $Count_R1 = 5. Audit Label $CountError_ODS_CUSTOMER = 0. audit messages appear in the Job Monitor Log. Data flow <Case_DF>. Metadata Reports Script Related Topics • Job Monitor Log • Job Error Log • Metadata Reports Job Monitor Log If you set Audit Trace to Yes on the Trace tab in the Execution Properties window. Data flow <Case_DF>. Metadata Reports Email message. You can see messages for audit rules that passed and failed. Audit Label $CountError_R3 = 0. Data flow <Case_DF>.

RHS=12.Data Assessment Using Auditing 17 Audit Rule passed ($Count_ODS_CUSTOMER = $CountR123): LHS=12. the Job Error Log shows the rule that failed. Data flow <Case_DF>. Information Collected — This status occurs when you define audit labels to collect statistics but do not define audit rules. The following sample message appears in the Job Error Log: Audit rule failed <($Count_ODS_CUSTOMER = $CountR1)> for <Data flow Case_DF>. This Audit Status column has the following values: • • • Not Audited Passed — All audit rules succeeded. Failed — Audit rule failed. Job Error Log When you choose the Raise exception option and an audit rule fails. This value is a link to the Auditing Details report which shows the audit rules and values of the audit labels. This value is a link to the Auditing Details report which shows the rule that failed and values of the audit labels. This value is a link to the Auditing Details report which shows the values of the audit labels. Metadata Reports You can look at the Audit Status column in the Data Flow Execution Statistics reports of the Metadata Report tool. • Related Topics • Management Console—Metadata Reports Guide: Operational Dashboard Reports SAP BusinessObjects Data Services Designer Guide 445 .

17 Data Assessment Using Auditing 446 SAP BusinessObjects Data Services Designer Guide .

Data Quality 18 .

households. and enhances customer and operational data. how to set up address cleansing. Geocoding. enhancing. corrects. or corporations within multiple tables or databases and consolidates them into a single source. Data Cleanse. Parses. Parses. and enhances address data. Related Topics • How address cleanse works • Prepare your input data • Determine which transform(s) to use • Identify the country of destination • Set up the reference files • Define the standardization options • Beyond the basics • Supported countries (Global Address Cleanse) 448 SAP BusinessObjects Data Services Designer Guide . standardizes. corrects. standardizes. Identifies duplicate records at multiple levels within a single pass for individuals. Related Topics • Address Cleanse • Data Cleanse • Geocoding • Matching strategies Address Cleanse This section describes how to prepare your data for address cleansing.18 Data Quality Overview of data quality Overview of data quality Data quality is a term that refers to the set of transforms that work together to improve the quality of your data by cleansing. Identifies and appends geographic information to address data such as latitude and longitude. Data quality is primarily accomplished in the software using three transforms: • • • • Address Cleanse. Match. and how to understand your output after processing. matching and consolidating data elements.

• • • Reports The USA Regulatory Address Cleanse transform creates the USPS Form 3553 (required for CASS) and the NCOALink Summary Report. region. the transform usually can add the postal code and vice versa (depending on the country). (These codes are included in the Reference Guide). they can add or remove punctuation and abbreviate or spell-out the primary type (depending on what you want). the Australia Post’s AMAS report. and the New Zealand SOA Report.Data Quality Address Cleanse 18 How address cleanse works Address cleanse provides a corrected. The Global Address Cleanse transform creates reports about your data including the Canadian SERP—Statement of Address Accuracy Report. such as vacant lots and condemned buildings (USA records only). and postal codes agree with one another. What happens during address cleanse? The USA Regulatory Address Cleanse transform and the Global Address Cleanse transform cleanse your data in the following ways: • Verify that the locality. If your data has just a locality and region. Assign diagnostic codes to indicate why addresses were not assigned or how they were corrected. Identify undeliverable addresses. address cleanse can also correct or add postal codes. complete. For example. Related Topics • The transforms • Input and output data and field classes • Prepare your input data • Determine which transform(s) to use • Define the standardization options SAP BusinessObjects Data Services Designer Guide 449 . Standardize the way the address line looks. With the USA Regulatory Address Cleanse transform and for some countries with the Global Address Cleanse transform. and standardized form of your original address data.

address data (within the Latin1 code page). use a Global Address Cleanse transform after the Global Suggestion Lists transform in the data flow.S. This transform is usually used for real time processing and does not standardize addresses. The transform includes the following add-on options: DPV. Japan. certification). and the New Zealand Statement of Accuracy (SOA) report. Canada. if you want to standardize your address data. Use a Country ID transform before this transform in the data flow. You must set up the Global Address Cleanse transform in conjunction with one or more of the Global Address Global Address Cleanse engines (Australia. and Z4Change. Use this transform before the Global Suggestion Lists transform in your data flow. Identifies the country of destination for the record and outputs an ISO code. suggestion lists Address Cleanse (not for certification). Transform Description Cleanses your address data from any of the supported countries (not for U. LACSLink. With this transform you can create a USPS Form 3553. or USA). (It is not necessary to place the Country ID transform before the Global Address Cleanse or the USA Regulatory Address Cleanse transforms. Cleanses your U.With this transform you can create Canada Post's Software Evaluation gines and Recognition Program (SERP)—Statement of Address Accuracy Report.18 Data Quality Address Cleanse The transforms The following table lists the address cleanse transforms and their purpose. eLOT. SuiteLink. NCOALink. EMEA. Cleanse and en. Offers suggestions for possible address matches for your global address data.S. Also.) Global Suggestion Lists Country ID 450 SAP BusinessObjects Data Services Designer Guide . USA Regulatory GeoCensus. EWS. Global. RDI. Australia Post's Address Matching Processing Summary report (AMAS).

Dual SAP BusinessObjects Data Services Designer Guide 451 . and Corrected: Contain the DUAL address details that were available on input. Parsed. The fields subjected to standardization are locality. and hybrid address line formats. Output data When you set up the USA Regulatory Address Cleanse transform or the Global Address Cleanse transform. and postcode. Delivery Best: Contains the parsed data when the address is unassigned or the corrected data for an assigned address. Corrected: Contains the assigned data after directory lookups and will be blank if the address is not assigned. you can include output fields that contain specific information: Generated Field AdGenerated Field Class dress Class Parsed: Contains the parsed input with some standardization applied.Data Quality Address Cleanse 18 Input and output data and field classes Input data The address cleanse transforms accept discrete. multiline. Best. region.

18 Data Quality Address Cleanse Generated Field AdGenerated Field Class dress Class Parsed: Contains the parsed input with some standardization applied. Both the USA Regulatory Address Cleanse transform and the Global Address Cleanse transform accept input data in the same way. the input record is sent to the corresponding standardized output field without any processing. Corrected: Contains the information from directories defined by the Postal Service when an address is assigned and will be blank if the address is not assigned. Best: Contains the information from directories defined by the Postal Service when an address is assigned. If an input record has characters outside the Latin1 code page (character value is greater than 255). you must decide which kind of address line format you will input. If your Unicode database has valid U. No other output fields (component. Official Prepare your input data Before you start address cleansing. addresses from the Latin1 character set. Caution: The USA Regulatory Address Cleanse Transform does not accept Unicode data. 452 SAP BusinessObjects Data Services Designer Guide . Instead.S. Contains the parsed input when an address is unassigned. this transform processes as usual. the USA Regulatory Address Cleanse transform will not process that data. for example) will be populated for that record.

Data Quality Address Cleanse 18 Accepted address line formats The following tables list the address line formats: multiline. you are not required to use all the multiline fields for a selected format (for example Multiline1-12). However.Postcode (Glob. for example.Postcode (GlobCountry (Option.Country (Option. You cannot skip numbers.) (USA Reg.) (USA Reg. and discrete. Multiline and multiline hybrid formats Example 1 Multiline1 Multiline2 Multiline3 Multiline4 Multiline5 Multiline6 Multiline7 Multiline8 Example 2 Multiline1 Multiline2 Multiline3 Multiline4 Multiline5 Multiline6 Multiline7 Lastline Example 3 Multiline1 Multiline2 Multiline3 Multiline4 Locality3 Locality2 Locality1 Region1 Example 4 Multiline1 Multiline2 Multiline3 Multiline4 Multiline5 Locality2 Locality1 Region1 Example 5 Multiline1 Multiline2 Multiline3 Multiline4 Multiline5 Multiline6 Locality1 Region1 Postcode (Glob. hybrid. you must start with Multiline1 and proceed consecutively.) Country (Option. from Multiline1 to Multiline3.Country (Optional) al) al) SAP BusinessObjects Data Services Designer Guide 453 .Country (Optional) or Postcode1 al) or Postcode1 al) or Postcode1 al) al) (USA Reg. Note: For all multiline and hybrid formats listed.

S. using business rules to cleanse data and cleansing global address data transactionally. There are transforms for cleansing global and/or U.S. cleansing based on USPS regulations.S.) Country (Optional) Country (Optional) Determine which transform(s) to use You can choose from a variety of address cleanse transforms based on what you want to do with your data. data only • Cleanse U. address data. Related Topics • Cleanse global address data • Cleanse U.) Region1 Country (Optional) Postcode (Global) or Postcode1 (USA Reg.18 Data Quality Address Cleanse Discrete line formats Example 1 Address_Line Lastline Country (Optional) Example 2 Address_Line Locality3 (Global) Locality2 Locality1 Example 3 Address_Line Locality2 Locality1 Region1 Postcode (Global) or Postcode1 (USA Reg. data and global data • Cleanse address data using multiple business rules • Cleanse your address data transactionally 454 SAP BusinessObjects Data Services Designer Guide .) Example 4 Address_Line Locality1 Region1 Postcode (Global) or Postcode1 (USA Reg.

S. use the Global Address Cleanse transform in your project with one or more of the following engines: • • • • • • Australia Canada Japan EMEA Global Address USA Tip: Cleansing U. With this transform. Address Matching Approval System. certification and Australia for AMAS.S. data and global data • Reference Guide: Transforms. use the USA Regulatory Address Cleanse transform for the best results. certification).S. data with the USA Regulatory Address Cleanse transform is usually faster than with the Global Address Cleanse transform and USA engine. Software Evaluation and Recognition Program. Start with a sample transform configuration The software includes a variety of Global Address Cleanse sample transform configurations (which include at least one engine) that you can copy to use for a project. This scenario is usually true even if you end up needing both transforms. Related Topics • Supported countries (Global Address Cleanse) • Cleanse U. address data. Transform configurations Cleanse U. You can also use the Global Address Cleanse transform with the Canada and/or USA engine in a real time data flow to create suggestion lists for those countries. and with DPV and SAP BusinessObjects Data Services Designer Guide 455 .Data Quality Address Cleanse 18 Cleanse global address data To cleanse your address data for any of the software-supported countries (including Canada for SERP. data only To cleanse U.S.

suggestion lists). you can run in a non-certification mode or run transactionally and create U. You do not need to have two engines of the same kind.S.S. GeoCensus. addresses that need to be certified and also addresses from other countries in your database? In this situation. Cleanse address data using multiple business rules When you have two addresses intended for different purposes (for example. eLOT. LACSLink. EWS.S.18 Data Quality Address Cleanse LACSLink enabled. a billing address and a shipping address).S. it does not affect the overall processing time of the data flow. You can also use this transform for DPV. If you use one engine or two. One or two engines? When you use two Global Address Cleanse transforms for data from the same country. you should use two of the same address cleanse transforms in a data flow. 456 SAP BusinessObjects Data Services Designer Guide . Tip: Even if you are not processing U. they can share an engine. data and global data What should you do when you have U. Start with a sample transform configuration The software includes a variety of USA Regulatory Address Cleanse sample transform configurations that can help you set up a project. data for USPS certification. Transform configurations Cleanse U.S. NCOALink. However. and Z4Change processing (which are add-on options). data with the USA Regulatory Address Cleanse transform is faster than with the Global Address Cleanse transform and USA engine. Related Topics • Reference Guide: Transforms. you should use both the Global Address Cleanse transform and the USA Regulatory Address Cleanse transform in your data flow. RDI. you may find that cleansing U. you do not need to CASS certify your data to use this transform (for example. you can CASS certify a mailing and produce a USPS Form 3553.

you may need to use two separate engines (even if the data is from the same country). or it can offer suggestions for possible matches. you'll now have to use the Global Address engine. however. Place the Global Suggestion Lists transform after the Country ID transform and before a Global Address Cleanse transform that uses an EMEA. see "Integrate Global Suggestion Lists " in the Integrator's Guide. if you previously used the USA engine.Data Quality Address Cleanse 18 In this situation. If you are a programmer looking for details about how to integrate this functionality. Depending on your business rules. Australia. in the Options group of the EMEA engine. you may have to define the settings in the engine differently for a billing address or for a shipping address. Cleanse your address data transactionally The Global Suggestion Lists transform. SAP BusinessObjects Data Services Designer Guide 457 . It's also a beneficial research tool when you need to manage bad addresses from a previous batch process. is a way to complete and populate addresses with minimal data. the Output Address Language option can convert the data used in each record to the official country language or it can preserve the language used in each record. For example. The Global Suggestion Lists transform can help identify that these countries are no longer in the USA Address directory. Canada. For example. you may want to convert the data for the shipping address but preserve the data for the billing address. For example. Start with a sample transform configuration Data Quality includes a Global Suggestion Lists sample transform that can help you when setting up a project. Therefore. best used in transactional projects. the Marshall Islands and the Federated States of Micronesia were recently removed from the USA Address directory. Integrating functionality Global Suggestion Lists functionality is designed to be integrated into your own custom applications via the Web Service. This easy address-entry system is ideal in call center environments or any transactional environment where cleansing is necessary at the point of entry. and/or USA engine.

In the Country ID Options option group of the Global Address Cleanse transform. Addresses that cannot be processed are not sent to the engine. 458 SAP BusinessObjects Data Services Designer Guide . The transform will use the country you specify in this option group as a default. Therefore. you can define the country of destination or define whether you want to run Country ID processing. you do not need to run Country ID processing or input a discrete country field. such as Australia. You can tell the Global Address Cleanse transform the country and it will assume all records are from this country (which may save processing time).18 Data Quality Address Cleanse Identify the country of destination The Global Address Cleanse transform includes Country ID processing. Constant country If all of your data is from one country. Assign default You'll want to run Country ID processing if you are using two or more of the engines and your input addresses contain country data (such as the two-character ISO code or a country name). or if you are using only one engine and your input source contains many addresses that cannot be processed by that engine. you do not need to place a Country ID transform before the Global Address Cleanse transform in your data flow. Related Topics • To set a constant country • Set a default country Set up the reference files The USA Regulatory Address Cleanse transform and the Global Address Cleanse transform and engines rely on directories (reference files) in order to cleanse your data.

Data Quality Address Cleanse 18 Directories To correct addresses and assign codes. there are many specialized directories that the USA Regulatory Address Cleanse transform uses: • • • • • • • • • Delivery Point Validation (DPV) Early Warning System (EWS) eLOT GeoCensus LACSLink NCOALink RDI SuiteLink Z4Change These features help extend address cleansing beyond the basic parsing and standardizing. Similarly. Maybe you find several people with a similar name. the address cleanse transforms rely on databases called postal directories. you may discover that the name is spelled a little differently from the way you thought. and it corrects misspelled street and city names and other errors. Besides the basic address directories. A telephone directory is a large table in which you look up something you know (a person's name) and read off something you don't know (the phone number). but you don't have enough information to tell which listing was the person you wanted to contact. SAP BusinessObjects Data Services Designer Guide 459 . The process is similar to the way that you use the telephone directory. Your system administrator should have already installed these files to the appropriate locations based on your company's needs. Sometimes it doesn't work out. This type of problem can prevent the address cleanse transforms from fully correcting and assigning an address. We've all had the experience of looking up someone and being unable to find their listing. the address cleanse transform looks up street and city names in the postal directories. Define directory file locations You must tell the transform or engine where your directory (reference) files are located in the Reference Files option group. In the process of looking up the name in the phone book.

Substitution files If you start with a sample transform. It helps ensure the integrity of your databases. monthly directory updates for the Australia and Canada engines. You can change that location by editing the substitution file associated with the data flow. the Reference Files options are filled in with a substitution variable (such as $$RefFilesAddressCleanse) by default. These options include casing. The type of change depends on the options that you define in the transform. The system administrator must install weekly. sequence. makes mail more deliverable.18 Data Quality Address Cleanse Caution: Incompatible or out-of-date directories can render the software unusable. Global Address. These substitution variables point to the reference data folder of the software directory by default. and Japan engines to the to ensure that they are compatible with the current software. and quarterly directory updates for the EMEA. and much more. This change is made for every data flow that uses that substitution file. monthly or bimonthly directory updates for the USA Regulatory Address Cleanse Transform. 460 SAP BusinessObjects Data Services Designer Guide . and gives your communications with customers a more professional appearance. abbreviations. Related Topics • Delivery Point Validation (DPV) • Early Warning System (EWS) • eLOT (USA Regulatory Address Cleanse) • GeoCensus (USA Regulatory Address Cleanse) • LACSLink (USA Regulatory Address Cleanse) • NCOALink (USA Regulatory Address Cleanse) • RDI (USA Regulatory Address Cleanse) • SuiteLink (USA Regulatory Address Cleanse) • Z4Change (USA Regulatory Address Cleanse) Define the standardization options Standardization changes the way the data is presented after an assignment has been made. punctuation.

For example. USA Regulatory Address Cleanse transform If you use the USA Regulatory Address Cleanse transform.Data Quality Address Cleanse 18 For example. Related Topics • Reference Guide: Transforms. the following address was standardized for capitalization. Input Output Multiline1 = route 1 box 44a Multiline2 = stodard wisc Address_Line = RR 1 BOX 44A Locality1 = STODDARD Region1 = WI Postcode1 = 54658 Global Address Cleanse transform In the Global Address Cleanse transform. punctuation. you set the standardization options on the Options tab in the Standardization Options section. SAP BusinessObjects Data Services Designer Guide 461 . You can standardize addresses for all countries and/or for individual countries (depending on your data). USA Regulatory Address Cleanse transform options (Standardization Options) Beyond the basics The software offers many additional address cleanse features. and another set of Global standardization options that standardize all other addresses. These features help extend address cleansing beyond the basic parsing and standardizing. you can have one set of French standardization options that standardize addresses within France only. and postal phrase (route to RR). you set the standardization options in the Standardization Options option group.

This can eliminate shipping of merchandise to individuals who place fraudulent orders. and other variables that may be unique to your operating environment. Increased assignment rate: Using DPV tiebreak mode to resolve a tie when other tie-breaking methods are not conclusive may increase assignment rates. or apartment rather than block face increases the data's level of accuracy. system configuration. DPV processing is required for CASS certification.S. DPV is available for U. • • Related Topics • About DPV processing • To enable DPV • DPV security • To retrieve the DPV unlock code • To notify the USPS of DPV locking addresses About DPV processing Performance Due to additional time required to perform DPV processing. With DPV. 462 SAP BusinessObjects Data Services Designer Guide . DPV can be useful in the following areas: • • Mailing: DPV helps to screen out undeliverable-as-addressed (UAA) mail and helps to reduce mailing costs. you can identify addresses that are undeliverable as addressed and whether an address is a Commercial Mail Receiving Agency (CMRA). Information quality: DPV's ability to verify an address down to the individual house. you may see a change in processing time. Processing time may vary with the DPV feature based on operating system.18 Data Quality Address Cleanse Delivery Point Validation (DPV) DPV is a technology developed to assist you in validating the accuracy of your address information with the USA Regulatory Address Cleanse transform. suite. Preventing mail-order-fraud: DPV can assist merchants by verifying valid delivery addresses and Commercial Mail Receiving Agencies (CMRA). data in the USA Regulatory Address Cleanse transform only.

For the amount of disk space required to cache the directories. Make sure that your computer has enough memory available before performing DPV processing. Memory usage Performing address correction processing with DPV requires approximately 50 MB of memory. expand Reference Files.dir. SAP BusinessObjects Data Services Designer Guide 463 . On the Options tab. dpvb.Data Quality Address Cleanse 18 You can decrease the time required for DPV processing by loading DPV directories into system memory before processing.dir.dir dpvc.dir dpvb. and dpvd. see the Supported Platforms document available in the SAP BusinessObjects Support > Documentation > Supported Platforms/PARs section of the SAP Service Marketplace: http://service.dir . Directory updates DPV directories are shipped monthly with the U. You can do this by choosing the appropriate options in the Transform Performance option group of the USA Regulatory Address Cleanse transform.com/bosap-support. and then set the locations for the DPV directories: dpva. directories in accordance with USPS guidelines.dir dpvd. The directories expire in 105 days and must be of the same month as the Address directory. Do not rename any of the files. The USPS requires the software to buffer 35 MB of data to read the DPV directories. This list shows the names of the DPV directories: • • • • dpva. 1.sap.dir To enable DPV DPV processing is enabled by default in the USA Regulatory Address Cleanse transform because it is required for CASS certification. These steps tell you how to change the setting.S.dir . dpvc. Open the USA Regulatory Address Cleanse transform. DPV will not run if the file names are changed. 2.

DPV security The USPS has instituted processes that monitor the use of DPV. You must obtain the proper permissions from the USPS. CASS purposes) NCOALink end user without stop processing DPV processing is locked (disalternative (See Note below) abled). you can utilize the Alternate Stop Processing functionality. The USPS has added security to prevent DPV abuse by including false positive addresses within the DPV directories.DPV false posated and DPV processing contin. NCOALink end user with stop processing alternative enabled (See Note below) NCOALink full or limited service provider DPV locking DPV false positive logs are gener. DPV false positive logs are gener. Note: If you are a non service provider who uses DPV or LACSLink. Depending on what type of user you are and your license key.abled). Each company that purchases the DPV functionality is required to sign a legal agreement stating that it will not attempt to misuse the DPV product. 3. and then for the Enable DPV option select Yes. the software's behavior varies when it encounters a false positive address.18 Data Quality Address Cleanse Note: DPV can run only when the locations for all the DPV directories have been entered and none of the DPV directory files have been manually renamed. If a user abuses the DPV product. The following table explains the behavior for each user type: User type Software behavior Read about: DPV locking Non-NCOALink users of DPV (for example DPV processing is locked (disyou use DPV for CASS certification or non. the USPS has the right to prohibit the user from using DPV in the future. Expand Assignment Options. and you want to bypass any future DPV or LACSLink directory locks.DPV false posated and DPV processing contin.itive logs ues.itive logs ues. provide proof of permission to SAP customer 464 SAP BusinessObjects Data Services Designer Guide .

but continues other processing in the current data flow When the software discontinues DPV processing. the software takes the following actions when it detects a false positive address during processing: • marks the record as a false positive • generates a DPV log file containing the false positive address • notes the path to the DPV log files in the error log • generates a "US Regulatory Locking Report" containing the path to the DPV log files • continues DPV processing without interruption (however. DPV false positive logs For NCOALink service providers and end users with stop processing alternative enabled. The transform continues to process the current data flow without DPV processing. Additionally. the software takes the following actions when it detects a false positive address during processing: • marks the record as a false positive • generates a DPV lock code • notes the false positive address record (lock record) and lock code in the error log • generates a "US Regulatory Locking Report" containing the false positive address record (lock record) and lock code. and then customer support will provide a key code that disables the DPV or LACSLink directory locking. you are required to notify the USPS that a false positive address was detected. There is no extra charge for this key code. DPV locking If you use DPV for CASS certification or for non-CASS purposes. you must obtain a DPV unlock code from SAP BusinessObjects Enterprise Support.) • discontinues DPV processing.Data Quality Address Cleanse 18 support. In order to re-enable DPV functionality. (The option to generate report data must be enabled in the transform. or if you are licensed as an NCOALink end user (without stop processing alternative enabled).) SAP BusinessObjects Data Services Designer Guide 465 . it is known as DPV locking. all subsequent job runs with DPV enabled error out and do not complete.

if it encounters more than one false positive record and the records are processed on different threads. The software stores DPV log files in the directory specified for the "USPS Log Path" in the USA Regulatory Address Cleanse transform.18 Data Quality Address Cleanse Before releasing the mailing list that contains the false positive address. 2. one log will be generated.log. Open your database and remove the record causing the lock.txt file is located in the DPV directory referenced in the job. SAP Support sends you an unlock file named dpvw. the first log file generated is dpvl0001. include the unlock information in the message instead. if the software encounters only one false positive record. the next one is dpvl0002.log. Note: When you have set the data flow degree of parallelism to greater than 1. 4. the software generates one log per thread.txt. and so on.com/mes sage and log a message using the component BOJ-EIM-DS. 3.log.log. However.txt file to your message and the log file named dpvl####. Replace the existing dpvw. Note: If your files cannot be attached to the original message. you will need to retrieve a new 466 SAP BusinessObjects Data Services Designer Guide . The software names DPV false positive logs dpvl####.sap. Go to the SAP Service Market Place (SMP) at http://service. During a job run. where #### is a number between 0001 and 9999.txt file with the new file. For example. Related Topics • To notify the USPS about LACSLink locking addresses • To retrieve the DPV unlock code To retrieve the DPV unlock code 1. Note: Keep in mind that you can only use the unlock code one time. The dpvx. you are required to send the DPV log files containing the false positive addresses to the USPS. then the software will generate one log for each thread that processes a false positive record. Attach the dpvx. If the software detects another false-positive. The log file is located in the directory specified for the USPS Log Path option in the USA Regulatory Address Cleanse transform.

1. After the USPS has released the list that contained the locked or false positive record. Note: When you have set the data flow degree of parallelism to greater than 1. 3. the software generates one log per thread. Related Topics • Delivery Point Validation (DPV) • DPV security To notify the USPS of DPV locking addresses Follow these steps only if you have received an alert that DPV false positive addresses are present in your address list and you are either an NCOALink service provider or an NCOALink end user with Stop Processing Alternative enabled. 2. records in the USA Regulatory Address Cleanse transform only.gov. Remove the record that caused the lock from the database. During a job run. SAP BusinessObjects Data Services Designer Guide 467 . Include the following: • • DPV False Positive as the subject line attach the dpvl####.log file or files from the job. Send an email to the USPS at dsf2stop@usps. where #### is a number between 0001 and 9999. the corresponding log files should be deleted. then the software will generate one log for each thread that processes a false positive record.S. EWS is available for U. if it encounters more than one false positive record and the records are processed on different threads. Be sure to remove the record that is causing the lock from the database. Related Topics • DPV security Early Warning System (EWS) EWS helps reduce the amount of misdirected mail caused when valid delivery points are created between national directory updates. one log will be generated.Data Quality Address Cleanse 18 DPV unlock code. However. if the software encounters only one false positive record.

EWS directories The EWS directory contains four months of rolling data. When using EWS processing. but it may take a couple of months before the delivery point is listed in the national directory. Now consider that construction is completed on a house at 300 Main Avenue. the USPS adds new data and drops a week's worth of old data.zip) and posts it on the Technical Customer Assurance site at https://service. What is EWS? The EWS feature is the solution to the problem of misdirected mail caused by valid delivery points that appear between national directory updates. Example of an EWS Match If the USA Regulatory Address Cleanse transform cannot make an exact match within the Address directory. The EWS feature solves this problem by using an additional directory which informs CASS users of the existence of 300 Main Avenue long before it appears in the national directory.18 Data Quality Address Cleanse Start with a sample transform configuration If you want to use the USA Regulatory Address Cleanse transform with the EWS option turned on. For example. The USPS then publishes the latest EWS data. Each week. suppose that 300 Main Street is a valid address and that 300 Main Avenue does not exist. it is best to start with the sample transform configuration for EWS processing named USARegulatoryEWS_AddressCleanse.sap. All the mail intended for the new house at 300 Main Avenue will be mis-directed to 300 Main Street until the delivery point is added to the national directory. this record is considered an EWS match. and there is an entry in the EWS directory. 468 SAP BusinessObjects Data Services Designer Guide . Each Friday. The transform returns a value of "T" (the address is located in the EWS directory and is an EWS match) within the EWS_Match output field. A mail piece addressed to 300 Main Avenue is assigned to 300 Main Street on the assumption that the sender is mistaken about the correct suffix.com/bosap-downloads-usps. SAP BusinessObjects converts the data to our format (EWyymmdd. The new owner signs up for utilities and mail. the previously mis-directed address now defaults to a 5-digit assignment.

this example is considered an inexact match. Address directory listing ZIP LOW HIGH P A E PR NM 499 37TH N PO SUF ST Z4_E 3108 Z4_O OE 3417 B T S LX 6 08110 472 EWS directory listing ZIP 08110 PREDIR PNAME 37TH SUFFIX ST POSTDIR Output record Address: Lastline: EWS match: Fault code: 472 37TH ST PENNSAUKEN NJ 08110 T E439 (exact match made in EWS directory) To enable EWS EWS is already enabled when you use the software's EWS sample transform named USARegulatoryEWS_AddressCleanse.Data Quality Address Cleanse 18 Input record Address: Lastline: 472 37TH ST PENNSAUKEN NJ 08110 Because the directory record has a pre-directional and the input record does not have a pre-directional. SAP BusinessObjects Data Services Designer Guide 469 . These steps show how to manually set EWS.

set the path for your “eLOT” directory. If you installed the "eLOT" directory to the default location. 2. and then for the Enable eLOT option select Yes. records in the USA Regulatory Address Cleanse transform only.18 Data Quality Address Cleanse 1.S. select Enable. eLOT narrows the mail carrier's delivery route walk sequence to the house (delivery point) level. 470 SAP BusinessObjects Data Services Designer Guide . 3. This allows you to sort your mailings to a more precise level. 2. Related Topics • Early Warning System (EWS) eLOT (USA Regulatory Address Cleanse) Enhanced Line of Travel (eLOT) is available for U. 3. It is available for U. GeoCensus (USA Regulatory Address Cleanse) The GeoCensus option of the USA Regulatory Address Cleanse transform offers geographic and census coding for enhanced sales and marketing analysis. expand Assignment Options. Related Topics • To enable eLOT • Set up the reference files To enable eLOT 1. the path for the default location is shown in the Reference Files option. Open the USARegulatoryEWS_AddressCleanse transform. In the Reference Files option. Open the USA Regulatory Address Cleanse transform. The original line of travel (LOT) narrowed the mail carrier's delivery route to the block face level (ZIP+4 level) by discerning whether an address resided on the odd or even side of a street or thoroughfare. On the Options tab. For the Enable EWS option. expand Assignment Options.S. records only. On the Options tab. Enhanced Line of Travel (eLOT) takes line of travel one step further.

• • • • Sample transform configuration To process with the GeoCensus feature in the USA Regulatory Address Cleanse transform. USARegulatoryGeo_AddressCleanse. The result is a more finely targeted marketing program. and demographic data on maps. and graphs. Territory management: GeoCensus data can provide a more accurate market picture for your organization. you may want to employ media planning. Predictive modeling and target marketing: You can more accurately target your customers for direct response campaigns using geographic selections. SAP BusinessObjects Data Services Designer Guide 471 . It can help you distribute territories and sales quotas more equitably. You will understand both where your customers are and the penetration you have achieved in your chosen markets. with or without mapping products. for example. Here are some of the ways you can use the GeoCensus data. Find the sample configuration. From this analysis. In this way they can view sales. • Market analysis: You can use mapping applications to analyze market penetration. Direct sales: Using GeoCensus data with market analysis tools and a mapping application. charts. This method incorporates demographic information used to enrich your customer database. GeoCensus helps your organization build its sales and marketing strategies. it is best to start with the sample transform configuration created for GeoCensus. under USA_Regulatory_Address_Cleanse in the Object Library. marketing. it is possible to identify the best prospects for mailing or telemarketing programs.Data Quality Address Cleanse 18 Get the most from the GeoCensus data You can combine the GeoCensus option with the functionality of mapping applications to view your geo-enhanced information. Predictive modeling or other analytical techniques allow you to identify the characteristics of your "ideal" customer. Media planning: For better support of your advertising decisions. Coupling a visual display of key markets with a view of media outlets can help your organization make more strategic use of your advertising dollars. Companies striving to gain a clearer understanding of their markets employ market analysis. you can track sales leads gathered from marketing activities.

Turns off GeoCensus processing. Centroid. Both Cen troid None GeoCensus output fields You must include at least one of the following generated output fields in the USA Regulatory Address Cleanse transform if you plan to use the GeoCensus option: • • • • • • • • • • • • AGeo_CountyCode AGeo_Latitude AGeo_Longitude AGeo_MCDCode AGeo_PlaceCode AGeo_SectionCode AGeo_StateCode CGeo_BSACode CGeo_Latitude CGeo_Longitude CGeo_Metrocode CGeo_SectionCode Find descriptions of these fields in the Reference Guide. If no assignment is made. or None. Attempts to make an Address-level GeoCensus assignment first. you need to choose a GeoCensus Mode. it attempts to make a Centroid-level GeoCensus assignment.18 Data Quality Address Cleanse GeoCensus Mode options To activate GeoCensus in the transform. Both. 472 SAP BusinessObjects Data Services Designer Guide . Processes Centroid-level GeoCensus only. The options are Address. Op tion Ad dress Description Processes Address-level GeoCensus only.

the USA Regulatory Address Cleanse transform can append latitude. by itself.Data Quality Address Cleanse 18 Related Topics • How GeoCensus works • GeoCensus directories • To enable GeoCensus coding How GeoCensus works By using GeoCensus. "one-pass" processing. we suggest that you use the matching process to match your records to the demographic database. To append demographic information. MSA is an aggregation of counties into Metropolitan Statistical Areas assigned by the US Office of Management and Budget. these locations will contain the substitution variable path. When you obtain one. However. GeoCensus directories The path and file names for the following directories must be defined in the Reference Files option group of the USA Regulatory Address Cleanse transform before you can begin GeoCensus processing. you need a demographic database from another vendor. append demographic data to your records. the transform does not draw maps. based on ZIP+4 codes.) Likewise. and Census codes such as Census Tract/Block and Metropolitan Statistical Area (MSA) to your records. then use the Best Record transform to post income and other information. You can apply the GeoCensus codes during address standardization and ZIP+4 assignment for simple. SAP BusinessObjects Data Services Designer Guide 473 . (You would use the MSA and Census block/tract information as match criteria. Those applications enable you to plot the locations of your customers and filter your database to cover a particular geographic area. and transfer the demographic information into your records. The transform lays the foundation by giving you Census coordinates via output fields. longitude. The transform cannot. If you use the sample transform configuration for GeoCensus. you can use the latitude and longitude assigned by the transform as input to third-party mapping applications.

GeoCensus is already enabled. make sure you define the directory location and define the Geo Mode option. records with the USA Regulatory Address Cleanse transform only. and for the Geo Mode option select either Address. ambulance. and postal personnel to locate a rural 474 SAP BusinessObjects Data Services Designer Guide . or Both. the transform will not perform GeoCensus processing. Related Topics • GeoCensus (USA Regulatory Address Cleanse) LACSLink (USA Regulatory Address Cleanse) Locatable Address Conversion System (LACSLink) is a USPS product that is available for U. Open the USA Regulatory Address Cleanse transform.S.18 Data Quality Address Cleanse Directory name Description Address-level GeoCensus directories are required if you choose Address for the GeoCensus Mode under the Assignment Options group. LACSLink processing is required for CASS certification. On the Options tab.di directories based on the geo mode you choose. fire. 4. if you are starting from a USA Regulatory Address Cleanse transform.dir and ageo1-10. If you select None. Expand Assignment Options. LACSLink updates addresses when the physical address does not move but the address has changed. Centroid. 3.dir To enable GeoCensus coding If you use a copy of the GeoCensus transform configuration file in your data flow. However. Centriod-level GeoCensus directory is required if you choose Centroid for the GeoCensus Mode under the Assignment Options group. when the municipality changes rural route addresses to street-name addresses. Set the locations for the cgeo. ageo1-10 cgeo2. 2. Rural route conversions make it easier for police. 1. For example. expand Reference Files group.

the following table shows an address that was found in the address directory as a LACS-convertible address. you must already have the old address data. Related Topics • How LACSLink works • To control memory usage for LACSLink processing • To disable LACSLink • LACSLink security How LACSLink works LACSLink provides a new address when one is available. 3. The USA Regulatory Address Cleanse transform standardizes the input address. Here are the conditions under which your data is passed into LACSLink processing: • • The address is found in the address directory. The address is found in the address directory. • For example. To obtain the new addresses.Data Quality Address Cleanse 18 address. even though a rural route or highway contract default assignment was made. it is not an extra step. LACSLink also converts addresses when streets are renamed or post office boxes renumbered. The transform does not process all of your addresses with LACSLink when it is enabled. LACSLink is an integrated part of address processing. The transform looks for a matching address in the LACSLink data. If a match is found. and. the record wasn't flagged as LACS convertible. and at the same time gives you easy access to the data. the transform outputs the LACSLink-converted address and other LACSLink information. 2. LACSLink technology ensures that the data remains private and secure. The address is not found in the address directory. LACSLink follows these steps when processing an address: 1. and it is flagged as a LACS-convertible record within the address directory. but the record contains enough information to be sent into LACSLink. SAP BusinessObjects Data Services Designer Guide 475 .

Open the Transform Performance option group. If the Continue option is chosen. The sample transform configuration is USARegulatory_AddressCleanse and is found under the USA_Regulatory_Address_Cleanse group in the Object Library. Open the Options tab of your USA Regulatory Address Cleanse transform configuration in your data flow. For the amount of disk space required to cache the directories.sap.com/bosap-support. see the Supported Platforms document available in the SAP BusinessObjects Support > Documentation > Supported Platforms/PARs section of the SAP Service Marketplace: http://service. Related Topics • LACSLink (USA Regulatory Address Cleanse) 476 SAP BusinessObjects Data Services Designer Guide . Select Yes for the Cache LACSLink Directories option. a verification error message is displayed at run-time and the transform terminates. 2. Related Topics • To control memory usage for LACSLink processing • LACSLink (USA Regulatory Address Cleanse) To control memory usage for LACSLink processing The transform performance improves considerably if you cache the LACSLink directories. If you do not have adequate system memory to load the LACSLink directories and the Insufficient_Cache_Memory_Action is set to Error.18 Data Quality Address Cleanse Original address After LACSLink conversion RR2 BOX 204 DU BOIS PA 15801 463 SHOWERS RD DU BOIS PA 15801-66675 Sample transform configuration LACSLink processing is enabled by default in the sample transform configuration because it is required for CASS certification. the transform attempts to continue LACSLink processing without caching. Follow these steps to load the LACSLink directories into your system memory: 1.

Open the Non Certified Options group. In your USA Regulatory Address Cleanse transform configuration.Data Quality Address Cleanse 18 To disable LACSLink LACSLink is enabled by default in the USA Regulatory Address Cleanse transform configuration because it is required for CASS certification. 2. 3. 4. the software's behavior varies when it encounters a false positive address. Select Yes for the Disable Certification option. Related Topics • LACSLink (USA Regulatory Address Cleanse) LACSLink security The USPS has instituted processes that monitor the use of LACSLink. Depending on what type of user you are and your license key. Open the Assignment Option group. Each company that purchases the LACSLink functionality is required to sign a legal agreement stating that it will not attempt to misuse the LACSLink product. The USPS has added security to prevent LACSLink abuse by including false positive addresses within the LACSLink directories. 1. you must disable CASS certification in order to disable LACSLink. open the Options tab. If a user abuses the LACSLink product. Select No for the Enable LACSLink option. Therefore. the USPS has the right to prohibit the user from using LACSLink in the future. The following table explains the behaviors for each user type: SAP BusinessObjects Data Services Designer Guide 477 .

LACSLink lockabled). NCOALink full or limited service provider LACSLink false positive logs are LACSLink false generated and LACSLink processing positive logs continues. provide proof of permission to SAP customer support.) • discontinues LACSLink processing. ing NCOALink end user with stop processing LACSLink false positive logs are LACSLink false alternative enabled (See Note below) generated and LACSLink processing positive logs continues. Note: If you are a non service provider who uses DPV or LACSLink. (The option to generate report data must be enabled in the transform. the software takes the following actions when it detects a false positive address during processing: • marks the record as a false positive • generates a LACSLink lock code • notes the false positive address record (lock record) and lock code in the error log • generates a "US Regulatory Locking Report" containing the false positive address record (lock record) and lock code.18 Data Quality Address Cleanse User type Non-NCOALink users of LACSLink (for example you use LACSLink for CASS certification or non-CASS purposes) Software behavior Read about: LACSLink processing is locked (dis. and you want to bypass any future DPV or LACSLink directory locks. and then customer support will provide a key code that disables the DPV or LACSLink directory locking. but continues other processing in the current data flow 478 SAP BusinessObjects Data Services Designer Guide . There is no extra charge for this key code. you can utilize the Alternate Stop Processing functionality. LACSLink locking If you use LACSLink for CASS certification or for non-CASS purposes.LACSLink processing is locked (dis. ing NCOALink end user without stop process. You must obtain the proper permissions from the USPS.LACSLink locking alternative (See Note below) abled). or if you are licensed as an NCOALink end user (without stop processing alternative enabled).

) Before releasing the mailing list that contains the false positive address. you are required to send the LACSLink log files containing the false positive addresses to the USPS. all subsequent job runs with LACSLink enabled error out and do not complete.log. it is known as LACSLink locking. you must obtain a LACSLink unlock code from SAP BusinessObjects Enterprise Support. Note: When you have set the data flow degree of parallelism to greater than 1. Related Topics • To retrieve the LACSLink unlock code • To notify the USPS about LACSLink locking addresses SAP BusinessObjects Data Services Designer Guide 479 . where ### is a number between 001 and 999.log. the first log file generated is lacsl001. you are required to notify the USPS that a false positive address was detected. However. During a job run. The software names LACSLink false positive logs lacsl###.Data Quality Address Cleanse 18 When the software discontinues LACSLink processing. and so on. one log will be generated. For example. if it encounters more than one false positive record and the records are processed on different threads. The transform continues to process the current data flow without LACSLink processing. The software stores LACSLink log files in the directory specified for the "USPS Log Path" in the USA Regulatory Address Cleanse transform. the next one is lacsl002. In order to re-enable LACSLink functionality. if the software encounters only one false positive record. LACSLink false positive logs For NCOALink service providers and end users with stop processing alternative enabled.log. Additionally. the software takes the following actions when it detects a false positive address during processing: • marks the record as a false positive • generates a LACSLink log file containing the false positive address • notes the path to the LACSLink log files in the error log • generates a "US Regulatory Locking Report" containing the path to the LACSLink log files • continues LACSLink processing without interruption (however. the software generates one log per thread. then the software will generate one log for each thread that processes a false positive record.

Open your database and remove the record causing the lock.18 Data Quality Address Cleanse To retrieve the LACSLink unlock code 1. you will need to retrieve a new LACSLink unlock code.txt file with the new file. Include the following: • • LACSLink False Positive as the subject line attach the lacsl###.txt file to your message and the log file named lacsl###.com/mes sage and log a message using the component BOJ-EIM-DS. Send an email to the USPS at dsf2stop@usps. 2. Replace the existing lacsw. include the unlock information in the message instead. Note: When you have set the data flow degree of parallelism to greater than 1. 4. if the software 480 SAP BusinessObjects Data Services Designer Guide . Note: If your files cannot be attached to the original message. where ### is a number between 001 and 999. Note: Keep in mind that you can only use the unlock code one time.txt file is located in the LACSLink directory referenced in the job. The lacsx.log file or files from the job. The log file is located in the directory specified for the USPS Log Path option in the USA Regulatory Address Cleanse transform. If the software detects another false-positive. the software generates one log per thread.txt.log.sap. Attach the lacsx. Related Topics • LACSLink (USA Regulatory Address Cleanse) • LACSLink security To notify the USPS about LACSLink locking addresses Follow these steps only if you have received an alert that LACSLink false positive addresses are present in your address list and you are either an NCOALink service provider or an NCOALink end user with Stop Processing Alternative enabled.gov. SAP Support sends you an unlock file named lacsw. Go to the SAP Service Market Place (SMP) at http://service. During a job run. Be sure to remove the record that is causing the lock from the database. 1. 3.

You can meet this requirement through the NCOALink process. After the USPS has released the list that contained the locked or false positive record. You must be certified by the USPS as a NCOALink licensee to receive and use the NCOALink product data sets. one log will be generated. SAP BusinessObjects Data Services Designer Guide 481 .S. records. 3. then the software will generate one log for each thread that processes a false positive record. Mover ID follows these steps: 1.Data Quality Address Cleanse 18 encounters only one false positive record. if it encounters more than one false positive record and the records are processed on different threads. Related Topics • LACSLink security NCOALink (USA Regulatory Address Cleanse) Move Update is the process of ensuring that your mailing list contains up-to-date address information. standardized address data as input. standardized records. Remove the record that caused the lock from the database. 2. the corresponding log files should be deleted. The USA Regulatory Address Cleanse transform standardizes the input addresses. Mover ID checks your mailing list against the change of address data provided in the USPS NCOALink product and determines which addresses have changed. which is a USPS-certified licensed option available in the USA Regulatory Address Cleanse transform. Mover ID and NCOALink data is available only for U. Mover ID searches the NCOALink database for records that match your parsed. How Mover ID works with NCOALink When processing addresses. Mover ID requires parsed. The USPS requires that your mailing list has been through move update processing in order for it to qualify for the discounted rates available for First-Class presorted mailings. • Mover ID. The NCOALink process consists of the following components: • The USPS NCOALink product which is a data only product containing change of address data filed by postal customers when they move. However. 2.

the Data Services output file contains: • only the original address (CORRECT) • only the move-updated address. Data Services produces the reports and log files required for USPS compliance. including the new address. Data Services receives the move information. otherwise the original address (BEST) Based on the Apply Move to Standardized Fields option. Data Services looks up move records that come back from the NCOALink database to assign postal and other codes. Depending on your field class selection. if one is available. 4. 5. if one exists (MOVE-UPDATED) • the move updated address if one exists. standardized components can contain either original or move-updated addresses. If a match is found. 482 SAP BusinessObjects Data Services Designer Guide .18 Data Quality Address Cleanse 3. 6.

Data Quality Address Cleanse 18 Example: Support for NCOALink provider levels Mover ID supports three NCOALink provider levels defined by the USPS. Options vary by provider level and are activated by your software product activation keycode. The following table shows the provider levels and support: SAP BusinessObjects Data Services Designer Guide 483 .

the new address is not provided. 484 SAP BusinessObjects Data Services Designer Guide . The ANKLink data helps you make informed choices regarding a contact. LSPs can both provide services to third parties and use the product internally. (FSP) Limited Service Provider (LSP) End User Note: Yes. The ANKLink option enhances that information by providing additional data about moves that occurred in the previous months 19 through 48. The ANKLink additional 30 months of data indicates only that a move occurred and the date of the move.18 Data Quality Address Cleanse Provider level Provide service to third parties COA Data redata ceived (months) from USPS 48 weekly Support for ANKLink no (no benefit) yes Full Service Yes. If the data indicates that the contact has moved. you may complete a Stop Processing Alternative application and enter into an agreement with the USPS. No 18 weekly 18 monthly yes If you are an NCOALink end user. About ANKLink NCOALink limited service providers and end users receive change of address data for the preceding 18 months. After you are approved by the USPS you may purchase the software's stop processing alternative functionality which allows DPV and LACSLink processing to continue after a false positive address record is detected. Third party services must be Provider at least 51% of all processing. If you choose to purchase ANKLink to extend NCOALINK information. you can choose to suppress that contact from the list or try to acquire the new address from an NCOALINK full service provider. then the DVD you receive from the USPS will contain both the NCOALINK 18-month full COA information and the additional 30 month ANKLink information which indicates that a move has occurred.

Related Topics • Support for NCOALink provider levels Getting started with Mover ID and the NCOALink product Before you begin Mover ID processing you need to do the following tasks: • Complete the USPS certification process to become an NCOALink service provider or end user.usps. If an ANKLink match exists. For full information and the required forms for each provider type. Note: If you have questions on the NCOALink documents or on NCOALink certification procedures. then configure your job. • Understand the available Mover ID output strategies and performance optimization options. it is noted in the ANKLINK_RETURN_CODE output field and in the NCOALINK Processing Summary report.gov/ in the NCOALink section.Data Quality Address Cleanse 18 ANKLink support is enabled by a keycode.usps.gov/. Note: If you are an NCOALINK full service provider you receive complete access to the full 48 months of move data (including the new addresses). Related Topics • Completing NCOALink certification • Output file strategies • Improving Mover ID performance • To enable NCOALink processing Completing NCOALink certification You must complete the USPS certification procedure for NCOALink in order to purchase the NCOALink product from the USPS. contact the USPS National Customer Support Center at 800-589-5766 or go to http://ribbs. The ANKLink option will not provide you any additional data. The configuration options in the USA Regulatory Address Cleanse transform are the same if you use NCOALINK alone or with ANKLink. see the End User Certification Procedures and Service Provider Certification Procedures documents available from http://ribbs. SAP BusinessObjects Data Services Designer Guide 485 .

Stage I is an optional test which includes answers that allow you to troubleshoot and prepare for the Stage II test. the software provides a blueprint to help you set up and run the certification tests. Purchase the Mover ID option. Related Topics • To set up the certification jobs • To complete the Step 3 Software Product Information form To complete the Step 3 Software Product Information form Use the information in the following table as you complete the Step 3 Software Product Information form for the NCOALink certification process. 486 SAP BusinessObjects Data Services Designer Guide . 1. Complete the USPS NCOALink application and other required forms and return the information to the USPS. NCOALink certification has two stages. you also pay the USPS for the NCOALink product. The USPS sends you test files to use with the blueprint. The following description of the USPS certification procedure contains the software-specific details you need. 4. the USPS gives you an authorization code to purchase the Mover ID option. At this point. The Stage II test does not contain answers and is sent to the USPS for evaluation of the accuracy of your software configuration. After successfully completing the certification tests. Submit the Software Product Information form to the USPS and request an NCOALink certification test. or end user) • your decision whether or not you want to purchase the ANKLink option (for limited service provider or end user only) After installing the software you are ready to request the NCOALink certification test from the USPS. limited service provider. SAP BusinessObjects needs the following information: • your USPS authorization code (see step 1) • your provider level (full service provider. 3. 2. the USPS sends you the NCOALink license agreement.18 Data Quality Address Cleanse During certification you must process files from the USPS to prove that you adhere to the requirements of your license agreement. After you satisfy the initial application and other requirements.

Company Name & License Number Company's NCOALink Product Name Mover ID for NCOALink Platform or Operating System NCOALink Software Vendor NCOALink Software Product Name NCOALink Software Product Version Is Software Hardware Dependent? Address Matching ZIP+4 Product Name Your specific information SAP BusinessObjects Americas Mover ID Contact your sales representative No ACE Address Matching ZIP+4 Product VerContact your sales representative sion Open or Closed System DPV® Product Name DPV Product Version LACSLink® Product Name LACSLink Product Version Integrated or Standalone Closed ACE Contact your sales representative ACE Contact your sales representative Integrated Check the box if you purchased the ANKLink option from SAP BusinessObjects. ANKLink Enhancement HASH—FLAT—BOTH SAP BusinessObjects Data Services Designer Guide 487 . Indicate your preference. The license number is the authorization code provided in your USPS approval letter.Data Quality Address Cleanse 18 Field in the Step 3 form You enter this Your specific information. The software provides access to both file formats.

488 SAP BusinessObjects Data Services Designer Guide . two data flows.atl and click Open.atl into the repository adds a new project named "DataQualityCertifications" as well as two jobs. 6. The naming convention of the objects includes the string "NCOALinkStageI" or "NCOALinkStageII" to indicate the associated certification test. 4. The platform ID is the four-character identification number assigned by the USPS. Click Tools > Import from file and navigate to $LINK_DIR\BusinessObjects\BusinessObjects Data Services\DataQuality\certification. Related Topics • Completing NCOALink certification To set up the certification jobs The software includes a blueprint to help you with certification testing. Open the DataQualityCertifications project. 5. Open the substitution parameter editor (Tools > Substitution Parameter Configurations) and enter values for your company's information in the configuration you will use for the certification jobs. and four flat file formats. 3. To set up the jobs required for Stage I or Stage II certification: 1. Open the Designer. where $LINK_DIR is the software installation directory. the blueprint can be used to process a test file provided by the USPS during an audit. Importing the us_ncoalink_stage_certification. Additionally. 7. Select the ATL file us_ncoalink_stage_certification.18 Data Quality Address Cleanse Field in the Step 3 form Service Level Option You enter this Check the appropriate box. Click OK at the message warning that you are about to import the ATL. 2. Expand the Job_DqBatchUSAReg_NCOALinkStageI job and then the DF_DqBatchUSAReg_NCOALinkStageI data flow.

Data Quality Address Cleanse 18 8. In the File name(s) option. If you type the path. As necessary. If you are a full service provider or limited service provider. complete the options in the NCOALink > PAF Details group and the NCOALink > Service Provider Options group. c. Repeat steps 7 through 12 and make similar changes to the Stage II objects found in the DF_DqBatchUSAReg_NCOALinkStageII data flow. 13. change StageI. b. 11. in the Reference Files group.in to the name of the Stage file provided by the USPS. or End User) in the Provider Level option. enter the correct path location to any files that reside in a directory other than the directory specified by the $$RefFilesAddressCleanse configuration parameter in the substitution parameter configuration. (Optional) In the File name(s) option. If you type the path. In the "Data Files(s)" property group make the following changes: a. In the Root Directory option. Click the USARegulatoryNCOALink_AddressCleanse transform to open the Transform Editor and click the Options tab. do the following: a. d. type the path or browse to the directory containing the output file. 10. change StageI. Limited Service Provider. In the USPS License Information group. 12. Enter a meaningful number in the List ID option. do not type a backslash (\) or forward slash (/) at the end of the file path. In the "Data Files(s)" property group make the following changes: a. Click the DqUsaNCOALinkStageI_in file to open the Source File Editor. type the path or browse to the directory containing the input file. do not type a backslash (\) or forward slash (/) at the end of the file path. b.out to fit your company's file naming convention. Ensure that the provider level specified in the substitution parameter configuration by the $$USPSProviderLevel is accurate or specify the appropriate level (Full Service Provider. Click the DqUsaNCOALinkStageI_out file to open the Target File Editor. Enter the current date in the List Received Date and List Return Date options. In the Root Directory option. b. SAP BusinessObjects Data Services Designer Guide 489 . 9.

2. make modifications to your configuration until you are satisfied with the results of your Stage I test. Download the Stage II file from the location specified by the USPS and unzip it to the location you specified when setting up the certification job. Execute the Stage I job and compare the test data with the expected results provided by the USPS in the Stage I input file.usps. running the Stage I job can help you ensure that you have configured the software correctly and are prepared to successfully execute the Stage II job. update the daily delete file to the most current version available from the USPS. However. 5. 1. Download the current version of the USPS daily delete file 3. As necessary. contact the USPS National Customer Support Center at 800-589-5766. the results do not need to be sent to the USPS. Execute the Stage II job. Follow the specific instructions in the "NCOALink Certification/Audit Instructions" document that the USPS should have provided to you. Running the Stage I job is optional. Use the NCOALink DVD Verification utility to install the NCOALink directories provided by the USPS. Download the Stage I file from http://ribbs. Ensure the input file name in the transform matches the name of the Stage I file from the USPS.gov/ and unzip it to the location you specified when setting up the certification job. Send the following results to the USPS for verification: 490 SAP BusinessObjects Data Services Designer Guide . Note: If necessary. Ensure the input file name in the transform matches the name of the Stage II file from the USPS. If you have any questions about your output.18 Data Quality Address Cleanse Related Topics • Reference Guide: USA Regulatory Address Cleanse transform To run the NCOALink certification jobs Before you run the NCOALink certification jobs. ensure you have installed the DPV and LACSLink files to the locations you specified during configuration and that the NCOALink DVDs provided by the USPS are available. 6. 7. 4.

Note: The NCOALink directories expire within 45 days. The utility is available with a GUI (graphical user interface) utility or from the command line. then each day you run a Mover ID job.Data Quality Address Cleanse 18 • • • • Stage II output file NCOALink Processing Summary Report CASS 3553 Report All log files generated in the $$CertificationLog path • • • Customer Service Log PAF (Service Providers only) Broker/Agent / List Administrator log (Service Providers only) Related Topics • To set up the certification jobs • Management Console Metadata Administrator Guide: Exporting NCOALink certification logs • To install NCOALink directories with the GUI • To install NCOALink directories from the command line • To install the NCOALink daily delete file About NCOALink directories After you have completed the certification requirements and purchased the NCOALink product from the USPS. the USPS sends you the latest NCOALink directories monthly (if you’re an end user) or weekly (if you’re a limited or full service provider). Related Topics • About the NCOALink daily delete file SAP BusinessObjects Data Services Designer Guide 491 . The NCOALink directories are not provided by SAP BusinessObjects. The USPS requires that you use the most recent NCOALink directories available for your Mover ID jobs. The software provides a utility that installs (transfers and unpacks) the compressed files from the NCOALink DVD onto your system. If you are a service provider. you must also download the daily delete file and install it in the same directory where your NCOALink directories are located.

located at LINK_DIR\bin\ncoadvdver. see the online help available within the DVD installation program (choose Help > Contents). For further installation details.exe (Windows) or $LINK_DIR/bin/ncoadvdver (UNIX). Insert the USPS DVD containing the NCOALink directories into your DVD drive. See Improving Mover ID performance. 1. 2. Run the DVD installer. use the ncoadvdver command with the following command line options: 492 SAP BusinessObjects Data Services Designer Guide .exe (Windows) or $LINK_DIR/bin/ncoadvdver (UNIX).18 Data Quality Address Cleanse To install NCOALink directories with the GUI Prerequisites Ensure your system meets the following minimum requirements: • • • At least 60 GB of available disk space DVD drive Sufficient RAM. located at LINK_DIR\bin\ncoadvdver. 1. where LINK_DIR is the path to your software installation directory. Related Topics • Improving Mover ID performance • About NCOALink directories • To install NCOALink directories from the command line To install NCOALink directories from the command line Prerequisites: Ensure your system meets the following minimum requirements: • • • At least 60 GB of available disk space DVD drive Sufficient RAM. 2. To automate the installation process. Run the DVD installer. where LINK_DIR is the path to your Data Services installation directory.

you must also specify the following: • DVD location with /d or -d • transfer location with /t or -t Perform verification. you must also specify the transfer location with /t or t. When using this option. Specify transfer location. if you want to transfer. Answer all warning messages with Yes. Example: Your command line may look something like this: Windows SAP BusinessObjects Data Services Designer Guide 493 . Perform transfer. enter /p:tuv or -p:tuv. For example. Specify DVD location. the program closes. When using this option. /p:t -p:t /p:u -p:u /p:v -p:v /d /t /nos -d -t -nos /a -a You can combine p options. After performing the p option specified.Data Quality Address Cleanse 18 Option Description Windows UNIX -c Run selected processes in console mode (do not use the GUI). When using this option you must also specify the following: • DVD location with /d or -d • transfer location with /t or -t Perform unpack. unpack. and verify all in the same process. Do not stop on error (return failure code as exit status).

dat. if Jane Doe filed a change of address with the USPS and then didn’t move. Jane’s record would be in the daily delete file.18 Data Quality Address Cleanse ncoadvdver /p:tuv /d D:\ /t C:\pw\dirs\ncoa UNIX ncoadvdver [-c] [-a] [-nos] [-p:(t|u|v)][-d<path>] [-t <filename>] Related Topics • About NCOALink directories • To install NCOALink directories with the GUI About the NCOALink daily delete file If you are a service provider. For example. Related Topics • About NCOALink directories 494 SAP BusinessObjects Data Services Designer Guide . and they are updated only weekly or monthly. then every day before you perform NCOALink processing. you must download the daily delete file and install it in the same directory where your NCOALink directories are located. It is not required for normal Mover ID processing. The software will issue a verification warning if the daily delete file is more than three days old. Do not rename the daily delete file. Mover ID supports only the ASCII version of the daily delete file. until the NCOALink directories are updated again. It must be named dailydel. you only need the daily delete file for processing Stage I or II files. the daily delete file is needed in the interim. Important points to know about the daily delete file: • • • • The software will fail verification if an NCOALink certification stage test is being performed and the daily delete file is not installed. Because the change of address is stored in the NCOALink directories. The daily delete file contains records that are pending deletion from the NCOALink data. Note: If you are an end user.

4. • Output move records twice. Click NCOALink on the left side of the page. Download the dailydel. or the original data if a move does not exist (BEST) By default the output option Apply Move to Standardized Fields is set to yes and the software updates standardized fields to contain details about the updated address available through NCOALink. components in your output file contain: • only the original data (CORRECT) • only the move-updated data.Data Quality Address Cleanse 18 To install the NCOALink daily delete file To download and install the daily delete file: 1. placing move records in the second output file and all other records in the main output file. you can create a second output file specifically for move records. and a second time to the second output file.dat file link and save it to the same location where your NCOALink directories are stored. you must change the Apply Move to Standardized Fields option to no. Then you can use output fields such as NCOALINK_RETURN_CODE to determine whether a move occurred. most of the appended fields will be blank.usps. Related Topics • About the NCOALink daily delete file Output file strategies You can configure your output file to meet your needs. One way to set up your output file is to replicate the input file format. Alternatively. Go to the USPS RIBBS site at http://ribbs. then append extra fields for move data. Depending on your field class selection. 2. SAP BusinessObjects Data Services Designer Guide 495 . In the output records not affected by a move.gov/. Click Click here to be redirected to the NCOALink page. once to the main output file. the fields are blank (MOVE-UPDATED) • move-updated data if it exists. 3. If a move does not exist. If you want to retain the old addresses in the standardized components and append the new ones to the output file. Two approaches are possible: • Output each record once.

and the sustained transfer rate. If you’re looking for a cost-effective way of processing single jobs.18 Data Quality Address Cleanse Both of these approaches require that you use an output filter to determine whether a record is a move. Memory NCOALink processing uses many gigabytes of data. Related Topics • How Mover ID works with NCOALink • To enable NCOALink processing • Reference Guide: USA Regulatory Address Cleanse transform Improving Mover ID performance Many factors affect performance when processing NCOALink data. the data format. Operating systems and processors The computation involved in most of the software and Mover ID processing is very well-suited to the microprocessors found in most computers. The exact amount depends on your service provider level. When the time spent on disk access is minimized. You should be able to increase the degree of parallelism (DOP) in the data flow properties to maximize the processor or core usage on your system. a Windows server or a fast workstation can produce excellent results. Most UNIX systems have multiple processors and are at their best processing several jobs at once. seek time. In fact a common PC can often run a single job through the software and NCOALink about twice as fast as a common UNIX system. the performance of the CPU becomes significant. Other critical factors that affect performance include hard drive speed. 496 SAP BusinessObjects Data Services Designer Guide . Generally the most critical factor is the volume of disk access that occurs. Increasing the DOP depends on the complexity of the dataflow. and the specific release of the data from the USPS. Often the most effective way to reduce disk access is to have sufficient memory available to cache data. such as those made by Intel and AMD. RISC style processors like those found in most UNIX systems are generally substantially slower for this type of computation.

Data storage If at all possible. Other processes competing for the use of the same physical disk drive can greatly reduce your NCOALink performance. accessing the flat file data involves binary searches. that format may provide the best performance. you may want to enable postcode order caching. You should be able to cache the entire NCOALink data set using 20 GB of RAM. which are slightly more time consuming than the direct access used with the hash file format. which may be SAP BusinessObjects Data Services Designer Guide 497 . you should obtain as much memory as possible. which means a larger share can be cached in a given amount of RAM. at least while your job is running. if performance is critical. That memory is later reused to cache another portion of the data.ini file and that the policy to lock pages in memory field is enabled in the Windows group policy console. the most significant hard drive feature is the average seek time. Data format The software supports both hash and flat file versions of NCOALink data. Thus. The flat file data is significantly smaller. If you have ample memory to cache the entire hash file data set. The postcode order caching option tells the software to cache the portion of the NCOALink data required for similar postcodes.Data Quality Address Cleanse 18 In general. When the software accesses NCOALink data directly rather than from a cache. with enough memory left for the operating system. the hard drive you use for NCOALink data should be fully dedicated to that process. However. If you wish to enable Windows extended memory. You may want to go as far as caching the entire NCOALink data set. Postcode order caching If your input file is already in postcode order. To achieve even higher transfer rates you may want to explore the possibility of using a RAID system. and especially if you are a Full Service Provider and you frequently run very large jobs with millions of records. for those cached addresses you can achieve nearly CPU-bound performance. ensure that the /pae switch is set in the boot.

The minimum size to get that benefit will depend on your hardware and data. The “Auto” option usually does a good job of deciding how much memory to use.18 Data Quality Address Cleanse in the millions of records per hour. Using far too much memory can cause large files to be read from the disk into the cache even when only a tiny fraction of the data will ever be used. enabling this feature puts a finite limit on the amount of disk access that the software may engage in for a single NCOALink job. it’s best to use a defragmented 498 SAP BusinessObjects Data Services Designer Guide . • Cache size—Using too little memory for NCOALink caching can cause unnecessary random file access and time-consuming hard drive seeks. Consider the following factors: • Postcode order caching—This feature is only beneficial for large jobs. turning on postcode order caching may actually decrease performance significantly. The data that the software caches for each 2-digit ZIP Code may be over 500 megabytes. If there aren't many addresses with a given 2-digit ZIP Code. Some are within your control and others may be inherent to your business. if your list contains only a few thousand records destined for addresses distributed throughout the country. Performance tips Many factors can increase or decrease processing speed. but in some cases manually adjusting the amount can be worthwhile. the time saved by having that data cached may be less than the time it takes to cache it. On the other hand. Postcode order caching isn’t always appropriate. Memory usage The optimal amount of memory depends on a great many factors. Essentially. Using a local sold state drive or virtual RAM drive eliminates all I/O for NCOALink while processing your job. • Directory location—It’s best to have NCOALink directories on a local solid state drive or a virtual RAM drive. For a small job turning on postcode order caching can decrease performance dramatically. The amount of cache that works best in your environment may require some testing to see what works best for your configuration and typical job size. When your data is concentrated geographically or your data file is large you will see the greatest improvements. If you have the directories on a hard drive.

For more information about the USA Regulatory Address Cleanse transform fields. There is overhead when processing any job. Input format—Ideally you should provide the USA Regulatory Address Cleanse transform with discrete fields for the addressee’s first. and last name. SAP BusinessObjects Data Services Designer Guide 499 . so updating a mailing list regularly will improve the processing speed on that list. DPV. NCOALink. 2. and LACSLink are already enabled. The table below shows fields that are required only for specific provider levels. the transform will have to take time to parse it before checking NCOALink data. USARegulatoryNCOALink_AddressCleanse. see the Reference Guide. Match rate—The more records you process that have forwardable moves. The hard drive should not be accessed for anything other than the NCOALink data while you are running your job. middle. but if a job includes millions of records. File size—Larger files process relatively faster than smaller files. Set values for the options as appropriate for your situation. Open the USA Regulatory Address Cleanse transform and click the Options tab. If your input has only a name line.Data Quality Address Cleanse 18 • • • local hard drive. To enable NCOALink processing Prerequisites: Access to the following files: • NCOALink directories • Current version of the USPS daily delete file • DPV data files • LACSLink data files If you use a copy of the sample transform configuration. as well as for the pre-name and post-name. the slower your processing will be. 1. Retrieving and decoding the new addresses takes time. a few seconds of overhead becomes insignificant.

18 Data Quality Address Cleanse Option group End user without Option name Stop Proor subgroup cessing Alternative Licensee Name List Owner NAICS Code yes End user with Stop Full or limited Process.provider native yes yes USPS License Information yes yes yes 500 SAP BusinessObjects Data Services Designer Guide .service ing Alter.

no gion Customer Company Postcode1 Customer Company Postcode2 Customer Company Phone yes yes yes yes yes yes no yes yes no yes yes no no no List Processyes ing Frequency List Received no Date yes yes no yes SAP BusinessObjects Data Services Designer Guide 501 .provider native no yes no yes yes Customer Company Ad.Data Quality Address Cleanse 18 Option group End user without Option name Stop Proor subgroup cessing Alternative List ID Customer Company Name no End user with Stop Full or limited Process.service ing Alter.no cality Customer Company Re.no dress Customer Company Lo.

service ing Alter. An additional group of contact detail fields will be added below the original group. except Customer Parent Company Name and Customer Alternate Company Name. expand the NCOALink group. Provider Level yes PAF Details subgroup no no NCOALink Service Provider Options subgroup no no Tip: If you are a service provider and need to provide contact details for multiple brokers. Related Topics • Reference Guide: USA Regulatory Address Cleanse transform • About NCOALink directories • About the NCOALink daily delete file • Output file strategies 502 SAP BusinessObjects Data Services Designer Guide .provider native no yes yes yes All options are required. except Buyer Company Name and Postcode for Mail Entry. All options are required.18 Data Quality Address Cleanse Option group End user without Option name Stop Proor subgroup cessing Alternative List Return Date no End user with Stop Full or limited Process. right-click Contact Details and click Duplicate Option.

The software produces the following move-related log files: • Customer service log (CSL) • PAF customer information log • Broker/Agent / List Administrator log The following table shows the log files required for each provider level: Required for: Limited or End Full SerUsers vice Providers Log file Description Customer service log yes yes This log file contains one record per list that you process. Each record details the results of change-of-address processing. The software does not verify that you have 100 unique records.Data Quality Address Cleanse 18 NCOALink security requirements Because of the sensitivity and confidentiality of change-of-address data. At the beginning of each month. Mover ID and NCOALink log files The software automatically generates the USPS-required log files and names them according to USPS requirements. Requirements include: • NCOALink licensees must ensure that their data has a minimum of 100 unique records. The software creates one log file per client. • The system where the software and NCOALink directories are installed must be password-protected. The USPS requires that you save these log files for five years. the USPS imposes strict security procedures on software vendors and users. The software generates these log files to the repository where you can export them by using the Management Console. The software starts new log files. It is up to you to verify that each is unique (in other words. Each log file is then appended with information about every NCOALink job processed that month for that specific client. you can’t input 100 copies of the same record). SAP BusinessObjects Data Services Designer Guide 503 .

If a list is processed with the same PAF information. yes The USPS requires the Broker/Agent / List Administrator log file from service providers. the information appears just once in the log file.18 Data Quality Address Cleanse Required for: Limited or End Full SerUsers vice Providers Log file Description This log file contains the information that you provided for the PAF. The software produces this log file for every job if you’re a certified service provider. The log file lists each unique PAF entry. even in jobs that do not involve a broker or list administrator. When contact information for the list administrator has changed. This log file contains all of the contact information that you entered for the broker or list administrator. Broker/Agent / no List Administrator log PAF customer information log no yes The log file lists information for each broker or list administrator just once. then information for both the list administrator and the corresponding broker are written to the PAF log file. Related Topics • Management Console Metadata Reports Guide: NCOALink Processing Summary Report 504 SAP BusinessObjects Data Services Designer Guide .

For example. Character 1 CharacCharacter 6 ters 2 -5 PlatMonth form ID exactly four 1 characters long Characters 7-8 Year two characters . P1234C07.DAT is a PAF Log file generated in December 2007 for a licensee with the ID 1234.Data Quality Address Cleanse 18 • Management Console Metadata Administrator Guide: Exporting NCOALink certification logs Log file names The software follows the USPS file-naming scheme for the following log files: • Customer service log • PAF customer information log • Broker/Agent / List Administrators log The table below describes the naming scheme. for example 07 for 2007 Extension Log type B Broker log January .DAT C Customer service log PAF log 2 February P 3 4 5 6 7 8 March April May June July August SAP BusinessObjects Data Services Designer Guide 505 .

506 SAP BusinessObjects Data Services Designer Guide . The USPS is motivated to encourage the use of RDI by parcel mailers. you have increased incentive to ship the parcel with the USPS instead of with a competitor that applies a residential surcharge. does not add surcharges for residential deliveries. 91 percent of U. addresses are residential. The USPS. You can use RDI if you are processing your data for CASS certification or if you are processing in a non-certified mode. RDI does not require that you use DPV processing. RDI determines whether a given address is for a residence. Parcel shippers can find RDI information to be very valuable. When you can recognize an address as a residence.18 Data Quality Address Cleanse Character 1 CharacCharacter 6 ters 2 -5 PlatMonth form ID 9 A B C September October November December Characters 7-8 Year Extension Log type Related Topics • Mover ID and NCOALink log files RDI (USA Regulatory Address Cleanse) The Residential Delivery Indicator (RDI) feature is available in the USA Regulatory Address Cleanse transform. According to the USPS.S. Why? Some delivery services charge higher rates to deliver to residential addresses. on the other hand. In addition.

hs9 To enable RDI If you use a copy of the “RDI transform configuration” file in your data flow. However. and for the Enable RDI option select Yes.hs9.This is possible only when the addresses for that ZIP+4 are for all residences or for no residences. RDI is already enabled. and then in the RDI Path option type the location for the RDI reference files. File Description rts.Data Quality Address Cleanse 18 Start with a sample transform If you want to use the RDI feature with the USA Regulatory Address Cleanse transform. make sure you enable RDI and set the location for the following RDI directories: rts. USARegulatoryRDI_AddressCleanse. This file is based on a ZIP+4. RDI requires the following directories. On the Options tab. rts. You must install them according to USPS instructions. if you are starting from a USA Regulatory Address Cleanse transform. Open the USA Regulatory Address Cleanse transform. This file is used when an address contains an 11-digit ZIP Code. 2. expand Reference Files. SAP BusinessObjects Data Services Designer Guide 507 . For 9-digit ZIP Code lookups (ZIP+4). Expand Assignment Options. It is located under USA_Regulatory_Address_Cleanse in the Object Library.hs11 and rts. 1. Determination is based on the delivery point. it is best to start with the sample transform configuration. 3. RDI directory files RDI directories can be purchased from the USPS only.hs11 For 11-digit ZIP Code lookups (ZIP+4 plus DPBC).

Many times. so that you can choose which is the best address. and postal code—in the City and Postcode directories. the transform should find exactly one record that matches the address. Sometimes. 508 SAP BusinessObjects Data Services Designer Guide . In many instances. house number. Overview Normally. there may be two or more records (or suggestions) in the postal directories that could possibly be the correct record. Suggestion lists provide you with a list of "matching" addresses. during the lookup in the Address directory. the transform can do this even when the input data is not complete. there is a Global Suggestion List transform configuration that you can set up in your data flow. when an address cleanse transform looks up an address in the postal directories.18 Data Quality Address Cleanse Related Topics • RDI (USA Regulatory Address Cleanse) • RDI directory files Suggestion lists The suggestion lists feature is available for real time data flows only and is available in both the Global Address Cleanse transform (for USA and Canadian records) and the USA Regulatory Address Cleanse transform (not for CASS certification). it finds exactly one matching record. Ideally. when the transform looks up an address in the postal directories. the transform should be able to determine exactly one matching record—one combination of city. because of incomplete information. For other countries. it finds one matching record. When the input data is good. all the transform needs to assign an address is the right postal code. and some of the street name. state. Then.

Use a value of 0 in the appropriate Suggestion_Reply field. You may have to accept that the address cannot be assigned. A suggestion selection of 0 means to ignore the suggestion. but you have already been given all the data available to you. then the process is complete. a five-digit match. The suggestion list is ignored. if you think that the primary address is correct and you want to make an assignment. you can ignore the secondary suggestion list. The suggestion selection is a numeric value that may range from 1 to the number of suggestions available. Unresolved suggestions If the transform produces a suggestion list. or can be assigned only at the postal code level (with USA data. For example. Integrating suggestion list functionality Suggestion list functionality is designed to be integrated into your own custom applications via the Web Service. SAP BusinessObjects Data Services Designer Guide 509 . the transform retrieves your suggestion selection through one of the Suggestion_Reply input fields. for example).Data Quality Address Cleanse 18 Example: Input record Output record Multiline1 = 1000 vin Multiline2 = 54603 Primaddr = 1000 Vine Street City = La Crosse State = WI Postcodefull = 54601-3474 Ignore suggestion lists When a suggestion list is generated.

The transform presents its suggestions. you choose one. it is best to start with one of the sample transforms that is configured for it. The sample transform USARegulatorySuggestions_AddressCleanse is available for both the Global Address Cleanse transform and the USA Regulatory Address Cleanse transform.18 Data Quality Address Cleanse Start with a sample transform If you want to use the suggestion lists feature. it assembles a list of the near matches—the suggestions. But if you choose one. but not quite close enough. WI 54806 Lac du Flambeau Reservation. and then the transform tries again to assign the address. the transform cannot reliably choose one of the four cities. WI 54601 Multiline1= 1000 vine Multiline2= lac wi Lac du Flambeau. the transform can go ahead with the rest of the assignment process. When the transform gets close to a match. WI 54806 510 SAP BusinessObjects Data Services Designer Guide . Given the incomplete last line below. Input record Possible matches in the City/ZCF directories La Crosse. Example: Incomplete last line Note: The addresses included in the examples are for informational purposes only. The transform may find several directory records that are near matches to the input data. WI 54538 Lac Courte Oreilles Indian Reservation. Related Topics • Choose the best suggestion • Cleanse your address data transactionally Choose the best suggestion Sometimes it's impossible to decide which suggestion is the correct one.

Data Quality Address Cleanse 18 Example: Missing directional or suffix The same can happen with address lines. A common problem is a missing directional." SAP BusinessObjects Data Services Designer Guide 511 . The transform has no basis for guessing one way or the other. there is an equal chance that the directional could be North or South. Input record Possible matches in the ZIP+4 directory Multiline1 = 615 losey blvd 600-699 Losey Blvd North Multiline2 = 54601 600-699 Losey Blvd South A missing suffix would cause the same problem. In the example below. Input record Possible matches in the ZIP+4 directory Multiline1 = 121 dorn 100-199 Dorn Place Multiline2 = 54601 100-199 Dorn Street Example: Breaking ties A badly misspelled street name could also cause a "tie.

Consider this address. For example. you need some basis for selecting one of the possible matches. At a minimum. a little smudged—but if the transform gives you a clue about what information is needed. perhaps you are using the transform to capture address data while the customer is still on the phone. and guess wrong. Or you might be taking data from a consumer coupon. which needs a directional: Input record Possible matches in the ZIP+4 directory Multiline1 = 5231 penn ave Multiline2 = minneapolis mn 5200-5299 Penn Ave North (ZIP 55430) 5200-5299 Penn Ave South (ZIP 55419) If you were to guess the directional. 512 SAP BusinessObjects Data Services Designer Guide .4000-4199 Maryland 55427 neapolis mn You'll need more information When the transform produces a suggestion list. Perhaps you can come up with some additional or better data. then your mail to this customer would go through the wrong post office. perhaps you could figure out the address. This is not a guessing game If you don't know. about 10 miles away. It might never be delivered. it's going to be badly delayed.18 Data Quality Address Cleanse Input record Possible matches in the ZIP+4 directory Multiline1 = 4101 mar 3900-4199 Marschall 55379 Multiline2 = min. don't guess.

However. set the Cache SuiteLink Directories option to Yes. compared with running all records through the normal ZIP+4 assignment process. When processing with SuiteLink functionality ensure that firm data is mapped into the transform via the input mappings (either FIRM or MULTILINE field mappings). expand Assignment Options. The USA Regulatory Address Cleanse transform uses firm information to make SuiteLink lookups and matches. Z4Change (USA Regulatory Address Cleanse) The Z4Change option is based on a USPS directory of the same name. Open the USA Regulatory Address Cleanse transform. the software provides improved business addressing information by appending the suite number as secondary address information to business addresses determined to be highrise default records. 4. (Optional) In the Transform Performance option group. caching the SuiteLink Directories may improve performance. On the Options tab.S. records with the USA Regulatory Address Cleanse transform only. 3.Data Quality Address Cleanse 18 SuiteLink (USA Regulatory Address Cleanse) The USPS SuiteLink product contains suite-specific address information for locations identified as highrise defaults. SAP BusinessObjects Data Services Designer Guide 513 . When the SuiteLink option is enabled. The Z4Change option is available in the USA Regulatory Address Cleanse transform only. Use Z4Change to save time Using the Z4Change option can save a lot of processing time. set the path for your “Suitelink Path” directory to reflect the location of your SuiteLink directory files. To enable SuiteLink 1. and then for the Enable Suitelink option select Yes. In the Reference Files option. SuiteLink processing is optional for CASS certification. SuiteLink is a USPS product that is available for U. 2. Depending on your circumstances. the USPS requires that NCOALink full service providers offer SuiteLink processing as an option to their customers.

Posts and Telecommunications (MPT) and additional data sources. Process Japanese addressees The Japan engine used with the Global Address Cleanse transform parses Japanese addresses. In our tests. and so on. A significant portion of the address parsing capability relies on the Japanese address database. The enhanced address database consists of a regularly updated government database that includes regional postal codes mapped to localities. The USPS requires that the mailing list be put through a complete assignment process every three years. total batch processing time was one third the normal processing time. Note: The Japan engine only supports kanji and katakana data. Related Topics • Standard Japanese address format 514 SAP BusinessObjects Data Services Designer Guide . based on files in which five percent of records were affected by a ZIP+4 change. processing one address at a time—there is less benefit from using Z4Change. The engine does not support Latin data.18 Data Quality Address Cleanse Z4Change is most cost-effective for databases that are large and fairly stable—for example. The software has data from the Ministry of Public Management. When you are using the transform interactively—that is. it is best to start with the sample transform. The primary purpose of this transform and engine is to parse and normalize Japanese addresses for data matching and cleansing applications. Home Affairs. subscribers. Start with a sample transform If you want to use the Z4Change feature in the USA Regulatory Address Cleanse transform. USARegulatoryZ4Change_AddressCleanse. databases of regular customers. USPS rules Z4Change is to be used only for updating a database that has previously been put through a full validation process.

SAP BusinessObjects Data Services Designer Guide 515 .Data Quality Address Cleanse 18 • Special Japanese address formats • Sample Japanese address Standard Japanese address format A typical Japanese address includes the following components. Address component Japanese English Output field(s) Postal code Prefecture 〒654-0153 兵庫県 654-0153 Hyogo-ken Postcode_Full Region1 Locality1_Name and Locality1_Description Locality2_Name and Locality2_Description Locality3_Name Primary_Name_Full1 Primary_Name_Full2 Primary_Number_Full City 神戸市 Kobe-shi Ward 須磨区 Suma-ku District Block number Sub-block number House number 南落合 1丁目 25番地 2号 Minami Ochiai 1 chome 25 banchi 2 go An address may also include building name. and room number. floor number.

island villages. but only one is correct for a particular district. Before 1998. the postal code consisted of 3 or 5 digits. The last four digits represent a location in the area. block. but it is sometimes omitted. island village. building. Postal codes must be written with Arabic numbers. In very small villages. sub-district. 516 SAP BusinessObjects Data Services Designer Guide . people use the village name with suffix 村 (-mura) in place of the district. You may omit the prefecture for some well known cities. District A ward is divided into districts. people omit the city name. The first three digits represent the area. they use the island name with a suffix 島 (-shima) in place of the city name. Ward A city is divided into wards. 町 has two possible pronunciations. or rural area is divided into districts. the island name is often omitted. The post office symbol 〒 is optional. the small city. When there is no ward. they use the county name with suffix 郡 (-gun) in place of the city name. When a village or district is on an island with the same name. In some island villages. The possible locations are district. Japan has forty-seven prefectures.18 Data Quality Address Cleanse Postal code Japanese postal codes are in the nnn-nnnn format. and rural areas that don't have wards. Prefecture Prefectures are regions. Some older databases may still reflect the old system. floor. City Japanese city names have the suffix 市 (-shi). In some parts of the Tokyo and Osaka regions. and company. The ward component is omitted for small cities. The district name may have the suffix 町 (-cho/-machi). In some rural areas. The ward name has the suffix 区(-ku). sub-block.

Sub-district parcel A sub-district aza may be divided into numbered sub-district parcels. and so on) 石川県七尾市松百町8部3番地1号 • • • Katakana letters in iroha order (イ. which means large aza. which is very rare (甲. and so on) 石川県小松市里川町ナ部23番地 Kanji numbers. The character 部 is frequently omitted. marked by the prefix 字 (aza-). 丙. 乙. ハ. koaza may be abbreviated to aza. A sub-district may also be marked by the prefix 大字 (oaza-). meaning piece. Parcels can be numbered in several ways: • Arabic numbers (1. 字(aza-). and 小字 (koaza-) are frequently omitted. A sub-district may be further divided into sub-districts that are marked by the prefix 小字 (koaza-). ニ. meaning small aza. 4. and so on) 愛媛県北条市上難波甲部 311 番地 SAP BusinessObjects Data Services Designer Guide 517 . 丁. a district may be divided into sub-districts. Oaza may also be abbreviated to aza.Data Quality Address Cleanse 18 Sub-district Primarily in rural areas. 3. Here are the possible combinations: • • • • • oaza aza oaza and aza aza and koaza oaza and koaza Note: The characters 大字(oaza-). which are marked by the suffix 部 (-bu). 2. ロ.

Districts usually have between 1 and 5 blocks. and house number data may vary. but they can have more. which means number. Japanese addresses do not include a street name. The suffix 番地 (-banchi) may be abbreviated to just 番 (-ban). The optional prefix is 第 (dai-) The following address examples show sub-divisions: 岩手県久慈市旭町10地割1番地 岩手県久慈市旭町第10地割1番地 Block number A district is divided into blocks. The sub-block name includes the suffix 番地 (-banchi). sub-block. The block number includes the suffix 丁目 (-chome). 番地 (banchi). and 号(go) may be replaced with dashes. marked by the suffix 地割 (-chiwari) which means division of land. House number Each house has a unique house number. which means numbered land. and house number variations Block. The block number may be written with a Kanji number. Block. The house number includes the suffix 号 (-go). sub-block. 東京都渋谷区道玄坂2丁目25番地12号 東京都渋谷区道玄坂二丁目25番地12号 Sub-block number A block is divided into sub-blocks. 518 SAP BusinessObjects Data Services Designer Guide . Possible variations include the following: Dashes The suffix markers 丁目(chome).18 Data Quality Address Cleanse Sub-division A rural district or sub-district (oaza/aza/koaza) is sometimes divided into sub-divisions.

and house number are often omitted. 二番町 means district number 2. 東京都 千代田区 二番町 9番地 6号 Building names Names of apartments or buildings are often included after the house number. it may be abbreviated or written using its acronym with English letters. When a building name is long. this ward of Tokyo has numbered districts. and house number are combined or omitted. sub-block. the district name is often omitted. sub-block. The following are the common suffixes: Suffix ビルディング ビルヂング ビル Romanized Translation birudingu birudingu biru building building building SAP BusinessObjects Data Services Designer Guide 519 . For example. When a building is well known. 東京都文京区湯島2丁目18番12号 東京都文京区湯島2丁目18番地12 東京都文京区湯島2丁目18-12 No block number Sometimes the block number is omitted.Data Quality Address Cleanse 18 東京都文京区湯島2丁目18番地12号 東京都文京区湯島2-18-12 Sometimes block. the block. When a building name includes the name of the district. and no block numbers are included.

18 Data Quality Address Cleanse Suffix センター プラザ パーク タワー 会館 棟 庁舎 マンション 団地 アパート 荘 住宅 社宅 Romanized Translation sentapuraza pa-ku tawakaikan tou chousha manshon danchi apa-to sou juutaku shataku center plaza park tower hall building (unit) government office building condominium apartment complex apartment villa housing company housing 520 SAP BusinessObjects Data Services Designer Guide .

and so on. Floor numbers above ground level may include the suffix 階 (-kai) or the letter F.Data Quality Address Cleanse 18 Suffix 官舎 Romanized Translation kansha official residence Building numbers Room numbers. • • • • Third floor above ground 東京都千代田区二番町9番地6号 バウエプタ3 F Second floor below ground 東京都渋谷区道玄坂 2-25-12 シティバンク地下 2 階 Building A Room 301 兵庫県神戸市須磨区南落合 1-25-10 須磨パークヒルズ A 棟 301 号室 Building A Room 301 兵庫県神戸市須磨区南落合 1-25-10 須磨パークヒルズ A-301 Special Japanese address formats Hokkaido regional format The Hokkaido region has two special address formats: • • super-block numbered sub-districts SAP BusinessObjects Data Services Designer Guide 521 . The following address examples include building numbers. Building numbers may include the suffix 号室 (-goshitsu). follow the building name. apartment numbers. An apartment complex may include multiple buildings called Building A. Floor numbers below ground level may include the suffix 地下n 階 (chika n kai) or the letters BnF (where n represents the floor number). and so on. marked by the suffix 棟 (-tou). Building B.

南 (south). 二番町 means district number 2. marked by the suffix 条 (-joh). is one level larger than the block.18 Data Quality Address Cleanse Super-block A special super-block format exists only in the Hokkaido prefecture. box addresses that are located in the Large Organization Postal Code (LOPC) database only. and the house number has no suffix. or 西 (west). prefecture. When a sub-district has a 線 suffix. Box addresses P. districts and so on can have multiple accepted spellings because there are multiple accepted ways to write certain sounds in Japanese.O. 東京都千代田区二番町九番地六号 P. and the box number.O. 522 SAP BusinessObjects Data Services Designer Guide . The following is an address that contains first the sub-district 4 sen and then a numbered block 5 go. the block may have the suffix 号 (-go). Locality1. 東 (east). Accepted numbering When the block. and in the following example it is for Niban-cho. the name of the post office. 北海道札幌市西区二十四軒 4 条4丁目13番地7号 Numbered sub-districts Another Hokkaido regional format is numbered sub-district. A super-block. For example. Note: The Global Address Cleanse transform recognizes P. sub-block. the box marker. the number may be written in Arabic or Kanji. A sub-district name may be marked with the suffix 線 (-sen) meaning number instead of the suffix 字 (-aza).O. house number or district contains a number. 北海道旭川市西神楽4線5号3番地11 Accepted spelling Names of cities. The following address example shows a super-block 4 Joh. Box addresses contain the postal code. The super-block number or the block number may contain a directional 北 (north).

and P. Input 0018521 北海道札幌市北区北十条西1丁目12番地3号創生ビル1階101号室 SAP BusinessObjects Data Services Designer Guide 523 . address. The following address example shows a P. Locality1.O.O. Locality1. 100-8798 東京都千代田区霜ヶ関 1 丁目 3-2 郵便ビル 5F 株式会社郵便輸送 Sample Japanese address This address has been processed by the Global Address Cleanse transform and the Japan engine. An organization may have up to two unique postal codes depending on the volume of mail it receives. box marker (私書箱). box number. such as the customer service department of a major corporation. and P. The address may be in one of the following formats: • • Address. Postal code. Box address: The Osaka Post Office Box marker #1 大阪府大阪市大阪支店私書箱1号 Large Organization Postal Code (LOPC) format The Postal Service may assign a unique postal code to a large organization. box marker (私書箱). post office name. company name The following is an example of an address in a LOPC address format.Data Quality Address Cleanse 18 The address may be in one of the following formats: • • Prefecture. box number. prefecture. company name Postal code.O. post office name.

18 Data Quality Address Cleanse Address-line fields Primary_Name1 Primary_Type1 Primary_Name2 Primary_Type2 Primary_Number 1 丁目 12 番地 3 Primary_Number_Descrip 号 tion Building_Name1 Floor_Number Floor_Description Unit_Number Unit_Description Primary_Address Secondary_Address 1 12 1 3 101 創生ビル 1 階 101 524 SAP BusinessObjects Data Services Designer Guide .

Data Quality Address Cleanse 18 Address-line fields Primary_Secondary_Ad dress 1 12 3 1 101 Last line fields Country ISO_Country_Code_3Dig 392 it ISO_Coun try_Code_2Char Postcode1 Postcode2 Postcode_Full Region1 Region1_Description Locality1_Name Locality1_Description JP 001 8521 001-8521 SAP BusinessObjects Data Services Designer Guide 525 .

Posts and Telecommunications (MPT). the assignment level is based on data provided by the Ministry of Public Management Home Affairs. The level of correction varies by country and by the engine that you use.18 Data Quality Address Cleanse Last line fields Locality2_Name Locality2_Description Locality3_Name Lastline 001-8521 Non-parsed fields Status_Code Assignment_Type Address_Type S0000 BN S Supported countries (Global Address Cleanse) There are several countries supported by the Global Address Cleanse transform. For the EMEA and Global Address engines. 526 SAP BusinessObjects Data Services Designer Guide . Complete coverage of all addresses in a country is not guaranteed. For Japan. country support depends on which sets of postal directories you have purchased.

SAP BusinessObjects Data Services Designer Guide 527 . select No. Related Topics • Process U. expandStandardization Options > Country > Options. the Global Address Cleanse transform's engines may not provide address correction for all of those countries. 3. PW. Northern Mariana Islands.S territories with the USA engine • Reference Guide: Country ISO codes and assignment engines Process U. On the Options tab. however. the output region is AS. respectively. Puerto Rico. PR. • Using an engine that processes multiple countries (such as the EMEA or Global Address engine). or VI. 1. However.Data Quality Address Cleanse 18 During Country ID processing.S territories with the USA engine When you use the USA engine to process addresses from American Samoa. Palau. Virgin Islands. GU. Related Topics • Supported countries (Global Address Cleanse) Set a default country Note: Run Country ID processing only if you are: • Using two or more of the engines and your input addresses contain country data (such as the two-character ISO code or a country name). The output country. Guam. is the United States (US). the transform can identify many countries. These steps show you how to set the Use Postal Country Name in the Global Address Cleanse transform. set the "Use Postal Country Name" option to No. 2. Open the Global Address Cleanse transform. MP. For the Use Postal Country Name option. If you do not want the output country as the United States when processing addresses with the USA engine.S. and the U.

expand Country ID Options. For the Country Name option. However. select the type of script code that represents your data. This value directs the transform to use Country ID to assign the country. 2. If Country ID cannot assign the country. it will use the value in Country Name. and then for the Country ID Mode option select Assign. 3. If you do not want a default country. 4. The LATN option provides script code for most types of data. The transform will use this country only when Country ID cannot assign a country. expand Country ID Options. if you are processing Japanese data. but your input data contains addresses from multiple countries. 4. select KANA 528 SAP BusinessObjects Data Services Designer Guide . Open the Global Address Cleanse transform. select the type of script code that represents your data. For the Script Code option. For the Script Code option. This value tells the transform to take the country information from the Country Name and Script Code options (instead of running “Country ID” processing).18 Data Quality Address Cleanse • Using only one engine. For the Country Name option. select the country that you want to use as a default country. On the Options tab. The LATN option provides script code for most types of data. Open the Global Address Cleanse transform. 3. select None. However. 1. On the Options tab. select the country that represents all your input data. select KANA Related Topics • Identify the country of destination • To set a constant country To set a constant country 1. if you are processing Japanese data. 2. and then for the Country ID Mode option select Constant.

The Address Server application port number. On Windows.Data Quality Address Cleanse 18 Related Topics • Identify the country of destination • Set a default country Address Server The Address Server checks and corrects addresses. The Address Server must be started before you can begin processing your addresses with the Global Address Cleanse transform's EMEA engine and Global Suggestion Lists' Multi Country engine.txt file includes configuration parameters and settings that determine how the Address Server runs. On UNIX. AddressServerConfig. 40011 SAP BusinessObjects Data Services Designer Guide 529 . the file is located in <DataServicesInstallLocation>:\Business Objects\BusinessObjects Data Services\bin\address_server. the file is located in <DataServicesInstallLocation>:/Business Objects/BusinessObjects Data Services/bin\address_server. This file includes the following parameters and default settings: Parameter ADDRESS_SERVER_ CNTL_PORT ADDRESS_SERVER_ APP_PORT Default setting 40010 Description The Address Server control port number.txt The AddressServerConfig.

/.\DataQuality\refer Note: ence_data UNIX: Location of the Address Server reference data.18 Data Quality Address Cleanse Parameter Default setting Description The number of server threads used by the Address Server. Only increase the number of threads to more than 5 when directed by Customer Assurance../DataQuality/refer clude the new location.. the system returns "file not found" errors.\. If you store your address reference data in a different location. set the number of threads to 2.. UNIX users: The number of threads may be increased or decreased to no less than 2.. update this path to in. Note: • Windows users: Change this setting only when directed by Customer Assurance. If the number of threads is too high. • ADDRESS_SERVER_ THREAD_NUMBER 3 • • Windows: REFERENCE_DATA_ DIRECTORY . ence_data 530 SAP BusinessObjects Data Services Designer Guide . Linux users: If you have purchased all of the Global Address data or a large amount of the data.

./. The Address Server will ignore your changes if you do not delete the file.txt file...db file The Address Server creates an associated file (AddressServerConfig./. UNIX: ./log CLIENT_TIMEOUT 90 Number of seconds before a client connection times out./DataQuality/gac Windows: . SAP BusinessObjects Data Services Designer Guide 531 .db file prior to restarting the server each time you make a change to the AddressServerConfig.. This file is used each time you start the server. This prevents overwriting your current configuration file and losing your current settings.Data Quality Address Cleanse 18 Parameter Default setting Description Windows: ...\. AddressServerConfig.\..\DataQuality\gac GAC_DATA_DIRECTORY Location of other non-reference Global Address Cleanse support files. Note: Before installing each software update.. UNIX: . Note: Delete the AddressServerConfig.\log LOG_FILE_DIRECTORY Location of log file (Ad dressServer.db) the first time you start the server.log). make a copy of the AddressServer Config.txt file.

dir 4. 2.dir • ga_region_gen. Stop any data flows that use the EMEA engine or Global Suggestion Lists and stop the Address Server.db • ga_directory_db. To copy the reference files to the non-default location: 1. 3.db • MultiLineKeywords. • AddressServerGlobal. Copy the following files from <DataServicesInstallLoca tion>\Business Objects Data Services\DataQuality\reference_data\(Windows) or <DataServicesInstallLocation>/Business Objects Data Services/DataQuality/reference_data/ (UNIX) to the location where you store your data directories.18 Data Quality Address Cleanse Update reference files after installing an update If you store your data and reference files in a location other than the default. 5.xml • ga_country.xml • ga_directory_db_emea. Related Topics • Installation Guide for Windows: Copy directory files to a non-default location • Installation Guide for Windows: Start the Address Server • Installation Guide for Windows: Stop the Address Server • Installation Guide for UNIX: Copy directory files to a non-default location • Installation Guide for UNIX: Start and stop the Address Server 532 SAP BusinessObjects Data Services Designer Guide . you need to copy the reference files from the default location every time you install a software update. Install the software update. Restart the Address Server. Copy all directory files from the disk to the location where you keep your directories.

In the Global Address Transform. After you run the job and produce the New Zealand Statement of Accuracy (SOA) report. New Zealand Statement of Accuracy (SOA) report • Reference Guide: Transforms. For more information on the required naming format.Data Quality Address Cleanse 18 New Zealand Certification New Zealand Certification enables you to process New Zealand addresses and qualify for mailing discounts with the New Zealand Post. Related Topics • Management Console Metadata Reports Guide: Data Quality Reports. 3. 2. See New Zealand SOA Report and SOA production log file. 4. set Country Options > Disable Certification to No. you need to rename the New Zealand Statement of Accuracy (SOA) report and New Zealand Statement of Accuracy (SOA) Production Log before submitting your mailing. complete all applicable options in the Global Address > Report Options > New Zealand subgroup. In the Global Address Transform. To process New Zealand addresses that qualify for mailing discounts: 1. Note: The software does not produce the New Zealand Statement of Accuracy (SOA) report when this option is set to Yes. In the Global Address Cleanse Transform. In the Global Address Cleanse Transform. enable Report and Analysis > Generate Report Data. Global Address Cleanse transform options (Report options for New Zealand) SAP BusinessObjects Data Services Designer Guide 533 . set Engines > Global Address to Yes. To enable New Zealand Certification You need to purchase the New Zealand directory data and obtain a customer number from the New Zealand Post before you can use the New Zealand Certification option.

The software appends a new record to the Sendrightaddraccuracy table each time a file is processed with the DISABLE_CERTIFICATION option set to No. Submit an SOA report for each file that is processed for mailing discounts. The software creates the SOA production log by extracting data from the Sendrightaddraccuracy table within the repository. File naming format The SOA production log and SOA report must have a file name in the following format: Production Log [SOA% (9999)]_[SOA Expiry Date (YYYYMMDD)]_[SOA ID]. Submit the SOA production log at least once a month. the software does not produce the SOA report and an entry will not be appended to the Sendrightaddraccuracy table. If the DISABLE_CERTIFICATION option is set to Yes. New Zealand Statement of Accuracy (SOA) Production Log The New Zealand Statement of Accuracy (SOA) production log contains identical information as the SOA report in a pipe-delimited ASCII text file (with a header record). Mailing requirements The SOA report and production log are only required when you submit the data processed for a mailing and want to receive postage discounts.txt SOA Report - 534 SAP BusinessObjects Data Services Designer Guide . Mailers must retain the production log file for at least 2 years. The default location of the SOA production log is <DataServicesInstal lLocation>\Business Objects\BusinessObjects Data Ser vices\DataQuality\certifications\CertificationLogs.18 Data Quality Address Cleanse New Zealand SOA Report and SOA production log file New Zealand Statement of Accuracy (SOA) Report The New Zealand Statement of Accuracy (SOA) report includes statistical information about address cleansing for New Zealand.

0943_20081015_AGD07_12345678.txt SOA Report . The import adds the following objects to the repository: • The project DataQualityCertifications • The job Job_DqBatchNewZealand_SOAProductionLog • The dataflow DF_DqBatchNewZealand_SOAProductionLog • The datastore DataQualityCertifications • The file format DqNewZealandSOAProductionLog SAP BusinessObjects Data Services Designer Guide 535 . run the job for New Zealand Certification. The default location is <DataServicesInstallLoca tion>\Business Objects\BusinessObjects Data Services\DataQuality\certifications. and generate the SOA production log file: 1.0943_20081015_AGD07_12345678.atl located in the DataQuality\certifications folder in the location where you installed the software.pdf Related Topics • Management Console—Metadata Reports Guide: Sample New Zealand SOA report • Reference Guide: New Zealand SOA report options • Management Console—Administrator Guide: Exporting New Zealand SOA certification logs The New Zealand Certification blueprint Do the following to edit the blueprint. Import nz_sendright_certification.3% SOA expiry date = 15 Oct 2008 SOA ID = AGD07_12345678 The file names will be: Production Log .PDF Example: An SOA with: SOA % = 94.Data Quality Address Cleanse 18 [SOA% (9999)]_[SOA Expiry Date (YYYYMMDD)]_[SOA ID].

5. Related Topics • New Zealand SOA Report and SOA production log file • Management Console Metadata Reports Guide: Data Quality Reports. Optional: By default. Note: Skip step 3 if you have Microsoft SQL Server 2000 or 2005 as a datastore database type. and import the .atl file in the software. Edit the datastore DataQualityCertifications. Select the Datastores tab of the Local Object Library. unzip it. Rename the SOAPerc_SOAExpDate_SOAId.zip file to the appropriate folder. To edit the datastore: 1. Follow the steps listed in Editing the datastore . From the Designer access Tools > Substitution Parameter Configurations and change the path location in Configuration1 for the substitution parameter $$CertificationLogPath to the location of your choice. If you want to output the production log file to a different location. The job produces an SOA Production Log called SOAPerc_SOAExp Date_SOAId. right-click DataQualityCertifications and select Edit. you must edit the DataQualityCertifications datastore. ignore this step. the software places the SOA Production Log in <DataServicesInstallLocation>\Business Objects\Busines sObjects Data Services\DataQuality\certifications\Cer tificationLogs. 3. Run the job Job_DqBatchNewZealand_SOAProductionLog. 2. If the default location is acceptable. Click Advanced to expand the Edit Datastore DataQualityCertifications window. 4. 536 SAP BusinessObjects Data Services Designer Guide . New Zealand Statement of Accuracy (SOA) report Editing the datastore After you download the blueprint .18 Data Quality Address Cleanse 2. edit the substitution parameter configuration.txt file using data in the last record in the log file and the file naming format described in New Zealand SOA Report and SOA production log file.txt in the default location or the location you specified in the substitution parameter configuration.

and then click OK. f. If the window closes without any error message. Expand the Aliases group and enter your owner name in place of the CHANGE_THIS value. In the toolbar. e. select cp1252 and then click OK. In DBO. Find the column for your database type. enter your repository connection information in place of the CHANGE_THIS values. Note: If you are using a version of Oracle other than Oracle 9i. Click OK. User name. SAP BusinessObjects Data Services Designer Guide 537 . At the Edit Datastore DataQualityCertifications window. and click OK. set this value to DBO. Enter your information. g. Click Close on the Added New Values . d. Click Edit. 7. If you are using Microsoft SQL Server. change Default configuration to Yes.Modified Objects window. select Yes for the Default configuration.) 6. Enter your information for the Database connection name. and Password options. click Create New Configuration.Data Quality Data Cleanse 18 3. (You may have to change three or four options. c. 4. In the new column that appears to the right of the previous columns. perform the following substeps: a. 5. enter your schema name. depending on your repository type. then the database is successfully connected. including the Oracle database version that you are using. In Code Page. b. Data Cleanse Parsing and standardizing with Data Cleanse is a common step in many projects.

by using Universal Data Cleanse functionality you can parse and manipulate operational and product data using dictionaries and rules you create to meet your specific needs. 538 SAP BusinessObjects Data Services Designer Guide . and then standardizes the data based on information stored in the parsing dictionary. and Germany. Additionally. Dictionaries and rules included with the software allow you to process name and firm data from a variety of countries including the United States. and expressions defined in the pattern file. business rules defined in the rule file.18 Data Quality Data Cleanse What is Data Cleanse The Data Cleanse transform identifies and isolates specific parts of mixed data. Japan. Related Topics • Universal Data Cleanse Parse data The Data Cleanse transform can identify and isolate a wide variety of data—even when the data is floating in different fields. You can use Data Cleanse to assign gender and prenames and generate Match standards.

. Related Topics • Reference Guide: Transforms. and abbreviations. mixed multi-name. The Data Cleanse transform parses up to six firm names per record. and ambiguous.Data Quality Data Cleanse 18 The Data Cleanse transform parses up to six names per record. or Mrs. For all six names found. Data Cleanse. and ambiguous multi-name. weak male. Assign gender and prenames The Data Cleanse transform can assign a gender to each name: strong male. one per input field. weak female. given names. Gender Standardization options SAP BusinessObjects Data Services Designer Guide 539 . it can also assign a prename such as Mr. male multi-name. Then it sends the data to individual fields. strong female.. family name. Standardize data The Data Cleanse transform can standardize data to make its format more consistent. The Data Cleanse transform also parses up to six job titles per record. and postname. Data Cleanse options. Data characteristics that it can standardize include case. The intelligence behind gender assignment lies partly in the application and partly in the parsing dictionary. Ms. two per input field. it parses components such as prename. punctuation. For dual names. When the Data Cleanse transform assigns a strong gender. Data Cleanse offers four additional gender descriptions: female multi-name.

The Match transform does not perform any data standardization (with a few exceptions). The Data Cleanse transform can generate match standards or aliases for many name and firm fields as well as all custom output fields. For example. and any that is done is for matching purposes only. it is recommended that you use standardized data to enhance the accuracy of your matches. This example shows how Data Cleanse can prepare records for matching. Therefore.18 Data Quality Data Cleanse Prepare records for matching If you are planning a data flow that includes matching on any type of data that Data Cleanse can standardize. When a dictionary entry does not include an alias. the match standard output field for that term will be empty. you need to standardize your data upstream from the matching process by using Data Cleanse. In the case of a multi-word output such as 540 SAP BusinessObjects Data Services Designer Guide . Data Cleanse can tell you that Patrick and Patricia are potential matches for the name Pat. Match standards can help you overcome two types of matching problems: alternate spellings (Catherine and Katherine) and nicknames (Pat and Patrick).

Email address When Data Cleanse parses input data that it determines is an email address. the match standard is generated using the token alias where available and the parsed value for tokens that do not have an alias. SAP BusinessObjects Data Services Designer Guide 541 . Fields Data Cleanse uses Data Cleanse outputs the individual components of a parsed email address—that is. when none of the tokens in the firm name have an alias. complete domain name. the email user name. and host name. Below is an example of a simple email address. Data parsing overview This section provides an overview of how Data Cleanse deals with various types of supported data. Verify that an email address is properly formatted. fifth domain. Data Cleanse then assigns the data to specific fields.Data Quality Data Cleanse 18 a firm name. second domain. then the alias output will be empty. Flag the address as belonging to an internet service provider (ISP). either in a field by itself or combined in a field with other data. top domain. and so on) by their relationships to each other.com By identifying the various data components (user name. However. Break down the domain name down into sub-elements. fourth domain. What Data Cleanse does Data Cleanse can take the following actions: • • • • Parse an email address. if at least one token has an alias associated with it. host. it places the components of that data into specific fields for output. third domain. joex@sap.

whether an email server is active at that address.co.of fice.18 Data Quality Data Cleanse What Data Cleanse does not verify Several aspects of an email address are not verified by Data Cleanse.co.uk uk co city 542 SAP BusinessObjects Data Services Designer Guide .co. with the input data. Data Cleanse does not verify: • • • • whether the domain name (the portion to the right of the @ sign) is registered. expat@london. whether the personal name in the record can be reached at this email address.uk expat london. whether the user name (the portion to the left of the @ sign) is registered on that email server (if any).city. Data Cleanse follows the Domain Name System (DNS) in determining the correct output field. Data Cleanse outputs each element in the following fields: Output field Email Email_User Email_Do main_All Email_Do main_Top Email_Do main_Second Email_Do main_Third Output value expat@london. For example.office.home.office. Email components The output field where Data Cleanse places the data depends on the position of the data in the record.home.uk.city.home.city.

Data Quality Data Cleanse 18 Output field Email_Do main_Fourth Email_Do main_Fifth Email_Do main_Host Output value office home london Related Topics • Reference Guide: Transforms. the area. Data Cleanse output fields Social Security number Data Cleanse parses U. group. the entire SSN.S. 3. Identifies a potential SSN by looking for the following patterns: Pattern Digits per grouping nnnnnnnnn 9 consecutive digits nnn nn nnnn Delimited by n. and 4 (for area. Fields used Data Cleanse outputs the individual components of a parsed Social Security number—that is.a. Social Security numbers (SSNs) that are either by themselves or on an input line surrounded by other text. and the serial. the group. spaces and serial) SAP BusinessObjects Data Services Designer Guide 543 . How Data Cleanse parses Social Security numbers Data Cleanse parses Social Security numbers in two steps: 1. 2.

it validates only the first 5 digits (area and group).dat) for Data Cleanse customers interested in parsing recently issued and existing U. group. and 4 (for area. unparsed data. Data Cleanse doesn't verify that a particular 9-digit Social Security number has been issued. or that it's the correct number for any named person. Instead. government. Update your SSN file SAP provides the Social Security number (SSN) file (drlssn.txt). The possible outcomes of this validity check are: Out come Pass Description Data Cleanse successfully parses the data—and the Social Security number is output to a SSN output field. Performs a validity check on the first five digits only. The rules and data that guide this check are available at http://www. That table is updated monthly as the SSA opens new groups.3.18 Data Quality Data Cleanse Pattern Digits per grouping Delimited by nnn-nn. all supported delimiters nnnn and serial) 2. 544 SAP BusinessObjects Data Services Designer Guide . SSA data Data Cleanse's validation of the first 5 digits is driven by a table from the Social Security Administration (http://www.gov/employer/highgroup.gov/history/ssn/geo card.S. The data is output as Extra.ssa.S.html. Fail Check validity When performing a validity check. 2. Data Cleanse doesn't validate the last 4 digits (serial)—except to confirm they are digits. Data Cleanse does not parse the data because it's not a valid SSN as defined by the U.ssa.

too. The SSN file is updated monthly with the latest SSN information from the U. but invalid. When Data Cleanse parses a phone number. government. and makes dates available as output in either the original format or a user-selected standard format. breaks those dates into components. If an apparent SSN fails validation. Data Cleanse recognizes phone numbers by their pattern and (for non-NANP numbers) by their country code. Social Security number. Data Cleanse does not pass on the number as a parsed. Social Security numbers within Data Cleanse. Phone numbering systems differ around the world.Data Quality Data Cleanse 18 Social Security numbers.S. That is. SAP converts the data to a format that Data Cleanse can use and posts the data by the 5th of every month.com/bosap-support. This site provides you with the opportunity to download the latest drlssn. Data Cleanse can parse up to six dates from your defined record. Phone number Data Cleanse can parse both North American Numbering Plan (NANP) and international phone numbers.dat file used to parse U.S. Outputs valid SSNs Data Cleanse outputs only Social Security numbers that pass its validation. it outputs the individual components of the number into the appropriate fields. SAP BusinessObjects Data Services Designer Guide 545 . Data Cleanse output fields Date Data Cleanse recognizes dates in a variety of formats and breaks those dates into components. Related Topics • Reference Guide: Transforms. Data Cleanse identifies up to six dates in the input.sap. You can obtain the most current SSN file from the Technical Customer Assurance site at http://service.

dat file when the pattern is defined. Data Cleanse doesn't offer any options for reformatting international phone numbers. phone and date. and Data Cleanse can parse your data according to those user-defined patterns. Data Cleanse's UDPM feature makes possible the parsing and extraction of virtually any kind of data that conforms to a pattern—any type of data pattern that can be expressed using regular expressions. Data Cleanse is able to parse patterns through its user-defined pattern matching (UDPM) feature. title. which uses regular expressions. email. Data Cleanse doesn't cross-compare to the address to see whether the country and city codes in the phone number match the address.dat. Data Cleanse can parse a wide variety of data such as: • • • • • • account numbers part numbers purchase orders invoice numbers VINs (vehicle identification numbers) driver license numbers In other words. They require that the country code appear at the beginning of the number. Data Cleanse can parse any kind of numeric or alphanumeric data for which you can define a pattern. User-defined pattern Data Cleanse can parse data that's outside the range of name.18 Data Quality Data Cleanse Data Cleanse searches for North American phone numbers by commonly used patterns such as: (234) 567-8901. The patterns used are stored in drlphint. and 2345678901. Data Cleanse gives you the option for some reformatting on output (such as your choice of delimiters). Data Cleanse searches for European and Pacific-Rim numbers by pattern. firm. With the user-defined pattern matching (UDPM) feature. Related Topics • Reference Guide: User-defined pattern matching (UDPM) 546 SAP BusinessObjects Data Services Designer Guide . That is. Also. you can set up data patterns to suit your data (such as part numbers). The pattern label is created in the drludpm. 234-567-8901.

An input string of “Macy's” is broken into three individual tokens: MACY. given names. Data Cleanse also accepts name and title data together with other data or alone in a field. such as Applebee's or Macy's. Data Cleanse accepts these firm names alone in a field or together with other data. If the term in not found. Because the last token is an “S”. it is usually output to the "Extra" fields. family names. process it using a Global Address Cleanse or U. If that is not successful. Since words ending with “S” are automatically kept together. S. postname. In the event address data is processed by the Data Cleanse transform. Firm Data Cleanse can parse firm data. '. and so on. To parse data that contains address information. Data Cleanse looks up the term without the apostrophe (MACYS). Data Cleanse first combines the tokens and looks up the term including the apostrophe (MACY'S). Data Cleanse can accept up to two names and titles as discrete components. Related Topics • How address cleanse works SAP BusinessObjects Data Services Designer Guide 547 . A person's name can consist of the following parts: prename. Regulatory Address Cleanse transform prior to Data Cleanse.S. The name line or multiline field may contain one or two names per field. An exception to how Data Cleanse recombines contiguous word pieces is made for words that end with an “S”. Data Cleanse automatically keeps the tokens together (MACY'S) and adds the FIRM_MISCELLANEOUS classification to the term. it is not necessary to add all possessive firm names to the dictionary. Street address Data Cleanse does not identify and parse individual address components.Data Quality Data Cleanse 18 Name and title Data Cleanse can parse name and title data.

) or job title (such as VP of Engineering). the dictionary indicates that the word Engineering can be used in a firm name (such as Smith Engineering. For example. Each dictionary entry contains the following information: Information Description type The dictionary shows in what types of situations the word might be used. Classifica tions For example. The dictionary contains aliases (match standards). title. For example. Each entry tells how the word or the phrase might be used. A person_firm parsing dictionary identifies and parses name. it indicates that Anne is a feminine name and Mr. The dictionary entries are used to standardize capitalization or other output formatting on data parsed by Data Cleanse. and firm data. Gender Aliases Standards Related Topics • Dictionary entries 548 SAP BusinessObjects Data Services Designer Guide .18 Data Quality Data Cleanse Parsing dictionaries What is a parsing dictionary A parsing dictionary contains entries for words and phrases. is a masculine prename. The parser uses the dictionary information and the rule file to identify and parse the data. A custom parsing dictionary identifies and parses operational data and is created using Universal Data Cleanse. The dictionary contains gender data. Patrick and Patricia are aliases for Pat. For example. Smith is assigned the Firm_Name and Name_Strong_Family_Name classifications. You can edit existing entries or add new entries to any dictionary in order to meet your specific needs. Inc. because Smith can be found in both firm and personal names.

For example. To improve parsing. You can tailor the dictionary to better suit your own industry by adding special titles. these dictionaries do not include every possible unique name or firm. standardized abbreviations. ABR. SAP BusinessObjects Data Services Designer Guide 549 . For example. or other jargon words. you might add industry-specific postnames such as CRS. prenames or postnames. you can add these job title phrases to the dictionary. if you process data for the real estate industry. you can add it to the dictionary you are using. gender assignment. you would add the name Jinco Xandru as two entries: Jinco classified as a given name (including the appropriate gender) and Xandru classified as a family name. As a result. Data Cleanse may incorrectly recognize some job titles as firm names. However. title. You can create a custom dictionary or modify the existing dictionary to improve parsing results for your specific data. Industry-specific jargon The person_firm parsing dictionary is useful across many industries. and standardization. If Data Cleanse doesn't recognize a specific name.Data Quality Data Cleanse 18 Improve parsing results Data Cleanse uses a parsing dictionary to guide parsing. and firm data common to that region. Correct specific parsing behavior You can customize the parsing dictionary to correct specific parsing behavior that you have seen in your output. and GRI. Each cleansing package includes a person_firm parsing dictionary that identifies and parses name. Specific phrases Some words can be used in both firm names and job titles. Local names Data Cleanse provides cleansing packages for many regions.

the dictionary has an entry for the word MS: Dictionary entry MS Usage HONORARY POSTNAME The word MS is cased differently depending upon how it is used: M. as an abbreviation for the honorary postname Master of Science. Data Cleanse may incorrectly parse the firm as a personal name. GmbH. How Data Cleanse capitalizes in mixed case The dictionary contains the correct casing of a word and also indicates when that casing should be used.S. the general rule is to capitalize the first letter of the word and put the rest of the word in lowercase.18 Data Quality Data Cleanse Firm names containing personal names Often a firm name is made up of personal names. the catalog retailer J. Crew may be parsed as a personal name rather than as a firm.D.. you can add multiple-word firm names to the dictionary. should be used as the standard only for honorary postname data. 550 SAP BusinessObjects Data Services Designer Guide . to parse J. For example.S. To handle mixed-case exceptions. Improve casing results If you use mixed case. So. SAP. To improve parsing. For example. Ph. The capitalization of entries in the parsing dictionary is applied to words for mixed-case exceptions. as a prename. Crew as a firm rather than as a personal name. As a result. However. and so on. or Ms. the secondary information for the entry in the dictionary indicates that M. you could add J and Crew to the dictionary with the Firm_Name classification. such as McDonald. and J Crew with the Firm_Name_Alone classification. Data Cleanse consults secondary information standards in the dictionary. there are exceptions to that rule. For example.

Data Quality Data Cleanse

18

Improved mixed-case results
Most Data Cleanse users find that the default capitalization of words in the dictionary is sufficient for producing good mixed-case results. However, it is impossible for the default dictionary to contain every mixed-case exception. If Data Cleanse does not case a word as you want, you can create a custom standard in the dictionary. For example, TechTel is not in the default dictionary, so Data Cleanse capitalizes only the first letter of the word. However, if you add the word TechTel to your dictionary with a standard for firm name use, you can achieve the desired mixed-case results:
Input TECHTEL, INC. TECHTEL, INC. Standard (none) TechTel Output Techtel Inc. TechTel Inc.

Related Topics

• To edit existing entries

Define paths for reference files
By default, the Cleansing Package installer installs Data Cleanse rule files (reference files) to the LINK_DIR\DataQuality\datacleanse folder. You can either define the reference file location in the substitution parameter configuration or specify the location directly within the transform options for each job. Additionally, for each job, in the Data Cleanse transform options, specify the regional person_firm parsing dictionary or the custom (Universal Data Cleanse) dictionary you want to use.
Related Topics

• Overview of substitution parameters

SAP BusinessObjects Data Services Designer Guide

551

18

Data Quality Data Cleanse

Dictionary entries
Each regional person_firm dictionary contains thousands of name, title, and firm entries. You can tailor a dictionary to better suit your data. For example: • You can customize the dictionary to correct specific parsing behavior. For example, given the name Mary Jones, CRNA, the word CRNA is parsed as a job title. In reality, CRNA is a postname (Certified Registered Nurse Anesthetist). To correct this, you can add CRNA to the parsing dictionary as a postname. You can tailor the dictionary to better suit your data by adding regional or ethnic names, special titles, or industry jargon. For example, if you process data for the real estate industry, you can add postnames such as CRS (Certified Residential Specialist) and ABR (Accredited Buyer Representative). If a specific title or firm name is parsed incorrectly, you can add an entry for the entire phrase. For example, if a firm name like Hewlett Packard is identified as a personal name, you can add it as a firm name also. If a word is not capitalized as you want, you can enter that word into the dictionary with your own capitalization. For example, if you want the word TECHTEL to be capitalized as TechTel, you could add the word TECHTEL to the dictionary with TechTel as the firm standard. You can create a dictionary to parse operational data, such as part numbers.

Related Topics

• Universal Data Cleanse

What is in a dictionary entry
Dictionary entries contain a number of elements that tell Data Cleanse how to process the word or phrase.
Element Primary Description The word or phrase that you want Data Cleanse to parse.

552

SAP BusinessObjects Data Services Designer Guide

Data Quality Data Cleanse

18

Element

Description

Classifications Indicators of the types of situations apply to this word. For example, Hewlett is assigned the Firm_Name and Name_Weak_Family_Name classifications, because it can be used in both firm and personal names. Gender Secondary information Defines the gender that applies to the word.

Secondary information assists Data Cleanse in determining how to process the word when it is used in different ways. Secondary information is not required. Each dictionary entry is limited to a maximum of 24 secondary types. Standards Standards define how Data Cleanse will standardize the output data for the word. Aliases Aliases are alternate forms that could potentially be matched to the word. For example, Robert is a personal name alias for Bob. Alias data is output in the Match_Std fields.
Note: Data Cleanse does not apply the capitalization option setting to the alias data.

Multi-byte data in Data Cleanse dictionaries

Multi-byte characters are supported in several parts of Data Cleanse dictionaries: • • • • Primary dictionary entry Secondary standards Secondary aliases Pattern-based classification regular expressions (Code points are supported. Unicode characters are not supported.)

Multi-byte characters are not supported for other dictionary components: • • Dictionary names and descriptions Classification names

SAP BusinessObjects Data Services Designer Guide

553

18

Data Quality Data Cleanse

Custom output categories and fields

Related Topics

• Classifications

Search for entries
1. From the Designer menu bar, choose Dictionary > Search. 2. Use the "Search Dictionary" window to query the parsing dictionary to see whether there is already an entry for the word.

Query a single word
1. Type the word in the Look for box. Do not include any punctuation. 2. Press Enter. If the word is in the dictionary, the Search Dictionary tool displays the dictionary entry.

Query a title phrase
To look up a multiple-word title, you must query the "lookup" form of the title—the same form as the parser would look up. For example, to find the lookup form of "Chief Financial Officer": 1. Type the unmodified input form of the name. For example, Chief Financial Officer. 2. Type Chief. View the results and repeat this step for each word and possible abbreviated words in the phrase. You would next query “Financial”, “Officer” and then “Fncl”. 3. Remove all punctuation and capitalize all words, for example “CHIEF FNCL OFFICER.”
Note:

If a line contains consecutive words that are marked as phrase words, the parser retrieves the standard for each word, removes any punctuation, and looks up the phrase.

554

SAP BusinessObjects Data Services Designer Guide

Data Quality Data Cleanse

18

Query a multiple-word name
To look up a multiple-word firm name, you must query the "lookup" form of the firm name—the same form as the parser would look up. For example, to find the lookup form of "International Business Machines": 1. Type the unmodified input form of the firm name. For example, International Business Machines. 2. Remove firm terminator words such as Corp, Inc, Ltd, Co, and so on. For example, International Business Machines. 3. Query each remaining word for firm standards. If an appropriate standard does not exist, use the original word, for example, Intl Business Machines 4. Remove all punctuation and capitalize all words, for example, INTL BUSINESS MACHINES.

Query a firm name that looks like a personal name
Some firms are named after people, for example, Hewlett Packard or Avery Dennison Corp. To look up this type of firm name, you must query the "lookup" form of the firm name; the same form of the name the parser would look up. For example, to find the lookup form of Avery Dennison Corp.: 1. Type the unmodified input form of the firm name, for example, Avery Dennison Corp. 2. Remove all firm terminator words, such as Corp, Inc, Ltd, Co, and so on, for example, Avery Dennison. 3. Remove all punctuation and capitalize all words, for example, AVERY DENNISON.
Note:

If all of the words in a line are identified as both Firm_Name and Name words, the parser removes all punctuation and then checks whether the name is listed as a firm name. If so, the line is parsed as a firm name. If not, a combination of rules and transform logic determines the result. You can influence the parsing result by ensuring that all words are entered correctly in the dictionary.

SAP BusinessObjects Data Services Designer Guide

555

18

Data Quality Data Cleanse

To add new entries
If Data Cleanse does not identify a word or phrase in your data, you can add that word to your parsing dictionary. 1. 2. 3. 4. 5. 6. From the Dictionary menu, choose Add New Dictionary Entry. Select the dictionary to which you want to add the word or phrase. Specify the primary word or phrase to add to the dictionary. Select classifications to associate with the primary entry Specify any secondary information for the primary entry. Do one of the following: • To add the new entry and clear your selections from the dialog, click Add & Reset • To add the new entry and retain your selections in the dialog, click Add & Keep settings. This choice is useful if you plan to add additional words that fit into the same classifications.

Related Topics

• What is in a dictionary entry

To edit existing entries
If Data Cleanse identifies a word or phrase in your data, but does not handle it in the way you want, you can edit its dictionary entry. 1. From the Dictionary menu, choose Search. 2. Locate the entry in your dictionary. Type the word you want to edit in the Look for box and click Search. 3. Click Edit. 4. Select classifications and specify secondary information to associate with the primary entry. 5. Click OK.
Related Topics

• Search for entries • What is in a dictionary entry

556

SAP BusinessObjects Data Services Designer Guide

Data Quality Data Cleanse

18

To delete existing entries
If you don't want Data Cleanse to process a certain word or phrase in your data, you can delete the parsing dictionary entry for that word or phrase. 1. Locate the entry in your dictionary. 2. Select the dictionary entry and click Remove.
Related Topics

• Search for entries

Bulk load dictionary entries
Bulk loading is a process that allows you to import and export dictionary changes in large batches. You can use the bulk load function to import dictionary changes, clone dictionaries, and merge dictionaries.

To import dictionary changes
Importing dictionary changes from an XML file is useful for situations such as migrating customized dictionary entries from a test server to a production server.
Note:

Data Cleanse includes a schema file (dcdict.xsd) that defines the schema required to create valid XML files for use with the bulk load feature. By default, this file is installed to the LINK_DIR\DataQuality\datacleanse folder. 1. Choose Bulk Load from the Dictionary menu. 2. Select Use file as source and enter the path and name of the XML file that contains the dictionary changes. 3. Select Use existing dictionary as target and choose the dictionary where you want to apply the changes. 4. If the dictionary contains Japanese data, select Normalize Entries. Normalization ensures that the Japanese word breaker is able to break words properly and eliminates the need for the dictionary to contain both fullwidth and halfwidth versions of the same word.

SAP BusinessObjects Data Services Designer Guide

557

18

Data Quality Data Cleanse

Note:

Normalizing data increases job runtime and is only required for dictionaries containing Japanese data. 5. Select the type of conflict resolution that you want to use. 6. Click OK.

To clone a dictionary
Cloning a dictionary is useful in situations where you need to use a common existing dictionary as a base for new, more specifically customized dictionaries, or if you want to create a backup of your existing dictionary before making changes. You can clone an existing dictionary by creating a new dictionary and copying the data from the existing dictionary. 1. Choose Bulk Load from the Dictionary menu. 2. Select Use dictionary as source and choose the dictionary that you want to clone. 3. Select Use new dictionary as target and enter a name for the new, cloned dictionary. 4. If the dictionary contains Japanese data, select Normalize Entries. Normalization ensures that the Japanese word breaker is able to break words properly and eliminates the need for the dictionary to contain both fullwidth and halfwidth versions of the same word.
Note:

Normalizing data increases job runtime and is only required for dictionaries containing Japanese data. 5. Select the type of conflict resolution that you want to use. 6. Click OK.

To merge dictionaries
To combine the entries contained in different dictionaries into a single dictionary, you merge the dictionaries by bulk loading the contents of one existing dictionary into the other dictionary. 1. Choose Bulk Load from the Dictionary menu. 2. Select Use dictionary as source and choose the source dictionary. 3. Select Use existing dictionary as target and choose the target dictionary into which you want to merge the additional content. 4. Select the type of conflict resolution that you want to use.

558

SAP BusinessObjects Data Services Designer Guide

Data Quality Data Cleanse

18

5. Click OK. The contents from the first dictionary selected are merged into the second dictionary. If you want to merge dictionaries into a completely new one without changing either of the original dictionaries, clone one of the original dictionaries and merge the other dictionary into the new clone.
Restriction:

You cannot merge dictionaries when the source is a default dictionary or clone of a default dictionary installed with a Data Cleanse cleansing package.

To resolve conflicts
Conflict resolution allows you to manage how entries are handled when the dictionary entry from the source conflicts with an entry that already exists in the target during the bulk load process. 1. Select the elements you want to add from the source in the categories in the left column. If you want to take all information from the source entry, click Select All Source Elements. 2. Select the elements you want to retain from the target in the categories in the right column. If you want to take all information from the target entry, click Select All Target Elements. 3. Select a gender for the resulting entry. 4. Verify that the resulting entry contains the information that you want. 5. Click OK. The resulting entry is written to the target.

View conflict logs
The conflict logs table displays all Bulk Load logs that exist for the selected dictionary. The conflict logs table provides the following information:
Column ID User Name Start Date Description The unique ID used to identify the bulk load operation. The user name used to connect to the Data Cleanse repository. The date the bulk load operation was initiated.

SAP BusinessObjects Data Services Designer Guide

559

18

Data Quality Data Cleanse

Column Source

Description The dictionary or bulk load XML file that was used as the source for the operation. The number of inserts performed on the target dictionary during the bulk load process. The number of deletes performed on the target dictionary during the bulk load process. The number of conflicts that were successfully resolved during the bulk load process. The number of errors encountered during the bulk load process. The time the bulk load operation was initiated. The time the bulk load operation finished or was manually stopped.

Inserts

Deletes

Conflicts Errors Start Time End Time

To view the information in a specific log, select that log in the conflict logs table. The log data table provides the following information for the selected log:
Column Description Primary The primary entry where the conflict occurred. A primary entry may Entry appear more than once if it was involved in more than one type of conflict.

560

SAP BusinessObjects Data Services Designer Guide

Data Quality Data Cleanse

18

Column Description Type

The type of conflict that occurred. This column may have several values. Primary: Indicates the deletion or addition of a primary entry. For a deletion, the old value displays with no new value. For an addition, the old value is empty and the new value contains the new primary entry. Gender: Indicates when there is a difference in the gender for an entry. The old value and new value are displayed. Classification: Indicates the addition or deletion of a classification. If more than one addition or deletion occurred for the entry, each operation is displayed in a separate row. _Alias: Indicates the addition or deletion of an alias. If more than one addition or deletion occurred for the entry, each operation is displayed in a separate row. _Standard: Indicates the update, addition, or deletion of a standard.

Old Val- The value that existed in the target dictionary prior to the bulk load ue operation. New Value The value that exists in the target dictionary following the bulk load operation.

View bulk load errors

In addition to bulk load conflicts, the software maintains a log of all errors reported during each bulk load operation. The View Bulk Load errors window displays the unique ID assigned to the bulk load operation and the source database or bulk load XML file used. To view any errors reported for a bulk load operation, select the operation in the conflict logs table, and click View Bulk Load Errors. The error log table displays the following information:

SAP BusinessObjects Data Services Designer Guide

561

18

Data Quality Data Cleanse

Column Primary Entry Type Error

Description The value of the primary entry when the error occurred. The type of operation that caused the error. A description that explains the error.

Export a conflict log

To save a copy of a bulk load conflict log to a CSV file, specify the location and filename to write, and click Export.

To export dictionary changes
When you export dictionary changes, Data Cleanse creates an XML file containing the changes that have been made to the dictionary since its creation. You can use the new XML file to apply the same set of changes to a different pre-existing dictionary by using the bulk load feature. 1. Choose Export Dictionary Changes from the Dictionary menu. 2. Choose the dictionary containing the changes you want to export. 3. Enter the path and filename for the XML file where the changes will be written. 4. Click OK.

Classifications
Classifications tell Data Cleanse what types of situations apply to the word or phrase. For example, Hewlett is assigned the Firm_Name and Name_Weak_Family_Name classifications, because it can be used in both firm and personal names. Data Cleanse includes a variety of default classifications which you can assign as you add or edit primary dictionary entries. However, some classifications, such as gender and script classifications, are only used in a rule file. Data Cleanse automatically identifies and assigns these classifications as each token is processed.

562

SAP BusinessObjects Data Services Designer Guide

Data Quality Data Cleanse

18

Classification

Description

A mailing box indicator.
Address_Box

For example, BOX. A word that contains only alphabetic characters.
Alpha_Only

This classification cannot be assigned to a primary entry in the dictionary. It is only used in rule files.
A word that contains both alphabetic and numeric characters.

Alpha_Numeric

This classification cannot be assigned to a primary entry in the dictionary. It is only used in rule files. Data Cleanse automatically identifies and assigns the Ambiguous_Gender classification based on the gender that is assigned in the dictionary entry.

Ambiguous_Gender This classification cannot be assigned to a primary entry in the dic-

tionary. It is only used in rule files. For example, a rule that captures only given names with ambiguous gender would include NAME_STRONG_GIVEN_NAME & AMBIGUOUS_GENDER. A word, character, or symbol between other words.
Connector

For example, and, &. A part of an address that gives directional information for delivery, such as N, S, NE.

Directional

Dual_Name_ IndicaOnly available in the Japanese person firm dictionary and Japanese tor

Indicates a husband and wife (夫妻). person_firm rule file.

SAP BusinessObjects Data Services Designer Guide

563

18

Data Quality Data Cleanse

Classification Firm_Designator

Description

A word that indicates that a firm is to follow. A word that is likely to be the first word when used as a part of a firm name. A location within a firm (usually used for internal mail delivery).

Firm_Initiator

Firm_Location

For example, Department, Mailstop, Room, Building.
Firm_Miscellaneous A word used in firm names.

A code that is used for firm names that may be parsed incorrectly.
Firm_Name

For example, Hewlett Packard could be incorrectly parsed as a personal name, so Hewlett and Packard are listed as Firm_Name words. A firm name that can stand on its own.

Firm_Name_Alone

For example, Hewlett Packard. A word that is likely to be the last word when used as a part of a firm name. For example, Inc, Corp, Ltd.
HC_RR_Address

Firm_Terminator

Highway Contract Rural Routes. A postname that signifies certification, academic degree, or affiliation

Honorary_Postname

For example, CPA, PhD, or USNR.

564

SAP BusinessObjects Data Services Designer Guide

Data Quality Data Cleanse

18

Classification Initial

Description

A single letter, such as C or J. Automatically identified and assigned to tokens written in kana script.

Kana_Script

This classification cannot be assigned to a primary entry in the dictionary. It is only used in rule files. For example, a rule that captures only strong family names written in kana script would include NAME_STRONG_FAMILY_NAME & KANA_SCRIPT. Automatically identified and assigned to tokens written in kanji script.

Kanji_Script

This classification cannot be assigned to a primary entry in the dictionary. It is only used in rule files. For example, a rule that captures only strong family names written in kanji script would include NAME_STRONG_FAMILY_NAME & KANJI_SCRIPT. Automatically identified and assigned to tokens written in Latin script.

Latin_Script

This classification cannot be assigned to a primary entry in the dictionary. It is only used in rule files. For example, a rule that captures only strong family names written in Latin script would include NAME_STRONG_FAMILY_NAME & LATIN_SCRIPT. A geographical word such as North, Western, Minnesota, or NY. A classification that Data Cleanse assigns to every token.

Locality

Lookup_Any

The Lookup_Any classification cannot be assigned to a primary entry in the dictionary. It is only used in rule files. A classification assigned to a word when that word does not exist as a primary entry in the dictionary.

Lookup_Not_Found

When the word matches a pattern classification as part of the rule evaluation process, it is also assigned the pattern classification.

SAP BusinessObjects Data Services Designer Guide

565

18

Data Quality Data Cleanse

Classification

Description

Maiden_Name_Indi cator

Indicates a maiden name (旧姓). Only available in the Japanese person firm dictionary and Japanese person firm rule file. A maturity postname such as Jr or Sr. A part of a domestic military address, such as psc. A part of an overseas military address preceding the two-character "state" abbreviation, such as APO, FPO. A part of an overseas military address indicating the state, such as AE, AP, AA A name that may be used as either a given name or a family name. A name designator such as Attn or c/o. A word that may appear in a name line, such as Family, Resident, Occupant.

Maturity_Postname

Military_Address

Military_Last

Military_Region

Name_Ambiguous

Name_Designator

Name_Special

Name_Strong_Fami A name that is most likely a family name. ly_Name

For example, McMichaels.

Name_Strong_Giv en_Name

A name that is most likely a given name. For example, Michael.

566

SAP BusinessObjects Data Services Designer Guide

Data Quality Data Cleanse

18

Classification

Description

A name that is used as both a given name and a family name, but
Name_Weak_ Fami- is more frequently used as a family name. ly_Name

For example, Hunter. A name that is that is used as both a given name and a family name, but is more frequently used as a given name. For example, Corey. A number word.
Number

Name_Weak_ Given_Name

For example, One, First, or 1st. A word that is part of a phrase.
Phrase_Word

For example, the dictionary contains an entry for the phrase VP Mktg. Each separate word in the phrase, VP and Mktg, is marked as a Phrase_Word. A ZIP Code or other postal code. A ZIP+4 Code or other extended postal code. The name of a SCF, ASF, BMC, or ADC. A family-name prefix, such as Van in Van Allen or O in O'Connor. A given-name prefix, such as de los in de los Angeles. A part of a prename that cannot stand on its own, such as First or Brigadier.

Postcode1

Postcode2

Post_Office

Pre_Family_Name

Pre_Given_Name

Prename

SAP BusinessObjects Data Services Designer Guide

567

such as an apartment. building. Private_Address A private mailbox. Automatically identified and assigned based on the gender that is assigned in the dictionary entry. For example. For example. or suite. Mr. The word may be an abbreviation. Ave. For example. or region. a rule that captures only female given names would include NAME_STRONG_GIVEN_NAME & [STRONG_FEMALE | WEAK_FEMALE]. Strong_Male This classification cannot be assigned to a primary entry in the dictionary. such as Software or Engineer. such as Street. NY is the abbreviation for the state of New York in the United States. Automatically identified and assigned based on the gender that is assigned in the dictionary entry. province.18 Data Quality Data Cleanse Classification Description A prename that can stand on its own. Dr. It is only used in rule files. Title 568 SAP BusinessObjects Data Services Designer Guide . a rule that captures only male given names would include NAME_STRONG_GIVEN_NAME & [STRONG_MALE | WEAK_MALE]. Ms. or Rd. Strong_Female This classification cannot be assigned to a primary entry in the dictionary. A state. territory. Senora. A word used in a job title. or Capt. Street_Type A street-level suffix.. It is only used in rule files. pmb. Region Secondary_Address Secondary address. Prename_Alone For example. Senor.

For example. Automatically identified and assigned based on the gender that is assigned in the dictionary entry. if Vanilla is used in your data as a part of a firm name. For example. a rule that captures only male given names would include NAME_STRONG_GIVEN_NAME & [STRONG_MALE | WEAK_MALE]. Weak_Male This classification cannot be assigned to a primary entry in the dictionary. such as Vice or Associate. It is only used in rule files.Data Quality Data Cleanse 18 Classification Description Title_Alone A word that can stand as a single title. you can add the Firm_Name classification to its entry. A word used at the beginning of a title. A word used at the end of a title. such as Engineer or Manager. Automatically identified and assigned based on the gender that is assigned in the dictionary entry. Add an existing classification If a word in your data is found in a situation that Data Cleanse does not correctly identify. you can add a classification to the dictionary entry. such as Accountant or Attorney. Title_Initiator Title_Terminator Weak_Female This classification cannot be assigned to a primary entry in the dictionary. a rule that captures only female given names would include NAME_STRONG_GIVEN_NAME & [STRONG_FEMALE | WEAK_FEMALE]. Related Topics • To edit existing entries SAP BusinessObjects Data Services Designer Guide 569 . For example. It is only used in rule files.

click View Data (magnifying glass icon) on the lower right corner of the target. To test the dictionary for any existing standards for the selected firm names that differ from the specified standard. Enter the standardized text in the Standard box. 6. • From the object library. Open the "View data" window by using one of the following methods: • From the data flow. while pattern-based classifications are only applied to individual data elements within the input line as a part of the rule evaluation process. The "Add Firm Standards" dialog appears. Select one or more entries in the firm name column. The primary difference between the two is that UDPM applies the pattern to the entire input line. To add firm standards You can add a single common standard to the group of firm names displayed in the Firm Name output column. Ensure that the dictionary you want to modify is selected. 570 SAP BusinessObjects Data Services Designer Guide . Click Add Firm Standards. 8. 5. 7. 1. Click OK to accept the new standard and close the window. right-click the target table and click View Data. Note: Accepting the new standard will overwrite any existing standards for the selected firm names. 4.18 Data Quality Data Cleanse Pattern-based classifications compared to UDPM Pattern-based classifications and user-defined pattern matching (UDPM) both use regular expressions to customize how Data Cleanse parses your data. 2. 3. or click Cancel to return to "View Data" without adding any standards. Click the Data Cleanse Dictionary tab. click Test for Conflicts Any firm names that already exist in the dictionary with a different firm standard are displayed in the Existing Standards table.

Sánchez João A. Lopes Spanish Portuguese Juan João Jean Christophe C.Data Quality Data Cleanse 18 Related Topics • Dictionary entries Region-specific data Cleansing packages Data Cleanse offers person and firm cleansing packages for a variety of regions. you can use the sample transforms in your projects in the same way you would use the base Data Cleanse transform and gain the advantage of the enhanced reference data and parsing rules. The table below illustrates how name parsing may vary by culture: Parsed Output Culture Name Given_Name1 Given_Name2 Juan C. Each cleansing package includes region-specific reference data and parsing rules designed to enhance the ability of Data Cleanse to appropriately parse the data according to the cultural standards of the region. French Rousseau German Hans Joachim Müller American James Andrew James Smith Andrew Smith Because cleansing packages are based on the standard Data Cleanse transform. SAP BusinessObjects Data Services Designer Guide 571 . Lopes Jean Christophe Rousseau Hans Joachim Müller Fami ly_Name1 Sánchez A.

User-defined pattern matching (UDPM) 572 SAP BusinessObjects Data Services Designer Guide . Social Security numbers and separate them into discrete components. and Ms. if you find that you need parsing for a country that is not included. you can use user-defined pattern matching to identify the numbers. you can modify the international phone file (drlphint. However. To modify these terms. add a Query transform following the Data Cleanse transform and use the search_replace function to replace the terms with region-appropriate prenames. If your data includes personal identification numbers different from U. New phone number patterns can be added to the international phone file using regular expressions. Related Topics • Reference Guide: Data Cleanse appendix. Modify the international phone file Data Cleanse includes phone number patterns for many countries by default. Regular expressions Use personal identification numbers instead of Social Security numbers With a default installation.S.18 Data Quality Data Cleanse Customize prenames per country When the input name does not include a prename.dat) to enable Data Cleanse to detect phone number patterns that follow a different format. Data Cleanse generates the English prenames Mr. Related Topics • Reference Guide: Data Cleanse appendix. Number formats to be identified by user-defined pattern matching can be set up by using regular expressions. Data Cleanse can identify U. SSNs.S.

To ensure that Data Cleanse parses the data correctly. As each token is processed. In general. Input fields containing data classified as Latin script are processed using the regular Data Cleanse methodology. Data Cleanse first identifies the script in each input field as kanji. All kana and kanji input is broken by the Japanese word breaker. Japanese data cannot be accurately tokenized and parsed using the same algorithm as other data. you must use the Japanese engine. Input fields containing data classified as kana or kanji script are then processed using a special Japanese lexer and parser. Due to its structure. Data Cleanse uses a word breaker to break an input string into individual tokens and then attempts to recombine contiguous tokens into words.Data Quality Data Cleanse 18 Japanese Data About Japanese data Data Cleanse can identify and parse Japanese data or mixed data that contains both Japanese and Latin characters. or Latin and assigns it to the appropriate script classification. kana. SAP BusinessObjects Data Services Designer Guide 573 . Each word is looked up in the parsing dictionary and assigned one or more classifications based on its dictionary entry. Data Cleanse identifies and assigns these classifications as applicable. The input is then parsed by the parser and rule file. Note: Only data in Latin script is parsed based on the value set for the Break on Whitespace Only transform option. Classifications for Japanese data Several Data Cleanse classifications are designed to improve the parsing results when processing Japanese data. The Maiden_Name_Indicator and Dual_Name_Indicator classifications allow Data Cleanse to identify and correctly parse data that contain the Japanese indicators for a woman's maiden family name (旧姓) and for a married couple (夫妻). When the Data Cleanse Japanese engine is used.

the official name is output to the Given_Name1 and Family_Name1 fields and the pronunciation name is output to the Given_Name2 and Family_Name2 fields. Data Cleanse is able to identify and parse a variety of name patterns that are common in the Japanese culture. usually written in kanji. When the input contains an official name and a pronunciation name. Official and Pronunciation Names Japanese names often consist of an official name. and a pronunciation name. and dual name indicators. Script classifications are: • Latin_Script • Kanji_Script • Kana_Script A rule that captures only strong family names written in kana script would include NAME_STRONG_FAMILY_NAME & KANA_SCRIPT. it can handle official and pronunciation names. Among other patterns. Related Topics • Parse Japanese names Parse Japanese names Using the Japanese cleansing package. The following table demonstrates how a Japanese name is parsed: Official name Output field Official and pronunciation names 佐藤和夫 Given_Name1 佐藤和夫 ( さとうかずお ) 和夫 和夫 574 SAP BusinessObjects Data Services Designer Guide . referred to as a furigana and usually written in hiragana or katakana. maiden name indicators.18 Data Quality Data Cleanse Script classifications are pattern classifications used only in rule files.

The maiden name indicator is not output to a discrete field.Data Quality Data Cleanse 18 Official name Output field Official and pronunciation names 佐藤和夫 Family_Name1 佐藤和夫 ( さとうかずお ) 佐藤 かずお さとう 佐藤 Given_Name2 Family_Name2 Maiden Name Indicator The Maiden_Name_Indicator classification is used to parse names that contain a maiden name indicator before a woman's maiden family name. The following table shows how the name 佐藤 旧姓斉藤愛 is parsed and output: Name Component Output field 佐藤 旧姓斉藤愛 佐藤 旧姓 斉藤 愛 family name Family_Name1 maiden name indicator (not output) maiden name Family_Name2 given name Given_Name1 SAP BusinessObjects Data Services Designer Guide 575 .

18 Data Quality Data Cleanse Dual Name Indicator The Dual_Name_Indicator classification is used to parse names that contain a dual name indicator (夫妻 ). 576 SAP BusinessObjects Data Services Designer Guide . It is not necessary to normalize entries in dictionaries that do not contain Japanese entries. The dual name indicator signifies that the names are those of a married couple. Normalization ensures that the Japanese word breaker is able to break words properly and also eliminates the need for the dictionary to contain both fullwidth and halfwidth versions of the same word. The dual name indicator is not output to a discrete field. The normalization process makes the following changes: • combines discrete characters that are able to be combined. The following table shows how the name 佐藤和夫・愛夫妻 is parsed and output: Name Component Output field 佐藤和夫・愛夫妻 Person1_Family_Name1 佐藤 shared family name Person2_Family_Name1 和夫 愛 夫妻 first given name (may be eiPerson1_Given_Name1 ther husband or wife) second given name (may be Person2_Given_Name1 either husband or wife) dual name indicator (not output) Normalize dictionary entries To process Japanese data properly. • converts halfwidth katakana characters to their fullwidth forms. the dictionary entries must be normalized.

In either case. The normal width value reflects the normalized character width based on script type. Ensure you select the Normalize Entries box for the target dictionary. You may need to increase the column width in the target table. the processing is done at the dictionary level. The dictionary included with the Japanese cleansing package (PERSON_FIRM_JP) is normalized.Data Quality Data Cleanse 18 • converts fullwidth Latin characters to their halfwidth forms. Text width in output fields Many Japanese characters are represented in both fullwidth and halfwidth forms. For example. For template tables. SAP BusinessObjects Data Services Designer Guide 577 . Note: Since the output width is based on the normalized width for the character type. all fullwidth Latin characters are standardized to their halfwidth forms and all halfwidth katakana characters are standardized to their fullwidth forms. NORMAL_WIDTH does not require special processing and thus is the most efficient setting. Although normalization occurs on an entry-by-entry basis. If you did not select the option to normalize entries at the time you created an existing dictionary. Thus normalization must be enabled on a per dictionary basis. use Dictionary > Bulk Load to import the dictionary from an XML file. To standardize your data. If you create a custom dictionary or if you add your own entries using a bulk load file. the output data may be larger than the input data. selecting the Use NVARCHAR for VARCHAR columns in supported databases box changes the VARCHAR column type to NVARCHAR and allows for increased data size. the fullwidth form requires more space than the halfwidth or proportional form. Latin characters can be encoded in either a proportional or fullwidth form. Thus some output fields contain halfwidth characters and other fields contain fullwidth characters. ensure that you select the Normalize Entries box for the target dictionary. you can use the Output Text Width Conversion option to set the character width for all output fields to either fullwidth or halfwidth. Note: Normalizing data is only required for dictionaries containing Japanese data.

Cleansing package requirements and preparation • Installation Guide for Windows: After Installation. You can also modify the configuration to process operational data using Universal Data Cleanse. ensure the following options in the Data Cleanse transform are set correctly: • • • Engines > Japanese must be set to Yes. Related Topics • Installation Guide for Windows: After Installation. the Cleansing Package installer installs the dc_rules_person_firm_jp.18 Data Quality Data Cleanse Process Japanese data In order to process Japanese data. Reference Files > Rule File must specify the path to the dc_rules_person_firm_jp.dat file to the LINK_DIR\DataQuali ty\datacleanse directory. By using custom outputs and other Data 578 SAP BusinessObjects Data Services Designer Guide . Data Cleanse Options • To import dictionary changes Universal Data Cleanse One way to process operational data is to add custom classifications and output categories to your dictionary. By default. as appropriate for your needs. Dictionary > Parsing Dictionary must specify the PERSON_FIRM_JP dictionary or a custom dictionary you have created. Set values for other transform options. The Japanese_DataCleanse transform configuration contains default settings required for processing Japanese person and firm data. including the output text width conversion. you must install the Japanese cleansing package on your server and client systems and then use the Repository Manager to create a cleansing package repository. Data Cleanse. To create or upgrade repositories • Reference Guide: Transforms. To process data containing kanji or kana script. Data Quality transforms.dat file or the path to a custom rule file you have created.

Tokenization—assigns specific meanings to each of the pieces. 3. more usable pieces. punctuation.Data Quality Data Cleanse 18 Cleanse features. A list of tokens is created using the classifications associated with each word in the dictionary. Data Cleanse breaks the input line on white space. you can configure the transform to use a complete custom parser solution designed specifically to handle your data. such as words that look up together in the dictionary. Data Cleanse does not attempt to combine contiguous words that have been broken for a custom parser. Gathering—recombines words that belong together. Examples might include information about a product line or a text string from an online ordering system. except for words that have been assigned the Phrase_Word classification. How Universal Data Cleanse works Universal Data Cleanse follows the same basic process as any other data parsing: 1. and alphanumeric transitions. Operational data is anything that you want to parse that falls outside of the standard person and firm parsing provided by Data Cleanse. SAP BusinessObjects Data Services Designer Guide 579 . By default. Data Cleanse looks up each individual input word in the dictionary. Word breaking—breaks the input line down into smaller. 2. You can choose to have Data Cleanse break the input line only on white space.

or classifications. Understanding how Data Cleanse works with your data is important for successfully creating effective parsing rules. Write custom rules that tell Data Cleanse how to manipulate the data. Action item assignment—outputs parsed data based upon matched rules. it matches pattern of the types. of the words. 6. Create new output categories and associated output fields. Configure the Data Cleanse transform. There are two examples that follow this general process: a simple example based on pizza order information and a more complex example based on iPod product information. Although the exact order for some of these tasks is not important. dividing the work in this way allows you to keep the tasks conceptually organized. Add new classifications to your parsing dictionary. Connect the Data Cleanse transform and perform any post-processing. Rule matching—matches the token classifications against defined rules.18 Data Quality Data Cleanse 4. 4. Data Cleanse does not match the patterns of specific words against the rules. 5. Add or modify dictionary entries for your data. 3. Related Topics • Basic operational data example • Complex operational data example Analyze your data The most important factor in creating a good operational data flow is how well you understand your data. Related Topics • Reference Guide: Data Cleanse. You can divide the process into a few basic tasks: 1. If you try to create a solution without first fully 580 SAP BusinessObjects Data Services Designer Guide . 8. Create a new dictionary specific to your data. 2. 5. 7. Analyze your data. data parsing details Operational data in Data Cleanse Setting up a complete Universal Data Cleanse data flow is a multi-step process.

For more information about using Data Insight to analyze your data. 4. The parts of your data that are relevant to your data flow. 3. but more likely will not even work the way that you want. Click Dictionary > Universal Data Cleanse > Create Dictionary. it is not necessary for all of them to share a single custom dictionary. the data flow may end up being inefficient at best. as well as provides performance benefits when you run the data flows. Data Cleanse does not have to consider all the default name and firm data. The best way to analyze your data is to use real data from the sources that you expect to process. and parsing is faster and more accurate. If you use a completely custom dictionary. The default Data Cleanse dictionary consists of name and firm data that is not required for parsing operational data. When you analyze your data. Creating a separate dictionary for each unrelated data flow simplifies dictionary entry and rule creation. If you have multiple unrelated data flows that require operational data parsing. see the SAP BusinessObjects Data Insight XI User's Guide. Create a custom dictionary For most operational data flows. you can also create a bulk load XML file to import into your dictionary. select Normalize Entries. If the dictionary will contain Japanese data. 2. With Data Insight XI. and expand your sample until it includes everything that you might expect to find in your data. To create a dictionary 1.Data Quality Data Cleanse 18 analyzing and understanding your data. You can simplify your data analysis tasks by using an application such as SAP BusinessObjects Data Insight XI to obtain metrics such as word frequency in your data. Start with a small sample. consider several factors: • • • The types of terms in your data. The forms of output that you need. SAP BusinessObjects Data Services Designer Guide 581 . Enter a description for the dictionary that will appear in other dictionary windows. it is recommended that you create a custom dictionary. Enter a name for the dictionary.

The new dictionary contains no primary entries. Click Close.18 Data Quality Data Cleanse Normalization ensures that the Japanese word breaker is able to break words properly and eliminates the need for the dictionary to contain both fullwidth and halfwidth versions of the same word. Before using the dictionary. If needed. Specify a name for the new output category and click Add Category. 5. 2. Related Topics • To add new entries • Bulk load dictionary entries To delete a dictionary 1. Note: Normalizing data increases job runtime and is only required for dictionaries containing Japanese data. However. you can use multiple output categories to reduce the number of rules needed. Create new output categories and fields Based on your analyzed data. 582 SAP BusinessObjects Data Services Designer Guide . and click Save. Choose the dictionary you want to delete. 3. For data sets where the input data comes in only a few orders. using a single output category is sufficient. but includes all of Data Cleanse's default classifications. additional classifications. 3. 2. Choose Dictionary > Universal Data Cleanse > Add Custom Output. 4. if your data can come in any order. To add a new output category and fields 1. is displayed in the Custom outputs list. including its default output fields. Select the dictionary that you want to add the output to. 5. The new output category. Click OK. add output fields. Choose Dictionary > Universal Data Cleanse > Delete Dictionary. Click OK. if necessary. you need to add entries and. create new output categories and fields where Data Cleanse can place parsed and standardized data.

the primary entry Blue might be assigned the classification Color. Select the output category in the Add to category list and click Add Field. Choose Dictionary > Universal Data Cleanse > Add Custom Output. For example. 2. Data Cleanse can use two types of classifications: dictionary-based and pattern-based. 5. Choose Dictionary > Universal Data Cleanse > Add New Classification. Data Cleanse looks for patterns in the input data that match the specified regular expression. Both are used in the rule file to define the pattern that a rule identifies.Data Quality Data Cleanse 18 To add a new output field 1. and are not assigned to specific terms in the dictionary. Note: Ensure that your expression accounts for the way that the Data Cleanse transform tokenizes input and then parses expressions. Dictionary-based classifications add meaning to terms when assigned to primary dictionary entries. 2. 4. you can add a new classification to the dictionary. 4. Add new classifications Add classifications to your dictionary for each type of term identified in your sample data. Instead. SAP BusinessObjects Data Services Designer Guide 583 . If you want to use a pattern to define the classification. Type a name for the new classification. Select the dictionary that you want to add the output field to. select Use pattern to define and type a regular expression. Select the dictionary for the new classification. To create a new classification If the existing classifications do not apply to the situations where a word is found in your data. 3. Specify a name for the new output field. meaning Data Cleanse will identify blue as a color term. 3. Pattern-based classifications are defined using regular expressions. Click Save and Close. To create a new classification: 1. The new output field is displayed under its output category in the Custom outputs list.

then No Match appears in the "Test results" area.18 Data Quality Data Cleanse 5. type sample data in the Data example box and then click Test. you can change the definition for custom pattern-based classifications. Choose Dictionary > Universal Data Cleanse > Edit Classification. Click Save and Close. Regular expressions • Reference Guide: Data Quality appendix. • If the data sample does not match. Select the classification that you want to remove and click Delete. 584 SAP BusinessObjects Data Services Designer Guide . 2. Optionally. Click Close. Add or modify dictionary entries Add to your dictionary all terms that you want to parse using the custom classifications you defined. You can also define secondary standards for the terms to output them in a specific format. 1. then the data example is highlighted and no result appears in the "Test results" area. 7. Select the dictionary containing the classification. Tokenize input To edit a classification If you have added custom classifications to a dictionary. 4. Select the dictionary containing the classification. To remove a classification 1. Click Add Classification to add the classification to the dictionary. Choose Dictionary > Universal Data Cleanse > Edit Classification. 4. or remove any custom classification. • If the data example matches. 2. Change the classification's definition and click Save and Close. Related Topics • Reference Guide: Data Quality appendix. 3. The default classifications included with Data Cleanse cannot be modified or removed from a dictionary. 6. Data parsing details. 3. Data Cleanse reference. Select the classification that you want to change and click Edit. Data Cleanse reference.

By default. After you are sure that your rules work the way that you want.dat) that you can use as a base for creating your own custom rule files. Tip: The software includes a template rule file (dc_rules_template. Rule file organization Configure the Data Cleanse transform Configure options on the Data Cleanse transform to fit your operational data flow. If you have added operational data to your name_firm dictionary. Add only a small number of terms to your dictionary to begin. You need to set several options: • • • • The Parsing_Dictionary option The Rule_File option The Parser_Sequence_Multiline options Any output fields Related Topics • Reference Guide: Transforms. you can add new rules to your name_firm rule file. keep your operational data rules in a separate rule file. Related Topics • To add new entries • To edit existing entries Write custom rules Create custom rules to identify your data.Data Quality Data Cleanse 18 Tip: Start small. you can then add the rest of the terms. testing your custom rules is easier. Related Topics • Reference Guide: Data Quality Appendix. tailored to your specific data needs. Data Cleanse SAP BusinessObjects Data Services Designer Guide 585 . When there are fewer terms in the dictionary. If you have created a custom dictionary. The combination of custom rules and custom output categories effectively creates a complete custom parser. this template is installed to the LINK_DIR\DataQuality\datacleanse folder. Data Cleanse reference.

18 Data Quality Data Cleanse Complete the data flow Connect the Data Cleanse transform to the rest of your data flow. sausage. topping-related. stuffed crust Handtossed sausage. large Start by separating the sample data into individual terms and identifying the type of each term. and add any transforms you might need for post-processing the output from Data Cleanse. 586 SAP BusinessObjects Data Services Designer Guide . and crust-related. Analyze your data Here are a few lines of sample pizza data: large mushrooms sausage pepperoni stuffed crust medium deluxe thin crust small vegetarian handtossed personal hamburger bacon cheese pan large cheese thin crust large mushrooms. on pattern-based matches Convert text to something else Add necessary punctuation Add or remove data from output based on certain conditions Related Topics • To add a Data Quality transform to a data flow Basic operational data example This example demonstrates how to set up Data Cleanse in a basic custom parsing solution that processes incoming pizza orders using a single output category. pepperoni. You can use additional transforms or functions to perform various types of post-processing tasks: • • • • • Create a single combined output field from individual output fields Case data differently when there is no standard available in the dictionary used—for example. This sample data contains three basic types of terms: size-related.

SAP BusinessObjects Data Services Designer Guide 587 . To create a new custom dictionary for the sample pizza data. the engine automatically creates six parent component output fields. an output field named the same as the category. Add additional fields to each category as needed. Create a new custom output category and fields for the sample data. TOPPING. For a data set that consists of pizza order information. and six match standard (alias) fields are added. For each of the parent components. Data Cleanse should output the parsed data in appropriate categories and fields. This could be an output category named PIZZA and fields named SIZE. For each output category. choose Dictionary > Universal Data Cleanse > Create Dictionary. and CRUST. SCORE.Data Quality Data Cleanse 18 Size terms large medium small personal Topping terms mushrooms sausage pepperoni vegetarian hamburger bacon cheese deluxe Crust terms stuffed crust thin pan Handtossed Create a custom dictionary Create a new dictionary to contain the information for the data flow. RULE_LABEL. a dictionary named PIZZA is appropriate. Create new output categories and fields Because the data consists of pizza order information.

DataCleanse Rule File v3. add classifications for CRUST. using the CRUST. In this case.0. MODIFY OR REMOVE THE ABOVE LINE!!!!! ############################################ # # # PIZZA RULES # # # ############################################ ############################################ 588 SAP BusinessObjects Data Services Designer Guide . you can use the multiple token match format (ON *) to separate the topping with commas in the output. choose Add New Dictionary Entry from the Dictionary menu. Write custom rules Because this data flow will only be parsing pizza order data. SIZE. To create new classifications for the different types of pizza terms. and TOPPING. Optionally. choose Dictionary > Universal Data Cleanse > Add Custom Output. and SIZE classifications where appropriate. Add new classifications Add a classification to the dictionary for each type of term you identified when analyzing the data. Use an appropriate name. The directory path for Data Cleanse rules files is typically LINK_DIR\DataQuality\datacleanse. choose Dictionary > Universal Data Cleanse > Add New Classification. TOPPING. You can also add any standards for outputting the terms in a specific format. create a new rules file. To create new dictionary entries for needed pizza-related terms.dat to avoid confusion. Add or modify dictionary entries Add the terms from the sample data to the PIZZA dictionary. such as pizzarules.18 Data Quality Data Cleanse To create a new output category and fields for the parsed pizza data. Add custom rules to the file to allow Data Cleanse to recognize multiple orders of input data. # DO NOT EDIT. when the input data contains multiple toppings for the same pizza.

end_action ############################################ # # Identifies crusts followed by toppings and then size. action = PIZZA. ". PIZZA = 1 : TOPPING : 2 : ON * ". PIZZA = 1 : SIZE : 1. Example: large pepperoni sausage thin Multiple toppings are separated with a comma using the ON* syntax size_topping_crust = #size SIZE + #topping TOPPING* + #crust CRUST. PIZZA = 1 : CRUST : 3. SAP BusinessObjects Data Services Designer Guide 589 .Data Quality Data Cleanse 18 # # # # # # Identifies size followed by toppings and then crusts. # Example: thin crust vegetarian large # Multiple toppings are separated with a comma using # the ON* syntax # crust_topping_size = #crust CRUST* + #topping TOPPING* + #size SIZE. action = PIZZA. format = PIZZA : PIZZA : 1 + " " + 2 + " " + 3.

dat PIZZA Create composite output by generating the following output field and mapping it to your output schema: Output field column Parent_Component Generated_Field_Name Generated_Field_Class Type Content_Type Value PIZZA1 PIZZA STANDARDIZED varchar(255) NONE Create any additional output fields required. and custom output fields. PIZZA = 1 : TOPPING : 2 : ON * ". ". rules file. and then add custom output fields. to create an output field for CRUST only. format = PIZZA : PIZZA : 3 + " " + 2 + " " + 1. 590 SAP BusinessObjects Data Services Designer Guide . The following option values are applicable to the entire transform: Option Dictionary > Parsing_Dictionary Reference_Files > Rule_File Options > Parser_Configuration > Parser_Sequence_Multiline1 Value PIZZA [$$RefFilesDataCleanse]\pizzarules. use the same settings.18 Data Quality Data Cleanse PIZZA = 1 : CRUST : 1. PIZZA = 1 : SIZE : 3. end_action Configure the Data Cleanse transform Use the Designer to configure the Data Cleanse transform for the new pizza-related dictionary. but replace Generated_Field_Name with CRUST. For example. Start by setting the options that are applicable to the entire transform. parser order.

40Gb. and click wheel. This sample data contains seven basic types of terms and phrases: size. 20GB. there are some noise words that do not apply to any of these types of terms. 2nd generation. Additionally. 2G SAP BusinessObjects Data Services Designer Guide 591 .Data Quality Data Cleanse 18 Complete the data flow Connect the Data Cleanse transform to the rest of your data flow. 30gb. model. 512 MB. Type size color product generation Terms 2GB. This example does not require additional post-processing. product. 512 Meg white. color. Green ipod. Fifth Generation. model description. generation. iPod 4G. 1 GB. Analyze your data Here are a few lines of sample iPod data: iPod nano (2GB) white iPod photo (40Gb)(view pictures on the go!) ipod U2 special edition 20GB 4G scroll wheel 2nd generation 1 GB ipod Fifth Generation iPod (Late 2006) (30gb) 1st generation ipod shuffle 512 MB iPod shuffle (512 Meg) iPod mini (2G) Green Start by separating the sample data into terms or phrases and identifying the type of each term or phrase. 1st generation. Related Topics • To add a Data Quality transform to a data flow Complex operational data example This example demonstrates how to set up Data Cleanse in a more complex custom parsing solution that processes iPod data using multiple output categories.

The input terms that do not contain white space would remain together. 1st breaks into "1" and "st" tokens. By default. photo. 40Gb. consider the following points about the data: • 4G. Late 2006 click wheel extra (noise words) scroll wheel view pictures on the go! Consider how Data Cleanse will parse the data. Let's also create a pattern-based classification to capture the size terms GB. instead we will write a regular expression to capture all the variations. You could set the transform option to break words on white space only. but 1 GB. MB and Meg. and U2 look up together. and 512 Meg do. However. 1st. To create a new custom dictionary for the sample iPod data.18 Data Quality Data Cleanse Type model Terms nano. 1st. shuffle. 512 MB. your data would not parse consistently. For this example. 2nd. We can use the phrase_word classification to ensure that terms such as 4G. and alphanumeric transitions. choose Dictionary > Universal Data Cleanse > Create Dictionary. Data Cleanse breaks an input line on white space. U2. and U2 do not contain white space and would remain together. punctuation. 2GB. For example. • size is written inconsistently with respect to white space. if you set the transform option to break on white space only. We will not need to enter these terms in the dictionary. Thus. 20GB and 30gb do not contain a space. 1st. and U2 each break into two tokens at the alphanumeric transition. Terms such as 20GB. 2nd. Create a custom dictionary For a data set that consists of iPod product information. a dictionary named IPOD is appropriate. mini model description special edition. 592 SAP BusinessObjects Data Services Designer Guide . 4G. let's keep the default word breaking method so that all the size terms will parse correctly.

and six match standard (alias) fields are added. the engine automatically creates six parent component output fields. MODEL_DESCRIP. For each output category. STORAGE. WHEEL. define seven custom output categories: PRODUCT. SCORE. For this sample data. GENERATION. Output CategoAdditional Output Fields ry PRODUCT (none) MOD_PART1 DESCRIPTOR1 DESCRIPTOR2 GEN_NUMBER MODEL MODEL_DE SCRIP GENERATION GEN_DESIGNATOR STORAGE_SIZE STORAGE SIZE_DESIG TYPE WHEEL DESIGNATOR COLOR (none) SAP BusinessObjects Data Services Designer Guide 593 . Data Cleanse should output the parsed data in appropriate categories and fields. For each of the parent components. an output field named the same as the category.Data Quality Data Cleanse 18 Create new output categories and fields Because the data consists of iPod product information. Add additional fields to each category as needed. MODEL. and COLOR. RULE_LABEL.

To create new classifications for the different types of iPod terms. You can use it to gather together terms that Data Cleanse parses into separate tokens. then the data example is highlighted and no result appears in the "Test results" area. For example: ([G|g][ ]?[B|b])|([M|m][ ]?[B|b])| ([M|m][E|e][G|g])|([G|g][I|i][G|g]) Tip: After typing the regular expression. For the iPod example data. Tip: You are not required to associate classification names with specific output categories or output fields by giving them the same names. When Data Cleanse parses the data. Add new classifications Add a classification to the dictionary for each type of term you identified when analyzing the data. if the data matches the pattern it will be assigned the SIZE_DESIG_PATTERN classification. but doing so may make writing rules for your data less confusing. type a data example such as GB and click Test. define a regular expression to capture data that matches the size terms. 1st. choose Dictionary > Universal Data Cleanse > Add New Classification. but that you want to output together. If the example matches. add the following classifications to the IPOD dictionary: Classification PROD UCT_NAME Related Example Terms iPod Comment 594 SAP BusinessObjects Data Services Designer Guide . For the SIZE_DESIG_PATTERN classification. and 2nd. such as U2. You do not need to create it.18 Data Quality Data Cleanse To create new output categories and fields for the parsed iPod data. choose Dictionary > Universal Data Cleanse > Add Custom Output. Note: The PHRASE_WORD classification is available in all dictionaries.

4. MB. edition white. Regular expressions SAP BusinessObjects Data Services Designer Guide 595 . st. special 2006. 2. 2. 2. 30. 2. mini. G Phrase_Word is a default classification.Data Quality Data Cleanse 18 Classification STOR AGE_SIZE SIZE_DE SIG_PATTERN Related Example Terms 1. You do not need to add it. generation. Meg Define a pattern using a regular expression. 2 nd. 1. green scroll WHEEL_DESIG wheel GENERA 1 st. 40. nd Related Topics • To create a new classification • Reference Guide: Data Quality Appendix. U 2 MODEL_DE SCRIP1 MODEL_DE SCRIP2 COLOR WHEEL_TYPE Late. gb. MODEL_PART1 nano. Gb. 512 Comment GB. Data Cleanse reference. 20. fifth TION_NUMBER GENERA TION_DESIG gen. PHRASE_WORD U. photo. shuffle.

Add a dictionary entry U 2 (with a space between the U and the 2) and select MODEL_PART1 as the classification. such as ipodrules. select the MOD_PART1. Use an appropriate name. to standardize the term iPod beginning with a lower case "i". To create new dictionary entries for needed iPod-related terms. type iPod as the standard when IPOD is used as a PRODUCT. For example. DataCleanse Rule File v3. Data Cleanse gathers the U and 2 tokens as phrase words contained in the primary entry. 2. Write custom rules Because this data flow will only be parsing iPod data. to create a phrase for U2: 1.0. For example. using the new classifications where appropriate. 596 SAP BusinessObjects Data Services Designer Guide . Some terms may need more than one classification. 3. from the "When used as" list. Add custom rules to the file to allow Data Cleanse to recognize different parts of the input data.18 Data Quality Data Cleanse Add or modify dictionary entries Add the terms from the sample data to the IPOD dictionary. and then standardizes the output as U2. 2nd. create a new rules file. Add or edit the dictionary entries so that U and 2 are classified as PHRASE_WORD. Type U2 (no space) in the "Standard" field.dat to avoid confusion. Create entries to classify each token as a PHRASE_WORD. 4. 1st. In the "Secondary information" area. in the secondary information section of the IPOD dictionary entry. The terms may also be assigned to other classifications. use the phrase_word classification. U 2. Phrase words To ensure that terms such as 4G. choose Add New Dictionary Entry from the Dictionary menu. and U2 look up together and output as phrases. You can also add standards to define how Data Cleanse standardizes the output data. Then create an entry that joins the two individual entries together into a phrase and standardizes the output.

end_action ####################################################### # # Model descriptor--phrase describing the iPod model # Example: Late 2006 # modeldescrip = SAP BusinessObjects Data Services Designer Guide 597 . MODIFY OR REMOVE THE ABOVE LINE!!!!! ####################################################### # # # IPOD RULES # # # ####################################################### ####################################################### # # Product name # Example: iPod # product = #product name PRODUCT_NAME. action = PRODUCT. end_action ####################################################### # # Model name # Example: shuffle # U2 # model = #model name MODEL_PART1. action = MODEL. MODEL = 1 : MOD_PART1 : 1. PRODUCT = 1 : PRODUCT : 1.Data Quality Data Cleanse 18 # DO NOT EDIT.

598 SAP BusinessObjects Data Services Designer Guide . STORAGE = 1 : SIZE_DESIG : 2. action = STORAGE. format = STORAGE : STORAGE : 1 + " " + 2. GB SIZE_DESIG_PATTERN. format = MODEL_DESCRIP : MODEL_DESCRIP : 1 + " " + 2. for example. end_action ####################################################### # # Storage capacity # Example: 2 GB # 512 Meg # storage = #size (amount) STORAGE_SIZE + #unit. action = MODEL_DESCRIP. MODEL_DESCRIP = 1 : DESCRIPTOR2 : 2. STORAGE = 1 : STORAGE_SIZE : 1. end_action ####################################################### # # Generation # Example: 4G # 2nd Generation # gen_number_designator = #generation number GENERATION_NUMBER + #generation designator GENERATION_DESIG.18 Data Quality Data Cleanse #first word of the phrase MODEL_DESCRIP1 + #second word of the phrase MODEL_DESCRIP2. MODEL_DESCRIP = 1 : DESCRIPTOR1 : 1.

parser order. COLOR = 1 : COLOR : 1. end_action ####################################################### # # User controls such as button or click wheel # Example: scroll wheel # wheel = #control type WHEEL_TYPE + #descriptor WHEEL_DESIG. GENERATION = 1 : GEN_NUMBER : 1. SAP BusinessObjects Data Services Designer Guide 599 . WHEEL = 1 : TYPE : 1. format = GENERATION : GENERATION : 1 + " " + 2.Data Quality Data Cleanse 18 action = GENERATION. end_action ####################################################### # # Color # Example: green # color = #color COLOR. format = WHEEL : WHEEL : 1 + " " + 2. GENERATION = 1 : GEN_DESIGNATOR : 2. end_action ####################################################### Configure the Data Cleanse transform Use the Designer to configure the Data Cleanse transform for the new iPod-related dictionary. rules file. and custom output fields. action = COLOR. WHEEL = 1 : DESIGNATOR : 2. action = WHEEL.

Generated field nent name PRODUCT1 MODEL1 MODEL_DE SCRIP1 MODEL_DE SCRIP1 PRODUCT MOD_PART1 DESCRIPTOR1 Generated field class STANDARDIZED STANDARDIZED STANDARDIZED DESCRIPTOR2 STANDARDIZED STANDARDIZED STANDARDIZED GENERATION1 GEN_NUMBER GENERATION1 STORAGE1 STORAGE1 WHEEL1 GEN_DESIGNA TOR STORAGE_SIZE STANDARDIZED SIZE_DESIG TYPE STANDARDIZED STANDARDIZED 600 SAP BusinessObjects Data Services Designer Guide . The following option values are applicable to the entire transform: Option Dictionary > Parsing Dictionary Reference Files > Rule File Value IPOD [$$RefFilesDataCleanse]\ipodrules.18 Data Quality Data Cleanse Start by setting the options that are applicable to the entire transform.dat Options > Parser ConfiguraPRODUCT | MODEL | MODEL_DESCRIP | GENERATION | tion > Parser Sequence MultiSTORAGE | WHEEL | COLOR line1 Create your output schema by mapping the following output fields from the Output tab: Parent compo. and then add custom output fields.

are highlighted in red. Turning off parsers that you do not need significantly improves parsing speed and reduces the chances that your data will be parsed incorrectly. Related Topics • Ordered options editor SAP BusinessObjects Data Services Designer Guide 601 . For example. Related Topics • To add a Data Quality transform to a data flow Rank and prioritize parsing engines The Data Cleanse transform can be configured to use only specific parsers or a specific parser order when dealing with multiline input. You can change the parser order for a specific multiline input by modifying the corresponding parser sequence option in the Parser_Configuration options group of the Data Cleanse transform. Carefully selecting which parsers to use in what order can be beneficial. to change the order of parsers for the Multiline1 input field. Tip: Data Cleanse parser prioritization options can be modified with the "Ordered Options" window.Generated field nent name WHEEL1 COLOR1 DESIGNATOR COLOR Generated field class STANDARDIZED STANDARDIZED Complete the data flow Connect the Data Cleanse transform to the rest of your data flow. Note: Selected parsers that are no longer valid for any reason. such as when the dictionary was changed.Data Quality Data Cleanse 18 Parent compo. modify the Parser_Sequence_Multiline1 option.

A.A. Use these output fields as input fields in the Geocoder transform to assign more accurate coordinates. You can use accurate address data from the Global Address Cleanse transform and then use the Geocoder transform to obtain the geographic coordinates.12345. England) 602 SAP BusinessObjects Data Services Designer Guide . Related Topics • GeoCensus (USA Regulatory Address Cleanse) Prepare records for geocoding To obtain the most accurate information through the Geocoder transform. for example. Latitude and longitude are denoted on output by decimal degrees.18 Data Quality Geocoding Geocoding Including latitude and longitude information in your data may help your organization to target certain population sizes and other regional geographical data. Longitude (0-180 degrees east or west of Greenwich Meridian in London. Understanding your output On output. 12. This geocoding section describes how to prepare your data for the Geocoder transform. set both Primary Type Style and Directional Style to Short.S. you will have latitude and longitude data. and how to understand your output. If you want to assign Latitude and Longitude data to your U. output the two-character ISO country code and the region symbol from the Global Address Cleanse transform.S. and set Assign Locality to Convert. see the GeoCensus section in the U. place the Global Address Cleanse transform before the Geocoder transform. Use the output from the Global Address Cleanse transform and be sure to set these Global Address Cleanse options to the following values. records. Regulatory Address Cleanse transform. Also. In the Standardization options section. Latitude (0-90 degrees north or south of the equator) shows a negative sign in front of the output number when the location is south of the equator.

• Simple match. See Association matching for more information. and contact matches. Match Matching strategies Here are a few examples of strategies to help you think about how you want to approach the setup of your matching data flow. and you want to find the overlap between all of those definitions.Data Quality Match 18 shows a negative sign in front of the output number when the location is within 180 degrees west of Greenwich. The return code of PRI means that you have the finest depth of assignment available: to the primary address range. Multinational consumer match. subsidiary matches. such as residential matches. such as corporate matches. Use this match strategy when your data consists of multiple countries and your matching business rules are different for different countries. The most general output level is either P1 (Postcode level) or L1 (Locality level) depending on the option you chose in the Best Assignment Threshold option. or product data. • • • • Think about the answers to these questions before deciding on a match strategy: SAP BusinessObjects Data Services Designer Guide 603 . You can understand the accuracy of the assignment based on the Assignment_Level output field. family matches. business. Consumer Householding. Use this strategy when your matching business rules consist of multiple match criteria for identifying relationships. Use this strategy when your matching business rules consist of multiple levels of corporate relationships. Use this strategy when your matching business rules consist of a single match criteria for identifying relationships in consumer. Identify a person multiple ways. Corporate Householding. Use this strategy when your matching business rules consist of multiple levels of consumer relationships. or house number. and individual matches.

Also.) What are the relative strengths and weaknesses of the data in those fields? Tip: You will get better results if you cleanse your data before matching. To allow for multiple match sets to be considered for association in an Associate match set. and so on) What fields do I want to compare? (last name. records differently than records containing international data. you could choose to match U. exclude blank SSNs. data profiling can help you answer this question. To allow for related match scenarios to be stacked to create a multi-level match set. segregate records. • What end result do I want when the match job is complete? (One record per family. per firm.) Match components The basic components of matching are: • Match sets • Match levels • Match criteria Match sets A match set is represented by a Match transform on your workspace. international addresses. match criteria. and so on. A match set has three purposes: • To allow only select data into a given set of match criteria for possible comparison (for example. and so on. • • 604 SAP BusinessObjects Data Services Designer Guide . For example. and prioritization. firm.18 Data Quality Match • • • What does my data consist of? (Customer data. Each match set can have its own break groups.S. and so on). and match on records independently. international data. Match sets let you control how the Match transform matches certain records.

family. The purpose of the resident match type is to determine whether two records should be considered members of the same residence. Match component Description Family The purpose of the family match type is to determine whether two people should be considered members of the same family. firm. as reflected by their record data. You can define each match level in a match set in a way that is increasingly more strict. The Match transform compares the first name. You can have as many match levels as you want. The Match transform compares the address data. The result of the match is one record per residence. but to the broad category of matching. family. and address data. A match means that the two records represent the same person. as reflected by their record data. which also compares last-name data. such as on individual. However. the Match wizard restricts you to three levels during setup (more can be added later). A match means that the two records represent members of the same family. resident. The Match transform compares the last name and the address data. Contrast this match type with the family match type. and so on. The purpose of the individual match type is to determine whether two records are for the same person. last name. The result of the match is one record per individual. Multi-level matching feeds only the records that match from match level to match level (for example. individual) for comparison. Individual Resident SAP BusinessObjects Data Services Designer Guide 605 . resident. A match level refers not to a specific criteria. A match means that the two records represent members of the same household. The result of the match is one record per family. as reflected by their record data.Data Quality Match 18 Match levels A match level is an indicator to what type of matching will occur.

or family (last) name and telephone number. They allow you to control how close to exact the data needs to be for that data to be considered a match. A match means that the two records reflect the same person at the same firm. and allow a first name to match a middle name. but also allow a first name initial to match a spelled out first name. we compare the first name. Individual level match criteria may include full name and address. or firm name and Data Universal Numbering System (DUNS) number. The purpose of the firm-individual match type is to determine whether two records are for the same person at the same firm. last name. 606 SAP BusinessObjects Data Services Designer Guide . The result of the match is one record per firm. and address data. • • • Family level match criteria may include family (last) name and address. full name and SSN. or full name and e-mail address. firm name. firm name and Standard Industrial Classification (SIC) Code. you may require first names to be at least 85% similar. Firm level match criteria may include firm name and address. This match type involves comparisons of firm and address data. The result of the match is one record per individual per firm. Firm-Individual Match criteria Match criteria refers to the field you want to match on. as reflected by their record data.18 Data Quality Match Match component Description Firm The purpose of the firm match type is to determine whether two records reflect the same firm. With this match type. For example. A match means that the two records represent the same firm. You can use criteria options to specify business rules for matching on each of these fields.

Data Quality Match 18 Match Wizard Use the Match Wizard Select match strategy The Match wizard begins by prompting you to choose a match strategy. Consumer Householding. • Identify a person multiple ways. you must edit your data flow after it has been created. and contact matches. SAP BusinessObjects Data Services Designer Guide 607 . or product data. Corporate Householding. and individual matches. Use this strategy when your matching business rules consist of a single match criteria for identifying relationships in consumer. Use this strategy when your matching business rules consist of multiple levels of corporate relationships. such as corporate matches. Use this match strategy when your data consists of multiple countries and your matching business rules are different for different countries. Source statistics If you want to generate source statistics for reports. The path through the Match wizard depends on the strategy you select here. Use these descriptions to help you decide which strategy is best for you: • Simple match. based on your business rule requirements. Use this strategy when your matching business rules consist of multiple match criteria for identifying relationships. If you want to use Unicode matching. subsidiary matches. Multinational consumer match. such as residential matches. business. Note: • • • The multinational consumer match strategy sets up a data flow that expects Latin1 data. make sure a field that houses the physical source value exists in all of the data sources. family matches. Use this strategy when your matching business rules consist of multiple levels of consumer relationships. and you want to find the overlap between all of those definitions.

When working on student or snowbird data. This produces the corresponding number of match sets (transforms) in the data flow. Select a combination of fields that best shows which information overlaps. 2. Match sets compare data to find similar records. use multiple match sets with two fields. Carson 1239 Whistle Lane Columbus. working independently within each break group that you designate (later in the Match wizard). Data1 Data2 Data3 Data4 R. Ohio 555-23-4333 Robert T. an individual may use the same name but have multiple valid addresses. Enter the number of ways you have to identify an individual. you will define these criteria for each match set that you are using. In this window. such as the family name and the SSN. select the Generate statistics for your sources checkbox. 608 SAP BusinessObjects Data Services Designer Guide . Carson 52 Sunbird Suites Tampa. Select a match set in the Match sets list. Related Topics • Unicode matching • Association matching Define match criteria Criteria represent the data that you want to use to help determine matches. Florida 555-23-4333 1. and enter a more descriptive name if necessary.18 Data Quality Match To generate source statistics for your match reports. use a single match set with multiple fields. The default match set name appears in the Name field. and then select a field that contains your physical source value. To find the data that matches all the fields. The records in one break group are not compared against those in any other break group. To find the data that matches only in a specific combination of fields.

To define match levels: 1. If you do not want to use the default criteria. resident. 3. You can also select multiple countries and add them all by clicking the Add button. if you don't want to keep the default name.Data Quality Match 18 3. Continue until you have populated all the levels that you require. family. a residence-level match would match on only address elements. Click the top level match. Click Add to move it into the Selected Countries list. Later. If you want to use criteria other than those offered. 4. Multi-level matching feeds only the records that match from match level to match level (that is. 1. Repeat steps 1 and 2 for each country that you want to include. a family-level would match on only Last Name and then the individual-level would match on First Name. and enter a name for the level. Select countries Select the countries whose postal standards may be required to effectively compare the incoming data. individual) for comparison. you will assign fields from upstream transforms to these criteria. The default criteria selection is a good place to start when choosing criteria. The default criteria is already selected. Select any additional criteria for this level. The left panel shows a list of all available countries. The right panel shows the countries you already selected. click Custom and then select the desired criteria. 3. click to remove the check mark from the box. SAP BusinessObjects Data Services Designer Guide 609 . Define match levels Match levels allow matching processes to be defined at distinct levels that are logically related. Multi-level matching can contain up to 3 levels within a single match set defined in a way that is increasingly more strict. Select the country name in the All Countries list. For instance. For each match set. You can add criteria for each level to help make finer or more precise matches. Match levels refer to the broad category of matching not the specific rules of matching. 2. choose the criteria you want to match on. 2.

Select criteria fields Select and deselect criteria fields for each match set and match level you create in your data flow. If you want to take advantage of native-script intelligence (part of the Unicode matching feature). Click Add to move the selected countries to the selected track. 1. 2. To enable the Next button. Group countries into tracks Create tracks to group countries into logical combinations based on your business rules (for example Asia. based on the data input. Europe. Use the COUNTRY UNKNOWN (__) listing for data where the country of origin has not been identified. Some criteria may be selected by default. depending on the number you specify in the Define Sets window of the Match Wizard. 4. Use the COUNTRY OTHER (--) listing for data whose country of origin has been identified. To create each track. you will not be able to change the field for that criteria within the Match Wizard. South America). such as Track1. with the Match engine option set to Latin1. 3. but the country does not exist in the list of selected countries. The Tracks list reflects the number of tracks you choose and assigns a track number for each. These selections determine which fields are compared for each record. you must select at least one non-match-standard field. remember to manually change the Match engine option for each language after you complete the Match wizard. 610 SAP BusinessObjects Data Services Designer Guide . Select the number of tracks that you want to create.18 Data Quality Match The countries that you select are remembered for the next Match wizard session. If there is only one field of the appropriate content type. The Next button is only enabled when all tracks have an entry and all countries are assigned to a track. Select the countries that you want in that track. Each track creates up to six match sets (Match transforms). select a track title.

When you match to find duplicate individuals. • If you are performing multi-national matching. For each break key. 2. when you match to find duplicate addresses. For each of the criteria fields you want to include. select an available field from the drop-down list. The match set compares the data in the records within each break group only. Keep in mind that records in one break group will not be compared against records in any other break group. 2. select the number of fields to include in the break key. break groups will help to speed up processing. Create break keys that group similar data that would most likely contain matches. do one of the following: • Click Finish. which contains fields from upstream source(s). After you define the break keys. base the break key on the postcode. select the following: • the field(s) in the break key • the starting point for each field • the number of positions to read from each field 3. The available fields are limited to the appropriate content types for that criteria. city.Data Quality Match 18 1. not across the groups. because the size of the break groups can affect processing time. Making the correct selections can save valuable processing time by preventing widely divergent data from being compared. or state to create groups with the most likely matches. Even if your data is not extensive. base the break key on the postcode and a portion of the name as the most likely point of match. all upstream fields display in the menu. Create break keys Use break keys to create manageable groups of data to compare. For example. click Next to go to the Matching Criteria page. In the How many fields column. This completes the match transform. Optional: Deselect any criteria fields you do not want to include. Break keys are especially important when you deal with large amounts of data. SAP BusinessObjects Data Services Designer Guide 611 . If no fields of the appropriate type are available. To create break keys: 1.

To correct non-standard entries or missing data. connected to the upstream transform you choose. You are responsible for connecting these transforms. You may want to edit option values to conform to your business rules. Applies default values to your match criteria based on the strategy you choose. will require customization to meet your business rules. The Match wizard: • Does not alter any data that flows through it. Does not allow you to set rule-based or weighted scoring values for matching. place one of the address cleansing transforms and a Data Cleanse transform upstream from the matching process. without requiring you to manually create each individual transform it takes to complete the task. Detects the appropriate upstream fields and maps to them automatically. such as a Loader.18 Data Quality Match Match wizard The Match wizard can quickly set up match data flows. The Match wizard incorporates a "best practices" standard that set these values for you. • • Related Topics • Combination method 612 SAP BusinessObjects Data Services Designer Guide . What the Match wizard does The Match wizard: • • • • Builds all the necessary transforms to perform the match strategy you choose. Does not connect the generated match transforms to any downstream transform. Places the resulting transforms on the workspace. What the Match wizard does not do The Match wizard provides you with a basic match setup that in some cases.

There are also some things you want to do to refine your matching process. connected to the upstream transform you selected to start the Match wizard. click a port and drag to connect to the desired object. • After setup Although the Match wizard does a lot of the work. The Match wizard works best if the data you're matching has already been cleansed and parsed into discrete fields upstream in the data flow. This rule is also true if you have the Data Cleanse transform before an address cleanse transform. right-click the transform and choose Match Editor. Connect to downstream transforms When the Match wizard is complete. you must connect each port from the last transform to a downstream transform. You may want to match on a particular input field that our data cleansing transforms do not handle. View and edit the new match transform To see what is incorporated in the transform(s) the Match Wizard produces. For your job to run. If you want to match on any address fields. there are some things that you must do to have a runnable match job. Include one of the address cleansing transforms and the Data Cleanse transform. To do this. SAP BusinessObjects Data Services Designer Guide 613 . they will not be available to the Match transform (and Match Wizard). be sure that you pass them through the Data Cleanse transform.Data Quality Match 18 Before you begin Prepare a data flow for the Match wizard To maximize its usefulness. Otherwise. be sure to include the following in your data flow before you launch the Match wizard: • • Include a Reader in your data flow. it places the generated transforms on the workspace.

Trans form Usage Routes data to a particular Match transform (match set). 614 SAP BusinessObjects Data Services Designer Guide . the wizard builds as many Match transforms as you specify in the Define Sets window of the wizard for each track you create. You may want to change this value to reflect the type of data coming into the match set. you must open the Case transform and delete any unwanted rules. Case You can also use this transform to route blank records around a Match transform. right-click the transform and choose Associate Editor.18 Data Quality Match View and edit Associate transforms To see what is incorporated in the Associate transform(s) the Match Wizard produces. There are also other transforms that can be used for specific purposes to optimize matching. Caution: If you delete any tracks from the workspace after the wizard builds them. These transforms perform the basic matching functions. A common usage for this transform is to send USA-specific and international-specific data to different transforms. By default. the Match wizard sets the Match engine option in the Match transform of each match set to Latin1. Multinational matching For the Multinational consumer match strategy. Related Topics • Unicode matching Transforms for match data flows The Match and Associate transforms are the primary transforms involved in setting up matching in a data flow.

The Query transform adds the fields that the Match transform generates and you output. and one that sends all other records to the bypass match route. The Merge transform merges the two routes into a single route. Query. • Brings together matching records and blank records after being split by a Case transform. Merge Query Example: Any time you need to bypass records from a particular match process (usually in Associative data flows and any time you want to have records with blank data to bypass a match process) you will use the Case. (The output schema in the Match transform and the output schema in the Query transform must be identical for them to be merged. and Merge transforms. performs functions to help prepare data for matching. Creates fields. and so on. SAP BusinessObjects Data Services Designer Guide 615 .Data Quality Match 18 Trans form Usage Performs the following functions: • Brings together data from Match transforms for Association matching.) The contents of the newly added fields in the Query transform may be populated with an empty string. orders data. • • • The Case transform has two routes: one route sends all records that meet the criteria to the Match transform.

Click OK. enter a meaningful name for your transform. Right-click the new transform. that doesn't do matching at all. 4. 1. 5. Order of setup Remember: The order that you set up your Match transform is important! 616 SAP BusinessObjects Data Services Designer Guide . 2. Working in the Match and Associate editors Editors The Match and Associate transform editors allow you to set up your input and output schemas. Drag and drop your new Match transform configuration onto the workspace and connect it to your data flow. but does not include the actual matching features. In the Format name field. Now you can add any available operation to this transform.18 Data Quality Match To remove matching from the Match transform You may want to place a transform that employs some of the functionality of a Match transform in your data flow. Deselect the Perform matching option in the upper left corner of the Match editor. You can access these editors by right-clicking the appropriate transform and choosing Match Editor (or Associate Editor). you may want to do candidate selection or prioritization in a data flow or a location in a data flow. The Match and Associate editors allow you to configure your transform's options. and choose New. 6. It's helpful to indicate which type of function this transform will be performing. and choose Match Editor. Right-click the Match transform in the object library. For example. You can access these editors by double-clicking the appropriate transform icon on your workspace. 3.

2.Data Quality Match 18 First. Configure the options for the transform. Map your input fields. Depending on what you are tracking. Secondly. • • Physical source: The filename or value attributed to the source of the input data. if you don't already have them in your database. it is best to map your input fields. Example: 1. Note: If your source is a flat file. you must create the appropriate fields in your data flow to ensure that the software generates the statistics you want. If you don't. To assign this value. you can use the Include file name option to automatically generate a column containing the file name. Physical and logical sources Tracking your input data sources and other sources. throughout the data flow is essential for producing informative match reports. and you add an operation in the Match editor. Adding operations to the Match transform (such as Unique ID and Group Statistics) can provide you with useful Match transform-generated fields that you may want to use later in the data flow or add to your database. Logical source: A group of records spanning multiple input sources or a subset of records from a single input source. whether based on an input source or based on some data element in the rows being read. 3. Physical input sources You track your input data source by assigning that physical source a value in a field. add a Query transform after the source and add a column with a constant containing the name you want to assign to this source. you should configure your options in the Match editor before you map your output fields. Map your output fields. you may not see a particular field you want to use for that operation. SAP BusinessObjects Data Services Designer Guide 617 . Then you will use this field in the transforms where report statistics are generated.

To set up a set of records that should not be counted toward multi-source status. Sources enable you to attach those characteristics to a record. you must create a field using a Query transform or a User-Defined transform. To identify a set of records that match suppression sources. either for selecting the best matching record. or all records that contain a particular value in a particular field. It is also used in compare tables. by virtue of that record’s membership in its particular source. For example. A source might be all records from one input file. so that you can specify which sources to compare. or for deciding which records to include or exclude from a mailing list. Source membership can cut across input files or database records as well as distinguish among records within a file or database. for example). a match user expects some characteristic or combination of characteristics to be significant. For example. then you do not need to include sources in your job. eligible records with equal priority. This field tracks the various sources within a Reader for reporting purposes. such as the DMA.18 Data Quality Match Logical input sources If you want to count source statistics in the Match transform (for the Match Source Statistics Summary report. Typically. based on how you define the source. Using sources A source is the grouping of records on the basis of some data characteristic that you can identify. here are some of the many reasons you might want to include sources in your job: • To give one set of records priority over others. Sources are abstract and arbitrary—there is no physical boundary line between sources. and is used in the Group Statistics operation of the Match transform to generate the source statistics. if you don't already have one in your input data sources. some mailers use a seed source of potential buyers • • 618 SAP BusinessObjects Data Services Designer Guide . If you are willing to treat all your input records as normal. Before getting to the details about how to set up and use sources. you might want to give the records of your house database or a suppression source priority over the records from an update file. for example.

These are special-type records. To save processing time. by canceling the comparison within a set of records that you know contains no matching records. you must know that there are no matching records within the source. In this case. or Special. The software can process your records differently depending on their source type. Source types SAP BusinessObjects Data Services lets you identify each source as one of three different types: Normal. You can choose to protect data based on membership in a source. SAP BusinessObjects Data Services Designer Guide 619 . you could set up sources and cancel comparing within each source. To protect a source from having its data overwritten by a best record or unique ID operation. or to get report statistics for groups of sources. To save processing time. Suppression.Data Quality Match 18 • • • who report back to the mailer when they receive a mail piece so that the mailer can measure delivery. but there may be matches among sources. To get separate report statistics for a set of records within an source.

These are names of people who report when they receive advertising mail. A Special source can contribute records. and how the software produces output (that is. A Special source is not counted in when determining whether a match group is single-source or multi-source. a suppress source can help remove records from the mailing.18 Data Quality Match Source Normal Suppress Description A Normal source is a group of records considered to be good. but it’s not counted toward multi-source status. and select Input Sources. with one exception. if you’re using Match to refine a mailing source. select Transform Options in the explorer pane on the left. so that the mailer can measure mail delivery. 1. To manually define input sources Once you have mapped in an input field that contains the source values. In the Match Editor. In the Value field drop-down list. The new Input Sources operation appears under Transform Options in the explorer pane. For example. 620 SAP BusinessObjects Data Services Designer Guide . choose the field that contains the input source value. For example. Examples: • • • • DMA Mail Preference File American Correctional Association prisons/jails sources No pandering or non-responder sources Credit card or bad-check suppression sources Special A Special source is treated like a Normal source. Select it to view Input Source options. eligible records. whether it includes or excludes a record from its output). A Suppress source contains records that would often disqualify a record from use. 2. Source type plays an important role in controling priority (order) of records in break group. some companies use a source of seed names. The reason for identifying the source type is to set that identity for each of the records that are members of the source. how the software processes matching records (the members of match groups). click the Add button. Appearance on the seed source is not counted toward multi-source status. you can create your sources in the Match Editor.

1. select Transform Options in the explorer pane on the left. The new Input Sources operation appears under Transform Options in the explorer pane. and choose a source type. SAP BusinessObjects Data Services Designer Guide 621 . To automatically define input sources To avoid manually defining your input sources.Data Quality Match 18 3. 3. 4. you can choose to do it automatically by choosing the Auto generate sources option in the Input Sources operation. Select a field from the drop-down list in the Type field option. Select it to view Input Source options. type a source value that exists in the Value field for that source. To do this. Choose value from the Default source name option. This name will be used for any record whose source field value is blank. suppose you rented several files from two brokers. You define five sources to be used in ranking the records. 6. you can define groups of sources for each broker. click the Add button. For example. This name will be used for any record whose source field value is blank. In addition. 5. Choose a value in the Default type option The default type will be assigned to to any source that does not already have the type defined in the Type field. Any records that do not have a value field defined will be assigned to the default source name. 2. before you move to another operation in the Match Editor. Be sure to click the Apply button to save any changes you have made. Select the Auto generate sources option. In the Match Editor. Source groups The source group capability adds a higher level of source management. Choose value from the Default source name option. create a source name. Auto generating sources will create a source for each unique value in the Value field. 4. and select Input Sources. choose the field that contains the input source value. In the Define sources table. In the Value field drop-down list. you would like to see your job’s statistics broken down by broker as well as by file.

This option is populated with source groups you have already defined. click the Add button. Select it to view Source Group options. Select a Match transform in your data flow. it will be assigned to this default source group. If you chose Default as the undefined action in the previous step. and press Enter. 622 SAP BusinessObjects Data Services Designer Guide . 3. 1. and choose Tools > Match Editor.18 Data Quality Match Source groups primarily affect reports. To create source groups You must have input sources in an Input Source operation defined to be able to add this operation or define your source groups. Choose a value for the Undefined action option. select a field in the Source group field option drop-down list that contains the value for your source groups. Remember that you cannot use source groups in the same way you use sources. 7. 2. 4. select Transform Options in the explorer pane on the left. and select Source Groups. The new Source Groups operation appears under Input Sources operation in the explorer pane. If you want. Select a source in the Sources column and click the Add button. For example. Double-click the first row in the Source Groups column on the left. you can also use source groups to select multi-source records based on the number of source groups in which a name occurs. Confirm that the input sources you need are in the Sources column on the right. This option specifies the action to take if an input source does not appear in a source group. 8. you cannot give one source group priority over another. you must choose a value in the Default source group option. In the Match Editor. 6. If an input source is not assigned to a source group. However. and enter a name for your first source group. 5.

and Ltd. Break groups Break groups organize records into collections that are potential matches. Include a Break Group operation in your Match transform to improve performance. When making comparisons. we always recommend that you include one of the address cleansing transforms and a Data Cleanse transform in your data flow before you attempt matching. depending on the type of data. Use a Case transform to route records to a different path or a Query transform to filter or block records. words such as Inc. Corp. This should help performance. For matching on firm data. can be removed. Match standards You may want to include variations of name or firm data in the matching process to help ensure a match. you may want to use the original data and one or more variations.. Noise words You can perform a search and replace on words that are meaningless to the matching process. For batch matching. You can use the search and replace function in the Query transform to accomplish this. SAP BusinessObjects Data Services Designer Guide 623 . a variation of Bill might be William.. You can add anywhere from one to five variations or match standards. thus reducing the number of comparisons that the Match transform must perform. Filter out empty records You should filter out empty records before matching.Data Quality Match 18 Match preparation Prepare data for matching Data correction and standardization Accurate matches depend on good data coming into the Match transform. For example.

Map the custom output fields from Data Cleanse and the custom fields will appear in the Match Editor's Criteria Fields tab. If the variations match. because the original first names were not considered a match. Custom Match Standards You can match on custom Data Cleanse output fields and associated aliases. the two records still have a chance of matching rather than failing. 624 SAP BusinessObjects Data Services Designer Guide . the variations are then compared.18 Data Quality Match For example. If the first names are compared but don't match.

Including a field that already contains the break key value could help improve the performance of break group creation. This field is used in the Group Statistics operation. A field that contains the break key value for creating break groups. Example: Here are some of the other fields that you might want to include. a source object. as long as you remember which field contains the appropriate data. A value that specifies which logical source a record originated. A value that is used to signify a record as having priority over another when ordering records. This field is used in Group Prioritization operations. other than the ones that you want to use as match criteria. (For example. Note: Do not map an input field to the BREAK_KEY field. A Yes or No value to specify whether a logical source should be counted in a Group Statistics operation. Logical source Physical source Break keys Criteria fields Count flags Record priority SAP BusinessObjects Data Services Designer Guide 625 . Set up your break groups in the Match Editor with the Break Group operation.Data Quality Match 18 Fields to include for matching To take advantage of the wide range of features in the Match transform. and the Associate transform. The names of the fields are not important.. Candidate Selection operation. because it will save the Match transform from doing the parsing of multiple fields to create the break key. The fields that contain the data you want to match on. and also the Associate transform. A value that specifies which physical source a record originated. Field contents Contains.. you will need to map a number of input fields. or a group of candidate-selected records) This field is used in the Match transform options. compare tables.

Apply blank penal.18 Data Quality Match Field contents Contains.. By grouping together only those records that have a potential to match. This field is used in Group Prioritization operations. Depending on the features you want to use.A Yes or No value to specify whether Match should apply a blank ty penalty to a record. you are assured of better results in your matching process. Match quality. You can also use compare tables to include or exclude records for comparison by logical source. you can save processing time. Related Topics • Break groups • Candidate selection • Compare tables 626 SAP BusinessObjects Data Services Designer Guide . Starting unique ID A starting ID value that will then increment by 1 every time a unique ID value is assigned. Controlling the number of comparisons is primarily done in the Group Forming section of the Match editor with the following operations: • • Break group: Break up your records into smaller groups of records that are more likely to match. Control record comparisons Controlling the number of record comparisons in the matching process is important for a couple of reasons: • • Speed. Candidate selection: Select only match candidates from a database table. This field is used in the Unique ID operation. This is primarily used for real-time jobs. By controlling the actual number of comparisons. This is not a complete list. you may want to include many other fields that will be used in the Match transform..

which is then used to group together records based on similar data. you place records into groups that are likely to match. a common scenario is to create break groups based on a postcode. Here is an example of a typical break key created by combining the five digits of the Postcode1 field and the first three characters of the Address_Primary_Name field. This ensures that records from different postcodes will never be compared. Break keys You form break groups by creating a break key: a field that consists of parts of other fields or a single field. You can add a Group Prioritization operation after the Break Groups operation to specify which records you want to be the drivers. be sure that the suppression records are the drivers in the break group. Field (Start pos:length) Postcode1 (1:5) Address_Primary_Name (1:3) Data in field 10101 10101Mai Main Generated break key All records that match the generated break key in this example are placed in the same break group and compared against one another.Data Quality Match 18 Break groups When you create break groups. SAP BusinessObjects Data Services Designer Guide 627 . For example. Sorting of records in the break group Records are sorted on the break key field. because the chances of finding a matching record with a different postcode are very small. Remember: Order is important! If you are creating break groups using records from a Suppress-type source.

The driver record is the first record in the break group. The driver record is the record that drives the comparison process in matching.18 Data Quality Match Break group anatomy Break groups consist of driver and passenger records. and all other records are passengers. 1. Related Topics • Phonetic matching To create break groups We recommend that you standardize your data before you create your break keys. This example is based on a break key that uses the first three digits of the Postcode. Add a Break Groups operation to the Group Forming option group. The driver is compared to all of the passengers first. for example. 628 SAP BusinessObjects Data Services Designer Guide . which can then be used to form break groups for matching. Phonetic break keys You can also use the Soundex and Double Metaphone functions to create fields containing phonetic codes. Data can be treated differently that is inconsistently cased.

For example. Select a field in the field column that you want to use as a break key. 6.000 or so postcodes.Data Quality Match 18 2. When the records are appended. they are not logically grouped in any way. Order your rows by selecting a row and clicking the Move Up and Move Down buttons. Your break key is now created. add a row by clicking the Add button.3) takes the last 3 characters of the string. 4. suppose you have a new source of records that you want to compare against your data warehouse in a batch job. whether the string has length of 10 or a length of 5. For example. use the Candidate Selection operaton (Group forming option group) in the Match transform to append records from a relational database to an existing data collection before matching. Add more rows and fields as necessary. SAP BusinessObjects Data Services Designer Guide 629 . 3. not the specified length of the field. regional file—with a large. They are simply appended to the end of the data collection on a record-by-record basis until the collection reaches the specified size. Postcode is a common break key to use. Field(-3. Choose the start position and length (number of characters) you want used in your break key. Ordering your rows ensures that the fields are used in the right order in the break key. in the Break key table. here is a simplified illustration: Suppose your job is comparing a new source database—a smaller. You can use negative integers to signify that you want to start at the end of the actual string length. national database that includes 15 records in each of 43. From this warehouse. Candidate selection To speed processing in a match job. Further assume that you want to form break groups based only on the postcode. 5. This helps narrow down the number of comparisons the Match transform has to make. For example. you can select records that match the break keys of the new source.

500 form reads all of the records of both databases. to use as your secondary table. the Match trans1. If the secondary source from which you do candidate selection is fairly static (that is. If the secondary 630 SAP BusinessObjects Data Services Designer Guide . Persistent cache datastores Persistent cache is like any other datastore from which you can load your candidate set. depending on whether your secondary source is static (it isn't updated often) or dynamic (the source is updated often).000 751. You may improve performance. 750. only those records that would be included in a break group are read. you must connect to a valid datastore. You may also encounter performance gains by using a flat file (a more easily searchable format than a RDBMS) for your persistent cache. rather than using your secondary source directly.500 1. then you might want consider building a persistent cache.500 About 600 (40 x 15) 2. There are advantages for using one over the other.100 Datastores and candidate selection To use candidate selection.18 Data Quality Match Notes Regional National Total Without candidate selection. it will not change often). You can connect to any SQL-based or persistent cache datastore. With candidate selection.

and there are times where you must create your own SQL. and then you can change the code page back to its original setting if you want. such as a flat file.Data Quality Match 18 source is not an RDBMS. To do this. As the size of the data loaded in the persistent cache increases. you cannot use it as a "datastore". SAP BusinessObjects Data Services Designer Guide 631 . you will need to change the locale setting in the Data Services Locale Selector (set the code page to utf-8). Related Topics • Persistent cache datastores Auto-generation vs. “input data” refers to break key fields coming from a transform upstream from the Match transform (such as a Query transform) or a break key fields coming from the Break Group operation within the Match transform itself. the performance gains may decrease. Note: A persistent cache used in candidate selection must be created by a dataflow in double-byte mode. Note: In the following scenarios. Cache size Performance gains using persistent cache also depend on the size of the secondary source data. Use this table to help you determine whether you can use auto-generated SQL or if you must create your own. Also note that if the original secondary source table is properly indexed and optimized for speed then there may be no benefit in creating a persistent cache (or even pre-load cache) out of it. In this case. This is determined by the options you and how your secondary table (the table you are selecting match candidates from) is set up. custom SQL There are cases where the Match transform can generate SQL for you. Run the job to generate the persistent cache. you can create a persistent cache out of that flat file source and then use that for candidate selection.

You want to select from multiple input sources. Note: Records extracted by candidate selection are appended to the end of an existing break group (if you are using break groups). rather than pulling substrings from database fields to create your break key. and you have one break key field in your secondary table. Auto-generate Auto-generate You have a single break key field in your input data. and you have multiple break key fields in your Custom secondary table.18 Data Quality Match Scenario Auto-generate or Custom? You have a single break key field in your input data. You have multiple break key fields in your input data. So. This makes setup of the Candidate Selection operation much easier.Auto-generate ble. Custom Break keys and candidate selection We recommend that you create a break key column in your secondary table (the table that contains the records you want to compare with the input data in your data flow) that matches the break key you create your break groups with in the Match transform. but you have a different format or number of Custom fields in your secondary table. if you do not reorder 632 SAP BusinessObjects Data Services Designer Guide . You have multiple break key fields in your input data. This can help improve the performance of candidate selection. Also. You have multiple break key fields in your input data. each of these columns should be indexed. We also recommend that you create and populate the database you are selecting from with a single break key field. and you have the same fields in your secondary table. and you have the same field in your secondary ta.

if you have one. they are not logically grouped in any way. If you are using candidate selection on a Suppress source. To set up candidate selection If you are using Candidate selection for a real-time job. If you don't. SAP BusinessObjects Data Services Designer Guide 633 . The data doesn't change until the job restarts. we recommend that you have a break key column in your secondary table and select the Use break column from database option. use the Candidate Selection operaton (Group forming option group) in the Match transform to append records from a relational database to an existing data collection before matching. Note: If you choose the Auto-generate SQL option. To speed processing in a real-time match job. the SQL that is created could be incorrect. records from the original source will always be the driver records in the break groups. Depending on how your input data and secondary table are structured. do one of the following: • Select Auto-generate SQL.Data Quality Match 18 the records using a Group Prioritization operation after the Candidate Selection operation. 2. In the Candidate Selection operation. Use this option for static data. and choose a column from the Break key field drop-down list. be sure to deselect the Split records into break groups option in the Break Group operation of the Match transform. select a valid datastore from the Datastore drop-down list. Then select the Use break column from database option. 1. They are simply appended to the end of the data collection on a record-by-record basis until the collection reaches the specified size. In the Cache type drop-down list. When the records are appended. choose from the following values: Option No_Cache Pre-load Cache Description Captures data at a point in time. you will need to reorder the records so that the records from the Suppress source are the drivers. 3.

Address1. Region. ContactFam ilyName. 4. City. This field has a placeholder of [MATCHKEY]. SELECT ContactGivenName1. AddrStreet. select Use constant source value. The records that are selected from the customer database and appended to the existing data collection are those that contain the same value in MatchKey as in the transaction's MATCH_KEY. The following is an example of what your SQL would look like with an actual phone number instead of the [MATCHKEY] placeholder. the fields used to create the break key are posted here. Postcode. Address2. 6. with the Break Group column set to YES. ContactGivenName2.18 Data Quality Match • Select Create custom SQL. AddrStreetNumber. let's say the customer database contains a field called MatchKey. Choose a field in the Mapped name column. b. and then choose a field that holds this value in the Physical source field drop-down list. you should use placeholders (which are replaced with real input data) in your WHERE clause. add as many rows as you want. If you want to track your records from the input source. Writing custom SQL Use placeholders To avoid complicated SQL statements. and the record that goes through the cleansing process of Data Services gets a field generated called MATCH_KEY. a. If you have already defined your break keys in the Break Group option group. 5. For example. Country. AddrUnitNumber FROM TblCustomer 634 SAP BusinessObjects Data Services Designer Guide . Each row is a field that will be added to the collection. Choose a column from your secondary table (or from a custom query) in the Column name option that contains the same type of data as specified in the Mapped name column. In the Column mapping table. and either click the Launch SQL Editor button or type your SQL in the SQL edit box. For this example. Enter a value that represents your source in the Physical source value option. let's say the actual value is a 10-digit phone number.

Address2. sort of an additional way to create break groups. AddrStreetNumber. you must replace the actual values with placeholders ([MATCHKEY]. SELECT ContactGivenName1. ContactFam ilyName. Compare tables Compare tables are sets of rules that define which records to compare. Postcode. or you can compare records across sources. Caution: You must make sure that the SQL statement is optimized for best performance and will generate valid results. Your SQL should now look similar to the following. Replace placeholder with actual values After testing the SQL with actual values. Note: Placeholders cannot be used for list values. ContactGivenName2. By using compare tables. you can compare records within sources. for example). Country. You use your logical source values to determine which records are compared or are not compared. Region. SAP BusinessObjects Data Services Designer Guide 635 . or a combination of both.Data Quality Match 18 WHERE MatchKey = '123-555-9876'. this SQL statement will fail. AddrUnitNumber FROM TblCustomer WHERE MatchKey = [MATCHKEY]. for example in an IN clause: WHERE status IN ([status]) If [status] is a list of values. City. Address1. AddrStreet. The Candidate Selection operation does not do this for you.

add a row. Remember that the driver record is the first record read in a collection. 1.18 Data Quality Match To set up a compare table Be sure to include a field that contains a logical source value before you add a Compare table operation to the Match transform (in the Match level option group). This tells the Match transform to not compare everything. Note: Use care when choosing logical source names. as the case may be) differently than your compare table is expecting. set up a table entry to account for that value and the passenger ID value. but follow the comparison rules set by the table entries. Suppose you have two IDs (A and B). The example values entered above assumes that A will always be the driver ID. Set Action to Compare. Note: Account for all logical source values. 636 SAP BusinessObjects Data Services Designer Guide . and then set the Driver value to A. not within the sources. Sometimes data in collections can be ordered (or not ordered. If you expect that a driver record has a value other than A. Set the Default action option to No_Match. and set the Passenger value to B. In the Compare actions table. So a Driver value of A and a blank passenger record with an action of compare will make a record from A compare against all other passenger records. Here is an example of how to set up your compare table. and you only want to compare across sources. Typing “None” in the Default logical source value option will not work if you have a source ID called “None. and type None in the Default logical source value option. If you leave the Driver value or Passenger value options blank in the compare table. 3. then it will mean that you want to compare all sources.” 2. This can cause the matching process to miss duplicate records.

such as a rented source. it won't be compared. include another row in the table that does the opposite. such as matching or best record. for example. that you might trust more than records from another source. you would want to order your records based on a date field. and the passenger comes in with a value of A. you should compare within the other sources. you might want your match groups to be ordered so that the first record in is the newest record of the group. for example. but if the driver record comes in with a value of B. These can be defined per field. Match editor You can define your priorities and order your records in the Group Prioritization operation.This ensures that all of the matches of those sources are suppressed when any are found to duplicate a record on the suppress source. for example. In this case. the way you set up your Compare action table row means that you are expecting that the driver record should have a driver value of A. For example. To account for situations where a driver record might have a value of B and the passenger a value of A. Order and prioritize records You may have data sources. per record. the way to express this preference in the matching process is using priorities. You may also prefer newer records over older records. such as your own data warehouse.Data Quality Match 18 In the example. regardless of which record is the driver record. This will make sure that any record with a value of A or B is compared. Whatever your preference. SAP BusinessObjects Data Services Designer Guide 637 . there are a two ways to order your records. available in Group Forming and in the Post-match processing operations of each match level in the Match editor. Note: In general. Whatever the reason. either before or after the comparison process: • • Sorting records in break groups or match groups using a value in a field Using penalty scores. if you use a suppress source. in a particular order. or based on input source membership. There are other times where you might want to ensure that your records move to a given operation. or more complete records over those with blank fields. no matter which is the Driver or Passenger.

18 Data Quality Match Types of priorities There are a couple of different types of priorities to consider: Priority Brief description Record priority Prefers records from one input source over another. order may not be as important to you. Tip: If you are not using a blank penalty. and if order is important. 638 SAP BusinessObjects Data Services Designer Guide . Assigns a lower priority to records in which a particular field is blank. Post-match ordering After the Match transform has created all of the match groups. This will ensure that the highest priority record is the first record (driver) in the break group. you can use a Group Prioritization operation before a Group Statistics. Blank penalty Pre-match ordering When you create break groups. Best Record. However. you may get better performance out of a Best Record operation by prioritizing records and then setting the Post only once per destination option to Yes. You will also want to have Suppress-type input sources to be the driver records in a break group. and you may not want to include a Group Prioritization operation before your post-match operations. besides ordering on the break key. you can set up your Group Forming > Group Prioritization operation to order (or sort) on a field. and Unique ID operations to ensure that the master record is the first in the match group.

Given_Name2. choose Ascending or Descending to specify the type of ordering. In the Priority fields table. if you were SAP BusinessObjects Data Services Designer Guide 639 . you can assign them a lower priority for each blank field. 2. To order records by sorting on a field Be sure you have mapped the input fields into the Match transform that you want to order on. and the rest will be secondary orders. The first row will be the primary order. Even though you accept these records into your match groups. or they won't show up in the field drop-down list. you can assess a penalty of any non-negative integer. suppose you are willing to accept a record as a match even if the Prename. 5. Use this method of ordering your records if you do not consider completeness of data important. For example. Incorporating a blank penalty is appropriate if you feel that a blank field shouldn't disqualify one record from matching another. Repeat step 2 for each row you added. You can assess the same penalty for each blank field. Enter a Prioritization name. Penalty scoring system The blank penalty is a penalty-scoring system. You can use blank penalty to penalize records that contain blank fields. Given_Name1. 3. you will want to ensure that records from the Suppress source are first in the break group. you may prefer to keep the record that contains the most complete data. 4. 1. if you are comparing a Normal source to a Suppress source and you are using a source ID field to order your records. or assess a higher penalty for fields you consider more important. For example. For example. In the Field Order column. Primary_Postfix and/or Secondary Number is blank. For each blank field.Data Quality Match 18 Blank penalty Given two records. choose a field from the drop-down list in the Input Fields column. and select the Priority Order tab. and you want to keep the more complete record. Order your rows in the Priority fields table by using the Move Up and Move Down buttons.

18 Data Quality Match targeting a mailing to college students. we want it as the master record because it does contain the data we consider more important: Given_Name1 and Secondary Number. Prename (5) Given Name1 (20) Given Name2 (5) Prim Postfix (5) Sec BlankNumber field (20) penalty Family Name Prim Range Prim Name Maria Ramirez 100 Main 6 5+5+ 5 = 15 20 Ms. Even though the first record has blank prename. the records below would be ranked in the order shown (assume they are from the same source. Field Blank penalty Prename Given_Name1 Given_Name2 Primary Postfix 5 20 5 5 Secondary Num20 ber As a result. who primarily live in apartments or dormitories. and street postfix fields. you might assess a higher penalty for a blank Given_Name1 or apartment number. Maria A Ramirez 100 Main St 640 SAP BusinessObjects Data Services Designer Guide . so record priority is not a factor). Given_Name2.

Is source membership more important. but you also want records with blank fields to have low priority.Data Quality Match 18 Prename (5) Given Name1 (20) Given Name2 (5) Family Name Prim Range Prim Name Prim Postfix (5) Sec BlankNumber field (20) penalty Ms. and lower penalties for blank fields: Source Record priority (penalty points) Field Given Name1 Blank penalty House Source Rented Source A Rented Source B Rented Source C 100 200 300 400 20 5 5 20 Given_Name2 Primary Postfix Secondary Number SAP BusinessObjects Data Services Designer Guide 641 . set a high penalty for membership in a rented source. even if some fields are blank? Or is it more important to have as complete a record as possible. and would not want blank fields to override that priority. Ramirez 100 Main St 6 20 + 5 = 25 Blank penalty interacts with record priority The record priority and blank penalty scores are added together and considered as one score. For example. To make this happen. even if it is not from the house database? Most want their house records to have priority. suppose you want records from your house database to have high priority.

a record from the house source always receives priority over a record from a rented source. has the lower penalty and the higher priority. and whether to apply these per record.Penal. Rec Blank priori. 642 SAP BusinessObjects Data Services Designer Guide . therefore. it receives only 155 penalty points (100 + 5 + 20 + 5 + 5 + 20).18 Data Quality Match With this scoring system. suppose the records below were in the same match group. Add a Group Prioritization operation to the Group Forming or Post Match Processing section in the Match Editor. 1. The house record.Total ty ty Source Given Given Fami Name1 Name2 ly Prim Prim Sec Range Name Num Post code House Source Rita A Source Rita B Smith 100 Bren 55343 100 55 155 A Smith 100 Bren 12A 55343 200 0 200 Smith 100 Bren 12 55343 300 10 310 You can manipulate the scores to set priority exactly as you'd like. even if the house record has blank fields. You could set the first-name blank penalty score to 500 so that a blank first-name field would weigh more heavily than any source membership. while the record from source A receives 200 penalty points. This task tells Match which fields hold your record priority and blank penalty values for your records. To define priority and penalty using field values Be sure to map in any input fields that carry priority or blank penalty values. Even though the house record contains five blank fields. For example. In the example above. suppose you prefer a rented record containing first-name data over a house record without first-name data.

To define penalty values by field This task lets you define your default priority score for every record and blank penalties per field to generate your penalty score. • Define priority and penalty based on input source: This allows you to define priority and blank penalty based on membership in an input source. Add a Group Prioritization operation to the Group Forming or Post Match Processing section in the Match Editor. Select the Define only field penalties option. 2. 3. Select the Define priority and penalty fields option. Select the Order records based on completeness of data option. 6. 4. Select the Order records based on completeness of data option. enter a default record priority to use if a record priority field is blank or if you do not specify a record priority field. 1. 8. choose a field from the Input Field column to which you want to assign blank penalty values. Choose a field that contains the record priority value from the Record priority field option. Choose a Default apply blank penalty value (Yes or No). • Define only field penalties: This option allows you to select a default record priority and blank penalties per field to generate your priority score. 5. In the Blank Penalty column. 4. 10. Enter a Prioritization name (if necessary) and select the Record Completeness tab. In the Blank penalty score table. 7. In the Apply blank penalty field option. 3. This determines whether the Match transform will apply blank penalty to a record if you didn't choose an apply blank penalty field or if the field is blank for a particular record. Enter a Prioritization name (if necessary) and select the Record Completeness tab.Data Quality Match 18 2. 9. type a blank penalty value to attribute to any record containing a blank in the field you indicated in Input Field column. choose a field that contains the Y or N indicator for whether to apply a blank penalty to a record. SAP BusinessObjects Data Services Designer Guide 643 . In the Default record priority option.

For one of the rented sources. from among the three sources. If a name appears on your house source and a rented source.18 Data Quality Match 5. 7. or priority. Source House source Rented source A Rented source B Priority Highest Medium Lowest Suppress-type sources and record completeness In cases where you want to use penalty scores. Source B. Prioritize records based on source membership However you prefer to prioritize your sources (by sorting a break group or by using penalty scores). you will want your Suppress-type sources to have a low priority score. In the Default record priority option. This determines whether the Match transform will apply blank penalty to a record if you didn't choose an apply blank penalty field or if the field is blank for a particular record. Therefore. You want to use as few records as possible from Source B so that you can get the largest possible rebate. you will want to ensure that your suppress-type source records are the drivers in the break group and comparison process. suppose also that you can negotiate a rebate for any records you do not use. 8. you prefer to use the name from your house source. suppose you are a charitable foundation mailing a solicitation to your current donors and to names from two rented sources. type a blank penalty value to attribute to any record containing a blank in the field you indicated in Input Field column. you want records from Source B to have the lowest preference. choose a field from the Input Field column to which you want to assign blank penalty values. In the Blank penalty score table. In the Blank Penalty column. This makes it likely that 644 SAP BusinessObjects Data Services Designer Guide . For example. Choose a Default apply blank penalty value (Yes or No). 6. enter a default record priority that will be used in the penalty score for every record.

SAP BusinessObjects Data Services Designer Guide 645 . Ms. Maria Maria A A Ramirez Ramirez Ramirez Ramirez Source House Source B Source A DMA Priority 100 300 200 0 The following match group would be established. and will therefore be suppressed. any record with a lower priority than a suppression source record is considered a suppress match. Within each match group. Matching record (name fields only) Maria Ms.Data Quality Match 18 normal records that match a suppress record will be subordinate matches in a match group. Match would rank the records as shown. as well. For example. Source DMA Suppression source House source Rented source A Rentd source B Priority 0 100 200 300 Suppose Match found four matching records among the input records. You would identify the DMA source as a suppression source and assign a priority of zero. as well. Based on their priority. Ms. and thus suppressed. suppose you are running your files against the DMA Mail Preference File (a list of people who do not want to receive advertising mailings). As a result. and the others would be subordinate suppress matches. the record from the suppression file (the DMA source) would be the master record.

Select the Define priority and penalty based on input source option. In the Apply Blank Penalty column. 9. Enter a Prioritization name (if necessary) and select the Record Completeness tab. 3. 7. 2. Remember that the lower the score. In the Blank penalty score table. Add a Group Prioritization operation to the Group Forming or Post Match Processing section in the Match Editor. 646 SAP BusinessObjects Data Services Designer Guide . and thus apply these scores to any record belonging to that source. choose a Yes or No value to determine whether to use blank penalty on records from that source. choose a field from the Input Field column to which you want to assign blank penalty values. select a source from the drop-down list. 4. you can attribute priority scores and blank penalties to an input source. 6. For example. Choose a Default apply blank penalty value (Yes or No). Just be sure you have your input sources defined before you attempt to complete this task. enter a default record priority that will be used in the penalty score for every record that is not a member of a source. In the Source Attributes table.18 Data Quality Match Source DMA House Source A Source B Priority 0 (Master record) 100 200 300 To define penalties based on source membership In this task. you would want to assign a very low score (such as 0) to a suppress-type source. the higher the priority. 8. This determines whether to apply blank penalties to a record that is not a member of a source. In the Default record priority option. Select the Order records based on completeness of data option. 1. 10. Type a value in the Priority column to assign a record priority to that source. 5.

123 Main St. Postcode 54601 54601 54601 SAP BusinessObjects Data Services Designer Guide 647 . which contains a blank Name field. Record 1 (driver) 2 3 Name Address Postcode 123 Main St. The data that’s copied is data that is found in the passenger record but is missing or incomplete in the driver record. 123 Main St. 54601 John Smith 123 Main St. For example. Data Salvage Data salvaging temporarily copies data from a passenger record to the driver record after comparing the two records. 54601 If you enabled data salvaging. In the Blank Penalty column. type a blank penalty value to attribute to any record containing a blank in the field you indicated in Input Field column. we have the following match group. The result: Record #1 matches Record #2. 54601 Jack Hill 123 Main St.Data Quality Match 18 11. but Record #1 does not match Record #3 (because John Smith doesn’t match Jack Hill). Record Name 1 (driver) 2 3 John Smith (copied from record below) John Smith Jack Hill Address 123 Main St. If you did not enable data salvaging. matches both of the other records. the software would temporarily copy John Smith from the second record into the driver record. the records in the first table would all belong to the same match group because the driver record. Data salvaging prevents blank matching or initials matching from matching records that you may not want to match.

and you would not enable data salvaging. Data salvaging and initials When a driver record’s name field contains an initial. 648 SAP BusinessObjects Data Services Designer Guide . With data salvaging. data salvaging helps improve the quality of your matching results. For illustration.18 Data Quality Match The following example shows how this is used for a suppression source. instead of a full name. based on address and ZIP Code data). assume that the following three records represent potentially matching records (for example. This is one form of data salvaging. record 839 will probably be considered not-to match record 357. For example. the software has grouped these as members of a break group. you would set the suppression source to have the highest priority. a suppress record of 123 Main St would match 123 Main St #2 and 123 Main St Apt C. Record 357 391 839 First name J Juanita Joanne Last name L Lopez London Address 123 Main 123 Main 123 Main Lowest ranking record Notes Driver The first match comparison will be between the driver record (357) and the next highest ranking record (391). both of these would be suppressed. Assume that the suppression source is a list of no-pandering addresses. the software suppresses all records that match the suppression source records. Note: Initials salvaging only occurs with the given name and family name fields. Therefore. Juanita and Lopez are temporarily copied to the name fields of record# 357. In that case. These two records will be called a match. the software may temporarily borrow the full name if it finds one in the corresponding field of a matching record. That way. By retaining more information for the driver record. the driver record’s name data is now Juanita Lopez (as “borrowed” from the first comparison). The next comparison will be between record 357 and the next lower ranking record (839).

2. they are both designated as suppress matches. and choose a default value for those records. In the Transform Options window. 3. if the driver record is a suppress-type record. since the driver record’s JL will match Joanne London. you may prefer to turn off data salvaging. Since both records 391 and 839 matched the suppress-type driver record. The next comparison will be between the driver record (357) and the next lower ranking record (839). This time these two records will also be called a match. If the field's value is Y for a record. to retain your best chance of identifying all the records that match the initialized suppression data. you would want to find all matches to JL regardless of the order in which the records are encountered in the break group. since the driver record’s JL and Juanita Lopez will be called a match.highest ranking record (391). and. To control data salvaging using a field You can use a field to control whether data salvage is enabled. Open the Match Editor for a Match transform.Data Quality Match 18 Initials and suppress-type records However. Select the Enable data salvage option. click the Data Salvage tab. data salvaging is enabled. neither will be included in your output. here is what happens during those same two match comparisons: Record 357 391 839 First name J Juanita Joanne Last name L Lopez London Address 123 Main 123 Main 123 Main Lowest ranking record Notes Driver The first match comparison will be between the driver record (357) and the next. Be sure to map the field into the Match transform that you want to use beforehand. 1. therefore. For example. SAP BusinessObjects Data Services Designer Guide 649 . if you want to suppress names with the initials JL (as in the case above. These two records will be called a match. If you have turned off data salvaging for the records of this suppression source.

For example. and choose a default value for those records. and so on (your business rules). you must create two criteria that specify these requirements. and choose a field from the drop-down menu. if you wanted to match on the first ten characters of a given name and the first fifteen characters of the family name. 2. Open the Match Editor for a Match transform. 1. Match criteria To the Match transform. Be sure to create your input sources beforehand. Select the Specify data salvage by field option. click the Data Salvage tab. Select the Specify data salvage by source option.18 Data Quality Match The default value will be used in the cases where the field you choose is not populated for a particular record. In the table. 650 SAP BusinessObjects Data Services Designer Guide . 4. The default value will be used if a record's input source is not specified in the following steps. In the Transform Options window. 5. name data. Match criteria Overview of match criteria Use match criteria in each match level to determine the threshold scores for matching and to define how to treat various types of data. choose a Source and then a Perform Data Salvage value for each source you want to use. Select the Enable data salvage option. 3. To control data salvaging by source You can use membership in an input source to control whether data salvage is enabled or disabled for a particular record. 4. blank. You can do all of this in the Criteria option group of the Match Editor. match criteria represent the fields you want to compare. such as numeric.

non party-data) output fields are available as pre-defined criteria. what types of operations to perform on that data. Data Cleanse custom (user-defined. and other data. • • Match criteria pre-comparison options The majority of your data standardization should take place in the address cleansing and Data Cleanse transforms. but the Match transform cannot perform some cross-field comparisons such as some name matching functions. such as name. should be designated as a custom criteria. However. your actual data is not affected) to provide more accurate matches. Any other types of data (such as part numbers or other proprietary data). Certain functions can be performed on custom keys. custom criteria There are two types of criteria: • Pre-defined criteria are available for fields that are typically used for matching. the Match transform is able to identify what type of data is in the field. address.Data Quality Match 18 Criteria provide a way to let the Match transform know what kind of data is in the input field and. without altering the actual input data. the Match transform can perform some preprocessing per criteria (and for matching purposes only. The options to control this standardization are located in the Options and Multi Field Comparisons tabs of the Match editor. By assigning a criteria to a field. for which a pre-defined criteria does not exist. They include: • • • • • Convert diacritical characters Convert text to numbers Convert to uppercase Remove punctuation Locale For more information about these options. therefore. see the Match transform section of the Data Services Reference Guide. SAP BusinessObjects Data Services Designer Guide 651 . Map the custom output fields from Data Cleanse and the custom fields appear in the Match Editor's Criteria Fields tab. substring. Pre-defined vs. numeric matching. such as abbreviation. and allow it to perform internal operations to optimize the data for matching.

choose the criteria that best represents the data that you want to match on. You can keep the default name for pre-defined criteria. 1. choose the Custom criteria. 5. Configure the Pre-comparison options and Comparison rules. and right-click. any appropriate match standard fields are removed from the Criteria field mapping table on the Criteria 652 SAP BusinessObjects Data Services Designer Guide . Enter a name for your criteria in the Criteria name box. 3. 7. Choose Criteria. click the Multiple Fields Comparisons tab. 8. choose input fields that contain the data you want to include in the multiple field comparison for this criteria. 2. b. Click the Options tab. 4. Select the appropriate match level or Match Criteria option group in the Option Explorer of the Match Editor. In the Additional fields to compare table. Note: If you enable multiple field comparison. choose an input field mapped name that contains the data you want to match on for this criteria. but you should enter a meaningful criteria name if you chose a Custom criteria. Be sure to set the Match score and No match score. in the Available criteria list. In the Criteria field mapping table. On the Criteria Fields tab. because these are required. 6. If you don't find what you are looking for. If you want to enable multiple field (cross-field) comparison. • The same field in other records: Compare each field only to the same field in all records. and select the Compare multiple fields option.18 Data Quality Match To add and order a match criteria You can add as many criteria as you want to each match level in your Match transform. a. Choose the type of multiple field comparison to perform: • All selected fields in other records: Compare each field to all fields selected in the table in all records. Tip: You can use custom match criteria field names for multiple field comparison by typing in the Custom name column.

select a criteria and click the Move Up or Move Down buttons as necessary. to any criteria. or weight. 9. 10. This method is easy to set up. so consider them carefully. To order your criteria in the Options Explorer of the Match Editor (or the Match Table). weighted-scoring evaluates every rule before determining a match. Related Topics • Similarity score • Rule-based method • Weighted-scoring method • Combination method Similarity score The similarity score is the percentage that your data is alike. Whether the application considers the records a match depends on the Match and SAP BusinessObjects Data Services Designer Guide 653 . which might cause an increase in processing time. This score is calculated internally by the application when records are compared. If you want to include them in the match process. Each of these ways have advantages and disadvantages. Configure the Pre-comparison options for multiple field comparison. Matching methods There are a number of ways to set up and order your criteria to get the matching results you want. add them in the Additional fields to compare table. Allows you to assign importance. Weightedscoring CombinaSame relative advantages and disadvantages as the other two tion method methods.Data Quality Match 18 Fields tab . Match method Rule-based Description Allows you to control which criteria determines a match. However.

This is not how the matching process works in the weighted scoring method. you rely only on your match and no-match scores to determine matches within a criteria.18 Data Quality Match No match scores you define in the Criteria option group (as well as other factors. Only the first comparison is considered a match. if possible. as in the case of the second comparison in the table below. because the similarity score met or exceeded the match score. the process moves to the next criteria. The last comparison is considered a no-match because the similarity score was less than the no-match score. Similarity score • • Comparison No match Match Matching? Smith > Smith 72 95 100% Yes Depends on other criteria No No Smith > Smitt 72 95 80% Smith > Smythe 72 Smith > Jones 72 95 95 72% 20% Rule-based method With rule-based matching. Example: This is an example of how similarity scores are determined. Here are some things to note: • The comparison table below is intended to serve as an example. for example. but for now let's focus on these scores). When a single criteria cannot determine a match. 654 SAP BusinessObjects Data Services Designer Guide .

smith@sap. By setting the Match score and No match score options for the E-mail criteria with no gap.com 74 79 101 80 80 91 By entering a value of 101 in the match score for every criteria except the last. because two fields cannot be more than 100 percent alike. the application determines that the records do not match. you should have the criteria that is most likely to make the match or no-match decisions first in your order of criteria. That is. Criteria Given Name1 Family Name E-mail Record A Record B No match 82 Match Similarity score 100 Mary Mary 101 Smith msmith@sap. if any criteria fails to meet the specified match score. A match score of 101 ensures that the criteria does not cause the records to be a match. the application gives all of the criteria the same amount of importance (or weight). although they can determine a no match. Remember: Order is important! For performance reasons. This can help reduce the number of criteria comparisons. Weighted-scoring method Weighted scoring method In a rule-based matching method. any comparison that reaches the last criteria must either be a match or a no match.Data Quality Match 18 Example: This example shows how to set up this method in the Match transform. SAP BusinessObjects Data Services Designer Guide 655 . the Given Name1 and Family Name criteria never determine a match.com Smitt mary.

18 Data Quality Match When you use the weighted scoring method. Contribution and total contribution score The Match transform generates the contribution score for each criteria by multiplying the contribution value you assign with the similarity score (the percentage alike). Note: All contribution values for all criteria that have them must total 100. You can define a criteria's contribution value in the Contribution to weighted score option in the Criteria option group. Social Security numbers. customer numbers. Postcode1. the more weight that criteria carries in determining matches. Also notice the following: 656 SAP BusinessObjects Data Services Designer Guide . Weighted match score In the weighted scoring method. If the total weighted score is less than the weighted match score. The higher the value. matches are determined only by comparing the total contribution score with the weighted match score. you are relying on the total contribution score for determining matches. Contribution values Contribution values are your way of assigning weight to individual criteria. You do not need to have a contribution value for all of your criteria. the records are considered a match. In general. as opposed to using match and no-match scores on their own. the records are considered a no-match. You can set the weighted match score in the Weighted match score option of the Level option group. criteria that might carry more weight than others include account numbers. These individual contribution scores are then added to get the total contribution score. If the total contribution score is equal to or greater than the weighted match score. Notice the various types of scores that we have discussed. Example: The following table is an example of how to set up weighted scoring. and addresses.

com 101 84 50 42 Total contribution score: 87 If the weighted match score is 87.Data Quality Match 18 • When setting up weighted scoring. Record A Record B No match Sim score Contribution value 25 Contribution score (actual similarity X contribution value) 25 Criteria Match First Name Mary Mary 59 101 100 SAP BusinessObjects Data Services Designer Guide 657 .com sap. We have assigned a contribution value to the E-mail criteria that gives it the most importance. In this example. the No match score option must be set to -1. These values ensure that neither a match nor a no-match can be found by using these scores. the comparison is a match because the total contribution score is 87. Combination method This method combines the rule-based and weighted scoring methods of matching. No match ContriSimilaribution ty score value 100 25 Contribution score (similarity X contribution value) 25 • Criteria Record Record A B Match First Name Last Name E-mail Mary Mary -1 101 Smith Smitt -1 101 80 25 20 ms@ msmith@ -1 sap. and the Match score option must be set to 101. then any comparison whose total contribution score is 87 or greater is considered a match.

if every field in a record matched another record's fields. Matching on strings. would you want these records to be considered matches? Figuring out what you want to do in these situations is part of defining your business rules. For example. For example.com sap.18 Data Quality Match Criteria Record A Record B No match Match Sim score Contribution value N/A (No Match) N/A Contribution score (actual similarity X contribution value) N/A Last Name E-mail Smith Hope 59 101 22 ms@ msmith@ 49 sap. Match criteria are where you define most of your business rules. except that one field was blank and the other record's field was not. abbreviations. and initials Initials and acronyms Use the Initials adjustment score option to allow matching initials to whole words. Abbreviations Use the Abbreviation adjustment score option to allow matching whole words to abbreviations. "International Health Providers" can be matched to "Intl Health Providers". while some name-based options are set in the Match Level option group. "International Health Providers" can be matched to "IHP". 658 SAP BusinessObjects Data Services Designer Guide .com 101 N/A N/A Total contribuN/A tion score Matching business rules An important part of the matching process is determining how you want to handle various forms of and differences in your data. For example.

SAP BusinessObjects Data Services Designer Guide 659 . the spaces prevent this. The higher the number. and Abbreviation adjustment score options. The adjustment score adds a penalty for the non-matched part of the words. you cannot designate these (La Crosse/LaCrosse and New York/NewYork) as matching 100%. the string "Mayfield Painting and Sand Blasting" can match "Mayfield painting". For example.Data Quality Match 18 String data Use the Substring adjustment score option to allow matching longer strings to shorter strings. • If you are concerned about either of these cases in your data. For example. you might encounter the following situations: • Suppose you have localities in your data such as La Crosse and New York. Extended abbrevation matching Extended abbreviation matching offers functionality that handles situations not covered by the Initials adjustment score. you also have these same localities listed as LaCrosse and NewYork (without spaces). the greater the penalty. A score of 100 means no penalty and score of 0 means maximum penalty. However. (These would normally be 94 and 93 percent matching. The Abbreviation adjustment score option cannot detect the combination of the two words.) Suppose you have Metropolitan Life and MetLife (an abbreviation and combination of Metropolitan Life) in your data. you should use the Ext abbreviation adjustment score option. How the adjustment score works The score you set in the Ext abbreviation adjustment score option tunes your similarity score to consider these types of abbreviations and combinations in your data. Substring adjustment score. Under normal matching.

18 Data Quality Match Example: Sim score when Adj score is 0 58 93 Sim score when Adj score is 50 Sim score when Adj score is 100 100 100 This score is due to string comparison. Two names. because they affect all appropriate name-based match criteria. you can control how matching is performed on match keys with more than one name (for example. comparing "John and Mary Smith" to "Dave and Mary Smith"). Note: Unlike other business rules. String 1 String 2 Notes MetLife Metropolitan Life 79 96 MetLife Met Life MetLife Metropolitan Life 60 60 60 Name matching Part of creating your business rules is to define how you want names handled in the matching process. are taken into consideration. these options are set up in the match level option group. for example. Extended Abbreviation scoring was not needed or used because both strings being compared are each one word. 660 SAP BusinessObjects Data Services Designer Guide . Choose whether only one name needs to match for the records to be identified as a match. or whether the Match transform should disregard any persons other than the first name it parses. two persons With the Number of names that must match option. The Match transform gives you many ways to ensure that variations on names or multiple names.

The word breaking is performed on all punctuation and spacing. such as a part number. Record # First name Middle name Last name Address 170 Leo Thomas Smith 225 Pushbutton Dr 225 Pushbutton Dr 198 Tom Smith Hyphenated family names With the Match on hyphenated family name option. these record pairs represent sons or daughters named for their parents. You can also specify how this data must match. comparing "Smith-Jones" to "Jones"). the Match transform can correctly identify matching records such as the two partially shown below. The string is first broken into words.Data Quality Match 18 With this method you can require either one or both persons to match for the record to match. but known by their middle name. The numeric matching process is as follows: 1. one person With the Compare Given_Name1 to Given_Name2 option. With this option. you can also compare a record's Given_Name1 data (first name) with the second record's Given_Name2 data (middle name). Numeric data matching Use the Numeric words match exactly option to choose whether data with a mixture of numbers and letters should match exactly. Choose whether both criteria must have both names to match or just one name that must match for the records to be called a match. This option applies most often to address data and custom data. you can control how matching is performed if a Family_Name (last name) field contains a hyphenated family name (for example. and then the words are assigned a numeric SAP BusinessObjects Data Services Designer Guide 661 . Typically. Two names.

18 Data Quality Match attribute. For example. For example: • Street address comparison: "4932 Main St # 101" and "# 101 4932 Main St" are considered a match. A numeric word is any word that contains at least one number from 0 to 9. Numeric matching is performed according to the option setting that you choose (as described below). 662 SAP BusinessObjects Data Services Designer Guide . Any_Position • • Street address comparison: "4932 Main St # 101" and "# 102 4932 Main St" are not considered a match. Same_Position This value specifies that numeric words must match exactly. For example. however. however. 608-782-5000 will match 608782-5000.4L 29BAR" and "ACCU 29BAR 1. but it will not match 782-608-5000. this option differs from the Any_Position option in that the position of the word is important.4L" are considered a match. 2. whereas FourL is not. Option values and how they work Option value Description With this value. numeric words must match exactly. the position of the word is not important. Part description: "ACCU 1. 4L is considered a numeric word.

435" and "25. For example. you can control how the Match transform treats field comparisons when one or both of the fields compared are blank.4L" and "ACCU 1. For example: • Part description: "ACCU 29BAR 1. except that decimal separators do not impact the matching process.4L 29BAR" are considered a match.4L 29BAR" and "ACCU 29BAR 1. For example.435" are not considered a match.4L 29BAR" are considered a match.4L" are not considered a match because there is a decimal indicator between the 1 and the 4 in both cases. the string 123. • Blank field matching In your business rules. The position of the numeric word is not important. • Any_Position_Con sider_Punctuation Part description: "ACCU 1. decimal separators do impact the matching process. the first name field is blank in the second record shown below.Data Quality Match 18 Option value Description This value performs word breaking on all punctuation and spaces except on the decimal separator (period or comma) so that all decimal numbers are not broken. • Any_Position_Ig nore_Punctuation This value is similar to the Any_Position_Consider_Punctuation value. For example: • Part description: "ACCU 29BAR 1.4L 29BAR" and "ACCU 29BAR 1.4L" are also considered a match even though there is a decimal indicator between the 1 and the 4.4L" and "ACCU 1. Would you want the Match transform to consider these records matches or no matches? What if the first name field were blank in both records? SAP BusinessObjects Data Services Designer Guide 663 . Financial data: "25. Part description: "ACCU 29BAR 1. • Part description: "ACCU 1.5L 29BAR" are not considered a match.4L" and "ACCU 1. however.456 is considered a single numeric word as opposed to two numeric words.

despite the blank field. the two records shown above could still be considered duplicates.18 Data Quality Match Record #1 John Doe 204 Main St La Crosse WI 54601 Record #2 _____ Doe 204 Main St La Crosse WI 54601 There are some options in the Match transform that allow you to control the way these are compared. 664 SAP BusinessObjects Data Services Designer Guide . If you choose Ignore. the score for this field rule does not contribute to the overall weighted score for the record comparison. Ignore Blank field scores The "Score" options control how the Match transform scores field comparisons when the field is blank in one or both records. In other words. They are: • • • • Both fields blank operation Both fields blank score One field blank operation One field blank score Blank field operations The "operation" options have the following value choices: Option Description Eval If you choose Eval. You can enter any value from 0 to 100. the Match transform scores the comparison using the score you enter at the One field blank score or Both fields blank score option.

the Match transform redistributes the contribution allotted for this field to the other criteria and recalculates the contributions for the other fields. Fields compared Postcode Address Record A 54601 Record B 54601 % alike 100 Contribution 20 (or 22) 40 (or 44) 30 (or 33) 10 (or 0) Score (per field) 22 44 31 — Weighted score: 97 100 Water St 100 Water St 100 Hammilton 94 — Family_Name Hamilton Giv en_Name1 Mary One field blank operation for Given_Name1 field set to Eval. or somewhere in between. Example: Here are some examples that may help you understand how your settings of these blank matching options can affect the overall scoring of records.Data Quality Match 18 To help you decide what score to enter. for example. One field blank operation for Given_Name1 field set to Ignore Note that when you set the blank options to Ignore. determine if you want the Match transform to consider a blank field 0 percent similar to a populated field or another blank field. 100 percent similar. Giving a blank field a high score might be appropriate if you're matching on a first or middle name or a company name. One field blank score set to 0 SAP BusinessObjects Data Services Designer Guide 665 . Your answer probably depends on what field you're comparing.

Field1 in the first record is compared with Field1 in the second record. One field blank score set to 100 Fields compared Postcode Address Record A 54601 Record B 54601 % alike 100 Contribution 20 40 30 10 Score (per field) 20 40 28 10 Weighted score: 98 100 Water St 100 Water St 100 Hammilton 94 100 Family_Name Hamilton Giv en_Name1 Mary Multiple field (cross-field) comparison In most cases. For example. you use a single field for comparison.18 Data Quality Match Fields compared Postcode Address Record A 54601 Record B 54601 % alike 100 Contribution 20 40 30 10 Score (per field) 20 40 28 0 Weighted score: 88 100 Water St 100 Water St 100 Hammilton 94 0 Family_Name Hamilton Giv en_Name1 Mary One field blank operation for Given_Name1 field set to Eval. 666 SAP BusinessObjects Data Services Designer Guide .

For example. or against only the same field in each record. Match performs multiple field comparison on fields where match standards are used. Comparing selected fields to all selected fields in other records When you compare each selected field to all selected fields in other records. When you enable multiple field comparison in the Multiple Field Comparison tab of a match criteria in the Match Editor. Note: By default. the rule will be considered to pass and any other defined criteria/weighted scoring will be evaluated to determine if the two rows are considered matches. For example. Multiple field comparison makes this possible. all fields that are defined in that match criteria are compared against each other. Example: Example of comparing selected fields to all selected fields in other records Your input data contains two firm fields. If one or more field comparisons exceeds the No match score. Remember: “Selected” fields include the criteria field and the other fields you define in the Additional fields to compare table. you can choose to match selected fields against either all of the selected fields in each record. Person1_Given_Name1 is automatically compared to Person1_Given_Name_Match_Std1-6.Data Quality Match 18 However. there are situations where comparing multiple fields can be useful. • • If one or more field comparisons meets the settings for Match score. suppose you want to match telephone numbers in the Phone field against numbers found in fields used for Fax. Mobile. SAP BusinessObjects Data Services Designer Guide 667 . Multiple field comparison does not need to be explicitly enabled. and Home. the two rows being compared are considered matches. and no additional configuration is required to perform multiple field comparison against match standard fields.

the records are found to be matches. Row 1 Firm1 (Firstlogic) is compared to Row 2 Firm1 (SAP BusinessObjects). a No Match decision is not made yet. and cell phone field. Row 1 Firm2 is compared to Row 2 Firm2 and so on until all other comparisons are made between all fields in all rows. This sets up. these two records are considered matches. Each field is used to determine a match: If Field_1. Normally. Field_2. what is essentially an OR condition for passing the criteria. Comparing selected fields to the same fields in other records When you compare each selected field to the same field in other records. If any one of these input field's data is the same between thte rows. each field defined in the Multiple Field Comparison tab of a match criteria are compared only to the same field in other records. 668 SAP BusinessObjects Data Services Designer Guide . The No Match score for one field does not automatically fail the criteria when you use multi-field comparison. Here is a summary of the comparison process and the results. the rows would fail this comparison. consider the records a match. Because Row 1 Firm1 (Firstlogic) and Row 2 Firm2 (Firstlogic) are 100% similar. • Next.18 Data Quality Match Row ID 1 2 Firm1 Firstlogic SAP BusinessObjects Firm2 Postalsoft Firstlogic With the Match score set to 100 and No match score set to 99. within this criteria. the two records are considered matches. Remember: “Selected” fields include the criteria field and the other fields you define in the Additional fields to compare table. but with multi-field comparison activated. fax. • First. or Field_3 passes the match criteria. Example: Example of comparing selected fields to the same field in other records Your input data contains a phone.

based on the criteria you have created. You can perform these functions by adding a Best Record post-match operation. the fact that the fax number is 100% similar calls these records a match. Note: In the example above. the phone and the cell phone number would both fail the match criteria. members of match groups—and posting that data to a best record. However. Post-match processing Best record A key component in most data consolidation efforts is salvaging data from matching records—that is. select the all selected fields in other records option instead.Data Quality Match 18 Row ID 1 2 Phone 608-555-1234 608-555-4321 Fax 608-555-0000 608-555-0000 Cell 608-555-4321 608-555-1111 With a Match score of 100 and a No match score of 99. because all three fields are defined in one criteria and the selected records being compared to the same records. Match groups are groups of records that the Match transform has found to be matching. If this cross-comparison is needed. Row 1's cell phone and Row 2's phone would not be considered a match with the selection of the the same field to other records option because it only compares within the same field in this case. or to all matching records. SAP BusinessObjects Data Services Designer Guide 669 . Operations happen within match groups The functions you perform with the Best Record operation involve manipulating or moving data contained in the master records and subordinate records of match groups. if defined individually.

The most recent phone number would be a good example here.18 Data Quality Match A master record is the first record in the Match group. Some records have blank fields. when you run a drivers license file against your house file. all of the records are considered matching. Smith J. As you can see. Smith 788-8700 788-1234 788-3271 11 Apr 2001 12 Oct 1999 22 Feb 1997 Master Subordinate Subordinate Subordinate Because this is a match group. consider the following match group: Record Name Phone Date Group rank #1 #2 #3 #4 John Smith John Smyth John E. For example. Subordinate records are all of the remaining records in a match group. each record is slightly different. you might pick up gender or date-of-birth data to add to your house record. all have different phone numbers. some have a newer date. or to all members of the match group. 670 SAP BusinessObjects Data Services Designer Guide . Another example might be to salvage useful data from matching records before discarding them. You can choose to move data to the master record. You can control which record this is by using a Group Prioritization operation before the Best Record operation. to all the subordinate members of the match group. A common operation that you can perform in this match group is to move updated data to all of the records in a match group. To help illustrate this use of master and subordinate records.

so that string comparisons work correctly. If the criteria is not met. Example: In our example of updating a phone field with the most recent data. using your own Python code. are variables that are necessary to manage the processing. the Match transform auto-generates the Python code that it uses for processing. This ensures that data can be salvaged from the higher-priority record to the lower priority record. You can also update all of the records in the match group (master and all subordinates) or only the subordinates. no action is taken. we can use the Date strategy with the Newest priority to update the master record with the latest phone number in the match group. you can create a custom best record strategy. You can also do this by setting up a custom strategy. Custom best record strategies and Python In the pre-defined strategies for the Best Record strategies. If none of these strategies fit your needs. Restriction: The date strategy does not parse the date. SAP BusinessObjects Data Services Designer Guide 671 . because it does not know how the data is formatted. one at a time. be sure that your records are prioritized correctly. So. This latter part (updating the master record with the latest phone number) is the action. Best record strategies We provide you with strategies that help you set up some more common best record operation quickly and easily. by adding a Group Prioritization post-match operation before your Best Record operation. Best record strategies act as a criteria for taking action on other fields. Included in this code. using Python code to parse the date and use a date compare. Be sure your data is pre-formatted as YYYYMMDD.Data Quality Match 18 Post higher priority records first The operations you set up in the Best Record option group should always start with the highest priority member of the match group (the master) and work their way down to the last subordinate.

indicating whether the strategy passed or failed (must be either "T" or "F"). the strategy field data must also be updated. Signifies the destination field. If you do not include these variables. This string variable will have a value of "T" when the destination record is new or different than the last time the strategy was evaluated and a value of "F" when the destination record has not changed since last time. NEWGRP New group indicator.18 Data Quality Match Common variables The common variables you see in the generated Python code are: Variable SRC DST RET Description Signifies the source field. such as ALL or SUBS in the Posting destination option. Specifies the return value. Variable NEWDST Description New destination indicator. This string variable will have a value of "T" when the match group is different than the last time the strategy was evaluated and a value of "F" when the match group has not changed since last time. NEWDST and NEWGRP variables Use the NEWDST and NEWGRP variables to allow the posting of data in your best-record action to be independent of the strategy fields. The NEWDST variable is only useful if you are posting to multiple destinations. NEWDST example The following Python code was generated from a NON_BLANK strategy with options set this way: 672 SAP BusinessObjects Data Services Designer Guide .

NORTH_AMERICAN_PHONE1_NORTH_AMERICAN_PHONE_STANDARDIZED.Data Quality Match 18 Option Best record strategy Strategy priority Strategy field Posting destination Setting NON_BLANK Priority option not available for the NON_BLANK strategy.SetBuffer(u'T') dct['BEST_RECORD_TEMP'] = SOURCE else: RET.strip()) > 0 and len(DESTINATION.GetField(u'NORTH_AMERI CAN_PHONE1_NORTH_AMERICAN_PHONE_STANDARDIZED') SOURCE = SRC.has_key('BEST_RECORD_TEMP') and NEWDST.Get Buffer() == u'F'): DESTINATION = dct['BEST_RECORD_TEMP'] else: DESTINATION = DST. ALL Post only once per desti.YES nation Here is what the Python code looks like. # Setup local temp variable to store updated compare condition dct = locals() # Store source and destination values to temporary vari ables # Reset the temporary variable when the destination changes if (dct.GetField(u'NORTH_AMERICAN_PHONE1_NORTH_AMER ICAN_PHONE_STANDARDIZED') if len(SOURCE.SetBuffer(u'F') dct['BEST_RECORD_TEMP'] = DESTINATION # Delete temporary variables del SOURCE del DESTINATION SAP BusinessObjects Data Services Designer Guide 673 .strip()) == 0: RET.

Match group Match group 1 Records Record A Record B Record C Match group 2 Record D Record E Record F Each new destination or match group is flagged with a "T".18 Data Quality Match Example: NEWDST and NEWGRP Suppose you have two match groups. NEWGRP NEWDST (T or F) T (New match group) F F F F F T (New match group) F F (T or F) Comparison T (New destination "A") F T (New destination "B") F T (New destination "C") F T (New destination "D") F T (New destination "E") Record A > Record B A>C B>A B>C C>A C>B D>E D>F E>D 674 SAP BusinessObjects Data Services Designer Guide . each with three records.

Enter a name for your best record operation. 2. Also. Example: The strategy field you choose must contain data that matches the strategy you are creating. 2. 3. if you are using a newest date strategy. Select a field from the Strategy field drop-down menu. The selection of values depends on the strategy you chose in the previous step. To create a custom best record strategy 1. 1. Select a strategy from the Best record strategy option. Enter a name for this Best Record operation. 4. The available strategies reflect common use cases. Add a best record operation to your Match transform. The field you select here is the one that acts as a criteria for determining whether a best record action is taken. be sure that the field you choose contains date data. For example. SAP BusinessObjects Data Services Designer Guide 675 .Data Quality Match 18 NEWGRP NEWDST (T or F) F F F (T or F) F T (New destination "F") F Comparison E>F F>D F>E To create a pre-defined best record strategy Be sure to add a Best Record post-match operation to the appropriate match level in the Match Editor. Select a priority from the Strategy priority option. This procedure allows you to quickly generate the criteria for your best record action. remember to map any pertinent input fields to make them available for this operation.

In the Best record strategy option. Example: Suppose you want to update phone numbers of the master record. Click the View/Edit Python button to create your custom Python code to reflect your custom strategy. You would only want to do this if there is a subordinate record in the match group that has a newer date. Example: In our phone number example. so we take data from the phone field (the source) and post it to the master record (the destination). Posting once or many times per destination In the Best Record options. The source is the field from which you take data and the destination is where you post the data. Choose a field from the Strategy field drop-down list. 5. 4.18 Data Quality Match 3. The Python Editor window appears. it is important to know the differences between sources and destinations in a best record action. 676 SAP BusinessObjects Data Services Designer Guide . choose Custom. A source or destination can be either a master or subordinate record in a match group. the subordinate record has the newer date. Best record actions Best record actions are the functions you perform on data if a criteria of a strategy is met. The action you set up would tell the Match transform to update the phone number field in the master record (action) if a newer date in the date field is found (strategy). Sources and destinations When working with the best record operation. which signifies a potentially new phone number for that person. you can choose to post to a destination once or many times per action by setting the Post only once per destination option.

Regardless of this setting. If you don’t limit the action in this way. here is what happens: Match group Record #1 (master) Record #2 (subordinate) Record #3 (subordinate) Record #4 (subordinate) First. the action is attempted with the next highest priority record (record #3) as the source. the action is attempted using. record #4 was the last source for the action. Your choice depends on the nature of the data you’re posting and the records you’re posting to. Action The results In the case above. here is what happens: Match group Record #1 (master) Action SAP BusinessObjects Data Services Designer Guide 677 . the action is attempted with the lowest priority record (record #4) as the source. If you post only once to each destination record. then once data is posted for a particular record. When posting to record #1 in the figure below. as a source. However. Finally. without limiting the posting to only once. that record from among the other match group records that has the highest priority (record #2). all actions are performed each time the strategy returns True. and therefore could be a source of data for the output record. Next. or you may want it to continue with the other match group records as well.Data Quality Match 18 You may want your best record action to stop after the first time it posts data to the destination record. The two examples that follow illustrate each case. if you set your best record action to post only once per destination record. the Match transform moves on to either perform the next best record action (if more than one is defined) or to the next record. the Match transform always works through the match group members in priority order.

2. record #2 was the source last used for the best record action. If you want to create a custom best record action. Record #3 (subordinate) Record #4 (subordinate) In this case. Select the record(s) to post to in the Posting destination option. Create a strategy. the Match transform considers this best Record #2 (subor. Select whether you want to post only once or multiple times to a destination record in the Post only once per destination option. If this attempt is successful. 3. and so is the source of posted data in the output record. the Match transform moves to the match group member with the next highest priority and attempts the posting operation. as a source. the action is attempted using. When you choose a source field. You need to change the destination field if this is not the field you want to post your data to. 678 SAP BusinessObjects Data Services Designer Guide . or to the next output record. the Destination field column is automatically populated with the same field.18 Data Quality Match Match group Action First. 4. If this attempt is not successful. In the Best record action fields table. that record from among the other match group records that has the highest priority (record #2). choose your source field and destination field. either pre-defined or custom. To create a best record action The best record action is the posting of data from a source to a destination record. choose Yes in the Custom column. 1. based on the criteria of your best record strategy.record action to be complete and moves to the next best record action dinate) (if there is one). 5.

2. This value determines whether a destination (input source) is protected if you do not specifically define the source in the table below. Select a value in the Default destination protection option drop-down list. select Enable destination protection. Select the Specify destination protection by source option. To protect destination records based on input source membership You must add an Input Source operation and define input sources before you can complete this task. Select the Specify destination protection by field option. and then choose a value from the Destination protected (or Unique ID protected) column. To protect destination records through fields 1. 3. Destination protection The Best Record and Unique ID operations in the Match transform offer you the power to modify existing records in your data. 3. 4. This value determines whether a destination is protected if the destination protection field does not have a valid value. 1. SAP BusinessObjects Data Services Designer Guide 679 . and choose a field from the Destination protection field drop-down list (or Unique ID protected field) . The field you choose must have a Y or N value to specify the action. Any record that has a value of Y in the destination protection field will be protected from being modified. In the Destination Protection tab. Select an input source from the first row of the Source name column. In the Destination Protection tab. The Destination Protection tab in these Match transform operations allow you the ability to protect data from being modified. 2.Data Quality Match 18 You can now access the Python editor to create custom Python code for your custom action. There may be times when you would like to protect data in particular records or data in records from particular input sources from being overwritten. select Enable destination protection. Select a value in the Default destination protection option drop-down list.

you can set your own starting ID for new key generation. use the Group number field option to indicate which records belong together. Additionally. for example.18 Data Quality Match Repeat for every input source you want to set protection for. or have it dynamically assigned based on existing data. With the Unique ID operation. or quarter. make sure that the records are sorted by group number so that records with the same group number value appear together. It creates and tracks data relationships from run to run. You could use a unique ID. If you are assigning IDs directly to a break group. the default value will be used. Remember that if you do not specify for every source. The Unique ID post-match processing operation also lets you begin where the highest unique ID from the previous run ended. It can assign the same ID to every record in a match group (groups of records found to be matches). such as each week. a Social Security number (SSN) in the United States. 680 SAP BusinessObjects Data Services Designer Guide . Unique ID works on match groups Unique ID doesn't necessarily assign IDs to individual records. or a National Insurance number (NINO) in the United Kingdom. Unique ID applies to a data record in the same way that a national identification number might apply to a person. for example. Match treats the entire data collection as one match group. in your company's internal database that receives updates at some predetermined interval. month. the Group number field is not required and should not be used. Note: If you are assigning IDs directly to a break group and the Group number field is not specified. Unique ID A unique ID refers to a field within your data which contains a unique value that is associated with a record or group of records. If you are assigning IDs to records that belong to a match group resulting from the matching process.

• Records in a match group where one record had an input unique ID will share the value with other records in the match group which had no input value. then the next available ID will be assigned to each record in the match group. then records with the same group number must appear consecutively in the data collection.Data Quality Match 18 Unique ID processing options The Unique ID post-match processing operation combines the update source information with the master database information to form one source of match group information. You can accomplish this by using the Processing operation option. Assign • • If the GROUP_NUMBER input field is used. In addition. place a Prioritization post-match operation prior to the Unique ID operation. If the GROUP_NUMBER field is not specified. Note: Use the GROUP_NUMBER input field only when processing a break group that may contain smaller match groups. combine. The operation can then assign. Records in a match group where two or more records had different unique ID input values will each keep their input value. SAP BusinessObjects Data Services Designer Guide 681 . Each record is assigned a value. Unique ID assumes that the entire collection is one group. and delete unique IDs as needed. split. Order affects this. If all of the records in a match group do not have an input unique ID value. the assign operation copies an existing ID if a member of a match group already has an ID. if you have a priority field that can be sequenced using ascending order. Operation Description Assigns a new ID to unique records that don't have an ID or to all members of a group that don't have an ID. The first value encountered will be shared.

Unique ID assumes that the entire collection is one group.18 Data Quality Match Operation Description Performs both an Assign and a Combine operation. • AssignCom bine If the GROUP_NUMBER input field is used. Order affects this. These are "add" records that could be unique records or could be matches. if you have a priority field that can be sequenced using ascending order. • Records that did not have an input unique ID value and are not found to match another record containing an input unique ID value will have the next available ID assigned to it. 682 SAP BusinessObjects Data Services Designer Guide . but not to another record that had previously been assigned a unique ID value. Records in a match group where one or more records had an input unique ID with the same or different values will share the first value encountered with all other records in the match group. place a Prioritization post-match operation prior to the Unique ID operation. Note: Use the GROUP_NUMBER input field only when processing a break group that may contain smaller match groups. then records with the same group number must appear consecutively in the data collection. Each record is assigned a value. If the GROUP_NUMBER field is not specified.

and a third person moves in with a different unique ID. Specifically.Data Quality Match 18 Operation Description Ensures that records in the same match group have the same Unique ID. then the Combine operation could be used to assign the same ID to all three members. this ID is added to the file. If the GROUP_NUMBER field is not specified. If you are using a file and are recycling IDs. Delete When Match detects that a group of records with the same unique ID is about to be deleted: • If any of the records are protected. provided that they are not protected from being deleted. If the GROUP_NUMBER input field is used. Deletes unique IDs from records that no longer need them. all records in the group are assumed to be protected. Note: Combine Use the GROUP_NUMBER input field only when processing a break group that may contain smaller match groups. All other records in the match group are given this record’s ID (assuming the record is not protected). SAP BusinessObjects Data Services Designer Guide 683 . • If recycling is enabled. The Combine operation does not assign a unique ID to any record that does not already have a unique ID. even though a group of records had the same ID. When performing a delete. It only combines the unique ID of records in a match group that already have a unique ID. records with the same unique ID should be grouped together. this operation could be used to assign all the members of a household the same unique ID. For example. Unique ID assumes that the entire collection is one group. The first record in a match group that has a unique ID is the record with the highest priority. then records with the same group number must appear consecutively in the data collection. the unique ID will be recycled only once. if a household has two members that share a common unique ID.

Match assumes that none of the records are protected. a value of Y means that the unique ID is protected and the ID posted on output will be the same as the input ID. A value of N means that the unique ID is not protected and the ID posted on output may be different from the input ID.18 Data Quality Match Operation Description Changes a split group's unique records. 684 SAP BusinessObjects Data Services Designer Guide . so that the records that do not belong to the same match group will have a different ID. a value other than N means that the record's input data will be retained in the output unique ID field. The record with the group's highest priority will keep its unique ID. Records that came in with the same input unique ID value that no longer are found as matches have the first record output with the input value. you must group your records by unique ID. Records that did not have an input unique ID value and did not match any record with an input unique ID value will have a blank unique ID on output. Any other value is converted to Y. • • If the protected unique ID field is not mapped as an input field. Subsequent records are assigned new unique ID values. There are two valid values allowed in this field: Y and N. • • Unique ID protection The output for the unique ID depends on whether an input field in that record has a value that indicates that the ID is protected. rather than by match group number. The rest will be assigned new unique IDs. Split For example: • Records in a match group where two or more records had different unique ID input values or blank values will each retain their input value. • If the protected unique ID field is mapped as an input field. filled or blank depending on the record. These rules for protected fields apply to all unique ID processing operations. For this operation.

Assign unique IDs using a field The Field option allows you to send the starting unique ID through a field in your data source or from a User-Defined transform. 1.Data Quality Match 18 Unique ID limitations Because some options in the unique ID operation are based on reading a file or referring to a field value. Set the file name and path in the File option. for example. When Match recycles an ID. In the Unique ID option group. To assign unique IDs using a file 1. 2. select the Value from file option. 2. If no unique ID is received. the value must be 1 or greater. it does not check whether the ID is already present in the file. Select the Constant value option. To assign a unique ID using a constant Similar to using a file. You must ensure that a particular unique ID value is not recycled more than once. the unique ID file must be on a shared file system. SAP BusinessObjects Data Services Designer Guide 685 . first-out order. Recycled IDs are used in first-in. This file must be an XML file and must adhere to the following structure: <UniqueIdSession> <CurrentUniqueId>477</CurrentUniqueId> </UniqueIdSession> Note: The value of 477 is an example of a starting value. • • If you are reading from or writing to a file. However. you can assign a starting unique ID by defining that value. The starting unique ID is passed to the Match transform before the first new unique ID is requested. the starting number will default to 1. Set the Starting value option to the desired ID value. there may be implications for when you are running a multi-server or real-time server environment and sharing a unique ID file.

Select the Field option.org/rfc/rfc4122. • Select the GUID option. Note: GUID is also known as the Universal Unique Identifier (UUID). The UUID variation used for unique ID is a time-based 36-character string with the format: TimeLow-TimeMid-TimeHighAndVersion-ClockSeqAn dReservedClockSeqLow-Node For more information about UUID. for example). 2. you can write those IDs back to a file to be used later. The second record or match group receives an ID of 100. In the Starting unique ID field option. therefore. Related Topics • UUID RFC: http://www.ietf. 686 SAP BusinessObjects Data Services Designer Guide . suppose the value you use is 100. The value in the first record that makes it to the Match transform contains the value where the incrementing begins. and each record must have the same value in this field. you cannot be sure which value the incrementing will begin at. To assign unique IDs using GUID You can use Globally Unique Identifiers (GUID) as unique IDs. and so on.000. There is no way to predict which record will make it to the Match transform first (due to sorting. During processing.001. select the field that contains the starting unique ID value. the first record or match group will have an ID of 100.txt To recycle unique IDs If unique IDs are dropped during the Delete processing option. For example.18 Data Quality Match Caution: Use caution when using the Field option. This means that each record you process must contain this field.002. To assign unique IDs using a field 1. see the Request for Comments (RFC) document. The field that you use must contain the unique ID value you want to begin the sequential numbering with.

3. Set the Recycle unique IDs option to Yes. Select a value in the Default destination protection option drop-down list. you can enter them in the file you want to use for recycling IDs and posting a starting value for your IDs. In the Destination Protection tab. Select the Value from file option. 3. and choose a field from the Destination protection field drop-down list (or Unique ID protected field) . There may be times when you would like to protect data in particular records or data in records from particular input sources from being overwritten. SAP BusinessObjects Data Services Designer Guide 687 . This is the same file that you might use for assigning a beginning ID number. Select the Specify destination protection by field option. Set the file name and path in the File option. In the Unique ID option group. The field you choose must have a Y or N value to specify the action. set the Processing operation option to Delete. 2. 2. To protect destination records through fields 1. For example: <UniqueIdSession> <CurrentUniqueId>477</CurrentUniqueId> <R>214</R> <R>378</R> </UniqueIdSession> Destination protection The Best Record and Unique ID operations in the Match transform offer you the power to modify existing records in your data. 4. This value determines whether a destination is protected if the destination protection field does not have a valid value. Enter these IDs in an XML tag of <R></R>. Use your own recycled unique IDs If you have some IDs of your own that you would like to recycle and use in a data flow.Data Quality Match 18 1. select Enable destination protection. The Destination Protection tab in these Match transform operations allow you the ability to protect data from being modified.

Repeat for every input source you want to set protection for. This operation can also counts statistics from logical input sources that you have already identified with values in a field (pre-defined) or from logical sources that you specify in the Input Sources operation. 4. Note: If you choose to count input source statistics in the Group Statistics operation. Group statistics The Group Statistics post-match operation should be added after any match level and any post-match operation for which you need statistics about your match groups or your input sources. In the Destination Protection tab. 1. select Enable destination protection. 2. To protect destination records based on input source membership You must add an Input Source operation and define input sources before you can complete this task. the default value will be used. This operation also allows you to exclude certain logical sources based on your criteria. Match will also count basic statistics about your match groups. 688 SAP BusinessObjects Data Services Designer Guide . 3. Select an input source from the first row of the Source name column. and then choose a value from the Destination protected (or Unique ID protected) column.18 Data Quality Match Any record that has a value of Y in the destination protection field will be protected from being modified. This value determines whether a destination (input source) is protected if you do not specifically define the source in the table below. Select a value in the Default destination protection option drop-down list. Select the Specify destination protection by source option. Remember that if you do not specify for every source.

clicking the Add button. Select the Generate source statistics from input sources option. Select Generate only basic statistics. To generate statistics for all input sources Before you start this task. Click the Apply button to save your changes. 3. Use this procedure if you are interested in generating statistics for all of your sources in the job. 1. 2. by selecting Post Match Processing in a match level.Data Quality Match 18 Group statistics fields When you include a Group Statistics operation in your Match transform. Add a Group Statistics operation to the appropriate match level. 1. SAP BusinessObjects Data Services Designer Guide 689 . 2. Match transform output fields To generate only basic statistics This task will generate statistics about your match groups. Add a Group Statistics operation to each match level you want. the following fields are generated by default: • • • • GROUP_COUNT GROUP_ORDER GROUP_RANK GROUP_TYPE In addition. and selecting Group Statistics. and so on. be sure that you have defined your input sources in the Input Sources operation. the following fields are also generated and available for output: • • • • SOURCE_COUNT SOURCE_ID SOURCE_ID_COUNT SOURCE_TYPE_ID Related Topics • Reference Guide: Data Quality fields appendix. which records are masters or subordinates. such as how many records in each match group. if you choose to generate source statistics.

Choose the appropriate value in the Default count flag option. in effect. or you can generate statistics for a sub-set of input sources. 3. Using this task. you can generate statistics for all input sources identified through values in a field. 1. 4. To count statistics for input sources generated by values in a field For this task. If you select this option.18 Data Quality Match This will generate statistics for all of the input sources you defined in the Input Sources operation. Select a field from the Logical source field drop-down list that contains the values for your logical sources. This value is used if the logical source field is empty. you can proceed to step 6 in the task. If you do not specify any sources in the table. 690 SAP BusinessObjects Data Services Designer Guide . 8. This task is complete. you do not need to define input sources with the Input Sources operation. 5. you are telling the Match transform to count all sources based on the (Yes or No) value in this field. Select one of the following: Option Count all sources Choose sources to count Description Select to count all sources. You can specify input sources for Match using values in a field. Add a Group Statistics operation to the appropriate match level. 7. In the Manually define logical source count flags table. Enter a value in the Default logical source value field. Select the Generate source statistics from source values option. Select Auto-generate sources to count sources based on a value in a field specified in the Predefined count flag field option. counting all sources. you are. 2. If you select this option. you can click the Apply button to save your changes. If you do not specify any sources in the Manually define logical source count flags table. Select to define a sub-set of input sources to count. 6. add as many rows as you need to include all of the sources you want to count. Choose Yes to count any source not specified in the Manually define logical source count flags table.

Highest ranking member of a match group whose members all came ters from the same source. Single source mas. Tip: If you have a lot of sources. Add a source value and count flag to each row. and setting up the Manually define logical source count flags table to count those two sources. Using the same method. you can set up Group Statistics to count everything and not count only a couple of sources.Data Quality Match 18 Note: This is the first thing the Match transform looks at when determining whether to count sources. but you only want to count two. This output field is populated with a Y or N depending on the type of record you select in the operation. Record type Unique Description Records that are not members of any match group. you can determine which records came from which source or source group and how many of each type of record were output per source or source group.A record that came from a normal or special source and is a subordinate dinates member of a match group. to tell the Match transform which sources to count. In that report. Can be from normal or special sources. Single source subor. No matching records were found. Your results will appear in the Match Input Source Output Select report. you could speed up your set up time by setting the Default count flag option to No. These can be from sources with a normal or special source. Output flag selection By adding an Output Flag Selection operation to each match level (Post Match Processing) you want. 9. you can flag specific record types for evaluation or routing downstream in your data flow. SAP BusinessObjects Data Services Designer Guide 691 . Adding this operation generates the Select_Record output field for you to include in your output schema.

Suppression subor. for each match level you want. 692 SAP BusinessObjects Data Services Designer Guide . Association matching Association matching combines the matching results of two or more match sets (transforms) to find matches that could not be found within a single match set. Records that came from a suppress source for which no matching records were found. The Select_Record output field can then be output from Match for use downstream in the data flow. Select the types of records for which you want to populate the Select_Record field with Y. To flag source record types for possible output 1. add an Output Flag Select operation.A record that came from a suppress-type source and is a subordinate dinates member of a match group. for example). A record that came from a suppress source and is the highest ranking member of a match group. Subordinate member of a match group that includes a higher-priority record that came from a suppress-type source.18 Data Quality Match Record type Multiple source masters Multiple source subordinates Suppression matches Suppression uniques Suppression masters Description Highest ranking member of a match group whose members came from two or more sources. This is most helpful if you later want to split off suppression matches or suppression masters from your data (by using a Case tranform. A subordinate record of a match group that came from a normal or special source whose members came from two or more sources. Can be from normal or special source. Can be from normal or special sources. In the Match editor. 2.

it provides access to any of the generated data from all match levels of all match sets. You know that many of the students have a temporary local address and a permanent home address. with name and SSN.Data Quality Match 18 You can set up association matching in the Associate transform. it provides the overlapped results of multiple criteria. This is commonly referred to as association matching. and match on name and Social Security number (SSN). Second. Korean and Taiwanese (or CJKT) data. You can process any non-Latin1 Unicode data. with special processing for Chinese. This lets you identify people who may have multiple addresses. This match set has two purposes. In this example. This transform acts as another match set in your data flow. you can match on name. from which you can derive statistics. which is available to the technical college on every student. Unicode matching Unicode matching lets you match Unicode data. Japanese. the Associate transform combines the two match sets to build associated match groups. generated by the Match transforms. Example: Association example Say you work at a technical college and you want to send information to all of the students prior to the start of a new school year. SAP BusinessObjects Data Services Designer Guide 693 . for each match result that will be combined. Then. thereby maximizing your one-to-one marketing and mailing efforts. The latter is needed for real-time support. address. and postal code in one match set. as a single ID. Group numbers The Associate transform accepts a group number field. such as name and address. The transform can then output a new associated group number. The Associate transform can operate either on all the input records or on one data collection at a time. First. in another match set.

It can interpret numbers that are written in native script. and Taiwanese matching Regardless of the country-specific language. and firm name characters in the referential data. to be equal to those used with hyphenated data. and so on. Includes variations for popular. the Match transform considers: • • • • Block data markers. district. and so on. personal. Ignores commonly used optional markers for province. and so on. Match between non-phonetic scripts like kanji. to be equal to their variations (Corp. For example. city.18 Data Quality Match Chinese. 694 SAP BusinessObjects Data Services Designer Guide . Japanese. simplified Chinese. in address data comparison. To find the abbreviations. Considers firm words. the Match transform: • • Considers half-width and full-width characters to be equal. ga marker. Unicode match limitations The Unicode match functionality does not: • • Perform conversions of simplified and traditional Chinese data. the matching process for CJKT data is the same. to be equal. This can be controlled with the Convert text to numbers option in the Criteria options group. Words with or without Okurigana to be equal in address data. Intelligently handles variations in a building marker. Variations of no marker. Korean. Considers native script numerals and Arabic numerals to be equal.) during the matching comparison process. the transform uses native script variations of the English alphabets during firm name matching. • • • • Japanese-specific matching capabilities With Japanese data. Variations of a hyphen or dashed line to be equal. such as Corporation or Limited. or Ltd. such as chome and banchi.

Setting this option is recommended if you plan to use the Text to Numbers feature to specify the locale of the data for locale-specific text-to-number conversion for the purpose of matching. Viktor Ivanov Takeda Noburu Виктор Иванов スッセ フレ Locale The Locale option specifies the locale setting for the criteria field.. Inter-script matching Inter-script matching allows you to process data that may contain more than one script by converting the scripts to Latin1. If you prefer to process the data without converting it to Latin1. leave the Inter-script Matching option set No. Here a four examples of text-to-number conversion: SAP BusinessObjects Data Services Designer Guide 695 . Tip: The Match wizard can do this for you when you use the multi-national strategy. you must first. as best you can. or one has Latin and other has Cyrillic.. For example one record has Latin1 and other has katakana. This can be done by using a Case transform to route country data based on the country ID. Here are two examples of names matched using inter-script matching: Name Can be matched to.Data Quality Match 18 Route records based on country ID before matching Before sending Unicode data into the matching process. separate out the data by country to separate match transforms. Select Yes to enable Inter-script matching.

Example: • • When possible. Set the Match engine option in the Match transform options to a value that reflects the type of data being processed. and name data. firm. Open the AddressJapan_MatchBatch Match transform configuration. For more information on Match Criteria options. and save it with a different name. such as Primary_Name or Person1_Family_Name1. and Name_Data1-3 criteria. • Phonetic matching You can use the Double Metaphone or Soundex functions to populate a field and use it for creating break groups or use it as a criteria in matching. Firm_Data1-3. use criteria for parsed components for address. Set up your criteria and other desired operations. If you have parsed address.18 Data Quality Match Language French German Italian Spanish Text Numbers quatre mille cinq cents soixante-sept 4567 dreitausendzwei cento ciento veintisiete 3002 100 127 For more information on these matching options. For all other data that does not have a corresponding criteria. see the Match Transform section of the Reference Guide. 4. see the Match Transform section of the Reference Guide To set up Unicode matching 1. 696 SAP BusinessObjects Data Services Designer Guide . firm. use the Custom criteria. 3. use the Address_Data1-5. 2. or name data that does not have a corresponding criteria. Use a Case transform to route your data to a Match transform that handles that type of data.

set up the criteria options this way Option Compare algorithm Check for transposed characters Intials adjustment score Substring adjustment score Value Field No 0 0 SAP BusinessObjects Data Services Designer Guide 697 . For example: Name Comparison score Smith 72% similar Smythe Name Phonetic key (primary) Comparison score Smith Smythe SMO 100% similar SMO Criteria options If you intend to match on phonetic data. than if you were to match on other criteria such as name or firm data.Data Quality Match 18 Match criteria There are instances where using phonetic data can produce more matches when used as a criteria. Matching on name field data produces different results than matching on phonetic data.

place the phonetic criteria first in the order of criteria and set your match score options like this: Option Match score No match score Value 101 99 Blank fields Remember that when you use break groups. records that have no value are not in the same group as records that have a value (unless you set up matching on blank fields). consider the following two input records: Mr Johnson 100 Main St La Crosse WI 54601 698 SAP BusinessObjects Data Services Designer Guide .18 Data Quality Match Option Abbreviation adjustment score Value 0 Match scores If you are matching only on the phonetic criteria. set your match score options like this: Option Match score No match score Value 100 99 If you are matching on multiple criteria. For example. including a phonetic criteria.

if you break on more than one character. these records will be in different break groups. see the Data Services Management Console: Metadata Reports Guide. an empty phonetic field. and therefore will not be compared. Length of data The length you assign to a phonetic function output is important. If you are not creating break groups. In this example. This means that there cannot be a match. Include Group Statistics in your Match transform If you are generating the Match Source Statistics Summary report. For more information about these individual reports.Data Quality Match 18 Scott Johnson 100 Main St La Crosse WI 54601 After these records are processed by the Data Cleanse transform. you must have a Group Statistics operation included in your Match and Associate transform(s). Set up for match reports We offer many match reports to help you analyze your match results. For example: First name (last name) Output S (Johnson) Scott (Johnson) S SKT Suppose these two records represent the same person. the first record will have an empty first name field and. if you are creating break groups. SAP BusinessObjects Data Services Designer Guide 699 . therefore. there cannot be a match if you are not blank matching.

You do not necessarily need to include an Input Sources operation in the Match transform. Insert appropriate output fields There are three output fields you may want to create in the Match transform.and post-match operations. levels. and each of your pre. It's best to turn off reports after you have thoroughly tested your data flow. This will help you better understand which of these elements is producing the data you are looking at. which criteria were instrumental in creating a match. By turning on report data generation. make sure that you have used unique names in the Match and Associate transforms for your match sets. and operations To get the most accurate data in your reports. you can get information about break groups. Note: You can also generate input source statistics in the Group Statistics operation by defining input sources using field values. and so on. you may want to include an Input Sources operation in the Match transform to define your sources and. Turn on report data generation in transforms In order to generate the data you want to see in match reports other than the Match Source Statistics report. in a Group Statistics operation select to generate statistics for your input sources. levels. you must set the Generate report statistics option to Yes in the Match and Associate transform(s). They are: • • • Match_Type Group_Number Match_Score 700 SAP BusinessObjects Data Services Designer Guide . if you want that data posted in the Match Duplicate Sample report. Note: Be aware that turning on the report option can have an impact on your processing performance.18 Data Quality Match If you want to track your input source statistics. Define names for match sets. such as Group Prioritization and Group Statistics.

Design and Debug 19 .

• • • • Related Topics • Using View Where Used • Using View Data • Using the interactive debugger • Comparing Objects • Using Auditing Using View Where Used When you save a job. work flow. or target object. See which data flows use the same object. the software also saves pointers between it and its three children: • • • a table source a query transform a file target 702 SAP BusinessObjects Data Services Designer Guide . or data flow the software also saves the list of objects used in them in your repository. For example.19 Design and Debug Using View Where Used This section covers the following Designer features that you can use to design and debug jobs: • Use the View Where Used feature to determine the impact of editing a metadata object (for example. Use the auditing data flow feature to verify that correct data is processed by a source. Use the Difference Viewer to compare the metadata for similar objects and their properties. Use the View Data feature to view sample source. transform. transform. and target data in a data flow after a job executes. at table). Parent/child relationship data is preserved. Use the Interactive Debugger to set breakpoints and filters between transforms within a data flow and view job data row-by-row during a job execution. when the following parent data flow is saved.

For example. Accessing View Where Used from the object library You can view how many times an object is used and then view where it is used. To access the View Where Used option in the Designer you can work from the object library or the workspace.Design and Debug Using View Where Used 19 You can use this parent/child relationship data to determine what impact a table change. The Usage column is displayed on all object library tabs except: • • Projects Jobs SAP BusinessObjects Data Services Designer Guide 703 . View an object in the object library to see the number of times that it has been used. will have on other data flows that are using the same table. for example. find all the data flows that are also using the table and update them as needed. while maintaining a data flow. Before doing this. The data can be accessed using the View Where Used option. you may need to delete a source table definition and re-import the table (or edit the table schema). To access parent/child relationship information from the object library 1.

19 Design and Debug Using View Where Used • Transforms Click the Usage column heading to sort values. to find objects that are not used. If the Usage is greater than zero. Table DEPT is used by data flow DF1. the values can be Source or Target For flat files and tables only: As Description Lookup() Lookup table/file used in a lookup function Lookup_ext() Lookup table/file used in a lookup_ext function 704 SAP BusinessObjects Data Services Designer Guide . etc. in the following example: The As column provides additional context. table DEPT is used as a Source. For example. right-click the object and select View Where Used. tables. in data flow DF1. The Information tab displays rows for each parent of the object you selected. The Output window opens. Other possible values for the As column are: • • For XML files and messages. The As column tells you how the selected object is used by the parent. 2. flat files. The type and name of the selected object is displayed in the first column's heading. For example..

From the Output window. SAP BusinessObjects Data Services Designer Guide 705 . double-click a parent object.Design and Debug Using View Where Used 19 As Description Lookup_seq() Lookup table/file used in a lookup_seq function • For tables only: As Description Comparison Table used in the Table Comparison transform Table used in the Key Generation transform Key Generation 3. The workspace diagram opens highlighting the child object the parent is using.

select ViewWhere Used. This is an important option because a child object in the Output window might not match the name used in its parent. If the row represents a child object in the same parent. this object is simply highlighted in the open diagram. Accessing View Where Used from the workspace From an open diagram of an object in the workspace (such as a data flow). The names of objects used in parents can only be seen by opening the parent in the workspace. the workspace diagram for that object opens. you can double-click a row in the output window again.19 Design and Debug Using View Where Used Once a parent is open in the workspace. 706 SAP BusinessObjects Data Services Designer Guide . you can view where a parent or child object is used: • To view information for the open (parent) object. The Information tab on the Output window displays the name used in the object library. or from the tool bar. You can customize workspace object names for sources and targets. • • If the row represents a different parent. select the View Where Used button. The software saves both the name used in each parent and the name used in the object library.

• The software does not save parent/child relationships between functions. SAP BusinessObjects Data Services Designer Guide 707 . the Output window opens with a list of jobs (parent objects) that use the open data flow. then right-click a data flow and select the View Where Used option. The Output window opens with a list of parent objects that use the selected object. Limitations • • This feature is not supported in central repositories. the Output window displays a list of data flows that use the table. if you select a table. for a table. If the table is also used by a grandparent (a work flow for example). Only parent and child pairs are shown in the Information tab of the Output window. • To view information for a child object. To see the relationship between a data flow and a work flow.Design and Debug Using View Where Used 19 In this example. these are not listed in the Output window display for a table. right-click an object in the workspace diagram and select the View Where Used option. open the work flow in the workspace. For example. For example. a data flow is the parent.

You can scan and analyze imported table and file data from the object library as well as see the data for those same objects within existing jobs. Of course after you execute the job. At any point after you import a data source. • • Transforms Lines in a diagram 708 SAP BusinessObjects Data Services Designer Guide . Use View Data to look at: • Sources and targets View Data allows you to see data before you execute a job. The fact that function B is used once in function A is not counted. This includes custom ABAP transforms that you might create to support an SAP applications environment. in data flow DF1 if table DEPT is used as a source twice and a target once the object library displays its Usage as 2. you can check on the status of that data—before and after processing your data flows. even when the job does not execute successfully. a table is not often joined to itself in a job design. you can refer back to the source data again. View Data information is displayed in embedded panels for easy navigation between your flows and the data. Use View Data to check the data while designing and testing jobs to ensure that your design returns the results you expect. you can view and compare sample data from different steps. the Usage in the object library will be zero for both functions. you can create higher quality job designs. and function A is not in any data flows or scripts. Using one or more View Data panes. This occurrence should be rare. • • • Transforms are not supported. and ending data at your targets. For example. The Designer counts an object's usage as the number of times it is used for a unique purpose. View imported source data. the usage in the object library will be 1 for both functions A and B.19 Design and Debug Using View Data • If function A calls function B. For example. Using View Data View Data provides a way to scan and capture a sample of the data produced by each step in a job. changed data from transformations. Armed with data details. If function A is saved in one data flow.

Right-click a table and select Open or Properties. the table must be from a supported database. 2. SAP BusinessObjects Data Services Designer Guide 709 . View Data is not supported for SAP IDocs. Click the View data button (magnifying glass icon) to open a View Data pane for that source or target object. or Properties window opens. Related Topics • Viewing data passed by transforms • Using the interactive debugger Accessing View Data To View data for sources and targets You can view data for sources and targets from two different locations: 1. the Table Profile tab and Column Profile tab options are not supported for hierarchies. the file must physically exist and be available from your computer's operating system. Open a View Data pane from the object library in one of the following ways: • • Right-click a table object and select View Data. For SAP and PeopleSoft. Object library View Data in potential source or target objects from the Datastores or Formats tabs. To view data for a file. From any of these windows. View Data button View Data buttons appear on source and target objects when you drag them into the workspace. To view data for a table. you can select the View Data tab. XML Format Editor.Design and Debug Using View Data 19 Note: • • View Data displays blob data as <blob>. The Table Metadata.

click the magnifying glass button on a data flow object. 710 SAP BusinessObjects Data Services Designer Guide . Click the magnifying glass button for another object and a second pane appears below the workspace area (Note that the first pane area shrinks to accommodate the presence of the second pane). and tables must be from a supported database. files must physically exist and be accessible from the Designer. This means: For sources and targets.19 Design and Debug Using View Data Related Topics • Viewing data in the workspace Viewing data in the workspace View Data can be accessed from the workspace when magnifying glass buttons appear over qualified objects in a data flow. A large View Data pane appears beneath the current workspace area. To open a View Data pane in the Designer workspace.

Design and Debug Using View Data 19 You can open two View Data panes for simultaneous viewing. The description or path for the selected View Data button displays at the top of the pane. a small menu appears containing window placement icons. • For sources and targets. The black area in each icon indicates the pane you want to replace with a new set of data. When both panes are filled and you click another View Data button. the description is the full object name: SAP BusinessObjects Data Services Designer Guide 711 . Click a menu option and the data from the latest selected object replaces the data in the corresponding pane.

The Designer highlights the View Data pane for the object. The Designer displays View Data buttons on open objects with grey rather than white backgrounds. Maximum sample size is 5000 rows. Set sample size for sources and targets from Tools > Options > Designer > General > View Data sampling size. View Data displays your data in the rows and columns of a data grid. and the object name to the right. Looking for grey View Data buttons on objects and lines. the path consists of the object name on the left. The number of rows displayed is determined by a combination of several conditions: • Sample size — The number of rows sampled in memory. if you select a View Data button on the line between the query named Query and the target named ALVW_JOBINFO(joes. an arrow. • Related Topics • Viewing data passed by transforms View Data Properties You can access View Data properties from tool bar buttons or the right-click menu. 712 SAP BusinessObjects Data Services Designer Guide . For example.Owner ) for tables FileName ( File Format Name ) for files For View Data buttons on a line.19 Design and Debug Using View Data • • • ObjectName ( Datastore. the path would indicate: Query -> ALVW_JOBINFO(Joes.DI_REPO). Default sample size is 1000 rows for imported source and target objects.DI_REPO) You can also find the View Data pane that is associated with an object or line by: • Rolling your cursor over a View Data button on an object or line.

Related Topics • Filtering • Sorting • Starting and stopping the interactive debugger Filtering You can focus on different sets of rows in a local or new data sample by placing fetch conditions on columns. b. To view and add filters 1. Operator—Select an operator from the second column. click the Filters button. The Filters window opens. c. the number of returned rows could be less than the default. the software uses the Data sample rate option instead of sample size. Column—Select a name from the first column. or right-click the grid and select Filters. Select {remove filter} to delete the filter. Value—Enter a value in the third column that uses one of the following data type formats SAP BusinessObjects Data Services Designer Guide 713 . In the View Data tool bar. Create filters. You can see which conditions have been applied in the navigation bar. • • Filtering Sorting If your original data set is smaller or if you use filters.Design and Debug Using View Data 19 When using the interactive debugger. The Filters window has three columns: a. 2.

click Apply. or right-click the cell and select Add Filter.19 Design and Debug Using View Data Data Type Format Integer. real date time datetime varchar standard yyyy. 4. In the View Data tool bar. select an operator (AND.dd hh24:mm.ss 'abc' 3. click OK. To use filters with a new sample. To see how the filter affects the current set of returned rows. 5. 2. To save filters and close the Filters window. double. OR) for the engine to use in concatenating filters. Select a cell from the sample data grid. 714 SAP BusinessObjects Data Services Designer Guide . click the Add Filter button. Each row in this window is considered a filter. Your filters are saved for the current object and the local sample updates to show the data filtered as specified in the Filters dialog. see Using Refresh.mm.dd hh24:mm:ss yyyy. Related Topics • Using Refresh To add a filter for a selected cell 1.mm. In the Concatenate all filters using list box.

in the tool bar click the Refresh button in the tool bar. use the Refresh command. All filters are removed for the current object. To change sort order. from the tool bar click the Remove Sort button. To remove sorting for an object. After you edit filtering and sorting. 3. Related Topics • Using Refresh Using Refresh To fetch another data sample from the database using new filter and sort settings. or right-click the grid and select Remove Filters. When you are finished. or right-click the data grid and select Refresh. or right-click the grid and select Remove Sort. go to the View Data tool bar and click the Remove Filters button. then opens the Filters window so you can view or edit the new filter. To remove filters from an object. An arrow appears on the heading to indicate sort order: ascending (up arrow) or descending (down arrow). click the column heading again.Design and Debug Using View Data 19 The Add Filter option adds the new filter condition. <column> = <cell value>. SAP BusinessObjects Data Services Designer Guide 715 . Sorting You can click one or more column headings in the data grid to sort your data. click OK. The priority of a sort is from left to right on the grid.

Opening a new window To see more of the data sample that you are viewing in a View Data pane. Using Show/Hide Columns You can limit the number of columns displayed in View Data by using the Show/Hide Columns option from: • • • The tool bar. Show All. This option is only available if the total number of columns in the table is ten or fewer.19 Design and Debug Using View Data To stop a refresh operation. click the Stop button. or right-click the data grid and select Show/Hide Columns. The Column Settings window opens. Click the Show/Hide columns tool bar button. open a full-sized View Data window. Select the columns that you want to display or click one of the following buttons: Show. 3. click the Open Window tool bar button to activate a separate. Hide. The arrow shortcut menu. 2. While the software is refreshing the data. or Hide All. Select a column to display it. Click OK. You can also "quick hide" a column by right-clicking the column heading and selecting Hide from the menu. To show or hide columns 1. The right-click menu. located to the right of the Show/Hide Columns tool bar button. all View Data controls except the Stop button are disabled. full-sized View Data 716 SAP BusinessObjects Data Services Designer Guide . From any View Data pane.

View Data tool bar options The following options are available on View Data panes. Opens the Filters window. See Using Refresh. See Filtering. Fetches another data sample from existing data in the View Data pane using new filter and sort settings. Saves the data in the View Data pane. Prints View Data pane data. Option Description Open in new window Opens the View Data pane in a larger window.Design and Debug Using View Data 19 window. Save As Print Copy Cell Refresh data Open Filters window SAP BusinessObjects Data Services Designer Guide 717 . you can right-click and select Open in new window from the menu. See Opening a new window. Copies View Data pane cell data. Alternatively.

Shows or hides the navigation bar which appears below the data table. Without the Data Profiler. See Sorting. See Using Show/Hide Columns Remove Filter Remove Sort Show/hide navigation Show/hide columns View Data tabs The View Data panel for objects contains three tabs: • • • Data tab Profile tab Column Profile tab Use tab options to give you a complete profile of a source or target object. Removes all filters in the View Data pane. 718 SAP BusinessObjects Data Services Designer Guide .19 Design and Debug Using View Data Option Description Add a Filter See To add a filter for a selected cell. Removes sort settings for the object you select. the Profile and Column Profile tabs are supported for some sources and targets (see Release Notes for more information). The Profile and Relationship tabs are supported with the Data Profiler. The Data tab is always available.

for example <CompanyName>. tables and columns in the selected path are displayed in blue. Related Topics • View Data Properties To view a nested schema 1. Double-click a cell. Nested schema references are shown in angle brackets. the selected cell value is marked by a special icon. Also. The data grid updates to show the data in the selected cell or nested table. When a column references nested schemas. In the Schema area. For example: • Select a lower-level nested column and double-click a cell to update the data grid.Design and Debug Using View Data 19 Related Topics • Viewing the profiler results Data tab The Data tab allows you to use the properties of View Data. Continue to use the data grid side of the panel to navigate. In the Data area. It also indicates nested schemas such as those used in XML files and messages. that column is shaded yellow and a small table icon appears in the column heading. Use the path and the data grid to navigate through nested schemas. SAP BusinessObjects Data Services Designer Guide 719 . • See the entire path to the selected column or table displayed to the right of the Drill Up button. • Click the at the top of the data grid to move up in the hierarchy. data is shown for columns. while nested schema references are displayed in grey. 2.

the minimum value in this column. The grid contains six columns: Column Description Names of columns in the current table. Of all values. the Profile tab displays the profile attributes that you selected on the Submit Column Profile Request option. Of all values. then click Update to populate the profile grid. Select names from this column. Column Distinct Values NULLs Min Max Last Updated 720 SAP BusinessObjects Data Services Designer Guide . Select only the column names you need for this profiling operation because Update calculations impact performance.19 Design and Debug Using View Data Profile tab If you use the Data Profiler. The Profile tab allows you to calculate statistical information for any set of columns you choose. You can also right-click to use the Select All and Deselect All menu options. This optional feature is not available for columns with nested schemas or for the LONG data type. 2. The total number of distinct values in this column. The statistics appear in the Profile grid. 3. the maximum value in this column. The total number of NULL values in this column. Related Topics • Executing a profiler task To use the Profile tab without the Data Profiler 1. Click Update. The time that this statistic was calculated. Select one or more columns.

Enter a number in the Top box. The grid contains three columns: Column Description A "top" (most frequently used) value found in your specified column. Column Profile tab The Column Profile tab allows you to calculate statistical information for a single column. 2. you can click the Records button on the Profile tab to count the total number of physical records in the object you are profiling. The Column Profile grid displays statistics for the specified column. Note that Min and Max columns are not sortable. Click Update. In addition to updating statistics. Value SAP BusinessObjects Data Services Designer Guide 721 . 3. or "Other" (remaining values that are not used as frequently). The software saves previously calculated values in the repository and displays them until the next update. This number is used to find the most frequently used values in the column. which means that the software returns the top 10 most frequently used values. Select a column name in the list box. If you use the Data Profiler. Note: This optional feature is not available for columns with nested schemas or the LONG data type. The default is 10.Design and Debug Using View Data 19 Sort values in this grid by clicking the column headings. the Relationship tab displays instead of the Column Profile. Related Topics • To view the relationship profile data generated by the Data Profiler To calculate value usage statistics for a column 1.

You can also see that the four most frequently used values (the "top four") are used in 90 percent of all cases. you will get up to 6 returned values (the top 5 used values in the specified column. the total number of rows counted during the calculation for each top value is 1000. as only 10 percent is shown in the Other category. plus an additional value called "Other.19 Design and Debug Using View Data Column Total Description The total number of rows in the specified column that contain this value. The percentage of rows in the specified column that have this value compared to the total number of values in the column. plus the "Other" category). and so on. For example." So. if you enter 5 in the Top box. 50 percent use the value Item3. Results are saved in the repository and displayed until you perform a new update. 722 SAP BusinessObjects Data Services Designer Guide . 20 percent use the value Item2. For this example. statistical results in the preceding table indicate that of the four most frequently used values in the Name column. Percentage The software returns a number of values up to the number specified in the Top box.

Setting filters and breakpoints You can set any combination of filters and breakpoints in a data flow before you start the interactive debugger. Note: A repository upgrade is required to use this feature. The Debug mode provides the interactive debugger's windows. While in debug mode. To exit the debug mode and return other Designer features to read/write.Design and Debug Using the interactive debugger 19 Using the interactive debugger The Designer includes an interactive debugger that allows you to examine and modify data row-by-row (during a debug mode job execution) by placing filters and breakpoints on lines in a data flow diagram. you can start the interactive debugger from the Debug menu when a job is active in the workspace. click the Stop debug button on the interactive debugger toolbar. Before starting the interactive debugger Like executing a job. If you do not set predefined filters or breakpoints: SAP BusinessObjects Data Services Designer Guide 723 . The debug mode begins. Before you start a debugging session. The debugger uses the filters and pauses at the breakpoints you set. The interactive debugger provides powerful options to debug a job. menus. you might want to set the following: • • Filters and breakpoints Interactive debugger port between the Designer and an engine. All interactive debugger commands are listed in the Designer's Debug menu. all other Designer features are set to read-only. The Designer enables the appropriate commands as you progress through an interactive debugging session. Select Start debug. and tool bar buttons that you can use to control the pace of the job and view data by pausing the job execution using filters and breakpoints. set properties for the execution. however. then click OK.

19 Design and Debug Using the interactive debugger • The Designer will optimize the debug job execution. Its title bar displays the objects to which the line connects. A line is a line between two objects in a workspace diagram. the following window represents the line between AL_ATTR (a source table) and Query (a Query transform). You can pause a job manually by using a debug option called Pause Debug (the job pauses before it encounters the next transform). Right-click the line that you want to examine and select Set Filter/Breakpoint. Consequently. For example. This often means that the first transform in each data flow of a job is pushed down to the source database. • Related Topics • Push-down optimizer To set a filter or breakpoint 1. 724 SAP BusinessObjects Data Services Designer Guide . 2. you cannot view the data in a job between its source and the first transform unless you set a predefined breakpoint on that line. 3. Open one of its data flows. The Breakpoint window opens. open the job that you want to debug. In the workspace.

the execution pauses when the number of rows you specify pass through the breakpoint. Use a filter to reduce a data set in a debug job execution. the job execution pauses for the first row passed to a breakpoint. 5. The breakpoint can only see the filtered rows. If you use a breakpoint with a condition. The software applies the filter first. Choose to use a breakpoint with or without conditions. A breakpoint condition applies to the after image for UPDATE. Click OK.Design and Debug Using the interactive debugger 19 4. Note that complex expressions are not supported in a debug filter. Like a filter. In this case. Place a debug filter on a line between a source and a transform or two transforms. Set and enable a filter or a breakpoint using the options in this window. the job execution pauses for the first row passed to the breakpoint that meets the condition. Instead of selecting a conditional or unconditional breakpoint. If you set a filter and a breakpoint on the same line. The software provides the following filter and breakpoint conditions: SAP BusinessObjects Data Services Designer Guide 725 . • • If you use a breakpoint without a condition. A breakpoint is the location where a debug job execution pauses and returns control to you. you can set a breakpoint between a source and transform or two transforms. The appropriate icon appears on the selected line. NORMAL and INSERT row types and to the before image for a DELETE row type. you can also use the Break after 'n' row(s) option. A debug filter functions as a simple Query transform with a WHERE clause.

You can use this button to open and close the View Data panes. The locator box appears over the breakpoint icon as shown in the following diagram: A View Data button also appears over the breakpoint. the job pauses at your breakpoint. For example.19 Design and Debug Using the interactive debugger Icon Description Breakpoint disabled Breakpoint enabled Filter disabled Filter enabled Filter and breakpoint disabled Filter and breakpoint enabled Filter enabled and breakpoint disabled Filter disabled and breakpoint enabled In addition to the filter and breakpoint icons that can appear on a line. 726 SAP BusinessObjects Data Services Designer Guide . the debugger highlights a line when it pauses there. when you start the interactive debugger. A red locator box also indicates your current location in the data flow.

it highlights subsequent lines and displays the locator box at your current position. 2. Starting and stopping the interactive debugger A job must be active in the workspace before you can start the interactive debugger. Enter a value in the InteractiveDebugger box. SAP BusinessObjects Data Services Designer Guide 727 . You can select a job from the object library or from the project area to activate it in the workspace. 3. Click OK. Select Tools > Options > Designer > Environment. The interactive debugger port is set to 5001 by default. Related Topics • Panes Changing the interactive debugger port The Designer uses a port to an engine to start and stop the interactive debugger. Once a job is active. the Designer enables the Start Debug option on the Debug menu and tool bar. To change the interactive debugger port setting 1.Design and Debug Using the interactive debugger 19 As the debugger steps though your job's data flow logic.

in the following data flow diagram. The options unique to the Debug Properties window are: • Data sample rate — The number of rows cached for each line when a job executes using the interactive debugger. click Start debug. For example.19 Design and Debug Using the interactive debugger To start the interactive debugger 1. The Debug Properties window opens. Defaults to cleared. if the source table has 1000 rows and you set the Data sample rate to 500. then the Designer displays up to 500 of the last rows that pass through a selected line. The debugger displays the last row processed when it reaches a breakpoint. in the project area you can click a job and then: • • • Press Ctrl+F8 From the Debug menu. Click the Start debug button on the tool bar. Click OK. In the project area. Enter the debug properties that you want to use or use the defaults. 728 SAP BusinessObjects Data Services Designer Guide . 2. The Debug Properties window includes three parameters similar to the Execution Properties window (used when you just want to run a job). 3. • Exit the debugger when the job is finished — Click to stop the debugger and return to normal mode after the job executes. right-click a job and select Start debug. Alternatively. You will also find more information about the Trace and Global Variable options.

The following diagram shows the default locations for these panes. SAP BusinessObjects Data Services Designer Guide 729 . Sets the user interface to read-only. They also update as you manually step through the job or allow the debugger to continue the execution. Note: You cannot perform any operations that affect your repository (such as dropping objects into a data flow) when you execute a job in debug mode. You now have control of the job execution. When the debugger encounters a breakpoint. press Shift+F8. Adds Debugging Job<JobName> to its title bar. Enables the appropriate Debug menu and tool bar options. click Stop debug. the Designer displays three additional panes as well as the View Data panes beneath the work space. Panes When you start a job in the interactive debugger. or From the Debug menu. The interactive debugger windows display information about the job execution up to this point. it pauses the job execution. The Designer: • • • Displays the interactive debugger windows. • • Displays the debug icon in the status bar.Design and Debug Using the interactive debugger 19 The job you selected from the project area starts to run in debug mode. Related Topics • Reference Guide: Parameters To stop a job in debug mode and exit the interactive debugger • Click the Stop Debug button on the tool bar.

2. then click and drag its title bar to re-dock it. To move a debugger pane. View Data panes Call Stack pane Trace pane Debug Variable pane Each pane is docked in the Designer's window.19 Design and Debug Using the interactive debugger 1. 1. double-click its control bar to release it. 3. Control buttons 730 SAP BusinessObjects Data Services Designer Guide . 4. Control bar 2.

the following Call Stack window indicates that the data you are currently viewing is in a data flow called aSimple and shows that the path taken began with a job called Simple and passed through a condition called Switch before it entered the data flow.Design and Debug Using the interactive debugger 19 The Designer