Professional Documents
Culture Documents
Important Information
SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED OR BUNDLED TIBCO SOFTWARE IS SOLELY TO ENABLE THE FUNCTIONALITY (OR PROVIDE LIMITED ADD-ON FUNCTIONALITY) OF THE LICENSED TIBCO SOFTWARE. THE EMBEDDED OR BUNDLED SOFTWARE IS NOT LICENSED TO BE USED OR ACCESSED BY ANY OTHER TIBCO SOFTWARE OR FOR ANY OTHER PURPOSE. USE OF TIBCO SOFTWARE AND THIS DOCUMENT IS SUBJECT TO THE TERMS AND CONDITIONS OF A LICENSE AGREEMENT FOUND IN EITHER A SEPARATELY EXECUTED SOFTWARE LICENSE AGREEMENT, OR, IF THERE IS NO SUCH SEPARATE AGREEMENT, THE CLICKWRAP END USER LICENSE AGREEMENT WHICH IS DISPLAYED DURING DOWNLOAD OR INSTALLATION OF THE SOFTWARE (AND WHICH IS DUPLICATED IN THE TIBCO DATAEXCHANGE INSTALLATION GUIDE). USE OF THIS DOCUMENT IS SUBJECT TO THOSE TERMS AND CONDITIONS, AND YOUR USE HEREOF SHALL CONSTITUTE ACCEPTANCE OF AND AN AGREEMENT TO BE BOUND BY THE SAME. This document contains confidential information that is subject to U.S. and international copyright laws and treaties. No part of this document may be reproduced in any form without the written authorization of TIBCO Software Inc. TIB, TIBCO, Information Bus, The Power of Now, TIBCO Adapter, TIBCO Administrator, TIBCO BusinessWorks, TIBCO Designer, TIBCO Hawk, TIBCO Rendezvous, TIBCO Runtime Agent, TIBCO Enterprise Message Service, TIBCO SmartSockets and TIBCO DataExchange are either registered trademarks or trademarks of TIBCO Software Inc. in the United States and/or other countries. EJB, J2EE, JMS and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the U.S. and other countries. All other product and company names and marks mentioned in this document are the property of their respective owners and are mentioned for identification purposes only. This software may be available on multiple operating systems. However, not all operating system platforms for a specific software version are released at the same time. Please see the readme.txt file for the availability of this software version on a specific operating system platform. THIS DOCUMENT IS PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. THIS DOCUMENT COULD INCLUDE TECHNICAL INACCURACIES OR TYPOGRAPHICAL ERRORS. CHANGES ARE PERIODICALLY ADDED TO THE INFORMATION HEREIN; THESE CHANGES WILL BE INCORPORATED IN NEW EDITIONS OF THIS DOCUMENT. TIBCO SOFTWARE INC. MAY MAKE IMPROVEMENTS AND/OR CHANGES IN THE PRODUCT(S) AND/OR THE PROGRAM(S) DESCRIBED IN THIS DOCUMENT AT ANY TIME. Copyright 2005 - 2006 TIBCO Software Inc. ALL RIGHTS RESERVED. TIBCO Software Inc. Confidential Information
| iii
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Related Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x TIBCO DataExchange Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Other TIBCO Product Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x How to Contact TIBCO Customer Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
iv
| Contents
Overview Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pop-up Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Toolbars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Custom Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 22 22 23
Web Favorites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 To Display the Web Favorites Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Web Favorites Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Contents v
Use Column Map to Perform Union on Output of Column Splitters with Different Sources . . . . . . . . . . . . . Use Duplicate and Flat Files to Debug Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Use equals() When Comparing Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Use Auto-Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adding, Finding, Renaming, and Deleting Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Add a New, Blank Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Import a Task from Another Project File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Retrieve a Task from DataExchange Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Find a Task, Task Segment, or Task Chain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rename a Task, Task Segment, or Task Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Delete a Task, Task Segment, or Task Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Common Tasks for All Objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Give Each Object a Descriptive Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Give Each Object a Detailed Description or Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Delete an Object From a Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Datasources (Sources And Targets) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flat File Datasources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Table Datasources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . JMS Stream Datasources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XML File Datasources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Common Tasks for All Datasources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Readers And Writers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Common Tasks for Readers and Writers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bulk Readers and Writers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flat File Readers and Writers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . JDBC Readers and Writers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . JMS Readers and Writers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Null Transformers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XML Readers and Writers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45 46 47 48 49 49 49 49 50 50 50 51 51 51 51 53 53 57 59 62 65 68 69 70 72 73 79 79 80
Transformers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Common Characteristics of Transformers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Column Map Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Column Splitter Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Duplicate Transformer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Duplicate Elimination Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Group By Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Join Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Null Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Pivot Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Row Splitter Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Sort Transformer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Temp Table Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Union Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Update/Delete Transformer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
TIBCO DataExchange Designer Users Guide
vi
| Contents
Data Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Link Objects with Data Streams. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Dependency Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Task Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Create a Task Segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Organize Task Segments with Folders. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Find a Task Segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modify a Task Segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Add a Task Segment to a Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . List Tasks That Use a Task Segment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 119 120 121 121 122 122
Using Multiple Data Flows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Changing Model and Diagram Layouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Saving and Using Quick Launch Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Log In and Logging Out of a DataExchange Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Log In to a Server from DataExchange Designer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Log Out of a Server from DataExchange Designer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Deploying Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Refresh the Server Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Create a Task Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modify a Task Chain in DataExchange Designer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Retrieve a Task Chain from DataExchange Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 133 133 136 136
Contents vii
Pivot Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Row Splitter Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Sort Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 Union Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Update/Delete Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 JDBC Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Column Map - JavaScript Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lesson 1: Using the Expression Editor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lesson 2: Getting and Setting Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lesson 3: Column-Level Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lesson 4: Cell-Level Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 164 164 168 169 169
viii
| Contents
| ix
Preface
This preface lists the TIBCO DataExchange documentation set and lists other related TIBCO products. It also provides information about how to contact TIBCO support.
Topics
Related Documentation, page x How to Contact TIBCO Customer Support, page xii
| Related Documentation
Related Documentation
This section lists documentation resources you may find useful.
Preface xi
TIBCO Administrator graphical user interface enables users to deploy, monitor, and start and stop TIBCO applications. TIBCO BusinessWorks software: This software provides an easy to use integration platform that allows you to develop integration projects. It includes a graphical user interface for defining business processes and an engine that executes the process. TIBCO Designer: This graphical user interface is used for designing and creating integration project configurations and building an Enterprise Archive (EAR) for the project. The EAR can then be used by TIBCO Administrator for deploying and running the application. TIBCO Hawk software: This is a tool for monitoring and managing distributed applications and operating systems. TIBCO Rendezvous: This software enables programs running on many different kinds of computers on a network to communicate seamlessly. It includes two main components: the Rendezvous programming language interface (API) in several languages, and the Rendezvous daemon. TIBCO Enterprise Message Service: This software lets application programs send and receive messages using the Java Message Service (JMS) protocol. It also integrates with TIBCO Rendezvous and TIBCO SmartSockets messaging products.
xii
TIBCO DataExchange is a tool for transforming, migrating, and integrating large quantities of data from disparate sources.
Topics
What is TIBCO DataExchange?, page 2 Architecture and Components, page 4 Development Process, page 6 Project Elements, page 8
Data Target 1
Load
Transformation Engine
Data Target 2
...
Data Source N
...
Data Target N
TIBCO DataExchange is typically used to: Leverage pertinent data for business intelligence use. For example, generating the star schema in a data warehouse for a business analytics application. Merge data from multiple sources based on a recurring schedule. For example, doing a nightly batch merge of related data from multiple application databases. Do a one-time conversion from one system or application to another. For example, converting a legacy flat file system to an RDBMS-based application. Synchronize data. For example, doing sequence batches of data transformations and loads into a data target, ensuring that the end result is current and free of data conflicts, as opposed to real-time snapshots that may be unsynchronized. Simplify complex data infrastructure. For example, cleansing and unifying fragmented and uncorrelated data from multiple organizations within a corporation, to produce a role-specific view.
Features
TIBCO DataExchange includes the following major features: Visual data modeler and data flow designer graphical user interface that allows drag and drop, and provides wizards for easy configuration. Source and target analysis capabilities through integrated data modeling. An ability to execute tasks on demand or schedule to run later.
DataExchange Designer
Microsoft Windows
DataExchange Console
Debugging during Development
TIBCO Administrator
Web Interface
Data Sources
Oracle SQL Server DB2 Sybase Informix Access Excel Bulk Read Datacom XML Flat Files JMS JCA Raw Data Transformed Data
Data Targets
Oracle SQL Server DB2 Sybase Informix Access Excel Bulk Write XML Flat Files JMS JCA
DataExchange Server
Runtime Data MetaData
Repository
DataExchange consists of several software components plus a number of utilities: DataExchange Designer: The design-time component used to create, modify, and deploy tasks to the DataExchange server. DataExchange Designer runs only on Windows; the other components are Web or Java-based and run on various operating systems.
TIBCO Administrator: The management and administrative component that allows you to manage the tasks created in DataExchange Designer and deployed to DataExchange server. Administrator is used to start and stop DataExchange servers, set DataExchange server security, schedule tasks and other options. TIBCO Administrator is also the security provider for DataExchange. Users and roles need to be defined in TIBCO Administrator for authentication and authorization purposes. DataExchange Console: DataExchange Console includes debugging and impact analysis tools for use when developing tasks in DataExchange Designer. DataExchange Console should be installed with DataExchange Designer on development machines for use when debugging tasks. DataExchange Server: The server component that actually performs the tasks. For development it is convenient to have a copy of DataExchange server on the development system, but this is not required and your license may not permit it. In production, for best performance, DataExchange server should be installed on a different system than the ones running the source and target databases. Repository: A database used by DataExchange server to store tasks and related data. See TIBCO DataExchange Installation Guide for a list of supported databases. TIBCO BusinessWorks plug-in: A plug-in that provides the DataExchange palette in TIBCO BusinessWorks. The palette includes resources for connecting to the DataExchange server and invoking tasks within BusinessWorks processes. DxUtility: A utility that allows you to deploy, run or undeploy a list of task or task chains specified in a configuration file or from the command line. The utility is installed with DataExchange server at install-path\tibco\dx\5.3\bin. See the DxUtility.readme file (located in the above directory) for information about using the utility. cmdlineconsole: A scriptable command-line alternative to using TIBCO Administrator to manage tasks and task chains. Installed with DataExchange server at install-path\tibco\dx\5.3\bin. User management tasks, however, should be performed using TIBCO Administrator.
Development Process
This section outlines the DataExchange development process. In the real world, development may be complicated by such things as multiple data flows within a task or division of labor among development, QA, and production staff. The next diagram shows the major steps in the development process.
Add Project Create Data Model Add Transformer for Source Data Add Transformer for Target Data Test and Debug Deploy to Production Server
Schedule
Design-time Tasks
The following tasks are performed in DataExchange Designer. Create a Project and Add a Data Model Create a new project file. Add data model(s). Typically you will have one data model (or set of data models) from which you will select source databases tables or flat files for your tasks, and another data model (or set of data models) which you will use as targets for the transformed data. Ideally, during development the source data model should be a test database that is an identical copy or subset of the production database, running on the same operating system and RDBMS, with the same drivers. If that is not possible, use flat files exported from the production database or, if it will not interfere with operations, the production database itself. During initial development, while adding transformers to your task, the target data model should always be flat files, since until the task is complete the data will not be in the format required by the target database. Once the task is complete, ideally both the source and the target data models should be test databases that are identical copies of the production databases. To the extent this is impossible, substitute flat files. Create a task. A task contains the set of data flows that are to be executed in parallel.
Development Process 7
Add Source and Target Transformers Add only the source(s) the transformer uses as inputs. When a source is a flat file on your local machine, and your DataExchange server is running on a different machine, you can attach a copy of the local file to the task when deploying to DataExchange server. Add a reader for each source. Add the transformer. Add a target for each of the transformers outputs. Add a writer for each target. Add data streams to link: the source(s) to the corresponding reader(s) the target(s) to the corresponding writer(s) the reader(s) to the transformer the transformer to the writer(s) Test and Debug the Transformers Deploy the task, selecting the Execute tasks immediately option. Run the task on the DataExchange server to see whether the task succeeds or aborts. If it succeeds, verify that the output data is correct.
Run-time Tasks
You deploy tasks to the production DataExchange server using DataExchange Designer. You schedule, monitor and manage tasks using TIBCO Administrator. Deploy Tested Task for Production Replace the test environment sources and targets with production environment sources and targets. Deploy and execute the task in the production environment.
Schedule the Deployed Task If you wish, in TIBCO Administrator, schedule the task to run automatically. See the TIBCO DataExchange Administrators Guide for details.
Project Elements
Broadly speaking, a DataExchange Designer project file contains at least one data model and one or more tasks. After the tasks have been deployed, a set of corresponding server-side elements are created on the DataExchange server machine.
Project File
Before you can create a task, you must create a project file to hold it. The file extension for DataExchange project files is .dt1, so project files are commonly called DT1s. A project file contains one or more tasks, plus data models that define the source and target databases, files or both for those tasks. When a task is deployed, the DT1 that contains it may be attached and stored in the DataExchange server repository, from which it may later be retrieved.
Project Elements 9
Data Models
As shown in the next diagram, data models are graphical representations of source or target objects. DataExchange Designer lets you reverse engineer existing databases to create physical data models. You can then visually construct your task diagrams using the tables in your models. Having access to the physical design of your databases while creating data transformation projects makes moving and transforming data from different sources easy to manage.
Data model diagrams display the tables, relationships between the tables, and constraints that make up your databases. DataExchange Designer also provides you with the tools to edit your data models, create new data models, and generate new databases from those models. Data models may represent: relational databases flat files XML files
TIBCO DataExchange Designer Users Guide
10
JMS streams
Models for any or all of these types may be included in a single project file.
Submodels
DataExchange Designer lets you create independent views of all or part of your physical model called submodels. Submodels let you display characteristics for a subject area that are independent from the main model. Any changes to submodel objects automatically propagate to the main model. You can create any number of submodels within your data model. Use submodels to help you organize your data model. For example, if you have a large number of tables in the main model, you may want to organize them in submodels. A submodel lets you focus on a smaller set of tables. You can use the same table in multiple submodels while the table remains in the main model. Any changes to this table or its relationships is reflected in all submodels as well as the main model. Changes to display settings, such as color and layout, are not reflected across submodels or the main model. This is so that submodels can also be leveraged to store different views or displays of the models. You can use a conceptual table display setting for business users not interested in attribute information. Or you can use a technical data type display setting for developers who might need more information.
Project Elements 11
Tasks
A task is a data flow, or a set of data flows to be executed in parallel. As shown in the next diagram, each data flow is assembled graphically by adding objects to the tasks workspace.
Data Source
Reader
Transformer
Writers
Data Target
Task object types include: Data Source: defines a database table, flat file, or other source from which data will be extracted. Reader: specifies the details of an extraction from a particular source. For example, a JDBC reader may include a SELECT to extract only the necessary columns. In some cases, a reader may have more than one source, for example, a JDBC reader may have several source tables and query the RDBMS to join them before sending the data to DataExchange server. Transformer: specifies a specific transformation to be performed on the data, such as concatenating columns, eliminating duplicate rows, or sorting. Writer: specifies the details of a load to a particular target. Data Target: defines a database table, flat file, or other target into which data will be loaded. Data Stream: a line that defines a portion of the path of a data flow by connecting two objects. A data stream may connect a source and a reader, a reader and a transformer, one transformer and another, a reader or transformer and a writer, or a writer and a a target. A data flow is a set of objects in a task that represent and define a single extract-transform-load
12
operation to be performed by DataExchange server. A task may contain multiple data flows (see Using Multiple Data Flows on page 123). Dependency relationship: Keeps a reader or writer on hold until a writer on which it is dependent is finished, for example when one writer loads a set of primary keys that will be used as foreign keys by the second writers data.
Server-Side Elements
Once a task is deployed from DataExchange Designer to DataExchange server, the project gains some additional server-side elements. These topics are documented in the TIBCO DataExchange Administrators Guide. Deployed tasks: When you deploy a task from DataExchange Designer, a file is created in DataExchange servers repository. That file, essentially a script, controls what DataExchange server does when the task is run. Use TIBCO Administrator to view, run, or delete deployed tasks. Attached DT1s: By default, a deployed task includes a copy of the DT1 file from which it was deployed, allowing you to retrieve the task back into DataExchange Designer for editing. This is particularly useful when you roll a deployed task back to an earlier version. Task chains: Deployed tasks may batched together as a task chain and then run or scheduled like a single task. Runtime properties: Various properties such as datasource file names and RDBMS username and password may be specified at runtime, overriding the settings in the deployed task. You may also define custom variables in DataExchange Designer to allow you to specify virtually any aspect of the task at runtime. Runtime parameters may be entered manually or picked up from a parameter file, which may be created in a text editor or written out to a file. Version history: By default, when you modify and redeploy a task in DataExchange Designer, it overwrites the existing deployed task in DataExchange server. You may set DataExchange server to retain multiple versions of deployed tasks instead. You may make an older version active temporarily, or roll back to an earlier version permanently. Schedules: Optionally, you may set DataExchange server to run a task automatically, either by defining a schedule. A schedule may be a one-time operation or may recur at any interval, and you may define multiple schedules for a task.
| 13
Chapter 1
This chapter describes the main windows that are part of the DataExchange Designer interface.
Topics
Overview, page 14 Main Window, page 15 Diagram Explorer, page 16 Diagram Window, page 19 Other Windows and Toolbars, page 22 Web Favorites, page 24
14
| Chapter 1
Overview
The DataExchange Designer interface is divided into two tabbed windows that allow you to toggle between Data Model and Task views. DataExchange Designer also includes a tree that lets you easily navigate Data Models and Tasks. DataExchange Designer includes context-sensitive toolbars that change depending on your workspace focus. Most items on the toolbar are accessible from the application and shortcut menus. Application and shortcut menus are also context-sensitive and change depending on workspace focus.
Main Window 15
Main Window
The DataExchange Designer interface is divided into two windows or panes. The left pane is the Diagram Explorer and the right pane is the Diagram Window. The Diagram Explorer has tabs that offer easy access to important functionality and lets you efficiently manage your Data Models, Servers, Data Dictionary, and Macros. You can easily navigate large Data Models and reuse design elements using the Diagram Explorer. The Diagram Window offers tabs that display Data Model, Task, Task Chain and Task Segment diagrams. Within the Diagram Window, several small windows provide detailed information about your Data Model or Task. You can find the Zoom window, Pop-up windows, and the Overview window in the Diagram Window whether you are working on a Data Model or a Task.
Diagram Explorer
Diagram Window
16
| Chapter 1
Diagram Explorer
The Diagram Explorer has the following tabbed panes: Tab Main Servers Data Dictionary Macro Description Lets you locate, modify, and create Source, Targets, and their objects. Lets you manage the DataExchange servers and the tasks deployed on the servers. Lets you locate, modify, and create Data Dictionary objects such as defaults, rules, user datatypes, and domains. Lets you locate, add, edit, rename, delete, or run macros.
Main Tab
The Main tab of the Diagram Explorer displays all your Source and Target models. DataExchange Designer separates your models by model types. You can create multiple models for tables, flat files, JMS files, and XML files. You can drill down to objects using the Data Model pane in the Diagram Explorer to directly access individual components within objects, such as columns contained in tables.
Servers Tab
The Servers tab of the Diagram Explorer displays the selected projects Task components, and registered machines.
Diagram Explorer 17
Tasks consist of selected Transformers, and Source and Targets in the project. The Machines node maintains a list of registered machines that are running DataExchange server. DataExchange Designer loads registered machine data from the system registry.
18
| Chapter 1
You can import a Data Dictionary from another model for use in the current model.
Macros Tab
The Macros tab of the Diagram Explorer displays all macros. Use the Macros tab to navigate your macros. You can organize your macros into folders, allowing you easy access to different types of macros. You can create or delete folders to better suit your organizational needs.The Macros tab also displays sample macros. DataExchange Designer includes several sample macros with the installation. These macros demonstrate how to use the Automation Interface.
Diagram Window 19
Diagram Window
The Diagram Window has tabbed panes that provide a workspace for creating Data Model, Task, Task Segment and Task Chain diagrams. The workspaces are customizable and allow you to zoom in or out of specific work areas. The next diagram shows the location of the Diagram Workspace tabs that allow you to toggle between displaying the panes.
The next table describes the tabs of the Diagram Workspace: Tab Data Model Task Description Lets you create physical Data Model diagrams of your Source and Target databases or flat files. Lets you create Task diagrams. Lets you add and edit all the components of your Task, including Source and Target objects, Transformers, and Data Streams. Lets you create a partial task and reuse it in other tasks. Lets you run or schedule two or more tasks as a single operation.
20
| Chapter 1
Right-clicking model objects opens context-sensitive menus that give you access to functions for that object type. For example, right-clicking a table opens a menu that lets you edit or delete the table, change its background color, create a submodel or view, and cut or copy a table. Right-clicking a relationship opens a context-sensitive menu that lets you edit or delete the database view; create a submodel or view; or copy the view.
Task Tab
The Task tab of the Diagram Window displays the objects that comprise your Task. You can add and edit any component of your Task in the Task tab of the Diagram Window. You can also customize the look and feel of the Task by changing colors and fonts, or aligning objects. If your Task is large, you can zoom into a specific work area. The Diagram Window can contain multiple Task diagrams. Each Task diagram has a separate tab at the bottom of the Diagram Window. This multi-tab layout makes it easy to navigate Projects that include multiple Tasks.
Diagram Window 21
Right-clicking an object opens other context-sensitive menus that let you edit, delete, or change the display properties of the individual objects. For example, right-clicking a transformer lets you edit or delete the transformer, or change its background color. Right clicking a Data Stream opens a context-sensitive menu that lets you edit or delete the path.
22
| Chapter 1
Zoom Window
DataExchange Designer offers a Zoom Window to help you focus on the details of a specific area of a large, reduced diagram. This feature is only available in the Diagram Window. You can open and close the Zoom Window as needed. You can also move the window by dragging it by its title bar. On the View menu, click Zoom Window.
Overview Window
The Zoom Window helps you focus on the details of a specific area of a large, reduced diagram. This feature is only available in the Diagram Window. You can open and close the Zoom Window as needed. You can also move the window by dragging it by its title bar. On the View menu, click Overview Window.
Pop-up Windows
Pop-up Windows provide brief definitions of each object in your Data Model or Task diagrams. When you place your pointer over any object for a second, DataExchange Designer opens a pop-up window. The pop-up window displays the name of the object and a brief description of the object.
Toolbars
DataExchange Designer toolbars are context-sensitive and change to reflect the element of the application you are using. Each toolbar contains buttons that quickly access commonly used features of DataExchange Designer. You can float or dock your toolbars. Double-click the handle of a docked toolbar to float it. If your toolbar is floating, double-click the title bar to redock it. You can also move toolbars to horizontal or vertical positions anywhere on the screen.
Custom Interface
You can customize the DataExchange interface by choosing among Microsoft Office 97, XP, and 2003 interface styles, and OneNote-style tabs. To change styles, right-click in the toolbar area and select Customize.
24
| Chapter 1
Web Favorites
DataExchange Designer lets you access external resources on the Internet or an intranet. You can add URLs that you visit often. Use the Web Favorites dialog box to add, edit, or delete favorite URLs.
Delete
| 25
Chapter 2
This chapter contains detailed instructions for working with project files and data models.
Topics
Creating Project Files, page 26 Adding and Working With Data Models, page 28
26
| Chapter 2
With the Retrieve and open the file immediately option selected, the file is always created in install-path\tibco\dx\5.3\dxdesigner\Model\, even if you set a different path in Tools > Options > Directories > Models.
28
| Chapter 2
5. Click the folder button in the upper right corner of the dialog, then navigate to and double-click the TIBCO DataExchange file (DT1). 6. Select the checkbox for the model(s) to be imported, then click Finish. For more information, see Data Models.
30
| Chapter 2
Native / SQL Server: Specify the machine name or named instance. In DataExchange Designer, you specify a SQL Server named instance Windows-style, with a backslash (SERVER1\INSTANCE1), but in DataExchange Console you specify it Java-style, with a forward slash (SERVER1/INSTANCE1).
Native / Sybase ASE: Specify the server name specified in the Sybase client. Database List: Select the database to reverse-engineer. This option is active only for native connections to MS SQL Server or Sybase ASE; with other connection types, the Datasource setting selects the database. Owner List: Only the tables and/or views of the selected owner(s) will be available for selection in the next step of the wizard. By default, this is set to the default system owner (IBM DB2 db2admin, MS SQL Server / Sybase dbo, Oracle system), so all tables and views will be available. Include System Tables / Include System Views: If checked, the RDBMS metaschema tables and views will be available for selection. Infer Primary Keys: If checked, DataExchange Designer infers primary keys from the existence of unique indexes on tables. If more than one unique index exists on a table, it picks the one with the fewest columns. Infer Foreign Keys from Indexes: If checked, DataExchange Designer infers foreign keys by looking for indexes whose columns match the names, data type properties, and column sequences of a primary key. If the child index is a primary key index, it must contain more columns than the parent primary key. In this case, an identifying relationship is created. Infer Foreign Keys from Names: If checked, DataExchange Designer infers foreign keys by looking for columns that (1) match the names and data type properties of a primary key and (2) do not have role names. In this case, a non-identifying relationship is created. Infer Domains: checked, DataExchange Designer infers a domain for each unique combination of a column name and its associated data type properties. Reverse Engineer View Dependencies: When checked, ensures that imported views are valid by automatically importing all tables on which the views are dependent. When unchecked, if you neglect to select all of the tables on which a selected view is dependent, the imported view will be invalid. To import only a view and the associated tables, check Include User Tables and Include User Views, deselect all tables, deselect all views but the one to be imported, and check Reverse Engineer View Dependencies:
32
| Chapter 2
Initial Layout Option: Controls how the model objects are laid out in the Diagram Window. The default Hierarchical setting usually gives the best results. You can experiment with other options after reverse-engineering is complete.
See also Change a Reverse-Engineered Models Datasource Settings Update a Model from Another Model or a Live Database (Compare and Merge)
3. To generate the database directly in the target RDBMS, select Generate Objects with a Database Connection; to create a SQL script, leave Generate a Single, Ordered Script File selected. At this point, you may restore previously saved settings (see Saving and Using Quick Launch Settings). After doing so, to generate the database immediately, click Go; otherwise, skip to step 6.. 4. If you selected Generate Objects with a Database Connection, click Connect, enter the connection information and click OK. The target platform and connection type must be the same as the source models. For example, you cannot generate an Oracle database from a SQL Server model. 5. For SQL Server or Sybase ASE only, select or specify the target database. 6. Click Next. 7. Select the objects to be generated, set any options as desired, then click Next. 8. Optionally, save your settings for future reuse (see Saving and Using Quick Launch Settings). 9. Optionally, click SQL Preview to view the DDL. 10. Click Finish to generate the SQL file or database. For additional information Create a Model by Reverse-Engineering a Database
Update a Model from Another Model or a Live Database (Compare and Merge)
The Compare and Merge utility allows you to compare a model in a project file (DT1) with another model in the same file or in another DT1, ER/Studio, or SQL file, or with a live database. 1. Select the Main tab of the Diagram Explorer.
34
| Chapter 2
2. Right-click the model icon, then select Compare and Merge Utility. The Compare and Merge wizard will walk you through the settings for comparing the selected model with another model or a live database. At this point, you may restore previously saved settings (see Saving and Using Quick Launch Settings). After doing so, to run the compare operation immediately, click Go, otherwise, click Next and modify the restored settings as desired. 3. After DataExchange Designer compares the two models, it displays differences as follows, with the current model on the left:
If this were a comparison of a reverse-engineered model (left) with the source database (right), the display above would tell you that there has been at least one change to the Categories table, and that a SupportContacts table has been added. To see what has changed in the Categories table, expand the tree by clicking + icons:
Now you can see that the only change is that one of the columns has been renamed. To update the current model with the changes to the database, change Set All Resolutions to Set All to Merge into Current:
Alternatively, to merge only a subset of changes, change Ignore to Merge into Current for the changes you wish to merge:
If due to different names or some other change objects are not matched with their counterparts, you may force a match: click the object in the left column, right-clicking the object in the right column, and select Match Objects:
4. After you have set the Resolution settings as desired, click Finish to perform the merge, or if merging into a live database click Next and follow the wizards instructions to generate and optionally run a SQL script. Changes to tables may also be propagated directly to the database by the JDBC Writer during task execution; see Add a JDBC Writer to a Task. For additional information Create a Model by Reverse-Engineering a Database
36
| Chapter 2
To import metadata 1. Select File > Import File > From External Metadata. 2. Follow the prompts to complete the import. To export model metadata 1. Select the Main tab of the Diagram Explorer. 2. Right-click the models icon, then select Export Model Metadata. 3. Follow the prompts to complete the export.
Dimensional Models
If you import a dimensional model from ER/Studio, the model notation will automatically be set to Dimensional, and each table will be assigned one of the following types, which are distinguished by icons. You may also change a models notation from Relational to Dimensional or vice-versa manually with the Model > Notation commands, and change the table types manually on the Dimensional tab of the Table Editor (see Edit a Table Datasource). Type Fact Icon Represents tables with one or more foreign keys and no children Notes Fact tables contain individual records, the data for which the database is being designed. Central tables in a star schema. A group of related data, such as date, hours, minutes and seconds, represented by one key such as time in a fact table. A more-normalized format where elements of dimension tables are listed. For date such a table might contain day of week, holiday, Julian date and so on.
Dimension
Snowflake
Type Bridge
Icon
Notes Used to support multi-valued dimensions or complex hierarchies. Also known as a helper table or an associative table. Its the only way to implement two (or more) one-to-many relationships, or many-to many relationships. Used to support certain kinds of complex hierarchies.
Hierarchy Navigation
Undefined
all other tables (e.g. one with a many-to-many relationship, or that is parent to both a fact table and a dimension table)
Assign this type manually to flag tables for which you have not yet determined the appropriate type.
38
| Chapter 2
| 39
Chapter 3
Topics
Optimizing Tasks for a Production Environment, page 40 Best Practices for Creating and Editing Tasks, page 41 Adding, Finding, Renaming, and Deleting Tasks, page 49 Common Tasks for All Objects, page 51 Datasources (Sources And Targets), page 53
40
| Chapter 3
Minimize Logging
In a production environment, minimize file-based logging and logging to the repository. In TIBCO Administrator, expand the DataExchange console folder and select Configuration. Under the Log Settings tab, ensure that only Critical and Error are selected.
42
| Chapter 3
Color-Code Objects
Color-code your sources and target so you can visually understand from where each of the sources are coming and to where each of the targets are going.
44
| Chapter 3
Use Column Map to Perform Union on Output of Column Splitters with Different Sources
When you need to perform a union of a common subset of columns from column splitter sources with different columns, you face a chicken-and-egg obstacle: A union transformer will not allow you to connect another source unless it has the same set of columns as previously connected sources, so you cant connect the column splitters to the union transformer until you drop the columns. You cannot configure a column splitter transformer until after you connect it to its target.
Consequently, in the example shown next, when you try to connect the second column splitter to the union, you will get an error.
46
| Chapter 3
The workaround is simple: after the second column splitter, put a column map transformer, connect the column splitter to the column map, configure the column splitter, then connect the column map to the union transformer (you do not need to configure the column map).
transformer, the duplicate transformer to the flat-file writer and JDBC writer, and the flat-file writer to the scratch flat file, as shown next:
5. Deploy and run the task, then check the scratch flat file to check the data. If it looks like the problem is farther back in the data flow, you can simply delete a few data streams, move the duplicate - writer - flat file segment back a step, and reconnect:
48
| Chapter 3
When you use == to compare two Java objects, you are comparing their references (addresses) and not their contents), so this will return true only if you are comparing the object with itself. For example, if you called getValueAt() twice using the same colIndex and rowIndex and then compared the two using ==, it would return true. To compare the contents of the objects, use the equals() method.
Use Auto-Layout
If as a task becomes more complex it gets messy and hard to read, save the project, select Layout > Hierarchical Layout from the main menu, and click Yes to confirm that you want to proceed even though the operation cant be undone. This will automatically redo the layout, often making it easier to read. If you are not happy with the results, try Orthogonal Layout and Tree Layout. (The other types are sometimes useful for database models but dont make much sense for ETL data flows.) If none of the results are an improvement, close and reopen the file to restore your previous layout.
50
| Chapter 3
52
| Chapter 3
Understanding Propagation By default, changes to a task are automatically propagated down the data stream. For example, when you add a new column in an update/delete transformer, the column is automatically added to any downstream transformers. The one exception is that DataExchange Designer always asks for confirmation before propagating changes to flat-file, XML, and JMS targets. This propagation can damage certain transformers. For example, say your task has an update/delete transformer that concatenates LastName and FirstName columns into a FullName column, followed by a column map that drops the LastName and FirstName columns. You find that you need to trim trailing spaces from the LastName data, so you add another column map before the update/delete. With propagation on, when you delete the link between the reader and the update transformer, the column map transformer is reset. Consequently, when you link the new column map into the stream, LastName and FirstName appear in the output. To avoid this unwanted change, turn propagation off by clicking the appropriate Propagation button, delete the link, click the button again to turn propagation back on, then create and link the new column map. With the approach, the output remains the same as before the change.
54
| Chapter 3
Add a Flat File Datasource to a Data Model For flat files you plan to use as sources, it a sample file is available, DataExchange Designer can pick up the column names from the file. For flat files used only as targets this is unnecessary, as the writer will set the targets column names. 1. Select the Diagram Explorers Main tab. 2. In the tree, double-click Model--Flat File (or another flat-file model). 3. Right-click in a blank area of the workspace, then select Insert Flat File. 4. Click in a blank area to add the flat file, right-click in a blank area to return to the select cursor, then double-click the new flat file. 5. Revise the datasource name to make it more descriptive (you might want to make this the same as the file name), enter a description, then click Next. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.) 6. Select the appropriate location: File System / Location @: LOCAL MACHINE means the flat file is on your computer (the one running DataExchange Designer), <computer name> means the flat file is on the default DataExchange server. (If the default DataExchange server is the local machine, LOCAL MACHINE is the only choice.) If the flat file is to be used only as a target, not a source, select LOCAL MACHINE only when you are deploying the task to a local instance of DataExchange server. After selecting the location, click to select the file, or type the full path and file name. Attach a copy of this local file to the task when deploying to DataExchange server: select this option when (1) the file will be used as a source, (2) you have selected LOCAL MACHINE as the location, and (3) you are deploying to a different machine. When you deploy the task, the file will be copied to the file-system repository on the DataExchange server machine, with the file name <task name>$$<Datasource Name> (no extension). FTP Server: the specified server must be reachable by DataExchange server. It does not need to be reachable by DataExchange Designer. 7. Follow the instructions in the wizard to complete the rest of the settings. A few notes on those that are not self-explanatory: For comma-delimited files, set the text qualifier (typically double quote). On the column specifications page, double-click any property to revise it. For flat files used only for output, there is no need to enter column specifications, as they will be provided by the writer. If you check First row contains column names, click Setting to choose the column header format for output files. If none of the predefined styles suit your purposes, select
TIBCO DataExchange Designer Users Guide
Customize a Header Style and define your own, using the keywords NAME, TYPE, LENGTH, and/or SCALE (do not use a keyword more than once), spaces, and any desired punctuation. For fixed-width files, set the column length. If you have a sample file at the location specified in step 6. above, the first few rows will appear in the wizard to help you adjust the settings. For example, at right you can see that first column is two characters too long, so its length needs to be changed from 10 to 8.
To add another column, double-click in the Column field of the blank row at the bottom of the list (as shown at right) and enter a column name. After entering the specifications, use the Up or Down buttons to reposition it.
When debugging, or any other time its useful, give your target flat file the following settings and end the file name with the extension .csv. Then, after DataExchange server creates or updates the file, you can double-click it to open it in Excel. Property File Format First row contains column names Row delimiter Text delimiter Setting Delimited checked Windows (CR/LF) Double quote
56
| Chapter 3
Setting Comma
For more information, see Flat File Datasources. For a hands-on demonstration, see Flat File Tutorial. Edit a Flat File Datasources Properties 1. If you are editing a task, double-click the flat file. Alternatively, select the Diagram Explorers Main tab, expand the flat-file branch of the tree, and double-click the flat files icon. 2. Make changes as required: To change the files location, select the Datasource tab. To change column type from delimited to fixed-length or vice-versa, select the Datasource tab. To change row, column, or text delimiters (for either type), select the Configuration tab. To change column names or data types, or to add or delete columns, select the Datasource tab. To add, remove, or change column headings for output files, select the Configuration tab and click Setting. To change the truncation setting, select the Configuration tab. To change how nulls are represented, select the Datasource tab. For more information on the above settings, see Add a Flat File Datasource to a Data Model. 3. If you make a mistake, click Cancel to lose your changes; otherwise, when finished, click OK. For more information, see Flat File Datasources. Add a Flat File Datasource to a Task 1. If you have not done so already, create the flat file datasource (see Add a Flat File Datasource to a Data Model). 2. Display the task in the Diagram Window (double-click the task in the Diagram Explorer). 3. Select the Diagram Explorers Main tab. 4. Expand the flat-file branch of the data model tree.
TIBCO DataExchange Designer Users Guide
5. Click and drag the flat files icon from the data model tree and drop it in the task workspace. 6. If the flat file is a source, right-click in the task workspace to the right of the flat file, then select Add Reader > Flat File Reader. If the flat file is a target, right-click in the task workspace to the left of the flat file, then select Add Writer > Flat File Writer. 7. Link the flat file and its reader or writer with a data stream (see Link Objects with Data Streams). 8. Double-click the reader or writer and configure it (see Flat File Readers and Writers). For more information, see Flat File Datasources. For a hands-on demonstration, see Flat File Tutorial. View a Flat Files First Several Rows in DataExchange Designer 1. Display the task in the Diagram Window (double-click the task in the Diagram Explorer). 2. Double-click the flat files icon, then select the Configuration tab. The first few rows of the file are shown at the bottom of the dialog. (If you see nothing, check the location settings discussed in step 6 of Add a Flat File Datasource to a Data Model.) For more information, see Flat File Datasources.
Table Datasources
A table datasource represents a table or view in an RDBMS that DataExchange server will extract data from or load data to. A table datasource may be used in multiple tasks within a project file as a source, a target, or both (one task in a chain could load data into the table, then a later task in the chain could extract data from the same table). For step-by-step instructions, see: Add a Table Datasource to a Task Use a Table Datasource Alias Edit a Table Datasource For a hands-on demonstration, see JDBC Tutorial
58
| Chapter 3
Add a Table Datasource to a Task 1. If you have not done so already, add a data model that includes the table (see Adding and Working With Data Models). 2. Display the task in the Diagram Window (double-click the task in the Diagram Explorer). 3. Select the Diagram Explorers Main tab. 4. In the data model tree, expand the branch containing the table. 5. Click and drag the tables icon from the data model tree and drop it in the task workspace. 6. If the table is a source, right-click in the task workspace to the right of the table, then select Add Reader > JDBC Reader. If the table is a target, right-click in the task workspace to the left of the table, then select Add Writer > JDBC Writer. In a production environment, in certain circumstances a bulk reader or writer may give better performance (see Bulk Readers and Writers). 7. Link the table and its reader or writer with a data stream (see Link Objects with Data Streams). 8. Double-click the reader or writer and configure it (see JDBC Readers and Writers). For more information, see Table Datasources. For a hands-on demonstration, see JDBC Tutorial. Use a Table Datasource Alias You may use property groups as datasource aliases. For details, see Change a Reverse-Engineered Models Datasource Settings. Edit a Table Datasource With a few exceptions, the properties in DataExchange Designers Table Editor are irrelevant to normal task development, and you should edit them only if you plan to use the Generate Database or Compare and Merge commands or the update options in the JDBC writer to create or update a database schema. (See Create a Database from a Reverse-Engineered Model (Generate Database) or Update a Model from Another Model or a Live Database (Compare and Merge).) To do the following, use transformers rather than editing table properties:
TIBCO DataExchange Designer Users Guide
To add columns, use a column map or pivot transformer. To change a columns name, data type, or nullability, use a column map. To delete columns, use a column splitter.
To edit a table 1. Double-click the table in the data model tree or a task workspace. 2. Make changes as necessary: Datasource tab: These properties should normally not be changed for individual tables, but rather set globally for the whole model (see Change a Reverse-Engineered Models Datasource Settings). Columns tab: In a task, you may use this tab to change the column output order. Other settings should be changed only when creating or updating a database schema. DDL tab: Display only. Definition tab: Has no effect except when creating or updating a database schema. Foreign Keys tab: Display only. Indexes tab: Change only when creating or updating a database schema. Note tab: Any notes about the table entered here will appear in the reports generated by Tools > Intranet Dictionary Report. Constraints tab: Change only when creating or updating a database schema. PreSQL and PostSQL tab: When creating or updating a database schema, use this tab to specify any SQL to be executed before and after the CREATE TABLE statement displayed on the DDL tab. Reference Values tab: Change only when creating or updating a database schema. Attachment Bindings tab: Change only when creating or updating a database schema. For more information, see Table Datasources.
60
| Chapter 3
To use a JMS stream as a TIBCO DataExchange datasource, do the following: 1. Install your JMS provider on the DataExchange server machine. 2. Add a JMS Stream Datasource to a Data Model 3. Add a JMS Stream Datasource to a Task See also Edit a JMS Stream Datasource Add a JMS Stream Datasource to a Data Model Before following these instructions, you must register a JMS stream datasource in DataExchange Console or in TIBCO Administrator. 1. Select the Diagram Explorers Main tab. 2. In the data model tree, double-click Model--JMS Stream (or another JMS model). 3. Right-click in a blank area of the workspace, then select Insert JMS. 4. Click in a blank area to add the JMS datasource, right-click in a blank area to return to the select cursor, then double-click the new JMS datasource. 5. Revise the datasource name to make it more descriptive (you might want to make this the same as the file name), enter a description, then click Next. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.) 6. Select the JMS type, then click Next. Binary and Object types are not supported. If a JMS reader receives a message of a different type than the one set here, it ignores the message.
7. Set the JMS Source Name to match one registered in DataExchange Console and the destination settings as appropriate for your JMS provider, then click Next. For JMS messages of type Map or Stream, values cannot be null. By default, the JMS Writer replaces nulls with the following values: boolean, false; integer / short / long / float / double / decimal, 0; date / time / timestamp, current date / time; binary: 0x00. If you need to set nulls to a different value, set them in a column map inserted before the JMS writer (see Add a column map to a task). 8. If necessary, adjust the JMS stream options, then click Next. JMS Rowset Wait Time is the maximum number of seconds the JMS reader will wait before sending the next rowset to its output. A setting of 0 (rarely useful) means the reader will wait until it accumulates the number of rows set by DataExchange servers autotune routine or the CHUNKSIZE runtime property. There is no setting for Acknowledge Mode as only AUTO is supported. 9. If the JMS datasource will be used as a source, define columns to match the source stream (see step 7. of Add a Flat File to a Data Model.), then click Finish. (There is no need to define columns for a JMS datasource to be used only as a target, as it will pick up those settings from its writer.) For more information, see JMS Stream Datasources. Edit a JMS Stream Datasource 1. If you are editing a task, double-click the JMS source. Alternatively, select the Diagram Explorers Main tab, expand the JMS stream branch of the tree, and double-click the datasources icon. 2. Make changes as required. For information on the settings, see Add a JMS Stream Datasource to a Data Model. 3. If you make a mistake, click Cancel to lose your changes; otherwise, when finished, click OK. For more information, see JMS Stream Datasources. Add a JMS Stream Datasource to a Task For execution to succeed for a task containing a JMS stream datasource, the JMS provider must be installed and running on the DataExchange server machine when you run the task.
TIBCO DataExchange Designer Users Guide
62
| Chapter 3
1. If you have not done so already, create the JMS datasource (see Add a JMS Stream Datasource to a Data Model). 2. Display the task in the Diagram Window (double-click the task in the Diagram Explorer). 3. Select the Diagram Explorers Main tab. 4. Expand the JMS stream branch of the data model tree. 5. Click and drag the JMS datasources icon from the data model tree and drop it in the task workspace. 6. If the JMS stream is a source, right-click in the task workspace to the right of the JMS datasource, then select Add Reader > JMS Reader. If the JMS stream is a target, right-click in the task workspace to the left of the JMS datasource, then select Add Writer > JMS Writer. 7. Link the JMS datasource and its reader or writer with a data stream (see Link Objects with Data Streams). 8. Double-click the reader or writer and configure it (see JMS Readers and Writers). For more information, see JMS Stream Datasources.
2. In the data model tree, double-click Model--XML Files (or another XML model). 3. Right-click in a blank area of the workspace, then select Insert XML File. 4. Click in a blank area to add the XML file, right-click in a blank area to return to the select cursor, then double-click the new XML file. 5. Revise the datasource name to make it more descriptive (you might want to make this the same as the file name), enter a description, then click Next. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.) 6. Specify the name and location of the XML file relative to DataExchange server. If you select File System Location and Name, specify the full path to and name of the XML file, relative to DataExchange server. If DataExchange server is running on the same computer as DataExchange Designer, or there is a copy of the XML file at the same path on both machines, you may click to browse to and select the file. If you select FTP Server, click Configure and set the ftp parameters relative to DataExchange server. The specified server must be reachable by DataExchange server. It does not need to be reachable by DataExchange Designer. 7. Select DTD file or XML schema file, click , browse to and select the DTD or schema file, then click Next. DataExchange Designer includes the information from the DTD or schema in the deployed task. It is not necessary to put a copy of the DTD or XML schema file on the DataExchange server machine.
64
| Chapter 3
8. If the DTD contains multiple root elements, select the correct one from the Root Element list. An XML datasource can extract data only from entities that share the same root element. To extract data from entities that are in the same file but have different root elements you must create multiple datasources for the file. 9. If you wish to limit how much of the selected root elements hierarchy is shown, click and select an entity. 10. Select an XML attribute (@) or text child (C) and click Add. You cannot select an XML entity (E) directly; instead, select its Value$ attribute, and set the column name to match the entity name. 11. Set Column Name to something appropriate. For example, a reasonable choice for the column mapped to the Value$ text child selected in the above image might be EmployeeName. 12. If necessary, correct the data type and associated options. To change the format of DATE, DATETIME, or TIME values, double-click in the Column Format field and enter a new format (see Date-Time Format). 13. To filter the data, click Occurrence. Records that do not match the condition will not be extracted (similar to a WHERE clause in a SQL statement). 14. To add another column, repeat from step 10.. Otherwise, click Next, then click Finish. For more information, see XML File Datasources. Edit an XML File Datasource 1. If you are editing a task, double-click the XML datasource. Alternatively, select the Diagram Explorers Main tab, expand the XML branch of the tree, and double-click the XML datasources icon. 2. Make changes as required: To change the XML, DTD, or XML schema file name or location, select the Datasource tab. To change what data is extracted or the format of that data, select the Columns tab. For more information on the above settings, see Add an XML File Datasource to a Data Model. 3. If you make a mistake, click Cancel to lose your changes; otherwise, when finished, click OK, then if prompted click Yes to regenerate the XML. For more information, see XML File Datasources.
TIBCO DataExchange Designer Users Guide
Add an XML File Datasource to a Task 1. If you have not done so already, create the XML datasource (see Add an XML File Datasource to a Data Model). 2. Display the task in the Diagram Window (double-click the task in the Diagram Explorer). 3. Select the Diagram Explorers Main tab. 4. Expand the XML branch of the data model tree. 5. Click and drag the XML files icon from the data model tree and drop it in the task workspace. 6. If the XML file is a source, right-click in the task workspace to the right of the flat file, then select Add Reader > XML Reader. If the XML file is a target, right-click in the task workspace to the left of the flat file, then select Add Writer > XML Writer. 7. Link the XML file and its reader or writer with a data stream (see Link Objects with Data Streams). 8. Double-click the reader or writer and configure it (see Flat File Readers and Writers). For more information, see XML File Datasources.
66
| Chapter 3
datasource, you would typically alias the server and database name; for a flat-file table datasource, you would alias the file name. See the TIBCO DataExchange Administrators Guide for details on the datasource settings that may be provided at runtime for each datasource type. If a property group alias is specified at the reader or writer level, it will override the setting at the datasource level. A reader/writer-level property group alias must be specified after the model-level alias or it will be overwritten. When you deploy the task, the property group specified in Property Group Alias must exist on DataExchange server. At runtime, that property group or whatever other property group you specify at runtime must provide any datasource settings left blank in the task, or they must be set manually in TIBCO Administrator; otherwise, execution will fail. Replace a source with another source 1. Click the appropriate Propagation button to disable propagation (next to the zoom control on the Application toolbar; see Delete an Object From a Task for discussion). 2. Select the source, press the Delete key, and click Yes to confirm deletion. The data stream linking the source to its reader is also deleted. 3. Add the new source to the task (see Add a Flat File Datasource to a Task, Add a Table Datasource to a Task, Add a JMS Stream Datasource to a Task, or Add an XML File Datasource to a Task. 4. If the new source is of a different type, delete the old reader and add the appropriate reader for the new type. 5. Link the source to the reader. If you replaced the reader, recreate that link as well. 6. Click the appropriate Propagation button again to re-enable propagation. Replace a target with another target 1. Select the target, press the Delete key, and click Yes to confirm deletion. The data stream linking the writer to the deleted target is also deleted. 2. Add the new target to the task (see Add a Flat File Datasource to a Task, Add a Table Datasource to a Task, Add a JMS Stream Datasource to a Task, or Add an XML File Datasource to a Task. 3. If the new target is of a different type, delete the old writer and add the appropriate writer for the new type.
4. Link the writer to the target. If you replaced the writer, recreate that link as well.
68
| Chapter 3
Flat File Readers and Writers JDBC Readers and Writers JMS Readers and Writers Null Transformers
70
| Chapter 3
Bulk writer vs. JDBC writer: The relative performance of JDBC and bulk writers varies depending on the platform. The following are generalizations; feel free to experiment and see what gives you the best performance. Oracle: A bulk writer will often perform no better than a JDBC writer using a native connection (at least when using Oracles current JDBC drivers). SQL Server: A bulk writer generally will perform better than a JDBC writer. DataExchange server can access SQL Servers bulk reader in native mode (i.e. using APIs to communicate with SQL Server directly rather than going through Microsofts bulk utility), which gives better performance.
Notes on the properties that are not self-explanatory: Name and Description: see Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition. Use native mode Bulk component for SQL Server: If you are using SQL Server, check this. It will give better performance than going through SQL Servers bulk utility. Data File (What is the name and location of the file containing the exported data?): The location of the temporary file for use by the RDBMSs bulk utility. In a reader, the source RDBMSs bulk utility writes to this file and
DataExchange server reads from it. In a writer, DataExchange server writes to this file and the target RDBMSs bulk utility reads from it. When running in back-to-back mode, reader and writer must be set to the same file. Bulk Utility: When you use a bulk reader or writer, DataExchange server uses the RDBMSs bulk utility. This setting specifies the path and name of that utility. On Windows platforms, DB2s utility is db2.exe, SQL Server and Sybase ASEs utility is bcp.exe, and Oracles is sqlldr.exe. If Use native mode Bulk component for SQL Server is checked, this setting is disabled. Run the component in back to back mode: Use this option to copy all the data from one table to an identical table on the same platform. Create a simple task with the source table, the target table, a bulk reader, and a bulk writer. Check this option in both reader and writer. If using SQL Server, check Use native mode Bulk component for SQL Server in both reader and writer; if using DB2 or Sybase ASE, set Data File to the same file in both reader and writer. Transformations are not supported in back-to-back mode. Oracle does not support back-to-back mode since its bulk utility is write-only.
Datasource settings are normally set globally for the whole model (see Change a Reverse-Engineered Models Datasource Settings). Modify the Server, Database, Owner, User, Password, Port, or Property Group Alias (see Use Datasource Aliasing) settings only if you are sure you know what you are doing. All changes you make to these settings will be overwritten if any of the models datasource settings are changed. If a property group alias is specified at the model level, the only one of these settings that will have any effect is Property Group Alias (which must be specified after the model-level alias or it will be overwritten). RowID Column (reader only): Optionally, DataExchange server will add additional columns and populate them with sequential ID numbers. Load Mode (writer only): Enabled only for DB2 and Oracle. Make sure that the load mode corresponds to the current settings of the target table. For example, if you select Truncate, your target table must be empty. Note that (unlike with the JDBC writer) regardless of which load mode is selected, should the task fail before the bulk writer executes, the target table will be unchanged.
72
| Chapter 3
Use parallel writing process (writer only): If checked, DataExchange server will run multiple simultaneous transactions over the same connection. This may improve performance if the database servers CPU or other resources are under utilized. Note, however, that if the write mode is Update or Delete, checking this option may result in contention problems. Ignore invalid rows for output (writer only): If checked, invalid rows (as defined by the isError() function in a column map transformer) are dropped from the output; if unchecked, invalid rows are included in the output. Typically you will also check Enable logging of error records to capture any dropped rows. Enable logging of error records (writer only): If checked, invalid rows are captured to the log file install-path\tibco\dx\5.3\logs\task\task name_execution ID_writer name_ErrorRecords.log, regardless of the Ignore invalid rows for output setting. Enable explicit insertion of values into SQL Server Identity columns (writer with SQL Server target only): If checked, values from the input column mapped to the targets identity column will be written to the identity column. If unchecked, SQL Server will identity column. Pass-Through Parameters: Specifies parameters (command-line switches) to be passed to the RDBMSs bulk writer utility. Enter the switch (e.g. -c or /s) in Parameter Name and the value in Parameter Value. If Use native mode Bulk component for SQL Server is checked, this option is disabled.
dropped from the output; if unchecked, invalid rows are included in the output. Typically you will also check Enable logging of error records to capture any dropped rows. Enable logging of error records (writer only): If checked, invalid rows are captured to the log file install-path\tibco\dx\5.3\logs\task\task name_execution ID_writer name_ErrorRecords.log, regardless of the Ignore invalid rows for output setting. Null Values Replacement (writer only): With the default setting, Replace with NULL, the DataExchange server replaces null values (as defined on the Options tab of the datasource properties) with explicit NULLs. Alternatively, select Replace with and specify replacement values for the various data types (see Date-Time Format.). Output Columns: by default, all columns are selected. Double-click a column to move it from Output Columns (included in output) to Source Columns (not included in output) or vice-versa. Alternatively, use the buttons: the single-arrow buttons move the currently selected column(s); the double-arrow buttons move all columns.
74
| Chapter 3
3. Link the source(s) and the reader with data stream(s) (see Link Objects with Data Streams). 4. Double-click the reader. 5. Edit the Name and Description (see Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition), then click Next. 6. The Datasource settings are normally set globally for the whole model (see Change a Reverse-Engineered Models Datasource Settings). The JDBC Configuration - Advanced Configuration dialog (not available at the model level) includes a drop-down list showing available JDBC drivers for this platform, and indicates whether the currently selected driver was bundled with TIBCO DataExchange and whether it has been certified. If the source table is in a Teradata Warehouse 7 (V2R5) database and the reader uses a JDBC type 3 driver, click Advanced Configuration and edit the URL string to replace <gatewayname> with the appropriate IP address or network name. For example, if the URL were jdbc:teradata://<gatewayname>:7060/ DemoTDAT and the JDBC gateway were 10.0.0.40, you would change the URL to jdbc:teradata://10.0.0.40:7060/ DemoTDAT. Modify the Server, Database, User, Password, Port, Property Group Alias (see Use Datasource Aliasing), or Advanced Configuration settings only if you are sure you know what you are doing. All changes you make to these settings will be overwritten if any of the models datasource settings are changed. If a property group alias is specified at the model level, the only one of these settings that will have any effect is Property Group Alias (which must be specified after the model-level alias or it will be overwritten). When finished making any necessary changes to the datasource settings, click Next. 7. Table Alias: If you wish, double-click in the Alias column to define table aliases. This can make it easier to read the automatically generated SQL script or to write a custom SQL script. When finished, click Next. 8. Columns: Configure the output columns: By default, all columns are selected for output. Double-click a column to move it from Selected Columns (included in output) to Available Columns (not included in output) or vice-versa. Alternatively, use the
buttons: the single-arrow buttons move the currently selected column(s); the double-arrow buttons move all columns. Include only those columns used by the task in the Output Columns list. Including unused columns and dropping them later in the task degrades performance. Column Alias: Renames the column within the task. Double-click in the Alias column of the Selected Columns list to add, edit, or delete. Use custom SQL: Check this to disable automatic SQL generation and paste your own script into the SQL tab instead. If you use this option, you may have to define the output columns manually with New Column / Edit Column. Include stored procedure: Check this if your custom SQL calls stored procedures. New Column: Adds a new output column. If Use custom SQL is checked, leave Column Expression blank; if it is unchecked, specify a Column Expression to calculate the values. For example:
(CONVERT(money,("Order Details".UnitPrice*Quantity*(1-Discount)/100)*100))
To add a column containing automatically generated IDs, use RowID Column (the last step in the Wizard), not New Column. When finished configuring columns, click Next. 9. WHERE: Specify a WHERE clause that will extract the rows required by the task. Extracting unneeded rows and filtering them out later in the task degrades performance, so the more precise you can make the WHERE clause, the better. When finished, click Next. If the reader has multiple source tables, define the required output in this WHERE clause using JOIN, GROUP BY, or ORDER BY. Otherwise (assuming the database is SQL92-compliant), the reader defaults to a Cartesian join. 10. If you checked Use custom SQL, enter your custom SQL statement here, then click Next. If you did not check Use custom SQL, read through the automatically generated SQL statement to make sure it will extract the desired data. If it appears wrong, click Back to correct the settings; otherwise, click Next.
76
| Chapter 3
11. PreSQL: If you need to run some custom SQL before the SQL statement in the previous step is executed, check Run session-specific SQL scripts before reading and compose or paste the SQL in the field. When finished, click Next. 12. PostSQL: If you need to run some custom SQL after the SQL statement in the previous step is executed, check Run session-specific SQL scripts after reading and compose or paste the SQL in the field. 13. Check Ignore Pre-SQL and Post-SQL errors if you want the task to abort if errors are encountered in processing the PreSQL or PostSQL scripts. Then click Next. 14. If you wish, define one or more row ID columns (columns that DataExchange server will add to the output data stream and populate with sequential ID numbers). Then click Finish. For more general information, see JDBC Readers and Writers. For a hands-on demonstration, see JDBC Tutorial. Add a JDBC Writer to a Task 1. Add the target table as described in Add a Table Datasource to a Task. 2. Right-click in the task workspace to the left of the table, then select Add Writer > JDBC Writer. 3. Link the writer and the target with a data stream (see Link Objects with Data Streams). 4. Double-click the writer. 5. Edit the Name and Description (see Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition), then click Next. 6. The Datasource settings are normally set globally for the whole model (see Change a Reverse-Engineered Models Datasource Settings). The JDBC Configuration - Advanced Configuration dialog (not available at the model level) includes a drop-down list showing available JDBC drivers for this platform, and indicates whether the currently selected driver was bundled with TIBCO DataExchange and whether it has been certified. If the target table is in a Teradata Warehouse 7 (V2R5) database and the writer uses a JDBC type 3 driver, click Advanced Configuration and edit the URL string to replace <gatewayname> with the appropriate IP address or network name. For example, if the URL were jdbc:teradata://<gatewayname>:7060/
DemoTDAT
and the gateway were 10.0.0.40, you would change the URL to
jdbc:teradata://10.0.0.40:7060/ DemoTDAT.
Modify the Server, Database, User, Password, Port, Property Group Alias (see Use Datasource Aliasing), or Advanced Configuration settings only if you are sure you know what you are doing. All changes you make to these settings will be overwritten if any of the models datasource settings are changed. If a property group alias is specified at the model level, the only one of these settings that will have any effect is Property Group Alias (which must be specified after the model-level alias or it will be overwritten). When finished making any necessary changes to the datasource settings, click Next. 7. The Owner setting is also normally set globally for the whole model. Unless you are sure you know what you are doing, leave it unchanged and click Next. 8. PreSQL: If you need to run some custom SQL before the data is loaded into the target database, check Run session-specific SQL scripts before loading and compose or paste the SQL in the field. When finished, click Next. 9. PostSQL: If you need to run some custom SQL after the data is loaded, check Run session-specific SQL scripts after loading and compose or paste the SQL in the field. 10. Check Ignore Pre-SQL and Post-SQL errors if you want the task to abort if errors are encountered in processing the PreSQL or PostSQL scripts. Then click Next. 11. Configuration: Specify the details of the write operation: Use parallel writing process: If checked, DataExchange server will run multiple simultaneous transactions over the same connection. This may improve performance if the database servers CPU or other resources are under utilized. Note, however, that when the write mode is Update or Delete, checking this option may result in contention problems. Abort task with error threshold value: If checked, the task is aborted after the specified number of errors. To define error conditions, use the isError() function in a column map transformer. Ignore invalid rows for output (writer only): If checked, invalid rows (as defined by the isError() function in a column map transformer) are dropped from the output; if unchecked, invalid rows are included in the
78
| Chapter 3
output. Typically you will also check Enable logging of error records to capture any dropped rows. Enable logging of error records (writer only): If checked, invalid rows are captured to the log file install-path\tibco\dx\5.3\logs\task\task name_execution ID_writer name_ErrorRecords.log regardless of the Ignore invalid rows for output setting. Enable explicit insertion of values into SQL Server Identity columns (writer with SQL Server target only): If checked, values from the input column mapped to the targets identity column will be written to the identity column. 12. Mode Operation and Option: Select the operation to be performed: Insert, Update, Delete, or User defined SQL. If you select Insert, select the type of insert. If you have modified datasource table or column properties in DataExchange Designer and want to propagate them to the target database, select Drop and Re-Create Table, then Insert. If you have added a new table from scratch and want to propagate it to the target database, select Create Table, Then Insert. If you select Create Table, Then Insert, after running the task successfully once you must edit the writer and select a different option. If you select Update, select the type of update. Column Specifier should normally be left at the default ?; one rare exception is when one or more column names contain a question mark. 13. If in the previous step you selected Insert, click Next. Otherwise, specify the update or delete condition, or enter the user-defined SQL, then click Next. 14. Mapping: Select Source to Target if you want to swap the position of the columns. Then click Next. 15. End Transaction: If appropriate, define a condition (for example, ROWID==0) that will signal DataExchange server to end the current transaction and start a new one. The row containing the end condition is included in the current transaction. The final transaction is committed even if the last row does not meet the end condition. 16. SQL Script: If the generated SQL looks good, click Finish. For more general information, see JDBC Readers and Writers. For a hands-on demonstration, see JDBC Tutorial.
data received for 15 sec. 10 sec. wait time 10 sec. wait time 35 second timeout 10 sec. wait time 10 sec. wait time connection active for 40 sec.
Property Group Alias: see Use Datasource Aliasing. RowID Column (reader only): optionally, DataExchange server will add additional columns and populate them with sequential ID numbers.
Null Transformers
A null transformer is a dummy reader or writer used when a transformer requires an input or output that has no function in the task. For example, you might use a null transformer to allow creation of a column map transformer that runs a script to set global variables.
80
| Chapter 3
When used in place of a reader, a null transformer has an output and no input; when used in place of a writer, it has an input and no output. Its only editable properties are name and description (see Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition).
Transformers 81
Transformers
Transformers are the core of TIBCO DataExchange tasks. Each transformer in a task takes the output of one or more reader(s) and or transformer(s), reforms a specific transformation on the data, and passes the transformed data to the input of one or more transformer(s) and/or writer(s). As detailed in the following table, TIBCO DataExchange includes dedicated transformers for common transformations such as joins and pivots. In addition, the general-purpose Column Map transformer can, using the bundled JavaScript methods or a custom JavaScript or Java program, perform virtually any conceivable transformation. Transformer Column Splitter Transformer Use Multiple uses: 1. Route columns from a single source to multiple targets. 2. Drop columns. Column Map Transformer Multiple uses: 1. Process data using bundled JavaScript methods. 2. Run a custom JavaScript or Java program. 3. Change output column order. 4. Add columns. 5. Change column properties such as name and data type. 6. Change null values. 7. Add index values. Duplicate Transformer Duplicate Elimination Transformer Group By Transformer Join Transformer Split an input data stream into two or more identical data streams. Remove duplicate rows. Perform aggregate function(s). Perform SQL join(s) on multiple sources.
82
| Chapter 3
Transformer Null Transformer Pivot Transformer Row Splitter Transformer Sort Transformer Temp Table Transformer Union Transformer Update/Delete Transformer
Use Acts as a dummy reader or writer. See Null Transformers. Consolidate values from several columns into one new column and track the source in another. Route rows from a single source to multiple destinations. Sort rows. Store the output of one transformer on disk, then pass it on to a second transformer. Merge multiple sources that have identical columns. Multiple uses: 1. Update values (UPDATE ... SET ... WHERE ...). New values may be generated by concatenating or performing mathematical operations on input values. 2. Delete rows that match specified criteria (DELETE (WHERE ...)).
For additional information For discussion of transformers role in TIBCO DataExchange, see Development Process on page 6.
Transformers 83
Data Block Size: Specifies the size (in KB) of the chunks in which DataExchange server reads and writes disk data. For best performance, set to the block size of the disk volume of %DT_HOME%.
Both may be set at runtime. See the TIBCO DataExchange Administrators Guide for details.
The JavaScript and Java options make the column map transformer a general-purpose tool capable of performing virtually any data-transformation task for which no specialized transformer is provided. See Column Map JavaScript Tutorial on page 164 for details. Notes Drop unneeded columns with a column splitter before the column map. Delete a column in a column map only if it is no longer needed after it has been used by the JavaScript expression, or if you want to use it as a source column for another output column. For best performance, do not use a column map if a more specialized transformer can handle the job. For example, to concatenate fields, use an update/delete.
84
| Chapter 3
The input and output column numbers in the Columns tab indicate only output order, not mappings. To see which columns have been renamed, added, or deleted, look at the Script tab, under Rename Input Column , Add new columns , and Delete columns
For step-by-step instructions, see Add a column map to a task or Edit a column map. For a hands-on demonstration, see Column Map Tutorial. Add a column map to a task 1. Right click in the diagram workspace and select Add Column Map. 2. Create data streams linking the input reader or transformer to the column map and the column map to the output transformer or writer. (See Link Objects with Data Streams.) 3. Double-click the column map. 4. Revise the name to make it more descriptive, enter a description, then click Next. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.) 5. Define one or more transformations: To rename a column, change its data type or options for a specific type, or apply values, double-click an output column to open the Column Editor. The Source Column drop-down list will offer a choice only if one or more of the input columns is currently unmapped. To replace null values, double-click an output column to open the Column Editor, check Replace Null Value, enter the new null value, and click OK. (Note that for flat file sources, the null value to be replaced is set in the datasources Options tab.) To add a new output column, click Add, enter the column name, data type, and any other appropriate settings, and click OK. When you add a decimal or numeric column, Scale defaults to zero. If that is not appropriate for your purposes, be sure to change it. To drop a column from the output (see notes above), select an output column and click Delete. To change output column order, select an output column and click Up or Down. To transform the data using JavaScript or a custom Java program, click Edit Expr. (There is only one expression in the column map, so it makes no difference
TIBCO DataExchange Designer Users Guide
Transformers 85
whether a column is selected.) See Column Map - JavaScript Tutorial on page 164 for details. When finished, click Next., then click Yes to generate the script. 6. Optionally, set a Script Timeout value (a number of minutes after which the column map will fail if the script has not completed). 7. Click Validate. If the script validates successfully, click Next. 8. If in the Expression Editor you called a function with custom properties, enter the values for those properties. Optionally, define common properties or supply external data (see Define Common Properties or Supply External Data for details). 9. Click Finish to close the Wizard. For more information, see Column Map Transformer. For a hands-on demonstration, see Column Map Tutorial. Edit a column map 1. Double-click the column map. 2. If you wish, revise the name and/or description. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.) 3. To revise the existing transformations or define additional ones, select the Columns tab, then: To rename a column, change its data type or options for a specific type, or apply values, double-click an output column to open the Column Editor. To add a new output column, click Add, enter the column name, data type, and any other appropriate settings, and click OK. To drop a column from the output (see notes above), select an output column and click Delete. To change output column order, select an output column and click Up or Down. To transform the data using JavaScript or a custom Java program, click Edit Expr. (There is only one expression in the column map, so it makes no difference whether a column is selected.) See the TIBCO DataExchange Column Map JavaScript Tutorial for instructions on using the Expression Editor. 4. When finished, select the Script tab, then click Yes to regenerate the script. 5. Click Validate. If the script validates successfully, click OK to dismiss the validation message.
86
| Chapter 3
6. To revise custom properties, common properties, or external data, select the Custom Properties tab. See Define Common Properties or Supply External Data for more information. 7. When through making changes, click OK to close the editor. For more information, see Column Map Transformer. Define Common Properties or Supply External Data The Custom Properties tab in a column map transformer allows you to define values, property-value pairs, and datasets for use by the transformers script. Define a Common Property 1. In the Custom Properties tree, select Common Properties. 2. Click in the blank cell at the bottom of the Property Name column and type a name for the new property. 3. Click in the adjacent Data Type cell and select the appropriate data type. Types marked with an asterisk are encrypted. 4. Click in the adjacent Value cell and type or select the value. To call a common property in your JavaScript, assign it to a variable using the following syntax:
var <property name> = DTComponent.getCommonProperties().get("<property name>");
Supply External Data 1. In the Custom Properties tree, select External Data. 2. Click in the blank cell at the bottom of the Dataset Name column and type a name for the dataset. 3. Click in the adjacent Value cell to open the Dataset Editor. 4. Click the button, navigate to and double-click the data file, and click OK. For more information, see Column Map Transformer.
Transformers 87
Column splitters are also used to drop columns from a data flow after the transformations that involve those columns have been performed. In that context you could also use a column map, but a column splitter makes it much easier to restore a column to the flow when you make a mistake or change your mind. Notes To route all the columns from a single source to multiple targets, use a Duplicate Transformer. If a column is not required by any of the steps in a data flow, remove it from the selected columns in the reader.
For additional information For step-by-step instructions, see Add a column splitter to a task or Edit a column splitter. For a hands-on demonstration, see Column Splitter Tutorial. Add a column splitter to a task 1. Add the source, target(s), and associated reader and writer(s). 2. Right click in the diagram workspace and select Add Column Splitter. 3. Create data streams linking the input reader or transformer to the column splitter and the column splitter to the output transformer(s) or writer(s). (See Link Objects with Data Streams.) 4. Double-click the column splitter.
88
| Chapter 3
5. Revise the name to make it more descriptive, enter a description, then click Next. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.) 6. Double-click one of the writers in the Available Output list to open the Column Split Editor.
7. Double-click a column to move it from Selected Columns (included in output) to Available Columns (not included in output) or vice-versa. Alternatively, use the buttons: the single-arrow buttons move the currently selected column(s); the double-arrow buttons move all columns. When finished, click OK. 8. Repeat for each writer in the Available Output list. When done, click Finish. For more information, see Column Splitter Transformer. For a hands-on demonstration, see Column Map Tutorial. Edit a column splitter 1. Double-click the column splitter. 2. If you wish, revise the name and/or description. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.) 3. To revise the output columns, select the Columns tab, then double-click one of the writers in the Available Output list to open the Column Split Editor.
4. Double-click a column to move it from Selected Columns (included in output) to Available Columns (not included in output) or vice-versa.
Transformers 89
Alternatively, use the buttons: the single-arrow buttons move the currently selected column(s); the double-arrow buttons move all columns. When finished, click OK. 5. Repeat for each writer in the Available Output list you wish to modify. When done, click Finish. For more information, see Column Splitter Transformer.
Duplicate Transformer
A duplicate transformer simply duplicates its input data stream to two or more output data streams. There are no properties to set except the name and description (see Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition). For a hands-on demonstration, see Duplicate Transformer Tutorial. To add a duplicate transformer to a task: 1. Right click in the diagram workspace and select Add Duplicate. 2. Create data streams linking the input reader or transformer to the duplicate transformer and the duplicate transformer to the output transformer(s) and/or writer(s). (See Link Objects with Data Streams.) 3. Double-click the duplicate transformer. 4. Revise the name to make it more descriptive, enter a description, then click Finish. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.)
90
| Chapter 3
For additional information For step-by-step instructions, see Add a duplicate elimination transformer to a task or Edit a duplicate elimination transformer. For a hands-on demonstration, see Duplicate Elimination Tutorial. Add a duplicate elimination transformer to a task 1. Right click in the diagram workspace and select Add Duplicate Elimination. 2. Create data streams linking the input reader or transformer to the duplicate elimination transformer and the duplicate elimination transformer to the output transformer or writer. (See Link Objects with Data Streams.) 3. Double-click the duplicate elimination transformer. 4. Revise the name to make it more descriptive, enter a description, then click Next. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.)
5. Double-click a column to move it from Available Columns (not used in the duplicate search criteria) to Search Columns or vice-versa. Alternatively, use the buttons: the single-arrow buttons move the currently selected column(s); the double-arrow buttons move all columns. If a rows values in all the search columns match those of a previous row, it will be treated as a duplicate. To eliminate only exact duplicates, include all columns in your criteria. Input order affects output. For each set of rows having the same values in the search columns, the first row is passed to the primary output, and the rest of the rows are purged as duplicates. 6. Clear the Ignore Case checkbox for any columns for which you want evaluation to be case-sensitive. 7. Click Next to skip specifying the filter condition. (In rare circumstances, you might wish to specify criteria for rows to be deleted prior to the duplicate elimination transformation. For example, this would be appropriate if some rows had been flagged as duplicates in a previous transformation. For a key to the operator symbols, see Add an update/delete transformer to a task.)
TIBCO DataExchange Designer Users Guide
Transformers 91
8. If necessary, correct the settings for the target and duplicate data flows, memory, and/or data block size (see Memory and Data Block Size Settings), then click Finish. For more information, see Duplicate Elimination Transformer. For a hands-on demonstration, see Column Map Tutorial. Edit a duplicate elimination transformer See Add a duplicate elimination transformer to a task for discussion of filter conditions (Filter tab). 1. Double-click the duplicate elimination transformer. 2. To change the duplicate elimination criteria, select the Search Key tab. Double-click a column to move it from Available Columns (not used in the duplicate search criteria) to Search Columns or vice-versa. Alternatively, use the buttons: the single-arrow buttons move the currently selected column(s); the double-arrow buttons move all columns. If a rows values in all the search columns match those of a previous row, it will be treated as a duplicate. Clear the Ignore Case checkbox for any columns for which you want evaluation to be case-sensitive. 3. If necessary, select the Configuration tab to correct the settings for the target and duplicate data flows, memory, and/or data block size (see Memory and Data Block Size Settings), then click Finish. 4. Click Finish. For more information, see Duplicate Elimination Transformer.
Group By Transformer
A group by transformer performs one or more aggregate functions (AVG, COUNT, MIN, MAX, or SUM) on a data stream. For each input column selected on the Group By tab, the output includes one column containing the distinct input values from that column, and one or more additional columns containing the corresponding aggregate values. If you select more than one column in the Group By tab, the output includes one row for each distinct set of values. For example, if the selected columns were Sex and MaritalStatus, there would be four output rows, one each for single males, married males, single females, and married females (assuming those four combinations and no others were found in the input data). You must define at least one aggregate column. If you need only the distinct sets of Group By column values, define one aggregate column, then use a column splitter to drop it.
TIBCO DataExchange Designer Users Guide
92
| Chapter 3
the flat-file output might look something like this: MaritalStatus:CHAR:1 M S M S Sex:CHAR:1 F F M M Subtotal:NUMERIC 248 183 227 195
For additional information For step-by-step instructions, see Add a group by transformer to a task or Edit a group by transformer. For a hands-on demonstration, see Group By Tutorial. Add a group by transformer to a task 1. Right click in the diagram workspace and select Add Group By. 2. Create data streams linking the input reader or transformer to the group by transformer and the group by transformer to the output transformer or writer. (See Link Objects with Data Streams.) 3. Double-click the group by transformer.
Transformers 93
4. Revise the name to make it more descriptive, enter a description, then click Next. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.)
5. Double-click a column to move it from Available Columns to Group By Columns or vice-versa. Alternatively, use the buttons: the single-arrow buttons move the currently selected column(s); the double-arrow buttons move all columns. Clear the Ignore Case checkbox for any columns for which you want evaluation to be case-sensitive. Then click Next. 6. Click Add. 7. Define an aggregate column by giving it a name, selecting an aggregate function, and selecting the input column containing the data to be aggregated. 8. To define another aggregate column, click Add and repeat step 7.. Otherwise, click Close, then Finish. For more information, see Group By Transformer. For a hands-on demonstration, see Group By Tutorial. Edit a group by transformer 1. Double-click the group by transformer. 2. To change the way the output data is grouped, select the Group By tab. Double-click a column to move it from Available Columns to Group By Columns or vice-versa. Alternatively, use the buttons: the single-arrow buttons move the currently selected column(s); the double-arrow buttons move all columns. Clear the Ignore Case checkbox for any columns for which you want evaluation to be case-sensitive.
94
| Chapter 3
3. To change the aggregate data, select the Aggregate Column tab. To edit an aggregate column, double-click it. Revise the name, change the aggregate function, and/or select a different input column, then click OK. To add an aggregate column, click Add. Enter a name, select an aggregate function, and select the input column containing the data to be aggregated, and click Add. Repeat to add another column, or click Close. To delete a column, select it and click Delete. 4. When finished making changes, click OK. For more information, see Group By Transformer.
Join Transformer
A join transformer performs one or more standard SQL joins (inner, left our, right outer, or full outer) on two or more input data streams, optionally dropping any unwanted columns from the result before passing it to a single output data stream, as shown in the simple example below:
Transformers 95
The following settings are from a join transformer with three inputs and thus two joins:
Orders and Customers tables joined on CustomerID column; Component indicates the sources are coming from input data streams:
Orders and OrderDetail joined on OrderID column; note the indication that the left source is the results of the first join (i.e. that this join is nested):
96
| Chapter 3
Output columns selection (columns left in Source Columns list are dropped):
For additional information For step-by-step instructions, see Add a join transformer to a task or Edit a join transformer. For a hands-on demonstration, see Join Tutorial. Add a join transformer to a task When joining untransformed rows from table datasources in the same model, best practice is to use an appropriate view or, if no such view exists or can be created in the database, a JDBC reader with multiple source tables (see Add a JDBC Reader to a Task). These approaches will give much better performance than using separate JDBC readers for each table and combining their outputs with a join transformer. 1. Right click in the diagram workspace and select Add Join. 2. Create data streams linking the input reader(s) and/or transformer(s) to the join transformer and the join transformer to the output transformer or writer. (See Link Objects with Data Streams.) 3. Double-click the join transformer.
Transformers 97
4. Revise the name to make it more descriptive, enter a description, then click Next. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.) 5. Select the join type (inner, left outer, right outer, or full outer). 6. Click Add.
7. Enter a descriptive name for the join in the Join Name field. 8. If necessary, select the appropriate left and right sources. The field above the Source box is not editable; it displays [Component] to indicate that the source is an input data stream or, for a nested join, the name of the preceding join. 9. Select the left and right join columns. It does not matter if the names are different. 10. If appropriate (rare), change the operator Operator == != > < >= <= Description equals (default) does not equal is greater than is less than is greater than or equal to is less than or equal to
11. If you want the join to be case-sensitive, clear Ignore Case Sensitivity.
98
| Chapter 3
12. Click OK. 13. To add another join, return to step 7.. Otherwise, click Next. 14. Select the output columns. Double-click a column to move it from Source Columns (dropped from output) to Output Columns (included in output) or vice-versa. Alternatively, use the buttons: the single-arrow buttons move the currently selected column(s); the double-arrow buttons move all columns. When finished, click Next. 15. If necessary, correct the memory and/or data block size settings (see Memory and Data Block Size Settings), 16. If necessary, change the Trim spaces in strings prior to comparison setting: If checked (the default), both leading and trailing spaces are removed from values in join columns with string-based data types. If unchecked, the values are left unchanged. For example, with this option unchecked, the values "John " and " John" would not match. With the option checked, both values would be changed to "John", and would match the row filter. 17. Click Finish. For more information, see Join Transformer. For a hands-on demonstration, see Join Tutorial. Edit a join transformer 1. Double-click the join transformer. To change the join type, select the Join Columns tab and change the Join Type selection. To revise a join, select the Join Columns tab and double-click a join. Revise as appropriate, then click OK. To add or remove columns from the output, select the Output Columns tab. Double-click a column to move it from Source Columns (dropped from output) to Output Columns (included in output) or vice-versa. Alternatively, use the buttons: the single-arrow buttons move the currently selected column(s); the double-arrow buttons move all columns. When finished, click Next. To change the memory, data block size, or trim spaces settings, select the Configuration tab. For information about the above settings, see Add a join transformer to a task. 2. When finished making changes, click OK.
Transformers 99
For more information, see Join Transformer. For a hands-on demonstration, see Join Tutorial.
Null Transformer
A null transformer takes the place of a reader or writer, so it is discussed in that section (see Null Transformers).
Pivot Transformer
A pivot transformer reorganizes data from a set of input columns into a single new output column: the data is pivoted from horizontal (multiple cells from each input row) to vertical (cells in the same column of multiple output rows). Optionally, the transformer also creates a second new output column indicating which of the dropped source columns each value came from. For example, if you have the following input table: Name Wyatt Bob Q1UnitSales 100 200 Q2UnitSales 125 190 Q3UnitSales 150 220 Q4UnitSales 210 240
you could pivot the data to create the table below: Name Wyatt Wyatt Wyatt Wyatt Bob Bob Bob Bob Quarter 1 2 3 4 1 2 3 4 UnitSales 100 125 150 210 200 190 220 240
100
| Chapter 3
The detailed settings for the Quarter (left) and UnitSales (right) output columns are:
With the above settings, for each input row, the output contains four rows, generated as follows: the input rows Name value is copied to the Name field of all four output rows (Existing) the input rows Q1 Sales Unit value is copied to the UnitSales fields of the first row, the Q1 Sales Unit value to the second row, and so on (Derived) each output rows Quarter field is set to a value indicating which input column the UnitSales value came from (Tracking)
When the above task is run with the following input data: Name Wyatt Bob Q1UnitSales 100 200 Q2UnitSales 125 190 Q3UnitSales 150 220 Q4UnitSales 210 240
Transformers 101
The output data looks like this: Name Wyatt Wyatt Wyatt Wyatt Bob Bob Bob Bob Quarter 1 2 3 4 1 2 3 4 UnitSales 100 125 150 210 200 190 220 240
For each input row of five cells, the output includes four rows of three cells: The Name column is unchanged from the input data. The derived column, UnitSales, includes the values from the dropped Q1UnitSales, Q2UnitSales, Q3UnitSales, and Q4UnitSales columns. The tracking column, Quarter, indicates which of the dropped columns each value in the UnitSales column was derived from.
For additional information For step-by-step instructions, see Add a pivot transformer to a task or Edit a pivot transformer. For a hands-on demonstration, see Pivot Tutorial. Add a pivot transformer to a task 1. Right click in the diagram workspace and select Add Pivot. 2. Create data streams linking the input reader or transformer to the pivot transformer and the pivot transformer to the output transformer or writer. (See Link Objects with Data Streams.) 3. Double-click the pivot transformer. 4. Revise the name to make it more descriptive, enter a description, then click Next. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.)
102
| Chapter 3
Delete unneeded columns from the output 5. In the Output Columns list, select the columns to be pivoted, as well as any other columns you do not want in the output, then click Delete. (Ctrl-click or Shift-click to select multiple columns.)
Create the derived and tracking columns 6. Click Add Derived. 7. Check the columns to be pivoted and enter a name for the output column. Then click Add Tracking Column.
Transformers 103
9. Enter a name for the output column, set the data type and width, and enter an appropriate value for each input column. Then click Finish to close the Add Tracking Column dialog.
10. Click Finish to close the Add Derived Column dialog. Adjust the output column order 11. If necessary, adjust the output column order: select a column, then click Up or Down.
12. Click Finish. For more information, see Pivot Transformer. For a hands-on demonstration, see Pivot Tutorial. Edit a pivot transformer 1. Double-click the pivot transformer. 2. Select the Columns tab. 3. Make the necessary changes (see Add a pivot transformer to a task). 4. When finished making changes, click OK. For more information, see Pivot Transformer.
104
| Chapter 3
For additional information For step-by-step instructions, see Add a row splitter transformer to a task or Edit a row splitter transformer. For a hands-on demonstration, see Row Splitter Tutorial. Add a row splitter transformer to a task 1. Right click in the diagram workspace and select Add Row Splitter. 2. Create data streams linking the input reader or transformer to the row splitter and the row splitter to the output transformer(s) and/or writer(s). (See Link Objects with Data Streams.) 3. Double-click the row splitter. 4. Revise the name to make it more descriptive, enter a description, then click Next. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.) Set the options that affect all row filters 5. If necessary, change the Trim the values of string-based columns option: If checked (the default), both leading and trailing spaces are removed from values in columns with string-based data types. If unchecked, the values are left unchanged. For example, with this option unchecked, the values "John " and " John" would not match the row filter [firstName]==(cs = FALSE) "John". With the option checked, both values would be changed to "John", and would match the row filter.
TIBCO DataExchange Designer Users Guide
Transformers 105
6. If necessary, change the Treat conditions as being mutually exclusive option. If checked (the default), each row will be included only in the first output whose criteria it matches. In other words, when a row is passed through a filter, it is removed from the dataset processed by successive filters. If unchecked, each row will be included in every output whose criteria it matches. In other words, all filters process the full input dataset. When this setting is checked, the rejected rows target (if any) is specified only once, in the last filter. When it is unchecked, each filter may specify its own rejected rows target. If you change this option from unchecked to checked, make sure that the Rejected Rows Data Flow setting of all but the last filter is <Discard> before deploying the task. Add a row filter 7. Click Add. 8. From the Target Data Flow list, select the transformer or writer to receive rows passed by this filter. 9. From the Rejected Rows Data Flow list, select the transformer or writer to receive rows rejected by this filter, or <Discard>. If the Treat conditions as being mutually exclusive option is checked, the Rejected Rows Data Flow setting for all but the last filter must be <Discard>. 10. If necessary, change the Ignore Case Sensitivity option. If the columns referenced by the row filter are not string-based, it does not matter which way this is set. 11. Create the row filter. You may add a column name or operator by selecting one from one of the lists and clicking the associated << button, or simply type the filter. For example:
[fName]==(cs = FALSE) [FIRSTN] && ([FIRSTN]IS NOT NULL || [ROLE]==(cs = FALSE) "SOCI")
106
| Chapter 3
This row filter contains three clauses joined by a logical AND (&&) operator. The second clause (in parentheses) has two subclauses joined by a logical OR (||) operator.
[fName]==(cs = FALSE) [FIRSTN]:
= FALSE)
checks that FIRSTN is not null. checks that the ROLE value is SOCI.
Putting all that together, a row will be passed to this row filters output when (1) the firstNAME and FIRSTN values match and (2) either the FIRSTN value is not null or the ROLE value is SOCI. The available operators are: Operator == != > >= < <= || && + * / % IS NULL IS NOT NULL Description equals does not equal is greater than is greater than or equal to is less than is less than or equal to OR AND add or concatenate subtract or negate multiply divide modulo
Transformers 107
12. When you finish the row filter, click Add, then Close. 13. To add another row filter, return to step 7.. Otherwise, click Finish. For more information, see Row Splitter Transformer. For a hands-on demonstration, see Row Splitter Tutorial. Edit a row splitter transformer 1. Double-click the row splitter transformer. 2. Select the Row Filter tab. 3. Make the necessary changes: To add a row filter, follow the instructions at step 7. of Add a row splitter transformer to a task. To edit a row filter, double-click it. Make the necessary changes, then click OK. (See step 8 et seq. of Add a row splitter transformer to a task for information about the settings. To change the order in which row filters are processed (which will affect the output only if Treat conditions as mutually exclusive is checked in the Row Splitter Editor dialog), select a row filter and click Up or Down. To delete a row filter, select it and click Delete. 4. If you make a mistake, click Cancel and then Yes to confirm you want to discard your changes. Otherwise, when finished making changes, click OK. For more information, see Add a row splitter transformer to a task.
Sort Transformer
A sort transformer sorts rows in a single data stream to a single output. Optionally it may filter out rows that meet specified conditions and/or limit output to the first or last n rows of the sort; the rejected rows may be discarded or passed to a second output. Before adding a sort transformer, consider carefully whether it is really necessary. Normally any sort necessary to the performance of the task is performed in the RDBMS prior to extraction into DataExchange server, and the order in which rows are loaded into a target RDBMS should not matter. For additional information For step-by-step instructions, see Add a sort transformer to a task or Edit a sort transformer. For a hands-on demonstration, see Sort Tutorial.
TIBCO DataExchange Designer Users Guide
108
| Chapter 3
Add a sort transformer to a task 1. Right click in the diagram workspace and select Add Sort. 2. Create data streams linking the input reader or transformer to the sort and the sort to the output transformer(s) and/or writer(s). (See Link Objects with Data Streams.) 3. Double-click the sort. 4. Revise the name to make it more descriptive, enter a description, then click Next. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.) 5. Select the sort columns. Double-click a column to move it from Available Columns to Sort Columns. Alternatively, use the buttons: the single-arrow buttons move the currently selected column(s); the double-arrow buttons move all columns. When finished, click Next. 6. If appropriate, change the sort order (Ascending or Descending). This selection applies to all sort columns. 7. To limit output to the first or last n rows of the sort, check Enable Rank, select the appropriate Rank Order (FIRST or LAST), and set Rank Records to the number of rows to be passed to the output. Rank order is affected by sort order. For example, say the sort column in a 100-row dataset has sequential values from 1 to 100 and Rank Records is set to 5. If the sort order is Ascending and the Rank Order is FIRST, the output values will be 1-5, but if the sort order is Descending the output values will be 96-100. 8. To filter out rows from the dataset before the sort is performed, check Use a filter condition and specify the filter criteria (see the discussion of the WHERE clause in Update/Delete Transformer for examples). 9. Click Next. 10. If necessary, correct the output settings. If you did not specify a filter condition, the Rejected Rows Data Flow must be N/A. 11. If necessary, correct the settings for memory and/or data block size (see Memory and Data Block Size Settings), then click Finish. For more information, see Sort Transformer. For a hands-on demonstration, see Sort Tutorial.
Transformers 109
Edit a sort transformer 1. Double-click the sort transformer. 2. Make the necessary changes: To change the sort columns, select the Sort Columns tab. Double-click a column to move it from Available Columns to Sort Columns or vice-versa. Alternatively, use the buttons: the single-arrow buttons move the currently selected column(s); the double-arrow buttons move all columns. To change the sort order, select the Configuration tab. The selection (Ascending or Descending) applies to all sort columns. To change the rank or filter settings, select the Filter tab. See step 7. or step 8. of Add a sort transformer to a task for instructions. To change the rejected rows target, select the Configuration tab and change the Rejected Rows Data Flow setting. If you do not specify a filter condition, the Rejected Rows Data Flow must be N/A. 3. If you make a mistake, click Cancel and then Yes to confirm you want to discard your changes. Otherwise, when finished making changes, click OK. For more information, see Add a sort transformer to a task. For a hands-on demonstration, see Sort Tutorial.
110
| Chapter 3
To add a temp table to a task: 1. Right click in the diagram workspace and select Add Temp Table. 2. Create data streams linking the input transformer to the temp table and the temp table to the output transformer. (See Link Objects with Data Streams.) 3. Double-click the temp table. 4. Revise the name to make it more descriptive, enter a description, then click Next. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.) 5. Enter the path (not a file name) to the directory where the temporary file should be created, or click the button, navigate to the directory, and click OK. Then click Finish. To change where the temporary file is created: 1. Double-click the temp table. 2. Select the Options tab. 3. Change the temporary files directory, then click OK.
Union Transformer
A union transformer combines two or more data streams and optionally changes the output order. The inputs must contain identical columns, but they do not have to be in the same order. For example:
To perform a union on a common subset of columns from sources with different sets of columns, see Use Column Map to Perform Union on Output of Column Splitters with Different Sources in TIBCO DataExchange Best Practices. For a hands-on demonstration, see Union Tutorial. Add a union transformer to a task 1. Right click in the diagram workspace and select Add Union.
TIBCO DataExchange Designer Users Guide
Transformers 111
2. Create data streams linking the input reader(s) and/or transformer(s) to the union transformer and the union transformer to the output transformer or writer. (See Link Objects with Data Streams.) 3. Double-click the union transformer. 4. Revise the name to make it more descriptive, enter a description, then click Next. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.) 5. If necessary, adjust the output column order: select a column, then click Up or Down. 6. Click Finish. Change the output column order 1. Double-click the union transformer and select the Output Columns tab. 2. Select a column, then click Up or Down. When the order is correct, click Finish.
Update/Delete Transformer
An update/delete transformer sets values in one or more columns based on the contents of each row and/or deletes rows that match specified criteria. For a very simple example, the following statement will change all occurrences of Jones in the LastName column to Smith.
112
| Chapter 3
Both the SET and WHERE clauses may be complex, involving concatenation and/or mathematical operations on multiple columns. The WHERE clause may have multiple subclauses connected by logical AND and OR operators. There are four types of update/delete statements: UPDATE: Sets values for one column in rows that match WHERE clause criteria or, if the WHERE clause is blank, in all rows. CASE: Functionally the same as a set of UPDATE statements, but displayed and managed as a group. The WHERE clauses should specify different values or ranges of the same column(s), for example [Type]=1, [Type]=2, [Type]=3, ... DELETE: Deletes rows that match the WHERE clause. SELECT*: Deletes rows that do not match the WHERE clause.
If you define multiple update/delete statements, they are processed in order from top to bottom. Set Operators and Sample Set Clauses Operator = + * / % Concatenate values:
[FullName]=[LastName]+", "+[FirstName]
Add values:
[YearSales]=[Q1SALES]+[Q2SALES]+[Q3SALES]+[Q4SALES]
Transformers 113
Where Operators and Sample Where Clauses Operator == != > >= < <= || && IS NULL IS NOT NULL Match specific values:
[DEPT]=="ACCTNG" [CustomerID]==24893 [HireDate]==
Description equals does not equal is greater than is greater than or equal to is less than is less than or equal to OR AND
For additional information For step-by-step instructions, see Add an update/delete transformer to a task or Edit an update/delete transformer. For a hands-on demonstration, see Update/Delete Tutorial. Add an update/delete transformer to a task 1. Right click in the diagram workspace and select Add Update/Delete.
114
| Chapter 3
2. Create data streams linking the input reader or transformer to the update/delete transformer and the update/delete transformer to the output transformer or writer. (See Link Objects with Data Streams.) 3. Double-click the update/delete transformer. 4. Revise the name to make it more descriptive, enter a description, then click Next. (See Give Each Object a Descriptive Name and Give Each Object a Detailed Description or Definition.) 5. Click Add. 6. Select the Query Action Type. 7. Create the statement (see Update/Delete Transformer for examples and an operator key): For an UPDATE statement, enter the Set Clause and optionally the Where Clause, then click OK. For a DELETE or SELECT* statement, enter the Where Clause, then click OK. For a CASE statement: a. Click Add. b. Enter a Set Clause and optionally a Where Clause, and click Add. c. Enter another Set Clause and optionally a Where Clause, and click Add. d. Repeat step (c) until you have entered all your cases, then click Close. e. Click the Add button at the bottom of the Add Update/Delete Statement dialog. f. Click Close.
8. If you wish to add additional update/delete statements, repeat from step 5., otherwise click Finish. For more information, see Update/Delete Transformer. Edit an update/delete transformer 1. Double-click the update/delete transformer. 2. Select the Operation tab.
Transformers 115
3. Make the necessary changes: To edit an UPDATE statement, double-click it, revise the Set Clause and/or the Where Clause, and click OK. To edit a DELETE or SELECT* statement, double-click it, revise the Where Clause, and click OK. To edit a CASE statement, double-click it, then: to edit a case, double-click it, make the necessary changes, and click OK. to add another case, click Add, enter the clause(s), click Add, and click Close. to change the execution order, select a case and click Up or Down. to delete a case, select it and click Delete. When finished making changes, click Close. To add another statement, click Add, then follow the instructions from step 6. of Add an update/delete transformer to a task. To change the order in which the statements are executed, select a statement and click Up or Down. To delete a statement, select it and click Del. 4. If you make a mistake, click Cancel and then Yes to confirm you want to discard your changes. Otherwise, when finished making changes, click OK. For more information, see Update/Delete Transformer. For a hands-on demonstration, see Update/Delete Tutorial.
116
| Chapter 3
Data Streams
Every component of a task must be properly linked to other components with data streams before the task can be deployed. The following table describes valid configurations for each type of object. For step-by-step instructions, see Link Objects with Data Streams. Object Type Source Null used as source JDBC Reader other Reader Column Splitter Column Map Duplicate Duplicate Elimination Group By Join Pivot Row Splitter Sort Temp Table Valid # of Inputs 0 0 1 or more 1 1 1 1 1 1 2 or more 1 1 1 1 Valid Input Component(s) Source(s) Source Reader or Transformer Reader or Transformer Reader or Transformer Reader or Transformer Reader or Transformer Readers and/or Transformers Reader or Transformer Reader or Transformer Reader or Transformer Reader or Transformer Valid # of Outputs 1 1 1 1 1 or more 1 2 or more 1 or 2* 1 1 1 1 or more 1 or 2** 2 or more Valid Output Component(s) Reader Column Map Transformer or Writer Transformer or Writer Transformer(s) and/or Writer(s) Transformer or Writer Transformers and/or Writers Transformer(s) and/or Writer(s) Transformer or Writer Transformer or Writer Transformer or Writer Transformer(s) and/or Writer(s) Transformer(s) and/or Writer(s) Transformers and/or Writers
Valid Input Component(s) Reader or Transformer Reader or Transformer Reader or Transformer Writer Transformer
Valid # of Outputs 1 1 1 0 0
*If two outputs are connected to a duplicate elimination transformer, one receives the data purged of duplicates and the other receives the duplicate rows that were purged. You may swap the outputs in the transformers Configuration tab. **If two outputs are connected to a sort transformer, one receives the sorted data and the other receives any rows rejected by the sort. You may swap the outputs in the transformers Configuration tab. ***A dotted line connecting a writer and reader or two writers is a dependency relationship (see Dependency Relationships).
118
| Chapter 3
Dependency Relationships
A dependency relationship keeps a reader or writer on hold until a writer on which it is dependent is finished, for example when one writer loads a set of primary keys that will be used as foreign keys by the second writers data. A dependency relationship is displayed in the task as a dotted blue line between the writers:
In the above example, all data will be written to output1.txt before any data is written to output2.txt. The two components may be in the same data flow or different data flows.
To add a dependency relationship to a task 1. Right-click in the task workspace and select Add Dependency Relationship. 2. Click the writer that must finish first, then on the dependent reader or writer.
Task Segments
A task segment is a part of a task (one or more source-reader pairs, one or more transformers, and/or one or more writer-target pairs, plus the data streams that link them) that can be shared by multiple tasks. A task segment is encapsulated into a single object that can be placed in a task. For example, if you have a number of tasks that read the same data, you could define a table datasource and its associated JDBC reader as a task segment. When connection data or other datasource details change, you only need to update the task segment, and all the tasks that use the segment are updated automatically. If a task segment includes source(s) and/or target(s), the associated model(s) will be stored with the deployed segment, and retrieved in their entirety into any project file (DT1) that uses the task segment. If a retrieved data model is identical to one already in the task, it will not be duplicated. For detailed instructions, see Create a Task Segment Find a Task Segment Modify a Task Segment Add a Task Segment to a Task List Tasks That Use a Task Segment
120
| Chapter 3
3. Enter a descriptive name for the segment, then click OK. 4. Create the task segment just as you would a full task. Define the input and/or output connection(s) for the task segment with task segment connectors (right-click in a blank area of the task, then select Add Reader > Task Segment Connector). 5. Select Task Segment > Deploy Task Segment. 6. If the correct DataExchange server is not already selected, select it. 7. Click Deploy. If prompted to log in, do so. If the segment does not deploy successfully, make a note of the error. 8. Click Close, then Close again. Create a task segment from a portion of an existing task 1. Display the task containing the task segment, or create a new task and design the segment. 2. Select all objects to be included in the task segment, then right-click and select Create Task Segment. To select multiple objects press the Ctrl key and click the objects with your mouse. 3. Enter a descriptive name for the segment, then click OK. 4. Select Task Segment > Deploy Task Segment. 5. If the correct DataExchange server is not already selected, select it. 6. Click Deploy. If prompted to log in, do so. If the segment does not deploy successfully, make a note of the error. 7. Click Close, then Close again. For more information, see Task Segments
When you add a task segment to a task, a read-only copy is added to the DT1. You can recognize these read-only copies by the square brackets around their names. A read-only task segment cannot be modified or deployed. 1. Select the Diagram Explorers Servers tab. 2. Expand the tree to display the appropriate servers task segments. 3. Right-click the task segment you want to edit, then select Import Deployed Task Segment. 4. Modify the task segment as necessary. Do not make any changes to the task segment that would affect its inputs or outputs, since that would break all the tasks that use the segment. Any attempt to deploy such modifications should fail. 5. When done, select Task Segment > Deploy Task Segment and proceed as when deploying a task. Click Yes to overwrite the old version of the task segment in DataExchange server. For more information, see Task Segments
122
| Chapter 3
Multiple data flows improve performance in a development environment two ways: All data flows in a task are run in parallel, maximizing CPU utilization on the DataExchange server machine. The data flows can share resources, reducing server overhead. For example, if the three data flows in the task shown above were run as separate tasks, connections to the source and target databases would have to be opened and closed three times. With the data flows in the same task, the databases are opened once.
To create a task with multiple data flows, first design and debug the individual data flows in separate tasks. Then, when the tested data flows are ready to be put into production, copy and paste them into a single task with a name that will be meaningful to production staff, for example Monday_Morning_Tasks or End_of_Quarter_Jobs.
124
| Chapter 3
126
| Chapter 3
3. To create a new set, enter a name for the set and click OK. To update an existing set, enter its name, then click OK. Save Quick Launch Settings to a File 1. Check Use file-based Quick Launch settings. 2. Click Save As. 3. To create a new set, enter a name for the set, then click Save. To update an existing set, double-click it, then click Yes.
| 127
Chapter 4
A deployed task is one that has been sent from DataExchange Designer to DataExchange server. Deployed tasks are stored in the repository and appear in TIBCO Administrator and DataExchange Console.
Topics
Configuring DataExchange Designer, page 128 Log In and Logging Out of a DataExchange Server, page 131 Deploying Tasks, page 132
128
| Chapter 4
6. Click Next. 7. Select the appropriate machine group, then click Finish. 8. Right-click the newly registered DataExchange server, then select Edit. If prompted to log in, do so. (You must have a user account that is configured in TIBCO Administrator by your DataExchange administrator.) If DataExchange Server Version or Build Number are blank or incorrect, click Refresh Configuration. Then click OK.
130
| Chapter 4
132
| Chapter 4
Deploying Tasks
To deploy a task, the permissions required depend on whether the task is new or modified. For a new task: System Permissions > Tasks > Create, or administrator. For a modified version of a previously deployed task: System Permissions > Tasks > Create and Task Permissions > Update and Execute. Task owner or administrator. When you deploy a task, DataExchange Designer automatically saves the current project file (DT1) without prompting for confirmation. 1. Display the task diagram. 2. Select Task > Deploy Task. 3. If the desired DataExchange server is not selected, select it. 4. Select Execute tasks immediately if you wish the task to run immediately, or Run task later from Console, if you will start it later from TIBCO Administrator. 5. Check or Clear Attach project file for archiving on the server, as appropriate. If you do not attach the project file, you will not be able to retrieve the task from DataExchange server after rolling back to an earlier version. 6. Click Deploy. If prompted to log in, do so. If the task does not deploy successfully, make a note of the error. 7. Click Close, then Close again. Updating a task (deploying a task with the same name as one previously deployed) does not change the owner.
134
| Chapter 4
6. Drag each of the tasks you want to chain from the tree onto the white area of the Task Chain Deployment dialog. 7. To link the tasks, right-click in a blank area of the diagram and select Add Link. Click on the task that starts the link, then on the other task. Repeat until you have linked all the tasks, then click in a blank area to return the cursor to normal mode. 8. Make any necessary additional changes to the task, then select Task Chain > Deploy Task Chain. Notes: To delete a task or link, right-click it and select Delete. You may include a task in the chain more than once; multiple instances are distinguished by a dollar sign and integer appended to the task name. To create a conditional branch like the one shown above, link a task to two successor tasks, one to be executed when the first task completes successfully, the other when it aborts. To change a links type, right-click the link and select Change Condition to Aborted or Change Condition to Completed. Two or more conditional branches may lead to the same task via an OR operator. If one or more of the tasks immediately preceding the OR operator completes successfully, the task following the operator will execute once (and only once, regardless of how many of the preceding tasks complete);
otherwise, it will not be executed. To add an OR, right-click in a blank area of the task chain diagram window, then select Add OR.
To guarantee that the task after the OR operator will always execute, every task preceding the OR operator must have successor tasks for both Succeeded and Aborted conditions, as shown in the above examples. Otherwise, if any of the preceding tasks aborts, the task after the OR operator will not execute, even if one of the tasks immediately preceding the OR operator succeeds. See also Modify a Task Chain in DataExchange Designer
136
| Chapter 4
| 137
Chapter 5
Tutorials
This section provides hands-on tutorials for flat file and JDBC datasources and for each of DataExchange Designers transformers.
Topics
Before Starting, page 138 Flat File Tutorial, page 140 Column Map Tutorial, page 143 Column Splitter Tutorial, page 144 Duplicate Transformer Tutorial, page 146 Duplicate Elimination Tutorial, page 147 Group By Tutorial, page 149 Join Tutorial, page 150 Pivot Tutorial, page 152 Row Splitter Tutorial, page 155 Sort Tutorial, page 156 Union Tutorial, page 157 Update/Delete Tutorial, page 158 JDBC Tutorial, page 159 Column Map - JavaScript Tutorial, page 164
138
| Chapter 5
Tutorials
Before Starting
Read the following before starting the tutorials: The DataExchange server must be configured, deployed and running. See the TIBCO DataExchange Administrators Guide for details. You must setup DataExchange Designer with a default DataExchange server. The server runs a secured mode. You must have the server user name and password to login to the server. These credentials are set in TIBCO Administrator. See the TIBCO DataExchange Administrators Guide for details The tutorials are installed when installing TIBCO DataExchange. The default location is install-path\tibco\dx\5.3\tutorials\. The tutorials described in this chapter assume you have installed the tutorials in the default location. If the tutorials have not been installed, you must rerun the DataExchange installation utility and select the tutorial component. The tutorials use datasources that are provided in the install-path\tibco\dx\5.3\tutorials\sample_data folder. If you have installed DataExchange software under a drive other than c, or are running the tutorials on a DataExchange server on a Unix platform, you must change the datasource folder location to your installed location. To do so, for each tutorial that uses a datasource, click the Data Model tab at the bottom of the Diagram Explorer window and double-click each datasource. The folder location is set under the Datasource tab. When configuring the tutorials, you should know how to place readers and writers in the diagram window and link them with data streams. To review these topics, see Bulk Readers and Writers on page 70 and Link Objects with Data Streams on page 117. When connecting a datasource to a reader or writer, the following dialog can appear. For each tutorial, click Yes when the dialog appears. See Understanding Propagation on page 52 for information about the message in the dialog.
Each tutorial has a corresponding completed project file that you can open. The completed project file is in the install-path\tibco\dx\5.3\tutorials\project_files directory. For information about how to deploy a tutorial to the DataExchange server, see Deploying Tasks on page 132. After deploying a tutorial to the DataExchange server, you can manage the task from the TIBCO Administrator DataExchange Management console. See the TIBCO DataExchange Administrators Guide for details. You can: rerun a task from Administrator create and set task global variables schedule task execution runs set file and database tracing options for a task manage task versions set one or multiple directories monitors to manage task output set runtime properties for a task configure email notification for a task.
140
| Chapter 5
Tutorials
january.csv. 7. Check Attach a copy of this local file to the task when deploying to DataExchange Server, then click Next. 8. Set the column delimiter to Comma, change the text qualifier to Double quote ("), then click Next. 9. Check First row contains column names, then click Next, then Next again. 10. Double-click in the Data Type field for Application_Nbr and select INTEGER. 11. Double-click in the Data Type field for Approval_Date and select TIMESTAMP. 12. Double-click in the Date Format field for Approval_Date, enter dd-MMM-yy, then click Finish. Add the flat file target to the data model 13. Right-click in a blank area of the workspace, then select Insert Flat File. 14. Click in a blank area to add the flat file, right-click in a blank area to return to the select cursor, then double-click the new flat file. 15. Change the name to flatfile_out.csv, then click Next.
16. In the field to the left of the button, enter C:\scratch\flatfile_out.csv, then click Next. 17. If prompted, click OK to confirm that the specified file does not exist. (It will be created when the task is run.) 18. Set the column delimiter to Comma, change the text qualifier to Double quote (), then click Next. 19. Check First row contains column names, then click Next, then Next again, then Finish. Create the task 20. In the Main tab of the Diagram Explorer, double-click tutorial_FlatFile. 21. Expand the Model--Flat File portion of the data model tree until you can see january.csv and flatfile_out.csv. 22. Click and drag the january.csv icon from the data model tree and drop it in on the left side of the task workspace. 23. Click and drag the flatfile_out.csv icon from the data model tree and drop it in on the right side of the task workspace. 24. Right-click in the blank space to the right of january.csv and select Add Reader > Flat File Reader. 25. Right-click in the blank space to the left of flatfile_out.csv and select Add Writer > Flat File Writer. 26. Add data streams to connect january.csv to the reader, the writer to flatfile_out.csv, and the reader to the writer (see Link Objects with Data Streams). The task should look something like this:
Set the date format for the output 27. Double-click flatfile_out.csv and select the Columns tab. 28. Double-click in the Date Format field for Approval_Date, enter dd-MMM-yyyy, then click OK.
142
| Chapter 5
Tutorials
Run the task 29. Optionally, deploy and run the task (see Deploying Tasks). Compare january.csv and flatfile_out.csv and you will see that they are identical except that the date formats are slightly different and the latter includes data type information in the column header. See also JDBC Tutorial (contains a flat file writer)
10. Optionally, deploy and run the task. To make tasks easier to understand and maintain, each column map should perform a single transformation. For more information, see Use Separate Transformers for Each Column Add/Delete/Rename Operation on page 44. See also JDBC Tutorial (contains a column map transformer)
144
| Chapter 5
Tutorials
14. Click OK, then click Finish. The task should look something like this:
15. Optionally, deploy and run the task. A column splitter can pass all, none, or a subset of the input columns to each of its outputs. In this task, the country_code column appears in both outputs, coordinates and TZ each appear in a single output, and comments is dropped. When dropping columns in the middle of a data flow, use a column splitter with a single output (not a column map). For more information, see Use Column Splitter to Drop Columns on page 44.
146
| Chapter 5
Tutorials
2. In the Main tab of the Diagram Explorer, double-click tutorial_Duplicate. 3. Right-click in the blank space in the middle of the partially completed task and select Add Duplicate. 4. Add data streams to connect the reader to the duplicate transformer and the transformer to the two writers (see Link Objects with Data Streams). 5. In a real-world project, you would give the give the duplicate transformer an appropriate name and description. Since there is nothing else to configure, for this tutorial the task is complete. It should look something like this:
2. In the Main tab of the Diagram Explorer, double-click tutorial_DuplicateElim. 3. Right-click in the blank space in the middle of the partially completed task and select Add Duplicate Elimination. 4. Add data streams to connect the reader to the duplicate elimination transformer and the transformer to the two writers (see Link Objects with Data Streams). 5. Double-click the duplicate elimination transformer. 6. In a real-world project, you would give the give the transformer an appropriate name and description. For this tutorial, just click Next. 7. Click the >> button to add all columns to the Search Columns list, then click Next. 8. Click Next to skip specifying the filter condition. 9. Notice that Target Data Flow (for non-duplicate rows) has defaulted to FWrite_dupelim_out_1.csv, and Duplicate Rows Data Flow (for the purged
148
| Chapter 5
Tutorials
duplicate rows) to FWrite_dupelim_out_2.csv, then click Finish. The task should look something like this:
10. Deploy and run the task. Look at the output data and you will see that dupelim_out_1.csv contains 50 rows, each with a unique set of values, while dupelim_out_2.csv contains 19 purged duplicate rows.
Group By Tutorial
This sample task demonstrates how to use the group by transformer to perform aggregate functions. For more information about group by transformers, see Group By Transformer. 1. Open install-path\tibco\dx\5.3\Tutorials\project_files\GroupBy.dt1. 2. In the Main tab of the Diagram Explorer, double-click tutorial_GroupBy. 3. Right-click in the blank space in the middle of the partially completed task and select Add Group By. 4. Add data streams to connect the reader to the group by transformer and the transformer to the writer (see Link Objects with Data Streams). 5. Double-click the transformer. 6. In a real-world project, you would give the give the transformer an appropriate name and description. For this tutorial, just click Next. 7. Click the << button to clear the Group By Columns list. 8. Under Available Columns, double-click Product, then click Next. 9. Click Add. 10. In the Aggregate Column Name field, enter Count. 11. For Function, select Count. 12. For Input Column Name, select Approved 13. Click Add, then Close, then Finish. The task should look something like this:
14. Deploy and run the task. Look at the output (groupby_out.csv) and you will see it includes a row for each product. Each row includes the product name and a count of the number of applications approved for each product (i.e. the number of rows with that product name in the source data). See also JDBC Tutorial (contains a group by transformer)
TIBCO DataExchange Designer Users Guide
150
| Chapter 5
Tutorials
Join Tutorial
This sample task demonstrates how to use the join transformer to perform a standard inner join. For more information about join transformers, see Join Transformer. When joining untransformed rows from table datasources in the same model, best practice is to use an appropriate view or, if no such view exists or can be created in the database, a JDBC reader with multiple source tables (see Add a JDBC Reader to a Task). These approaches will give much better performance than using separate JDBC readers for each table and combining their outputs with a join transformer. 1. Open install-path\tibco\dx\5.3\Tutorials\project_files\Join.dt1. 2. In the Main tab of the Diagram Explorer, double-click tutorial_Join. 3. Right-click in the blank space in the middle of the partially completed task and select Add Join. 4. Add data streams to connect the two readers to the join transformer and the transformer to the writer (see Link Objects with Data Streams). 5. Double-click the transformer. 6. In a real-world project, you would give the give the transformer an appropriate name and description. For this tutorial, just click Next. 7. Leave the join type set to the default (Inner), and click Add. 8. In the Join Name field, enter 1 (or any name you like). Notice that the other settings indicate that the join will be on the country_code column in both sources. 9. Click OK, then Next. 10. Double-click the Source cells for country_code_1 and coordinates to remove those columns from the Output Columns list.
11. Click Next, then Finish. The task should look something like this:
12. Deploy and run the task. Look at the output (join_out.csv) and you will see it has the same rows as timezones.csv, except with a country_name column added (values taken from countries.csv) and the coordinates column dropped.
152
| Chapter 5
Tutorials
Pivot Tutorial
This sample task demonstrates how to use the pivot transformer to convert rowwise data into columnwise data. For each row containing four quarterly sales values, the output contains four rows, one for each quarter. A tracking column in the output indicates which quarter each value represents. For more information about pivot transformers, see Pivot Transformer. 1. Open install-path\tibco\dx\5.3\Tutorials\project_files\Pivot.dt1. 2. In the Main tab of the Diagram Explorer, double-click tutorial_Pivot. 3. Right-click in the blank space in the middle of the partially completed task and select Add Pivot. 4. Add data streams to connect the reader to the pivot transformer and the transformer to the writer (see Link Objects with Data Streams). 5. Double-click the transformer. 6. In a real-world project, you would give the give the transformer an appropriate name and description. For this tutorial, just click Next. 7. First step: drop columns to be consolidated from the output. Under Output Columns, select the four quarterly unit sales columns, then click Delete.
8. Second step: consolidate the dropped columns. Click Add Derived. 9. Under Input Columns, check the four quarterly unit sales columns:
10. In the Output Column Name field, enter UnitSales. 11. Third step: add a tracking column to indicate the quarters of the consolidated UnitSales values. Click Add Tracking Column, then click Yes. 12. In the Output Column Name field, enter Quarter. 13. Change DataType to INTEGER and Width to 1. 14. In the Value column, enter 1, 2, 3, and 4, then click in the last entry to enter it. You should see this:
15. Click Finish. 16. Under Output Columns, select Quarter, then click Up. You should see this:
17. Click Finish. The task should look something like this:
18. Deploy and run the task. Compare the source and output to see that the data has been pivoted as follows
154
| Chapter 5
Tutorials
quarterly_sales.csv: Name Wyatt Bob pivot_out.csv: Name:CHAR:255 Wyatt Wyatt Wyatt Wyatt Bob Bob Bob Bob Quarter:INTEGER 1 2 3 4 1 2 3 4 UnitSales:INTEGER 100 125 150 210 200 190 220 240 Q1UnitSales 100 200 Q2UnitSales 125 190 Q3UnitSales 150 220 Q4UnitSales 210 240
10. Click Add, then Close, then Finish. The task should look something like this:
11. Deploy and run the task. Look at the two output files and compare the Approval_Date columns: in rowsplit_out_1.csv the columns values are valid dates; in rowsplit_out_2.csv they all read prior to 1982.
156
| Chapter 5
Tutorials
Sort Tutorial
This sample task sorts a data stream. For more information about sort transformers, see Sort Transformer. Before adding a sort transformer, consider carefully whether it is really necessary. Normally any sort necessary to the performance of the task is performed in the RDBMS prior to extraction into DataExchange server, and the order in which rows are loaded into a target RDBMS should not matter. 1. Open install-path\tibco\dx\5.3\Tutorials\project_files\Sort.dt1. 2. In the Main tab of the Diagram Explorer, double-click tutorial_Sort. 3. Right-click in the blank space in the middle of the partially completed task and select Add Sort. 4. Add data streams to connect the reader to the sort transformer and the transformer to the writer (see Link Objects with Data Streams). 5. Double-click the transformer. 6. In a real-world project, you would give the give the transformer an appropriate name and description. For this tutorial, just click Next. 7. Under Available Columns, double-click Trade_Name, then double-click Application_Nbr, then click Next. 8. Click Next to accept the default sort order and not use the rank or filter options, then click Finish. The task should look something like this:
9. Deploy and run the task. Compare patents.csv and sort_out.csv and you will see that the rows been sorted, first by the trade name, then by the application number.
Union Tutorial
This sample task performs a standard union. For more information about unions, see Union Transformer. For best performance, do not combine two data streams with a union and follow it with a row splitter to drop unwanted data. Instead, drop the unwanted data first using two row splitters, then feed both their outputs into the union. To perform a union on a common subset of columns from different sources, see Use Column Map to Perform Union on Output of Column Splitters with Different Sources on page 45. 1. Open install-path\tibco\dx\5.3\Tutorials\project_files\Union.dt1. 2. In the Main tab of the Diagram Explorer, double-click tutorial_Union. 3. Right-click in the blank space in the middle of the partially completed task and select Add Union. 4. Add data streams to connect the two readers to the union transformer and the transformer to the writer (see Link Objects with Data Streams). 5. In a real-world project, you would double-click the transformer, give it an appropriate name and description, and optionally set the output order, but for the purposes of this tutorial, the task is complete. It should look something like this:
158
| Chapter 5
Tutorials
Update/Delete Tutorial
This sample task reads a flat file containing a mix of fahrenheit and centigrade value, uses an update/delete transformer to convert the fahrenheit values to centigrade, and creates an all-centigrade output file. For more information about update/delete transformers, see Update/Delete Transformer. 1. Open
install-path\tibco\dx\5.3\Tutorials\project_files\UpdateDelete.dt1.
2. In the Main tab of the Diagram Explorer, double-click tutorial_UpdateDelete. 3. Right-click in the blank space in the middle of the partially completed task and select Add Update/Delete. 4. Add data streams to connect the reader to the update/delete transformer and the transformer to the writer (see Link Objects with Data Streams). 5. Double-click the transformer. 6. In a real-world project, you would give the give the transformer an appropriate name and description. For this tutorial, just click Next. 7. Click Add. 8. In the Set Clause field, enter: [temperature]=([temperature]-32)/1.8 9. In the Where Clause field, enter [scale]=="F" 10. Click Add. 11. In the Set Clause field, enter: [scale]="C" 12. In the Where Clause field, enter [scale]=="F" 13. Click Add, then Close, then Finish. The task should look something like this:
14. Deploy and run the task. Compare the source and target files to see that the fahrenheit entries in temperatures.csv have all been converted to centigrade in updel_out.csv. See also JDBC Tutorial (contains an update/delete transformer)
JDBC Tutorial
This sample task demonstrates how to use a JDBC reader to perform a join using a WHERE clause, and how to replace a flat file target with a database target. To illustrate a common real-world use, the task uses several other transformers to calculate total sales by customer. In sequence, the transformers do the following: Read_Order_Details_Orders: This JDBC reader extracts all data from the Orders and OrderDetails tables, performing a Cartesian join on the OrderID columns. The readers output includes only four columns, CustomerID from Orders and UnitPrice, Quantity, and Discount from OrderDetails, and has one row for each row in OrderDetails. add_totals_column: This column map transformer adds a fifth, empty column, OrderItemTotal, which will be used by the next transformer. calculate_totals: For each row, this update/delete transformer calculates the total cost of the detail item (quantity times discounted unit price) and stores the results in OrderItemTotal. sum_totals_by_customer: This group-by transformer calculates the sum of OrderItemTotal values for each unique CustomerID and stores the result in a new column, OrderTotal. This transformers output includes only two columns, CustomerID and OrderTotal, and has one row for each unique CustomerID value. FWrite_jdbc_out.csv: This flat-file writer writes the transformed data to a file. Write_CustomerTotals: This JDBC writer drops and re-creates the CustomerTotals table, then loads it with the transformed data.
For more information about JDBC datasources, readers, and writers, see Adding and Working With Data Models, Table Datasources, and JDBC Readers and Writers. 1. Open install-path\tibco\dx\5.3\Tutorials\project_files\JDBC.dt1. Build the task 2. In the Main tab of the Diagram Explorer, double-click tutorial_JDBC. 3. Right-click in the blank space to the right of the Order Details and Orders datasources and select Add Reader > JDBC Reader. 4. Right-click in the blank space to the right of the reader and select Add Column Map. 5. Right-click in the blank space to the right of the column map and select Add Update/Delete.
TIBCO DataExchange Designer Users Guide
160
| Chapter 5
Tutorials
6. Right-click in the blank space to the left of FWrite_jdbc_out.csv and select Add Group By. The task should look something like this:
7. Add data streams to connect the Order Details to Reader_16, Orders to Reader_16, Reader_16 to Col_Map_17, Col_Map_17 to Update_18, Update_18 to Group_By_19, and Group_By_19 to FWrite_jdbc_out.csv. (The numbers may vary.) The task should now look something like this:
Configure the JDBC reader 8. Double-click the reader. For purposes of this task, the default name is OK. Since the readers purpose is self-explanatory, no description is needed, so click Next.
9. Connection information is best entered for the entire model, not for individual tables, so click Next. 10. Define aliases for the table names to simplify manual SQL entry: in the Alias column, double-click next to Order Details and enter OD, then double-click next to Orders and enter O. Then click Next. 11. Click the << button to clear the Selected Columns list. 12. In the Available Columns list, under Order Details (OD), double-click UnitPrice, Quantity, and Discount, and under Orders (O) double-click CustomerID. Then click Next. 13. Edit the WHERE clause to read WHERE Next.
OD.OrderID=O.OrderID,
then click
14. The remaining options are not required for this task. Click Next three more times, then Finish, then OK. Configure the column map 15. Double-click the column map. 16. Change the name to add_totals_column, then click Next. 17. Click Add. 18. In ColumnName, enter OrderItemTotal; change Data Type to DECIMAL; change Width to 19; and change Scale to 2. 19. Click OK, then Next, then Yes, then Finish. OrderItemTotal is added to the targets columns. Configure the update/delete transformer 20. Double-click the update/delete transformer. 21. Change the name to calculate_totals, then click Next. 22. Click Add. 23. In the Set Clause field, enter
[OrderItemTotal]=[Quantity]*([UnitPrice]*(1-[Discount]))
24. Click Add, then Close, then Finish. Configure the group by transformer 25. Double-click the group by transformer. 26. Change the name to sum_totals_by_customer, then click Next.
162
| Chapter 5
Tutorials
27. Click the << button to clear the Group By Columns list. 28. Under Available Columns, double-click CustomerID, then click Next. 29. Click Add. 30. In the Aggregate Column Name field, enter OrderTotal; change Function to SUM; and change Input Column Name to OrderItemTotal. 31. Click Add, then Close, then Finish. The task is now complete, and should look something like this:
32. Optionally, change the models datasource settings to connect to Microsoft SQL Servers sample Northwind database (see Change a Reverse-Engineered Models Datasource Settings), then deploy and run the task. Look at the jdbc_out.csv file and you will see it includes totals for 89 customers. Add a new database table to the model 33. In the Main tab of the Diagram Explorer, double-click Model 1. 34. Right-click in a blank space in the model diagram and select Insert Table. 35. Click in the blank space to insert a new table, right-click in the blank space to return to the select cursor, and double-click the new table. 36. Change Table Name to CustomerTotals. 37. Select the Columns tab, then click Add. 38. In Column Name, enter CustomerID; change Width to 5; and click Add. 39. In Column Name, enter OrderTotal; change the data type to numeric; change Width to 19; change Scale to 2; click Add; then click OK.
Replace the flat file target with the new database table 40. In the Main tab of the Diagram Explorer, double-click tutorial_JDBC. 41. Click and drag the mouse to select FWrite_jdbc_out.csv and jdbc_out.csv, then press Delete. (Since these objects are at the end of the task, there is no need to turn off propagation.) 42. Right-click in the blank space to the right of sum_totals_by_customer and select Add Writer > JDBC Writer. 43. Click and drag the CustomerTotals tables icon from the data model tree and drop it in the task workspace to the right of the JDBC writer. 44. Add data streams to connect the writer to CustomerTotals and sum_totals_by_customer to the writer. The task should now look something like this:
45. Double-click the writer, then click Next repeatedly until you get to step 7 of the wizard. 46. Change Insert Only to Drop and Re-Create table, then Insert. 47. Click Next repeatedly until you get to the last step of the wizard, then click Finish. 48. If in step 32 above you set the models datasource settings to connect to a sample Northwind database, deploy and run the task. Compare the new CustomerTotals table with the jdbc_out.csv file and you will see that the data is identical.
164
| Chapter 5
Tutorials
Requirements
To follow the instructions in this tutorial, you must have installed the tutorials when installing TIBCO DataExchange. If you have not, you must rerun the DataExchange installer and select the tutorials component. To run the sample tasks, you must have access to a DataExchange server.
2. In the Main tab of the Diagram Explorer, double-click Ex_1_Task_1. 3. Double-click on Col_Map_1. Since this transformer has not yet been configured, the Column Map Wizard opens. 4. Click Next, then click Add. 5. In the Column Name field, type CustomerName. 6. From the Data Type drop-down, select VARCHAR, then click OK.
7. In the Output Columns table, select the LastName and FirstName rows (Shift-click or click and drag to select both), then click Delete. Create the JavaScript Expression Next we will create a JavaScript expression to concatenate FirstName and LastName and store the result in CustomerName. 8. Click Edit Expr to open the Expression Editor:
Java function that will be called when task is run Function search Help for selected function
Function Tree
Parameter grid
Expression
9. In the Function Tree, select Transformation APIs > String Functions > Column Level Transformation > rtrim(1). 10. Triple-click in the Value field and select LastName. 11. Click Insert. The following JavaScript code appears in the expression field:
transformerString.rtrim( "LastName" );
12. In the Function Tree, select Transformation APIs > String Functions > Column Level Transformation > appendString(2). 13. Triple-click in the colName row's Value field and select LastName. 14. Double-click in the str row's Value field and type ", " (double quotes enclosing a comma followed by a space). Note that the quotes are necessary only because the string ends with a space. 15. Click Insert. The following JavaScript code is added to the expression:
TIBCO DataExchange Designer Users Guide
166
| Chapter 5
Tutorials
16. In the Function Tree, select Transformation APIs > Field Functions > Rowset Transformation > combineFields(3). 17. Triple-click in the colName_1 row's Value field and select LastName. 18. Triple-click in the colName_2 row's Value field and select FirstName. 19. Triple-click in the colName_Target row's Value field and select CustomerName. 20. Click Insert. The following JavaScript code is added:
transformerField.combineFields( "LastName", "FirstName", "CustomerName" );
21. Click OK to close the Expression Editor. 22. Click Next to go on to the next step of the Column Map Wizard. 23. Click Yes to regenerate. TIBCO DataExchange generates and displays the complete JavaScript, including class and variable declarations, cleanup, and comments. Non-editable portions of the script are displayed with a gray background:
24. Click Next, then Finish to close the Column Map Wizard, then click Yes to propagate the changes.
Run the Task If you have access to DataExchange server, you can run the task and confirm that the output is as expected. 25. From the main menu, select Task > Deploy Task. 26. In the Deploy to Machine tree, select the DataExchange server, then click Deploy. 27. If you see a Do you want to login now? dialog, click Yes. 28. Log in. (You must have a user account that is configured in TIBCO Administrator by your DataExchange administrator.) 29. Click OK to confirm successful login. 30. When you see a Complete Task Deployment message, click Close, then Close again to dismiss the Deploy dialog. 31. The output will be saved to
install-path\tibco\dx\5.3\tutorials\JavaScript\example_1.txt on the DataExchange server machine. If DataExchange server is running on the same machine as DataExchange Designer, double -click on the Example_1.txt object and select the Configuration tab. The output data is displayed at the bottom of the dialog:
168
| Chapter 5
Tutorials
2. In the Main tab of the Diagram Explorer, double-click Ex_2_Task_1. 3. Double-click on Col_Map_1. Since this transformer has not yet been configured, the Column Map Wizard opens. 4. Click Next, then click Add. 5. In the Column Name field, type CustomerName. 6. From the Data Type drop-down, select VARCHAR, then click OK. 7. In the Output Columns table, select the LastName and FirstName rows (Shift-click or click and drag to select both), then click Delete. 8. Click Edit Expr. 9. Add the following script in the expression area (you may copy the following and paste in DataExchange Designer):
// Get Row Count var rowCount = inputRowset1.getRowCount(); // Loop for all Records in the rowset for ( var i = 0; i < rowCount; i++) { // Get first input value var firstName = inputRowset1.getValueAt( inputRowset1.getColumnIndex("FirstName"), i ); // Get second input value var lastName = inputRowset1.getValueAt( inputRowset1.getColumnIndex("LastName"), i ); // Concatenate var customerName = lastName + ", " + firstName; // Set output value inputRowset1.setValueAt( inputRowset1.getColumnIndex("CustomerName"), i, customerName ); }
10. Click OK, then Next, then Yes, then Finish. 11. If you have access to DataExchange server, you can run the task and check the output (see Run the Task).
TIBCO DataExchange Designer Users Guide
2. In the Main tab of the Diagram Explorer, double-click Ex_3_Task_1. 3. Double-click on Col_Map_1. Since this transformer has not yet been configured, the Column Map Wizard opens. 4. Click Next, then click Add. 5. In the Column Name field, type NameLength. 6. From the Data Type drop-down, select INTEGER, then click OK. 7. Click Edit Expr. 8. Add the following script in the expression area (you may copy the following and paste in DataExchange Designer):
// Get Row Count var rowCount = inputRowset1.getRowCount(); // Perform len operation var nameLengthArray = transformerString.len("FirstName"); // Loop for all records in the rowset for ( var i = 0; i < rowCount; i++) { // Set output value inputRowset1.setValueAt(inputRowset1.getColumnIndex("NameLength "), i, nameLengthArray[i] ); }
9. Click OK, then Next, then Yes, then Finish. 10. If you have access to DataExchange server, you can run the task and check the output (see Run the Task).
170
| Chapter 5
Tutorials
2. In the Main tab of the Diagram Explorer, double-click Ex_4_Task_1. 3. Double-click on Col_Map_1. Since this transformer has not yet been configured, the Column Map Wizard opens. 4. Click Next, then click Add. 5. In the Column Name field, type TAN_VALUE. 6. From the Data Type drop-down, select NUMERIC, then click OK. 7. Click Edit Expr. 8. Add the following script in the expression area (you may copy the following and paste in DataExchange Designer):
// Get row count var rowCount = inputRowset1.getRowCount(); // Loop for all records in the rowset for ( var i = 0; i < rowCount; i++) { // Perform tan operation var tmp = transformerMath.tan( "VALUE", i); // Set output value inputRowset1.setValueAt(inputRowset1.getColumnIndex("TAN_VALUE" ), i, tmp ); }
9. Click OK, then Next, then Yes, then Finish, then No. 10. If you have access to DataExchange server, you can run the task and check the output (see Run the Task).
| 171
Chapter 6
Reference
This chapter includes reference material regarding Control Characters, Data Type Mappings, and Date-Time Format.
Topics
Control Characters, page 172 Data Type Mappings, page 173 Date-Time Format, page 180
172
| Chapter 6
Reference
Control Characters
TIBCO DataExchange supports the following escape sequences to represent control characters: Escape sequence
\n \r \t
SQL Server using Inet JDBC driver SQL Type binary bit JDBC Type binary bit Java Type (DT internal type) byte[] boolean
174
| Chapter 6
Reference
SQL Type char datetime decimal float image int money nchar ntext numeric nvarchar real smalldatetime smallint smallmoney sysname* text timestamp tinyint uniqueidentifier* varbinary varchar
*unsupported
JDBC Type char timestamp decimal float longvarbinary integer numeric char (-10) ? numeric varchar real timestamp smallint numeric ] longvarchar binary tinyint
Java Type (DT internal type) char[] long BigDecimal double byte[] int double char[] other (Object) double char[] float long short double
varbinary varchar
byte[] char[]
MS SQL Server using Avenir JDBC driver binary bit char datetime decimal float image int money nchar ntext numeric nvarchar real smalldatetime smallint smallmoney sysname* text timestamp tinyint uniqueidentifier* varbinary varbinary byte[] binary bit char date decimal float longvarbinary integer decimal char longvarchar numeric varchar real date smallint decimal ] longvarchar binary tinyint char[] byte[] byte byte[] boolean char[] long BigDecimal double byte[] int BigDecimal char[] char[] double char[] float long short BigDecimal
176
| Chapter 6
Reference
binary varchar
*unsupported
binary varchar
byte[] char[]
MS SQL Server using WebLogic JDBC driver binary bit char datetime decimal float image int money nchar ntext numeric nvarchar real smalldatetime smallint smallmoney sysname* text timestamp tinyint uniqueidentifier* varbinary varbinary byte[] binary bit char timestamp decimal float longvarbinary integer decimal char longvarchar decimal varchar float timestamp smallint decimal ] longvarchar binary tinyint char[] byte[] byte byte[] boolean char[] long BigDecimal double byte[] int BigDecimal char[] char[] BigDecimal char[] double long short BigDecimal
178
| Chapter 6
Reference
binary varchar
*unsupported
binary varchar
byte[] char[]
Oracle using Oracle JDBC driver bfile blob* char clob* date long long raw nchar nclob* number nvarchar2 raw rowid* urowid* varchar2
*unsupported
char
char[]
varchar
char[]
Sybase ASE using JConnect 5.2 JDBC driver binary bit binary bit byte[] boolean
binary char datetime decimal double float image int money nchar numeric nvarchar real smalldatetime smallint smallmoney text timestamp tinyint varbinary varchar
binary char timestamp decimal double double longvarbinary integer decimal char numeric varchar real timestamp smallint decimal longvarchar varbinary tinyint varbinary varchar
byte[] char[] long BigDecimal double double byte[] int BigDecimal char[] double char[] float long short BigDecimal char[] byte[] byte byte[] char[]
180
| Chapter 6
Reference
Date-Time Format
In TIBCO DataExchange, date formats are defined using pattern strings, using the following system: Character G y M d h H m s S E D F w W a k K z Description era designator year month in year day in month hour in am/pm (1-12) hour in day (0-23) minute in hour second in minute millisecond day in week day in year day of week in month week in year week in month AM/PM marker hour in day (1-24) hour in AM/PM (1-11) time zone Presentation (Text) (Number) (Text or Number) (Number) (Number) (Number) (Number) (Number) (Number) (Text) (Number) (Number) (Number) (Number) (Text) (Number) (Number) (Text) Example AD 1996 July or 07 10 12 0 30 55 978 Tuesday 189 2 (Second Wednesday in July) 27 2 PM 24 0 Pacific Standard Time
Character
Example
The count of pattern letters determines the format, as follows: Format Text Number Description 4 or more pattern letters. Uses full form. For example MMMM returns August in full form text. A minimum number or digits. Shorter numbers are zero-padded. Years are handled so that if the count of y is 2, the year is truncated to 2 digits. For example, MM returns a month as a number, 08 for August. 3 or more uses text. For example, MMM returns AUG the abbreviated form in text.
Text or Number
Any characters in the pattern that are not in the ranges of ['a'..'z'] and ['A'..'Z'] will be treated as quoted text. For instance, characters like ':', '.', ' ', '#' and '@' will appear in the resulting time text even they are not embraced within single quotes. A pattern containing any invalid pattern letter will result in an exception during formatting or parsing. If the am/pm marker 'a' is left out from the format pattern while using "hour in am/pm" pattern symbol information loss may occur when formatting the time in PM. If the target type is DATE, then the format CYYMMDD (where C represents the first two digits of the year minus 19) may be used. C can be any value between 0 and 9; 0 represents 19, and 9 represents 28. (For example, 0010101 represents 1 Jan 1901.) This format matches the formatting used for the java.util.SimpleDateFormat class.
182
| Chapter 6
Reference
| 183
Glossary
data flow A set of objects in a task that represent and define a single extract-transform-load operation to be performed by DataExchange server (see Tasks for more information). A task may contain multiple data flows (see Using Multiple Data Flows). data stream A line that defines a portion of the path of a data flow by connecting two objects. A data stream may connect a source and a reader, a reader and a transformer, one transformer and another, a reader or transformer and a writer, or a writer and a a target. datasource (1) A database table, flat file, XML file, or JMS stream, used as a source or target by a TIBCO DataExchange task. (2) An object in a DataExchange Designer model that represents such a datasource. (3) An object in a DataExchange Designer task that represents such a datasource. To avoid confusion, in this document we generally refer to (3) as a source (if the task reads the datasource) or a target (if the task writes to the datasource). derived column See Pivot Transformer.
184
| Chapter 6
Glossary
Diagram Explorer A portion of the DataExchange Designer workspace that presents tree views of the project. Datasources are added to tasks by dragging and dropping from the Main tab, task segments by dragging and dropping from the Servers tab. DataExchange servers are registered and the default DataExchange server set on the Servers tab.
Diagram Window A portion of the DataExchange Designer workspace that displays data models, tasks, task segments, and task chains.
DTRowset
Glossary 185
See rowset. repository A database in a third-party RDBMS used by DataExchange server to store tasks and related data. rowset The internal data format of DataExchange server, based on the JDBC rowset. As DataExchange server extracts rows from a datasource, it groups them into rowsets of manageable size. Most transformations are performed one rowset at a time. For those transformations that require the entire dataset, DataExchange server caches all rowsets. tracking column See Pivot Transformer. transformer An object in a data flow that defines a specific transformation to be performed on the data, such as concatenating columns, eliminating duplicate rows, or sorting (see Transformers).
186
| Chapter 6
Glossary
| 187
Index
A
Add 119 Alias datasource 65 individual table 74 property group 65 Align see Diagram Attached DT1 see DT1 file Auto-Propagation 51
D
Data Block Size property 82 Data Diagram Window 19 data model tab 19 task tab 20 Data Dictionary 17 Data flow defined 183 multiple, in a single task 123 Data model adding by reverse-engineering a database 30 from another project 28 from ER/Studio 28 datasource settings, for reverse-engineered 32 importing from another project 28 from DT1 file 28 from ER/Studio 28 from ERWin 29 from SQL file 29 merging data models 33 metadata exporting 35 importing 35 overview 9 reverse-engineering 30 Data stream defined 183 linking objects 117 list of valid connections 116 role in tasks 11
B
Bulk reader properties 70 Bulk writer adding to a task 69 editing 69 properties 70
C
cmdlineconsole 5 Column Map transformer adding to a task 84 editing 85 overview 83 Column Splitter transformer adding to a task 87 editing 88 overview 87 Command-line utilities cmdlineconsole 5
188
| Index
Data types date format 180 mapping DB2 173 Oracle 178 SQL Server using Avenir JDBC driver 175 SQL Server using Inet JDBC driver 173 SQL Server using Weblogic JDBC driver 177 Sybase ASE using JConnect JDBC driver 178 time format 180 timestamp format 180 unsupported 173 Database creating 32 generating 32 modifying 33 updating with changes to a reverse-engineered model 33 DataDictionary 17 Datasource alias 65 database, see Table datasource defined 183 flat-file, see Flat file datasource JMS stream, see JMS stream datasource replacing, see Source or Target table, see Table datasource XML file, see XML file datasource DB2 data type mapping 173 DDL file importing model from 29 Debugging 7 Dependency relationship adding 118 overview 118 role in tasks 12 Deployed task see Task, deployed Diagram aligning objects 124 distributing objects 124 Explorer (illustration) 184 layout properties 124 Window (illustration) 184
TIBCO DataExchange Designer Users Guide
DM1 importing model from 28 DT1 file attached to deployed task 12 importing model from 28 overview 8 Duplicate Elimination transformer adding to a task 90 editing 91 overview 89 Duplicate transformer 89 DX/Console described 5 DX/Designer changing default SQL client 129 described 4 logging in to a DX/Server 131 logging out of a DX/Server 131 options 129 refreshing the server configuration 133 registering a DX/Server with 128 setting default DX/Server for 129 unlocking 128 unregistering a DX/Server 129 validating license 128 DX/Server data type mappings 173 described 5 registering with DX/Designer 128 refreshing configuration 133 retrieve task chain from 136 retrieve task from 49 DxUtility 5
E
Edit Web Favorite Entry dialog 24 Enable Auto-Propagation 51 ER/Studio importing model from 28 ERWin importing model from 29
Index 189
Explorer Tree 16 data dictionary tab 17 data model tab 16 macro tab 18 task tab 16 Extract see Reader 68
J
JDBC reader adding to a task 73 properties 74 JDBC writer adding to a task 76 editing 69 properties 76 JMS reader properties 79 JMS stream datasource adding to data model 60 adding to task as source or target 61 editing properties 61 JMS writer adding to a task 69 editing 69 properties 79 Join transformer adding to a task 96 editing 98 overview 94
F
Flat file datasource adding to data model 54 adding to task as source or target 56 editing properties 56 viewing 57 Flat file reader properties 72 Flat file writer adding to a task 69 editing 69 properties 72
G
Generate Database command 32 global variables setting when to use null transformer 79 when to use temp table 109 with column map 83 Glossary 183 Group By transformer adding to a task 92 editing 93 overview 91
L
Layout see Diagram Licensing DX/Designer 128 Load see Writer 68 LOCAL MACHINE 54
M
Macros 18 Memory property 82 Metawizard 35
190
| Index
N
Null transformer 79 adding to a task 69 editing 69
R
Reader bulk, see Bulk reader Description property 51 flat file, see Flat file reader JDBC, see JDBC reader JMS, see JMS reader renaming 51 role in tasks 11 XML, see XML reader Repository defined 185 Reverse engineering adding a data model 30 creating a new database from a reverse-engineered data model 32 propagating changes to data model back to database 33 Row Splitter transformer adding to a task 104 editing 107 overview 104 Rowset defined 185 Runtime properties overview 12
O
Oracle bulk reader not supported 70 bulk writer vs. JDBC writer 70 data type mapping 178
P
Panes data diagram window 19 explorer tree 16 Pivot transformer adding to a task 101 editing 103 overview 99 Project file creating 26 blank 26 from template 26 using existing file as template 26 overview 8 retrieving from DX/Server 26 Property group alias 65
S
Schedule defined 12 Sort transformer adding to a task 108 editing 109 overview 107 Source Description property 51 flat-file adding to task 56 viewing 57 JMS stream
Index 191
adding to task 61 renaming 51 replacing a source with another source 66 role in tasks 11 XML file adding to task 65 SQL file importing model from a DDL file 29 SQL Server bulk writer vs. JDBC writer 70 data type mapping 173 identity columns 72, 78 named instances 31 native mode for bulk writer 70 Submodels 10 support, contacting xii Sybase ASE data type mapping 178
T
Table datasource adding to a data model, see Reverse-engineering adding to a task as source or target 58 alias datasource 65 individual table 74 editing properties 58 Target Description property 51 flat-file adding to task 56 viewing 57 JMS stream adding to task 61 renaming 51 replacing a target with another target 66 role in tasks 11 XML file adding to task 65 Task adding blank 49
from a deployed task 49 from another project 49 delete 50 deployed defined 12 deploying for production 7 overview 11 rename 50 retrieve from DX/Server 49 scheduling 7 segment, see Task segment Task chain defined 12 retrieve from DX/Server 136 Task segment add to task 122 create 119 deploy modified 121 new 119 list deployed tasks that use a 122 modify 121 overview 119 technical support xii Temp Table transformer adding to a task 110 changing temp file location 110 overview 109 Transformer Column Map, see Column Map transformer Column Splitter, see Column Splitter transformer defined 185 Description property 51 Duplicate Elimination, see Duplicate Elimination
192
| Index
transformer Duplicate, see Duplicate transformer 89 Group By, see Group By transformer Join, see Join transformer list of transformers 81 list of valid connections 116 Null, see Null transformer overview 81 Pivot, see Pivot transformer renaming 51 role in tasks 11 Row Splitter, see Row Splitter transformer Sort, see Sort transformer Temp Table, see Temp Table transformer Union, see Temp Table transformer Update/Delete, see Update/Delete transformer Transformers adding 7 debugging 7
W
Web Favorites Dialog Box 24 Writer bulk, see Bulk writer Description property 51 flat file, see Flat-file writer JDBC, see JDBC writer JMS, see JMS writer renaming 51 role in tasks 11 XML, see XML writer
X
XML file datasource adding to data model 62 adding to task as source or target 65 editing properties 64 XML reader properties 80 XML writer adding to a task 69 editing 69 properties 80
U
Union transformer adding to a task 110 changing output column order 111 overview 110 Unlocking DX/Designer 128 Update/Delete transformer adding to a task 113 editing 114 overview 111
V
Validating DX/Designer license 128 Version history described 12