This action might not be possible to undo. Are you sure you want to continue?
Ale Ribeiro June 6, 2006
• What is PowerCenter? • PowerCenter Client Applications • Demo
• PowerCenter – Designer, Workflow Manager, Workflow Monitor • PowerCenter Architecture
• Where do we use PowerCenter in IT? • Q&A
• Is a single, unified enterprise data integration platform that allows companies and government organizations of all sizes to access, discover, and integrate data from virtually any business system, in any format, and deliver that data throughout the enterprise at any speed • An ETL Tool (Extract, Transform and Load)
PowerCenter Client Applications Administration Administration Console Development Repository Manager Manage repository • connections • folders • objects • users and groups Administration Console (browser-based) Perform domain and repository service tasks: • Create/configure nodes and repository services • Upgrade/delete • Start/stop • Backup/restore Designer Workflow Manager Workflow Monitor Monitor and control workflows Create ETL Create and mappings start workflows 4 .
Designer Tools – Create mappings Target Transformation Mapplet Source Designer: Developer: Designer: Analyzer: create create source create target create reusable objects transformations mapplets objects Mapping Designer: create mappings 5 .
Mapping Logically Defines the ETL Process: • Reads data from sources • Applies transformation logic to data • Writes transformed data to targets Source Transformations Target Note: Sources and targets can be flat files. relational tables. message queues. etc Unit 1 6 . application systems. XML files.
7 . and transformations so the Integration Service can move the data as it transforms it. ♦Transformation. Connect sources. targets. Mappings represent the data flow between sources and targets. A mapplet is a set of transformations that you build in the Mapplet Designer and can use in multiple mappings. and write data. it uses the instructions configured in the mapping to read. Modifies data before writing it to targets. Use different transformation objects to perform different functions. When the Integration Service runs a session. Defines the target table or file. Describes the characteristics of a source table or file. transform. • • A mapping can also contain one or more mapplets. ♦Target definition.Mapping (cont’d) • A mapping is a set of source and target definitions linked by transformation objects that define the rules for data transformation. Every mapping must contain the following components: ♦Source definition. ♦Links.
Example • Give me an Excel file with Total Order Amount per Customer. I also need to know when this data was extracted (date) and the customer type initial ( first letter of the customer type) • Define the sources • Orders • Customers • Define any required transformation • Sum of order amount • Get extracted date • Get first letter of customer type • Create the file 8 .
Transformations • Generate. or pass data • Data passes into and out of transformations through ports that you link in a mapping • Passive transformations do not change the number of rows received • Active transformations can change the number of rows received Unit 1 9 . modify.
update.PowerCenter Transformations (partial list) Source Qualifier: reads data from flat file and relational sources Expression: performs row-level calculations Filter: drops rows conditionally Sorter: sorts data Aggregator: performs aggregate calculations Joiner: joins heterogeneous sources Lookup: looks up values and passes them to other objects Update Strategy: tags rows for insert. delete. reject Router: routes rows conditionally Transaction Control: allows data-driven commits and rollbacks 10 .
message queues and applications 11 .Advanced PowerCenter Transformations Union: Performs a union-all join between two data streams Java: allows Java syntax to be used within PowerCenter Midstream XML Parser: reads XML from anywhere in mapping Midstream XML Generator: writes XML to anywhere More Source Qualifiers: read from XML.
Mapplet – Set of transformation that can be reusable Mapplet Input & Output transformations (pass data from or to mapping) Mapplet Designer Tool Unit 14 12 .
Example: Data Sources Defined Outside Mapplet Mapping Source data defined outside the Mapplet Mapplet Mapplet Input transformation Mapplet Output transformation Unit 14 13 .
transform and load data Create mapping objects Logically defines the ETL process Generates or manipulates data Set of transformations that can be reused in multiple mappings 14 . 5. 3. 2. − Extract. 4. b. d. c. ETL Designer Mapping Transformation Mapplet a.Recap 1.
Workflow Manager Tools – Create and Start Workflow Create reusable tasks Create worklets Create workflows 15 .
functions or commands • Examples: Session task runs a mapping Command task runs a shell script Email task sends an email Decision task branches workflow conditionally Timer task waits for a specified period 16 .Task • An executable set of actions.
Error handling.Session • Task that executes a mapping • Define Log Options. Connections 17 .
Decision Task Tests for a condition during the workflow and sets a flag based on the condition Use a link condition (or a Control task) downstream to test the flag and control execution flow Can use workflow variables in condition Options on all tasks to fail parent and disable Treat inputs as AND/OR Unit 16 18 .
Email Task Sends an email within a workflow Note: emails can also be sent post-session in a Session task Can be used with a link condition to notify success or failure of prior tasks Unit 16 19 .
Event Wait Task Pauses processing of the pipeline until a specified event occurs Events can be: Pre-defined – file watch User-defined – created by an Event Raise task elsewhere in the workflow Unit 17 20 .
Event Wait Task (cont’d) Events Tab Specify either a pre-defined or user-defined event User-defined events must be declared in the workflow Events tab 21 .
Event Raise Task Sets the location of a user-defined event in the workflow User-defined events are triggered when the PowerCenter Server executes the Event Raise Task User-defined events must be declared in the workflow Events tab Used with the Event Wait Task 22 .
STATUS 23 .or postsession in a Session task Command task status (success or failure) is held in the task-specific variable $command_task_name.Command Task Specifies one or more UNIX command or shell script. DOS command or batch file for Integration Services to run during a workflow Note: UNIX and DOS commands can also be run pre.
Command Task (cont’d) Add Cmd Remove Cmd 24 .
a reusable task is indicated by a special symbol Unit 17 25 .Reusable Tasks • Session. Email and Command tasks can be reusable • Use the Task Developer to create reusable tasks • Reusable tasks appear in the Navigator Tasks node and can be dragged and dropped into any workflow In a workflow.
Worklet An object representing a set or grouping of Tasks Can contain any Task available in the Workflow Manager Worklets expand and execute inside a Workflow A Workflow which contains a Worklet is called the “parent Workflow” Worklets CAN be nested Reusable Worklets – create in the Worklet Designer Non-reusable Worklets – create in the Workflow Designer Unit 18 26 .
Workflow • A collection of ordered tasks • Tasks can be linked sequentially. concurrently and/or combined • Links can be conditional on previous tasks completing Unit 1 27 .
Workflow Structure • Workflow 1 1 1 2 3 • Session 1 • Worklet A • Session A1 • Session A2 • Session A3 • Worklet B Session B1 Session B2 Worklet C 2 Session C1 Session C2 1 3 4 28 .
repeat at a given time or interval. and you must reschedule it 29 . •The Integration Service runs a workflow unless the prior workflow run fails.Workflow Schedule •Workflow can be scheduled to run continuously. •When a workflow fails. the Integration Service removes the workflow from the schedule. or start manually.
Workflow Monitor • Check Workflow Status • Recover Workflow • Get session log 30 .
c. d. 5. Workflow Worklet Task Workflow Manager Workflow Monitor a. A collection of ordered tasks Set of tasks An executable mapping.Recap 1. 2. 3. e. b. functions or commands Create and start workflows Monitor and control workflows Unit 1 31 . 4.
PowerCenter Architecture Domain Sources Integration Service Targets Repository Service Repository Service Process Administration Console PowerCenter Client Repository 32 .
multi-threaded process that retrieves. The Administration Console is a web application that you use to manage a PowerCenter domain. and domain objects. you can access the Administration Console. Domain objects include services. The PowerCenter repository resides in a relational database. user accounts. and licenses. The Integration Service loads the transformed data into the mapping targets.Architecture – Components • • Domain is a collection of nodes and services. transform. PowerCenter Client applications access the repository database tables through the Repository Service. The Integration Service reads mapping and session information from the repository. Primary unit of administration The Repository Service manages connections to the PowerCenter repository from client applications. and updates metadata in the repository database tables. and load data. Use the Administration Console to perform administrative tasks such as managing logs. nodes. It extracts data from the mapping sources and stores the data in memory while it applies the transformation rules that you configure in the mapping. 33 • • • . If you have a user login to the domain. The repository database tables contain the instructions required to extract. The Repository Service ensures the consistency of metadata in the repository. The Repository Service is a separate. inserts.
integer. etc) • Datatype (character string. XML file. database table. decimal. etc) • Other attributes (length. precision. etc.Metadata • Defines data and processes • Examples: • Source and target definitions • Type (flat file.) • Mapping logic • Workflow logic • Stored in a metadata repository Repository 34 .
ETL processing engine Unit 1 35 .Recap Match the terms and explanations: 1. Collection of tables that contains PowerCenter metadata c. Integration Service a. Repository Manager 4. Defines data and processes b. Repository organization and security d. Repository 3. Metadata 2.
Where do we use PowerCenter? • Data Warehouse(SalesVision) and Data Mart (Horizon) Loads • Customer Hub Load • Interfaces – • PowerCafe Orders Peoplesoft • Magic Leads PowerCafe • Customer Portal Online Support Access Atlas • ADS Sales Rep Accounts SalesPortal LDAP 36 .
com Adabas C-ISAM Complex flat files Datacom IDMS IMS VSAM 37 .PowerCenter Connect Options Packaged Applications and Systems Hyperion Essbase Lotus Notes PeopleSoft SAP Netweaver BW SAS Siebel Databases and Flat Files DB2 Flat files Informix Netezza SQL Server Sybase Teradata Web logs Messaging and Standards HTTP IBM MQSeries JMS LDAP MSMQ ODBC TIBCO Rendezvous webMethods Web Services XML Hierarchical* Software as a Service (SaaS) salesforce.
Questions? 38 .