Introduction ETL Life Cycle The typical real-life ETL cycle consists of the following execution steps

: 1. Cycle initiation 2. Build reference data 3. Extract (from sources) 4. Validate 5. Transform (clean, apply business rules, check for data integrity, create aggregates or disaggregates) 6. Stage (load into staging tables, if used) 7. Audit reports (for example, on compliance with business rules. Also, in case of failure, helps to diagnose/repair) 8. Publish (to target tables) 9. Archive 10. Clean up Best practices Four-layered approach for ETL architecture design
• • • • • • • • • • • •

Functional layer: Core functional ETL processing (extract, transform, and load). Operational management layer: Job-stream definition and management, parameters, scheduling, monitoring, communication and alerting. Audit, balance and control (ABC) layer: Job-execution statistics, balancing and controls, rejects- and errorhandling, codes management. Utility layer: Common components supporting all other layers. Storage costs relatively little Intermediate files serve multiple purposes: Used for testing and debugging Used for restart and recover processing Used to calculate control statistics Helps to reduce dependencies - enables modular programming. Allows flexibility for job-execution and -scheduling Better performance if coded properly, and can take advantage of parallel processing capabilities when the need arises. Parameter-driven jobs, functions, and job-control Code definitions and mapping in database Consideration for data-driven tables to support more complex code-mappings and business-rule application. Performance Scalable Migratable Recoverable (run_id, ...) Operable (completion-codes for phases, re-running from checkpoints, etc.) Auditable (in two dimensions: business requirements and technical troubleshooting)

Use file-based ETL processing where possible

Use data-driven methods and minimize custom ETL coding
• • • • • • • • •

Qualities of a good ETL architecture design :

What is Informatica Informatica Power Center is a powerful ETL tool from Informatica Corporation. Informatica Corporation products are:
• • • • •

Informatica Power Center Informatica on demand Informatica B2B Data Exchange Informatica Data Quality Informatica Data Explorer

Informatica Power Center is a single, unified enterprise data integration platform for accessing, discovering, and integrating data from virtually any business system, in any format, and delivering that data throughout the enterprise at any speed. Informatica Power Center Editions :

Because every data integration project is different and includes many variables such as data volumes, latency requirements, IT infrastructure, and methodologies—Informatica offers three Power Center Editions and a suite of Power Center Options to meet your project’s and organization’s specific needs.
• • •

Standard Edition Real Time Edition Advanced Edition

Informatica Power Center Standard Edition: Power Center Standard Edition is a single, unified enterprise data integration platform for discovering, accessing, and integrating data from virtually any business system, in any format, and delivering that data throughout the enterprise to improve operational efficiency. Key features include:
• • • •

A high-performance data integration server A global metadata infrastructure Visual tools for development and centralized administration Productivity tools to facilitate collaboration among architects, analysts, and developers .

Informatica Power Center Real Time Edition : Packaged for simplicity and flexibility, Power Center Real Time Edition extends Power Center Standard Edition with additional capabilities for integrating and provisioning transactional or operational data in real-time. Power Center Real Time Edition provides the ideal platform for developing sophisticated data services and delivering timely information as a service, to support all business needs. It provides the perfect real-time data integration complement to service-oriented architectures, application integration approaches, such as enterprise application integration (EAI), enterprise service buses (ESB), and business process management (BPM). Key features include:
• • • • •

Change data capture for relational data sources Integration with messaging systems Built-in support for Web services Dynamic partitioning with data smart parallelism Process orchestration and human workflow capabilities

Informatica Power Center Real Time Edition : Power Center Advanced Edition addresses requirements for organizations that are Standardizing data integration at an enterprise level, across a number of projects and departments. It combines all the capabilities of Power Center Standard Edition and features additional capabilities that are ideal for data governance and Integration Competency Centers. Key features include:
• • • • • • • • • • • • • •

Dynamic partitioning with data smart parallelism Powerful metadata analysis capabilities Web-based data profiling and reporting capabilities Power Center domain Administration Console Power Center repository Power Center Client Repository Service Integration Service Web Services Hub SAP BW Service Data Analyzer Metadata Manager Power Center Repository Reports

Power Center includes the following components:

POWERCENTER CLIENT The Power Center Client consists of the following applications that we use to manage the repository, design mappings, mapplets, and create sessions to load the data: 1. Designer 2. Data Stencil 3. Repository Manager 4. Workflow Manager 5. Workflow Monitor 1. Designer:

Use the Designer to create mappings that contain transformation instructions for the Integration Service. The Designer has the following tools that you use to analyze sources, design target Schemas, and build source-to-target mappings:
• • • • •

Source Analyzer: Import or create source definitions. Target Designer: Import or create target definitions. Transformation Developer: Develop transformations to use in mappings. Mapplet Designer: Create sets of transformations to use in mappings. Mapping Designer: Create mappings that the Integration Service uses to Extract, transform, and load data.

You can also develop user-defined functions to use in expressions.

2.Data Stencil Use the Data Stencil to create mapping template that can be used to generate multiple mappings. Data Stencil uses the Microsoft Office Visio interface to create mapping templates. Not used by a developer usually. 3.Repository Manager Use the Repository Manager to administer repositories. You can navigate through multiple folders and repositories, and complete the following tasks:
• • •

Manage users and groups: Create, edit, and delete repository users and User groups. We can assign and revoke repository privileges and folder Permissions. Perform folder functions: Create, edit, copy, and delete folders. Work we perform in the Designer and Workflow Manager is stored in folders. If we want to share metadata, you can configure a folder to be shared. View metadata: Analyze sources, targets, mappings, and shortcut dependencies, search by keyword, and view the properties of repository Objects. We create repository objects using the Designer and Workflow Manager Client tools. Source definitions: Definitions of database objects (tables, views, synonyms) or Files that provide source data. Target definitions: Definitions of database objects or files that contain the target data. Mappings: A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the Integration Service uses to transform and move data. Reusable transformations: Transformations that we use in multiple mappings. Mapplets: A set of transformations that you use in multiple mappings. Sessions and workflows: Sessions and workflows store information about how and When the Integration Service moves data. A workflow is a set of instructions that Describes how and when to run tasks related to extracting, transforming, and loading Data. A session is a type of task that you can put in a workflow. Each session Corresponds to a single mapping.

We can view the following objects in the Navigator window of the Repository Manager:
• • •

• • •

4.Workflow Manager : Use the Workflow Manager to create, schedule, and run workflows. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. The Workflow Manager has the following tools to help us develop a workflow:
• •

Task Developer: Create tasks we want to accomplish in the workflow. Work let Designer: Create a worklet in the Worklet Designer. A worklet is an object that groups a set of tasks. A worklet is similar to a workflow, but without scheduling information. We can nest worklets inside a workflow.

Workflow Designer: Create a workflow by connecting tasks with links in the Workflow Designer. You can also create tasks in the Workflow Designer as you develop the workflow.

When we create a workflow in the Workflow Designer, we add tasks to the workflow. The Workflow Manager includes tasks, such as the Session task, the Command task, and the Email task so you can design a workflow. The Session task is based on a mapping we build in the Designer. We then connect tasks with links to specify the order of execution for the tasks we created. Use conditional links and workflow variables to create branches in the workflow. 5.Workflow Monitor Use the Workflow Monitor to monitor scheduled and running workflows for each Integration Service. We can view details about a workflow or task in Gantt chart view or Task view. We Can run, stop, abort, and resume workflows from the Workflow Monitor. We can view Sessions and workflow log events in the Workflow Monitor Log Viewer. The Workflow Monitor displays workflows that have run at least once. The Workflow Monitor continuously receives information from the Integration Service and Repository Service. It also fetches information from the repository to display historic Information. Services Behind Scene INTEGRATION SERVICE PROCESS The Integration Service starts an Integration Service process to run and monitor workflows. The Integration Service process accepts requests from the Power Center Client and from pmcmd. It performs the following tasks:
• • • • • • • •

Manages workflow scheduling. Locks and reads the workflow. Reads the parameter file. Creates the workflow log. Runs workflow tasks and evaluates the conditional links connecting tasks. Starts the DTM process or processes to run the session. Writes historical run information to the repository. Sends post-session email in the event of a DTM failure.

LOAD BALANCER The Load Balancer is a component of the Integration Service that dispatches tasks to achieve optimal performance and scalability. When we run a workflow, the Load Balancer dispatches the Session, Command, and predefined Event-Wait tasks within the workflow. The Load Balancer dispatches tasks in the order it receives them. When the Load Balancer needs to dispatch more Session and Command tasks than the Integration Service can run, it places the tasks it cannot run in a queue. When nodes become available, the Load Balancer dispatches tasks from the queue in the order determined by the workflow service level. DTM PROCESS When the workflow reaches a session, the Integration Service process starts the DTM process. The DTM is the process associated with the session task. The DTM process performs the following tasks:
• • • • • • • • • • • •

Retrieves and validates session information from the repository. Performs pushdown optimization when the session is configured for pushdown optimization. Adds partitions to the session when the session is configured for dynamic partitioning. Expands the service process variables, session parameters, and mapping variables and parameters. Creates the session log. Validates source and target code pages. Verifies connection object permissions. Runs pre-session shell commands, stored procedures, and SQL. Sends a request to start worker DTM processes on other nodes when the session is configured to run on a grid. Creates and run mapping, reader, writer, and transformation threads to extract, transform, and load data. Runs post-session stored procedures, SQL, and shell commands. Sends post-session email.

PROCESSING THREADS The DTM allocates process memory for the session and divides it into buffers. This is also known as buffer memory. The default memory allocation is 12,000,000 bytes. The DTM uses multiple threads to process data in a session. The main DTM thread is called the master thread. The master thread can create the following types of threads:
• • • •

Mapping Threads: One mapping thread for each session. Pre- and Post-Session Threads: One thread created. Reader Threads: One thread for each partition Transformation Threads: One thread for each partition

Writer Threads: One thread for each partition

CODE PAGES and DATA MOVEMENT A code page contains the encoding to specify characters in a set of one or more languages. An encoding is the assignment of a number to a character in the character set. The Integration Service can move data in either ASCII or Unicode data movement mode. These modes determine how the Integration Service handles character data. We choose the data movement mode in the Integration Service configuration settings. If we want to move multi byte data, choose Unicode data movement mode. ASCII Data Movement Mode: In ASCII mode, the Integration Service recognizes 7-bit ASCII and EBCDIC characters and stores each character in a single byte. Unicode Data Movement Mode: Use Unicode data movement mode when sources or targets use 8-bit or multi byte character sets and contain character data. Try U R Hand's on Admin-Console Repository Manager Tasks:
• • • • • • • • • •

Add domain connection information Add and connect to a repository Work with Power Center domain and repository connections Search for repository objects or keywords View object dependencies Compare repository objects Truncate session and workflow log entries View user connections Release locks Exchange metadata with other business intelligence tools

Add a repository to the Navigator, and then configure the domain connection information when we connect to the repository. 1.Adding a Repository to the Navigator : 1. In any of the Power Center Client tools, click Repository > Add.

2. Enter the name of the repository and a valid repository user name. 3. Click OK. Before we can connect to the repository for the first time, we must configure the Connection information for the domain that the repository belongs to.

2.Configuring a Domain Connection 1. In a Power Center Client tool, select the Repositories node in the Navigator. 2. Click Repository > Configure Domains to open the Configure Domains dialog box. 3. Click the Add button. The Add Domain dialog box appears. 4. Enter the domain name, gateway host name, and gateway port number. 5. Click OK to add the domain connection. 3.Connecting to a Repository 1. Launch a Power Center Client tool. 2. Select the repository in the Navigator and click Repository > Connect, or double-click the repository. 3. Enter a valid repository user name and password. 4. Click Connect.

The repository creates different types of locks depending on the task. 4. Select the object of use in navigator. we can remove the entries from the repository. Workflow Manager. Launch the Repository Manager and connect to the repository. 3.Validating Multiple Objects We can validate multiple objects in the repository without fetching them into the workspace. Types of locks created: 1.Truncating Workflow and Session Log Entries When we configure a session or workflow to archive session logs or workflow logs. When we compare two objects. connect to the repository. we can view dependencies to see the impact on other objects. We can view dependencies for repository objects in the Repository Manager. Connect to the repository. Steps: 1. We can end connections when necessary. we can compare two sessions to check for differences. we can find out which workflows use the session. 3. Select the objects you want to validate. 2. 5.Comparing Repository Objects We can compare two repository objects of the same type to identify differences between the objects. 6. Click Edit > Show User Connections or Show locks . change or view domain information. and worklets. mappings. Click a link to view the objects in the results group. the Repository Manager displays their attributes. 4. 2. Steps: 1. Click Compare in the dialog box displayed. The Repository Service locks and unlocks all objects in the repository. Click Edit > Compare Objects. In the Navigator. such as workflows and sessions Steps: 1. select the workflow in the Navigator window or in the Main window. Click Validate. Choose to delete all workflow and session log entries or to delete all workflow and session log entries with an end time before a particular date. Steps: 1. If you want to delete all entries older than a certain date. For example. For example. Select validation options from the Validate Objects dialog box 4. Choose Edit > Truncate Log. The Truncate Workflow Log dialog box appears. 4. If we move or delete a session log or workflow log from the workflow log directory or session log directory. 2. We can validate sessions. mapplets. 3. Steps: 1. workflows.Viewing Object Dependencies Before we change or delete repository objects. We can save and optionally check in objects that change from invalid to valid status as a result of the validation.Managing User Connections and Locks In the Repository Manager. The repository also creates an entry for each saved workflow log and session log. 8.Click on more button to add. User connections: Use the Repository Manager to monitor user connections to the repository. the Integration Service saves those logs in local directories. we can view and manage the following items: Repository object locks: The repository locks repository objects and folders by user. Click Analyze and Select the dependency we want to view. 7. Click OK. In the Repository Manager. and Designer tools. In-use lock: Placed on objects we want to view 2. select the object you want to compare. enter the date and time. 2. 5. Write-intent lock: Placed on objects we want to modify. before you remove a session. 5. Execute lock: Locks objects we want to run. 3. In the Repository Manager. 2. Click Analyze and Select Validate 3.

Only processes any Transformation logic that it cannot push to the database. FTP: We can have partitioned FTP targets and Indirect FTP File source (with file list). Select the options available to add. Pushdown optimization: Uses increased performance by pushing Transformation logic to the database by analyzing the transformations and Issuing SQL statements to sources and targets. 10. 6. 4. Enter the following information: 3. Flat File Enhancements: • • • • Reduced conversion of data types Delimited file performance has improved Flat file now can have integer and double data types Data can be appended to existing flat files . 1. and remove users and groups. 2. Click Security > Manage Users and Privileges. Edit or delete folder as per our need. New function in expression editor: New function have been introduced in Informatica 8X like reg_extract and reg_match 4. We cannot delete these users from the repository or remove them from the Administrators group. 9. Environment SQL Enhancements: Environment SQL can still be used to Execute an SQL statement at start of connection to the database. Click Folder > Create. 2. connect to a repository. The locks or user connections will be displayed in a window. In the Repository Manager.". For example. Click the Users tab to create Users 5.6 1. The default users are Administrator and the database user that created the repository. Repository query available in both versioned and non versioned Repositories previously it was available only for versioned repository.Managing Users and Groups 1. 8.1 and 8. 3. We can do the rest as per our need. In the Repository Manager. Propagating Port Descriptions: In Informatica 8 we can edit a port description and propagate the description to other transformations in the mapping. 9. or 4. Public: The Repository Manager does not create any default users in the Public group. 3. Target from Transformation: In Informatica 8X we can create target from Transformation by dragging transformation in Target designer 2. Click the Groups tab to create Groups. Click ok. Concurrently write to multiple files in a session with partitioned targets. connect to a repository.3. 7. There are two default repository user groups: Administrators: This group initially contains two users that are created by default. the following SQL command Modifies how the session handles characters: Alter session set NLS_DATE_FORMAT='DD/MM/YYYY'. edit. Difference Between 7. We can Use SQL commands that depend upon a transaction being opened during The entire read or write process. 10 Working with Folders We can create. 5. Click the Privileges tab to give permissions to groups and users. UDF (User defined function) similar to macro in excel 6.

b) Analyze the rejections and build a process to handle those rejections. it should form the basis of starting integration testing. unstructured and semi structured data Support for grid computing High availability Pushdown optimization Dynamic partitioning Metadata exchange enhancements Team based Development Global Web-based Admin console New transformations 23 New functions User defined functions Custom transformation enhancements Flat file enhancements New Data Federation option Enterprise GRID Testing Unit Testing Unit testing can be broadly classified into 2 categories. Qualitative Testing Analyze & validate your transformation business rules.Informatica power center 8 is having the following features which makes it more powerful. a) Have customized SQL queries to check the source/targets and here we will perform the Record Count Verification. . 2. g) Ensure that appropriate dimension lookup’s have been used and your development is in Sync with your business requirements. Integration testing will involve following 1. This requires a clear business requirement from the business on how to handle the data rejections. Performance Improvement a) Network Performance b) Session Performance c) Database Performance d) Analyze and if required define the Informatica and DB partitioning requirements. e) You need review field by field from source to target and ensure that the required transformation logic is applied. Do we need to reload or reject and inform etc? Discussions are required and appropriate process must be developed. Analyze the Load Time a) Execute the session and review the session statistics. Sequence of ETL jobs in batch. Incremental loading of records at a later date to verify the newly inserted or updated data. 3. b) Check the Read and Write counters. f) If you are making changes to existing mappings make use of the data lineage feature Available with Informatica Power Center. Integration testing should Test out initial and incremental loading of the data warehouse. How long it takes to perform the load. c) Use the session and workflow logs to capture the load statistics. Initial loading of records on data warehouse. easy to use and manage when compared to previous versions. More of functional testing. Analyze the success rows and rejections. b) If you are using flat file make sure have enough read/write permission on the file share. c) You need to document all the connector information. This will help you to find the consequences of Altering or deleting a port from existing mapping. • • • • • • • • • • • • • • • • Supports Service oriented architecture Access to structured. Integration Testing After unit testing is complete. d) You need to document all the load timing information. Quantitative Testing Validate your Source and Target a) Ensure that your connectors are configured properly.

quantities or dollars) for acceptability before further processing. With a test load. Error log generation. Size test: This test can be used to test the full size of the data field. Data Quality Validation Check for missing data. For all other target types. and training systems. multiplication.g. The proper mode in each case depends on the data field definition. Usually. the Integration Service reads and transforms data without writing to targets. This programming technique can be used to detect the truncation of a financial or quantity data field value after computation (e. Control totals: To ensure accuracy in data entry and processing. Limit checks: The program tests specified data fields against defined high or low value limits (e.. documents. as if running the full session. Overflow checks: This is a limit check based on the capacity of a data field or data file area to accept data. or dollars. part number. line items. and division). Dimensional Analysis Data integrity between the various source tables and relationships. or employee number. Format checks: These are used to determine that data are entered in the proper mode. Note: Based on your project and business needs you might have additional testing requirements. testing.. within designated fields of information. smaller targeted data subsets for development. The coverage of the tests would include the below: Count Validation Record Count Verification: DWH backend/Reporting queries against source and target as an initial check. Integration Testing would cover End-to-End Testing for DWH.. but rolls back the data when the session completes. Field-by-Field data verification can be done to check the consistency of source and target data. Statistical Analysis Validation for various calculations. or simple record counts Hash totals: This is a technique for improving data accuracy. Testing. Optimize Development. fields for which it would logically be meaningless to construct a total). Enter the number of source rows you want to test in the Number of Rows to Test field. debits or credits for financial data fields. Transaction Logs. 5. The Integration Service reads the number you configure for the test load. the first digit is the one lost. and performs all pre.and post-session Functions. For example. the session fails.4. such as flat file and SAP BW. while maintaining full data integrity. addition.g. User Acceptance Test In this phase you will involve the user to test the end results and ensure that business is satisfied with the quality of the data. control totals can be compared by the system with manually entered or otherwise calculated control totals using the data fields such as quantities. negatives and consistency. Error Logs and Validity checks. such as account number. a social security number in the United States should have nine digits Granularity Validate at the lowest granular level possible Other validations Audit Trails. Any changes to the business requirement will follow the change management process and eventually those changes have to follow the SDLC process. Sign test: This is a test for a numeric data field containing a designation of an algebraic sign. whereby totals are obtained on identifier fields (i. and Training Systems • Dramatically accelerate development and test cycles and reduce storage costs by creating fully functional. . The Integration Service writes data to relational targets. You can perform a test load for relational targets when you configure a session for normal Mode. for example. These totals have no significance other than for internal system control purposes. Number of Rows to Test Enter the number of source rows you want the Integration Service to test load. social security number. Instead you use the Enable Test Load feature available in Informatica Power Center. If you configure the session for bulk mode. which can be used to denote. The Integration Service generates all session files. Testing the rejected records that don’t fulfill transformation rules. • • When you validate the calculations you don’t require loading the entire rows into target and Validating it. + or . You cannot perform a test load on sessions using XML sources.. as numeric or alphabetical characters.e. the Integration Service does not write data to the targets. Property Enable Test Load Description You can configure the Integration Service to perform a test load.

Test Load Options – Relational Targets. Decrease maintenance costs by eliminating custom code and scripting. resource intensive. Decrease the cost and time of data divestiture with no reimplementation costs . Substantially accelerate time to value for subsets of packaged applications. You can share metadata with a third party. Improve the reliability of application delivery by ensuring IT teams have ready access to updated quality production data. For example. When you run the Debugger. it pauses at breakpoints and you can view and edit transformation output data. Syntax Testing: Test your customized queries using your source qualifier before executing the session. instead of coding by hand—which is expensive. You can configure a test session to read from a flat file source or to write to a flat file target to identify source and target bottlenecks. you configure and run the Debugger from within the Mapping Designer. you want to send a mapping to someone else for testing or analysis. Simplify test data management and shrink the footprint of nonproduction systems to significantly reduce IT infrastructure and maintenance costs. Easily customize provisioning rules to meet each organization’s changing business requirements. After the Integration Service fails over in safe mode. Lower training costs by standardizing on one approach and one infrastructure. Use Power Center conditional filter in the Source Qualifier to improve performance. realistic data before introducing them into production . . Accelerate application delivery by decreasing R&D cycle time and streamlining test data management. Analyze performance details. Accelerate the provisioning of new systems by using only data that’s relevant to the divested organization. Dramatically increase an IT team’s productivity by reusing a comprehensive list of data objects for data selection and updating processes across multiple projects. to determine where session performance decreases. and time consuming . Use the following methods to identify performance bottlenecks: • Debugger You can debug a valid mapping to gain troubleshooting information about data and error conditions. Analyze thread statistics to determine the optimal number of partition points. You can export the mapping to an XML file and edit the repository connection information before sending the XML file. Refer Informatica documentation to know more about debugger tool. I/O waits. To debug a mapping. You can also use the Workflow Monitor to view system resource usage. The Debugger uses a session to run the mapping on the Integration Service. production-like data in training systems. Untangle complex operational systems and separate data along business lines to quickly build the divested organization’s system. Train employees effectively using reliable. Running the Integration Service in Safe Mode • • Test a development environment. Monitor system performance.• • • • • • • • • • Quickly build and update nonproduction systems with a small subset of production data and replicate current subsets of nonproduction copies faster. You can use system monitoring tools to view the percentage of CPU use. Configure the Integration Service to fail over in safe mode and troubleshoot errors when you migrate or test a production environment configured for high availability. Reduce application and upgrade deployment risks by properly testing configuration updates with up-to-date. you can correct the error that caused the Integration Service to fail over. and paging to identify system bottlenecks. Performance Testing for identifying the following bottlenecks: • • • • • • • • • Target Source Mapping Session System Run test sessions. Run the Integration Service in safe mode to test a development environment before migrating to production Troubleshoot the Integration Service. The third party can import the mapping from the XML file and analyze the metadata. Share metadata. such as performance counters. Analyze performance details. Lower administration costs by centrally managing data growth solutions across all packaged and custom applications. Analyze thread statistics. but you do not want to disclose repository connection information for security reasons. Support Corporate Divestitures and Reorganizations Reduce the Total Cost of Storage Ownership • • • • • Informatica Power Center Testing Debugger: Very useful tool for debugging a valid mapping to gain troubleshooting information about data and error conditions.

When you create a debug session. the Designer displays the following windows: • • • Debug log. 3. the Integration Service runs the non-reusable session and the existing workflow. you can run the Debugger against the session. Create breakpoints. When you run the Debugger. target. you configure a subset of session properties within the Debugger Wizard. target. Use the Debugger Wizard to configure the Debugger for the mapping. The Debugger uses existing source. • • Debug Process To debug a mapping. the Designer connects to the Integration Service. Select the session type the Integration Service uses when it runs the Debugger. You can also choose to load or discard target data. View transformation data. The Debugger runs a workflow for each session type. complete the following steps: 1. Run the Debugger. transformation and mapplet output data. 4. target. the Integration Service runs a debug instance of the reusable session And creates and runs a debug workflow for the session. While you run the Debugger. Run the Debugger from within the Mapping Designer. . Configure the Debugger. If a session fails or if you receive unexpected results in the target.You might want to run the Debugger in the following situations: • • Before you run a session. The Debugger uses existing source. When you run the Debugger. Monitor the Debugger. View messages from the Debugger. you can run some initial tests with a debug session before you create and configure a session in the Workflow Manager. such as source and target location. Instance window. After you run a session. and the session log. the debug log. The Debugger does not suspend on error. you can monitor the target data. The Integration Service reads the breakpoints and pauses the Debugger when the breakpoints evaluate to true. The Integration Service initializes the Debugger and runs the debugging session and workflow. 2. Create a debug session instance. You can choose from the following Debugger session types when you configure the Debugger: • Use an existing non-reusable session. View target data. You might also want to run the Debugger against a session if you want to debug the mapping using the configured session properties. and session configuration properties through the Debugger Wizard. When you run the Debugger. and session configuration properties. Debugger Session Types: You can select three different debugger session types when you configure the Debugger. Use an existing reusable session. When you run the Debugger. After you save a mapping. Target window. Create breakpoints in a mapping where you want the Integration Service to evaluate data and error conditions. You can configure source. and session configuration properties. When you run the Debugger. the Integration Service runs a debug instance of the debug workflow and creates and runs a debug workflow for the session.

mapplets. After initialization. each user must configure different port numbers in the Tools > Options > Debug tab. . Running. You can also modify breakpoint information. the Debugger moves in and out of running and paused states based on breakpoints and commands that you issue from the Mapping Designer. The Debugger can be in one of the following states: • • • Initializing. you can modify data and see the effect on transformations. The Designer connects to the Integration Service. The Debugger does not use the high availability functionality. and targets as the data moves through the pipeline. Paused. Running the Debugger: When you complete the Debugger Wizard. the Integration Service starts the session and initializes the Debugger. Modify data and breakpoints. Note: To enable multiple users to debug the same mapping at the same time. The Designer saves mapping breakpoint and Debugger information in the workspace files. The Integration Service processes the data.5. When the Debugger pauses. If you want to run the Debugger from another Power Center Client machine. You can copy breakpoint information and the Debugger configuration to another mapping. The Integration Service encounters a break and pauses the Debugger. you can copy the breakpoint information and the Debugger configuration to the other Power Center Client machine.

Data movement. The debug log displays in the Debugger tab. All output ports. the Integration Service orders the target load on a row-by-row basis. you cannot change data associated with the following: Constraint-Based Loading: In the Workflow Manager. Debug indicators on transformations help you follow breakpoints and data flow. Custom transformation. Generated Keys and Generated Column ID ports. you can specify constraint-based loading for a session. Displays messages from the Repository Service. Sequence Generator transformation. you can monitor the following information: • • • • • • • • • • • Session status. Breakpoints. The session log displays in the Session Log tab. Router transformation. For every row generated by an active source. Target data. Ports in output groups other than the current output group. Restrictions You cannot change data for the following output ports: • • • • • • • • • • Normalizer transformation. RANKINDEX port. Monitor data as it moves through transformations. Output window. The Integration Service writes messages to the following tabs in the Output window: Debugger tab. Session Log tab. When the Debugger pauses. When you select this option. You might also want to edit or add more breakpoint information to monitor the session more closely.Monitoring the Debugger : When you run the Debugger. NewLookupRow port for a Lookup transformation configured to use a dynamic cache. Java transformation. Instance window. you can view transformation data and row information in the Instance window. you might want to change the transformation output data to see the effect on subsequent transformations or targets in the data flow. Ports in output groups other than the current output group. View target data for each target in the mapping. Monitor the status of the session. Debug indicators. The Mapping Designer displays windows and debug indicators that help you monitor the session: While you monitor the Debugger. Target window. Lookup transformation. Rank transformation. Mapplets that are not selected for debugging Input or input/output ports Output ports when the Debugger pauses on an error breakpoint Additionally. CURRVAL and NEXTVAL ports. the . Monitor data that meets breakpoint conditions. Monitor target data on a row-by-row basis. Notifications tab.

The second target also contains a foreign key that references the primary key in the first target. source qualifier. It reverts to a normal load. Use constraint-based loading to load the primary table. Example The following mapping is configured to perform constraint-based loading: . transformations. Constraint-based loading depends on the following requirements: • • • • Active source. Use this option when you insert into the target. For example. You might get inconsistent data if you select a different Treat Source Rows As option and you configure the session for constraint-based loading.Integration Service loads the corresponding transformed row first to the primary key table. and targets linked together in a mapping. when target tables have circular key relationships. Define the same target type for all targets in the session properties. If the tables with the primary key-foreign key relationship are in different target connection groups. but loads all other targets in the session using constraint-based loading when possible. then to any foreign key tables. Target load ordering defines the order the Integration Service reads the sources in each target load order group in the mapping. A target load order group is a collection of source qualifiers. Since these two targets receive data from different active sources. Target Connection Groups: The Integration Service enforces constraint-based loading for targets in the same target connection group. Key relationships. Similarly. The first two contain a source. and two targets. the Integration Service reverts to normal loading for those tables. the Integration Service cannot enforce constraint-based loading when you run the workflow. Target connection groups. For example. Key Relationships: When target tables have no key relationships. the Integration Service reverts to a normal load. Choose normal mode for the target load type for all targets in the session properties. Normalizer. Constraint based loading establishes the order in which the Integration Service loads individual targets within a set of targets receiving data from a single source qualifier. the Integration Service reverts to normal loading for both targets. the Integration Service performs constraint-based loading: loading the primary key table first. If you want to specify constraint-based loading for multiple targets that receive data from the same active source. Since these two targets share a single active source (the Normalizer). Treat rows as insert. Constraint-based loading does not affect the target load ordering of the mapping. Perform inserts in one mapping and updates in another mapping. the Integration Service does not perform constraint-based loading. Target tables must have key relationships. To verify that all targets are in the same target connection group. The Integration Service cannot enforce constraint-based loading for these tables. complete the following tasks: • • • • • Verify all targets are in the same target load order group and receive data from the same active source. Targets must be in one target connection group. Use the default partition properties and do not add partitions or partition points. Define the same database connection name for all targets in the session properties. then the foreign key table. Related target tables must have the same active source. When the mapping contains Update Strategy transformations and you need to load data to a primary key table first. Treat Rows as Insert: Use constraint-based loading when the session option Treat Source Rows As is set to insert. you must verify the tables are in the same target connection group. split the mapping using one of the following options: • • Load primary key table in one mapping and dependent tables in another mapping. Active Source: When target tables receive rows from different active sources. you have one target containing a primary key and a foreign key related to the primary key in a second target. a mapping contains three distinct pipelines. You cannot use updates with constraint based loading. The third pipeline contains a source. and target.

and then the second target load order group. SQ_A. but since T_2 and T_3 have no dependencies. Setting the Target Load Order You can configure the target load order for a mapping containing any type of target definition. the Mapping Designer lets you set the target load plan for sources within the mapplet. you then determine in which order the Integration Service reads each source in the mapping. To set the target load order: 1. and T_ITEMS. To set the target load order. T_3. you can set the order in which the Integration Service sends rows to targets in different target load order groups in a mapping. A target load order group is the collection of source qualifiers. target T_1 has a primary key. Target Load Order When you use a mapplet in a mapping. The Integration Service reads sources in a target load order group concurrently. The following figure shows two target load order groups in one mapping: In this mapping. Click OK. . and T_4 are in one target connection group if you use the same database connection for each target. The second target load order group includes all other objects in the mapping. since T_5 and T_6 receive data from a single active source. T_4 The Integration Service loads T_1 first because it has no foreign key dependencies and contains a primary key referenced by T_2 and T_3. select Constraint Based Load Ordering. To enable constraint-based loading: 1. the Integration Service loads rows to the target in the following order: 1. Create a mapping that contains multiple target load order groups. The Integration Service includes T_5 and T_6 in a different target connection group because they are in a different target load order group from the first four targets. If there are no key relationships between T_5 and T_6. 3. or updating tables that have the primary key and foreign key constraints. SQ_ITEMS. 2. deleting. because it has a foreign key that references a primary key in T_3. You can set the target load order if you want to maintain referential integrity when inserting.After loading the first set of targets. T_5 and T_6 are in another target connection group together if you use the same database connection for each target and you use the default partition properties. In the General Options settings of the Properties tab. choose Insert for the Treat Source Rows As property. The Integration Service loads T_4 last. If T_6 has a foreign key that references a primary key in T_5. Click the Config Object tab. Enabling Constraint-Based Loading: When you enable constraint-based loading. The Integration Service then loads T_2 and T_3. the first target load order group includes ITEMS. and targets linked together in a mapping. it reads data from both sources at the same time. In the Advanced settings. transformations. 2. and it processes target load order groups sequentially. T_1 2. The Target Load Plan dialog box lists all Source Qualifier transformations in the mapping and the targets that receive data from each source qualifier. create one source qualifier for each target within a mapping. Since these tables receive records from a single active source. T_3 has a primary key that T_4 references as a foreign key. the Integration Service reverts to a normal load for both targets. T_2. the Integration Service loads rows to the tables in the following order: • • T_5 T_6 T_1. including the TOTAL_ORDERS target. When it processes the second target load order group. the Integration Service begins reading source B. they are not loaded in any particular order. T_2 and T_3 (in no particular order) 3. the Integration Service orders the target load on a row-by-row basis. and you use the default partition properties. Click Mappings > Target Load Plan. To specify the order in which the Integration Service sends data to targets. In the Designer. the Aggregator AGGTRANS. T_2 and T_3 contain foreign keys referencing the T1 primary key. The Integration Service processes the first target load order group. 3.In the first pipeline.

When a session starts. A mapping parameter retains the same value throughout the entire session. The Integration Service looks for the start value in the following order: 1. the Integration Service does not update the value of the variable in the repository. The Integration Service saves the latest value of a mapping variable to the repository at the end of each successful session. Select a source qualifier from the list. The final current value for a variable is saved to the repository at the end of a successful session. . Used in following transformations: • • • • Expression Filter Router Update Strategy Initial and Default Value: When we declare a mapping parameter or variable in a mapping or a mapplet. It saves 8/2/2004 to the repository at the end of the session. Instead of manually entering a session override to filter source data each time we run the session. it appears in the Expression Editor. If we set the initial value of $$IncludeDateTime to 8/1/2004. 5. 2004. Value in parameter file 2. Default value Current Value: The current value is the value of the variable as the session progresses. Initial value 4. $$IncludeDateTime. first we declare the mapping parameter or variable for use in each mapplet or mapping. we can create a mapping variable. and in the Expression Editor of reusable transformations. When a session fails to complete. Data ->Default Value Numeric ->0 String ->Empty String Date time ->1/1/1 Variable Values: Start value and current value of a mapping variable Start Value: The start value is the value of the variable at the start of the session. 6. During the session. We can override a saved value with the parameter file. or extract override. it reads only rows dated 8/1/2004. the Integration Service sets $$IncludeDateTime to 8/2/2004. and we did not declare an initial value for the parameter or variable. MAPPING VARIABLES • • • • We might use a mapping variable to perform an incremental read of the source. we can enter an initial value. the Integration Service uses a default value based on the data type of the parameter or variable. Value saved in the repository 3. The next time it runs the session. Unlike mapping parameters. • • • After we create a parameter. we will create a Mapping Parameter of data type and use it in query to compare it with the timestamp field in SQL override. we define a value for the mapping parameter or variable before we run the session. We can also clear all saved values for the session in the Workflow Manager. MAPPING PARAMETERS • • A mapping parameter represents a constant value that we can define before running a session. Example: When we want to extract records of a particular month during ETL process. In the source qualifier. When the Integration Service needs an initial value. such as: TIMESTAMP = $$IncludeDateTime In the mapping. Advanced Concepts MAPPING PARAMETERS & VARIABLES Mapping parameters and variables represent values in mappings and mapplets. it reads only rows from August 2. use a variable function to set the variable value to increment one day each time the session runs. We can then use the parameter in any expression in the mapplet or mapping. the current value of a variable is the same as the start value. Repeat steps 3 to 4 for other source qualifiers you want to reorder.4. the first time the Integration Service runs the session. Then. we have a source table containing time stamped transactions and we want to evaluate the transactions on a daily basis. create a filter to read only rows whose transaction date equals $$IncludeDateTime. Click OK. We can also use parameters in a source qualifier filter. Click the Up and Down buttons to move the source qualifier within the load order. When we use a mapping parameter or variable in a mapping. For example. mapping variables are values that can change between sessions. user-defined join.

Do not remove $$ from name. we need to configure the Data type and aggregation type for the variable. 5. Give Initial Value. Example: Use of Mapping of Mapping Parameters and Variables • • • • • EMP will be source table. 3. Create variable $$var_max of MAX aggregation type and initial value 1500. Open folder where we want to create the mapping. At the end of a session. Create Parameter $$Bonus and Give initial value as 200. In the Mapping Designer. or reject. Create a target table MP_MV_EXAMPLE having columns: EMPNO. 2. Creating Mapping Parameters and Variables 1. 2. Aggregation types are: • • • Count: Integer and small integer data types are valid only. click Mapplet > Parameters and Variables. The IS uses the aggregate type of a Mapping variable to determine the final current value of the mapping variable. delete. TOTAL_SAL = SAL+ COMM + $$BONUS (Bonus is mapping parameter that changes every month) SET_VAR: We will be added one month to the HIREDATE of every employee. delete. Min: All transformation data types except binary data type are valid. MAX_VAR. 3. Click Mapping-> Create-> Give name. Create variable $$var_min of MIN aggregation type and initial value 1500. Select Aggregation type for mapping variables. Aggregation type set to Max. HIREDATE. MIN_VAR. click Mappings > Parameters and Variables.In the Mapplet Designer. ENAME. 6. and subtracts one when the row is Marked for deletion. . It ignores rows marked for update. Aggregation type set to Min. 9. Variable Functions Variable functions determine how the Integration Service calculates the current value of a mapping variable in a pipeline. ENAME. Click Tools -> Mapping Designer. Based on the aggregate type of the variable. -or. It adds one to the variable value when a row is marked for insertion. 10. 8. the start value of the variable is saved to the repository. Max: All transformation data types except binary data type are valid. TOTAL_SAL. COUNT_VAR and SET_VAR. Create variable $$var_count of COUNT aggregation type and initial value 0. COMM and DEPTNO to Expression. It ignores rows marked for update. DEPTNO. Open the folder where we want to create parameter or variable. Drag EMP and target table. Drag EMPNO. it saves a final value to the repository. Transformation -> Create -> Select Expression for list -> Create –> Done. Variable Data type and Aggregation Type When we declare a mapping variable in a mapping. Aggregation type set to Count. 7. 5. Select Type and Data type. SetVariable: Sets the variable to the configured value. SAL. 4. Click ok. Enter name. It ignores rows marked for update or reject. SetCountVariable: Increments the variable value by one. 6. COUNT is visible when datatype is INT or SMALLINT. Ex: m_mp_mv_example 4. Creating Mapping 1. or reject. 11. it compares the final current value of the variable to the start value of the variable. SetMinVariable: Sets the variable to the minimum value of a group of values. SetMaxVariable: Sets the variable to the maximum value of a group of values.Note: If a variable function is not used to calculate the current value of a mapping variable. Click the add button. Create shortcuts as necessary. Create variable $$var_set of MAX aggregation type.

Open expression editor for TOTAL_SAL.'MM'.1)). Validate the expression. Validate. Click OK. Select the variable function SETMAXVARIABLE from left side pane.12. 20.ADD_TO_DATE(HIREDATE. 18.SAL) 17. Open Expression editor for out_set_var and write the following expression: SETVARIABLE($$var_set. Open Expression editor for out_min_var and write the following expression: SETMINVARIABLE($$var_min.SAL). To add $$BONUS to it. Create 5 output ports out_ TOTAL_SAL. out_MAX_VAR. Link all ports from expression to target and Validate Mapping and Save it. out_COUNT_VAR and out_SET_VAR. Open Expression editor for out_count_var and write the following expression: SETCOUNTVARIABLE($$var_count). Validate the expression.SETMAXVARIABLE($$var_max. Do the same as we did earlier for SAL+ COMM. out_MIN_VAR. SAL + COMM + $$Bonus 14. Select $$var_max from variable tab and SAL from ports tab as shown below. 15. select variable tab and select the parameter from mapping parameter. 13. Expression Transformation below: 21. 19. . Open Expression editor for out_max_var.

22. or session. PARAMETER FILE • • • • • • • • A parameter file is a list of parameters and associated values for a workflow. Create a text file in notepad with name Para_File. worklet. workflow. See mapping picture on next page. Worklet variable: References values and records information in a worklet. folder and session names are case sensitive. We can create a parameter file using a text editor such as WordPad or Notepad. such as a database connection or file name. Give connection information for source and target table. or session to which we want to assign parameters or variables.txt [Practice. Enter the parameter file name and directory in the workflow or session properties. • • • Make session and workflow. Run workflow and see result. Mapping parameter and Mapping variable A parameter file contains the following types of parameters and variables: USING A PARAMETER FILE Parameter files contain several sections preceded by a heading. but we cannot use workflow variables from the parent workflow in a worklet. worklet. Sample Parameter File for Our example: In the parameter file.ST:s_m_MP_MV_Example] $$Bonus=1000 $$var_max=500 $$var_min=1200 $$var_count=0 CONFIGURING PARAMTER FILE . We can create multiple parameter files and change the file we use for a session or workflow. Use predefined worklet variables in a parent workflow. Parameter files provide flexibility to change these variables each time we run a workflow or session. The heading identifies the Integration Service. Workflow variable: References values and records information in a workflow. Integration Service process. Session parameter: Defines a value that can change from session to session.

We can create multiple pipelines in a mapplet. Repository -> Save Use of mapplet in mapping: . Click Mapplets-> Create-> Give name. Steps: 1.txt or $PMSourceFileDir\Para_File. So instead of making 5 transformations in every 10 mapping. we create a mapplet of these 5 transformations. Drag EMP and DEPT table. Pass data to multiple transformations: We can create a mapplet to feed data to multiple transformations. Use Joiner transformation as described earlier to join them. 3. Then calculate total salary. Example1: We will join EMP and DEPT table. Accept data from sources in a mapping Include multiple transformations: As many transformations as we need. Example: To create a surrogate key in target. Give the output to mapplet out transformation. 10. Contain unused ports: We do not have to connect all mapplet input and output ports in a mapping. · EMP and DEPT will be source tables. • • We use Mapplet Input transformation to give input to mapplet. Mapplets help simplify mappings in the following ways: • • • • • Include source definitions: Use multiple source definitions and source qualifiers to provide source data for a mapping. 5. Created in Mapplet Designer in Designer Tool. Click Workflows > Edit. 8. We need to use same set of 5 transformations in say 10 mappings. 4. 4. 2. We create a mapplet using a stored procedure to create Primary key for target table. MAPPLETS • • • A mapplet is a reusable object that we create in the Mapplet Designer. Pass all ports from joiner to expression and then calculate total salary as described in expression transformation. A mapplet must contain at least one Output transformation with at least one connected port in the mapplet. Mapplet Input: Mapplet input can originate from a source definition and/or from an Input transformation in the mapplet. Click the Properties tab and open the General Options settings. Enter the parameter directory and name in the Parameter Filename field. We give target table name and key column name as input to mapplet and get the Surrogate key as output. Mapplet Output: The output of a mapplet is not connected to any target table. Open a Workflow in the Workflow Manager. Transformation -> Create -> Select Expression for list -> Create -> Done 7. 2. To enter a parameter file in the workflow properties: 1. Now we use this mapplet in all 10 mappings. • • We must use Mapplet Output transformation to store mapplet output. Enter the parameter directory and name in the Parameter Filename field. Open a session in the Workflow Manager. Pass all ports from expression to Mapplet output.txt 5. 9. 2. Click OK. Ex: mplt_example1 4. Click the Properties tab. Now Transformation -> Create -> Select Mapplet Out from list –> Create -> Give name and then done. Use of Mapplet Input transformation is optional. Click OK. Mapplet -> Validate 11. · Output will be given to transformation Mapplet_Out. 3.We can specify the parameter file name and directory in the workflow or session properties. 5. Example: D:\Files\Para_File. Open folder where we want to create the mapping. Click Tools -> Mapplet Designer. 3. 6. It contains a set of transformations and lets us reuse that transformation logic in multiple mappings. To enter a parameter file in the session properties: 1. Each Output transformation in a mapplet represents one output group in a mapplet.

2. Make sure to give correct connection information in session. Drag mplt_Example1 and target table. When the Integration Service runs the session. 6. Making a mapping: We will use mplt_example1. Validate mapping and Save it. Partition points • • • By default. Ex: m_mplt_example1 4. PARTITIONING A partition is a pipeline stage that executes in a single reader. 7. Give connection information for mapplet source tables. transformation. Click Mapping-> Create-> Give name. and then create a filter transformation to filter records whose Total Salary is >= 1500. Number of Partitions • we can define up to 64 partitions at any partition point in a pipeline. Drag all ports from mplt_example1 to filter and give filter condition. Transformation -> Create -> Select Filter for list -> Create -> Done. 8. A pipeline consists of a source qualifier and all the transformations and Targets that receive data from that source qualifier. These are referred to as the mapplet input and mapplet output ports. it can achieve higher Performance by partitioning the pipeline and performing the extract. IS sets partition points at various transformations in the pipeline. The number of partitions in any pipeline stage equals the number of Threads in the stage. the mapplet object displays only the ports from the Input and Output transformations. or Writer thread. Transformation. By default. A stage is a section of a pipeline between any two partition points. We can add more transformations after filter if needed. When we use the mapplet in a mapping. · mplt_example1 will be source. and load for each partition in parallel. · Create target table same as Mapplet_out transformation as in picture above. 2. 3. . Open folder where we want to create the mapping. Connect all ports from filter to target. • • • • • • Make session and workflow. the Integration Service creates one partition in every pipeline stage. Creating Mapping 1. Run workflow and see result. PARTITIONING ATTRIBUTES 1. Give connection information for target table. 5. Partition points mark thread boundaries and divide the pipeline into stages.• • • We can mapplet in mapping by just dragging the mapplet from mapplet folder on left pane as we drag source and target tables. Click Tools -> Mapping Designer.

when a session has three partitions and the database has five partitions. PARTITIONING TYPES 1. Round Robin Partition Type • • • In round-robin partitioning. the Workflow Manager increases or decreases the number of partitions at all Partition points in the pipeline. we can change the partition type. All rows in a single partition stay in that partition after crossing a pass-Through partition point. Use any number of pipeline partitions and any number of database partitions. . If the session has three partitions and the database table has two partitions. 3 rd Session partition will receive Data from the remaining 1 DB partition. the Integration Service processes data without Redistributing rows among partitions. the Integration Service distributes rows of data evenly to all partitions. 1 st and 2nd session partitions will receive data from 2 database partitions each. The partition type controls how the Integration Service distributes data among partitions at partition points. increasing the number of partitions or partition points increases the number of threads. For example. Use round-robin partitioning when we need to distribute rows evenly and do not need to group data among partitions. For one partition. Pass-Through Partition Type • • • 3. 3. In pass-through partitioning. This option is purchased separately. Use pass-through partitioning when we want to increase data throughput. one of the session partitions receives no data. Thus four DB partitions used. Database Partitioning Partition Type • • • Database Partitioning with One Source When we use database partitioning with a source qualifier with one source. one database connection will be used. Partition types • • • The Integration Service creates a default partition type at each partition point. 2. We can improve performance when the number of pipeline partitions equals the number of database partitions. Use database partitioning for Oracle and IBM DB2 sources and IBM DB2 targets only. The number of partitions we create equals the number of connections to the source or target. the Integration Service generates SQL queries for each database partition and distributes the data from the database partitions among the session partitions Equally.• • • When we increase or decrease the number of partitions at any partition point. If we have the Partitioning option. but we do not want to increase the number of partitions. Each partition processes approximately the same number of rows. Partitioning a Source Qualifier with Multiple Sources Tables The Integration Service creates SQL queries for database partitions based on the Number of partitions in the database table with the most partitions.

The Workflow Manager does not allow us to use links to create loops in the workflow. Use hash auto-keys partitioning at or before Rank. 101-200 in another and so on. The Expression Editor provides predefined workflow variables. We can specify conditions with links to create branches in the workflow. If we do not specify conditions for each link. In the Workflow Designer workspace. Valid Workflow : Example of loop: Specifying Link Conditions: • • • Once we create links between tasks. Example: Customer 1-100 in one partition. Use predefined or user-defined workflow variables in the link condition. . the Integration Service runs the next task in the workflow by default. and Boolean and arithmetic operators. user-defined workflow variables. Key range Partition Type WORKING WITH LINKS • • • Use links to connect each workflow task.4. 2. Use key range partitioning where the sources or targets in the pipeline are Partitioned by key range. The Integration Service uses a hash function to group rows of data among Partitions. double-click the link you want to specify. 5. we choose the ports that define the partition key . Joiner. The Integration Service passes data to each partition depending on the Ranges we specify for each port. We Define the range for each partition. and Unsorted Aggregator transformations to ensure that rows are grouped Properly before they enter these transformations. we can specify conditions for each link to determine the order of execution in the workflow. Hash User-Keys Partition Type • • • • • • • 6. In the Expression Editor. Sorter. Validate the expression using the Validate button. 3. Hash Auto-Keys Partition Type • • The Integration Service uses all grouped or sorted ports as a compound Partition key. we define the number of ports to generate the partition key. variable functions. enter the link condition. Steps: 1. We specify one or more ports to form a compound partition key. 4. Each link in the workflow can run only once. The Expression Editor appears.

Using the Expression Editor: The Workflow Manager provides an Expression Editor for any expressions in the workflow. The Integration Service does not run the workflow if: The prior workflow run fails. or we can manually start a workflow. In the General tab. 3. There are 3 run options: 1. If we change schedule settings. 4. To make the workflows valid. it reschedules all workflows. the Workflow Manager lets us create reusable schedulers so we can reuse the same set of scheduling settings for workflows in the folder. If we delete a folder. We remove the workflow from the schedule The Integration Service is running in safe mode For each folder. We can enter expressions using the Expression Editor for the following: • • • Link conditions Decision task Assignment task SCHEDULERS We can schedule a workflow to run continuously. the Integration Service removes workflows from the schedule. Open the folder where we want to create the scheduler. Click Apply and OK. all workflows that use the deleted scheduler becomes invalid. start options. schedule options. If we choose a different Integration Service for the workflow or restart the Integration Service. By default. the Integration Service reschedules the workflow according to the new settings. and end options for the schedule. Click Add to add a new scheduler. In the Workflow Designer. repeat at a given time or interval. Configure the scheduler settings in the Scheduler tab. the workflow runs on demand. • • • • • • • • • • • • A scheduler is a repository object that contains a set of schedule settings. The Workflow Manager marks a workflow invalid if we delete the scheduler associated with the workflow. 5. The Integration Service runs a scheduled workflow as configured. 2. Creating a Reusable Scheduler Steps: 1. We can change the schedule settings by editing the scheduler. we must edit them and replace the missing scheduler. Run Continuously 3. Run on Demand 2. enter a name for the scheduler. Run on Server initialization . Use a reusable scheduler so we do not need to configure the same set of scheduling settings in each workflow. Configuring Scheduler Settings Configure the Schedule tab of the scheduler to set run options. click Workflows > Schedulers. Scheduler can be non-reusable or reusable. When we delete a reusable scheduler. 6.

as configured. Customized Repeat: Integration Service runs the workflow on the dates and times specified in the Repeat dialog box. choose a reusable scheduler from the Scheduler 8. If we select Reusable. Schedule options for Run on Server initialization: • • • Run Once: To run the workflow just once. Click the right side of the Scheduler field to edit scheduling settings for the non. create one before we choose Reusable. In the Workflow Designer. 6. 4. 9. we must 5. Run on Server initialization Integration Service runs the workflow as soon as the service is initialized. Run every: Run the workflow at regular intervals. 3. In the Scheduler tab.reusable scheduler 7. Run Continuously: Integration Service runs the workflow as soon as the service initializes. Note: If we do not have a reusable scheduler in the folder. Start options for Run on Server initialization: · Start Date · Start Time End options for Run on Server initialization: • • • • End on: IS stops scheduling the workflow in the selected date. The Integration Service then starts the next run of the workflow as soon as it finishes the previous run. Forever: IS schedules the workflow as long as the workflow does not fail. Click Ok. End After: IS stops scheduling the workflow after the set number of workflow runs. 2. Click Workflows > Edit. choose Non-reusable. open the workflow. Creating a Non-Reusable Scheduler 1.1. The Integration Service then starts the next run of the workflow according to settings in Schedule Options. 2. Browser dialog box. Select Reusable if we want to select an existing reusable scheduler for the workflow. Points to Ponder : . 3. Run on Demand: Integration Service runs the workflow when we start the workflow manually.

Click Apply -> Ok. Click the Open button in the Email Text field to open the Email Editor. EMAIL TASK • • Steps: 1. We can run as many sessions in a workflow as we need. The Power Center Server creates several files and in-memory caches depending on the transformations and options used in the session. We can run the Session tasks sequentially or concurrently. depending on our needs. Create a workflow wf_sample_email 2. To reschedule a workflow on its original schedule. right-click the workflow in the Navigator window and choose Schedule Workflow. In the Task Developer or Workflow Designer. Types of tasks: Task Type Session Email Command Event-Raise Event-Wait Timer Decision Assignment Control SESSION TASK • • • • Tool where task can Reusable or not be created Task Developer Workflow Designer Worklet Designer Workflow Designer Worklet Designer Yes Yes Yes No No No No No No A session is a set of instructions that tells the Power Center Server how and when to move data from sources to targets. In Value.• • To remove a workflow from its schedule. 9. 6. Created by Administrator usually and we just drag and use it in our mapping. right-click the workflow in the Navigator window and choose Unscheduled Workflow. we must first create a workflow to contain the Session task. 5. 6. choose Tasks-Create. See On Success Email Option there and configure it. Enter the subject of the email in the Email Subject field. you can leave this field blank. select the email task to be used. 5. 3. We can create reusable tasks in the Task Developer. Edit Session task and go to Components tab. Enter the fully qualified email address of the mail recipient in the Email User Name field. In Type select reusable or Non-reusable. We can set the option to send email on success or failure in components tab of a session task. Click OK twice to save your changes. Select an Email task and enter a name for the task. 4. The Edit Tasks dialog box appears. 3. Click Done. Click the Properties tab. 8. Or. The Workflow Manager provides an Email task that allows us to send email during a workflow. Click Create. 2. WORKING WITH TASKS –Part 1 The Workflow Manager contains many types of tasks to help you build workflows and worklets. 7. 4. 7. To run a session. Double-click the Email task in the workspace. Validate workflow and Repository -> Save • • We can also drag the email task and use as per need. . Example: To send an email when a session completes: Steps: 1. 8. Drag any session task to workspace.

8. click the Edit button to open the Command Editor. User-defined event: A user-defined event is a sequence of tasks in the Workflow. Open any workflow where we want to create an event. Go to commands tab. Workflow -> Create -> Give name and click ok. This is done in COMPONENTS TAB of a session. In the Commands tab. Double-click the Command task. We use this task to raise a user defined event. 5. Click Apply -> Ok. Start is displayed. Types of Events Tasks: • • EVENT RAISE: Event-Raise task represents a user-defined event. $S_M_FILTER_EXAMPLE. EVENT WAIT: Event-Wait task waits for a file watcher event or user defined event to occur before executing the next session in the workflow. Open Workflow Designer. Example: to copy a file sample. click the Add button to add a command. Standalone Command task: We can use a Command task anywhere in the workflow or worklet to run shell commands. Drag session say s_m_Filter_example and command task. 2. We create events and then raise them as per need. Click Workflow-> Edit -> Events tab. Steps to create the workflow using command task: 1.COMMAND TASK The Command task allows us to specify one or more shell commands in UNIX or DOS commands in Windows to run during the workflow.or post-session shell command for a Session task. enter a name for the new command. Steps for creating workflow: .txt from D drive to E. 10. Types of Events: • • Pre-defined event: A pre-defined event is a file-watch event. 4. 2. Enter a name for the Command task. Click to Add button to add events and give the names as per need. 3. Steps for creating User Defined Event: 1. 9. 4. choose Tasks-Create. Workflow-> Validate 8. 7. Repeat steps 5-9 to add more commands in the task. In the Name field. copy a file. 3. In the Task Developer or Workflow Designer. Repository –> Save WORKING WITH EVENT TASKS We can define events in the workflow to specify the sequence of task execution. Example1: Use an event wait task and make sure that session s_filter_example runs when abc. Click OK. We can run it in Pre-Session Command or Post Session Success Command or Post Session Failure Command. This event Waits for a specified file to arrive at a given location. Enter only one command in the Command Editor. In the Command field.txt E:\ in windows Steps for creating command task: 1. 11. Validate the workflow and Save it.and post-session shell command: We can call a Command task as the pre. 3. Link Start to Session task and Session to Command Task. Click Create. Then click done. Command: COPY D:\sample. we can specify shell commands in the Command task to delete reject files. Select Command Task for the task type. Create a task using the above steps to copy a file in Task Developer. Double click link between Session and Command and give condition in editor as 6. 6. 4.Status=SUCCEEDED 7. 2. 2. Select the Value and Type option as we did in Email task. For example. or archive target files. 5. Click OK to close the Command Editor. Pre.txt file is present in D:\FILES folder. Ways of using command task: 1.

5. give directory and filename to watch. Click Create and then done. 16. 10. Workflow -> Edit -> Events Tab and add events EVENT1 there. 13. Workflow -> Create -> Give name wf_event_wait_file_watch -> Click ok. Click Tasks -> Create -> Select EVENT WAIT from list. Workflow -> Create -> Give name wf_event_wait_event_raise -> Click ok. 2. Link Start to Event Wait task. Example: D:\FILES\abc. Click Tasks -> Create -> Select EVENT RAISE from list. 4.Status=SUCCEEDED 8. the parent workflow. Click link between ER_Example and s_m_filter_example and give the condition $S_M_FILTER_EXAMPLE. ER_Example. Select the Event1 by clicking Browse Events button. Run workflow and see.1. Repository -> Save. Give name 5. 4. 6. 3. Select User Defined there. Click Create and then done. Drag s_filter_example to workspace and link it to event wait task. Mapping -> Validate 15. 2. 9. Example 2: Raise a user defined event when session s_m_filter_example succeeds. Link EW_WAIT to START task. The Timer task has two types of settings: • • Absolute time: We specify the exact date and time or we can choose a user-defined workflow variable to specify the exact time. Click create and done. Capture this event in event wait task and run session S_M_TOTAL_SAL_EXAMPLE Steps for creating workflow: 1. Task -> Create -> Select Event Wait. Right click ER_Example -> EDIT -> Properties Tab -> Open Value for User Defined Event and Select EVENT1 from the list displayed. Workflow validate and Repository Save. Right click EW_WAIT -> EDIT-> EVENTS tab. In the blank space. Give name. Apply -> OK. Right click on event wait task and click EDIT -> EVENTS tab. Drag s_m_filter_example and link it to START task.tct 7. The next task in workflow will run as per the date and time specified. Apply -> OK.Link ER_Example to s_m_filter_example. Example: Run session s_m_filter_example relative to 1 min after the timer task. or the top-level workflow starts. . 11. 7. Give name EW_WAIT. Select Pre Defined option there. WORKING WITH TASKS –Part 2 TIMER TASK The Timer task allows us to specify the period of time to wait before the Power Center Server runs the next task in the workflow. 12. 14. 6. Drag S_M_TOTAL_SAL_EXAMPLE and link it to EW_WAIT. Relative time: We instruct the Power Center Server to wait for a specified period of time after the Timer task. 3.

Click Tasks -> Create -> Select DECISION from list. Click Create and then done. Default is AND. Apply and click OK. Drag s_m_filter_example and link it to TIMER_Example. 9. 8. 2. 6. We can specify one decision condition per Decision task. Now edit decision task again and go to PROPERTIES Tab. 6. Drag s_m_filter_example and S_M_TOTAL_SAL_EXAMPLE to workspace and link both of them to START task. Give name TIMER_Example. Double click link between Command task and DECISION_Example and give the condition: $DECISION_Example. 4. Apply -> OK. The Decision task has a pre-defined variable called $Decision_task_name. Workflow-> Validate and Repository -> Save. similar to a link condition. 5. Click Tasks -> Create -> Select TIMER from list. Right click TIMER_Example-> EDIT -> TIMER tab. Steps for creating workflow: 1. DECISION TASK • • • • The Decision task allows us to enter a condition that determines the execution of the workflow. 11. 8. Click Create and then done. Drag command task and S_m_sample_mapping_EMP task to workspace and link them to DECISION_Example task. 2. 12. 3. Workflow -> Create -> Give name wf_decision_task_example -> Click ok. Validate and click OK. Open the Expression editor by clicking the VALUE section of Decision Name attribute and enter the following condition: $S_M_FILTER_EXAMPLE. Link DECISION_Example to both s_m_filter_example and S_M_TOTAL_SAL_EXAMPLE.Status = SUCCEEDED 7. 4. Validate & click OK. Double click link between S_m_sample_mapping_EMP & DECISION_Example & give the condition: $DECISION_Example. Right click DECISION_Example-> EDIT -> GENERAL tab. Example: Command Task should run only if either s_m_filter_example or S_M_TOTAL_SAL_EXAMPLE succeeds. Give name DECISION_Example. The Power Center Server evaluates the condition in the Decision task and sets the pre-defined condition variable to True (1) or False (0). 10.Condition = 0.condition that represents the result of the decision condition.Status = SUCCEEDED OR $S_M_TOTAL_SAL_EXAMPLE.Condition = 1. Link TIMER_Example to START task. Validate the condition -> Click Apply -> OK. 3. Workflow -> Create -> Give name wf_timer_task_example -> Click ok. Set ‘Treat Input Links As’ to OR. 7. .Steps for creating workflow: 1. Workflow Validate and repository Save. 5. Select Relative Time Option and Give 1 min and Select ‘From start time of this task’ Option. If any of s_m_filter_example or S_M_TOTAL_SAL_EXAMPLE fails then S_m_sample_mapping_EMP should run. Run workflow and see the result.

Aborts the WF or worklet that contains the Control task. A parent workflow or worklet is the workflow or worklet that contains the Control task.Status = SUCCEEDED. To use an Assignment task in the workflow. Right click cntr_task-> EDIT -> GENERAL tab. We give the condition to the link connected to Control Task. 2. Link all sessions to the control task cntr_task. Workflow Validate and repository Save. 7. Set ‘Treat Input Links As’ to OR. Description Fails the control task. 3. Run workflow and see the result. Click Tasks -> Create -> Select CONTROL from list. Click Create and then done. 6. Repeat above step for remaining 2 sessions also. Fails the workflow that is running. Example: Drag any 3 sessions and if anyone fails. 9. Stops the WF or worklet that contains the Control task. Abort Top-Level WF Aborts the workflow that is running. then Abort the top level workflow. 5. Control Option Fail Me Fail Parent Stop Parent Abort Parent Fail Top-Level WF Stop Top-Level WF Stops the workflow that is running. Steps for creating workflow: 1. Workflow -> Create -> Give name wf_control_task_example -> Click ok. Go to PROPERTIES tab of cntr_task and select the value ‘Fail top level 10. Marks the status of the WF or worklet that contains the Control task as failed. Drag any 3 sessions to workspace and link all of them to START task. 11. Default is AND. 12. Click Apply and OK. ASSIGNMENT TASK • • • The Assignment task allows us to assign a value to a user-defined workflow variable. or fail the top-level workflow or the parent workflow based on an input link condition. See Workflow variable topic to add user defined variables. Give name cntr_task. Workflow’ for Control Option. abort. first create and add the . 8. 4.CONTROL TASK • • • We can use the Control task to stop. Double click link between cntr_task and any session say s_m_filter_example and give the condition: $S_M_FILTER_EXAMPLE.

Click Apply. 5. Click OK. Solution1: 1. 9. Import one flat file definition and make the mapping as per need. Now make a notepad file that contains the location and name of each 10 flat files. 12. Now open session after workflow completes. Do same for remaining files. 7. All the flat files have same number of columns and data type. Select Assignment Task for the task type.txt E:\FILES\DWH\EMP3. $InputFileName=EMP1. 4. Now edit parameter file and give value of second file.txt E:\EMP2. Now in Fieldname use $InputFileName. Steps to create Assignment Task: 1. Import one flat file definition and make the mapping as per need. 2. 5. 8. INDIRECT LOADING FOR FLAT FILES Suppose. 3. Validate Session 7. In Source file type field. Click Create. 5.txt and so on 3. 4. Enter a name for the Assignment task. Names of files are say EMP1. 2. Change the Filename and Directory to give information of second file. Now we need to transfer all the 10 files to same target. Then configure the Assignment task to assign values or expressions to userdefined variables. Click the Edit button in the Expression field to open the Expression Editor. Solution2: 1. 6. 4. Click the Open button in the User Defined Variables field. Choose Tasks-Create. Now in session give the Source File name and Source File Directory location of one file. Edit Workflow and add user defined variables. Then click Done. Now make a session and in Source file name and Source File Directory location fields. give the name and location of above created file. Double-click the Assignment task to open the Edit Task dialog box. Repeat steps 7-10 to add more variable assignments as necessary. Enter the value or expression you want to assign. Now in session give the Source Directory location of the files. 10. Save it to repository and run. 6. Make Workflow.• • Assignment task to the workflow. 2. Do the above for all 10 files. 3. Run workflow again. 2. Run the workflow 6.txt 5. you have 10 flat files of same structure. 4. Open any workflow where we want to use Assignment task. 11. This is a session parameter. select Indirect. Solution3: 1. EMP2 and so on. SCD – Type 1 . Make workflow and run. click Add to add an assignment. Select the variable for which you want to assign a value. Import one flat file definition and make the mapping as per need. Click OK. On the Expressions tab. Now make a parameter file and give the value of $InputFileName. We cannot assign values to pre-defined workflow. 7. Run workflow again. 3. Sample: D:\EMP1.

but if you use that to compare the performance of salesmen. Technically. such as the spelling of a name. Supplier_Code is the natural key and Supplier_Key is a surrogate key. Step 1: Is to import Source Table and Target table. If the salesperson that was transferred used to work in a hot market where sales were easy. until a salesperson is transferred from one regional office to another. You can't tell if your suppliers are tending to move to the Midwest. However.Slowly Changing Dimensions (SCDs) are dimensions that have data that changes slowly. that might give misleading information. Creating sales reports seems simple enough. Or you could create a second salesperson record and treat the transferred person as a new sales person. This is most appropriate when correcting certain types of data errors. you may have a dimension in your database that tracks the sales records of your company's salespeople. Explanation with an Example: Source Table: (01-01-11) Target Table: (01-01-11) Emp no 101 102 103 Ename A B C Sal 1000 2000 3000 Emp no 101 102 103 Ename A B C Sal 1000 2000 3000 The necessity of the lookup transformation is illustrated using the above source and target table. In the same way as above create two target tables with the names emp_target1. the surrogate key is not necessary. Import the source from the source analyzer. The updated table would simply overwrite this record: Supplier_Key Supplier_Code Supplier_Name Supplier_State 123 ABC Acme Supply Co IL The obvious disadvantage to this method of managing SCDs is that there is no historical record kept in the data warehouse. and now works in a market where sales are infrequent. even if they are just as good.) Here is an example of a database table that keeps supplier information: Supplier_Key Supplier_Code Supplier_Name Supplier_State 123 ABC Acme Supply Co CA In this example. regular schedule For example. Create a table by name emp_source with three columns as shown above in oracle. but that creates problems also. since the table will be unique by the natural key (Supplier_Code). Now imagine that this supplier moves their headquarters to Illinois. and therefore does not track historical data at all. her totals will look much stronger than the other salespeople in her new region. for example. Dealing with these issues involves SCD management methodologies: Type 1: The Type 1 methodology overwrites old data with new data. • • • • . rather than changing on a timebased. But an advantage to Type 1 SCDs is that they are very easy to maintain. Go to the targets Menu and click on generate and execute to confirm the creation of the target tables. the joins will perform better on an integer than on a character string. (Assuming you won't ever need to know how it used to be misspelled in the past. How do you record such a change in your sales dimension? You could sum or average the sales by salesperson. Source Table: (01-02-11) Target Table: (01-02-11) Emp no 101 102 103 104 • Ename A B C D Sal 1000 2500 3000 4000 Empno 101 102 103 104 Ename A B C D Sal 1000 2500 3000 4000 In the second Month we have one more employee added up to the table with the Ename D and salary of the Employee is changed to the 2500 instead of 2000. emp_target2.

Necessity and the usage of all the transformations will be discussed in detail below. • Here in this transformation we are about to use four kinds of transformations namely Lookup transformation. update. Transformation port should be Empno1 and Operator should ‘=’. Update Transformation. . Delete. delete or reject rows. Step 2: Design the mapping and apply the necessary transformation. The Input Port for the first column should be unchked where as the other ports like Output and lookup box should be checked. Look up Transformation: The purpose of this transformation is to determine whether to insert. In the Properties tab (i) Lookup table name ->Emp_Target. • • The first thing that we are goanna do is to create a look up transformation and connect the Empno from the source qualifier to the transformation. (ii)Look up Policy on Multiple Mismatch -> use First Value. Expression Transformation. (iii) Connection Information ->Oracle. In the Ports tab we should add a new column and name it as empno1 and this is column for which we are gonna connect from the Source Qualifier. Filter Transformation. • • • • What Lookup transformation does in our mapping is it looks in to the target table (emp_table) and compares it with the Source Qualifier and determines whether to insert.• The snap shot of the connections using different kinds of transformations are shown below. Update or reject the rows in to target table. For the newly created column only input and output boxes should be checked. The snapshot of choosing the Target table is shown below. • In the Conditions tab (i) Click on Add a new condition (ii)Lookup Table Column should be Empno.

Ename. The steps to create an Expression Transformation are shown below. • • • • (i) The value for the filter condition 1 is Insert. Input à IsNull(EMPNO1) Output à iif(Not isnull (EMPNO1) and Decode(SAL.0) . • • We are all done here . (ii) The value for the filter condition 1 is Update. Later now connect the Empno.0)=0. If there is no change in input data then filter transformation 1 forwards the complete input to update strategy transformation 1 and same output is gonna appear in the target table. Sal from the expression transformation to both filter transformation. Connect the Insert column from the expression transformation to the insert column in the first filter transformation and in the same way we are gonna connect the update column in the expression transformation to the update column in the second filter.SAL1.Expression Transformation: After we are done with the Lookup Transformation we are using an expression transformation to check whether we need to insert the records the same records or we need to update the records.Click on apply and then OK. Go to the Properties tab on the Edit transformation Filter Transformation: we are gonna have two filter transformations one to insert and other to update. If there is any change in input data then filter transformation 2 forwards the complete input to the update strategy transformation 2 then it is gonna forward the updated input to the target table. • • The condition that we want to parse through our output data are listed below.1. Both these columns are gonna be our output data so we need to have check mark only in front of the Output check box. . • • Drag all the columns from both the source and the look up transformation and drop them all on to the Expression transformation. • The Closer view of the filter Connection is shown below. The Snap shot for the Edit transformation window is shown below.1. Now double click on the Transformation and go to the Ports tab and create two new columns and name it as insert and update.

• • • • • • • Drag the respective Empno.Ename. Change Bulk to the Normal.(01-01-2010) we are provided with an source table with the three columns and three rows in it like (EMpno. Type 2 Let us drive the point home using a simple scenario. Ename A B Sal 1000 2000 Source Table: (01-01-11) Emp no 101 102 . We are all set here finally connect the outputs of the update transformations to the target table. Run the work flow from task. Step 4: Preview the Output in the target table. update or reject the rows.Sal). Step 3: Create the task and Run the work flow. delete. Now go to the Properties tab and the value for the update strategy expression is 1 (on the 2 nd update transformation). We are gonna use the SCD-2 style to extract and load the records in to target table.. Now go to the Properties tab and the value for the update strategy expression is 0 (on the 1 st update transformation). For eg. Ename and Sal from the filter transformations and drop them on the respective Update Strategy Transformation. There is a new employee added and one change in the records in the month (01-02-2010). in the current month ie. • The thing to be noticed here is if there is any update in the salary of any employee then the history of that employee is displayed with the current date as the start date and the previous date as the end date.. Don’t check the truncate table option.Update Strategy Transformation: Determines whether to insert.

Sequence Generator. Step 2: Design the mapping and apply the necessary transformation. Expression Transformation (3). Here in this transformation we are about to use four kinds of transformations namely Lookup transformation (1). Version. Drag the Target table twice on to the mapping designer to facilitate insert or update process. Look up Transformation: The purpose of this transformation is to Lookup on the target table and to compare the same with the Source using the Lookup Condition. Necessity and the usage of all the transformations will be discussed in detail below. • • • • • Create a table by name emp_source with three columns as shown above in oracle. Filter Transformation (2). The snap shot of the connections using different kinds of transformations are shown below. Import the source from the source analyzer. Flag. S_date . Step 1: Is to import Source Table and Target table. Go to the targets Menu and click on generate and execute to confirm the creation of the target tables.E_Date). .103 C 3000 Target Table: (01-01-11) Skey 100 200 300 Source Table: (01-02-11) Emp no 101 102 103 104 Ename A B C D Sal 1000 2500 3000 4000 Emp no 101 102 103 Ename A B C Sal 1000 2000 3000 S-date 01-01-10 01-01-10 01-01-10 E-date Null Null Null Ver 1 1 1 Flag 1 1 1 Target Table: (01-02-11) Skey 100 200 300 201 400 Emp no 101 102 103 102 104 Ename A B C B D Sal 1000 2000 3000 2500 4000 S-date 01-02-10 01-02-10 01-02-10 01-02-10 01-02-10 E-date Null Null Null 01-01-10 Null Ver 1 1 1 2 1 Flag 1 1 1 0 1 In the second Month we have one more employee added up to the table with the Ename D and salary of the Employee is changed to the 2500 instead of 2000. • • In The Target Table we are goanna add five columns (Skey.

1. The Input Port for only the Empno1 should be checked. The steps to create an Expression Transformation are shown below. • The condition that we want to parse through our output data are listed below. • In the Conditions tab (i) Click on Add a new condition (ii)Lookup Table Column should be Empno. Now double click on the Transformation and go to the Ports tab and create two new columns and name it as insert and update. The snapshot of choosing the Target table is shown below. We specify the condition here whether to insert or to update the table.0) .0)=0. Filter Transformation: We need two filter transformations the purpose the first filter is to filter out the records which we are goanna insert and the next is vice versa. Both these columns are goanna be our output data so we need to have unchecked input check box. (ii)Look up Policy on Multiple Mismatch -> use Last Value.SAL1. • • • Drag all the columns from both the source and the look up transformation and drop them all on to the Expression transformation. In the Properties tab (i) Lookup table name ->Emp_Target.1. The Snap shot for the Edit transformation window is shown below. • We are all done here . If there is any change in input data then filter transformation 2 forwards the complete input to the Exp 2 then it is gonna forward the updated input to the target table. • • If there is no change in input data then filter transformation 1 forwards the complete input to Exp 1 and same output is goanna appear in the target table. • • • Drag the Empno column from the Source Qualifier to the Lookup Transformation. Insert : IsNull(EmpNO1) Update: iif(Not isnull (Skey) and Decode(SAL.Click on apply and then OK. . (iii) Connection Information ->Oracle.• • The first thing that we are gonna do is to create a look up transformation and connect the Empno from the source qualifier to the transformation. Transformation port should be Empno1 and Operator should ‘=’. Expression Transformation: After we are done with the Lookup Transformation we are using an expression transformation to find whether the data on the source table matches with the target table.

Connect the output of the sequence transformation to the Exp 1. .• Go to the Properties tab on the Edit transformation (i) The value for the filter condition 1 is Insert. • • We are gonna have a sequence generator and the purpose of the sequence generator is to increment the values of the skey in the multiples of 100 (bandwidth of 100). We are goanna make the s-date as the o/p and the expression for it is sysdate.The purpose of this in our mapping is to increment the skey in the bandwidth of 100. • The closer view of the connections from the expression to the filter is shown below. Now add a new column as N_skey and the expression for it is gonna be Nextval1*100. • • • • Drag all the columns from the filter 1 to the Exp 1. Sequence Generator: We use this to generate an incremental cycle of sequential range of number. Else the there is no modification done on the target table . (ii) The value for the filter condition 2 is Update. Point to be noticed here is skey gets multiplied by 100 and a new row is generated if there is any new EMP added to the list. Expression Transformation: Exp 1: It updates the target table with the skey values. Flag is also made as output and expression parsed through it is 1.

Change Bulk to the Normal. Step 3: Create the task and Run the work flow. . Run the work flow from task. Both the S_date and E_date is gonna be sysdate. Exp 3: If any record of in the source table gets updated then we make it only as the output.• Version is also made as output and expression parsed through it is 1. The update strategy expression is set to 1. Don’t check the truncate table option. Update Strategy: This is place from where the update instruction is set on the target table.F • • • Drag all the columns from the filter 2 to the Exp 2. • • • • • • If change is found then we are gonna update the E_Date to S_Date. Now add a new column as N_skey and the expression for it is gonna be Skey+1. Exp 2: If same employee is found with any updates in his records then Skey gets added by 1 and version changes to the next higher number. Create the task and run the work flow.

SCD Type 3 This Method has limited history preservation. Source table: (01-01-2011) Empno 101 102 103 Ename A B C Sal 1000 2000 3000 Target Table: (01-01-2011) Empno Ename 101 102 103 A B C C-sal 1000 2000 3000 P-sal - Source Table: (01-02-2011) Empno 101 102 103 Ename A B C Sal 1000 4566 3000 Target Table (01-02-2011): Empno Ename C-sal P-sal .Step 4: Preview the Output in the target table. and we are goanna use skey as the Primary key here.

Update Strategy 1: This is intended to insert in to the target table. And in this mapping I’m using lookup.Stuff’s logically. These two ports are goanna be just output ports. Step 2: here we are goanna see the purpose and usage of all the transformations that we have used in the above mapping. • Drag all the ports except the insert from the first filter in to this.101 102 103 102 A B C B 1000 4566 3000 4544 Null 4566 So hope u got what I’m trying to do with the above tables: Step 1: Initially in the mapping designer I’m goanna create a mapping as below. Insert: isnull(ENO1 ) Update: iif(not isnull(ENO1) and decode(SAL. Prior to this Look up transformation has to look at the target table. filter. expression.Curr_Sal. In the Properties tab specify the Filter condition as Insert. • • • As usually we are goanna connect Empno column from the Source Qualifier and connect it to look up transformation. . Explanation of each and every Transformation is given below.1. insert.1. Add two Ports and Rename them as Insert. Expression Transformation: We are using the Expression Transformation to separate out the Insert-stuff’s and Update. Filter 1: • • Drag the Insert and other three ports which came from source qualifier in to the Expression in to first filter.0) Filter Transformation: We are goanna use two filter Transformation to filter out the data physically in to two separate sections one for insert and the other for the update process to happen. In the Properties tab specify the Filter condition as update. Update Strategy: Finally we need the update strategy to insert or to update in to the target table. Update. Finally specify that connection Information (Oracle) and look up policy on multiple mismatches (use last value) in the Properties tab. Filter 2: • • Drag the update and other four ports which came from Look up in to the Expression in to Second filter. update strategy to drive the purpose. Based on the Look up condition it decides whether we need to update. Look up Transformation: The look Transformation looks the target table and compares the same with the source table. Next to this we are goanna specify the look up condition empno =empno1. • • • Drag all the ports from the Source Qualifier and Look up in to Expression. Specify the below conditions in the Expression editor for the ports respectively. and delete the data from being loaded in to the target table.0)=0.

If it finds a corresponding group. rather than forcing it to process the entire source and recalculate the same data each time you run the session. (iii)When writing to the target. Drag all the ports except the update from the second filter in to this. you can configure the session to process those changes. If the source changes incrementally and you can capture changes. On March 2. At the end of the session. The Integration Service uses system memory to process these functions in addition to the cache memory you configure in the session properties. It saves modified aggregate data in the index and data files to be used as historical data the next time you run the session. Step 3: Create a session for this mapping and Run the work flow. the Integration Service performs the aggregate operation incrementally. For example. Use incremental aggregation when the changes do not significantly change the target. You can capture those incremental changes because you have added a filter condition to the mapping that removes pre-existing data from the flow of data. As a result. you use the incremental source changes in the session. using the aggregate data for that group. Incremental changes do not significantly change the target. (iv) If the source changes significantly and you want the Integration Service to continue saving aggregate data for future incremental changes. you might have a session using a source that receives new data every day. In the Properties tab specify the condition as the 1 or dd_update. drop the table and recreate the target with complete source data. For each input record. and saves the incremental change. Use incremental aggregation when you can capture new source data each time you run the session. Note: Do not use incremental aggregation if the mapping contains percentile or median functions. the index file and the data file.Consider using incremental aggregation in the following circumstances: • • You can capture new source data. the Integration Service processes the entire source. Use a Stored Procedure or Filter transformation to process new data. If processing the incrementally changed source alters more than half the existing target. the Integration Service stores aggregate data from that session run in two files. the Integration Service creates a new group and saves the record data. you use the entire source. . When using incremental aggregation. Step 4: Observe the output it would same as the second target table Incremental Aggregation: When we enable the session option-> Incremental Aggregation the Integration Service performs incremental aggregation. configure the Integration Service to overwrite existing aggregate data with new aggregate data. you filter out all the records except those time-stamped March 2. The Integration Service then processes the new data and updates the target accordingly. the session may not benefit from using incremental aggregation. Integration Service Processing for Incremental Aggregation (i)The first time you run an incremental aggregation session. it passes source data through the mapping and uses historical cache data to perform aggregation calculations incrementally. This allows the Integration Service to update the target incrementally. The Integration Service creates the files in the cache directory specified in the Aggregator transformation properties. This allows the Integration Service to read and store the necessary aggregate data. Update Strategy 2: This is intended to update in to the target table. In this case.• • • In the Properties tab specify the condition as the 0 or dd_insert. you apply captured changes in the source to aggregate calculations in a session. when you run the session again. the Integration Service checks historical information in the index file for a corresponding group. When the session runs with incremental aggregation enabled for the first time on March 1. Finally connect both the update strategy in to two instances of the target. If it does not find a corresponding group. You then enable incremental aggregation. the Integration Service does not store incremental aggregation values for percentile and median functions in disk caches. (ii)Each subsequent time you run the session with incremental aggregation. the Integration Service applies the changes to the existing target.

Save and publish a mapping template to create the mapping template files. Configuring the Mapping Before enabling incremental aggregation. Move the aggregate files without correcting the configured path or directory for the files in the session properties. • • The index and data files grow in proportion to the source data. Create links:. Configure the session for incremental aggregation and verify that the file directory has enough disk space for the aggregate files. in the Workflow Manager. by using the process variable for all sessions using incremental aggregation. you can easily change the cache directory when necessary by changing $PMCacheDir. You can configure rules and parameters in a mapping template to specify the transformation logic. Change the configured path or directory for the aggregate files without moving the files to the new location. the data in the previous files is lost. Integration Services rebuild incremental aggregation files they cannot find. Creating a Mapping Template Manually: You can use the Informatica Stencil and the Informatica toolbar to create a mapping template. complete the following steps: 1. Note: To protect the incremental aggregation files from file corruption or disk failure. $PMCacheDir. Verify that the Informatica Stencil and Informatica toolbar are available . The Informatica Stencil contains shapes that represent mapping objects that you can use to create a mapping template. 3. Configure the session to reinitialize the aggregate cache. 2.Each subsequent time you run a session with incremental aggregation. the Integration Service creates a backup of the incremental aggregation files. (v)When you partition a session that uses incremental aggregation. Changing the cache directory without moving the files causes the Integration Service to reinitialize the aggregate cache and gather new aggregate data. You can configure the session for incremental aggregation in the Performance settings on the Properties tab. Preparing for Incremental Aggregation: When you use incremental aggregation. Delete cache files.Use the mapping objects to create visual representation of the mapping. To create a mapping template manually. When the Integration Service rebuilds incremental aggregation files. You can use a Filter or Stored Procedure transformation in the mapping to remove pre-existing source data during a session. when you perform one of the following tasks: • • • • • • Save a new version of the mapping. 4. Use the Informatica Stencil and the Informatica toolbar in the Mapping Architect for Visio to create a mapping template. The Informatica toolbar contains buttons for the tasks you can perform on mapping template. Start Mapping Architect for Visio. The cache directory for the Aggregator transformation must contain enough disk space for two sets of the files. instead of using historical data. periodically back up the files. If you choose to reinitialize the cache. In a grid. • • Mapping Templates A mapping template is a drawing in Visio that represents a PowerCenter mapping. the Integration Service creates one set of cache files for each partition. enter the appropriate directory for the process variable. Decrease the number of partitions. you must capture changes in source data. Be sure the cache directory has enough disk space to store historical data for the session. decide where you want the files stored. When you run multiple sessions with incremental aggregation. it loses aggregate history. • • (ii) Verify the incremental aggregation settings in the session properties. Drag the mapping objects from the Informatica Stencil to the drawing window:. However. Then. or you can create a mapping template by importing a Power Center mapping. Configuring the Session Use the following guidelines when you configure the session for incremental aggregation: (i) Verify the location where you want to store the aggregate files. . When an Integration Service rebuilds incremental aggregation files.Create links to connect mapping objects. You can also configure the session to reinitialize the aggregate cache. the Workflow Manager displays a warning indicating the Integration Service overwrites the existing cache and a reminder to clear this option after running the session. You can create a mapping template manually. You can enter sessionspecific directories for the index and data files. you need to configure both mapping and session properties: • • Implement mapping logic or filter to remove pre-existing data. The Integration Service creates new aggregate data.

Verify links. you must have the Session on Grid option. Save the mapping template. 6. After you import the mappings created from the mapping template into Power Center. 9. 11. Configure rules for each link in the mapping template to indicate how data moves from one mapping object to another.Configure rules for each link in the mapping template to indicate how data moves from one mapping object to another. Configure the mapping objects:. Command. When you run a workflow on a grid. Export a Power Center mapping. the Integration Service distributes session threads to multiple DTM processes on nodes in the grid to increase performance and scalability. Mapping Architect for Visio determines the mapping objects and links included in the mapping and adds the appropriate objects to the drawing window. Import the mapping. Save the mapping template:. Use parameters to make the rules flexible. Importing a Mapping Template from a Power Center: If you have a Power Center mapping that you want to use as a basis for a mapping template. Declare mapping parameters and variables to use when you run the session in Power Center. 9. When you run a session on a grid. you can configure workflows and sessions to run on a grid. Configure link rules. Declare mapping parameters and variables to use when you run sessions in Power Center:. you need to publish the mapping template again. Note: If the Power Center mapping contains mapping parameters and variables. Note: Mapping Architect for Visio fails to create a mapping template if you import a mapping that includes an unsupported source type. or mapping object. Verify that the Informatica stencil and Informatica toolbar are available .When you publish the mapping template. Save changes to the mapping template drawing file.After you import the mappings created from the mapping template into Power Center. select the mapping that you want to base the mapping template on and export it to an XML file. Add a group or expression required by the transformations in the mapping template. you can use the mapping parameters and variables in the session or workflow.If you edit the mapping template drawing file after you publish it. set a parameter for the source or target definition. 8. 7. 10. Start Mapping Architect for Visio.Add a group or expression required by the transformations in the mapping template. In the Designer. Create or verify links that connect mapping objects. To import a mapping template from a Power Center mapping. target type. To run a session on a grid. To run a workflow on a grid. You create the grid and configure the Integration Service in the Administration Console. Modify or declare new mapping parameters and variables appropriate for running the new mappings created from the mapping template. export the mapping to a mapping XML file and then use the mapping XML file to create a mapping template. The Integration Service distributes workflow tasks and session threads based on how you configure the workflow or session to run: • • Running workflows on a grid. 3. When you publish the mapping template. To create multiple mappings. Running sessions on a grid.xml). Note: To run workflows on a grid. set a parameter for the source or target definition. Informatica does not support imported objects from a different release. The master service process is the Integration . Validate the mapping template. 4. Note: Export the mapping XML file within the current Power Center release. The Integration Service distributes session threads across nodes in a grid. the master service process runs the workflow and all tasks except Session. you must have the Server grid option. Command. complete the following steps: 1. 2. Running Workflows on a Grid: When you run a workflow on a grid.xml). you can use the mapping parameters and variables in the session or workflow. it is possible that the mapping parameters and variables ($$ParameterName) may not work for all mappings you plan to create from the mapping template. the Integration Service runs a service process on each available node of the grid to increase performance and scalability. 6. 5. Mapping Architect for Visio generates a mapping template XML file and a mapping template parameter file (param.5. It also distributes the Session. Publish the mapping template. and predefined Event-Wait tasks within workflows across the nodes in a grid. you need to publish again. The Integration Service distributes workflows across the nodes in a grid. Do not edit the mapping template XML file. and predefined Event-Wait tasks. click the Create Template from Mapping XML button. Configure link rules:. Mapping Architect for Visio generates a mapping template XML file and a mapping template parameter file (param. To create multiple mappings. Grid Processing When a Power Center domain contains multiple nodes. configure the session to run on the grid. 7. If you make any change to the mapping template after publishing. Do not edit the mapping template XML file. you configure the workflow to run on the Integration Service associated with the grid. Configure the mapping objects. 10. Use parameters to make the rules flexible. On the Informatica toolbar. which it may distribute to other nodes. Validate the mapping template. To run sessions on a grid. Publish the mapping template:.Save changes to the mapping template drawing file. 8.

For Example. When a workflow suspends. • Use the following keywords to write expressions for user-defined and predefined workflow variables: . so it uses the date and time for the master service process node to start scheduled workflows. the Load Balancer distributes session threads based on the following factors: • • • Node availability :. you can run the next task. use a Variable in a Decision task to determine whether the previous task ran properly. For example. you can manually restart a workflow on another node to recover it. the Load Balancer also distributes tasks based on resource availability. Operating mode:. and another link to follow when the decision condition evaluates to false. and available for task dispatch. after a Decision task. The Load Balancer distributes tasks based on node availability. The Load Balancer is the component of the Integration Service that dispatches Session. you can stop the workflow. Running mode:-If the workflow runs on a grid. a workflow contains a Session task. and a Command task. The master service process runs the Start and Decision tasks. The Load Balancer distributes the Session and Command tasks to nodes on the grid based on resource availability and node availability. Use a user-defined date/time variable to specify the time the Integration Service starts to run the next task.You can configure a workflow to suspend on error. If not. Services may shut down unexpectedly. For example. If it did. and predefined EventWait tasks to the nodes in the grid. In addition. Shutdown mode:. recovery is disabled for sessions and workflows. Timer tasks. Use the following types of workflow variables: • • • • • Predefined workflow variables. If you do not have high availability. If the Integration Service is configured to check resources. Resource availability :. the Integration Service can recover workflows and tasks on another node. You create user-defined workflow variables when you create a workflow.When you disable an Integration Service or service process.The Load Balancer verifies which nodes are currently running. Timer tasks specify when the Integration Service begins to run the next task in the workflow. You might want to configure a session to run on a grid when the workflow contains a session that takes a long time to run. Workflow Variables You can create and use variables in a workflow to reference values and record information. so it uses the date and time for the master service process node to start scheduled workflows. Grid Connectivity and Recovery When you run a workflow or session on a grid. or stops processes running on the service. Command. Recovery strategy:. aborts. and predefined Event-Wait tasks as it does when you run a workflow on a grid. the recovery behavior depends on the recovery strategy you configure for each task in the workflow. Decision tasks. enabled. the master service process runs the workflow and all tasks except Session. Partitioning configuration. The Workflow Manager provides predefined workflow variables for tasks within a workflow. or loses connectivity. Links connect each workflow task. or you may disable the Integration Service or service processes while a workflow or session is running. you can increment a user-defined counter variable by setting the variable to its current value plus 1. The Scheduler runs on the master service process node. a Decision task. Running Sessions on a Grid: When you run a session on a grid.Service process that runs the workflow. Command. Decision tasks determine how the Integration Service runs a workflow. Use workflow variables when you configure the following types of tasks: Assignment tasks. use the Status variable to run a second session only if the first session completes successfully. and Node 4 is unavailable. You specify a resource requirement for the Session task. • • • • Note: You cannot configure an Integration Service to fail over in safe mode if it runs on a grid. In this case. If a session runs on a grid. and runs the Load Balancer. For example. Use an Assignment task to assign a value to a user-defined workflow variable. monitors service processes running on other nodes. you can specify that the service completes. it identifies nodes that have resources required by mapping objects in the session. When you run a session on a grid. you can create one link to follow when the decision condition evaluates to true. The Scheduler runs on the master service process node.If the Integration Service runs in safe mode. Recovery behavior also depends on the following factors: • High availability option:-When you have high availability. you cannot configure a resume recovery strategy.If the Integration Service is configured to check resources. The Load Balancer dispatches groups of session threads to separate nodes based on the partitioning configuration. Links. The grid contains four nodes. User-defined workflow variables. The Integration Service or service process may also shut down unexpectedly. the failover and recovery behavior depend on which service process shuts down and the configured recovery strategy. You configure a recovery strategy for tasks within the workflow. Behavior also differs when you disable a master service process or a worker service process. Use workflow variables in links to create branches in the workflow. The Integration Service failover and recovery behavior in these situations depends on the service process that is disabled. Network failures can cause connectivity loss between processes running on separate nodes. shuts down. For example. the Load Balancer distributes session threads to DTM processes running on different nodes. workflows fail over to another node if the node or service shuts down. Behavior differs when you disable the Integration Service or you disable a service process. service processes and DTM processes run on different nodes.

Session Date/Time ErrorCode Integer ErrorMsg Nstring First Error Code Error code for the first error message in the session. the Integration Service sets ErrorCode to 0 when the task completes. • Task-Specific Variables Condition Description Evaluation result of decision condition expression.• • • • • • • AND OR NOT TRUE FALSE NULL SYSDATE Predefined Workflow Variables: Each workflow contains a set of predefined variables that you use to evaluate workflow and task conditions. If there is no error. Last error message for the All tasks associated task. Use task-specific variables in a link condition to control the path the Integration Service takes when running the workflow.EndTime > TO_DATE('11/10/2004 08:13:25') Last error code for the associated All tasks task.ErrorMsg = 'PETL_24013 Session run completed with failure Variables of type Nstring can have a maximum length of 600 characters. If the task fails. The Workflow Manager lists built-in variables under the Built-in node in the Expression Editor. the Integration Service sets ErrorMsg to an empty string when the task completes. system date. The Workflow Manager provides a set of task-specific variables for each task in the workflow. Note: You might use this variable when a task consistently fails with this final error message.ErrorCode = 24013. Sample syntax: $s_item_summary. Sample syntax: $s_item_summary. Built-in variables. Sample syntax: $s_item_summary. or workflow start time.Condition = <TRUE | FALSE | NULL | any integer> Task Types Decision Data type Integer End Time Date and time the associated task All tasks ended. Use built-in variables in a workflow to return run-time or system information such as folder name. the Workflow Manager keeps the condition set to null.If there is no error. Sample syntax: $Dec_TaskStatus. Integration Service Name. The Workflow Manager lists task-specific variables under the task name in the Expression Editor. Integer . Precision is to the second. Note: You might use this variable when a task consistently fails with this final error message. Use the following types of predefined variables: • Task-specific variables.

Sample syntax: $s_item_summary.STOPPED 4. Sample syntax: $s_dist_loc.SrcFailedRows = 0 Session Integer SrcFailedRows SrcSuccessRows Total number of rows successfully read from the sources.StartTime > TO_DATE('11/10/2004 08:13:25') Status of the previous task in the All Task workflow.SUCCEEDED Use these key words when writing expressions to evaluate the status of the current task.FirstErrorCode = 7086 FirstErrorMsg First error message in the Session session. Sample syntax: $s_dist_loc. Nstring PrevTaskStatus Status of the previous task in the All Tasks Integer workflow that the Integration Service ran.NOTSTARTED .FAILED .SUCCEEDED Use these key words when writing expressions to evaluate the status of the previous task. Sample syntax: $s_dist_loc.PrevTaskStatus = FAILED Total number of rows the Session Integration Service failed to read from the source. Precision is to the second. Statuses include: . Sample syntax: $s_item_summary.DISABLED . Sample syntax: $Dec_TaskStatus. the Integration Service sets FirstErrorMsg to an empty string when the task completes.If there is no error. Sample syntax: Date/Time Status Integer TgtFailedRows Integer . Sample syntax: $s_item_summary.If there is no error.FirstErrorMsg = 'TE_7086 Tscrubber: Debug info… Failed to evalWrapUp'Variables of type Nstring can have a maximum length of 600 characters.Status = SUCCEEDED Total number of rows the Session Integration Service failed to write to the target. Statuses include: 1.ABORTED 2. the Integration Service sets FirstErrorCode to 0 when the session completes.STARTED .STOPPED .FAILED 3.ABORTED .SrcSuccessRows > 2500 StartTime Integer Date and time the associated task All Task started.

the session that updates the local database runs every time the workflow runs. 2. 3.Double . Enter the information in the following table and click OK: Field Name Description Variable name. 4. For example. The correct format is $$VariableName. Use the variable in tasks within that workflow.Integer . Create separate sessions to update the local database and the one at headquarters. Integer Integer Use a user-defined variable to determine when to run the session that updates the orders database at headquarters. Data type Persistent .Set up the decision condition to check to see if the number of workflow runs is evenly divisible by 10. $$WorkflowCount. Sample syntax: $s_dist_loc. 2. to represent the number of times the workflow has run. Link it to the Assignment task when the decision condition evaluates to false. When you configure workflow variables using conditions. Create a persistent workflow variable. When you create a variable in a workflow. In the Workflow Designer. You can select from the following data types: . every tenth time you update the local orders database.Nstring Whether the variable is persistent. The session that updates the database at headquarters runs every 10th time the workflow runs. it is valid only in that workflow. You can edit and delete user-defined workflow variables. Click Add. You also need to load a subset of this data to headquarters periodically. To create a workflow variable: 1. Use user-defined variables when you need to make a workflow decision based on criteria you specify. Select the Variables tab. 4. create a new workflow or edit an existing one. 5. Add a Start task and both sessions to the workflow.$s_dist_loc. Creating User-Defined Workflow Variables : You can create workflow variables for a workflow in the workflow properties. The single dollar sign is reserved for predefined workflow variables Data type of the variable. 3. complete the following steps: 1. Use the modulus (MOD) function to do this. Link the Decision task to the session that updates the database at headquarters when the decision condition evaluates to true. Sample syntax: $s_dist_loc. To configure user-defined workflow variables.Date/Time .TgtFailedRows = 0 TgtSuccessRows Total number of rows Session successfully written to the target.TotalTransErrors = 5 User-Defined Workflow Variables: You can create variables within a workflow. Enable this option if you want the value of the variable retained from one execution of the workflow to the next. Place a Decision task after the session that updates the local orders database. Create an Assignment task to increment the $$WorkflowCount variable by one.TgtSuccessRows > 0 TotalTransErrors Total number of transformation Session errors. Workflow variable names are not case sensitive. Do not use a single dollar sign ($) for a user-defined workflow variable. you create a workflow to load data to an orders database nightly.

MM/DD/YYYY HH24:MI:SS.US . Where you Involved in more than two projects simultaneously? 12.MM/DD/YYYY HH24:MI:SS. How did you implement CDC in your project? 17. How does your Mapping in Load to Stage look like? 19.MM/DD/YYYY HH24:MI . Is Null Description 5.NS You can use the following separators: dash (-). What is your Daily feed size and weekly feed size? .MM/DD/YYYY HH24:MI:SS . How many mapping have you created all together in your project? 4.MS . period (. In which account does your Project Fall? 5. Can I have one situation which you have adopted by which performance has improved dramatically? 11.or three-digit values for year or the “HH12” format for hour. What is your Involvement in Performance tuning of your Project? 8. How do we do the Fact Load? 16. What is the size of your Data warehouse? 21. What kinds of Testing have you done on your Project (Unit or Integration or System or UAT)? And Enhancement’s were done after testing? 14. The Integration Service uses this value for the variable during sessions if you do not set a value for the variable in the parameter file and there is no value stored in the repository. How many Complex Mapping’s have you created? Could you please me the situation for which you have developed that Complex mapping? 7. What are your Roles in this project? 10. 6. What are your Daily routines? 3.MM/DD/RR HH24:MI:SS. 1. Description associated with the variable. These are the questions which normally i would expect by interviewee to know when i sit in panel. click the Validate button. How does your Mapping in File to Load look like? 18.MM/DD/RR .MM/DD/RR HH24:MI:SS.MM/DD/YYYY . backslash (\). Variables of type Date/Time can have the following formats: . Click Apply to save the new workflow variable. enable this option. If the default value is null.MM/DD/YYYY HH24:MI:SS.MM/DD/RR HH24:MI:SS. To validate the default value of the new workflow variable.NS .). 7.Default Value Default value of the variable. Interview Zone Hi readers. Click OK.MM/DD/RR HH24:MI . What is your Reporting Hierarchy? 6. How does your Mapping in Stage to ODS look like? 20. slash (/). Whether the default value of the variable is null. The Integration Service ignores extra spaces. How many Dimension Table are there in your Project and how are they linked to the fact table? 15.MS .US . Variables of type Nstring can have a maximum length of 600 characters. Explain your Project? 2. and space. Do you have any experience in the Production support? 13. So what i would request my reader’s to start posting your answers to this questions in the discussion forum under informatica technical interview guidance tag and i’ll review them and only valid answers will be kept and rest will be deleted. What is the Schema of your Project? And why did you opt for that particular schema? 9. colon (:). You cannot use one.MM/DD/RR HH24:MI:SS .

Difference between and reusable transformation and mapplet? 7. What is Persistent Lookup cache? What is its significance? 6. What is the biggest Challenge that you encountered in this project? 27. What do you mean by direct loading and Indirect loading in session properties? 24. Why? 35. Did your Project go live? What are the issues that you have faced while moving your project from the Test Environment to the Production Environment? 26. Explain the versioning concept in Informatica? 26. Why did we use stored procedure in our ETL Application? 31. Explain about Informatica server Architecture? 10. What is hash table Informatica? 34. What are the types of the aggregations available in Informatica? 40. When we can join tables at the Source qualifier itself. Explain what DTM does when you start a work flow? 37. why do we go for joiner transformation? 32. Difference between Static and Dynamic caches? 5. What is parameter file? 17. Difference between connected and unconnected lookup transformation in Informatica? 3. What is Data driven? 27. Can you explain what are error tables in Informatica are and how we do error handling in Informatica? 13. How to import oracle sequence into Informatica? 16. How do I create Indexes after the load process is done? 41. How do you access your source’s (are they Flat files or Relational)? 24. What are the types of meta data repository stores? 29. Is sorter an active or passive transformation? When do we consider it to be active and passive? 9. What are the session parameters? 20. What is the scheduler tool you have used in this project? How did you schedule jobs using it? Informatica Experienced Interview Questions – part 1 1. What is difference between partitioning of relational target and file targets? 22. What is Mapplet? 46. Difference between Cached lookup and Un-cached lookup? 36. Can you use the mapping parameters or variables created in one mapping into another mapping? 30. Difference between constraint base loading and target load plan? 14. Have you developed any Stored Procedure or triggers in this project? How did you use them and in which situation? 25. Explain what Load Manager does when you start a work flow? 38.22. What is polling? 44. What are the limitations of the joiner transformation? 45. Which Approach (Top down or Bottom Up) was used in building your project? 23. Difference between Normal load and Bulk load? 18. How do we implement recovery strategy while running concurrent batches? 25. you should specify the table with lesser rows as the master table. Difference between Informatica 7x and 8x? 2. What is batch? Explain the types of the batches? 28. What is the default join operation performed by the look up transformation? 33. What are active and passive transformations? . What are mapping parameters and variables in which situation we can use them? 23. What are the different types of the caches available in Informatica? Explain in detail? 43. How the Informatica server sorts the string values in Rank transformation? 8. In a Sequential batch how do i stop one particular session from running? 39. Difference between stop and abort in Informatica? 4. In update strategy Relational table or flat file which gives us more performance? Why? 11. Difference between IIF and DECODE function? 15. Where does Informatica store rejected data? How do we view them? 21. What are the out put files that the Informatica server creates during running a session? 12. How do we improve the performance of the aggregator transformation? 42. In a joiner transformation. How u will create header and footer in target using Informatica? 19.

Can we use an active transformation after update strategy transformation? 61. What is a command that used to run a batch? 54. Can u copy the session in to a different folder or repository? 52. What do you mean by SQL override? 81. What are the options in the target session of update strategy transformation? 48. While importing the relational source definition from database. After dragging the ports of three sources (Sql server. Explain how we set the update strategy transformation at the mapping level and at the session level? 62. How many ways you can filter the records? 87. What is data cleansing? 72. You are required to perform “bulk loading” using Informatica on Oracle. How do you handle the decimal places when you are importing the flat file? 75. What are the unsupported repository objects for a mapplet? 55. Is it possible negative increment in Sequence Generator? If yes. Can we do ranking on two ports? If yes explain how? 68. What are partition points? 66. What are the transformations that use cache for performance? 85. Which directory Informatica looks for parameter file and what happens if it is missing when start the session? Does session stop after it starts? 92. The Informatica is installed on a Personal laptop.e different source and targets for each session run ? 65. What kinds of sources and of targets can be used in Informatica? 79. My flat file’s size is 400 MB and I want to see the data inside the FF with out opening it? How do I do that? 73. What is a shortcut in Informatica? 82. Which object is required by the debugger to create a valid debug session? 60. What is change data capture? 64. how can you delete duplicates before it starts loading? 88. Why we use stored procedure transformation? 59. what are the meta data of source U import? 77. How does Informatica do variable initialization? Number/String/Date 83. You have more five mappings use the same lookup. Difference between Power mart & Power Center? 78. Informatica is complaining about the server could not be reached? What steps would you take? 93. Difference between Filter and Router? 74. What precautions do you need take when you use reusable Sequence generator transformation for concurrent sessions? 90. How can you manage the lookup? . What is the difference between $ & $$ in mapping or parameter file? In which case they are generally used? 76. What is Transformation? 69. What is the use of Forward/Reject rows in Mapping? 86. i.47. How many different locks are available for repository objects 84. oracle. What is a code page? Explain the types of the code pages? 49. What are the different threads in DTM process? 67. what is your approach towards performance tuning? 56. What is exact use of 'Online' and 'Offline' server connect Options while defining Work flow in Work flow monitor? The system hangs when 'Online' Server connect option. what value will each target get? 80. 63. If a sequence generator (with increment of 1) is connected to (say) 3 targets and each target uses the NEXTVAL port. how would you accomplish it? 91. How to delete duplicate records from source database/Flat Files? Can we use post sql to delete these records. How do you recognize whether the newly added rows got inserted or updated? 71. What are the types of mapping wizards available in Informatica? 57. How can you delete duplicate rows with out using Dynamic Lookup? Tell me any other ways using lookup delete the duplicate rows? 51. Write a session parameter file which will change the source and targets for every session. What do you mean rank cache? 50. can we map these three ports directly to target? 58. What does stored procedure transformation do in special as compared to other transformation? 70. In case of flat file. If your workflow is running slow. What is tracing level and what are its types? 53. Informix) to a single source qualifier. what action would perform at Informatica + Oracle level for a successful load? 89.

26. What is Normalization? First Normal Form.Account balances in bank) 28.94. You want to attach a file as an email attachment from a particular directory using ‘email task’ in Informatica. What is a level of Granularity of a fact table? What does this signify?(Weekly level summarization there is no need to have Invoice Number in the fact table anymore) 25. How would do trace the error? What log file would you seek for? 101. it complains that the session details are not available. group by col 2. How are the Dimension tables designed? De-Normalized. What is ER Diagram? 4. What are Data Marts? 3. What is the Difference between OLTP and OLAP? 10. What are conformed dimensions? 29. Which columns go to the fact table and which columns go the dimension table? (My user needs to see <data element<data element broken by <data element<data element> All elements before broken = Fact Measures All elements after broken = Dimension Elements 24. SCD2 and SCD3? . What is data mining? 20. What is a dynamic lookup and what is the significance of NewLookupRow? How will use them for rejecting duplicate records? 98. col3). Second Normal Form . What is a general purpose scheduling tool? Name some of them? 17. What is Fact table? 14. shell script. When you export a workflow from Repository Manager. What are Aggregate tables? 9. What is a data-warehouse? 2. What is VLDB? (Database is too large to back up in a time frame then it's a VLDB) 30. If you have more than one pipeline in your mapping how will change the order of load? 99. Wide. What type of Indexing mechanism do we need to use for a typical Data warehouse? 23. Use Surrogate Keys. An Aggregate transformation has 4 ports (l sum (col 1). What Snow Flake Schema? 7. What are SCD1. How will you do it? 102. What are non-additive facts? (Inventory. what does this xml contain? Workflow only? 100. What is a dimension table? 15. What are modeling tools available in the Market? Name some of them? 18. What are slowly changing dimensions? 27. What are the various Reporting tools in the Market? 13. What is real time data-warehousing? 19. What is Dimensional Modelling? 6. Your session failed and when you try to open a log file. You can use any method. procedure or Informatica mapping or workflow control? Data warehousing Concepts Based Interview Questions 1. What is a Star Schema? 5. Contain Additional date fields and flags. How can you limit number of running sessions in a workflow? 96. which port should be the output? 97. How can you create a workflow that will send you email for sessions running more than 30 minutes. What are the Different methods of loading Dimension tables? 8. You have a requirement to alert you of any long running sessions in your workflow. What is ODS? 22. Short. Third Normal Form? 21. What is a lookup table? 16. What are the various ETL tools in the Market? 12. What is ETL? 11. What will happen if you copy the mapping from one repository to another repository and if there is no identical source? 95.

Sign up to vote on this title
UsefulNot useful