You are on page 1of 29

An Introduction to Informatica

Chandrashekar P

Abstract

Informatica is an ETL product, known as Informatica Power Center.

It is a tool, supporting all the steps of Extraction, Transformation
and Load process. It’s an easy to use tool. It can communicate
with all major data sources (mainframe/RDBMS/Flat
Files/XML/VSM/SAP etc), can move/transform data between them.
It can move huge volumes of data in a very effective

It can effectively join data from two distinct data sources

This document gives you an Intro to Informatica

Table of Contents

1. An Overview of DWH...............................................................................................................4
2. Informatica Architecture......................................................................................................5
2.1. Informatica PowerCenter Client Tools...........................................................................5
2.2. Application Services..............................................................................................................7
3. Informatica Transformations.............................................................................................8
3.1. Source Qualifier Transformation......................................................................................9
3.2. Expression Transformation..............................................................................................12
3.3. Aggregate Transformation...............................................................................................15
3.4. Filter Transformation..........................................................................................................16
3.5. Router Transformation.......................................................................................................17
3.6. Sorter Transformation........................................................................................................18
3.7. Joiner Transformation:......................................................................................................19
3.8. Lookup Transformation......................................................................................................21
3.9. Union Transformation.........................................................................................................24
4. Workflow Creation..................................................................................................................25
5. Summary.......................................................................................................................................30

An Introduction to Informatica

1. An Overview of DWH
A data warehouse is a relational database that is designed for query and analysis
rather than for transaction processing. It usually contains historical data derived
from transaction data, but it can include data from other sources. In addition to a
relational database, a data warehouse environment includes an extraction,
transportation, transformation, and loading (ETL) solution

ETL Technology (shown below with arrows) is an important component of the
Data Warehousing Architecture. It is used to copy data from Operational
Applications to the Data Warehouse Staging Area, from the DW Staging Area into
the Data Warehouse and finally from the Data Warehouse into a set of conformed
Data Marts that are accessible by decision makers.
We have different types of tools in DWH: Informatica, Ab Initio, Data stage,
Oracle Data Integrator,

Let’s have a detail look on Informatica.

Page 2 of 29

(Designer) Click on Designer: Repository Navigator windows gets displayed. Provide the credentials to login. Informatica PowerCenter Client Tools These are the development tools installed at developer end. known as mapping. then Select the corresponding folder  Click on Open.An Introduction to Informatica 2. Informatica Architecture Tool_View: 2. Page 3 of 29 . Right_Click on the required Repository then select Connect. These tools enable a developer to  Define transformation process.1. to access the folders.

known as sessions (Workflow Manager) Page 4 of 29 .An Introduction to Informatica The Designing Window gets displayed  Define run-time properties for a mapping.

Application Services Application services are a group of services that represent PowerCenter server- based functionality.2. Type of Application Services: Page 5 of 29 . When you configure an application service. you designate the node where it runs. useful for administrators (Repository Manager) 2.An Introduction to Informatica  Monitor execution of sessions (Workflow Monitor)  Manage repository.

Integration Service: The Integration Service is an application service that runs data integration sessions and workflows SAP BW Service: The SAP BW Service is an application service that listens for RFC requests from SAP BW and initiates workflows to extract from or load to SAP BW Web Services Hub : The Web Services Hub is a web service gateway for external clients. Informatica Transformations A transformation is a repository object that generates. It retrieves. modifies. Web service clients access the Integration Service and Repository Service through the Web Services Hub Core Services: The PowerCenter Architecture has a new set of Core Services which comprises of: Log Service / Gateway Service / Administration Service / Configuration Service Authentication Service and Domain Service 3. It processes SOAP requests from web service clients that want to access PowerCenter functionality through web services. and updates metadata in the repository database tables. The Designer provides a set of transformations that perform specific functions.An Introduction to Informatica Repository Service: The Repository Service is an application service that manages the repository. inserts. Transformation can be Page 6 of 29 . or passes data.

(Source can be a DB.1. 4. 3. Click on “Import from Database” Select the Data source. To specify sorted ports. and the source table.An Introduction to Informatica Type of Transformations: Note: To view all the available types of transformations  click on Transformations in Tool bar. It is used to select only distinct values from the source Hands_on: Select the type of source file. The Source Qualifier is used to join data originating from the same source database. 6. 3. Source Qualifier Transformation 1. Flat file) As part of this example the source is DB. Page 7 of 29 . Specify an outer join rather than the default inner join 5. 2. XML File. Active and Connected transformation. Filter rows when the Integration Service reads source data.

An Introduction to Informatica The table will be imported as mentioned below and it gets stored in “Source” folder. Click on mapping  A popup window will get displayed. Now Navigate to Mapping designer . provide the mapping name (eg: m_*****) Page 8 of 29 .

While dragging itself each Source definition will have its source qualifier. Note: All the fields might not be required. In the ports tab  provide the needed ports in the order of the results getting retrieved from your query. Page 9 of 29 . so in that case delete the unwanted ports. Double click on SQ  A popup window gets opened. while linking the field the Datatype needs to be taken care.An Introduction to Informatica Now open the Source folder and drag the required table in to the Mapping designer window.

Expression Transformation  Passive and Connected Transformation.An Introduction to Informatica In the properties tab query can be generated as per the requirement. 3. Output or variable) Page 10 of 29 . Example: Discount of Each Product. Concatenate Names Click on the Expression Transformation icon and drag it in the designer window.  It permits you to perform calculations row by row basis only. Double click on the dragged Exp_Trns  Ports tab will have the details of the ports and its type (Input.2.

Import Target: Navigate to Target designer  Select Create and provide the Target Table name Page 11 of 29 . In the functions tab  the in-built function of the tools can be seen with the syntax. Click as mentioned below and write the required expressions.An Introduction to Informatica In the above example we have variable named: Name & Annual_Income.

save it. the output window shows the status of the mapping.An Introduction to Informatica  Add the required columns as per the output and its appropriate datatype & precision. Now Navigate to Mapping Designer and drag the created target in to the designer window. Source Value: Page 12 of 29 . Once after linking all the fields.

. STDDEV. on multiple rows or groups. count etc. SUM. (Please refer How to Create Workflow Session for details) 3. FIRST. PERCENTILE. sum. Select he column on what basis the Grouping needs to be done. COUNT. 3. Aggregate Transformation 1. Here you can perform calculations on groups. MEDIAN. 2. Click on the Aggregate Transformation icon and drag it in the designer window.An Introduction to Informatica Output: To get the output. Example: To calculate total of daily sales / To calculate average of monthly / yearly sales. Page 13 of 29 . MAX. MIN. Double click on the dragged Aggr_Trns  Ports tab will have the details of the ports and its type (Input.3. run the Workflow corresponding to that. LAST. Output or variable). Active & Connected transformation. Aggregate Functions: AVG. VARIANCE. Aggregate Functions are : Average.

4. It can be used to filter rows in a mapping that do not meet the condition. Example: Employees who are working in Department: 10 Product that falls in the rate category $500 and $1000 Click on the Filter Transformation icon and drag it in the designer window. Page 14 of 29 . Filter Transformation Active and connected transformations.An Introduction to Informatica Source Value: Output: 3. Double click on the dragged Filtr_Trns  Ports tab will have the conditions for filtering.

Page 15 of 29 .  It is similar to filter transformation because both allow you to apply a condition to test data.  The Router transformation is more efficient. filter transformation drops the data that do not meet the condition whereas router has an option to capture the data that do not meet the condition and route it to a default output group. Router Transformation  Active & Connected Transformation. State = California and all other.  The only difference is. Example: If State=Michigan.5.An Introduction to Informatica Output: 3.

An Introduction to Informatica NewGroup1: Page 16 of 29 .

Sorter Transformation  Active & Connected transformation.6.An Introduction to Informatica NewGroup2: Default: 3.  Also it’s used to configure for case.  When you create a Sorter transformation in a mapping. you specify one or more ports as a sort key and configure each sort key port to sort in ascending or descending order. Fetching frm staging table  without query (sorter) Page 17 of 29 .  It is used sort data either in ascending or descending order according to a specified sort key.sensitive sorting and specify whether the output rows should be distinct.

Double click on the dragged Jnr_Trns  condition tab will have the conditions for joining & Ports tab will have the type of join. Click on the Joiner Transformation icon and drag it in the designer window. there must be at least one or more pairs of matching column between the sources and a must to specify one source as master and the other as detail. Joiner Transformation:  Active & Connected  It is used to join data from two related heterogeneous sources residing in different locations  To join data from the same source. Note: In order to join two sources.7. Page 18 of 29 .An Introduction to Informatica 3.

table doesn’t hold information for Project Manager.An Introduction to Informatica As the joined salary_hiked_details. The output doesn’t have the PM information. Page 19 of 29 .

Drag the needed values from Source Qualifier to the Lookup. as there is no common key).  Return multiple ports Select the Lookup Icon  select the Lookup table from Source / Target otherwise Import the same. (Because joiner will not work here. Lookup Transformation  Passive & Connected or UnConnected.An Introduction to Informatica 3. Need to create emp_id’s for New_Joinees  to verify the existing id’s and to generate new id Lookup can be used. Function  unconnected  Return 1 port Procedure  connected.” Page 20 of 29 . relational table.  It compares lookup transformation ports (input ports) to the source column values based on the lookup condition. view. Double click the Lookup transformation  Name the fields from Source Qualifier as “In_****.  It is used to look up data in a flat file.  You can create a lookup definition from a source qualifier and can also use multiple Lookup transformations in a mapping. Later returned values can be passed to other transformations.8. For Example: You are having Emp_details  which will hold all the existing employee level informations. or synonym.

Page 21 of 29 .An Introduction to Informatica In the Condition Tab  Provide the required Conditions. Link the fields in the appropriate way  save the mapping.

An Introduction to Informatica Source_Table_Values: Output: (Depends upon the Performance and Designation the MSI percentage has been assigned to each employee) Unconnected Lookup Page 22 of 29 .

Union Transformation  Active & Connected.  The Union transformation is a multiple input group transformation that you use to merge data from multiple pipelines or pipeline branches into one pipeline branch.An Introduction to Informatica Output: 3.9. Page 23 of 29 .  It merges data from multiple sources similar to the UNION ALL SQL statement to combine the results from two or more SQL statements.

An Introduction to Informatica  Similar to the UNION ALL statement. 4. the Union transformation does not remove duplicate rows. Page 24 of 29 . Workflow Creation Navigate to Workflow manager  click on Workflow designer  select Workflow  create  Provide the new workflow name (wf_*****) Click on Task  Create  Provide new session name  and select the appropriate mapping for workflow.

An Introduction to Informatica Click Tasks  Link Task  Then link the session Double click on the Session  General  provide Fail Parent if the task fails Properties  Provide the log file name & path details . Page 25 of 29 .

If the query in SQ & Session is different then the job will take the session level query. Page 26 of 29 . Note: SQL Query can be modified at session level.An Introduction to Informatica Mapping  Select the DB Connection details for Source & Provide the Target file type & location for the output file.

Page 27 of 29 .An Introduction to Informatica Once after modifying everything  Save the workflow. To run the WF  Right click on the appropriate Workflow  Start Workflow.

To view the reason for failure  Right Click on the Session  Get Session Log. Page 28 of 29 .An Introduction to Informatica Once after running the WF  The Monitor Window gets opened.

Page 29 of 29 .An Introduction to Informatica After making the required changes  Refresh the mapping  then run the Workflow again.