You are on page 1of 35

Part 3,4,5,6 & 7: Informatica

Informatica
Overview and
Transformations

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Introduction:
• Is GUI based ETL product from Informatica corporation.
• Is a client server technology.
• Is developed using JAVA language.
• Is an integrated tool set (To Design, To Run, To Monitor)
Versions:
• 5.0
• 6.0
• 7.1.1
• 8.1.1
• 8.5
• 8.6
• 9.0
• 9.1
• 9.5
• 9.6
• 10

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Informatica Architecture

Intellipaat Software Solutions Pvt. Ltd.


Components of Informatica
Service Components

1. Repository Services
2. Integration Services

Client Components

1. Administrator Console
2. Repository Manager
3. Mapping Designer
4. Work flow Manager
5. Work flow Monitor

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Mapping Designer Workflow Manager Workflow Monitor

Import Source Definition 1.Create Session

Import Target Metadata

Import Designing Mapping Mapping

(S_xyz) Monitoring (Mapping)


Session
Save

Mapping 2. Create Workflow

(M_xyz)

Save

| Start

Repository

--Executing into Informatica


server.

--Integration services are


responsible for execution.
Admin Console

For Administrative Purpose.

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Domain:
A domain is the primary unit for management and administration of services in PowerCenter.
A domain contain one or more nodes.

Node:
A node is the logical representation of a machine in a domain.
Two kinds of nodes.
1. Gateway node - can run management services for the domain, and they are also the ones
that communicate with the domain database. The worker node can run application services,
but cannot communicate with the domain database.
Only one gateway node can run the management services at a time, and only one node
can talk to the domain database at a time, regardless of how many nodes are in the domain.
The node that performs these tasks is the master gateway node. While there is no upper limit on
the number of nodes in a domain, each domain has a minimum of one gateway node

2. Worker node - A worker node runs a Service Manager process, and it can run application services.
The worker node cannot run the extra management processes, nor does it communicate with the
domain database. This can be good, because it does not require the extra resources for
management, but it cannot take over as a master gateway node.

Intellipaat Software Solutions Pvt. Ltd.


Server components:
Repository Services:
Responsible for maintaining Informatica metadata & providing access of same to other services.
The Repository service manages connections to the power center repository from client applications.
The Repository service is a multithreaded process that inserts, retrieves, deleted and updates metadata in the repository.
The Repository service ensures the consistency of the metadata in the repository.
The following Power Center applications can access the repository service
• Power Center Client
• Integration Service
• Web Service Hub
• Command Line Program (For backup and Recovery for administrative purpose)
Repository is also known by the following other names:
• Data Dictionary
• Registry
• Catalog
• Meta Data

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Integration Services
Responsible for the movement of data from sources to targets
The Integration Service reads mappings and session information from the repository.
It extract the data from the mapping source stores in the memory (Staging Area) where it
applies the transformation rule that you can configure in the mapping.
The Integration Service loads the transformed data into the mapping targets.
The integration service connects to the repository through repository service to fetch the
metadata.
Client components:

Mapping Designer:
It is a GUI based client component which allows you to design the plan of ETL process called mapping.
The following types of metadata objects can be created using designer client.
• Create Source Definition
• Create Target Definition
• Design Mapping with or without a Transformation rule.

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Workflow Manager:
It is a GUI based client component which allows you to create the following task.
• Create session for each mapping
• Create workflow
• Execute workflow
• Schedule workflow

Workflow Monitor:
It is a GUI based client component which provides the following information:
• Give the workflow and session status (Succeeded or Failed)
• Get Session Log from the repository.
• Start, Stop sessions and workflows.

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Repository Manager:
The Repository manager is GUI based administrative client which allows you to create following objects.
• Create, Edit and Delete folders which are required to organize the metadata and the repository.
• Create used, user groups, assign permissions and privileges.

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Session: A Session is a set of instruction which perform extraction, transformation and loading. A session Created to make the
mapping available for execution.

Workflow: A Workflow is a start task which contains a set of instruction to execute the other task such as session. Workflow is a
top object in the power center development hierarchy.

Schedule Workflow: A Schedule workflow is an administrative task which specifies the data and time to run the workflow.

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Transformation:

A transformation is an object used to define business logic for processing the data.

Transformation can be categorized in two categories


• Based upon no. of rows processing
• Based upon connection
Based upon no. of rows processing there are two types of Transformation
• Active Transformation
• Passive Transformation

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Active Transformation:

A transformation which can affect the number of rows while data is going from source to target is
known as active transformation.

The following are the list of active transformation used for processing the data.
• Source Qualifier Transformation
• Filter Transformation
• Aggregator Transformation
• Joiner Transformation
• Router Transformation
• Rank Transformation
• Sorter Transformation
• Update Strategy Transformation
• Transaction Control Transformation
• Union Transformation
• Normalizer Transformation
• XML Source Qualifier
• Java Transformation
• SQL Transformation

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Passive Transformation:

A transformation which does not affect the number of rows when the data is moving from source to
target is known as passive transformation.

The following are the list of passive transformation used for processing the data.
• Expression Transformation
• Sequence Generator Transformation
• Stored Procedure Transformation
• Lookup Transformation

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Filter Transformation:

This is a type of an active transformation which allows you to filter the data based on given condition.
A condition is created with the three elements
• Port
• Operator
• Operand
The integration service evaluates the filter condition against each input record, returns TRUE or FALSE.

The integration service returns TRUE when the records is satisfied with the condition and the records are
given for further processing or loading the data into the target.

The integration service returns FALSE when the input record is not satisfied with the condition and those
records are rejected from filter transformation.

Filter transformation does not support “IN” operator.


The filter transformation supports to send the data to the single target.
Use filter transformation to perform data cleansing activity.
The filter transformation functions as WHERE clause in terms of SQL.

Ex: SAL > 10000


Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Router Transformation:

Router transformation is a type of active transformation which allows to apply multiple condition, to
load multiple target table.

Is created with two types of group.


1. Input Group: - Which receives the data from source.
2. Output Group: - Which sends the data to target.

Output groups are also of two types.


1. User defined group allows to apply condition.
2. Default group captures the rejected record.

Ex: Newport1 – Type = Local


Newport2 – Type = Retro
Defaultport.

Intellipaat Software Solutions Pvt. Ltd.


Expression Transformation:

This is a type of passive transformation which allows you to calculate the expression for each record.
The expression can be calculated only in the output ports.
Used expression transformation to perform data cleansing and data scrubbing activities.
Expression transformations define only on the output port.

Ex: Price/Quantity

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Aggregator Transformation:

This is a type of an Active transformation which allows you to calculate the summary for a group of records.

Aggregator transformation is created with following four components.


1. Group by: It defines the group on a port for which summaries are calculated. Ex. Deptno
2. Aggregate Expression:- The aggregate expressions can be developed only in the output ports using following
aggregate function.
• Sum( )
• Max( )
• Avg( )
• Min()
3. Sorted Input: - An aggregator transformation receives sorted data as an input to improve the performance of
summary calculations.
• The port on which group is defined, the same ports need to be sorted, using sorter transformation. (Only
group by port need to be sorted by sorter transformation)
4. Aggregate Cache: - The Integration service creates cache memory when the first time session executes on it.
• The aggregate cache stored on server hard drive.
• An incremental Aggregation uses aggregate cache to improve the performance of session.
• Data cache – Stores row values
• Index cache – Stores group values
Intellipaat Software Solutions Pvt. Ltd.
Sorter Transformation:

This is of type an Active Transformation which sorts the data in ascending or in descending order.

• The port on which sorting takes place is represented as a key.


• .
User sorter Transformation for eliminating duplicates
Ex: Ascending/Descending

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Sequence Generator Transformation:

Sequence generator transformation is a passive and connected transformation. The sequence generator
transformation is used for

• Generating unique primary key values.


• Replace missing primary keys
• Generate surrogate keys for dimension tables in SCDs.
• Cycle through a sequential range of numbers.

Ex: 1, 2, 3,…

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Rank Transformation:

This is of type an active transformation which allows you to identify the TOP and BOTTOM
performers.
The rank transformation can be created with following types of ports.
1. Input Port
2. Output Port
3. Rank Port (R)
4. Variable Prot (V)

Rank Port: - The port based on which rank is determined is known as Rank Port.
Variable Port: - A port which can store the data temporally is known as a variable port.

The following properties need to be set for calculating the Ranks.


Top/Bottom
Number of Rank
The Rank transformation by default create with an output port called Rank index.

Dense Ranking: - It is a process of calculating the ranks for each group.

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Joiner Transformation:

This is of type of an Active transformation which allows you to combine the data from multiple
sources into a single output based on given join condition.
The joiner transformation is created with the following types of ports.
1. Input Port
2. Output Port
3. Master Port (M)
A Source which is defined with lesser number of records than other source is designated as master source.
A master source is created with the master ports. The joiner transformation can be created with following
types of join.
1. Normal join (Equi Join)
2. Master outer join
3. Detail outer join
4. Full outer join.

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
The default type of joiner transformation is Normal join (Equi Join).

Normal Join keeps only matching rows on the condition.


Master Outer Join Keeps all rows from detail and matching rows from master.
Detail Outer Join Keeps all rows from master and matching rows from detail.
Full Outer Join Keeps all rows from both master and detail.

Joiner transformation does not support non-equi join.


Use joiner transformation to perform merge the data records horizontally.
Use joiner transformation to perform join on the following types of sources.
Table + Table
Flat file + Flat file
XML file + XML file
Table + Flat file
Table + XML file
Flat file + XML file

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Normal join Master outer join Detail outer join

M M M

D D D

Full outer join

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Merging

Horizontally Vertically

Joiner Transformation Union Transformation

Intellipaat Software Solutions Pvt. Ltd.


Union Transformation:

• Union transformation is an active and connected transformation.


• It is multi input group transformation used to merge the data from multiple pipelines into a single
pipeline.
• Basically it merges data from multiples sources just like the UNION ALL set operator in SQL.
• The union transformation does not remove any duplicate rows.
• Union transformation contains only one output group and can have multiple input groups.
• The input groups and output groups should have matching ports. The datatype, precision and scale
must be same.
• Union transformation does not remove duplicates. To remove the duplicate rows use sorter
transformation with "select distinct" option after the union transformation.
• The union transformation does not generate transactions.
• You cannot connect a sequence generator transformation to the union transformation.
• Union transformation does not generate transactions.

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Source Qualifier Transformation:

• The source qualifier transformation is an active,connected transformation used to represent the


rows that the integrations service reads when it runs a session.
• You need to connect the source qualifier transformation to the relational or flat file definition in a
mapping.
• The source qualifier transformation converts the source data types to the Informatica native data
types. So, you should not alter the data types of the ports in the source qualifier transformation.
• SQ is used to do the following tasks,

• Joins
• Filter rows
• Sorting Input
• Distinct rows
• Custom SQL Query

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Lookup Transformation:

Lookup is a passive/active transformation and can be used in both connected/unconnected modes.


From informatica version 9 onwards lookup is an active transformation. The lookup transformation can
return a single row or multiple rows. Lookup transformation is used to look up data in a flat file, relational
table, view or synonym.

The lookup transformation is used to perform the following tasks:

• Get a related value


• Get multiple values
• Perform Calculation
• Update SCD tables

Can configure the lookup transformation in the following types of lookup.

• Flat file or Relational lookup


• Pipeline Lookup
• Connected or Unconnected lookup
• Cached or Un cached Lookup

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Connected look up – Receives source data, performs a look up and returns data to the pipeline.
Unconnected lookup – Received source data from :LKP expression, performs a lookup and returns one
column data at a time to the calling transformation.

Continuation of lookup caches,


• Static – Won’t insert/update the rows in the cache.
• Dynamic – Insert/Update the rows in the cache.
• Persistent – Insert/Update the rows in the existing cache.
• Named/Shared – Cache will be named and it will be shared in the mappings.
• Un named/Shared – Cache will be unnamed and it will be shared once the lookup condition and
the output ports are matching.

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Update Stratergy Transformation:

Update strategy transformation is an active and connected transformation. Update strategy


transformation is used to insert, update, and delete records in the target table. It can also reject the records
without reaching the target table.

In the informatica, you can set the update strategy at two different levels:
• Session Level: Configuring at session level instructs the integration service to either treat all rows
in the same way (Insert or update or delete) or use instructions coded in the session mapping to
flag for different database operations.
• Mapping Level: Use update strategy transformation to flag rows for inert, update, delete or reject.

Flags in Update Stratergy

• DD_INSERT: Numeric value is 0. Used for flagging the row as Insert.


• DD_UPDATE: Numeric value is 1. Used for flagging the row as Update.
• DD_DELETE: Numeric value is 2. Used for flagging the row as Delete.
• DD_REJECT: Numeric value is 3. Used for flagging the row as Reject.
The integration service treats any other numeric value as an insert.

An important note, Update strategy works only when we have a primary key on the target table.
Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
SQL Transformation:

SQL Transformation is a connected transformation used to process SQL queries in the midstream of a
pipeline. We can insert, update, delete and retrieve rows from the database at run time using the SQL
transformation.

The following SQL statements can be used in the SQL transformation.


• Data Definition Statements (CREATE, ALTER, DROP, TRUNCATE, RENAME)
• DATA MANIPULATION statements (INSERT, UPDATE, DELETE, MERGE)
• DATA Retrieval Statement (SELECT)
• DATA Control Language Statements (GRANT, REVOKE)
• Transaction Control Statements (COMMIT, ROLLBACK)

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Transaction control Transformation:

Transaction Control is an active and connected transformation. The transaction control transformation is used
to control the commit and rollback of transactions. You can define a transaction based on varying number of input
rows.

We can define the transaction at the following levels:


• Mapping level: Use the transaction control transformation to define the transactions.
• Session level: You can specify the "Commit Type" option in the session properties tab. The different
options of "Commit Type" are Target, Source and User Defined. If you have used the transaction control
transformation in the mapping, then the "Commit Type" will always be "User Defined“

Use the following built-in variables in the expression editor of the transaction control transformation,
• TC_CONTINUE_TRANSACTION – IS does not perform any change in the transaction for this row.
• TC_COMMIT_BEFORE – IS Commits the transaction, begins a new transaction, and writes the current
row to the target. The current row is in the new transaction.
• TC_COMMIT_AFTER – IS writes the current row to the target, commits the transaction, and begins a
new transaction. The current row is in the committed transaction.
• TC_ROLLBACK_BEFORE – IS rolls back the current transaction, begins a new transaction, and writes
the current row to the target. The current row is in the new transaction.
• TC_ROLLBACK_AFTER – IS writes the current row to the target, rolls back the transaction, and begins
a new transaction. The current row is in the rolled back transaction.
Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Stored Procedure Transformation:

Stored Procedure Transformation is a passive transformation. Stored procedure transformation can be used in
both connected and unconnected mode. Stored procedures are stored and run within the database. Stored procedures
contain a pre-compiled collection of PL-SQL statements.

The stored procedures in the database are executed using the Execute or Call statements. Informatica provides the
stored procedure transformation which is used to run the stored procedures in the database. It contains connected and
unconnected transformation.

The property, "Stored Procedure Type" is used to specify when the stored procedure runs. The different values of this
property are shown below:

• Normal.
• Pre-load of the source.
• Post-load of the source.
• Pre-load of the target.
• Post-load of the target.

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Normalizer Transformation:

Normalizer transformation type is Active & Connected. The Normalizer transformation is used in place of
Source Qualifier transformations when you wish to read the data from the cobol copybook source.

Also, a Normalizer transformation is used to convert column-wise data to row-wise data. This is similar to the
transpose feature of MS Excel. You can use this feature if your source is a cobol copybook file or relational database
table. The Normalizer transformation converts columns to rows and also generates an index for each converted row.

Normalizer can be used to deal with


multiple-occurring columns and
multiple record types created using redefines.

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved
Thank You

Email us – support@intellipaat.com

Visit us - https://intellipaat.com

Intellipaat Software Solutions Pvt. Ltd. © Copyright Intellipaat.com All rights reserved

You might also like