Professional Documents
Culture Documents
Informatica
Informatica
Session Recovery Strategy is used when a session fails and needs to restart from
where it failed. Informatica uses an inbuilt table OPB_SRVR_RECOVERY to determine
the last committed record and starts again from the next record. The session also needs
to be configured for this. To do this:
Go to session --> Properties --> For attribute 'Recovery Strategy' set value to 'Resume
from the Last checkpoint'
The commit strategy also matters here. If the commit interval is 500 records and the
transaction fail at the 1100th record, then the new run will start from the 1001th record.
A parameter in Informatica is one for which the value is specified in a parameter file and
that value cannot be changed during the run of that session. A variable, on the contrary,
is the one whose value can change during the session run.
Dynamic lookup transformation is used when we have to consider changed data on the
table which is been looking upon during the session run like updating the details of the
table. Example: SCD type II.
8. When Dynamic Cache is selected, a new default port will be created. What
does the new port do?
The cache will be getting updated if any changes in the lookup table during the session
run a new default port will be created - 'newlookuprow', it will generate values - (0 - no
change, 1 - a new record, 2 - changed record) based on these values the UPD
transformation can act further.
0 = Integration Service does not update or insert the row in the cache.
1 = Integration Service inserts the row into the cache.
2 = Integration Service updates the row in the cache.
9. What are the different types of ports in Expression transformation?
Input,
Output
Variable
12. A row has 5 columns. How do you take 5 ports i.e. single row and create 5
rows from it?
This can be done using normalizer transformation (Gk and GCID port will be created,
which are inbuilt)
When we have multiple targets to load data to in Informatica, we can decide which
target to load first and which to load next. This can be helpful in case the two or more
targets have a primary key or foreign key relationship between them.
To do this: go to Mappings--> Target Load Plan--> Select a Target Load Plan.
14. While comparing the source with the target, when there is a need to avoid
duplicate rows coming from the source, then which lookup should be used?
Dynamic Lookup should be used, as it creates a dynamic cache of the target which
changes as the rows are processed. In this case, rows can be determined for insert or
update. A normal lookup creates a static cache of the target.
15. Reading data from huge (100 Million rows) flat file, the target is oracle
table which will have either insert or update on target. Oracle has correct
indexes. What are the performance things you look for from Informatica to
improve performance?
Dropping the indexes before the session run and re-creating the indexes once the
session completes.
Database partitioning: This can be used with Oracle or IBM DB2 source instances on
a multi-node tablespace or with DB2 targets. It reads partitioned data from the
corresponding nodes in the database. The PowerCenter Integration Service queries the
IBM DB2 or Oracle system for table partition information.
Hash partitioning: This can be used when you want the PowerCenter Integration
Service to distribute rows to the partitions by the group. For example, you need to sort
items by item ID, but you do not know how many items have a particular ID number.
You can use the following types of hash partitioning:
Hash auto-keys: A compound partition key is created by the PowerCenter Integration Service
using all grouped or sorted ports. You may need to use hash auto-keys partitioning at rank,
sorter, and unsorted aggregator transformations.
Hash user keys: A hash function is used by the PowerCenter Integration Service to group rows of
data among partitions. You define the number of ports to generate the partition key.
Key range: The PowerCenter Integration Service passes data to each partition depending on the
ranges you specify for each port. One or more ports can be used to form a compound partition
key. Use key range partitioning where the sources or targets in the pipeline are partitioned by key
range.
Pass-through: Choose pass-through partitioning where you want to create an additional pipeline
stage to improve performance, but do not want to change the distribution of data across
partitions. The PowerCenter Integration Service passes all rows at one partition point to the next
partition point without redistributing them.
Round-robin: This can be used so that each partition processes rows based on the number and
size of the blocks. The PowerCenter Integration Service distributes blocks of data to one or more
partitions.
Update strategy transformation is used to insert, update, and delete records in the
target table. It can also reject the records without reaching the target table. It can also
flag rows in mapping with update strategy as below:
DD_INSERT: Numeric value is 0. Used for flagging the row as Insert.
DD_UPDATE: Numeric value is 1. Used for flagging the row as Update.
DD_DELETE: Numeric value is 2. Used for flagging the row as Delete.
DD_REJECT: Numeric value is 3. Used for flagging the row as Reject.
In the session level have to set, 'Treat target rows as - Data Driven'
18. While doing Insert and Update on Target table, Update Strategy is poor in
performance. Without using update strategy, how do you perform Insert and
Updates? How do you design the mapping?
We can create two mappings. One for inserting the new records and, another one is for
updating the existing record, In which have to connect the key column and columns
which have to get updated of the target table. In the session only update the target rows
have to check.
19. Out of these operations: like constants for insert, update, delete and reject
in update strategy, what does expression of an update strategy leads to?
20. If the update strategy evaluates a record to Reject, what Informatica will
do?
We use sorted input by choosing the table with fewer records as Master Table.
24. Can you have a Mapplet that reads data from the source, and expression
transformation and writes data to target?
We can add source definitions that act as a source of data for our mapplet. We can add
as many sources as we want. Another way to feed data through a mapplet is with an
input transformation. Mapplet can have an expression transformation. The output of a
mapplet cannot be connected to any target table.
You cannot include the following objects in a mapplet:
Normalizer transformations.
Cobol sources.
XML Source Qualifier transformations.
XML sources.
Target definitions.
Pre- and post- session stored procedures.
Other mapplets.
25. You design a Mapplet, when I drag a Mapplet to a mapping, what ports of
the Mapplet are visible? What if we have to pass data to a Mapplet from
Mapping? What are the Output ports?
Table names.
Properties to treat the target rows (insert, update).
Joining condition.
Filter condition.
Source qualifier override.
29. If you run a Workflow and it fails, how would you investigate the failure?
31. What are the sections present in creating Workflow Parameter file? What
is the type of information present?
Sample structure:
[Global]
[Folder_Name.WF:wf_name]
$param1 =
It might contain - log file name, some parameter values like data which has to be
passed, connection strings.
It will contain the information which will remain the same for current session run.
A mapping variable can be assigned to the workflow variable at the workflow manager.
To do this: Go to Session→ Edit→Components→Pre/Post session variable assignment.
35. How to use session name and workflow name in email task without
hardcoding?
This can be done by using the %s for session and %w for workflow.
36. We have a scenario where we want our session to stop processing further
when the 2nd error is detected while running it. How to achieve this?
There is a session level property: Stop on errors
Set this value to 2.
37. What type of sessions allow the variable assignment from mapping to a
workflow?
Only the non-reusable sessions. For Re-usable sessions, we cannot assign the
mapping variable to the workflow variable.
Implementing the Not Exists operator is very easy in Informatica. For example: If we
want to get only the records which are available in table A and not in table B, we use a
joiner transformation with A as master and B as detail. We specify the join condition and
in the join type, select detail outer join. This will get all the records from A table and only
the matching records from B table. Connect the joiner to a filter transformation and
specify the filter condition as B_port is NULL. This will give the records which are in A
and not in B. Then connect the filter to the target definition.
40. If a parameter file path is defined at both the session level as well as at
the workflow level, which path will be taken while running the workflow?
Unconnected Lookup should be selected when the same Lookup has to be performed
at multiple places. This is a kind of reusable lookup which is used as a function in any
transformation using LKP Expression. It uses the only static cache.
A connected lookup should be used when the same lookup need not be used at multiple
places. It is the part of the data flow. It uses both static and dynamic cache.
42. When we right click on a running session from the workflow monitor, Stop
and Abort options are available. What is the difference between both of them?
Stop option just makes the integration service stop taking input from the source but
continues to process the records which are being processed to go to the target.
Abort option makes the integration service stop not only taking the records from the
input but also stops the in-process records.
43. What is the scenario which compels the Informatica server to reject files?
We can make use of sorter transformation and select distinct option to delete the
duplicate rows.
46. Under what condition selecting Sorted Input in aggregator may fail the
session?
If the input data is properly sorted, the session may fail if the sort order by ports and the
group by ports of the aggregator are not in the same order.
We can configure the way the sorter transformation treats null values. Enable the
property Null Treated Low if we want to treat null values as lower than any other value
when it performs the sort operation. Disable this option if we want the Integration
Service to treat null values as higher than any other value.
Solution:
1. Go to mapping then parameter and variable tab in the Informatica designer.Give name as $$v1, type
choose parameter (You can also choose variable), data type as integer and give initial value as 20.
2. Create a mapping as shown in the figure( I have considered a simple scenario where a particular
department id will be filtered to the target).
3. In filter set deptno=$$v1 (that means only dept no 20 record will go to the target.)
4. Mapping parameter value can’t change throughout the session but variable can be changed. We can
change variable value by using text file.
Solution:
1. In repository go to menu “tool” then “queries”. Query Browser dialog box will appear.Then click on new
button.
2. In Query Editor, choose folder name and object type as I have shown in the picture.
35.Target table rows , with each row as sum of all previous rows from source table.
Scenario: How to produce rows in target table with every row as sum of all previous rows in source table ? See the
source and target table to understand the scenario.
SOURCE TABLE
id Sal
1 200
2 300
3 500
4 560
TARGET TABLE
Id Sal
1 200
2 500
3 1000
4 1560
2. In expression add one column and make it output(sal1) and sal port as input only.
We will make use of a function named cume() to solve our problem, rather using any complex
mapping. Write the expression in sal1 as cume(sal) and send the output rows to target.
36.Sending records to target tables in cyclic order
Scenario: There is a source table and 3 destination table T1,T2, T3. How to insert first 1 to 10 record in T1, records
from 11 to 20 in T2 and 21 to 30 in T3.Then again from 31 to 40 into T1, 41 to 50 in T2 and 51 to 60 in T3 and so
on i.e in cyclic order.
Solution:
1. Drag the source and connect to an expression.Connect the next value port of sequence generator to
expression.
2.
3. Send the all ports to a router and make three groups as bellow
Group1Group2Group3
4. mod(NEXTVAL,30) >= 21 and mod(NEXTVAL,30) <= 29 or mod(NEXTVAL,30) = 0
5. mod(NEXTVAL,30) >= 11 and mod(NEXTVAL,30) <= 20
6. mod(NEXTVAL,30) >= 1 and mod(NEXTVAL,30) <= 10
7.
8. Finally connect Group1 to T1, Group2 to T2 and Group3 to T3.
37.Segregating rows on group count basis
Scenario : There are 4 departments in Emp table. The first one with 100,2nd with 5, 3rd with 30 and 4th dept has 12
employees. Extract those dept numbers which has more than 5 employees in it, to a target table.
Solution:
https://forgetcode.com/informatica/1448-count-number-of-rows-with-not-null-values
1. Put the source to mapping and connect the ports to aggregator transformation.
2. Make 4 output ports in aggregator as in the picture above : count_d10, count_d20, count_d30,
count_d40.
For each port write expression like in the picture below.
5. Then connect to router transformation. And create a group and fill condition like below.
6. Finally connect to target table having one column that is dept no.
Solution:
2. Arrange the salary in descending order in sorter as follows and send the record to expression.
sorter properties
3. Add the next value of sequence generator to expression.(start the value from 1 in sequence generator).
4. Connect the expression transformation to a filter or router. In the property set the condition as follows-
5. Finally connect to the target.
Scenario : In Dept table there are four departments (dept no 40,30,20,10). Separate the record to different target
department wise.
Solution: