You are on page 1of 25

4. What is meant by Session Recovery Strategy in Informatica?

Session Recovery Strategy is used when a session fails and needs to restart from
where it failed. Informatica uses an inbuilt table OPB_SRVR_RECOVERY to determine
the last committed record and starts again from the next record. The session also needs
to be configured for this. To do this:
Go to session --> Properties --> For attribute 'Recovery Strategy' set value to 'Resume
from the Last checkpoint'
The commit strategy also matters here. If the commit interval is 500 records and the
transaction fail at the 1100th record, then the new run will start from the 1001th record.

5. What is the difference between a variable and parameter in Informatica?

A parameter in Informatica is one for which the value is specified in a parameter file and
that value cannot be changed during the run of that session. A variable, on the contrary,
is the one whose value can change during the session run.

6. What is the difference between a Static cache and a dynamic Cache?


 Static Lookup cache:
o When the Dynamic Lookup Cache lookup property is not selected the cache is static and
the data in the cache will stay the same for the entire session. PowerCenter does not
update the cache while it processes the transformation.
 Dynamic Lookup cache:
o When the Lookup Caching Enabled and Dynamic Lookup Cache lookup properties are
selected the lookup cache is dynamic.
o Each time the Server executes the session, whenever a row is inserted, updated or
deleted to or from a target table, the cache will also be updated.

7. Under what situation do we use dynamic lookup transformation?

Dynamic lookup transformation is used when we have to consider changed data on the
table which is been looking upon during the session run like updating the details of the
table. Example: SCD type II.

8. When Dynamic Cache is selected, a new default port will be created. What
does the new port do?

The cache will be getting updated if any changes in the lookup table during the session
run a new default port will be created - 'newlookuprow', it will generate values - (0 - no
change, 1 - a new record, 2 - changed record) based on these values the UPD
transformation can act further.

 0 = Integration Service does not update or insert the row in the cache.
 1 = Integration Service inserts the row into the cache.
 2 = Integration Service updates the row in the cache.
9. What are the different types of ports in Expression transformation?

There are 3 type of ports:

 Input,
 Output
 Variable

10. In an expression transformation, there are 2 variable ports, in the


beginning, 1 output and 1 input. What is the sequence of execution of these
ports?

The sequence will be: Input -->Variable1 -->Variable2-->Output

11. What is MD5 function?

MD5 (Message Digest Function) is a hash function in Informatica which is used to


evaluate data integrity. The MD5 function uses Message-Digest Algorithm 5 (MD5) and
calculates the checksum of the input value. MD5 is a one-way cryptographic hash
function with a 128-bit hash value.
MD5 returns a 32 characters string of hexadecimal digits 0-9 & a-f and returns NULL if
the input is a null value.

12. A row has 5 columns. How do you take 5 ports i.e. single row and create 5
rows from it?

This can be done using normalizer transformation (Gk and GCID port will be created,
which are inbuilt)

13. What is meant by the Target load plan in Informatica?

When we have multiple targets to load data to in Informatica, we can decide which
target to load first and which to load next. This can be helpful in case the two or more
targets have a primary key or foreign key relationship between them.
To do this: go to Mappings--> Target Load Plan--> Select a Target Load Plan.

14. While comparing the source with the target, when there is a need to avoid
duplicate rows coming from the source, then which lookup should be used?

Dynamic Lookup should be used, as it creates a dynamic cache of the target which
changes as the rows are processed. In this case, rows can be determined for insert or
update. A normal lookup creates a static cache of the target.

15. Reading data from huge (100 Million rows) flat file, the target is oracle
table which will have either insert or update on target. Oracle has correct
indexes. What are the performance things you look for from Informatica to
improve performance?

Dropping the indexes before the session run and re-creating the indexes once the
session completes.

16. What are the different partitioning methods in Informatica?

Database partitioning: This can be used with Oracle or IBM DB2 source instances on
a multi-node tablespace or with DB2 targets. It reads partitioned data from the
corresponding nodes in the database. The PowerCenter Integration Service queries the
IBM DB2 or Oracle system for table partition information.
Hash partitioning: This can be used when you want the PowerCenter Integration
Service to distribute rows to the partitions by the group. For example, you need to sort
items by item ID, but you do not know how many items have a particular ID number.
You can use the following types of hash partitioning:

 Hash auto-keys: A compound partition key is created by the PowerCenter Integration Service
using all grouped or sorted ports. You may need to use hash auto-keys partitioning at rank,
sorter, and unsorted aggregator transformations.
 Hash user keys: A hash function is used by the PowerCenter Integration Service to group rows of
data among partitions. You define the number of ports to generate the partition key.
 Key range: The PowerCenter Integration Service passes data to each partition depending on the
ranges you specify for each port. One or more ports can be used to form a compound partition
key. Use key range partitioning where the sources or targets in the pipeline are partitioned by key
range.
 Pass-through: Choose pass-through partitioning where you want to create an additional pipeline
stage to improve performance, but do not want to change the distribution of data across
partitions. The PowerCenter Integration Service passes all rows at one partition point to the next
partition point without redistributing them.
 Round-robin: This can be used so that each partition processes rows based on the number and
size of the blocks. The PowerCenter Integration Service distributes blocks of data to one or more
partitions.

17. How does Update Strategy work in Informatica?

Update strategy transformation is used to insert, update, and delete records in the
target table. It can also reject the records without reaching the target table. It can also
flag rows in mapping with update strategy as below:
DD_INSERT: Numeric value is 0. Used for flagging the row as Insert.
DD_UPDATE: Numeric value is 1. Used for flagging the row as Update.
DD_DELETE: Numeric value is 2. Used for flagging the row as Delete.
DD_REJECT: Numeric value is 3. Used for flagging the row as Reject.
In the session level have to set, 'Treat target rows as - Data Driven'

18. While doing Insert and Update on Target table, Update Strategy is poor in
performance. Without using update strategy, how do you perform Insert and
Updates? How do you design the mapping?
We can create two mappings. One for inserting the new records and, another one is for
updating the existing record, In which have to connect the key column and columns
which have to get updated of the target table. In the session only update the target rows
have to check.

19. Out of these operations: like constants for insert, update, delete and reject
in update strategy, what does expression of an update strategy leads to?

It will perform a corresponding operation.

20. If the update strategy evaluates a record to Reject, what Informatica will
do?

It will block that record

21. How do you improve the performance of a Joiner Transformation?

We use sorted input by choosing the table with fewer records as Master Table.

22. What is Transaction Control transformation?

Transaction Control Transformation. It is used to control the commit and rollback of


transactions. Following built-in variables are available:

 TC_CONTINUE_TRANSACTION: The Integration Service does not perform any transaction


change for this row. This is the default value of the expression.
 TC_COMMIT_BEFORE: The Integration Service commits the transaction, begins a new
transaction, and writes the current row to the target. The current row is in the new transaction.
 TC_COMMIT_AFTER: The Integration Service writes the current row to the target, commits the
transaction, and begins a new transaction. The current row is in the committed transaction.
 TC_ROLLBACK_BEFORE: The Integration Service rolls back the current transaction, begins a
new transaction, and writes the current row to the target. The current row is in the new
transaction.
 TC_ROLLBACK_AFTER: The Integration Service writes the current row to the target, rolls back
the transaction, and begins a new transaction. The current row is in the rolled back transaction.

23. What is the difference between Reusable transformation and a Mapplet?

Reusable transformation is a Single transformation which can be reused in many


mappings.
Mapplet is a group of transformation which forms particular logic, which can be used in
many mappings.

24. Can you have a Mapplet that reads data from the source, and expression
transformation and writes data to target?
We can add source definitions that act as a source of data for our mapplet. We can add
as many sources as we want. Another way to feed data through a mapplet is with an
input transformation. Mapplet can have an expression transformation. The output of a
mapplet cannot be connected to any target table.
You cannot include the following objects in a mapplet:

 Normalizer transformations.
 Cobol sources.
 XML Source Qualifier transformations.
 XML sources.
 Target definitions.
 Pre- and post- session stored procedures.
 Other mapplets.

25. You design a Mapplet, when I drag a Mapplet to a mapping, what ports of
the Mapplet are visible? What if we have to pass data to a Mapplet from
Mapping? What are the Output ports?

Input and output.

26. What are the different types of tasks?

Different types of tasks include:

1. Assignment- Used to assign a value to a workflow variable.


2. Command -Used to run a shell command during the workflow.
3. Control -Used to stop or abort the workflow.
4. Decision -Tells a condition to evaluate.
5. Email- Used to send an email during the workflow.
6. Event-Raise -Notifies the Event-Wait task that an event has occurred.
7. Event-Wait - It waits for the event to complete in order to start the next task.
8. Session - Used to run the mapping created in Designer buy linking to session.
9. Timer - It waits for an already timed event to start.

27. Session Task is nothing but a mapping. I design a mapping, somebody is


using it in a workflow. Can you name a few properties of mapping that we can
override at the session level?

Some of the properties of mapping that we can override are:

 Table names.
 Properties to treat the target rows (insert, update).
 Joining condition.
 Filter condition.
 Source qualifier override.

28. What is a Control Task in a workflow?


Control Task can be used to stop or abort the workflow.

29. If you run a Workflow and it fails, how would you investigate the failure?

We can do this by looking at the session and wf logs.

30. How do you access Session or Workflow log?

We can access session and wf logs in the monitor.

31. What are the sections present in creating Workflow Parameter file? What
is the type of information present?

Sample structure:
[Global]
[Folder_Name.WF:wf_name]
$param1 =
It might contain - log file name, some parameter values like data which has to be
passed, connection strings.
It will contain the information which will remain the same for current session run.

32. How do you execute a workflow from a Unix shell script?

Using PMCMD command:


pmcmd startworkflow -sv ${INTEGRATION_SERVICE} -d ${DOMAIN_NAME} -u
${INFA_USR} -p ${INFA_PWD} -f ${FOLDER_NAME} -wait ${WORKFLOW_NAME}

33. How to use mapping variable be used in a workflow?

A mapping variable can be assigned to the workflow variable at the workflow manager.
To do this: Go to Session→ Edit→Components→Pre/Post session variable assignment.

34. How to use session name in a command task in a workflow?

This can be done with the help of $PMSessionName.


Example: echo “Session: $PMSessionName” >> Dummy.txt

35. How to use session name and workflow name in email task without
hardcoding?

This can be done by using the %s for session and %w for workflow.

36. We have a scenario where we want our session to stop processing further
when the 2nd error is detected while running it. How to achieve this?
There is a session level property: Stop on errors
Set this value to 2.

37. What type of sessions allow the variable assignment from mapping to a
workflow?

Only the non-reusable sessions. For Re-usable sessions, we cannot assign the
mapping variable to the workflow variable.

38. What are different types of tracing levels in Informatica 10.1.1?


 Note: Applicable only at the session level. The Integration Service uses the tracing levels
configured in the mapping.
 Terse: logs initialization information, error messages, and notification of rejected data in the
session log file.
 Normal: Integration Service logs initialization and status information, errors encountered and
skipped rows due to transformation row errors. Summarizes session results, but not at the level of
individual rows.
 Verbose Initialization: In addition to normal tracing, the Integration Service logs additional
initialization details; names of index and data files used, and detailed transformation statistics.
 Verbose Data: In case of Verbose data, in addition to the verbose initialization tracing, the logs of
each row that passes into the mapping is kept by the Integration Service. It also logs where the
Integration Service truncates string data to fit the precision of a column and provides detailed
transformation statistics. The Integration Service writes row data for all rows in a block when it
processes a transformation when you configure the tracing level to verbose data.

39. How can the ‘not exists’ operator be implemented in Informatica?

Implementing the Not Exists operator is very easy in Informatica. For example: If we
want to get only the records which are available in table A and not in table B, we use a
joiner transformation with A as master and B as detail. We specify the join condition and
in the join type, select detail outer join. This will get all the records from A table and only
the matching records from B table. Connect the joiner to a filter transformation and
specify the filter condition as B_port is NULL. This will give the records which are in A
and not in B. Then connect the filter to the target definition.

40. If a parameter file path is defined at both the session level as well as at
the workflow level, which path will be taken while running the workflow?

The workflow-level parameter file is picked up irrespective of whether a parameter file is


defined at the session level or not.

41. When do we select a connected/unconnected transformation in


Informatica?

Unconnected Lookup should be selected when the same Lookup has to be performed
at multiple places. This is a kind of reusable lookup which is used as a function in any
transformation using LKP Expression. It uses the only static cache.
A connected lookup should be used when the same lookup need not be used at multiple
places. It is the part of the data flow. It uses both static and dynamic cache.

42. When we right click on a running session from the workflow monitor, Stop
and Abort options are available. What is the difference between both of them?

Stop option just makes the integration service stop taking input from the source but
continues to process the records which are being processed to go to the target.
Abort option makes the integration service stop not only taking the records from the
input but also stops the in-process records.

43. What is the scenario which compels the Informatica server to reject files?

This happens when it faces DD_Reject in update strategy transformation. Moreover, it


disrupts the database constraint filed in the rows was condensed.

44. How can we keep last 20 session logs in Informatica?

Go to Session→ Properties→Config Object→ Log Options


Set these 2 properties:
Save Session Logs by Session runs.
Save Session Log for these runs: 20

45. How can we delete duplicate rows from flat files?

We can make use of sorter transformation and select distinct option to delete the
duplicate rows.

46. Under what condition selecting Sorted Input in aggregator may fail the
session?

If the input data is properly sorted, the session may fail if the sort order by ports and the
group by ports of the aggregator are not in the same order.

47. How does Sorter handle NULL values?

We can configure the way the sorter transformation treats null values. Enable the
property Null Treated Low if we want to treat null values as lower than any other value
when it performs the sort operation. Disable this option if we want the Integration
Service to treat null values as higher than any other value.

48. How does rank transformation handle string values?


Rank transformation can return the strings at the top or the bottom of a session sort
order. When the Integration Service runs in Unicode mode, it sorts character data in the
session using the selected sort order associated with the Code Page of IS which may
be French, German, etc. When the Integration Service runs in ASCII mode, it ignores
this setting and uses a binary sort order to sort character data.

49. What is a Dimensional Model?


 Data Modeling: It is a process of designing the database by fulfilling business requirements
specifications.
 A Data Modeler (or) Database Architect Designs the warehouse Database using a GUI based
data modeling tool called “ERWin”.
 ERWin is a data modeling tool from Computer Associates (A).
 A dimensional modeling consists of following types of schemas designed for the data warehouse:
o Star Schema.
o Snowflake Schema.
o Galary Schema.

 A schema is a data model which consists of one or more tables.

50. What are the different ways of debugging a code in Informatica?


 Use a debugger at the mapping level. A debugger gives you row by row data.
 Use the workflow/session logs.
 Use the create and add target after any transformation which you want to check for data.
 Change the tracing level at session level/transformation level to verbose data.

Important Tips to remember while preparing for an


interview:
 The Interview questions of any ETL tool like Informatica PowerCenter/ IBM Datastage mainly
consist of 3 type of questions:
o Theoretical Questions related to the tool.
o Scenario-Based Questions related to the tool.
o How to implement a particular functionality through the tool and with SQL/Unix(whichever
applicable). The basic knowledge of Unix and SQL is very essential for clearing any ETL
interview.
 A lot of questions are generally asked about Automation, which you may have implemented in
your project using Informatica PowerCenter.
o For Example Export and import Automation, which can be done using Unix and
PowerCenter.
 Thoroughly go through all the properties at the session level, workflow level, and mapping level, a
lot of questions are asked about that.
 Be prepared for General Questions like ‘What is the most complex requirement you have
implemented in Informatica PowerCenter’ or ‘What is the most complex bug you have faced and
how did you fix it’.
33.Using mapping parameter and variable in mapping

Scenario:How to use mapping parameter and variable in mapping ?

Solution:

1. Go to mapping then parameter and variable tab in the Informatica designer.Give name as $$v1, type
choose parameter (You can also choose variable), data type as integer and give initial value as 20.

2. Create a mapping as shown in the figure( I have considered a simple scenario where a particular
department id will be filtered to the target).
3. In filter set deptno=$$v1 (that means only dept no 20 record will go to the target.)

4. Mapping parameter value can’t change throughout the session but variable can be changed. We can
change variable value by using text file.

34.Validating all mapping in repository

Scenario:How validate all mapping in repository ?

Solution:

1. In repository go to menu “tool” then “queries”. Query Browser dialog box will appear.Then click on new
button.
2. In Query Editor, choose folder name and object type as I have shown in the picture.

3. After that, execute it (by clicking the blue arrow button).


4. Query results window will appear. You select single mapping (by selecting single one) or whole mapping
(by pressing Ctrl + A) and go to “tools” then “validate” option to validate it.

35.Target table rows , with each row as sum of all previous rows from source table.

Scenario: How to produce rows in target table with every row as sum of all previous rows in source table ? See the
source and target table to understand the scenario.

SOURCE TABLE

id Sal

1 200

2 300

3 500

4 560

TARGET TABLE

Id Sal

1 200
2 500

3 1000

4 1560

1. Pull the source to mapping and then connect it to expression.

2. In expression add one column and make it output(sal1) and sal port as input only.
We will make use of a function named cume() to solve our problem, rather using any complex
mapping. Write the expression in sal1 as cume(sal) and send the output rows to target.
36.Sending records to target tables in cyclic order

Scenario: There is a source table and 3 destination table T1,T2, T3. How to insert first 1 to 10 record in T1, records
from 11 to 20 in T2 and 21 to 30 in T3.Then again from 31 to 40 into T1, 41 to 50 in T2 and 51 to 60 in T3 and so
on i.e in cyclic order.

Solution:

1. Drag the source and connect to an expression.Connect the next value port of sequence generator to
expression.

2.
3. Send the all ports to a router and make three groups as bellow
Group1Group2Group3
4. mod(NEXTVAL,30) >= 21 and mod(NEXTVAL,30) <= 29 or mod(NEXTVAL,30) = 0
5. mod(NEXTVAL,30) >= 11 and mod(NEXTVAL,30) <= 20
6. mod(NEXTVAL,30) >= 1 and mod(NEXTVAL,30) <= 10
7.
8. Finally connect Group1 to T1, Group2 to T2 and Group3 to T3.
37.Segregating rows on group count basis

Scenario : There are 4 departments in Emp table. The first one with 100,2nd with 5, 3rd with 30 and 4th dept has 12
employees. Extract those dept numbers which has more than 5 employees in it, to a target table.

Solution:

https://forgetcode.com/informatica/1448-count-number-of-rows-with-not-null-values

1. Put the source to mapping and connect the ports to aggregator transformation.

2. Make 4 output ports in aggregator as in the picture above : count_d10, count_d20, count_d30,
count_d40.
For each port write expression like in the picture below.

3. Then send it to expression transformation.


4. In expression make four output ports (dept10, dept20, dept30, dept40) to validate dept no
And provide the expression like in the picture below.

5. Then connect to router transformation. And create a group and fill condition like below.
6. Finally connect to target table having one column that is dept no.

38.Get top 5 records to target without using rank

Scenario : How to get top 5 records to target without using rank ?

Solution:

1. Drag the source to mapping and connect it to sorter transformation.

2. Arrange the salary in descending order in sorter as follows and send the record to expression.
sorter properties

3. Add the next value of sequence generator to expression.(start the value from 1 in sequence generator).

sorter to exp mapping

4. Connect the expression transformation to a filter or router. In the property set the condition as follows-
5. Finally connect to the target.

final mapping sc12

39.Separate rows on group basis

Scenario : In Dept table there are four departments (dept no 40,30,20,10). Separate the record to different target
department wise.

Solution:

Step 1: Drag the source to mapping.


Step 2: Connect the router transformation to source and in router make 4 groups and give condition like below.
router transformation

Step 3: Based on the group map it to different target.

The final mapping looks like below.


router to target

You might also like