You are on page 1of 10

1. What is INFORMATICA? Why do we need it?

Answer: INFORMATICA is a software development firm which offers some data


integration solution for ETL, data virtualization, master data management, data quality, data
replica, ultra messaging, etc.

Some of the popular INFORMATICA products are:

INFORMATICA PowerCenter
INFORMATICA PowerConnect
INFORMATICA Power Mart
INFORMATICA Power Exchange
INFORMATICA Power Analysis
INFORMATICA Power Quality
We need INFORMATICA while working with data systems which contain data to perform
certain operations along with a set of rules. INFORMATICA facilitates operations line
cleaning and modifying data from structured and unstructured data systems.

2. What is a Filter Transformation and why it is an Active one?


Answer: A Filter transformation is an Active and Connected transformation that can filter
rows in a mapping.

Only the rows that meet the Filter Condition pass through the Filter transformation to the next
transformation in the pipeline. TRUE and FALSE are the implicit return values from any
filter condition we set. If the filter condition evaluates to NULL, the row is assumed to be
FALSE.

The numeric equivalent of FALSE is zero (0) and any non-zero value is the equivalent of
TRUE.

As an ACTIVE transformation, the Filter transformation may change the number of rows
passed through it. A filter condition returns TRUE or FALSE for each row that passes
through the transformation, depending on whether a row meets the specified condition. Only
rows that return TRUE pass through this transformation. Discarded rows do not appear in the
session log or reject files.

3. Does an Informatica Transformation support only Aggregate expressions?


Answer: Apart from aggregate expressions Informatica Aggregator also supports non-
aggregate expressions and conditional clauses.

4. How will the document be delivered to me?


Answer: Master Informatica Question and Answer Set™ will be delivered in PDF format in
your email ID provided by you during the checkout process. Therefore, please ensure that
you provide your email ID accurately during the buying/check-out process. Delivery
generally takes place within 8 hours of purchase.

5. What kind of support is available if I have difficulty understanding the question or


answer?
Answer: If you need any assistance to understand the questions or answers of the Master
Informatica Question & Answer Set™, or if you have any doubt; please feel free to contact
our forum. In the forum, you can post your questions with MIQA tag to draw attention to our
experts who will happily provide personalized guidance to each and every query you have
6. Is Master Informatica Question & Answer set™ copyrighted?
Answer: Master Informatica Question and Answer set™ is copyrighted and only intended to
be used by the buyer of the material. Any breach in the copy-rights will be subjected to legal
prosecution. You are explicitly prohibited from copying or distributing the material in full or
part for any purpose other than within the context of personal use

7. What Informatica product line does Master Informatica Question and Answer set™
cover?
Answer: We cover Informatica PowerCenter® Mapping Designer, Workflow Manager,
Repository Manager, Workflow Monitor and PowerCenter® Administration Console
(version in

8. What is the difference between Source Qualifier transformations Source Filter to


Filter transformation?
Answer: SQ Source Filter Filter Transformation
Source Qualifier transformation filters row when reading from a source. Filter transformation
filters rows from within a mapping
Source Qualifier transformation can only filter rows from Relational Sources. Filter
transformation filters row coming from any type of source system in the mapping level.
Source Qualifier limits the row set extracted from a source. Filter transformation limits the
row set sent to a target.
Source Qualifier reduces the number of rows used throughout the mapping and hence it
provides better performance. To maximize session performance, include the Filter
transformation as close to the sources in the mapping as possible to filter out unwanted data
early in the flow of data from sources to targets.
The filter condition in the Source Qualifier transformation only uses standard SQL as it runs
in the database. Filter Transformation can define a condition using any statement or
transformation function that returns either a TRUE or FALSE value.
Revisiting Joiner Transformation

9. What are some examples of Informatica ETL programs?


Answer:
Some basic Informatica programs are:
Mappings: A mapping is designed in the Designer. It defines all the ETL processes. Data are
read from their original sources by mappings before the application of transformation logic to
the read data. The transformed data is later written to the targets.
Workflows: The processes of runtime ETL are described by a collection of different tasks
are known as a workflow. Workflows are designed in the Workflow Manager.
Task: This is a set of actions, commands, or functions that are executable. How an ETL
process behaves during runtime can be defined by a sequence of different tasks.

10. Which development components of Informatica have the highest usage?


Answer: There are many development components in Informatica. However, these are the
most widely used of them:
Expression: This can be used to transform data that have functions.
Lookups: They are extensively used to join data.
Sorter and Aggregator: This is the right tool for sorting data and aggregating them.
Java transformation: Java transformation is the choice of developers if they want to invoke
variables, java methods, third-party APIs and java packages that are built-in.
Source qualifiers: Many people use this component to convert source data types to the
equivalent Informatica data types.
Transaction control: If you want to create transactions and have absolute control over
rollbacks and commits, count on this component to bail you out.

11. What are the uses of ETL tools?


Answer: ETL tools are quite different from other tools. They are used for performing some
actions such as:

Loading important data into a data warehouse from any source known as Target.
Extracting data from a data warehouse from any sources such as database tables or files.
Transforming the data received from different sources in an organized way. Some of the
notable sources where data are received include SAP solutions, Teradata, or web services.

12. What is a Joiner Transformation and why it is an Active one?


Answer: A Joiner is an Active and Connected transformation used to join source data from
the same source system or from two related heterogeneous sources residing in different
locations or file systems.
The Joiner transformation joins sources with at least one matching column. The Joiner
transformation uses a condition that matches one or more pairs of columns between the two
sources.
The two input pipelines include a master pipeline and a detailed pipeline or a master and a
detailed branch. The master pipeline ends at the Joiner transformation, while the detail
pipeline continues to the target.
In the Joiner transformation, we must configure the transformation properties namely Join
Condition, Join Type and Sorted Input option to improve Integration Service performance.
The join condition contains ports from both input sources that must match for the Integration
Service to join two rows. Depending on the type of join selected, the Integration Service
either adds the row to the result set or discards the row.
The Joiner transformation produces result sets based on the join type, condition, and input
data sources. Hence it is an Active transformation.

13. State the limitations where we cannot use Joiner in the mapping pipeline?
Answer: The Joiner transformation accepts input from most transformations. However, the
following are the limitations:
Joiner transformation cannot be used when either of the input pipelines contains an Update
Strategy transformation.
Joiner transformation cannot be used if we connect a Sequence Generator transformation
directly before the Joiner transformation.

14. Define the various Join Types of Joiner Transformation?


Answer: In a normal join, the Integration Service discards all rows of data from the master
and detail source that do not match, based on the join condition.
A master outer join keeps all rows of data from the detail source and the matching rows from
the master source. It discards the unmatched rows from the master source.
A detail outer join keeps all rows of data from the master source and the matching rows from
the detail source. It discards the unmatched rows from the detail source.
A full outer join keeps all rows of data from both the master and detail sources. 

15. What is INFORMATICA Worklet?


Answer: Worklet works as a Mapplet with the feature of Reusability, the only difference is
that we can apply worklet to any number of workflows in INFORMATICA, unlike mapplet.
Worklet saves the logic and tasks at a single place to reuse.
Worklet is much similar to the Mapplet and is defined as the group of tasks that can be either
reusable or non-reusable at the workflow level. It can be added to as much number of
workflows as required. With its reusability feature, much time is saved as reusable logic can
be developed once and can be placed from where it can be reused.
In the INFORMATICA Power center environment, Mapplets are considered as the most
advantageous feature. Mapplets are created in Mapplet designer and are a part of the
Designer tool.
It basically contains a set of transformations that are designed to be reused in multiple
mapping.
Mapplets are said to be reusable objects which simplify mapping by:
Including multiple transformations and source definitions.
Not required to connect to all input and output ports.
Accept data from sources and pass to multiple transformations
Well, overall when it is required to reuse the mapping logic then the logic should be placed in
Mapplet.

16. What is Code page Compatibility?


Answer: Transfer of data takes place from one code page to another keeping that both code
pages have the same character sets then data failure cannot occur.

17. Write the prerequisite tasks to achieve the session partition?


Answer: In order to perform session partition, one needs to configure the session to partition
source data and then installing the Informatica server machine in multifold CPU’s.

18. In informatics server Which files are created during the session rums?
Answer: Errors log, Bad file, Workflow low and session log namely files are created during
the session rums.Company.

19. How to implement Security Measures using Repository manager?


Answer:
There are 3 ways to implement security measures.
They are:
Folder Permission within owner, groups, and users.
Locking (Read, Write, Retrieve, Save and Execute).
Repository Privileges viz.
Browse Repository.
Use Workflow Manager(To create session and batches and set its properties).
Workflow Operator(To execute Session and batches).
Use Designer, Admin Repository(Allows any user to create and manage repository).
Admin User(Allows the user to create a repository server and set its properties).
SuperUser(All the privileges are granted to the user).

20. What are the types of groups in router transformation?


Answer:
Input group
Output group
Default group
Learn Transformations In Depth

21. How can we store previous session logs?


Answer: If you run the session in the time stamp mode then automatically session log out
will not overwrite the current session log.
Go to Session Properties –> Config Object –> Log Options
Select the properties as follows:
Save session log by –> SessionRuns
Save session log for these runs –> Change the number that you want to save the number of
log files (Default is 0)

If you want to save all of the log files created by every run, and then select the option Save
session log for these runs –> Session TimeStamp

You can find these properties in the session/workflow Properties.

22. Suppose we configure Sorter transformations in the master and detail pipelines with
the following sorted ports in order: ITEM_NO, ITEM_NAME, PRICE. When we
configure the join condition, what are the guidelines we need to follow to maintain the
sort order?
Answer: If we have sorted both the master and detail pipelines in order of the ports say
ITEM_NO, ITEM_NAME and PRICE we must ensure that:
Use ITEM_NO in the First Join Condition.
If we add a Second Join Condition, we must use ITEM_NAME.
If we want to use PRICE as a Join Condition apart from ITEM_NO, we must also use
ITEM_NAME in the Second Join Condition.
If we skip ITEM_NAME and join on ITEM_NO and PRICE, we will lose the input sort order
and the Integration Service fails the session.

23. How an Expression Transformation differs from Aggregator Transformation?


Answer: An Expression Transformation performs a calculation on a row-by-row basis. An
Aggregator Transformation performs calculations on groups.

24. What is Incremental Aggregation?


Answer: We can enable the session option, Incremental Aggregation for a session that
includes an Aggregator Transformation. When the Integration Service performs incremental
aggregation, it actually passes changed source data through the mapping and uses the
historical cache data to perform aggregate calculations incrementally.
For reference check Implementing Informatica Incremental Aggregation.

25. What are the performance considerations when working with Aggregator
Transformation?
Answer:
1. Filter the unnecessary data before aggregating it. Place a Filter transformation in the
mapping before the Aggregator transformation to reduce unnecessary aggregation.
2. Improve performance by connecting only the necessary input/output ports to subsequent
transformations, thereby reducing the size of the data cache.
3. Use Sorted input which reduces the amount of data cached and improves session
performance.

26. What will happen if the SELECT list COLUMNS in the Custom override SQL
Query and the OUTPUT PORTS order in SQ transformation do not match?
Answer: Mismatch or Changing the order of the list of selected columns to that of the
connected transformation output ports may result in session failure.
27. What are the different ways to implement parallel processing in Informatica?
Answer: We can implement parallel processing using various types of partition algorithms:
Database partitioning: The Integration Service queries the database system for table partition
information. It reads partitioned data from the corresponding nodes in the database.

Round-Robin Partitioning: Using this partitioning algorithm, the Integration service


distributes data evenly among all partitions. It makes sense to use round-robin partitioning
when you need to distribute rows evenly and do not need to group data among partitions.

Hash Auto-Keys Partitioning: The Powercenter Server uses a hash function to group rows of
data among partitions. When the hash auto-key partition is used, the Integration Service uses
all grouped or sorted ports as a compound partition key. You can use hash auto-keys
partitioning at or before Rank, Sorter, and unsorted Aggregator transformations to ensure that
rows are grouped properly before they enter these transformations.

Hash User-Keys Partitioning: Here, the Integration Service uses a hash function to group
rows of data among partitions based on a user-defined partition key. You can individually
choose the ports that define the partition key.

Key Range Partitioning: With this type of partitioning, you can specify one or more ports to
form a compound partition key for a source or target. The Integration Service then passes
data to each partition depending on the ranges you specify for each port.

Pass-through Partitioning: In this type of partitioning, the Integration Service passes all rows
from one partition point to the next partition point without redistributing them.

28. Mention a few design and development best practices for Informatica?
Answer: Mapping design tips: Standards – sticking to consistent standards is beneficial in the
long run. This includes naming conventions, descriptions, environment settings, parameter
files, documentation, among others.
Reusability – in order to react quickly to potential changes, use Informatica components like
applets, worklets, and reusable transformations.
Scalability – when designing and developing mappings, it is a good practice to keep volumes
in mind. This is caching, queries, partitioning, initial vs incremental loads.
Simplicity – it is recommended to create multiple mappings instead of a few complex ones.
Use Staging Area and try to keep the processing logic as clear and simple as possible.
Modularity – use the modular design technique (common error handling, reprocessing).
Mapping development best practices

Source Qualifier – use shortcuts, extract only the necessary data, limit the reading of columns
and rows on the source. Try to use the default query options (User Defined Join, Filter)
instead of using SQL Query override which may impact database resources and make unable
to use partitioning and push-down.
Expressions – use local variables to limit the number of redundant calculations, avoid
datatype conversions, reduce invoking external scripts (coding outside of Informatica),
provide comments, use operators (||, +, /) instead of functions. Keep in mind that numeric
operations are generally faster than string operations.
Filter – use the Filter transformation as close to the source as possible. If multiple filters need
to be applied, usually it’s more efficient to replace them with Router.
Aggregator – use sorted input, also use as early (close to the source) as possible and filter the
data before aggregating.
Joiner – try to join the data in Source Qualifier wherever possible, and avoid outer joins. It is
good practice to use a source with fewer rows, such as a Master source.
Lookup – relational lookup should only return ports that meet the condition. Call
Unconnected Lookup in expression (IIF). Replace large lookup tables with joins whenever
possible. Review the database objects and add indexes to database columns when possible.
Use Cache Calculator in session to eliminate paging in lookup cache.

29. What is Enterprise Data Warehousing?


Answer: When a large amount of data is assembled at a single access point then it is called as
Enterprise Data Warehousing. This data can be reused and analyzed at regular intervals or as
per the need of the time requirement.
Considered as the central database or say a single point of access, Enterprise data
warehousing provides a complete global view and thus helps in decision support.
It can be more understood from the following points which define its features:
All important business information stored in this unified database can be accessed from
anywhere across the organization.
Although the time required is more, periodic analysis on this single source always produces
better results.
Security and integrity of data are never compromised while making it accessible across the
organization.

30. How can we use Batches?


Answer: Batches are the collection of sessions which is used to migrate the data from the
source to target on a server. Batches can have the largest number of sessions in it but they
cause more network traffic whereas less number of sessions in a batch can be moved rapidly.

31. What is a Sequence Generator Transformation?


Answer: A Sequence Generator transformation is a Passive and Connected transformation
that generates numeric values. It is used to create unique primary key values, replace missing
primary keys, or cycle through a sequential range of numbers. This transformation by default
contains ONLY Two OUTPUT ports namely CURRVAL and NEXTVAL. We cannot edit or
delete these ports neither we cannot add ports to this unique transformation. We can create
approximately two billion unique numeric values with the widest range from 1 to
2147483647.

32. What is Decode in INFORMATICA?


Answer: To understand Decode in an easy way, let’s consider it as similar to the CASE
statement in SQL. It is basically the function that is used by an expression transformation in
order to search a specific value in a record.
There can be unlimited searches within the Decode function where a port is specified for
returning result values. This function is usually used in cases where it is required to replace
nested IF statements or to replace lookup values by searching in small tables with constant
values.
Decode is a function that is used within Expression Transformation. It is used just like CASE
Statement in SQL to search a specific record.
Below is a simple example of a CASE in SQL:
Syntax:
SELECT EMPLOYEE_ID,
CASE
WHEN EMPLOYEE_AGE <= 20 THEN ‘Young’ WHEN EMPLOYEE_AGE > 30 AND
AGE <= 40 THEN ‘Knowledgeable’ WHEN EMPLOYEE_AGE > 40 AND AGE = 60
THEN ‘Wise’
ELSE ‘Very Wise’
END AS EMPLOYEE_WISDOM
FROM EMPLOYEE

33. What is Mapping?


Answer: Mapping is a collection of source and targets which are linked with each other
through certain sets of transformations such as Expression Transformation, Sorter
Transformation, Aggregator Transformation, Router Transformation, etc.

34. What is Rank Transformation?


Answer:
1. This a type of an active T/R which allows you to find out either top performance or bottom
performers.
2. Rank T/R is created with the following types of the port:

i. Input Port (I)


ii. Output Port (O)
iii. Rank Port (R)
iv. Variable Port (V)

35. What are the transformations that cannot be placed between the sort origin and the
Joiner transformation so that we do not lose the input sort order?
Answer: The best option is to place the Joiner transformation directly after the sort origin to
maintain sorted data. However, do not place any of the following transformations between
the sort origin and the Joiner transformation:
Custom
UnsortedAggregator
Normalizer
Rank
Union transformation
XML Parser transformation
XML Generator transformation
Mapplet [if it contains any one of the above-mentioned transformations]

36. What is the format of INFORMATICA objects in a repository? What are the
databases that INFORMATICA can connect to Windows?
Answer: INFORMATICA objects can be written in XML format.

Following is the list of databases that INFORMATICA can connect to:

 SQL Server
 Oracle
 MS Access
 MS Excel
 DB2
 Sybase
 Teradata.
37. How does the Aggregator Transformation handle NULL values?
Answer: By default, the aggregator transformation treats null values as NULL in aggregate
functions. But we can specify to treat null values in aggregate functions is NULL or zero.
38. What differs when we choose the Sorted Input for Aggregator Transformation?
Answer: Integration Service creates the index and data caches files in memory to process the
Aggregator transformation. If the Integration Service requires more space as allocated for the
index and data cache sizes in the transformation properties, it stores overflow values in cache
files i.e. paging to disk. One way to increase session performance is to increase the index and
data cache sizes in the transformation properties. But when we check Sorted Input the
Integration Service uses memory to process an Aggregator transformation it does not use
cache files.

39. Under what conditions selecting Sorted Input in aggregator will still not boost
session performance?
Answer
1. Incremental Aggregation, session option is enabled.
2. The aggregate expression contains nested aggregate functions.
3. Source data is data-driven.

40. Suppose we have a Source Qualifier transformation that populates two target tables.
How do you ensure TGT2 is loaded after TGT1?
Answer: In the Workflow Manager, we can Configure Constraint-based load ordering for a
session. The Integration Service orders the target load on a row-by-row basis. For every row
generated by an active source, the Integration Service loads the corresponding transformed
row first to the primary key table, then to the foreign key table.

Hence if we have one Source Qualifier transformation that provides data for multiple target
tables having primary and foreign key relationships, we will go for Constraint-based load
ordering.
Constraint-based loading
Revisiting Filter Transformation

41. How can we create Index after completion of Load Process?


Answer: Command Tasks are used to create the Index. Command Task scripts can be used in
a session of the workflow to create an index.

42. What are Sessions?


Answer: Session is a Set of Instructions that are used while moving data from the source to
destination. We can partition the session to implement several sequences of sessions to
improve server performance.

After creating session we can use server manager or command-line program pmcmd to stop
or start the session.

43. What is the benefit of Session Partitioning?


Answer: While Integration Service is running in the environment the workflow is partitioned
for better performance. These partitions are then used to perform Extraction, Transformation,
and Loading.

44. What is a predefined event?


Answer: Predefined event is the file-watch event. It waits for a specific file to arrive at a
specific location.
45. What is Joiner Transformation in INFORMATICA?
Answer: With the help of Joiner Transformation, you can make use of Joins in
INFORMATICA.

It is based on two sources namely:

 Master Source
 Detail Source
 Following joins can be created using Joiner transformation as in SQL.
 Normal Join
 Full Outer Join
 Master outer join(Right Outer Join)
 Detail outer join(Left Outer Join)
46. What is Aggregator Transformation in INFORMATICA?
Answer: Aggregator Transformation can be active or connected. It works as the GROUP BY
clause in SQL. It is useful to perform aggregate calculations on groups in INFORMATICA
PowerCenter. It performs an aggregate calculation on data using aggregate type function viz.
SUM, AVG, MAX, and MIN.

47. What is the Sequence Generator Transformation in INFORMATICA?


Answer: Sequence Generator Transformation can be Passive or Connected. Its basic use is to
generate integer value with NEXTVAL and CURRVAL.

48. What is Rank Index in Rank Transformation?


Answer: Rank Index is assigned by the task designer to each record. The rank index port is
used to store ranking position for each row. Rank Transformation identifies each row from
the top to bottom and then assigns Rank Index.

You might also like