You are on page 1of 24

Interview Questionary Set 1

19:27    3 comments

1. What is ETL and what are the ETL tools?


ETL is process of Extract, Transform and Load the data into Datawarehousing in required
format for decision making.

ETL Tools List:

         Informatica PowerCenter

         IBM DataStage

         Ab Initio

         Oracle Warehouse Builder

         Microsoft SQL Server Integration

         DT/Studio

         Talend Studio..etc 

2. What is Informatica?

Informatica is an ETL tool provided by 'Informatica Corporation', it is used to Extraction,


Transformation and Load process all types databases. Now a day’s Informatica is also being
used as an Integration tool and it is Service Oriented Architecture.

3. What are the Informatica PowerCenter Tools.?


Informatica PowerCenter had 2 types of Tools:
Server Tools: Administrator Console and Integration Service.
Client Tools: Repository Manager, Mapping Designer, Workflow Manager and Workflow
Monitor.

4. What are the PowerCenter Client Side Tools and their uses?
Informatica PowerCenter Client Side Tools are Repository Manager, Mapping Designer,
Workflow Manager, Workflow Monitor, Data Transformation and Developer Client.

Repository Manager: It is use to organize and secure metadata by creating folders.


Mapping Designer: It is use to create and store mapping metadata in the repository. The
Designer helps you create source definitions, target definitions and transformations to build
the mappings. The Designer has tools to help to build mappings and mapplets so you can
specify how to move and transform data between sources and targets. 
Workflow Manager: It is use to store workflow metadata and connection object
information in the repository. A workflow contains a session and any other task you may
want to perform when you run a session. Tasks can include a session, email notification, or
scheduling information. You connect each task with links in the workflow. In the Workflow
Manager, you define a set of instructions called a workflow to execute mappings you build in
the Designer. 
Workflow Monitory: It is use to retrieve workflow run status information and session logs
written by the Integration Service. A workflow is a set of instructions that tells an
Integration Service how to run tasks. Integration Services run on nodes or grids. The nodes,
grids, and services are all part of a domain. You can monitor workflows and tasks in the
Workflow Monitor. 

5. What are the PowerCenter Server Side Tools and their uses?
Informatica PowerCenter Server Side Tools are Administrator Console and Integration
Service.

Administrator Console: Informatica Administrator is the administration tool that you use


to administer the Informatica domain and Informatica security. It is used to Domain
administrative tasks and Domain administrative tasks
Integration Service: The PowerCenter Integration Service is an application service that
runs sessions and workflows. Use the Administrator tool to manage the PowerCenter
Integration Service. It is used to create a PowerCenter Integration Service, Enable or
disable the PowerCenter Integration Service, Configure normal or safe mode, Configure the
PowerCenter Integration Service properties, Configure the associated repository, Configure
the PowerCenter Integration Service processes, Configure permissions on the PowerCenter
Integration Service and Remove a PowerCenter Integration Service. 

6. What happen if the no of source records are more than the sequence generator
no?
Error Message: Overflow error. The Sequence Generator transformation has reached the
end of the configured end value.

7. What is Source Qualifier and tasks it performs?


The Source Qualifier transformation represents the rows that the Integration Service reads
when it runs a session. The Source Qualifier Transformation convert relational or flat file
datatypes into Informatica datatypes.

Tasks it performs:
         Join data originating from the same source database

         Filter rows when the Integration Service reads source data

         Specify an outer join rather than the default inner join

         Specify sorted ports


         Select only distinct values from the source

         Create a custom query to issue a special SELECT statement for the Integration Service to
read source data

8. What is Filter, Router and what is difference between them?

Filter: The Filter transformation to filter out rows in a mapping which are passing through
it. The Filter transformation allows rows that meet the specified filter condition to pass
through. It drops rows that do not meet the condition. You can filter data based on one or
more conditions.
Router: A Router transformation is similar to a Filter transformation because both
transformations allow you to use a condition to test data. A Filter transformation tests data
for one condition and drops the rows of data that do not meet the condition. However, a
Router transformation tests data for one or more conditions and gives you the option to
route rows of data that do not meet any of the conditions to a default output group.

9. What is Joiner and why we use sorted input option?


The Joiner transformation is used join source data from two related heterogeneous sources
residing in different locations or file systems. We can also join data from the same source.
The use Joiner transformation sources should have at least one matching column. The
Joiner transformation uses a condition that matches one or more pairs of columns between
the two sources.

User of Sorted Input: To improve session performance by configuring the Joiner


transformation to use sorted input. When you configure the Joiner transformation to use
sorted data, the Integration Service improves performance by minimizing disk input and
output. If the sorted input option is select then we are telling Informatica integration service
that we have passing sorted data to the joiner transformation.

10. What is Union Transformation?

The Union transformation is a multiple input group transformation that you use to merge
data from multiple pipelines or pipeline branches into one pipeline branch. It merges data
from multiple sources similar to the 'UNION ALL' SQL statement to combine the results
from two or more SQL statements. Similar to the UNION ALL statement but the Union
transformation does not remove duplicate rows.

11. What is Update Strategy and Forward rejected rows?


Update Strategy is used to update/delete the data from target table other than insert the
records. In General the data passes through source to target is only for insert.

Forward Rejected Rows:


The Forward Rejected Rows option is used to either pass rejected rows to next
transformation or drop them. By default the Integration Service forward rejected rows to
next transformation. The Integration Service flags the rows for reject and writes them to
the session log file and passes to reject file. If you do not select Forward Rejected Rows
option the Integration Service drops rejected rows and writes them to the session log file.

12. What is transaction control transformation?


The Transaction Control transformation is used to control the data flow through it by commit
and roll back transactions. A transaction is the set of rows bound by commit or roll back
rows. We can define a transaction based on a varying number of input rows. It can be
controlled two level within a mapping or within a session.

13. What is Lookup and How many types of Lookup?


Lookup transformation is used to look up data in a flat file, relational table, view, or
synonym based on lookup conditions before loading data into target.

Types of Lookup:

Connected: 

         Connected Lookup transformation to receive input directly from the mapping pipeline.

         It can be a dynamic or static cache.

         Cache includes the lookup source columns in the lookup condition and the lookup source
columns that are output ports.

         Can return multiple columns from the same row or insert into the dynamic lookup cache.

         If there is no match for the lookup condition, the Integration Service returns the default
value for all output ports. If it configure dynamic caching, the Integration Service inserts
rows into the cache or leaves it unchanged.

         If there is a match for the lookup condition, the Integration Service returns the result of the
lookup condition for all lookup/output ports. If it configure dynamic caching, the Integration
Service either updates the row the in the cache or leaves the row unchanged.

        Pass multiple output values to another transformation. Link lookup/output ports to another
transformation.

         Supports user-defined default values.

Un-Connected:
         Unconnected Lookup transformation to receive input from the result of a :LKP expression in
another transformation.

         Use a static cache.

        Cache includes all lookup/output ports in the lookup condition and the lookup/return port.

         Designate one return port (R). Returns one column from each row.

         If there is no match for the lookup condition, the Integration Service returns NULL.

         If there is a match for the lookup condition, the Integration Service returns the result of the
lookup condition into the return port.

         Pass one output value to another transformation. The lookup/output/return port passes the
value to the transformation calling :LKP expression.

         Does not support user-defined default values.

14. What is cache and How many types of cache?

The cache is memory allocation for that instance in the server. The Integration Service
builds a cache in memory when it processes the first row of data in a cached Lookup
transformation. It allocates memory for the cache based on the amount you configure in the
transformation or session properties.

Types of cache:

Persistent cache: Once save the lookup cache files and reuse them the next time the
Integration Service processes a Lookup transformation configured to use the cache.

Re-cache from lookup source: If the persistent cache is not synchronized with the lookup
table, you can configure the Lookup transformation to rebuild the lookup cache.
Static cache: The cache that cannot be changed through the session. It caches the lookup
file or table and looks up values in the cache for each row that comes into the
transformation. When the lookup condition is true, the Integration Service returns a value
from the lookup cache. The Integration Service does not update the cache while it processes
the Lookup transformation. By default the Integration Service creates a static cache and it
read only.
Dynamic cache: The cache that can be changed through the session. The Integration
Service dynamically inserts or updates data in the lookup cache and passes the data to the
target. The dynamic cache is synchronized with the target. To cache a table, flat file, or
source definition and update the cache, configure a Lookup transformation with dynamic
cache.
Shared cache: The cache which can be used between multiple transformations. A named
cache can be used more than one place within the same mapping. A unnamed cache can be
used more than one or more places within mapping or different mappings.
Pre-build lookup cache: When you configured for sequential caches, the Integration
Service creates caches as the source rows enter the Lookup transformation. When you
configure the session to build concurrent caches, the Integration Service does not wait for
the first row to enter the Lookup transformation before it creates caches. Instead, it builds
multiple caches concurrently.

15. What is difference between static and dynamic cache?


Static cache:

         We cannot insert or update the cache

         We can use relational, flat-file or pipeline lookup.

         When the condition is true the Integration Service returns a value from the lookup table or
cache.

        When the condition is false the Integration Service returns the default value for connected
lookup and NULL for un-connected lookup.

Dynamic cache:

         We can insert or update rows in the cache as the rows pass to the target.

         We can use relational, flat-file or pipeline lookup.

         When the condition is true the Integration Service either updates rows in the cache or
leaves the cache unchanged, depending on the row type. This indicates that the row is in
the cache and target table. It can pass updated rows to a target.

        When the condition is not true, the Integration Service either inserts rows into the cache or
leaves the cache unchanged, depending on the row type. This indicates that the row is not
in the cache or target. It can pass inserted rows to a target table.

16. What is named cache?

When you want to share a persistent cache file across mappings then they it will be named
and that can be used. The caching structures must match or be compatible with a named
cache. It can share static and dynamic named caches.

17. What is difference between re-usable transformation and mapplet?


A transformation which is created in Transformation developer is called re-usable
transformation and which is used more than one mapping.

A mapplet is a set of transformations with a business logic which can be used more than
one mapping.

18. What is Normalizer and why we use it?


The Normalizer transformation is which convert a each single row into set of multiple rows.

Uses of it:
The Normalizer transformation is used to convert a row that contains multiple-occurring
columns and returns a row for each instance of the multiple-occurring data. 

The Normalizer transformation is used as Source Qualifier for Cobol/VSAM files.

19. What happen if the cache overflows?

If the cache data overflow then the Integration Service stores the overflow values in the
cache files. Once the session completes, the Integration Service delete cache files unless
you configure the lookup transformation to use a persistent cache.

20. What is data driven?


Integration Service follows instructions coded into Update Strategy and Custom
transformations within the session mapping to determine how to flag rows for insert, delete,
update, or reject. 

21. What is target load plan?


Target Load Plan is the order in which the Integration Service sends rows to targets in
different target load order groups in a mapping.

A target load order group is the collection of source qualifiers, transformations, and targets
linked together in a mapping. We can set the target load order if you want to maintain
referential integrity when inserting, deleting, or updating tables that have the primary key
and foreign key constraints.

The Integration Service reads sources in a target load order group concurrently, and it
processes target load order groups sequentially.

 
22. What happen if the Source Qualifier one of port not connected to source
definition?
The Integration Service through an internal error. The Source Qualifier contains an
unbounded field [LOC].

23. What is constraint based load, can we update in constraint based load?
The Constraint based load order is process to load the data first in Parent table then Child
Table. When you specify constraint-based loading for a session then the Integration Service
orders the target load on a row-by-row basis. For every row generated by an active source,
the Integration Service loads the corresponding transformed row first to the primary key
table, then to any foreign key tables. The Constraint based load ordering attribute applies
only to insert operations.

When you enable complete constraint-based loading, change data is loaded to targets in the
same Transaction Control Unit (TCU) by using the row ID assigned to the data by the CDC
Reader. As a result, data is applied to the targets in the same order in which it was applied
to the sources. The following message will be issued in the session log to indicate that this
support is enabled:

WRT_8417 Complete Constraint-Based Load Ordering is enabled.

To enable complete constraint-based loading, specify FullCBLOSupport=Yes in


the Custom Properties attribute on the Config Object tab. This property can also be set
in the PowerCenter Integration Service, which makes it applicable to all workflows and
sessions that use that the PowerCenter Integration Service.

If you use complete constraint-based loading, your mapping must not contain active
transformations which change the row ID generated by the CDC Reader. The following
transformations change the row Id value: Aggregator, Custom, configured as an active
transformation, Joiner, Normalizer, Rank and  Sorter.

24. How to update target without using update strategy?

By selecting Treat source rows as “update” in session properties tab.

25. How many ways we can update target table?


We can update target table in 3 ways.

1.    By selecting Treat source rows as “update” in session properties tab.

2.    By using Update Strategy in mapping with DD_UPDATE function in it and by selecting Treat
source rows as “Data driven” in session properties tab.
3.    By using Update Override in Target instance in mapping.

26. What is aggregate transformation and what is use of sorted input?


The Aggregator transformation is used to perform aggregate calculations, such as averages,
sums, mean, median etc. This is used for rows level calculations based on group by.

By selecting sorted input we are informing Integration Service that the input data is already
in sorted. The sorted input option is improve session performance.

27. Why we use joiner transformation, what are different types of joins?
Joiner transformation is used to join source data from two related heterogeneous source
residing in different locations or file systems. It can also used to join the source from same
database. To join the source it should have at least one matching column.

Types of joins:
Normal Join: It discards all rows of data from the both master and detail source that do
not match, based on the condition.
Master Join: A master outer join keeps all rows of data from the detail source and the
matching rows from the master source. It discards the unmatched rows from the master
source. 
Detail Join: A detail outer join keeps all rows of data from the master source and the
matching rows from the detail source. It discards the unmatched rows from the detail
source.
Full Outer Join: A full outer join keeps all rows of data from both the master and detail
sources.

28. What is RANK transformation and use of group by port in it?


The Rank transformation is used to rank top or bottom for source data.

The use of group by port in it is to get the top or bottom rank from each group instead of
complete data.

29. What is performance tuning?


The performance tuning is a process to optimize session performance by eliminating
performance bottlenecks. To tune session performance, first identify a performance
bottleneck, eliminate it, and then identify the next performance bottleneck until you are
satisfied with the session performance. You can use the test load option to run sessions
when you tune session performance.

30. Difference between stop and abort in Informatica?


Stop: It will stop the process by loading the loading the records in buffer which are fetch
after the last commit point.
Abort: It will stop the process immediately.

31. What is Persistent Lookup cache? What is its significance?


Persistent Lookup caches is which we can save the lookup cache files once and reuse them
the next time the Integration Service processes a Lookup transformation configured to use
the cache.
Significance:

 If you want to save the cache files and reuse, you can configure the
transformation to use a persistent cache. Use a persistent cache when you know the
lookup table does not change between session runs.
 The first time the Integration Service runs a session using a persistent lookup
cache, it saves the cache files to disk instead of deleting them. The next time the
Integration Service runs the session, it builds the memory cache from the cache files. If
the lookup table changes occasionally, you can override session properties to re-cache
the lookup from the database.
 When you use a persistent lookup cache, you can specify a name for the cache
files. When you specify a named cache, you can share the lookup cache across sessions.
32. How the Informatica server sorts the string values in Rank transformation?
If the Integration Service runs in the ASCII data movement mode, it sorts session data
using a binary sort order.

If the Integration Service runs in Unicode data movement mode, the Integration Service
uses the sort order configured for the session. You select the session sort order in the
session properties. The session properties lists all available sort orders based on the code
page used by the Integration Service.

33. What is Normal and Bulk load?


In normal loading, it loads record by record and writes log for that. It takes comparatively
a longer time to load data to the target in normal loading. 
In bulk loading, it loads number of records at a time to target database and do not writes
log for that. It takes less time to load data to target.

34. What is Constraint Based load and Target Load Plan?


Constraint Based Load: In Constraint based load the data was loaded based on
constraints. First the data was loaded into Parent table then the Child table based
constraints defined.
Target Load Plan: In Target load plan we specify the order in which the Integration
Service load the data to the Target instance.

Note: We will define Target Load Plan in case we have more than one flow in the mapping.

35. Is sorter an active or passive transformation? When do we consider it to be


active and passive?
The sorter transformation is an active transformation by default. But you can configure it as
a passive transformation.

 In properties tab, if you select distinct option then it will acts like active.
 In port tab, if you select one or more port for only to ascending or descending
order then it will like passive.
36. Is SQL an active or passive transformation? When do we consider it to be
active and passive?
The SQL transformation is an active transformation by default. But you can configure it as a
passive transformation when you create the transformation.

Passive: When you configure the transformation to run in script mode, you create a passive
transformation. The transformation returns one row for each input row. The output row
contains results of the query and any database error.
Active: When you configure the SQL transformation to run in query mode, you create an
active transformation. The transformation can return multiple rows for each input row

37. Using Informatica, how to update target table which don’t have any primary
key in database?
We can do this in 2 ways:
1. By considering any port of the target instance as Primary Key. And at session level select
Treat Source Rows as “update”

2. By Target override with syntax: TU


Example: UPDATE T_SALES SET EMP_NAME = :TU.EMP_NAME, DATE_SHIPPED =
:TU.DATE_SHIPPED, TOTAL_SALES = :TU.TOTAL_SALES WHERE EMP_ID = :TU.EMP_ID  

39. What are the output files that the Informatica server creates during running a
workflow/session?
The PowerCenter Integration Service creates the following output files: 

 Workflow log
 Session log
 Session details file
 Performance details file
 Reject files
 Row error logs
 Recovery tables and files
 Control file
 Post-session email
 Output file
 Cache files
40. What is an Active and Passive transformation?
Active: An active transformation can change the number of rows that passes through the
transformation. It changes the transaction boundary and change the row type.
Passive: A passive transformation does not change the number of rows that passes
through the transformation. It maintains the transaction boundary and maintains the rows
type.

41. What are error tables in Informatica are and how we do error handling in Informatica?
Error handling in Informatica:
Error Handling settings allow you to determine if the session fails or continues when it
encounters pre-session command errors, stored procedure errors, or a specified number of
session errors.

42. What is difference between IIF and DECODE function?


You can use nested IIF statements to test multiple conditions. The following example tests
for various conditions and returns 0 if sales is zero or negative:
IIF( SALES > 0, IIF( SALES < 50, SALARY1, IIF( SALES < 100, SALARY2, IIF( SALES < 200,
SALARY3, BONUS))), 0 )

You can use DECODE instead of IIF in many cases. DECODE may improve readability. The
following shows how you can use DECODE:
DECODE(
SALES > 0 and SALES < 50, SALARY1,
SALES > 49 AND SALES < 100, SALARY2,
SALES > 99 AND SALES < 200, SALARY3,
SALES > 199, BONUS)

43. How to import oracle sequence into Informatica?


The same way we import relation source. In the Source Analyzer, click Sources  Import
from Database.

44. How many way you can create source or target instance in designer?
The source definition can be created in 2 ways:
1. In the Source Analyzer, click Sources  Import from Database/File/Cobol File/XML
Definition. 
2. In the Source Analyzer, click Sources  Create

The target definition can be creating in 4 ways:


1. In the Target Designer, click Target  Import from Database/File/ XML Definition.  
2. In the Target Designer, click Target  Create
3. Creating a Target Definition from a Source Definition
4. Creating a Target Definition from a Transformation

45. What is parameter file?


A parameter file contains a list of parameters and variables with assigned values. Group
parameters and variables in different sections of the parameter file. 

Each section is preceded by a heading that identifies the Integration Service, Integration
Service process, workflow, worklet, or session to which you want to define parameters or
variables.

We define parameters and variables directly below the heading, entering each parameter or
variable on a new line. You can list parameters and variables in any order within a section.

46. How will you create header/trailer record in target file using Informatica?
Header: Header can be created in 3 ways:
1. In Session Mapping tab: Header Options  Output File Names
2. In Session Mapping tab: Header Options  Use header command output (The shell
command whose output will be used as the header.)
3. Manually created in Mapping.
Trailer: Trailer can be created in 2 ways:
1. In Session Mapping tab: Footer Command (The shell command whose output will be used
as the footer, which will be appended to the output data. No footer if the command is
empty.)
2. Manually created in Mapping.

47. What is mapping parameter and mapping variable?


The mapping parameters and variables to make mappings more flexible. Mapping
parameters and variables represent values in mappings and mapplets. If you declare
mapping parameters and variables in a mapping, you can reuse a mapping by altering the
parameter and variable values of the mapping in the session. This can reduce the overhead
of creating multiple mappings when only certain attributes of a mapping need to be
changed.

48. What are the session parameters?


Session Parameters are 2 types, user-defined and built-in session parameters.

User-defined:
$PMSessionLogFile: Defines the name of the session log between session runs. 
$DynamicPartitionCount: Defines the number of partitions for a session. 
$InputFileName: Defines a source file name. 
$LookupFileName: Defines a lookup file name. 
$OutputFileNames: Defines a target file name. 
$BadFileName: Defines a reject file name. 
$DBConnectionName: Defines a relational database connection for a source, target,
lookup, or stored procedure. 
$LoaderConnectionName: Defines external loader connections. 
$FTPConnectionName: Defines FTP connections. 
$QueueConnectionName: Defines database connections for message queues. 
$AppConnectionName: Defines connections to source and target applications. 
$ParamName: Defines any other session property. For example, you can use this
parameter to define a table owner name, table name prefix, FTP file or directory name,
lookup cache file name prefix, or email address. You can use this parameter to define
source, lookup, target, and reject file names, but not the session log file name or database
connections. 

Built-in:
$PMFolderName: Returns the folder name. 
$PMIntegrationServiceName: Returns the Integration Service name. 
$PMMappingName: Returns the mapping name. 
$PMRepositoryServiceName: Returns the Repository Service name. 
$PMRepositoryUserName: Returns the repository user name. 
$PMSessionName: Returns the session name. 
$PMSessionRunMode: Returns the session run mode (normal or recovery). 
$PMSourceQualifierName@numAffectedRows: Returns the number of rows the
Integration Service successfully read from the named Source Qualifier. 
$PMSourceQualifierName@numAppliedRows: Returns the number of rows the
Integration Service successfully read from the named Source Qualifier. 
$PMSourceQualifierName@numRejectedRows: Returns the number of rows the
Integration Service dropped when reading from the named Source Qualifier. 
$PMSourceName@TableName: Returns the table name for the named source instance. 
$PMTargetName@numAffectedRows: Returns the number of rows affected by the
specified operation for the named target instance. 
$PMTargetName@numAppliedRows: Returns the number of rows the Integration
Service successfully applied to the named target instance. 
$PMTargetName@numRejectedRows: Returns the number of rows the Integration
Service rejected when writing to the named target instance. 
$PMTargetName@TableName: Returns the table name for the named target instance. 
$PMWorkflowName: Returns the workflow name. 
$PMWorkflowRunId: Returns the workflow run ID. 
$PMWorkflowRunInstanceName: Returns the 

49. Where does Informatica store rejected data? How do we view them?
By default, the Integration Service names store the rejected data to the reject file after the
target instance name: target_name.bad. Optionally, use the $BadFileName session
parameter for the file name. 

50. What is difference between partitioning of relational target and file targets?
Relational Target Partitioning: The Integration Service creates a separate connection to
the target database for each partition at the target instance. It concurrently loads data for
each partition into the target database.

File Target Partitioning: You can write the target output to a separate file for each
partition or to a merge file that contains the target output for all partitions. When you run
the session, the Integration Service writes to the individual output files or to the merge file
concurrently. You can also send the data for a single partition or for all target partitions to
an operating system command.
51. What do you mean by direct loading and Indirect loading in session
properties?
Direct loading: Indicates the source file contains the source data
Indirect loading: Indicates the source file contains a list of files with the same file
properties. When you select Indirect, the Integration Service finds the file list and reads
each listed file when it runs the session.
 
52. What is the status code?
Status code provides error handling for the informatics server during the session. The
stored procedure issues a status code that notifies whether or not stored procedure
completed successfully. This value cannot seen by the user. It only used by the Informatica
server to determine whether to continue running the session or stop.
 
53. What is Data driven?
The Integration Service will follows instructions coded into update strategy transformations
with in the session mapping determine how to flag records for insert, update, delete or
reject. If you do not choose data driven option setting, the Integration Service will ignores
all update strategy transformations in the mapping.
 
54. Can you use the mapping parameters or variables created in one mapping into
another mapping?
No, we cannot use the mapping parameters or variables created in one mapping into
another mapping.
 
55. When we can join tables at the Source qualifier itself, why do we go for joiner
transformation?
The Source Qualifier transformation is used to join source data from the relation source
residing in the same schema, database or system.
The Joiner transformation is used to join source data from two related heterogeneous
sources residing in different locations or file systems. You can also join data from the same
source.
 
56. What are a DTM and Load Manager?
DTM: The PowerCenter Integration Service starts a DTM process to run each Session and
Command task within a workflow. The DTM process performs session validations, creates
threads to initialize the session, read, write, and transform data, and handles pre- and post-
session operations.
Load Manager: The PowerCenter Integration Service uses the Load Balancer to dispatch
tasks. The Load Balancer dispatches tasks to achieve optimal performance. It may dispatch
tasks to a single node or across the nodes in a grid.
 
57. What is tracing level and what are its types?
The tracing level is the amount of detail displayed in the session log for this transformation.
 
Tracing level are 4 types:
Normal: Integration Service logs initialization and status information, errors encountered,
and skipped rows due to transformation row errors. Summarizes session results, but not at
the level of individual rows.
Terse: Integration Service logs initialization information and error messages and
notification of rejected data.
Verbose Initialization: In addition to normal tracing, Integration Service logs additional
initialization details, names of index and data files used, and detailed transformation
statistics.
Verbose Data: In addition to verbose initialization tracing, Integration Service logs each
row that passes into the mapping. Also notes where the Integration Service truncates string
data to fit the precision of a column and provides detailed transformation statistics.
Allows the Integration Service to write errors to both the session log and error log when you
enable row error logging.
When you configure the tracing level to verbose data, the Integration Service writes row
data for all rows in a block when it processes a transformation.
 
58. What is a command that used to run a batch?
PMCMD
 
59. What is sequential and concurrent run?
Sequential Run: In Sequential run the tasks will run one after another based on the link
condition.
Concurrent Run: In concurrent run a set of task will run parallel based on the link
condition.
 
60. How do we improve the performance of the aggregator transformation?
 By passing sorted input to aggregator transformation, by selecting Sorted input
option.  Which will reduce the amount of data cached during the session and improves
session performance.
 By limiting the number of connected input/output or output ports to reduce the
amount of data the Aggregator transformation stores in the data cache.
 By using a Filter transformation in the mapping, place the transformation before
the Aggregator transformation to reduce unnecessary aggregation.
61. What is a code page? Explain the types of the code pages?
A code page contains encoding to specify characters in a set of one or more languages and
is selected based on source of the data. The set code page refers to a specific set of data
that describes the characters the application recognizes. This influences the way that
application stores, receives, and sends character data.
 
62. How many ways you can delete duplicate records?
We can delete duplicate records in 3 ways:
1. By select distinct option in Source Qualifier (Incase relation Source, if there is no sql
override query).
2. By select distinct option in Sorter Transformation (Incase file is source).
3. By select group by all ports in Aggregator Transformation.
 
63. What is the difference between Power Center & Power Mart?
PowerCenter - ability to organize repositories into a data mart domain and share metadata
across repositories.
PowerMart - only local repository can be created.
 
64. Can u copy the session in to a different folder or repository?
Yes. By using copy session wizard you can copy a session in a different folder or repository.
But that target folder or repository should consists of mapping of that session. If target
folder or repository is not having the mapping of copying session, you should have to copy
that mapping first before you copy the session
 
65. After dragging the ports of three sources (SQL server, oracle, Informix) to a
single source qualifier, can we map these three ports directly to target?
No, each source definition required separate Source Qualifier as they different database
tables.
 
66. What is debugger?
Debugger is used to valid mapping to gain troubleshooting information about data and error
conditions. To debug a mapping, we configure and run the Debugger from within the
Mapping Designer. The Debugger uses a session to run the mapping on the Integration
Service. When you run the Debugger, it pauses at breakpoints and we can view and edit
transformation output data.
 
You might want to run the Debugger in the following situations:
         Before you run a session: After we save a mapping, we can run some initial tests with a
debug session before we create and configure a session in the Workflow Manager.
         After you run a session: If a session fails or if you receive unexpected results in the
target, we can run the Debugger against the session. We might also want to run the
Debugger against a session if we want to debug the mapping using the configured session
properties.
 
67. What are the different threads in DTM process?
The PowerCenter Integration Service process starts the DTM process to run a session. The
DTM process is also known as the pmdtm process. The DTM is the process associated with
the session task.
 
Read the Session Information: The PowerCenter Integration Service process provides
the DTM with session instance information when it starts the DTM. The DTM retrieves the
mapping and session metadata from the repository and validates it.
Perform Pushdown Optimization: If the session is configured for pushdown optimization,
the DTM runs an SQL statement to push transformation logic to the source or target
database.
Create Dynamic Partitions: The DTM adds partitions to the session if you configure the
session for dynamic partitioning. The DTM scales the number of session partitions based on
factors such as source database partitions or the number of nodes in a grid.
Form Partition Groups: If you run a session on a grid, the DTM forms partition groups. A
partition group is a group of reader, writer, and transformation threads that runs in a single
DTM process. The DTM process forms partition groups and distributes them to worker DTM
processes running on nodes in the grid.
Expand Variables and Parameters: If the workflow uses a parameter file, the
PowerCenter Integration Service process sends the parameter file to the DTM when it starts
the DTM. The DTM creates and expands session-level, service-level, and mapping-level
variables and parameters.
Create the Session Log: The DTM creates logs for the session. The session log contains a
complete history of the session run, including initialization, transformation, status, and error
messages. You can use information in the session log in conjunction with the PowerCenter
Integration Service log and the workflow log to troubleshoot system or session problems.
Validate Code Pages: The PowerCenter Integration Service processes data internally using
the UCS-2 character set. When you disable data code page validation, the PowerCenter
Integration Service verifies that the source query, target query, lookup database query, and
stored procedure call text convert from the source, target, lookup, or stored procedure data
code page to the UCS-2 character set without loss of data in conversion. If the PowerCenter
Integration Service encounters an error when converting data, it writes an error message to
the session log.
Verify Connection Object Permissions: After validating the session code pages, the DTM
verifies permissions for connection objects used in the session. The DTM verifies that the
user who started or scheduled the workflow has execute permissions for connection objects
associated with the session.
Start Worker DTM Processes: The DTM sends a request to the PowerCenter Integration
Service process to start worker DTM processes on other nodes when the session is
configured to run on a grid.
Run Pre-Session Operations: After verifying connection object permissions, the DTM runs
pre-session shell commands. The DTM then runs pre-session stored procedures and SQL
commands.
Run the Processing Threads: After initializing the session, the DTM uses reader,
transformation, and writer threads to extract, transform, and load data. The number of
threads the DTM uses to run the session depends on the number of partitions configured for
the session.
Run Post-Session Operations: After the DTM runs the processing threads, it runs post-
session SQL commands and stored procedures. The DTM then runs post-session shell
commands.
Send Post-Session Email: When the session finishes, the DTM composes and sends email
that reports session completion or failure. If the DTM terminates abnormally, the
PowerCenter Integration Service process sends post-session email.
 
68. What are the data movement modes in informatica?
Data movement modes determines how informatica server handles the character data. U
choose the data movement in the informatica server configuration settings.
 
Two types of data movement modes available in informatica:
ASCII mode
Unicode mode.
 
69. What is the difference between $ & $$ in mapping or parameter file?
$ are System defined and $$ are User defined variables
 
70. What is aggregate cache in aggregator transformation?
The Integration Service stores data in the aggregate cache until it completes aggregate
calculations. The Integration Service stores group values in an index cache and it stores row
data in the data cache. If the Informatica server requires more space, it stores overflow
values in cache files.
 
71. What do you mean by SQL override?
SQL Override: To override the default SQL query by customize user defined SQL query.
 
72. What is a shortcut in Informatica?
Any object which is used more than folder is called shortcut, it should be created in shared
folder.
Any point of time, if you want to do any modification to shortcut, you need to do to original
object in the shared folder.
 
73. What will happen if you copy the mapping from one repository to
anotherrepository and if there is an identical source?
It will ask for source to Rename, Replace, Reuse or Skip.
 
74. What is a dynamic lookup and what is the significance of NewLookupRow?
To cache a table, flat file, or source definition and update the cache, configure a Lookup
transformation with dynamic cache. The Integration Service dynamically inserts or updates
data in the lookup cache and passes the data to the target. The dynamic cache is
synchronized with the target.
 
Significance of NewLookupRow:
The Designer adds this port to a Lookup transformation configured to use a dynamic cache.
Indicates with a numeric value whether the Integration Service inserts or updates the row in
the cache, or makes no change to the cache. To keep the lookup cache and the target table
synchronized, pass rows to the target when the NewLookupRow value is equal to 1 or 2.
 
75. What happen if cache size got full?
If the cache size got full, then the Integration Service stores the overflow values in the
cache files. When the session completes, the Integration Service releases cache memory
and deletes the cache files unless you configure the Lookup transformation to use a
persistent cache.
76. How do you assign values from one session to another session?
1. Create workflow variable
 2. Once the first session succeeded, use “Post-session on success variable assignment” to
value to the workflow variable

3. In the second session, use “Pre-session variable assignment” re-assign workflow variable
to session.

77. I want to run the second session after running first session 3 times?
1. Create Workflow variable Name: $$Session_Counter, Datatype: Integer, Persistent:
Checked.

2. Create Assignment task and Place it in between the two session.

3. Edit “Assignment Task”  Expression Tab

Add follow expression:


IIF($$Session_Counter = 3, 1,  $$Session_Counter + 1 ) 

78. How do you override default lookup order by?


We can override default lookup order by clause, by place two dashes ‘--' as a comment
notation after the ORDER BY clause to suppress the ORDER BY clause that the Integration
Service generates.

79. What happen if one of the source qualifier port not linked with source
instance?
The Integration Service through an internal error. The Source Qualifier contains an
unbounded field [LOC].

80. What is difference between TRUE and FALSE in filter transformation?


A filter condition it returns TRUE or FALSE for each row that the Integration Service
evaluates, depending on whether a row meets the specified condition. 
         For each row that returns TRUE, the Integration Services pass through the transformation.
         For each row that returns FALSE, the Integration Service drops and writes a message to
the session log.

81. Can We Use Source and Target objects inside Mapplets?


We can use Source objects inside the mapplets, but we cannot use Target objects inside the
mapplets.

82. Explain the User Defined Functions in Informatica?


The User-defined functions extend the PowerCenter transformation language. We can create
and manage user-defined functions with the PowerCenter transformation language in the
Designer. We can add them to expressions in the Designer or Workflow Manger to reuse
expression logic and build complex expressions. User-defined functions are available to
other users in a repository.

83. What is pmcmd command and why we use it?


pmcmd is a program you use to communicate with the Integration Service. With pmcmd,
we can perform some of the tasks that you can also perform in the Workflow Manager, such
as starting and stopping workflows and sessions.

84. How you run workflow or session using command?


Command to run Workflow:
pmcmd startworkflow -sv IntService -d Domain -u user -p password -f Practice
wf_s_m_sample

Command to run Session:


pmcmd starttask -sv IntService -d Domain -u user -p password -f Practice –w wf_s_sample
s_m_sample

85. How many ways you can run a workflow or session or task?
We can run a workflow or session or task in 2 ways, Command line mode or Interactive
mode.

86. What is use of code page?


A code page contains encoding to specify characters in a set of one or more languages. The
code page is selected based on source of the dat. For example if source contains Japanese
text then the code page should be selected to support Japanese text.

When a code page is chosen, the program or application for which the code page is set,
refers to a specific set of data that describes the characters the application recognizes. This
influences the way that application stores, receives, and sends character data.

87. What is a surrogate key?


A surrogate key is a substitution for the natural primary key. It is a unique identifier or
number for each record of a dimension table that can be used for the primary key to the
table instead of a "real" or natural key.

88. What is difference between Mapplet and reusable transformation?


Mapplet: A mapplet is set of transformation, which is created in Mapplet Designer based on
some business logic. This can be re-used more than one mapping or across the other folder
if it is created in shared folder.
Reusable Transformation: It is a single transformation having some business logic, which
is created in Transformation Developer. This can be reused more than one mapplet or
mapping or across the other folder if it is created in shared folder.

89. What is lookup function and how do you call it?


Lookup function is nothing, but searching for a value in a lookup source columns. The
lookup function compares data in a lookup source to a value we specify. When the
Integration Service finds the search value in the lookup table, it returns the value from a
specified column in the same row in the lookup table.

Example:
syntax: lookup(result, search1, value1, search2, value2) where result is output, search is
output and value is input.

LOOKUP(:TD.EMP.HIREDATE, :TD.EMP.EMPID, 112233, :TD.EMP.DEPTNO, 10)        

Note:
         This Lookup function is used in expression transformation instead of un-connected lookup
or uncached lookup.
         NULL if the search does not find any matching values.
         Error if the search finds more than one matching value.

90. What is the difference between truncate and delete options in the session?
The truncate option is used to truncate data from that table.
The Delete option is used to delete the records from table, which are flagged for deletion
in update strategy. 

91. What is Forward Rejected Rows?


The Forward Rejected Rows option is used to either pass rejected rows to next
transformation or drop them. By default the Integration Service forward rejected rows to
next transformation. The Integration Service flags the rows for reject and writes them to
the session log file and passes to reject file. If you do not select Forward Rejected Rows
option the Integration Service drops rejected rows and writes them to the session log file.

92. Under what condition selecting Sorted Input in aggregator may fail the
session?
If the un-sorted data is passed to the aggregator and sorted input option is check then the
session get failed.

93. What are the difference between joiner transformation and source qualifier
transformation?
The Source Qualifier transformation is used to join source data from the relation source
residing in the same schema, database or system.
The Joiner transformation is used to join source data from two related heterogeneous
sources residing in different locations or file systems. You can also join data from the same
source.

94. What are two types of processes that run the session?
The two types of processes that runs the session are Load Manager and DTM process.

 Load manager process starts the session, creates DTM process, and sends post
session email when the session completes.
 DTM process creates threads to initialize the session, read, write and transform
data and handle pre-session and post-session operations.
95. What are the Update Strategy transformation Constant and its values?
The Update Strategy transformation Constant are Insert, Update, Delete and Reject.
Operation Constant Numeric Value
Insert DD_INSERT 0
Update DD_UPDAT 1
E
Delete DD_DELETE 2
Reject DD_REJECT 3

96. What is a shared folder?


A shared folder is global or local folder where we can create objects like Source, Target,
Transformation, Reusable Transformation, Mapplet, Mapping, Session, Workflow and
Workflow. After create any object in shared folder that can be used in any other folder
within the repository as shortcut object.

Any time if you made any modification to the object in the shared folder that will
automatically reflect in the other folders.

97. What is a worklet and its uses?


A worklet is an object which represents a set of tasks created for reuse a set of business
logic is called worklet. And which can be used more than one workflow within the folder.

98. What is file list and why it is used?


It contains list of files names and the files path which having same definition or structure,
when you specify the source file type as “Indirect” then Integration Service read the data
from the file list when we run the session.

99. What is Session and Batches?


Session: A Session is a set of instructions that tells to the Integration Service, how and
when to move data from source to target.
Batch: A group by session is called as Batch

100. What are slowly changing dimensions?


Over a time period data would get changes, where if you want to maintain history for those
changes, that process of maintaining history is called Slowly Changing Dimension.

You might also like