Professional Documents
Culture Documents
Project Related Questions
Project Related Questions
These are the questions which normally i would expect by interviewee to know when i sit
in panel. So what i would request my reader’s to start posting your answers to this questions in the
discussion forum under informatica technical interview guidance tag and i’ll review them and only valid
3. How many mapping have you created all together in your project?
6. How many Complex Mapping’s have you created? Could you please tell me the situation for which
8. What is the Schema of your Project? And why did you opt for that particular schema?
10. Can I have one situation which you have adopted by which performance has improved
dramatically?
13. What kinds of Testing have you done on your Project (Unit or Integration or System or UAT)? And
14. How many Dimension Table are there in your Project and how are they linked to the fact table?
21. What is your Daily feed size and weekly feed size?
22. Which Approach (Top down or Bottom Up) was used in building your project?
23. How do you access your source’s (are they Flat files or Relational)?
24. Have you developed any Stored Procedure or triggers in this project? How did you use them and
in which situation?
25. Did your Project go live? What are the issues that you have faced while moving your project from
26. What is the biggest Challenge that you encountered in this project?
27. What is the scheduler tool you have used in this project? How did you schedule jobs using it?
28. Difference between Informatica 7x and 8x?
34. How the Informatica server sorts the string values in Rank transformation?
passive?
37. In update strategy Relational table or flat file which gives us more performance? Why?
38. What are the out put files that the Informatica server creates during running a session?
39. Can you explain what are error tables in Informatica are and how we do error handling in
Informatica?
40. Difference between constraint base loading and target load plan?
45. How u will create header and footer in target using Informatica?
47. Where does Informatica store rejected data? How do we view them?
48. What is difference between partitioning of relational target and file targets?
49. What are mapping parameters and variables in which situation we can use them?
50. What do you mean by direct loading and Indirect loading in session properties?
53. Hi readers. These are the questions which normally I would expect by interviewee to know
when i sit in panel. So what i would request my reader’s to start posting your answers to this
questions in the discussion forum under informatica technical interview guidance tag and i’ll
review them and only valid answers will be kept and rest will be deleted.
mapping?
59. When we can join tables at the Source qualifier itself, why do we go for joiner transformation?
60. What is the default join operation performed by the look up transformation?
62. In a joiner transformation, you should specify the table with lesser rows as the master table.
Why?
64. Explain what DTM does when you start a work flow?
65. Explain what Load Manager does when you start a work flow?
66. In a Sequential batch how do i stop one particular session from running?
70. What are the different types of the caches available in Informatica? Explain in detail?
75. What are the options in the target session of update strategy transformation?
76. What is a code page? Explain the types of the code pages?
78. How can you delete duplicate rows with out using Dynamic Lookup? Tell me any other ways
in panel. So what i would request my reader’s to start posting your answers to this questions in the
discussion forum under informatica technical interview guidance tag and i’ll review them and only valid
55.If your workflow is running slow, what is your approach towards performance tuning?
57.After dragging the ports of three sources (Sql server, oracle, Informix) to a single source qualifier,
61.Explain how we set the update strategy transformation at the mapping level and at the session
level?
62.What is exact use of 'Online' and 'Offline' server connect Options while defining Work flow in Work
flow monitor? The system hangs when 'Online' Server connect option. The Informatica is installed on a
Personal laptop.
64.Write a session parameter file which will change the source and targets for every session. i.e
68.What is Transformation?
69.What does stored procedure transformation do in special as compared to other transformation?
70.How do you recognize whether the newly added rows got inserted or updated?
72.My flat file’s size is 400 MB and I want to see the data inside the FF with out opening it? How do I
do that?
74.How do you handle the decimal places when you are importing the flat file?
75.What is the difference between $ & $$ in mapping or parameter file? In which case they are
generally used?
76.While importing the relational source definition from database, what are the meta data of source U
import?
79.If a sequence generator (with increment of 1) is connected to (say) 3 targets and each target uses
87.How to delete duplicate records from source database/Flat Files? Can we use post sql to delete
these records. In case of flat file, how can you delete duplicates before it starts loading?
88.You are required to perform “bulk loading” using Informatica on Oracle, what action would perform
concurrent sessions?
90.Is it possible negative increment in Sequence Generator? If yes, how would you accomplish it?
91.Which directory Informatica looks for parameter file and what happens if it is missing when start
92.Informatica is complaining about the server could not be reached? What steps would you take?
93.You have more five mappings use the same lookup. How can you manage the lookup?
94.What will happen if you copy the mapping from one repository to another repository and if there is
no identical source?
96.An Aggregate transformation has 4 ports (l sum (col 1), group by col 2, col3), which port should
be the output?
97.What is a dynamic lookup and what is the significance of NewLookupRow? How will use them for
98.If you have more than one pipeline in your mapping how will change the order of load?
99.When you export a workflow from Repository Manager, what does this xml contain? Workflow
only?
100. Your session failed and when you try to open a log file, it complains that the session details are
not available. How would do trace the error? What log file would you seek for?
101.You want to attach a file as an email attachment from a particular directory using ‘email task’ in
102. You have a requirement to alert you of any long running sessions in your workflow. How can
you create a workflow that will send you email for sessions running more than 30 minutes. You can
use any method, shell script, procedure or Informatica mapping or workflow control?
1. What is a data-warehouse?
3. What is ER Diagram?
4. What is a Star Schema?
17. What are modeling tools available in the Market? Name some of them?
20. What is Normalization? First Normal Form, Second Normal Form , Third Normal Form?
Data warehouse?
23. Which columns go to the fact table and which columns go the dimension table? (My user needs to
25. How are the Dimension tables designed? De-Normalized, Wide, Short, Use Surrogate Keys,
29. What is VLDB? (Database is too large to back up in a time frame then it's a VLDB)
Informatica Designer
1) Development Projects.
2) Enhancement Projects
3) Migration Projects
4) Production support Projects.
-> The following are the different phases involved in a ETL project development life cycle.
-> The business requirement gathering start by business Analyst, onsite technical lead and client
business users.
-> In this phase,a Business Analyst prepares Business Requirement Document ( BRD ) (or) Business
Requirement Specifications ( BRS )
-> BRS :- Business Analyst will gather the Business Requirement and document in BRS
-> SRS :- Senior technical people (or) ETL architect will prepare the SRS which contains s/w and h/w
requirements.
An ETL Architect and DWH Architect participate in designing a solution to build a DWH.
An HLD document is prepared based on Business Requirement.
Based on HLD,a senior ETL developer prepare Low Level Design Document
The LLD contains more technical details of an ETL System.
An LLD contains data flow diagram ( DFD ), details of source and targets of each mapping.
An LLD also contains information about full and incremental load.
After LLD then Development Phase will start
-> Based an LLD, the ETL team will create mapping ( ETL Code )
-> After designing the mappings, the code ( Mappings ) will be reviewed by developers.
Code Review :-
Peer Review :-
-> The code will reviewed by your team member ( third party developer )
Testing:-
--------------------------------
Unit Testing :-
-> A unit test for the DWH is a white Box testing,It should check the ETL procedure and Mappings.
-> The following are the test cases can be executed by an ETL developer.
1) Verify data loss
2) No.of records in the source and target
3) Dataload/Insert
4) Dataload/Update
5) Incremental load
6) Data accuracy
7) verify Naming standards.
8) Verify column Mapping
-> The Unit Test will be carried by ETL developer in development phase.
-> ETL developer has to do the data validations also in this phase.
-> This test is carried out in the presence of client side technical users to verify the data migration from
source to destination.
Production Environment :-
---------------------------------
-> Migrate the code into the Go-Live environment from test environment ( QA Environment ).
depends on what aspect of the project you are talking about, for
instance....
this was an example of LLD & HLD in the aspect of Business Rules
and Data Mapping.
For people who have been involved in software projects, they will constantly hear the terms, High Level
Design (HLD) and Low Level Design (LLD). So what are the differences between these 2 design stages
and when are they respectively used ?
High – level Design gives the overall System Design in terms of Functional Architecture and Database
design. It designs the over all architecture of the entire system from main module to all sub module. This
is very useful for the developers to understand the flow of the system. In this phase design team, review
team (testers) and customers plays a major role. For this the entry criteria are the requirement document
that is SRS. And the exit criteria will be HLD, projects standards, the functional design documents,and the
database design document. Further, High level deign gives the overview of the development of product.
In other words how the program is going to be divided into functions, modules, subdivision etc.
Low – Level Design (LLD): During the detailed phase, the view of the application developed during the
high level design is broken down into modules and programs. Logic design is done for every program and
then documented as program specifications. For every program, a unit test plan is created. The entry
criteria for this will be the HLD document. And the exit criteria will the program specification and unit test
plan (LLD).
The Low Level Design Document gives the design of the actual program code which is designed based
on the High Level Design Document. It defines Internal logic of corresponding submodule designers are
preparing and mapping individual LLD’s to Every module. A good Low Level Design Document developed
will make the program very easy to be developed by developers because if proper analysis is made and
the Low Level Design Document is prepared then the code can be developed by developers directly from
Low Level Design Document with minimal effort of debugging and testing.
High Level Design, means precisely that. A high level design discusses an overall view of how something
should work and the top level components that will comprise the proposed solution. It should have very
little detail on implementation, i.e. no explicit class definitions, and in some cases not even details such as
database type (relational or object) and programming language and platform.
A low level design has nuts and bolts type detail in it which must come after high level design has been
signed off by the users, as the high level design is much easier to change than the low level design.
HLD: It refers to the functionlity to be achieved to meet the client requirement. Precisely speaking it is a
diagramatic representation of clients operational systems, staging areas, dwh n datamarts. also how n
what frequency the data is extracted n loaded into the target database.
LLD: It is prepared for every mapping along with unit test plan. It contains the names of source
definitions, target definitions, transformatoins used, column names, data types, business logic written n
source to target field matrix, session name, mapping name.
HLDBased on SRS, software analysts will convert the requirements into a usable product.They will
design an application, which will help the programmers in coding.In the design process, the product is to
be broken into independent modules and then taking each module at a time and then further breaking
them to arrive at micro levelsThe HLD document willcontain the following items at a macro level: - list of
modules and a brie description of each module - brief functionality of each module - interface relationship
among modules -dependencies between modules - database tables identified along with key elements -
overall architecture diagrams along with technology detailsLLDHLD contains details at macro level and so
it cannot be given to programmers as a document for coding.So the system analysts prepare a micro
level design document, called LLDThis document describes each and every module in an elaborate
manner, so that the programmer can directly code the program based on this.There will be at least 1
document for each module and there may be more for a module.The LLD will contain: - deailed functional
logic of the module, in pseudo code - database tables, with all elements, including their type and size - all
interface details with complete API references(both requests and responses) - all dependency issues -
error message listings - complete input and outputs for a module(courtesy 'anonimas')
HHD is the first output in your system design phase(in SDLC).Here we design the overall architecture of
the system.The main functional or all the core modules are given shape here.This also include
contr0l flow b/w main modules,e-r status etc.
main out-put's are
E-r diagram,flow chart,DFD's etc
LLD we create more detail and specific design of the system.how exactly we make the dB
structure,interface design etc
Main output's are
DB's schema,frameworks,Interface desins etc
For people who have been involved in software projects, they will constantly hear the terms, High Level
Design (HLD) and Low Level Design (LLD). So what are the differences between these 2 design stages
and when are they respectively used ?
High Level Design (HLD) gives the overall System Design in terms of Functional Architecture and
Database design. It designs the over all architecture of the entire system from main module to all sub
module. This is very useful for the developers to understand the flow of the system. In this phase design
team, review team (testers) and customers plays a major role. For this the entry criteria are the
requirement document that is SRS. And the exit criteria will be HLD, projects standards, the functional
design documents, and the database design document. Further, High level deign gives the overview of
the development of product. In other words how the program is going to be divided into functions,
modules, subdivision etc.
Low Level Design (LLD): During the detailed phase, the view of the application developed during the high
level design is broken down into modules and programs. Logic design is done for every program and then
documented as program specifications. For every program, a unit test plan is created. The entry criteria
for this will be the HLD document. And the exit criteria will the program specification and unit test plan
(LLD).
The Low Level Design Document gives the design of the actual program code which is designed based
on the High Level Design Document. It defines Internal logic of corresponding submodule designers are
preparing and mapping individual LLD’s to Every module. A good Low Level Design Document
developed will make the program very easy to be developed by developers because if proper analysis is
made and the Low Level Design Document is prepared then the code can be developed by developers
directly from Low Level Design Document with minimal effort of debugging and testing.
Informatica Batch Processing
Explain how the batch processing works in informatica ? When would it be useful in real-time projects ?
When we run multiple sessions in a single workflow sequentially that is called batch processing. This is
useful when we create relational database for any company.
Question: 3 of 73
In Session level, we can select Source File Type as Indirect . When you select Indirect, the Integration
Service finds the file list and reads each listed file when it runs the session. So inside the file list we can
mention the file name that is changing frequently.
The dictionary meaning of Homogeneous is Uniform and Heterogeneous is Mixed. For example if a
mapping is using only Oracle sources or Flat files or DB2 or XML or any other then they are called
Homogeneous sources.
Example of Heterogeneous is if a mapping is using Oracle source table, Flat file, DB2 source and XML
Source then they are called as Heterogeneous sources.
We simple cannot say that a flat file is different from those two.
first we need to save the excel file in csv file and then we have xml transformation in informatica by
using that we can convert csv file to xml file
Excel source is possible through File DSN - Target XML also can be done - read the documentation or
help for more information
Alternate Index
What is the use of Alternate Index? Is using alternate index in file processing fast
Alternate Index is used to access the records from the file using alternate key, when there is no primary
key avaliable. But using this index accessing the records is slow. Because, alternate index format is
Alternate key and primary key. So by using alternate key we will get primary key from the Alternate
Index file, from there we will search the file using the primary key. Using Alternate Index the accessing is
slow.
Question: 6 of 707
Using the CONCAT (||) operator and SUBSTR function in the expression transformation.
You Can use expression transformation. Add an port as output, then use to_char(date_port, n) function.
to_char(date,mm)
What is a delete flag in Informatica and why is it used?
Delete Flag is used to delete record from the target schema when flag condition given in update strategy
transformation is find true.
it is seen that flag return binary value which is either true or false.
Question: 8 of 38
Informatica Architecture
Can you Explain about Informatica Architecture ? and what difference between service based and service
oriented ?
|----------------------------|---------------------------------|-------------------|
| SOURCES |client tools | TARGET |
| like ,oracle,db2 |-------------------------------- | |
| |powercenter repositery | |
| |------------------------------- | |
|________________|repositery server | |
--------------------------------|-------------------|
Hi , when interviewer asks this question , it means he/she is asking about your current project work/task
requirement. So u may tell from requirement gathering to the report generation. or u may probably
define/ explain ur work or the task u were involved in : for example if u were involved with BE's taking
requirements and understanding them (KT with Business engineers) then u were involved with the
design/test/code migration for informatica , and u may also extend this explanation if u know about the
reporting .
Question: 2 of 73
Closing 1 excel when multiple instances of excel are open during runtime in
QTP
I have written a VB script to batch run the QTP scripts. My VB script takes input from the "ControlFile"
excel to get the name of the test script to open and execute in QTP. So I need to keep this "ControlFile"
excel open throughout the execution of all the scripts in the batch. The problem is my scripts open some
excel for comparison and when they are closed with appexcel.Quit, even my "ControlFile" excel closes
and hence I am unable to get the scriptnames after that. The execution stops here. Can anyone please
help me with this, to close 1 particular instance of an excel during runtime. Thanks in Advance for the
help!
Informatica ERROR REP_12014 : Error occured while accessing the Registry
Hi
Iam learning informatica 8.1 ( which is what i could get my hands on)..
DataBase error: ORA-01455: converting column overflows integer datatype ORA-01455: converting
column overflows integer datatype Database driver error... Function Name : Fetch SQL Stmt : SELECT
OBJECT_NAME, OBJECT_TYPE, OBJECT_SUBTYPE, USER_NAME, USER_PASSWORD2,
CONNECT_STRING, CODE_PAGE, COMMENTS, OWNER_ID, GROUP_ID, LAST_SAVED, CREATE_INFO,
OPB_OBJECT_ID, OBJVERSION, COMP_VERSION FROM OPB_CNX WHERE OBJECT_ID = ?
I can see some data in the table and I have an idea about the offending column
The structure of the table is
Question: 2 of 385
Question: 13 of 707
Organization problems
Consider an organization by which you are familiar. If the organization using file processing system the
what are problem organization will face?
Explain with suitable examples.
Can be done by using java transformation with the below like code:
Could we use dsjob command on linux or unix plantform to achive the activity of extacting parameters
from a job?
first we need to save the excel file in csv file and then we have xml transformation in informatica by
using that we can convert csv file to xml file
Excel source is possible through File DSN - Target XML also can be done - read the documentation or
help for more information
Upload a flat file in your program into an internal table and then pass the internal table to BAPI
Splitting and merging of file using sort
Hi,I have a file which contains 3 types of transactions with account number,transaction type and
transaction creation date.The transaction types are 35,39 and 41.The file has duplicate records for all
transactions(ie 2 records for each account number for each transaction).Now I need to remove the
duplicate for the 39 type transaction alone and keep the remaining.That is only for the transaction type
39 the duplicate record(record with old transaction date) need to be removed and for the remaining
transactions the duplicates need to be there.How to achieve this in a single sort step?
2.Degenerated Dimension:Which is derived from the fact table and does not have any dimension of its
own.
I guess in the workflow monitor, just right click on the session and select get run properties, this option
shows total source rows, no of rejected rows an total rows moved to target.
Normal load:
In normal load we are processing entire source data into target with constraint based checking
Bulk load:
In bulk load with out checking constraints in target we are processing entire source data into target.
if both the tables are relational we can join both the tables using sql override but if one table is relational
and another table is a flat then we've to use joiner transformation.
If a session fails after loading of 10,000 records in to the target.How can u load
the records from 10001 th record when u run the session next time in
informatica 6.1?
In Informatica 8.6, recovery feature is improved. Informatica server writes real time recovery info to a
queue which helps maintain data integrity during recovery. So, no data is lost or duplicated. Recovery
queue stores reader state, commit number and messageID informatica server committed to target.
During recovery, informatica server uses recovery info to determine where it stopped processing.
The recovery ignore list stores message IDs that IS wrote to target for failed session. Informatica server
writes recovery info to the list if there is a chance that source did not receive an acknowledgement. While
recovering, informatica server uses recovery ignore list to prevent data duplication.
In Star schema, Fact table is normalized and Dimension table may be normalized or Denormalized.
In Snowflake schema, the Fact table is normalized and Dimension tables are always normalized.
Question: 39 of 707
Which tool U use to create and manage sessions and batches and to monitor
and stop the informatica server?
Informatica Server Manager - Tool that we are using to create and manage Sessions and Batches.
Suppose I have one source which is linked into 3 targets.When the workflow runs for the
first time only the first target should be populated and the rest two(second and last) should
not be populated.When the workflow runs for the second time only the second target
should be populated and the rest two(first and last) should not be populated.When the
workflow runs for the third time only the third target should be populated and the rest
two(first and second) should not be populated.
First create a sequence generator where startwith=1 and maxvalue=3, enable the option "cycle". Make
sure cache value is set to 0.
In the data flow use expression to collect dataflow ports and add a new port (iteration_no) to collect
sequence.nextval. pass this data to router where you need to create 3 groups, first group condition
iteration_no=1, second group condition iteration_no=2 and third group condition iteration_no=3. This
way each session run will be loading first, second and third target instance in cyclic mode.
2.and Power exchange tool also introduced in informatica 8.x while was not in 7.x
Router Transf can be used and use the same conditions for both
groups which lets all rows pass through. Then insert the same target table
and to get the records starting with A..write an SQL query in Source Qualifier
transformation as..
4. how to get the records starting with particular letter like A in informatica?
View is just a SQL statement stored in Database which is executed every single time its
called.
Materialized view (MV) is SQL statement and its resultant data stored in database in
some form - eg in temp tables while its created.
This help in faster extraction of data - downside is the MV has to be refreshed in regular
basis to get latest data.
Meterialized views are phisical oracle database objects. Using these we can refresh
database tables (W r t) timely manner.
whereas views are logical database object.if any chages happens in tabl those changes
will effect that respective view also.