You are on page 1of 143

Informatica

1. While importing the relational source definition from


database, what are the meta data of source you import?

Source name
Database location
Column names
Data types
Key constraints

2. How many ways you can update a relational source definition


and what r they?
Two ways
1. Edit the definition
2. Reimport the definition

3. Where should U place the flat file to import the flat file
definition to the designer?

Place it in local folder

4. To provide support for Mainframes source data, which files r


used as a source definitions?
COBOL files

5. Which transformation should u need while using the COBOL


sources as source definitions?
Normalizer transformation which is used to normalize the data,
since COBOL sources consists of Denormailzed data.

6. How can U create or import flat file definition in to the


warehouse designer?
U can not create or import flat file definition in to warehouse
designer directly. Instead U must analyze the file in source analyzer,
and then drag it into the warehouse designer. When you drag the flat
file source definition into warehouse designer workspace, the
warehouse designer creates a relational target definition not a file
definition. If you want to load to a file, configure the session to
write to a flat file. When the informatica server runs the session, it
creates and loads the flat file.

7. What is a Mapplet? Mapplet is a set of transformations that you


build in the Mapplet designer and U can use in multiple mappings.

8. What is a transformation? It is a repository object that


generates, modifies or passes data.

9. What r the designer tools for creating transformations?


Mapping designer
Transformation developer
Mapplet designer

10. What r the active and passive transformations?


An active transformation can change the number of rows that
pass through it. A passive transformation does not change the
number of rows that pass through it.

11. What r the connected or unconnected transformations?


An unconnected transformation is not connected to other
transformations in the mapping. Connected transformation is
connected to other transformations in the mapping.

12. How many ways u create ports?


Two ways
1. Drag the port from another transformation
2. Click the add button on the ports tab.

14. What r the reusable transformations?


Reusable transformations can be used in multiple mappings.
When u need to incorporate this transformation into maping, U add
an instance of it to maping. Later if U change the definition of the
transformation ,all instances of it inherit the changes. Since the
instance of reusable transforamation is a pointer to that
transforamtion, U can change the transformation in the
transformation developer, its instances automatically reflect these
changes. This feature can save U great deal of work.

15. What r the methods for creating reusable transforamtions?


Two methods
1.Design it in the transformation developer.
2.Promote a standard transformation from the mapping
designer. After U add a transformation to the mapping , U can
promote it to the status of reusable transformation.
Once U promote a standard transformation to reusable status, U
can demote it to a standard transformation at any time.
If u change the properties of a reusable transformation in
mapping, U can revert it to the original reusable transformation
properties by clicking the revert button.

16. What r the unsupported repository objects for a mapplet?


COBOL source definition
Joiner transformations
Normalizer transformations
Non reusable sequence generator transformations.
Pre or post session stored procedures
Target defintions
Power mart 3.5 style Look Up functions
XML source definitions
IBM MQ source defintions

17. What r the mapping paramaters and mapping variables?


Maping parameter represents a constant value that U can define
before running a session.A mapping parameter retains the same
value throughout the entire session.
When u use the maping parameter ,U declare and use the
parameter in a maping or maplet.Then define the value of parameter
in a parameter file for the session.
Unlike a mapping parameter,a maping variable represents a value
that can change throughout the session.The informatica server saves
the value of maping variable to the repository at the end of session
run and uses that value next time U run the session.

18. Can U use the mapping parameters or variables created in


one maping into another maping? NO.
We can use mapping parameters or variables in any transformation
of the same maping or mapplet in which U have created maping
parameters or variables.

19.Can u use the maping parameters or variables created in one


maping into any other reusable transformation?
Yes. Because reusable transformation is not contained with any
mapplet or mapping.

20.How can U improve session performance in aggregator


transformation? Use sorted input.

21.What is aggregate cache in aggregator transformation ?


The aggregator stores data in the aggregate cache until it
completes aggregate calculations. When u run a session that uses an
aggregator transformation, the informatica server creates index and
data caches in memory to process the transformation. If the
informatica server requires more space, it stores overflow values in
cache files.

22.What r the difference between joiner transformation and


source qualifier transformation?
U can join hetrogenious data sources in joiner transformation
which we can not achieve in source qualifier transformation.
U need matching keys to join two relational sources in source
qualifier transformation. Where as u doesn't need matching keys to
join two sources.
Two relational sources should come from same datasource in
source qualifier.U can join relational sources which r coming from
different sources also.

23.In which conditions we cannot use joiner transformation


(Limitaions of joiner transformation)?

Both pipelines begin with the same original data source.


Both input pipelines originate from the same Source Qualifier
transformation.
Both input pipelines originate from the same Normalizer
transformation.
Both input pipelines originate from the same Joiner transformation.
Either input pipelines contains an Update Strategy transformation.
Either input pipelines contains a connected or unconnected
Sequence Generator transformation.

24. what r the settings that u use to configure the joiner


transformation?
Master and detail source
Type of join
Condition of the join

25. What r the join types in joiner transformation?


Normal (Default)
Master outer
Detail outer
Full outer
26.What r the joiner caches?
When a Joiner transformation occurs in a session, the Informatica
Server reads all the records from the master source and builds index
and data caches based on the master rows.
After building the caches, the Joiner transformation reads records
from the detail source and perform joins.

27.what is the look up transformation?


Use lookup transformation in u'r mapping to lookup data in a
relational table, view, synonym.
Informatica server queries the look up table based on the lookup
ports in the transformation. It compares the lookup transformation
port values to lookup table column values based on the look up
condition.

28.Why use the lookup transformation ?


To perform the following tasks.
Get a related value. For example, if your source table includes
employee ID, but you want to include the employee name in your
target table to make your summary data easier to read.
Perform a calculation. Many normalized tables include values used
in a calculation, such as gross sales per invoice or sales tax, but not
the calculated value (such as net sales).
Update slowly changing dimension tables. You can use a Lookup
transformation to determine whether records already exist in the
target.

29.What r the types of lookup? Connected and unconnected

30.Differences between connected and unconnected lookup?

Connected lookup
Unconnected lookup

Receives input values directly from Receives input


values from the result of a
the pipe line. lkp expression in a
another transformation.
U can use a dynamic or static cache U can use a static
cache.

Cache includes all lookup columns Cache includes all


lookup out put ports in the
used in the mapping lookup condition
and the lookup/return port.

Support user defined default values Does not support


user defined default values

31.what is meant by lookup caches ?

The informatica server builds a cache in memory when it


processes the first row of a data in a cached look up transformation.
It allocates memory for the cache based on the amount u configure
in the transformation or session properties. The informatica server
stores condition values in the index cache and output values in the
data cache.

32.What r the types of lookup caches?

Persistent cache: U can save the lookup cache files and reuse them
the next time the informatica server processes a lookup
transformation configured to use the cache.

Recache from database: If the persistent cache is not synchronized


with the lookup table, U can configure the lookup transformation to
rebuild the lookup cache.

Static cache: U can configure a static or read only cache for only
lookup table. By default informatica server creates a static cache. it
caches the lookup table and lookup values in the cache for each row
that comes into the transformation. when the lookup condition is
true, the informatica server does not update the cache while it
processes the lookup transformation.
Dynamic cache: If u want to cache the target table and insert new
rows into cache and the target, u can create a look up
transformation to use dynamic cache. The informatica server
dynamically inerts data to the target table.

shared cache: U can share the lookup cache between multiple


transactions can share unnamed cache between transformations in
the same mapping.

33. Difference between static cache and dynamic cache

Static cache Dynamic


cache

U can not inert or update the cache. U can insert rows


into the cache as u pass
to the target
The informatic server returns a value from The informatic
server inserts rows into cache
the lookup table opr cache when the condition when the
condition is false.This indicates that
is true.When the condition is not true,the the row is not in
the cache or target table.
informatica server returns the default value U can pass these
rows to the target table.
for connected transformations and null for
unconnected transformations.

34. Which transformation should we use to normalize the


COBOL and relational sources?
Normalizer Transformation.
When U drag the COBOL source in to the mapping Designer
workspace, the normalizer transformation automatically appears,
creating input and output ports for every column in the source.

35. How the informatica server sorts the string values in Rank
transformation?
When the informatica server runs in the ASCII data movement mode
it sorts session data using Binary sortorder.If U configure the session
to use a binary sort order, the informatica server calculates the binary
value of each string and returns the specified number of rows with
the highest binary values for the string.

36. What r the rank caches?

During the session ,the informatica server compares an input row


with rows in the data cache. If the input row out-ranks a stored row, the
informatica server replaces the stored row with the input row. The
informatica server stores group information in an index cache and row
data in a data cache.

37. What is the Rankindex in Rank transformation?

The Designer automatically creates a RANKINDEX port for each


Rank transformation. The Informatica Server uses the Rank Index
port to store the ranking position for each record in a group. For
example, if you create a Rank transformation that ranks the top 5
salespersons for each quarter, the rank index numbers the
salespeople from 1 to 5:

38. What is the Router transformation?

A Router transformation is similar to a Filter transformation


because both transformations allow you to use a condition to test
data. However, a Filter transformation tests data for one condition
and drops the rows of data that do not meet the condition. A Router
transformation tests data for one or more conditions and gives you
the option to route rows of data that do not meet any of the
conditions to a default output group.
If you need to test the same input data based on multiple conditions,
use a Router Transformation in a mapping instead of creating
multiple Filter transformations to perform the same task.

39.What r the types of groups in Router transformation? Input


group Output group

The designer copies property information from the input ports of


the input group to create a set of output ports for each output group.
Two types of output groups
User defined groups
Default group
U can not modify or delete default groups.

40.Why we use stored procedure transformation? For


populating and maintaining databases. A Stored Procedure
transformation is an important tool for populating and maintaining
databases. Database administrators create stored procedures to
automate time-consuming tasks that are too complicated for
standard SQL statements.

42.What r the types of data that passes between informatica server and stored
procedure?
3 types of data
Input/Out put parameters
Return Values
Status code.

43.What is the status code?

Status code provides error handling for the informatica server


during the session. The stored procedure issues a status code that
notifies whether or not stored procedure completed successfully.
This value cannot be seen by the user. It only used by the
informatica server to determine whether to continue running the
session or stop.

44. What is source qualifier transformation?

When U add a relational or a flat file source definition to a mapping, U need to


connect it to
a source qualifier transformation. The source qualifier transformation represents
the records
that the informatica server reads when it runs a session.

45.What r the tasks that source qualifier performs?

Join data originating from same source data base.


Filter records when the informatica server reads source data.
Specify an outer join rather than the default inner join
specify sorted records.
Select only distinct values from the source.
Creating custom query to issue a special SELECT statement for the informatica
server to read
source data.

46. What is the target load order?

U specify the target load order based on source qualifiers in a maping. If u


have the multiple
source qualifiers connected to the multiple targets, U can designate the order
in which informatica
server loads data into the targets.

47. What is the default join that source qualifier provides?

Inner equi join.

48. What r the basic needs to join two sources in a source qualifier?
Two sources should have primary and Foreign key relation ships.
Two sources should have matching data types.

49.what is update strategy transformation ?

This transformation is used to maintain the history data or just most recent
changes in to target
table.

50.Describe two levels in which update strategy transformation sets?

Within a session. When you configure a session, you can instruct


the Informatica Server to either treat all records in the same way (for
example, treat all records as inserts), or use instructions coded into
the session mapping to flag records for different database
operations.

Within a mapping. Within a mapping, you use the Update Strategy


transformation to flag records for insert, delete, update, or reject.

51.What is the default source option for update stratgey transformation?

Data driven.

52. What is Datadriven?

The informatica server follows instructions coded into update strategy


transformations within
the session maping determine how to flag records for insert, update, delete or
reject If u do not
choose data driven option setting, the informatica server ignores all update
strategy transformations
in the mapping.

53.What r the options in the target session of update strategy


transsformatioin?
Insert
Delete
Update
Update as update
Update as insert
Update else insert
Truncate table

54. What r the types of maping wizards that r to be provided in Informatica?

The Designer provides two mapping wizards to help you create


mappings quickly and easily. Both wizards are designed to create
mappings for loading and maintaining star schemas, a series of
dimensions related to a central fact table.

Getting Started Wizard. Creates mappings to load static fact and


dimension tables, as well as slowly growing dimension tables.
Slowly Changing Dimensions Wizard. Creates mappings to load
slowly changing dimension tables based on the amount of historical
dimension data you want to keep and the method you choose to
handle historical dimension data.

55. What r the types of maping in Getting Started Wizard?

Simple Pass through maping :

Loads a static fact or dimension table by inserting all rows. Use


this mapping when you want to drop all existing data from your
table before loading new data.

Slowly Growing target :

Loads a slowly growing fact or dimension table by inserting new


rows. Use this mapping to load new data when existing data does
not require updates.
56. What r the mapings that we use for slowly changing
dimension table?

Type1: Rows containing changes to existing dimensions are


updated in the target by overwriting the existing dimension. In the
Type 1 Dimension mapping, all rows contain current dimension
data.
Use the Type 1 Dimension mapping to update a slowly changing
dimension table when you do not need to keep any previous
versions of dimensions in the table.

Type 2: The Type 2 Dimension Data mapping inserts both new and
changed dimensions into the target. Changes are tracked in the target
table by versioning the primary key and creating a version number
for each dimension in the table.
Use the Type 2 Dimension/Version Data mapping to update a
slowly changing dimension table when you want to keep a full
history of dimension data in the table. Version numbers and
versioned primary keys track the order of changes to each
dimension.

Type 3: The Type 3 Dimension mapping filters source rows based


on user-defined comparisons and inserts only those found to be new
dimensions to the target. Rows containing changes to existing
dimensions are updated in the target. When updating an existing
dimension, the Informatica Server saves existing data in different
columns of the same row and replaces the existing data with the
updates

57.What r the different types of Type2 dimension maping?

Type2 Dimension/Version Data Maping: In this maping the


updated dimension in the source will gets inserted in target
along with a new version number. And newly added dimension
in source will inserted into target with a primary key.
Type2 Dimension/Flag current Maping: This maping is also used
for slowly changing dimensions. In addition it creates a flag value
for changed or new dimension.
Flag indiactes the dimension is new or newly updated. Recent
dimensions will gets saved with current flag value 1. And updated
dimensions r saved with the value 0.

Type2 Dimension/Effective Date Range Maping: This is also one


flavour of Type2 maping used for slowly changing dimensions. This
maping also inserts both new and changed dimensions in to the
target. And changes r tracked by the effective date range for each
version of each dimension.

58. How can u recognise whether or not the newly added rows in
the source r gets insert in the target ?

In the Type2 maping we have three options to recognise the newly


added rows
Version number
Flagvalue
Effective date Range

59. What r two types of processes that informatica runs the


session?

Load manager Process: Starts the session, creates the DTM


process, and sends post-session email when the session completes.
The DTM process. Creates threads to initialize the session, read,
write, and transform data, and handle pre- and post-session
operations.

60. What r the new features of the server manager in the


informatica 5.0?
U can use command line arguments for a session or batch. This
allows U to change the values of session parameters, and mapping
parameters and maping variables.

Parallel data processig: This feature is available for powercenter


only.If we use the informatica server on a SMP system,U can use
multiple CPU's to process a session concurently.

Process session data using threads: Informatica server runs the


session in two processes.Explained in previous question.

61. Can u generate reports in Informatcia?

Yes. By using Metadata reporter we can generate reports in


informatica.

62.What is metadata reporter?

It is a web based application that enables you to run reports


againist repository metadata.
With a meta data reporter, u can access information about U'r
repository without having knowledge of sql, transformation
language or underlying tables in the repository.

63.Define maping and sessions?

Maping: It is a set of source and target definitions linked by


transformation objects that define the rules for transformation.
Session : It is a set of instructions that describe how and when to
move data from source to targets.

64.Which tool U use to create and manage sessions and batches


and to monitor and stop the informatica server?

Informatica server manager.


65.Why we use partitioning the session in informatica?

Partitioning achieves the session performance by reducing the


time period of reading the source and loading the data into target.

66.To achieve the session partition what r the necessary tasks u


have to do?

Configure the session to partition source data.

Install the informatica server on a machine with multiple


CPU's.

67.How the informatica server increases the session


performance through partitioning the source?

For a relational sources informatica server creates multiple


connections for each parttion of a single source and extracts
separate range of data for each connection. Informatica server reads
multiple partitions of a single source concurrently. Similarly for
loading also informatica server creates multiple connections to the
target and loads partitions of data concurrently.

For XML and file sources, informatica server reads multiple files
concurently. For loading the data informatica server creates a
seperate file for each partition(of a source file). U can choose to
merge the targets.

68. Why u use repository connectivity?

When u edit, schedule the session each time, informatica server


directly communicates the repository to check whether or not the
session and users r valid. All the metadata of sessions and mappings
will be stored in repository.

69.What r the tasks that Load manger process will do?


Manages the session and batch scheduling: Whe u start the
informatica server the load maneger launches and queries the
repository for a list of sessions configured to run on the informatica
server.When u configure the session the loadmanager maintains list
of list of sessions and session start times.When u sart a session
loadmanger fetches the session information from the repository to
perform the validations and verifications prior to starting DTM
process.

Locking and reading the session: When the informatica server starts
a session lodamaager locks the session from the repository.Locking
prevents U starting the session again and again.

Reading the parameter file: If the session uses a parameter


files,loadmanager reads the parameter file and verifies that the
session level parematers are declared in the file

Verifies permission and privelleges: When the sesson starts load


manger checks whether or not the user have privelleges to run the
session.

Creating log files: Loadmanger creates logfile contains the status of


session.

70. What is DTM process?

After the loadmanager performs validations for session, it creates


the DTM process. DTM is to create and manage the threads that
carry out the session tasks. It creates the master thread.Master thread
creates and manges all the other threads.

71. What r the different threads in DTM process?

Master thread: Creates and manages all other threads


Maping thread: One maping thread will be creates for each
session.Fectchs session and maping information.

Pre and post session threads: This will be created to perform pre and
post session operations.

Reader thread: One thread will be created for each partition of a


source.It reads data from source.

Writer thread: It will be created to load data to the target.

Transformation thread: It will be created to tranform data.

72.What r the data movement modes in informatcia?

Datamovement modes determines how informatcia server


handles the charector data.U choose the datamovement in the
informatica server configuration settings.Two types of
datamovement modes avialable in informatica.

ASCII mode
Uni code mode.

73. What r the out put files that the informatica server creates
during the session running?

Informatica server log: Informatica server(on unix) creates a


log for all status and error messages(default name: pm.server.log).It
also creates an error log for error messages.These files will be
created in informatica home directory.

Session log file: Informatica server creates session log file for each
session.It writes information about session into log files such as
initialization process,creation of sql commands for reader and writer
threads,errors encountered and load summary.The amount of detail
in session log file depends on the tracing level that u set.
Session detail file: This file contains load statistics for each targets
in mapping.Session detail include information such as table
name,number of rows written or rejected.U can view this file by
double clicking on the session in monitor window

Performance detail file: This file contains information known as


session performance details which helps U where performance can
be improved.To genarate this file select the performance detail
option in the session property sheet.

Reject file: This file contains the rows of data that the writer does
notwrite to targets.

Control file: Informatica server creates control file and a target file
when U run a session that uses the external loader.The control file
contains the information about the target flat file such as data format
and loading instructios for the external loader.

Post session email: Post session email allows U to automatically


communicate information about a session run to designated
recipents.U can create two different messages.One if the session
completed sucessfully the other if the session fails.

Indicator file: If u use the flat file as a target,U can configure the
informatica server to create indicator file.For each target row,the
indicator file contains a number to indicate whether the row was
marked for insert,update,delete or reject.

output file: If session writes to a target file,the informatica server


creates the target file based on file prpoerties entered in the session
property sheet.

Cache files: When the informatica server creates memory cache it


also creates cache files.For the following circumstances informatica
server creates index and data cache files.
Aggregator transformation
Joiner transformation
Rank transformation
Lookup transformation

74.In which circumstances that informatica server creates


Reject files?

When it encounters the DD_Reject in update strategy


transformation.
Violates database constraint
Filed in the rows was truncated or overflowed.

75. What is polling?


It displays the updated information about the session in the monitor
window. The monitor window displays the status of each session
when U poll the informatica server

76. Can u copy the session to a different folder or repository?

Yes. By using copy session wizard u can copy a session in a different folder
or repository. But that
target folder or repository should consists of mapping of that session.
If target folder or repository is not having the maping of copying session ,
u should have to copy that maping first before u copy the session

77. What is batch and describe about types of batches?

Grouping of session is known as batch. Batches r two types


Sequential: Runs sessions one after the other
Concurrent: Runs session at same time.

If u have sessions with source-target dependencies u have to go for sequential


batch to start the
sessions one after another.If u have several independent sessions u can use
concurrent batches.
Whch runs all the sessions at the same time.

78. Can u copy the batches? NO

79.How many number of sessions that u can create in a batch? Any number of
sessions.

80.When the informatica server marks that a batch is failed?

If one of session is configured to "run if previous completes" and that previous


session fails.

81. What is a command that used to run a batch? pmcmd is used to start a
batch.

82. What r the different options used to configure the sequential batches?
Two options

Run the session only if previous session completes sucessfully. Always runs
the session.

83. In a sequential batch can u run the session if previous session fails?

Yes.By setting the option always runs the session.

84. Can u start a batches with in a batch?


U can not. If u want to start batch that resides in a batch,create a new
independent batch and copy the necessary sessions into the new batch.

85. Can u start a session inside a batch idividually?


We can start our required session only in case of sequential batch.in case of
concurrent batch
we cant do like this.
86. How can u stop a batch? By using server manager or pmcmd.

87. What r the session parameters?

Session parameters r like maping parameters, represent values U might want to


change between
sessions such as database connections or source files.

Server manager also allows U to create userdefined session


parameters.Following r user defined
session parameters.
Database connections
Source file names: use this parameter when u want to change the
name or location of
session source file between session runs
Target file name : Use this parameter when u want to change the
name or location of
session target file between session runs.
Reject file name : Use this parameter when u want to change the
name or location of
session reject files between session runs.

88. What is parameter file?


Parameter file is to define the values for parameters and variables used in a
session.A parameter
file is a file created by text editor such as word pad or notepad.
U can define the following values in parameter file
Maping parameters
Maping variables
session parameters

89. How can u access the remote source into U'r session?

Relational source: To acess relational source which is situated in a remote place


,u need to
configure database connection to the datasource.
FileSource : To access the remote source file U must configure the FTP
connection to the
host machine before u create the session.

Hetrogenous : When U'r maping contains more than one source type,the server
manager creates
a hetrogenous session that displays source options for all types.

90. What is difference between partioning of relatonal target and partitioning


of file targets?

If u parttion a session with a relational target informatica server creates


multiple connections
to the target database to write target data concurently.If u partition a session
with a file target
the informatica server creates one target file for each partition.U can configure
session properties
to merge these target files.

91. what r the transformations that restricts the partitioning of sessions?

Advanced External procedure tranformation and External procedure


transformation: This
transformation contains a check box on the properties tab to allow
partitioning.

Aggregator Transformation: If u use sorted ports u can not parttion the


assosiated source

Joiner Transformation : U can not partition the master source for a joiner
transformation

Normalizer Transformation
XML targets.

92. Performance tuning in Informatica?

The goal of performance tuning is optimize session performance


so sessions run during the available load window for the Informatica
Server.Increase the session performance by following.

The performance of the Informatica Server is related to network


connections. Data generally moves across a network at less than 1
MB per second, whereas a local disk moves data five to twenty
times faster. Thus network connections ofteny affect on session
performance.So aviod netwrok connections.

Flat files: If u'r flat files stored on a machine other than the
informatca server, move those files to the machine that consists of
informatica server.
Relational datasources: Minimize the connections to sources
,targets and informatica server to
improve session performance.Moving target database into server
system may improve session
performance.
Staging areas: If u use staging areas u force informatica server to
perform multiple datapasses.
Removing of staging areas may improve session performance.

U can run the multiple informatica servers againist the same


repository.Distibuting the session load to multiple informatica
servers may improve session performance.

Run the informatica server in ASCII datamovement mode improves


the session performance.Because ASCII datamovement mode stores
a character value in one byte.Unicode mode takes 2 bytes to store a
character.
If a session joins multiple source tables in one Source Qualifier,
optimizing the query may improve performance. Also, single table
select statements with an ORDER BY or GROUP BY clause may
benefit from optimization such as adding indexes.

We can improve the session performance by configuring the


network packet size,which allows
data to cross the network at one time.To do this go to server
manger ,choose server configure database connections.

If u r target consists key constraints and indexes u slow the loading


of data.To improve the session performance in this case drop
constraints and indexes before u run the session and rebuild them
after completion of session.

Running a parallel sessions by using concurrent batches will also


reduce the time of loading the
data.So concurent batches may also increase the session
performance.

Partittionig the session improves the session performance by


creating multiple connections to sources and targets and loads data
in paralel pipe lines.

In some cases if a session contains a aggregator transformation ,u


can use incremental aggregation to improve session performance.

Aviod transformation errors to improve the session performance.

If the sessioin containd lookup transformation u can improve the


session performance by enabling the look up cache.

If U'r session contains filter transformation ,create that filter


transformation nearer to the sources
or u can use filter condition in source qualifier.
Aggreagator,Rank and joiner transformation may oftenly decrease
the session performance .Because they must group data before
processing it.To improve session performance in this case use sorted
ports option.

92. What is difference between maplet and reusable


transformation?

Maplet consists of set of transformations that is reusable. A


reusable transformation is a
single transformation that can be reusable.

If u create a variable or parameter in mapplet it can not be used


in another mapping or maplet. Unlike the variables that r created in
a reusable transformation can be useful in any other mapping or
mapplet.

We can not include source definitions in reusable transformations.


But we can add sources to a mapplet.

Whole transformation logic will be hided in case of maplet. But it is


transparent in case of reusable transformation.

We cant use COBOL source qualifier,joiner,normalizer


transformations in maplet. Where as we can make them as a
reusable transformations.

93. Define informatica repository?

The Informatica repository is a relational database that stores


information, or metadata, used by the Informatica Server and Client
tools. Metadata can include information such as mappings
describing how to transform source data, sessions indicating when
you want the Informatica Server to perform the transformations, and
connect strings for sources and targets.
The repository also stores administrative information such as
usernames and passwords, permissions and privileges, and product
version.

Use repository manager to create the repository.The Repository


Manager connects to the repository database and runs the code
needed to create the repository tables.Thsea tables
stores metadata in specific format the informatica server,client tools
use.

94. What r the types of metadata that stores in repository?

Following r the types of metadata that stores in the repository

Database connections
Global objects
Mappings
Mapplets
Multidimensional metadata
Reusable transformations
Sessions and batches
Short cuts
Source definitions
Target defintions
Transformations

95. What is power center repository?

The PowerCenter repository allows you to share metadata across


repositories to create a data mart domain. In a data mart domain,
you can create a single global repository to store metadata used
across an enterprise, and a number of local repositories to share the
global metadata as needed.

96. How can u work with remote database in informatica?did u


work directly by using remote
connections?

To work with remote datasource u need to connect it with remote


connections.But it is not
preferable to work with that remote source directly by using
remote connections .Instead u bring that source into U r local
machine where informatica server resides.If u work directly with
remote source the session performance will decreases by passing
less amount of data across the network in a particular time.

97. What r the new features in Informatica 5.0?

U can Debug U'r maping in maping designer


U can view the work space over the entire screen
The designer displays a new icon for a invalid mapings in the
navigator window
U can use a dynamic lookup cache in a lokup transformation
Create maping parameters or maping variables in a maping or
maplet to make mapings more
flexible
U can export objects into repository and import objects from
repository.when u export a repository object,the designer or server
manager creates an XML file to describe the repository metadata.
The designer allows u to use Router transformation to test data for
multiple conditions.Router transformation allows u route groups of
data to transformation or target.
U can use XML data as a source or target.

Server Enahancements:

U can use the command line program pmcmd to specify a


parameter file to run sessions or batches.This allows you to change
the values of session parameters, and mapping parameters and
variables at runtime.
If you run the Informatica Server on a symmetric multi-processing
system, you can use multiple CPUs to process a session
concurrently. You configure partitions in the session properties
based on source qualifiers. The Informatica Server reads,
transforms, and writes partitions of data in parallel for a single
session. This is avialable for Power center only.

Informatica server creates two processes like loadmanager


process,DTM process to run the sessions.

Metadata Reporter: It is a web based application which is used to


run reports againist repository metadata.

U can copy the session across the folders and reposotories using the
copy session wizard in the informatica server manager

With new email variables, you can configure post-session email to


include information, such as the mapping used during the session

98. what is incremantal aggregation?

When using incremental aggregation, you apply captured changes


in the source to aggregate calculations in a session. If the source
changes only incrementally and you can capture changes, you can
configure the session to process only those changes. This allows the
Informatica Server to update your target incrementally, rather than
forcing it to process the entire source and recalculate the same
calculations each time you run the session.

99. What r the scheduling options to run a sesion?

U can shedule a session to run at a given time or intervel,or u can


manually run the session.

Different options of scheduling


Run only on demand: server runs the session only when user
starts session explicitly
Run once: Informatica server runs the session only once at a
specified date and time.
Run every: Informatica server runs the session at regular
intervels as u configured.
Customized repeat: Informatica server runs the session at the
dats and times secified in the repeat dialog
box.

100 .What is tracing level and what r the types of tracing level?

Tracing level represents the amount of information that


informatcia server writes in a log file.
Types of tracing level
Normal
Verbose
Verbose init
Verbose data

101. What is difference between stored procedure


transformation and external procedure transformation?

In case of storedprocedure transformation procedure will be


compiled and executed in a relational data source.U need data base
connection to import the stored procedure in to u'r maping.Where as
in external procedure transformation procedure or function will be
executed out side of data source.Ie u need to make it as a DLL to
access in u r maping.No need to have data base connection in case
of external procedure transformation.

102. Explain about Recovering sessions?

If you stop a session or if an error causes a session to stop, refer


to the session and error logs to determine the cause of failure.
Correct the errors, and then complete the session. The method you
use to complete the session depends on the properties of the
mapping, session, and Informatica Server configuration.
Use one of the following methods to complete the session:
 Run the session again if the Informatica Server has not issued
a commit.
 Truncate the target tables and run the session again if the
session is not recoverable.
 Consider performing recovery if the Informatica Server has
issued at least one commit.

103. If a session fails after loading of 10,000 records in to the


target.How can u load the records from 10001 th record when u
run the session next time?

As explained above informatcia server has 3 methods to


recovering the sessions.Use performing recovery to load the records
from where the session fails.

104. Explain about perform recovery?

When the Informatica Server starts a recovery session, it reads


the OPB_SRVR_RECOVERY table and notes the row ID of the last
row committed to the target database. The Informatica Server then
reads all sources again and starts processing from the next row ID.
For example, if the Informatica Server commits 10,000 rows before
the session fails, when you run recovery, the Informatica Server
bypasses the rows up to 10,000 and starts loading with row 10,001.
By default, Perform Recovery is disabled in the Informatica
Server setup. You must enable Recovery in the Informatica Server
setup before you run a session so the Informatica Server can create
and/or write entries in the OPB_SRVR_RECOVERY table.

105. How to recover the standalone session?

A standalone session is a session that is not nested in a batch. If a


standalone session fails, you can run recovery using a menu
command or pmcmd. These options are not available for batched
sessions.

To recover sessions using the menu:


1. In the Server Manager, highlight the session you want to
recover.
2. Select Server Requests-Stop from the menu.
3. With the failed session highlighted, select Server Requests-
Start Session in Recovery Mode from the menu.

To recover sessions using pmcmd:


1.From the command line, stop the session.
2. From the command line, start recovery.

106. How can u recover the session in sequential batches?

If you configure a session in a sequential batch to stop on failure,


you can run recovery starting with the failed session. The
Informatica Server completes the session and then runs the rest of
the batch. Use the Perform Recovery session property

To recover sessions in sequential batches configured to stop on


failure:

1.In the Server Manager, open the session property sheet.


2.On the Log Files tab, select Perform Recovery, and click OK.
3.Run the session.
4.After the batch completes, open the session property sheet.
5.Clear Perform Recovery, and click OK.

If you do not clear Perform Recovery, the next time you run the
session, the Informatica Server attempts to recover the previous
session.
If you do not configure a session in a sequential batch to stop on
failure, and the remaining sessions in the batch complete, recover
the failed session as a standalone session.
107. How to recover sessions in concurrent batches?

If multiple sessions in a concurrent batch fail, you might want to


truncate all targets and run the batch again. However, if a session in
a concurrent batch fails and the rest of the sessions complete
successfully, you can recover the session as a standalone session.
To recover a session in a concurrent batch:
1.Copy the failed session using Operations-Copy Session.
2.Drag the copied session outside the batch to be a standalone
session.
3.Follow the steps to recover a standalone session.
4.Delete the standalone copy.

108. How can u complete unrecoverable sessions?

Under certain circumstances, when a session does not complete,


you need to truncate the target tables and run the session from the
beginning. Run the session from the beginning when the Informatica
Server cannot run recovery or when running recovery might result in
inconsistent data.

109. What r the circumstances that infromatica server results an


unrecoverable session?

The source qualifier transformation does not use sorted ports.


If u change the partition information after the initial session
fails.
Perform recovery is disabled in the informatica server
configuration.
If the sources or targets changes after initial session fails.
If the maping consists of sequence generator or normalizer
transformation.
If a concuurent batche contains multiple failed sessions.
110. If i done any modifications for my table in back end does it
reflect in informatca warehouse or maping desginer or source
analyzer?

NO. Informatica is not at all concern with back end data base.It
displays u all the information
that is to be stored in repository.If want to reflect back end
changes to informatica screens,
again u have to import from back end to informatica by valid
connection.And u have to replace the existing files with imported
files.

111. After draging the ports of three sources(sql


server,oracle,informix) to a single source qualifier, can u map
these three ports directly to target?

NO.Unless and until u join those three ports in source qualifier u


cannot map them directly.

Data Warehouse Questions

1. What is the difference between normal database and enterprise


data warehouse database?
2. Why you use warehouse tools? What are ETL advantages?
3. How do you identify the sources for designing the target
database?
4. What is logical model and physical model?
5. What procedures you follow to design the target data
warehouse?
6. What is normalization? Explain normalization if you have you
ever done?
7. What is the size of Data Warehouse? How many records it
holds?
Ans: 1 Terra Bytes. 900Million records
8. What is DW key?
9. What are Slowly Changing Dimensions?
10. How were you implementing Type two dimension in
your project ?

Data Modeling Interview Questions


1. What is the difference between Star Schema and Snow
Flake Schema?
2. What is Dimensional Data Modeling?
3. What is a datamart? Size of datamart
4. How do you increase the performance of star schemas?
5. What you will do if you have five fact tables and you need
target as 1 big table
6. What are FACT tables? How many FACT tables are there?
How many records were in your fact table?
7. How big is your dataset and what is the size of the biggest
table?
8. What are the headaches in accessing such big tables? How
do you fix them?
9. How many dimensions where there and how many fact
10. What was your role in modeling part?
11. What were your sources exactly like columns, what the
record contain

General Questions
1. Tell about yourself?
2. How many years of oracle experience?
3. How many years of warehouse tools experience?
4. How many years of informatica experience?
5. Have you involved in all the phases of the software?
6. What you have done in analysis and design phase in your
recent project?
7. How many years of data warehousing experience?
8. How do you mentor/train the employees?
9. What encouraged you to work in your recent project?
10. Did you get any problems while in analysis phase?
11. Have you followed any naming standards?
12. What kind of documents you have prepared?
13. What is your responsibility in your recent project?
14. With which version you started your career in
DataStage?
15. What is your operating system
environment? Is it NT or UNIX?
16. Have you written any Unix shell scripting?
17. How many databases you know?
18. Do you have any questions?
19. Can you explain about the last two projects you have
done?
20. What is your visa status?
21. How did you learn Informatica?
22. Do you drive?
23. When did your last project get over?
24. How did you commute to work?
25. Give me your functional objective of your last project.
26. How did u implement your latest project?
27. What phases did go through in the project

UNIX Shell Scripting Questions

1. Have you ever run the batches through command mode?


2. What is the difference between grep and egrep in UNIX?
3. Have you used CASE statement/condition in your projects?
What is it?
4. Have you used parameters while running the sessions and
where you have used?
5. What does cut command do?
6. How do you connect to oracle from Unix?
7. How do you executive sql in Sh Scripts?
8. What is dot in Unix?
9. How can we invoke other sh from the present?
10. What are sed, awk, grep?
11. How can we see the return value of the previously
processed command?
12. What is difference between $# and $*?
13. When we login to Unix, it should prompt us to go to a
particular sh. How do we give this in the .profile?
14. What does ‘find’ do?
15. How can we divert the output of a file to the err file.
16. How you can migrate the flat file data to database. Using
shell script?
17. How you can do the pattern matching?
18. How you can check the Predefined format of a file in
your shell script
19. How you are receiving the files from the various
vendors?
20. What is the command to check for disk usage in the
Unix?
21. What is the command to check for the disk usage by a
particular directory?
22. The maximum number of lines coded in a shell
program.?
23. How do you connect to oracle from Unix? Answer: You'll
need to script to include the ORACLE_HOME environment
variable and add the $ORACLE_HOME/lib directory to the
LD_LIBRARY_PATH. See Make sure you can connect to the
oracle server from Solaris with sql*plus. Make sure you have
the Oracle listener running. Make sure you have a
tnsnames.ora file set up. For oracle 7.3 it is in the
/var/opt/oracle directory.
24. How do you execute sql in Sh Scripts?
25. What is dot in Unix? Answer: Dot files are special
configuration files, which are ``invisible'' to a normal ls
command. If you want to see your dot files you can use the
command `ls -a'. Try it out and see what's really hiding in
your home directory.
26. What are I/O operations in Unix? Ans. Read, Write
UNIX:
1. What command is used to type files to the screen?
cat
2. What command is used to remove a file?
rm
3. What is the purpose of the grep command?
Search a keyword or a pattern in a file
4. What is redirection and how is it used?
Sending output to another file. can be done by the
command > (redirect standard output to a file) or
the command < (to redirect standard input from a
file) or the command >> (to append standard output
to a file)

PL/SQL

1. Describe the difference between a procedure, function


and anonymous pl/sql block.
Ans. Function and procedure are subprogram that
can be called but function returns a value. If no
header is specified to a PL/SQL block, the block is
said to be an anonymous pl/sql block.

2. What is the difference between Inner Join and Outer


Join?

3. What is a Cartesian product?


Ans : each row from one table is combined with
each row from the other table , this operation is
called Cartesian product. Suppose one table has m
rows and other has n rows then we get n * m rows

4. what is a cursor?
Ans : A cursor is a variable whose declaration
specifies a set of tuples (as a query result) such
that the tuples can be processed in a tuple-oriented
way (i.e., one row at a time) using the fetch
statement.

5. How can you find within a PL/SQL block, if a cursor is


open?
Ans : Use the %ISOPEN cursor status variable.

6. What are the types of triggers?


Ans : Row level trigger (defined using the clause
for each row) and Statement level Trigger (if the
clause is not given trigger is assumed to be SLT)

7. You want to determine the location of identical rows in a


table before attempting to place a unique index on the
table, how can this be done?
SELECT loc
FROM dept
GROUP BY loc
HAVING COUNT(*)>1

8. Give one method for transferring a table from one


schema to another:

9. How to handle exceptions in pl\sql


Ans. Each error or warning during the execution of
a pl/sql block raises an exception. They can be
handled two ways, SYSTEM DEFINED
EXCEPTIONS AND USER DEFINED
EXCEPTIONS. System defined exceptions are
automatically raised whenever corresponding errors
or warnings occur, user defined expceptions must
be raised explicitly in a sequence of statement
using raise <exception name>
Informatica
Designer Questions

1. Explain the complex mapping you did?

2. Can we use two tables from two different database


instances of same database type in Source Qualifier
Transformation? How will the server know which table is
coming from which database?
Yes

3. What is the warehouse designer in Informatica?


Tool to import or create target

4. What are the transformations that u did not use?


External stored procedure transformation,
Advanced external stored procedure
transformation, transaction control transformation,
xml source qualifier transformation, advanced
source qualifier transformationl.

5. Can u delete rows from a target table using a mapping?


How? What are the settings in the session properties
that u need to configure?
Yes,

6. Purpose of Source qualifier Transformation?


Used to filter , join

7. What is SQL Override


It overrides the default SQL query

8. Every Mapping should have atleast three components.


What are they?
Source, transformation, target

9. Difference between Active Transformations and Passive


Transformations?
AT change the number of rows pass through it.

10. What are the transformations, which use cache?


Aggregator, lookup, joiner, rank transformations

11. What is the difference between filter and Router


Transformations?
FT test data for one condition and drop rows of data
that do not meet the condition whereas RT test data
for one or more conditions and gives you the option
to route rows of data that do no meet any of the
conditions to a default output group.

12. What is lookup transformation? What is the


difference between connected and unconnected lookup
transformations?
Use a Lookup transformation in your mapping to look up data in a relational
table.

Unconnected
Connected Lookup
Lookup
Receives input values
from the result of a
Receives input values directly from
:LKP expression in
the pipeline.
another
transformation.
You can use a dynamic or static You can use a static
cache. cache.
Cache includes all lookup columns
Cache includes all
used in the mapping (that is,
lookup/output ports in
lookup table columns included in
the lookup condition
the lookup condition and lookup
and the lookup/return
table columns linked as output
port.
ports to other transformations).
Can return multiple columns from Designate one return
the same row or insert into the port (R). Returns one
dynamic lookup cache. column from each row.
If there is no match for the lookup
condition, the Informatica Server
If there is no match for
returns the default value for all
the lookup condition,
output ports. If you configure
the Informatica Server
dynamic caching, the Informatica
returns NULL.
Server inserts rows into the cache
or leaves it unchanged.
If there is a match for the lookup
condition, the Informatica Server
If there is a match for
returns the result of the lookup
the lookup condition,
condition for all lookup/output
the Informatica Server
ports. If you configure dynamic
returns the result of
caching, the Informatica Server
the lookup condition
either updates the row the in the
into the return port.
cache or leaves the row
unchanged.
Pass one output value
to another
Pass multiple output values to transformation. The
another transformation. Link lookup/output/return
lookup/output ports to another port passes the value
transformation. to the transformation
calling :LKP
expression.
Supports user-defined default Does not support user-
values. defined default values.

13. Two advantages of unconnected lookup over


connected lookup?
A common use for unconnected Lookup
transformations is to update slowly changing
dimension tables

14. Can you tell one Scenario where you used lookup
transformation

15. How do u decide when to cache a lookup and when


to not?
When there are too

16. How do you know how much cache u will need?


What does it depend on?
17. What is dynamic lookup caching? What is the
transformation that I must be using in this mapping?
What are the session properties setting that I must be
using in this situation?
18. Did u create stored procedure and what exactly did
u write?
19. Diff. Between the connected and unconnected SP
transformation?
20. How do u get the return of an unconnected
transformation?

21. Which transformation do u use to calculate the


standard deviation across row?
Aggregate transformation.

22. How can u improve the performance of an


aggregator transformation?
Sorted input (which reduce the use of aggregate
cache) is used to improve session performance.

23. What is the difference between sorted aggregation


and unsorted aggregation? What happens when I use
the sorted aggregation?

24. How do u update row in a target table? How do u


update a row in target table if u don’t have any primary
key in the target table?
25. A particular mapping’s source is in a database
schema. The Schema is changed and even the columns
in the source are changed. How can we run the same
mapping?
26. What is Mapplet?
27. Can I have a mapplet in a mapping with no inputs
passing into the mapplet transformation?
28. Is mapplet an active transformation or a passive
transformation?
29. Can you have a maplet inside another maplet?
30. What are Pre-session and Post-session Options?
31. What is the normalize transformation?
32. Can you used normalize for both normalizing and
de-normalizing a record?
33. What are the various dd_ commands? In which
transformation you give them where do you give
dd_insert, dd_update and dd_delete? What are the
different Update Strategies
34. In Sequence Generator, what happens when
nextval is connected and curval is not connected? What
happens if it is reverse?
35. Can we send data from normalization and filter to
expression transformation?
36. Explain this Transformation:
IIF(ISNULL(C),IIF(UPPER(A)='SALARIED',1,IIF(UPPER(
B)='HOURLY',2,3)),4)'A' is not equal to ‘SALARIED’ and
B is not equal to 'HOURLY'. What the o/p of this
transformation and 'C' is Null
37. I have a sequence "Y" which gets incremented
whenever the data gets loaded into Table "X". How do i
set "Y" to "ZERO" before the data gets loaded into table
"X" Next time. Ans : check the "RESET" Button in the
sequence properties.

38. IIF (isnull (A), NULL, IIF (isnull (B), 4,iif (D ='1', 0,
-1)))
a. If d =1 and B= null and a= not null
39. Did you do Error Handling? (Null Handling?)
40. How do you we migration the Mappings and
Sessions from Development to QA or Testing ?
41. For what Transformations the Sessions runs
slowly….If so how do you fix them?
42. What is pmcmd
43. Which Transformation is used to join
heterogeneous sources residing at different locations or
File Systems?
44. How do you truncate a table? What about the
truncate option on the target settings?
45. What are target load strategies? What you were
using in your latest project?
46. What is a router transformation?
47. Which tool do you use perform unlocks? Repository
Manager
48. Added features of Informatica 6.0 Designer
49. What is a worklets?
50. Difference between using a joiner transformation
and SQL with multiple joins? Which do you prefer?
51. What does the normalizer transformation do?
52. Which transformations should not use in Mapplets?
53. What is pmrep command?
54. Write the Syntax for Unconnected Lookup And the
lookup Name is "SATYAM" and two values are to be
parsed "SATYAM COMPUTERS" AND "STC"?
55. IIF (ISNULL (A), DD_INSERT, DD_UPDATE), what
is the O/P?
56. How do you run the server on unix machines? Ans:
Using ‘pmcmd’ command.
57. You usually get flat files from legacy systems. They
can be joined with tables from relational sources using a
joiner transformation.
58. Have you used FTP connections? Ans : Yes, we
used to get flat files from legacy systems. We used to
create ftp connections to predefined paths on remote
systems and when the session is run Informatica gets
the file from the remote system.
59. The order in which Informatica server sends
records to various target definitions in the mapping is
known as?
60. What properties should be there for the shared
folder (shortcuts)?

Repository and Server Questions


1. What is Code Page?
2. How do you debug a Session ?
3. Purpose of Pre and Post Session?
4. Purpose of file wait in Pre Session
5. When loading data to a relational table, there are several
Target Options to choose from in Server Manager relating to
Updates: Update (as Update), Update (as Insert), and
Update (else Insert). What’s the difference these three Target
Options
Update (as Insert) and Update (else Insert) and update as
(update)?
Ans
Update (as Update): server tries to perform an update on the
rows flagged for update
Update (as Insert): server tries to perform an update on the
rows flagged for insert, This can be handy when you have two
targets in a mapping, one that You only want to do updates
to and one that you only want to do inserts to, and you don't
want to use an update strategy. You can just leave the session
"treat rows as" setting to insert and change the target setting
on the target on which you want to do updates to 'update as
insert'.
Update (else Insert): for rows flagged as update, server tries
to perform an update, if row
Is not there an insert is attempted. Note that you have to have
the Insert box checked on the target table as well to allow
the inserts to occur.

6. I need to know how I can execute an SQL command


from the Pre-Session commands dialogue box. Ans)
execute the sql from a script or stored procedure.
7. What are two Modes of Data Movement that Informatica
Server supports?
Ans. ASCII and Unicode
8. The Repository user profile is the same as database
user profile.
Ans False.
9. Child process that Load Manager creates when it
receives a request to start a session? Ans. Data
Transformation Manager
10. Processes created by Data Transformation Manager?
Ans. Reader and Writer
11.Can we stop the session in a concurrent batch?
Ans: Yes, Stop the session through the Server
Manager or by using the command line program pmcmd.
You can issue the stop or abort command in the Server
Manager at any point during the session run.

Performance Questions
1. What are the performance issues you have come across? And
how did you handle them?
2. How can you increase the performance at the mapping level?
3. Performance tuning at session level.
4. What is Repository tuning?
5. If a session fails, who are all the people to whom you were
sending the email? And how did you do that?
6. How do you improving a query performance?

General Informatica Questions

1. Difference between Powermart and PowerCenter and explain


about their architecture?
2. How many mappings did you create?
3. Did you deal with any legacy systems?
4. What are the enhancements in Informatica 6.x?
5. How informatica can be used as data warehousing tool?
6. Which versions you have worked and what are the
differences?
7. What are the enhancements from a previous version of
Informatica like 4.7 to 5.1.1?Ans: Router transformation and
debugger were introduced. Aggregator performance was
enhanced
8. What were the problems you faced while upgrading from 5.1.1
to 6.0?
9. What are the difference between 5.1 and 6.1?
10. How many mapplets you have created?
11. Did you do parallel ETL?
12. Have you used multiple registered servers?
13. How many records are transferred from Source to
Target?
14. How many Informatica Servers?
15. How do you get flat files from Source tables?
16. Have you worked with DB2?
17. Tell me the problems you faced in your last project?
18. What is Power Connect? Ans Power Connect is an
option that is available for Power Center to connect to IBM
DB2, SAP and People Soft
Data Handling Questions

1. How did you handle the rejected data? Ans: Open the log file
and rejected file and analyze the reason for rejection of each
row and the modify the data in the rejected file, then using
reject load utility reload the data into the target tables.
2. In how many types you can load the target data?
3. Can we create target table dynamically? How?
4. How can we use the same mapping for extracting data from a
source, which comes with a different name every week without
modifying the mapping?
5. What is the difference between bulk load and normal load?
6. In how many types you can load the target data?
7. Can we create target table dynamically? How?
8. How can we use the same mapping for extracting data from a
source, which comes with a different name every week without
modifying the mapping.
9. There are three Targets "X","Y","Z" in a mapping. How do I
look mapping only for target "X" without? Having the "Y" and
"Z" on the screen. Ans) select "Layout " from the Toolbar. Go
to option "Arrange". And select the Target "X"
10. There are three Targets "X","Y","Z" in a mapping. How
do I set the load sequence?
11. So that the Data gets loaded for "X" then "Y" and then
"Z"? Ans GO to "Mappings" in the toolbar and then select
Target Load Plan.
12. How did you handle the data errors, say bad data?
13. How do you do a Test Load ?
14. If there is no primary key on the Target Table can we
update the Target Table? Ans : No
15. What is difference in between normal load and bulk
load?
16. How did you handle the rejected data? Ans: Open the
rejected file and analyze the reason for rejection of each row
and modify the data in the rejected file, then using reject load
utility reload the data into the target tables.
17. What the different sources that Informatica can handle?
18. When can we run a Store Procedure?
Ans:
a. Normal - when the store procedure is supposed to be
executed after each and Every row of data.

b. Pre-Load of The Source - Before the sessions retrieve


the data from the source.

c. Post-Load of the Source- After the session retrieves the


data from the source

d. Pre-Load of the Target - Before the session sends the data


to the target.

e. Post-Load of the Target- After the session sends the data


to the target.

Data WareHouse Interview Questions

Informatica Group URL for Real Time Problems:


http://groups.yahoo.com/group/informaticadevelopment/
http://groups.yahoo.com/group/informaticadevelopment/

Kimbel Ralph URL's:-


http://www.dbmsmag.com/9612d05.html
http://www.dbmsmag.com/9701d05.html
STAR - SCHEMA :-
http://www.starlab.vub.ac.be/staff/robert/Information
%20Systems/Halpin%203rd%20ed/Infosys%20Ch1.pdf
ODS Design:-
http://www.compaq.nl/products/servers/alphaserver/pdf/SPD-ODS
%20Service-V1.0.pdf
http://www.intelligententerprise.com/010613/warehouse1_1.shtml?database

1.Can 2 Fact Tables share same dimensions Tables? How many


Dimension tables are associated with one Fact Table ur project?
Ans: Yes

2.What is ROLAP, MOLAP, and DOLAP...?


Ans: ROLAP (Relational OLAP), MOLAP (Multidimensional OLAP), and
DOLAP (Desktop OLAP). In these three OLAP
architectures, the interface to the analytic layer is typically the same;
what is quite different is how the data is physically stored.
In MOLAP, the premise is that online analytical processing is best
implemented by storing the data multidimensionally; that is,
data must be stored multidimensionally in order to be viewed in a
multidimensional manner.
In ROLAP, architects believe to store the data in the relational
model; for instance, OLAP capabilities are best provided
against the relational database.
DOLAP, is a variation that exists to provide portability for the OLAP
user. It creates multidimensional datasets that can be
transferred from server to desktop, requiring only the DOLAP software
to exist on the target system. This provides significant
advantages to portable computer users, such as salespeople who are
frequently on the road and do not have direct access to
their office server.

3.What is an MDDB? and What is the difference between MDDBs and


RDBMSs?
Ans: Multidimensional Database There are two primary technologies
that are used for storing the data used in OLAP applications.
These two technologies are multidimensional databases (MDDB) and
relational databases (RDBMS). The major difference between MDDBs and
RDBMSs is in how they store data. Relational databases store their data
in a series of tables and columns. Multidimensional databases, on the
other hand, store their data in a large multidimensional arrays.
For example, in an MDDB world, you might refer to a sales figure as
Sales with Date, Product, and Location coordinates of
12-1-2001, Car, and south, respectively.

Advantages of MDDB:
Retrieval is very fast because
 The data corresponding to any combination of dimension
members can be retrieved with a single I/O.
 Data is clustered compactly in a multidimensional array.
 Values are caluculated ahead of time.
 The index is small and can therefore usually reside completely
in memory.
Storage is very efficient because
 The blocks contain only data.
 A single index locates the block corresponding to a
combination of sparse dimension numbers.

4. What is MDB modeling and RDB Modeling?


Ans:

5. What is Mapplet and how do u create Mapplet?


Ans: A mapplet is a reusable object that represents a set of
transformations. It allows you to reuse transformation logic and can
contain as many transformations as you need.
Create a mapplet when you want to use a standardized set of
transformation logic in several mappings. For example, if you
have a several fact tables that require a series of dimension keys, you
can create a mapplet containing a series of Lookup
transformations to find each dimension key. You can then use the
mapplet in each fact table mapping, rather than recreate the
same lookup logic in each mapping.
To create a new mapplet:
1. In the Mapplet Designer, choose Mapplets-Create Mapplet.
2. Enter a descriptive mapplet name.
The recommended naming convention for mapplets is
mpltMappletName.
3. Click OK.
The Mapping Designer creates a new mapplet in the Mapplet
Designer.
4. Choose Repository-Save.

6. What for is the transformations are used?


Ans: Transformations are the manipulation of data from how it appears in
the source system(s) into another form in the data
warehouse or mart in a way that enhances or simplifies its meaning.
In short, u transform data into information.

This includes Datamerging, Cleansing, Aggregation: -


Datamerging: Process of standardizing data types and fields.
Suppose one source system calls integer type data as smallint
where as another calls similar data as decimal. The data from the two
source systems needs to rationalized when moved into
the oracle data format called number.
Cleansing: This involves identifying any changing inconsistencies
or inaccuracies.
- Eliminating inconsistencies in the data from multiple sources.
- Converting data from different systems into single consistent data set
suitable for analysis.
- Meets a standard for establishing data elements, codes, domains,
formats and naming conventions.
- Correct data errors and fills in for missing data values.
Aggregation: The process where by multiple detailed values are
combined into a single summary value typically summation numbers
representing dollars spend or units sold.
- Generate summarized data for use in aggregate fact and dimension
tables.

Data Transformation is an interesting concept in that some


transformation can occur during the “extract,” some during the
“transformation,” or even – in limited cases--- during “load“ portion of
the ETL process. The type of transformation function u
need will most often determine where it should be performed. Some
transformation functions could even be performed in more
than one place. B’ze many of the transformations u will want to
perform already exist in some form or another in more than
one of the three environments (source database or application, ETL
tool, or the target db).

7. What is the difference btween OLTP & OLAP?


Ans: OLTP stand for Online Transaction Processing. This is standard,
normalized database structure. OLTP is designed for
Transactions, which means that inserts, updates, and deletes must be
fast. Imagine a call center that takes orders. Call takers are continually
taking calls and entering orders that may contain numerous items. Each
order and each item must be inserted into a database. Since the
performance of database is critical, we want to maximize the speed of
inserts (and updates and deletes). To maximize performance, we
typically try to hold as few records in the database as possible.

OLAP stands for Online Analytical Processing. OLAP is a term that


means many things to many people. Here, we will use the term OLAP
and Star Schema pretty much interchangeably. We will assume that
star schema database is an OLAP system.( This is not the same
thing that Microsoft calls OLAP; they extend OLAP to mean the cube
structures built using their product, OLAP Services). Here, we will
assume that any system of read-only, historical, aggregated data is
an OLAP system.

A data warehouse(or mart) is way of storing data for later retrieval. This
retrieval is almost always used to support decision-making in the
organization. That is why many data warehouses are considered to be
DSS (Decision-Support Systems).

Both a data warehouse and a data mart are storage mechanisms for
read-only, historical, aggregated data.
By read-only, we mean that the person looking at the data won’t be
changing it. If a user wants at the sales yesterday for a certain product,
they should not have the ability to change that number.

The “historical” part may just be a few minutes old, but usually it is at
least a day old.A data warehouse usually holds data that goes back a
certain period in time, such as five years. In contrast, standard OLTP
systems usually only hold data as long as it is “current” or active. An
order table, for example, may move orders to an archive table once they
have been completed, shipped, and received by the customer.

When we say that data warehouses and data marts hold aggregated
data, we need to stress that there are many levels of aggregation in a
typical data warehouse.

8. If data source is in the form of Excel Spread sheet then how do use?
Ans: PowerMart and PowerCenter treat a Microsoft Excel source as a
relational database, not a flat file. Like relational sources,
the Designer uses ODBC to import a Microsoft Excel source. You do
not need database permissions to import Microsoft
Excel sources.
To import an Excel source definition, you need to complete the
following tasks:
 Install the Microsoft Excel ODBC driver on your system.
 Create a Microsoft Excel ODBC data source for each source file in
the ODBC 32-bit Administrator.
 Prepare Microsoft Excel spreadsheets by defining ranges and
formatting columns of numeric data.
 Import the source definitions in the Designer.
Once you define ranges and format cells, you can import the ranges in
the Designer. Ranges display as source definitions
when you import the source.

9. Which db is RDBMS and which is MDDB can u name them?


Ans: MDDB ex. Oracle Express Server(OES), Essbase by Hyperion
Software, Powerplay by Cognos and
RDBMS ex. Oracle , SQL Server …etc.

10. What are the modules/tools in Business Objects? Explain theier


purpose briefly?
Ans: BO Designer, Business Query for Excel, BO Reporter,
Infoview,Explorer,WEBI, BO Publisher, and Broadcast Agent, BO
ZABO).
InfoView: IT portal entry into WebIntelligence & Business Objects.
Base module required for all options to view and refresh
reports.
Reporter: Upgrade to create/modify reports on LAN or Web.
Explorer: Upgrade to perform OLAP processing on LAN or Web.
Designer: Creates semantic layer between user and database.
Supervisor: Administer and control access for group of users.
WebIntelligence: Integrated query, reporting, and OLAP analysis
over the Web.
Broadcast Agent: Used to schedule, run, publish, push, and
broadcast pre-built reports and spreadsheets, including event
notification and response capabilities, event
filtering, and calendar based notification, over the LAN, e-
mail, pager,Fax, Personal Digital Assistant( PDA),
Short Messaging Service(SMS), etc.
Set Analyzer - Applies set-based analysis to perform functions such
as execlusion, intersections, unions, and overlaps
visually.
Developer Suite – Build packaged, analytical, or customized apps.

11.What are the Ad hoc quries, Canned Quries/Reports? and How do u


create them?
(Plz check this page……C\:BObjects\Quries\Data Warehouse - About
Queries.htm)
Ans: The data warehouse will contain two types of query. There will be
fixed queries that are clearly defined and well understood, such as
regular reports, canned queries (standard reports) and common
aggregations. There will also be ad hoc queries that are
unpredictable, both in quantity and frequency.
Ad Hoc Query: Ad hoc queries are the starting point for any analysis
into a database. Any business analyst wants to know what is inside the
database. He then proceeds by calculating totals, averages, maximum
and minimum values for most attributes within the database. These are
unpredictable element of a data warehouse. It is exactly that ability to
run any query when desired and expect a reasonable response that
makes the data warhouse worthwhile, and makes the design such a
significant challenge.
The end-user access tools are capable of automatically generating the
database query that answers any Question posed by the user. The user
will typically pose questions in terms that they are familier with (for
example, sales by store last week); this is converted into the database
query by the access tool, which is aware of the structure of information
within the data warehouse.

Canned queries: Canned queries are predefined queries. In most


instances, canned queries contain prompts that allow you to customize
the query for your specific needs. For example, a prompt may ask you
for a School, department, term, or section ID. In this instance you would
enter the name of the School, department or term, and the query will
retrieve the specified data from the Warehouse.You can measure
resource requirements of these queries, and the results can be used for
capacity palnning and for database design.

The main reason for using a canned query or report rather than creating
your own is that your chances of misinterpreting data or getting the
wrong answer are reduced. You are assured of getting the right data and
the right answer.

12. How many Fact tables and how many dimension tables u did? Which
table precedes what?
Ans: http://www.ciobriefings.com/whitepapers/StarSchema.asp

13. What is the difference between STAR SCHEMA & SNOW FLAKE
SCHEMA?
Ans: http://www.ciobriefings.com/whitepapers/StarSchema.asp
14. Why did u choose STAR SCHEMA only? What are the benefits
of STAR SCHEMA?
Ans: Because it’s denormalized structure , i.e., Dimension Tables are
denormalized. Why to denormalize means the first (and often
only) answer is : speed. OLTP structure is designed for data inserts,
updates, and deletes, but not data retrieval. Therefore,
we can often squeeze some speed out of it by denormalizing some of
the tables and having queries go against fewer tables.
These queries are faster because they perform fewer joins to retrieve
the same recordset. Joins are also confusing to many
End users. By denormalizing, we can present the user with a view of
the data that is far easier for them to understand.

Benefits of STAR SCHEMA:


 Far fewer Tables.
 Designed for analysis across time.
 Simplifies joins.
 Less database space.
 Supports “drilling” in reports.
 Flexibility to meet business and technical needs.

15. How do u load the data using Informatica?


Ans: Using session.

16. (i) What is FTP? (ii) How do u connect to remote? (iii) Is there
another way to use FTP without a special utility?
Ans: (i): The FTP (File Transfer Protocol) utility program is commonly used
for copying files to and from other computers. These
computers may be at the same site or at different sites thousands of
miles apart. FTP is general protocol that works on UNIX
systems as well as other non- UNIX systems.

(ii): Remote connect commands:


ftp machinename
ex: ftp 129.82.45.181 or ftp iesg
If the remote machine has been reached successfully, FTP responds
by asking for a loginname and password. When u enter
ur own loginname and password for the remote machine, it returns the
prompt like below
ftp>
and permits u access to ur own home directory on the remote
machine. U should be able to move around in ur own directory
and to copy files to and from ur local machine using the FTP interface
commands.
Note: U can set the mode of file transfer to ASCII ( default and
transmits seven bits per character).
Use the ASCII mode with any of the following:
- Raw Data (e.g. *.dat or *.txt, codebooks, or other plain text
documents)
- SPSS Portable files.
- HTML files.
If u set mode of file transfer to Binary (the binary mode transmits all
eight bits per byte and thus provides less chance of
a transmission error and must be used to transmit files other than
ASCII files).
For example use binary mode for the following types of files:
- SPSS System files
- SAS Dataset
- Graphic files (eg., *.gif, *.jpg, *.bmp, etc.)
- Microsoft Office documents (*.doc, *.xls, etc.)

(iii): Yes. If u r using Windows, u can access a text-based FTP utility


from a DOS prompt.
To do this, perform the following steps:
1. From the Start  Programs MS-Dos Prompt
2. Enter “ftp ftp.geocities.com.” A prompt will appear
(or)
Enter ftp to get ftp prompt  ftp> open hostname ex. ftp>open
ftp.geocities.com (It connect to the specified host).
3. Enter ur yahoo! GeoCities member name.
4. enter your yahoo! GeoCities pwd.
You can now use standard FTP commands to manage the files in your
Yahoo! GeoCities directory.

17.What cmd is used to transfer multiple files at a time using FTP?


Ans: mget ==> To copy multiple files from the remote machine to the local
machine. You will be prompted for a y/n answer before
transferring each file mget * ( copies all files in the current
remote directory to ur current local directory,
using the same file names).
mput ==> To copy multiple files from the local machine to the
remote machine.

18. What is an Filter Transformation? or what options u have in Filter


Transformation?
Ans: The Filter transformation provides the means for filtering
records in a mapping. You pass all the rows from a source
transformation through the Filter transformation, then enter a
filter condition for the transformation. All ports in a Filter
transformation are input/output, and only records that meet the
condition pass through the Filter transformation.
Note: Discarded rows do not appear in the session log or
reject files
To maximize session performance, include the Filter
transformation as close to the sources in the mapping as
possible.
Rather than passing records you plan to discard through the
mapping, you then filter out unwanted data early in the
flow of data from sources to targets.

You cannot concatenate ports from more than one transformation


into the Filter transformation; the input ports for the filter
must come from a single transformation. Filter transformations
exist within the flow of the mapping and cannot be
unconnected. The Filter transformation does not allow setting
output default values.
19.What are default sources which will supported by Informatica Powermart
?
Ans :
 Relational tables, views, and synonyms.
 Fixed-width and delimited flat files that do not contain binary data.
 COBOL files.

20. When do u create the Source Definition ? Can I use this Source Defn to
any Transformation?
Ans: When working with a file that contains fixed-width binary data ,
you must create the source definition.
The Designer displays the source definition as a table,
consisting of names, datatypes, and constraints. To use a source
definition in a mapping, connect a source definition to a
Source Qualifier or Normalizer transformation. The Informatica
Server uses these transformations to read the source data.

21. What is Active & Passive Transformation ?


Ans: Active and Passive Transformations
Transformations can be active or passive. An active transformation
can change the number of records passed through it. A
passive transformation never changes the record count.For
example, the Filter transformation removes rows that do not
meet the filter condition defined in the transformation.

Active transformations that might change the record count


include the following:
 Advanced External Procedure
 Aggregator
 Filter
 Joiner
 Normalizer
 Rank
 Source Qualifier
Note: If you use PowerConnect to access ERP sources, the ERP
Source Qualifier is also an active transformation.
/*
You can connect only one of these active transformations to
the same transformation or target, since the Informatica
Server cannot determine how to concatenate data from different sets
of records with different numbers of rows.
*/
Passive transformations that never change the record count
include the following:
 Lookup
 Expression
 External Procedure
 Sequence Generator
 Stored Procedure
 Update Strategy

You can connect any number of these passive transformations , or


connect one active transformation with any number of
passive transformations, to the same transformation or target.

22. What is staging Area and Work Area?


Ans: Staging Area : -
- Holding Tables on DW Server.
- Loaded from Extract Process
- Input for Integration/Transformation
- May function as Work Areas
- Output to a work area or Fact Table
Work Area: -
- Temporary Tables
- Memory
23. What is Metadata? (plz refer DATA WHING IN THE REAL WORLD
BOOK page # 125)
Ans: Defn: “Data About Data”
Metadata contains descriptive data for end users. In a data
warehouse the term metadata is used in a number of different
situations.
Metadata is used for:
 Data transformation and load
 Data management
 Query management
Data transformation and load:
Metadata may be used during data transformation and load to describe
the source data and any changes that need to be made. The advantage
of storing metadata about the data being transformed is that as source
data changes the changes can be captured in the metadata, and
transformation programs automatically regenerated.
For each source data field the following information is reqd:
Source Field:
 Unique identifier (to avoid any confusion occurring betn 2 fields of
the same anme from different sources).
 Name (Local field name).
 Type (storage type of data, like character,integer,floating point…
and so on).
 Location
- system ( system it comes from ex.Accouting system).
- object ( object that contains it ex. Account Table).
The destination field needs to be described in a similar way to
the source:
Destination:
 Unique identifier
 Name
 Type (database data type, such as Char, Varchar, Number and so
on).
 Tablename (Name of the table th field will be part of).
The other information that needs to be stored is the
transformation or transformations that need to be applied to turn
the source data into the destination data:
Transformation:
 Transformation (s)
- Name
- Language (name of the lanjuage that transformation is written
in).
- module name
- syntax
The Name is the unique identifier that differentiates this from any
other similar transformations.
The Language attribute contains the name of the lnguage that the
transformation is written in.
The other attributes are module name and syntax. Generally these
will be mutually exclusive, with only one being defined. For simple
transformations such as simple SQL functions the syntax will be
stored. For complex transformations the name of the module that
contains the code is stored instead.
Data management:
Metadata is reqd to describe the data as it resides in the data
warehouse.This is needed by the warhouse manager to allow it to
track and control all data movements. Every object in the database
needs to be described.
Metadata is needed for all the following:
 Tables
- Columns
- name
- type
 Indexes
- Columns
- name
- type
 Views
- Columns
- name
- type
 Constraints
- name
- type
- table
- columns
Aggregations, Partition information also need to be stored in
Metadata( for details refer page # 30)
Query Generation:
Metadata is also required by the query manger to enable it to generate
queries. The same metadata can be used by the Whouse manager to
describe the data in the data warehouse is also reqd by the query
manager.
The query mangaer will also generate metadata about the queries it has
run. This metadata can be used to build a history of all quries run and
generate a query profile for each user, group of users and the data
warehouse as a whole.
The metadata that is reqd for each query is:
- query
- tables accessed
- columns accessed
- name
- refence identifier
- restrictions applied
- column name
- table name
- reference identifier
- restriction
- join Criteria applied
……
……
- aggregate functions used
……
……
- group by criteria
……
……
- sort criteria
……
……
- syntax
- execution
plan
- resources
……
……

24. What kind of Unix flavoures u r experienced?


Ans: Solaris 2.5 SunOs 5.5 (Operating System)
Solaris 2.6 SunOs 5.6 (Operating System)
Solaris 2.8 SunOs 5.8 (Operating System)
AIX 4.0.3
5.5.1 2.5.1 May 96 sun4c, sun4m, sun4d, sun4u, x86, ppc
5.6 2.6 Aug. 97 sun4c, sun4m, sun4d, sun4u, x86
5.7 7 Oct. 98 sun4c, sun4m, sun4d, sun4u, x86
5.8 8 2000 sun4m, sun4d, sun4u, x86

25. What are the tasks that are done by Informatica Server?
Ans:The Informatica Server performs the following tasks:
 Manages the scheduling and execution of sessions and batches
 Executes sessions and batches
 Verifies permissions and privileges
 Interacts with the Server Manager and pmcmd.
The Informatica Server moves data from sources to targets based
on metadata stored in a repository. For instructions on how to
move and transform data, the Informatica Server reads a mapping
(a type of metadata that includes transformations and source and
target definitions). Each mapping uses a session to define
additional information and to optionally override mapping-level
options. You can group multiple sessions to run as a single unit,
known as a batch.
26. What are the two programs that communicate with the Informatica
Server?
Ans: Informatica provides Server Manager and pmcmd programs to
communicate with the Informatica Server:
Server Manager. A client application used to create and manage
sessions and batches, and to monitor and stop the Informatica Server.
You can use information provided through the Server Manager to
troubleshoot sessions and improve session performance.
pmcmd. A command-line program that allows you to start and stop
sessions and batches, stop the Informatica Server, and verify if the
Informatica Server is running.
27. When do u reinitialize Aggregate Cache?
Ans: Reinitializing the aggregate cache overwrites historical aggregate
data with new aggregate data. When you reinitialize the
aggregate cache, instead of using the captured changes in source
tables, you typically need to use the use the entire source
table.
For example, you can reinitialize the aggregate cache if the source
for a session changes incrementally every day and
completely changes once a month. When you receive the new
monthly source, you might configure the session to reinitialize
the aggregate cache, truncate the existing target, and use the new
source table during the session.

/? Note: To be clarified when server manger works for


following ?/
To reinitialize the aggregate cache:
1.In the Server Manager, open the session property sheet.
2.Click the Transformations tab.
3.Check Reinitialize Aggregate Cache.
4.Click OK three times to save your changes.
5.Run the session.

The Informatica Server creates a new aggregate cache, overwriting the


existing aggregate cache.
/? To be check for step 6 & step 7 after successful run of session…
?/

6.After running the session, open the property sheet again.


7.Click the Data tab.
8.Clear Reinitialize Aggregate Cache.
9.Click OK.

28. (i) What is Target Load Order in Designer?


Ans: Target Load Order: - In the Designer, you can set the order in which
the Informatica Server sends records to various target definitions in a
mapping. This feature is crucial if you want to maintain referential integrity
when inserting, deleting, or updating records in tables that have the primary
key and foreign key constraints applied to them. The Informatica Server
writes data to all the targets connected to the same Source Qualifier or
Normalizer simultaneously, to maximize performance.

28. (ii) What are the minimim condition that u need to have so as to
use Targte Load Order Option in Designer?
Ans: U need to have Multiple Source Qualifier transformations.
To specify the order in which the Informatica Server sends data to
targets, create one Source Qualifier or Normalizer transformation for each
target within a mapping. To set the target load order, you then determine the
order in which each
Source Qualifier sends data to connected targets in the
mapping.
When a mapping includes a Joiner transformation, the
Informatica Server sends all records to targets connected to that
Joiner at the same time, regardless of the target load order.

28(iii). How do u set the Target load order?


Ans: To set the target load order:
1. Create a mapping that contains multiple Source Qualifier
transformations.
2. After you complete the mapping, choose Mappings-Target Load
Plan.
A dialog box lists all Source Qualifier transformations in the
mapping, as well as the targets that receive data from each Source
Qualifier.
3. Select a Source Qualifier from the list.
4. Click the Up and Down buttons to move the Source Qualifier within
the load order.
5. Repeat steps 3 and 4 for any other Source Qualifiers you wish to
reorder.
6. Click OK and Choose Repository-Save.

29. What u can do with Repository Manager?


Ans: We can do following tasks using Repository Manager : -
 To create usernames, you must have one of the following sets of
privileges:
- Administer Repository privilege
- Super User privilege
To create a user group, you must have one of the following
privileges :
- Administer Repository privilege
- Super User privilege
To assign or revoke privileges , u must hv one of the
following privilege..
- Administer Repository privilege
- Super User privilege
Note: You cannot change the privileges of the default user groups or
the default repository users.

30. What u can do with Designer ?


Ans: The Designer client application provides five tools to help you
create mappings:
Source Analyzer. Use to import or create source definitions for flat
file, Cobol, ERP, and relational sources.
Warehouse Designer. Use to import or create target definitions.
Transformation Developer. Use to create reusable transformations.
Mapplet Designer. Use to create mapplets.
Mapping Designer. Use to create mappings.
Note:The Designer allows you to work with multiple tools at one time.
You can also work in multiple folders and repositories

31. What are different types of Tracing Levels u hv in


Transformations?
Ans: Tracing Levels in Transformations :-
Level Description
Terse Indicates when the Informatica Server initializes
the session and its components. Summarizes
session results, but not at the level of individual
records.
Normal Includes initialization information as well as error
messages and notification of rejected data.
Verbose initialization Includes all information provided with the
Normal setting plus more extensive information about
initializing transformations in the session.
Verbose data Includes all information provided with the Verbose
initialization setting.

Note: By default, the tracing level for every transformation is Normal.

To add a slight performance boost, you can also set the tracing level to
Terse, writing the minimum of detail to the session log
when running a session containing the transformation.

31(i). What the difference is between a database, a data warehouse


and a data mart?
Ans: -- A database is an organized collection of information.
-- A data warehouse is a very large database with special sets
of tools to extract and cleanse data from operational systems and to
analyze data.
-- A data mart is a focused subset of a data warehouse that
deals with a single area of data and is organized for quick analysis.
32. What is Data Mart, Data WareHouse and Decision Support
System explain briefly?
Ans: Data Mart:
A data mart is a repository of data gathered from operational data and
other sources that is designed to serve a particular community of
knowledge workers. In scope, the data may derive from an enterprise-wide
database or data warehouse or be more specialized. The emphasis of a
data mart is on meeting the specific demands of a particular group of
knowledge users in terms of analysis, content, presentation, and ease-of-
use. Users of a data mart can expect to have data presented in terms that
are familiar.
In practice, the terms data mart and data warehouse each tend to imply
the presence of the other in some form. However, most writers using the
term seem to agree that the design of a data mart tends to start
from an analysis of user needs and that a data warehouse tends
to start from an analysis of what data already exists and how it
can be collected in such a way that the data can later be used . A
data warehouse is a central aggregation of data (which can be
distributed physically); a data mart is a data repository that may derive
from a data warehouse or not and that emphasizes ease of access and
usability for a particular designed purpose. In general, a data warehouse
tends to be a strategic but somewhat unfinished concept; a data mart
tends to be tactical and aimed at meeting an immediate need.

Data Warehouse:
A data warehouse is a central repository for all or significant parts of
the data that an enterprise's various business systems collect. The term
was coined by W. H. Inmon. IBM sometimes uses the term
"information warehouse."
Typically, a data warehouse is housed on an enterprise mainframe
server. Data from various online transaction processing (OLTP)
applications and other sources is selectively extracted and organized on
the data warehouse database for use by analytical applications and user
queries. Data warehousing emphasizes the capture of data from
diverse sources for useful analysis and access, but does not generally
start from the point-of-view of the end user or knowledge worker who
may need access to specialized, sometimes local databases. The latter
idea is known as the data mart.
data mining, Web mining, and a decision support system (DSS)
are three kinds of applications that can make use of a data warehouse.

Decision Support System:


A decision support system (DSS) is a computer program application that
analyzes business data and presents it so that users can make business
decisions more easily. It is an "informational application" (in distinction to
an "operational application" that collects the data in the course of normal
business operation).

Typical information that a decision support application might


gather and present would be:
Comparative sales figures between one week and the next
Projected revenue figures based on new product sales assumptions
The consequences of different decision alternatives, given past
experience in a context that is described

A decision support system may present information graphically and may


include an expert system or artificial intelligence (AI). It may be aimed at
business executives or some other group of knowledge workers.

33. What r the differences between Heterogeneous and Homogeneous?


Ans: Heterogeneous Homogeneous
Stored in different Schemas Common structure
Stored in different file or db types Same database type
Spread across in several countries Same data center
Different platform n H/W config. Same platform and H/Ware
configuration.

34. How do you use DDL commands in PL/SQL block ex. Accept table
name from user and drop it, if available else display msg?
Ans: To invoke DDL commands in PL/SQL blocks we have to use
Dynamic SQL, the Package used is DBMS_SQL.
35. What r the steps to work with Dynamic SQL?
Ans: Open a Dynamic cursor, Parse SQL stmt, Bind i/p variables (if any),
Execute SQL stmt of Dynamic Cursor and
Close the Cursor.

36. Which package, procedure is used to find/check free space available for
db objects like table/procedures/views/synonyms…etc?
Ans: The Package  is DBMS_SPACE
The Procedure  is UNUSED_SPACE
The Table  is DBA_OBJECTS

Note: See the script to find free space @


c:\informatica\tbl_free_space

37. Does informatica allow if EmpId is PKey in Target tbl and source data is
2 rows with same EmpID?If u use lookup for the same
situation does it allow to load 2 rows or only 1?
Ans: => No, it will not it generates pkey constraint voilation. (it loads 1 row)
=> Even then no if EmpId is Pkey.

38. If Ename varchar2(40) from 1 source(siebel), Ename char(100) from


another source (oracle) and the target is having Name
varchar2(50) then how does informatica handles this situation? How
Informatica handles string and numbers datatypes
sources?

39. How do u debug mappings? I mean where do u attack?

40. How do u qry the Metadata tables for Informatica?

41(i). When do u use connected lookup n when do u use unconnected


lookup?
Ans:
Connected Lookups : -
A connected Lookup transformation is part of the mapping data flow.
With connected lookups, you can have multiple return values. That is,
you can pass multiple values from the same row in the lookup table out
of the Lookup transformation.
Common uses for connected lookups include:
=> Finding a name based on a number ex. Finding a Dname based on
deptno
=> Finding a value based on a range of dates
=> Finding a value based on multiple conditions
Unconnected Lookups : -
An unconnected Lookup transformation exists separate from the data
flow in the mapping. You write an expression using the :LKP reference
qualifier to call the lookup within another transformation.
Some common uses for unconnected lookups include:
=> Testing the results of a lookup in an expression
=> Filtering records based on the lookup results
=> Marking records for update based on the result of a lookup (for
example, updating slowly changing dimension tables)
=> Calling the same lookup multiple times in one mapping

41(ii). What r the differences between Connected lookups and


Unconnected lookups?
Ans: Although both types of lookups perform the same basic task,
there are some important differences:
---------------------------------------------------------------
------------------------------------------------------------
Connected Lookup Unconnected
Lookup
---------------------------------------------------------------
------------------------------------------------------------
Part of the mapping data flow. Separate from the mapping data
flow.
Can return multiple values from the same row. Returns one value from
each row.
You link the lookup/output ports to another You designate the return
value with the Return port (R).
transformation.
Supports default values. Does not support default values.
If there's no match for the lookup condition, theIf there's no match for the
lookup condition, the server
server returns the default value for all output ports. returns NULL.
More visible. Shows the data passing in and out Less visible. You
write an expression using :LKP to tell
of the lookup. the server when to perform the
lookup.
Cache includes all lookup columns used in the Cache includes
lookup/output ports in the Lookup condition
mapping (that is, lookup table columns included and
lookup/return port.
in the lookup condition and lookup table
columns linked as output ports to other
transformations).

42. What u need concentrate after getting explain plan?


Ans: The 3 most significant columns in the plan table are named
OPERATION,OPTIONS, and OBJECT_NAME.For each step,
these tell u which operation is going to be performed and which object
is the target of that operation.
Ex:-
**************************
TO USE EXPLAIN PLAN FOR A QRY...
**************************
SQL> EXPLAIN PLAN
2 SET STATEMENT_ID = 'PKAR02'
3 FOR
4 SELECT JOB,MAX(SAL)
5 FROM EMP
6 GROUP BY JOB
7 HAVING MAX(SAL) >= 5000;

Explained.
**************************
TO QUERY THE PLAN TABLE :-
**************************
SQL> SELECT RTRIM(ID)||' '||
2 LPAD(' ', 2*(LEVEL-1))||OPERATION
3 ||' '||OPTIONS
4 ||' '||OBJECT_NAME STEP_DESCRIPTION
5 FROM PLAN_TABLE
6 START WITH ID = 0 AND STATEMENT_ID = 'PKAR02'
7 CONNECT BY PRIOR ID = PARENT_ID
8 AND STATEMENT_ID = 'PKAR02'
9 ORDER BY ID;

STEP_DESCRIPTION
----------------------------------------------------
0 SELECT STATEMENT
1 FILTER
2 SORT GROUP BY
3 TABLE ACCESS FULL EMP

43. How components are interfaced in Psoft?


Ans:

44. How do u do the analysis of an ETL?


Ans:

=========================================================
=====

45. What is Standard, Reusable Transformation and Mapplet?


Ans: Mappings contain two types of transformations, standard and
reusable. Standard transformations exist within a single
mapping. You cannot reuse a standard transformation you created in
another mapping, nor can you create a shortcut to that transformation.
However, often you want to create transformations that perform common
tasks, such as calculating the average salary in a department. Since a
standard transformation cannot be used by more than one mapping, you
have to set up the same transformation each time you want to calculate
the average salary in a department.
Mapplet: A mapplet is a reusable object that represents a set of
transformations. It allows you to reuse transformation logic
and can contain as many transformations as you need. A mapplet
can contain transformations, reusable transformations, and
shortcuts to transformations.

46. How do u copy Mapping, Repository, Sessions?


Ans: To copy an object (such as a mapping or reusable transformation)
from a shared folder, press the Ctrl key and drag and drop
the mapping into the destination folder.

To copy a mapping from a non-shared folder, drag and drop the


mapping into the destination folder.
In both cases, the destination folder must be open with the related tool
active.
For example, to copy a mapping, the Mapping Designer must be active.
To copy a Source Definition, the Source Analyzer must be active.

Copying Mapping:
 To copy the mapping, open a workbook.
 In the Navigator, click and drag the mapping slightly to the right, not
dragging it to the workbook.
 When asked if you want to make a copy, click Yes, then enter a new
name and click OK.
 Choose Repository-Save.
Repository Copying: You can copy a repository from one database
to another. You use this feature before upgrading, to
preserve the original repository. Copying repositories provides a quick
way to copy all metadata you want to use as a basis for
a new repository.
If the database into which you plan to copy the repository contains an
existing repository, the Repository Manager deletes the existing
repository. If you want to preserve the old repository, cancel the copy.
Then back up the existing repository before copying the new repository.
To copy a repository, you must have one of the following
privileges:
 Administer Repository privilege
 Super User privilege

To copy a repository:
1. In the Repository Manager, choose Repository-Copy Repository.
2. Select a repository you wish to copy, then enter the following
information:
-------------------------------- ---------------------------
-------------------------------------------------
Copy Repository Field Required/ Optional Description
-------------------------------- ---------------------------
-------------------------------------------------
Repository Required Name for the repository copy.
Each repository name must be unique within
the domain and should be easily
distinguished from all other repositories.
Database Username Required Username required to connect
to the database. This login must have the
appropriate database permissions to create
the repository.
Database Password Required Password associated with the
database username.Must be in US-ASCII.
ODBC Data Source Required Data source used to connect to
the database.
Native Connect String Required Connect string identifying
the location of the database.
Code Page Required Character set associated with
the repository. Must be a superset of the code
page of the repository you want to copy.

If you are not connected to the repository you want to copy, the
Repository Manager asks you to log in.
3. Click OK.
5. If asked whether you want to delete an existing repository data in the
second repository, click OK to delete it. Click Cancel to preserve the
existing repository.

Copying Sessions:
In the Server Manager, you can copy stand-alone sessions within a
folder, or copy sessions in and out of batches.
To copy a session, you must have one of the following:
 Create Sessions and Batches privilege with read and write
permission
 Super User privilege
To copy a session:
1. In the Server Manager, select the session you wish to copy.
2. Click the Copy Session button or choose Operations-Copy Session.
The Server Manager makes a copy of the session. The Informatica
Server names the copy after the original session, appending a number,
such as session_name1.

47. What are shortcuts, and what is advantage?


Ans: Shortcuts allow you to use metadata across folders without making
copies, ensuring uniform metadata. A shortcut inherits all
properties of the object to which it points. Once you create a shortcut,
you can configure the shortcut name and description.

When the object the shortcut references changes, the shortcut inherits
those changes. By using a shortcut instead of a copy,
you ensure each use of the shortcut exactly matches the original
object. For example, if you have a shortcut to a target
definition, and you add a column to the definition, the shortcut
automatically inherits the additional column.

Shortcuts allow you to reuse an object without creating multiple objects


in the repository. For example, you use a source
definition in ten mappings in ten different folders. Instead of creating
10 copies of the same source definition, one in each
folder, you can create 10 shortcuts to the original source definition.
You can create shortcuts to objects in shared folders. If you try to
create a shortcut to a non-shared folder, the Designer
creates a copy of the object instead.

You can create shortcuts to the following repository objects:


 Source definitions
 Reusable transformations
 Mapplets
 Mappings
 Target definitions
 Business components

You can create two types of shortcuts:


Local shortcut. A shortcut created in the same repository as the
original object.
Global shortcut. A shortcut created in a local repository that
references an object in a global repository.

Advantages: One of the primary advantages of using a


shortcut is maintenance. If you need to change all instances of an
object, you can edit the original repository object. All shortcuts
accessing the object automatically inherit the changes.
Shortcuts have the following advantages over copied
repository objects:
 You can maintain a common repository object in a single
location. If you need to edit the object, all shortcuts
immediately inherit the changes you make.
 You can restrict repository users to a set of predefined
metadata by asking users to incorporate the shortcuts into
their work instead of developing repository objects
independently.
 You can develop complex mappings, mapplets, or reusable
transformations, then reuse them easily in other folders.
 You can save space in your repository by keeping a single
repository object and using shortcuts to that object, instead
of creating copies of the object in multiple folders or
multiple repositories.

48. What are Pre-session and Post-session Options?


(Plzz refer Help Using Shell Commands n Post-Session Commands
and Email)
Ans: The Informatica Server can perform one or more shell commands
before or after the session runs. Shell commands are
operating system commands. You can use pre- or post- session shell
commands, for example, to delete a reject file or
session log, or to archive target files before the session begins.

The status of the shell command, whether it completed


successfully or failed, appears in the session log file.
To call a pre- or post-session shell command you must:
1. Use any valid UNIX command or shell script for UNIX servers, or
any valid DOS or batch file for Windows NT servers.
2. Configure the session to execute the pre- or post-session shell
commands.

You can configure a session to stop if the Informatica Server encounters


an error while executing pre-session shell commands.

For example, you might use a shell command to copy a file from one
directory to another. For a Windows NT server you would use the
following shell command to copy the SALES_ ADJ file from the target
directory, L, to the source, H:
copy L:\sales\sales_adj H:\marketing\

For a UNIX server, you would use the following command line to
perform a similar operation:
cp sales/sales_adj marketing/

Tip: Each shell command runs in the same environment (UNIX or


Windows NT) as the Informatica Server. Environment settings in one
shell command script do not carry over to other scripts. To run all shell
commands in the same environment, call a single shell script that in turn
invokes other scripts.

49. What are Folder Versions?


Ans: In the Repository Manager, you can create different versions within
a folder to help you archive work in development. You can copy versions
to other folders as well. When you save a version, you save all metadata
at a particular point in development. Later versions contain new or
modified metadata, reflecting work that you have completed since the
last version.

Maintaining different versions lets you revert to earlier work when


needed. By archiving the contents of a folder into a version each time
you reach a development landmark, you can access those versions if
later edits prove unsuccessful.

You create a folder version after completing a version of a difficult


mapping, then continue working on the mapping. If you are unhappy with
the results of subsequent work, you can revert to the previous version,
then create a new version to continue development. Thus you keep the
landmark version intact, but available for regression.

Note: You can only work within one version of a folder at a time.
50. How do automate/schedule sessions/batches n did u use any tool for
automating Sessions/batch?
Ans: We scheduled our sessions/batches using Server Manager.
You can either schedule a session to run at a given time or interval, or
you can manually start the session.
U needto hv create sessions n batches with Read n Execute
permissions or super user privilege.
If you configure a batch to run only on demand, you cannot
schedule it.

Note: We did not use any tool for automation process.

51. What are the differences between 4.7 and 5.1 versions?
Ans: New Transformations added like XML Transformation and MQ Series
Transformation, and PowerMart and PowerCenter both
are same from 5.1version.

52. What r the procedure that u need to undergo before moving


Mappings/sessions from Testing/Development to Production?
Ans:

53. How many values it (informatica server) returns when it passes thru
Connected Lookup n Unconncted Lookup?
Ans: Connected Lookup can return multiple values where as Unconnected
Lookup will return only one values that is Return Value.

54. What is the difference between PowerMart and PowerCenter in 4.7.2?


Ans: If You Are Using PowerCenter
PowerCenter allows you to register and run multiple Informatica
Servers against the same repository. Because you can run
these servers at the same time, you can distribute the repository
session load across available servers to improve overall
performance.
With PowerCenter, you receive all product functionality, including
distributed metadata, the ability to organize repositories into
a data mart domain and share metadata across repositories.
A PowerCenter license lets you create a single repository that you
can configure as a global repository, the core component
of a data warehouse.
If You Are Using PowerMart
This version of PowerMart includes all features except
distributed metadata and multiple registered servers. Also, the
various
options available with PowerCenter (such as PowerCenter
Integration Server for BW, PowerConnect for IBM DB2,
PowerConnect for SAP R/3, and PowerConnect for PeopleSoft)
are not available with PowerMart.

55. What kind of modifications u can do/perform with each


Transformation?
Ans: Using transformations, you can modify data in the
following ways:
----------------- ------------------------
Task Transformation
----------------- ------------------------
Calculate a value Expression
Perform an aggregate calculations Aggregator
Modify text Expression
Filter records Filter, Source Qualifier
Order records queried by the Informatica Server Source
Qualifier
Call a stored procedure Stored Procedure
Call a procedure in a shared library or in the External
Procedure
COM layer of Windows NT
Generate primary keys Sequence Generator
Limit records to a top or bottom range Rank
Normalize records, including those read Normalizer
from COBOL sources
Look up values Lookup
Determine whether to insert, delete, update, Update Strategy
or reject records
Join records from different databases Joiner
or flat file systems

56. Expressions in Transformations, Explain briefly how do u use?


Ans: Expressions in Transformations
To transform data passing through a transformation, you can
write an expression. The most obvious examples of these are the
Expression and Aggregator transformations, which perform
calculations on either single values or an entire range of values
within a port. Transformations that use expressions include the
following:
--------------------- -----------------------------------
-------
Transformation How It Uses Expressions
--------------------- -----------------------------------
-------
Expression Calculates the result of an expression for each
row passing through the transformation, using
values from one or more ports.
Aggregator Calculates the result of an aggregate
expression, such as a sum or average, based
on all data passing through a port or on
groups within that data.
Filter Filters records based on a condition you
enter using an expression.
Rank Filters the top or bottom range of records,
based on a condition you enter using an
expression.
Update Strategy Assigns a numeric code to each record based
on an expression, indicating whether the
Informatica Server should use the information
in the record to insert, delete, or update the
target.
In each transformation, you use the Expression Editor to enter the
expression. The Expression Editor supports the transformation
language for building expressions. The transformation language
uses SQL-like functions, operators, and other components to
build the expression. For example, as in SQL, the transformation
language includes the functions COUNT and SUM. However, the
PowerMart/PowerCenter transformation language includes
additional functions not found in SQL.

When you enter the expression, you can use values available
through ports. For example, if the transformation has two input
ports representing a price and sales tax rate, you can calculate the
final sales tax using these two values. The ports used in the
expression can appear in the same transformation, or you can use
output ports in other transformations.

57. In case of Flat files (which comes thru FTP as source) has not
arrived then what happens?Where do u set this option?
Ans: U get an fatel error which cause server to fail/stop the session.
U can set Event-Based Scheduling Option in Session
Properties under General tab-->Advanced options..
----------------- ------------------- ------------------
Event-Based Required/ Optional Description
----------------- -------------------- ------------------
Indicator File to Wait For Optional Required to use event-
based scheduling. Enter the indicator file
(or directory and file) whose arrival
schedules the session. If you do
not enter a directory, the Informatica
Server assumes the file appears
in the server variable directory
$PMRootDir.

58. What is the Test Load Option and when you use in Server
Manager?
Ans: When testing sessions in development, you may not need to
process the entire source. If this is true, use the Test Load
Option(Session Properties  General Tab  Target Options
Choose Target Load options as Normal (option button), with
Test Load cheked (Check box) and No.of rows to test ex.2000
(Text box with Scrolls)). You can also click the Start button.

----------------------------------------------------------------------------------
----------------------------------------------------------------------------------
-----
59. SCD Type 2 and SGT difference?

60. Differences between 4.7 and 5.1?

61. Tuning Informatica Server for improving performance?


Performance Issues?
Ans: See /* C:\pkar\Informatica\Performance Issues.doc */

62. What is Override Option? Which is better?

63. What will happen if u increase buffer size?

64. what will happen if u increase commit Intervals? and also


decrease commit Intervals?

65. What kind of Complex mapping u did? And what sort of


problems u faced?

66. If u have 10 mappings designed and u need to implement some


changes(may be in existing mapping or new mapping need to
be designed) then how much time it takes from easier to
complex?

67. Can u refresh Repository in 4.7 and 5.1? and also can u refresh
pieces (partially) of repository in 4.7 and 5.1?
68. What is BI?
Ans: http://www.visionnet.com/bi/index.shtml

69. Benefits of BI?


Ans: http://www.visionnet.com/bi/bi-benefits.shtml

70. BI Faq
Ans: http://www.visionnet.com/bi/bi-faq.shtml

71. What is difference between data scrubbing and data cleansing?


Ans: Scrubbing data is the process of cleaning up the junk in legacy
data and making it accurate and useful for the next generations
of automated systems. This is perhaps the most difficult of all
conversion activities. Very often, this is made more difficult when
the customer wants to make good data out of bad data. This is
the dog work. It is also the most important and can not be done
without the active participation of the user.
DATA CLEANING - a two step process including
DETECTION and then CORRECTION of errors in a data set

72. What is Metadata and Repository?


Ans:
Metadata. “Data about data” .
It contains descriptive data for end users.
Contains data that controls the ETL processing.
Contains data about the current state of the data warehouse.
ETL updates metadata, to provide the most current state.

Repository. The place where you store the metadata is called a


repository. The more sophisticated your repository, the more complex
and detailed metadata you can store in it. PowerMart and PowerCenter
use a relational database as the repository.

73. SQL * LOADER?


Ans: http://download-
west.oracle.com/otndoc/oracle9i/901_doc/server.901/a90192/ch03.h
tm#1004678

74. Debugger in Mapping?

75. Parameters passing in 5.1 vesion exposure?

76. What is the filename which u need to configure in Unix while


Installing Informatica?

77. How do u select duplicate rows using Informatica i.e., how do u


use Max(Rowid)/Min(Rowid) in Informatica?

The Informatica Developer - What are the Desired Skills?

1. General: This document discusses the desired skill set of a


perspective Informatica Developer candidate.

2. Basic Skill Set:


A. Database
· Familiar with Oracle PL/SQL.
· Fluent in database design.
· Familiar with database performance and tuning skills.
B. UNIX
· UNIX shell scripting experience.
· Moderate UNIX administration (desired) to handle such
things as Informatica server installation, setting CRON
jobs, .archiving logs and files, creating directories,
maintaining user profiles, etc.
C. Informatica
(1). Knowledgeable in the installation of an Informatica
server on a UNIX platform.
(2). Knowledgeable in the installation and configuration
of ODBC sources.
(3). Knowledgeable in the installation and configuration
of the Informatica client tools.
(4). Knowledgeable in the configuration and connectivity
of the target Informatica server
with the Informatica server manager.
(5). Has a basic, fundable understanding of the
Informatica metadata repository.
(6). Knowledgeable in the administration of Informatica
security.
(7). Designer
(A) Source Analyzer
1. Understands how to import sources from a database.
2. Understands how to import a source from a file.
3. Understands how to import a source from a VSAM
source.
4. Understands how to create a source manually.
5. Understands how to reuse created objects.
6. Understands the concept of versioning within
Informatica.
(B) Target Definition
1. Understands how to create an automatic target
definition.
2. Understands how to create a manual target
definition.
(C) Understands how to use the schema generation
wizards.
(D) Understands the process of creating transformations
1. Understands how to perform calculations.
2. Understands how to declare local variables.
3. Understands how to call external procedures.
4. Understands how to lookup values from the
database.
5. Understands how to perform heterogeneous joins.
6. Understands how to filter data.
7. Understands how to use SQL to read sources.
8. Understands how to generate a unique sequence
number.
(E) Knows the twelve types of transformations and their
uses.
1. Relates to above.
2. (source qualifier, normalizer, expression, filter,
aggregator, rank, update strategy,
lookup, stored procedure, external procedure,
sequence generator, joiner).
(F) Is knowledgeable of the different types of Informatica
functions and their use.
1. Aggregate functions (i.e. avg, max, sum)
2. Character functions (i.e. chr, length, lpad).
3. Conversion functions (i.e. to_date, to_decimal,
to_integer).
4. Numerical functions (i.e. log, mod, round).
5. Special functions (abort, decode, iif).
6. Test Functions (i.e. is_null, is_spaces).
(G) Understands what an Informatica port is.
1. Understands what a required port is.
2. Understands what an optional port is.
3. Knows how to copy a port.
4. Knows how to manually create a port.
5. Knows what an input port is.
6. Knows what an output port is.
(H) Understands the Informatica datatypes.
(I) Understands the Informatica datatype conversions.
(J) Understands the concept of default values and how to
assign them.
(K) Understands the concept of validation and knows the
three levels of Designor
validation (connection, expression, and mapping).
(L) Understands the difference between a connected and
an unconnected lookup
validation.
(M) Understands the concept of a reusable
transformation and when it should be applied.
(8). Server Manager
(A) Understands how to configure a server connection.
(B) Understands how to connect to a server.
(C) Understands how to disconnect from a server.
(D) Sessions.
a. Understands how to schedule a session.
b. Understands how to launch a session.
c. Understands how to create a session.
d. Understands how to override the SQL qualifier.
e. Understands how to set the source tablename
qualifier.
f. Understands how a session and a map interact.
g. Understands the purpose of pre-session scripts.
h. Understands the purpose of post-session scripts.
i. Understands how to configure email notification.
j. Understands the concept of on-demand processing.
k. Understands the concept of scheduling and how to
set it up.
l. Understands how to set the output file.
m. Understands how to define file output format (fixed
width or delimited).
n. Understands how to create batches.
o. Understands how to add sessions to batcher.
p. Understands the concept of dependant vs.
independent batches.
q. Understands how to run batches in parallel.
r. Understands how to load one target table in parallel.
s. Understands how to load multiple targets in parallel.
t. Understands how to monitor sessions and batches.
u. Understands how to create control files.
(E) Troubleshooting
a. Understands where to look and how to troubleshoot
connectivity problems.
b. Understands where to look and how to read session
logs for troubleshooting.
c. Understands where to look and how to read event
and error logs for troubleshooting.
d. Understands the purpose of reject files and how they
are used with loads.
e. Understands how to reload rejected files with the
reject file loader.
f. Understands how to recover a failed session.
(9). Repository Manager
(A) Security and User Management
a. Understands the security philosophy employed by
Informatica (user id and password access, group and
user level privileges, repository level privileges, and
folder permissions)
b. Understands read, write, and execute permissions.
c. Understands how to backup a repository.
d. Understands how to copy a repository to another
database server.

3. Advanced Skill Set:


A. Database
· Familiar with Oracle partitioning.
· Familiar with materialized views.
· Familiar with bitmap indexing.
· Familiar with Data Replication.
· Familiar with Oracle Parallel Processing.
B. UNIX
(1) Knowledgeable about UNIX file system and
directory security (user, groups, and
others).
(2) Familiar with setting up user environments.
(3) Familiar with telnet.
(4) Familiar with FTP.
(5) Familiar with shared memory.
(6) Familiar with CRON administration.
(7) Familiar with NFS.
(8) Familiar with mounting file systems.
C. Informatica
(1) Performance
· Solid understanding of architectural tradeoffs when
positioning a mix of application servers together or
across server platforms or domains.
· Solid understanding of how to size a hardware server
for a given project.
· Solid understanding of how and when to deploy
multiple servers for a given project.
· Understanding of how to develop maps to be
optimized for maximum performance.
· Understanding of how to identify poor performing
maps and optimize them.
D. Data
(A) Data Granularity.
(B) Fundamental understanding of the concept of data
granularity and the implications thereof in information
reporting.
(C) Data Discovery
· Solid analytical skills which allow the data migration
architect to discover data elements required to populate
the warehouse dimension and fact tables.
· Solid understanding of the value of cataloging and
documenting data sources.
· Solid understanding of the value of data mapping and
developing metadata.
(D) Data Cleansing
· Solid understanding of where and how data should be
cleansed.
(E) Load Validation
· Solid understanding the need for and how to validate
the data when a load operation has completed.
(F) Data Integration
· Solid understanding of the issues involved with
integrating enterprise-wide data from disparate data
sources.
(G) Denormalization
· Solid understanding of the process of denormalizing
data and when it is appropriate to
apply it.
(H) Aggregation
· Solid understanding of the reasons for aggregating
data, and the pros and cons of data
aggregation.
(I) Partitioning
· Solid understanding of the benefits of partitioning data
and when it is appropriate to do
so.
(J) External Data.
· A solid understanding of how to handle external and
unstructured data and introduce it
into the data warehouse.
(K) Slowly Changing Dimensions
· Solid understanding of the concept of slowly changing
dimensions and how they are handled in the data
migration process.

Oracle Questions

1. How much oracle experience you have?


2. How many kinds of table spaces are there?
3. What are the versions of Oracle you worked on?
4. What partitioning did u do in your last project?
5. What are segments and do you know Index segments?

PL/SQL Questions

1. How many years of experience in writing PL/SQL code?


2. Difference between implicit cursor and explicit cursor?
3. What are PL/SQL Tables?
4. Tell me a scenario where you have used PL/SQL procedures?
5. What happens when the cursor is not closed?
6. How you will output the data from PL/SQL?
7. What are the PL/SQL packages you have used?
8. What is the difference between Stored Procedures and
functions?
9. How many types of triggers do you know?
10. How to handle exceptions in pl\sql
11. How to increase pl\sql performance
12. How you can generate surrogate key?

SQL Questions

1. How many years of experience in writing SQLs?


2. How do you add the new column to the existing table?
3. Tell me one simple select statement with condition?
4. What is the difference between UNION and JOIN?
5. What is the difference between Inner Join and Outer Join?
6. What is the difference between OrderBy and GroupBy?
7. Why you use DISTINCT clause in sql statements?
8. What does decode work?
9. Sub query and correlated sub query
10. Difference between union and union all.
11. Purpose of using minus
12. And where do you mention about exception handler in
Oracle
13. (write sql) If you need to update one table depending on
rows of other table
14. Do you know about virtual from? Virtual from is select
statement in from clause
15. Why you need indexing
16. Do you know about partitioning the tables?
17. How many types of partitions is possible
18. What are the difference between Materialized Views and
Views?
19. Oracle 9i features for partitioning
20. How do you write sql to reach parent value?
21. How Case Statements works in select
22. Types of joins
23. How you can implement if else in select statement

Sebastian

CISCO

1 bulk bind

2 bind vairable

3 raise appliaction error

4 difference between truncate and delete

5 materialized view how its used in data ware house difference


between a regular table and materialized view

6 oracle 8i feature you can use in data ware house

7 mutating table - how to get arround to it

8 snap shot too old error what is it have you encountered this error
in your project what is the solution

9 pseudo colum row num and row id

10 informatica wht all things you have done in your project11 what
was your role in your last project

12 explain plan tk proof

13 errors you have encountered in informatica

14 how big was your database


15 what was your team size

16 wht is hints? Its for query optimization its an option you are
giving to a query to chose how to execute a statement

17 common types of hints you have used ( index, rule, full)

18 bit map and b-tree index

19 kinds of triggers you have used.

20 trigger timing instead of used for?

Dba sections: sql/sqlplus, pl/sql, tuning, configuration, trouble


shooting
developer sections: sql/sqlplus, pl/sql, data modeling
data modeler: data modeling
all candidates for unix shop: unix

PL/SQL QUESTIONS:

1. Describe the difference between a procedure, function and


anonymous pl/sql block.
Level: Low
Expected answer: Candidate should mention use of
DECLARE statement, a function must return a value while a
procedure doesn’t have to.

2. What is a mutating table error and how can you get around
it?
Level: Intermediate
Expected answer: This happens with triggers. It occurs
because the trigger is trying to update a row it is currently
using. The usual fix involves either use of views or temporary
tables so the database is selecting from one while updating
the other.

3. Describe the use of %ROWTYPE and %TYPE in PL/SQL


Level: Low
Expected answer: %ROWTYPE allows you to associate a
variable with an entire table row. The %TYPE associates a
variable with a single column type.

4. What packages (if any) has Oracle provided for use by


developers?
Level: Intermediate to high
Expected answer: Oracle provides the DBMS_ series of
packages. There are many which developers should be aware
of such as DBMS_SQL, DBMS_PIPE,
DBMS_TRANSACTION, DBMS_LOCK, DBMS_ALERT,
DBMS_OUTPUT, DBMS_JOB, DBMS_UTILITY, DBMS_DDL,
UTL_FILE. If they can mention a few of these and describe
how they used them, even better. If they include the SQL
routines provided by Oracle, great, but not really what was
asked.

5. Describe the use of PL/SQL tables


Level: Intermediate
Expected answer: PL/SQL tables are scalar arrays that can
be referenced by a binary integer. They can be used to hold
values for use in later queries or calculations. In Oracle 8 they
will be able to be of the %ROWTYPE designation, or
RECORD.

6. When is a declare statement needed?


Level: Low
The DECLARE statement is used in PL/SQL anonymous
blocks such as with stand alone, non-stored PL/SQL
procedures. It must come first in a PL/SQL stand-alone file if it
is used.

7. In what order should a open/fetch/loop set of commands in


a PL/SQL block be implemented if you use the %NOTFOUND
cursor variable in the exit when statement? Why?
Level: Intermediate
Expected answer: OPEN then FETCH then LOOP followed by
the exit when. If not specified in this order will result in the
final return being done twice because of the way the
%NOTFOUND is handled by PL/SQL.

8. What are SQLCODE and SQLERRM and why are they


important for PL/SQL developers?
Level: Intermediate
Expected answer: SQLCODE returns the value of the error
number for the last error encountered. The SQLERRM returns
the actual error message for the last error encountered. They
can be used in exception handling to report, or, store in an
error log table, the error that occurred in the code. These are
especially useful for the WHEN OTHERS exception.

9. How can you find within a PL/SQL block, if a cursor is


open?
Level: Low
Expected answer: Use the %ISOPEN cursor status variable.

10. How can you generate debugging output from PL/SQL?


Level:Intermediate to high
Expected answer: Use the DBMS_OUTPUT package. Another
possible method is to just use the SHOW ERROR command,
but this only shows errors. The DBMS_OUTPUT package can
be used to show intermediate results from loops and the
status of variables as the procedure is executed. The new
package UTL_FILE can also be used.

11. What are the types of triggers?


Level:Intermediate to high
Expected Answer: There are 12 types of triggers in PL/SQL
that consist of combinations of the BEFORE, AFTER, ROW,
TABLE, INSERT, UPDATE, DELETE and ALL key words:
BEFORE ALL ROW INSERT
AFTER ALL ROW INSERT
BEFORE INSERT
AFTER INSERT

DBA:

1. Give one method for transferring a table from one schema


to another:
Level:Intermediate
Expected Answer: There are several possible methods,
export-import, CREATE TABLE... AS SELECT, or COPY.

2. What is the purpose of the IMPORT option IGNORE? What


is it’s default setting?
Level: Low
Expected Answer: The IMPORT IGNORE option tells import
to ignore ¡°already exists¡± errors. If it is not specified the
tables that already exist will be skipped. If it is specified, the
error is ignored and the tables data will be inserted. The
default value is N.

3. You have a rollback segment in a version 7.2 database that


has expanded beyond optimal, how can it be restored to
optimal?
Level: Low
Expected answer: Use the ALTER TABLESPACE ..... SHRINK
command.

4. If the DEFAULT and TEMPORARY tablespace clauses are


left out of a CREATE USER command what happens? Is this
bad or good? Why?
Level: Low
Expected answer: The user is assigned the SYSTEM
tablespace as a default and temporary tablespace. This is bad
because it causes user objects and temporary segments to be
placed into the SYSTEM tablespace resulting in fragmentation
and improper table placement (only data dictionary objects
and the system rollback segment should be in SYSTEM).

5. What are some of the Oracle provided packages that DBAs


should be aware of?
Level: Intermediate to High
Expected answer: Oracle provides a number of packages in
the form of the DBMS_ packages owned by the SYS user.
The packages used by DBAs may include:
DBMS_SHARED_POOL, DBMS_UTILITY, DBMS_SQL,
DBMS_DDL, DBMS_SESSION, DBMS_OUTPUT and
DBMS_SNAPSHOT. They may also try to answer with the
UTL*.SQL or CAT*.SQL series of SQL procedures. These can
be viewed as extra credit but aren’t part of the answer.

6. What happens if the constraint name is left out of a


constraint clause?
Level: Low
Expected answer: The Oracle system will use the default
name of SYS_Cxxxx where xxxx is a system generated
number. This is bad since it makes tracking which table the
constraint belongs to or what the constraint does harder.

7. What happens if a tablespace clause is left off of a primary


key constraint clause?
Level: Low
Expected answer: This results in the index that is
automatically generated being placed in then users default
tablespace. Since this will usually be the same tablespace as
the table is being created in, this can cause serious
performance problems.

8. What is the proper method for disabling and re-enabling a


primary key constraint?
Level: Intermediate
Expected answer: You use the ALTER TABLE command for
both. However, for the enable clause you must specify the
USING INDEX and TABLESPACE clause for primary keys.

9. What happens if a primary key constraint is disabled and


then enabled without fully specifying the index clause?
Level: Intermediate
Expected answer: The index is created in the user’s default
tablespace and all sizing information is lost. Oracle doesn’t
store this information as a part of the constraint definition, but
only as part of the index definition, when the constraint was
disabled the index was dropped and the information is gone.

10. (On UNIX) When should more than one DB writer


process be used? How many should be used?
Level: High
Expected answer: If the UNIX system being used is capable
of asynchronous IO then only one is required, if the system is
not capable of asynchronous IO then up to twice the number
of disks used by Oracle number of DB writers should be
specified by use of the db_writers initialization parameter.

11. You are using hot backup without being in archivelog


mode, can you recover in the event of a failure? Why or why
not?
Level: High
Expected answer: You can’t use hot backup without being in
archivelog mode. So no, you couldn’t recover.

12. What causes the ¡°snapshot too old¡± error? How can this
be prevented or mitigated?
Level: Intermediate
Expected answer: This is caused by large or long running
transactions that have either wrapped onto their own rollback
space or have had another transaction write on part of their
rollback space. This can be prevented or mitigated by
breaking the transaction into a set of smaller transactions or
increasing the size of the rollback segments and their extents.

13. How can you tell if a database object is invalid?


Level: Low
Expected answer: By checking the status column of the
DBA_, ALL_ or USER_OBJECTS views, depending upon
whether you own or only have permission on the view or are
using a DBA account.

14. A user is getting an ORA-00942 error yet you know you


have granted them permission on the table, what else should
you check?
Level: Low
Expected answer: You need to check that the user has
specified the full name of the object (select empid from
scott.emp; instead of select empid from emp;) or has a
synonym that points to the object (create synonym emp for
scott.emp;)

15. A developer is trying to create a view and the database


won’t let him. He has the ¡°DEVELOPER¡± role which has the
¡°CREATE VIEW¡± system privilege and SELECT grants on
the tables he is using, what is the problem?
Level: Intermediate
Expected answer: You need to verify the developer has direct
grants on all tables used in the view. You can’t create a stored
object with grants given through views.

16. If you have an example table, what is the best way to get
sizing data for the production table implementation?
Level: Intermediate
Expected answer: The best way is to analyze the table and
then use the data provided in the DBA_TABLES view to get
the average row length and other pertinent data for the
calculation. The quick and dirty way is to look at the number of
blocks the table is actually using and ratio the number of rows
in the table to its number of blocks against the number of
expected rows.

17. How can you find out how many users are currently
logged into the database? How can you find their operating
system id?
Level: high
Expected answer: There are several ways. One is to look at
the v$session or v$process views. Another way is to check
the current_logins parameter in the v$sysstat view. Another if
you are on UNIX is to do a ¡°ps -ef|grep oracle|wc -l’
command, but this only works against a single instance
installation.

18. A user selects from a sequence and gets back two values,
his select is:
SELECT pk_seq.nextval FROM dual;
What is the problem?
Level: Intermediate
Expected answer: Somehow two values have been inserted
into the dual table. This table is a single row, single column
table that should only have one value in it.

19. How can you determine if an index needs to be dropped


and rebuilt?
Level: Intermediate
Expected answer: Run the ANALYZE INDEX command on the
index to validate its structure and then calculate the ratio of
LF_BLK_LEN/LF_BLK_LEN+BR_BLK_LEN and if it isn’t near
1.0 (i.e. greater than 0.7 or so) then the index should be
rebuilt. Or if the ratio
BR_BLK_LEN/ LF_BLK_LEN+BR_BLK_LEN is nearing 0.3.

SQL/ SQLPLUS
1. How can variables be passed to a SQL routine?
Level: Low
Expected answer: By use of the & symbol. For passing in
variables the numbers 1-8 can be used (&1, &2,...,&8) to pass
the values after the command into the SQLPLUS session. To
be prompted for a specific variable, place the ampersanded
variable in the code itself:
¡°select * from dba_tables where owner=&owner_name;¡± .
Use of double ampersands tells SQLPLUS to resubstitute the
value for each subsequent use of the variable, a single
ampersand will cause a reprompt for the value unless an
ACCEPT statement is used to get the value from the user.

2. You want to include a carriage return/linefeed in your output


from a SQL script, how can you do this?
Level: Intermediate to high
Expected answer: The best method is to use the CHR()
function (CHR(10) is a return/linefeed) and the concatenation
function ¡°||¡±. Another method, although it is hard to
document and isn’t always portable is to use the
return/linefeed as a part of a quoted string.

3. How can you call a PL/SQL procedure from SQL?


Level: Intermediate
Expected answer: By use of the EXECUTE (short form EXEC)
command.

4. How do you execute a host operating system command


from within SQL?
Level: Low
Expected answer: By use of the exclamation point ¡°!¡± (in
UNIX and some other OS) or the HOST (HO) command.

5. You want to use SQL to build SQL, what is this called and
give an example
Level: Intermediate to high
Expected answer: This is called dynamic SQL. An example
would be:
set lines 90 pages 0 termout off feedback off verify off
spool drop_all.sql
select ¡®drop user ¡®||username||’ cascade;’ from dba_users
where username not in (¡°SYS’,’SYSTEM’);
spool off
Essentially you are looking to see that they know to include a
command (in this case DROP USER...CASCADE;) and that
you need to concatenate using the ¡®||’ the values selected
from the database.

6. What SQLPlus command is used to format output from a


select?
Level: low
Expected answer: This is best done with the COLUMN
command.

7. You want to group the following set of select returns, what


can you group on?
Max(sum_of_cost), min(sum_of_cost), count(item_no),
item_no
Level: Intermediate
Expected answer: The only column that can be grouped on is
the ¡°item_no¡± column, the rest have aggregate functions
associated with them.

8. What special Oracle feature allows you to specify how the


cost based system treats a SQL statement?
Level: Intermediate to high
Expected answer: The COST based system allows the use of
HINTs to control the optimizer path selection. If they can give
some example hints such as FIRST ROWS, ALL ROWS,
USING INDEX, STAR, even better.
9. You want to determine the location of identical rows in a
table before attempting to place a unique index on the table,
how can this be done?
Level: High
Expected answer: Oracle tables always have one guaranteed
unique column, the rowid column. If you use a min/max
function against your rowid and then select against the
proposed primary key you can squeeze out the rowids of the
duplicate rows pretty quick. For example:

select rowid from emp e


where e.rowid > (select min(x.rowid)
from emp x
where x.emp_no = e.emp_no);

In the situation where multiple columns make up the proposed


key, they must all be used in the where clause.

10. What is a Cartesian product?


Level: Low
Expected answer: A Cartesian product is the result of an
unrestricted join of two or more tables. The result set of a
three table Cartesian product will have x * y * z number of
rows where x, y, z correspond to the number of rows in each
table involved in the join.

11. You are joining a local and a remote table, the network
manager complains about the traffic involved, how can you
reduce the network traffic?
Level: High
Expected answer: Push the processing of the remote data to
the remote instance by using a view to pre-select the
information for the join. This will result in only the data
required for the join being sent across.

12. What is the default ordering of an ORDER BY clause in a


SELECT statement?
Level: Low
Expected answer: Ascending

13. What is tkprof and how is it used?


Level: Intermediate to high
Expected answer: The tkprof tool is a tuning tool used to
determine cpu and execution times for SQL statements. You
use it by first setting timed_statistics to true in the initialization
file and then turning on tracing for either the entire database
via the sql_trace parameter or for the session using the
ALTER SESSION command. Once the trace file is generated
you run the tkprof tool against the trace file and then look at
the output from the tkprof tool. This can also be used to
generate explain plan output.

14. What is explain plan and how is it used?


Level: Intermediate to high
Expected answer: The EXPLAIN PLAN command is a tool to
tune SQL statements. To use it you must have an
explain_table generated in the user you are running the
explain plan for. This is created using the utlxplan.sql script.
Once the explain plan table exists you run the explain plan
command giving as its argument the SQL statement to be
explained. The explain_plan table is then queried to see the
execution plan of the statement. Explain plans can also be run
using tkprof.

15. How do you set the number of lines on a page of output?


The width?
Level: Low
Expected answer: The SET command in SQLPLUS is used to
control the number of lines generated per page and the width
of those lines, for example SET PAGESIZE 60 LINESIZE 80
will generate reports that are 60 lines long with a line width of
80 characters. The PAGESIZE and LINESIZE options can be
shortened to PAGES and LINES.

16. How do you prevent output from coming to the screen?


Level: Low
Expected answer: The SET option TERMOUT controls output
to the screen. Setting TERMOUT OFF turns off screen output.
This option can be shortened to TERM.

17. How do you prevent Oracle from giving you informational


messages during and after a SQL statement execution?
Level: Low
Expected answer: The SET options FEEDBACK and VERIFY
can be set to OFF.

18. How do you generate file output from SQL?


Level: Low
Expected answer: By use of the SPOOL command

TUNING QUESTIONS:

1. A tablespace has a table with 30 extents in it. Is this bad?


Why or why not.
Level: Intermediate
Expected answer: Multiple extents in and of themselves aren’t
bad. However if you also have chained rows this can hurt
performance.

2. How do you set up tablespaces during an Oracle


installation?
Level: Low
Expected answer: You should always attempt to use the
Oracle Flexible Architecture standard or another partitioning
scheme to ensure proper separation of SYSTEM,
ROLLBACK, REDO LOG, DATA, TEMPORARY and INDEX
segments.
3. You see multiple fragments in the SYSTEM tablespace,
what should you check first?
Level: Low
Expected answer: Ensure that users don’t have the SYSTEM
tablespace as their TEMPORARY or DEFAULT tablespace
assignment by checking the DBA_USERS view.

4. What are some indications that you need to increase the


SHARED_POOL_SIZE parameter?
Level: Intermediate
Expected answer: Poor data dictionary or library cache hit
ratios, getting error ORA-04031. Another indication is steadily
decreasing performance with all other tuning parameters the
same.

5. What is the general guideline for sizing db_block_size and


db_multi_block_read for an application that does many full
table scans?
Level: High
Expected answer: Oracle almost always reads in 64k chunks.
The two should have a product equal to 64 or a multiple of 64.

6. What is the fastest query method for a table?


Level: Intermediate
Expected answer: Fetch by rowid

7. Explain the use of TKPROF? What initialization parameter


should be turned on to get full TKPROF output?
Level: High
Expected answer: The tkprof tool is a tuning tool used to
determine cpu and execution times for SQL statements. You
use it by first setting timed_statistics to true in the initialization
file and then turning on tracing for either the entire database
via the sql_trace parameter or for the session using the
ALTER SESSION command. Once the trace file is generated
you run the tkprof tool against the trace file and then look at
the output from the tkprof tool. This can also be used to
generate explain plan output.

8. When looking at v$sysstat you see that sorts (disk) is high.


Is this bad or good? If bad -How do you correct it?
Level: Intermediate
Expected answer: If you get excessive disk sorts this is bad.
This indicates you need to tune the sort area parameters in
the initialization files. The major sort are parameter is the
SORT_AREA_SIZe parameter.

9. When should you increase copy latches? What parameters


control copy latches?
Level: high
Expected answer: When you get excessive contention for the
copy latches as shown by the ¡°redo copy¡± latch hit ratio. You
can increase copy latches via the initialization parameter
LOG_SIMULTANEOUS_COPIES to twice the number of
CPUs on your system.

10. Where can you get a list of all initialization parameters for
your instance? How about an indication if they are default
settings or have been changed?
Level: Low
Expected answer: You can look in the init<sid>.ora file for an
indication of manually set parameters. For all parameters,
their value and whether or not the current value is the default
value, look in the v$parameter view.

11. Describe hit ratio as it pertains to the database buffers.


What is the difference between instantaneous and cumulative
hit ratio and which should be used for tuning?
Level: Intermediate
Expected answer: The hit ratio is a measure of how many
times the database was able to read a value from the buffers
verses how many times it had to re-read a data value from the
disks. A value greater than 80-90% is good, less could
indicate problems. If you simply take the ratio of existing
parameters this will be a cumulative value since the database
started. If you do a comparison between pairs of readings
based on some arbitrary time span, this is the instantaneous
ratio for that time span. Generally speaking an instantaneous
reading gives more valuable data since it will tell you what
your instance is doing for the time it was generated over.

12. Discuss row chaining, how does it happen? How can you
reduce it? How do you correct it?
Level: high
Expected answer: Row chaining occurs when a VARCHAR2
value is updated and the length of the new value is longer
than the old value and won’t fit in the remaining block space.
This results in the row chaining to another block. It can be
reduced by setting the storage parameters on the table to
appropriate values. It can be corrected by export and import
of the effected table.

13. When looking at the estat events report you see that you
are getting busy buffer waits. Is this bad? How can you find
what is causing it?
Level: high
Expected answer: Buffer busy waits could indicate contention
in redo, rollback or data blocks. You need to check the
v$waitstat view to see what areas are causing the problem.
The value of the ¡°count¡± column tells where the problem is,
the ¡°class¡± column tells you with what. UNDO is rollback
segments, DATA is data base buffers.

14. If you see contention for library caches how can you fix it?

Level: Intermediate
Expected answer: Increase the size of the shared pool.
15. If you see statistics that deal with ¡°undo¡± what are they
really talking about?
Level: Intermediate
Expected answer: Rollback segments and associated
structures.

16. If a tablespace has a default pctincrease of zero what will


this cause (in relationship to the smon process)?
Level: High
Expected answer: The SMON process won’t automatically
coalesce its free space fragments.

17. If a tablespace shows excessive fragmentation what are


some methods to defragment the tablespace? (7.1,7.2 and
7.3 only)
Level: High
Expected answer: In Oracle 7.0 to 7.2 The use of the 'alter
session set events 'immediate trace name coalesce level ts#';’
command is the easiest way to defragment contiguous free
space fragmentation. The ts# parameter corresponds to the
ts# value found in the ts$ SYS table. In version 7.3 the ¡®alter
tablespace <name> coalesce;’ is best. If the free space isn’t
contiguous then export, drop and import of the tablespace
contents may be the only way to reclaim non-contiguous free
space.

18. How can you tell if a tablespace has excessive


fragmentation?
Level: Intermediate
If a select against the dba_free_space table shows that the
count of a tablespaces extents is greater than the count of its
data files, then it is fragmented.

19. You see the following on a status report:


redo log space requests 23
redo log space wait time 0

Is this something to worry about? What if redo log space wait


time is high? How can you fix this?
Level: Intermediate
Expected answer: Since the wait time is zero, no. If the wait
time was high it might indicate a need for more or larger redo
logs.

20. What can cause a high value for recursive calls? How can
this be fixed?
Level: High
Expected answer: A high value for recursive calls is cause by
improper cursor usage, excessive dynamic space
management actions, and or excessive statement re-parses.
You need to determine the cause and correct it By either
relinking applications to hold cursors, use proper space
management techniques (proper storage and sizing) or
ensure repeat queries are placed in packages for proper
reuse.

21. If you see a pin hit ratio of less than 0.8 in the estat library
cache report is this a problem? If so, how do you fix it?
Level: Intermediate
Expected answer: This indicate that the shared pool may be
too small. Increase the shared pool size.

22. If you see the value for reloads is high in the estat library
cache report is this a matter for concern?
Level: Intermediate
Expected answer: Yes, you should strive for zero reloads if
possible. If you see excessive reloads then increase the size
of the shared pool.

23. You look at the dba_rollback_segs view and see that there
is a large number of shrinks and they are of relatively small
size, is this a problem? How can it be fixed if it is a problem?
Level: High
Expected answer: A large number of small shrinks indicates a
need to increase the size of the rollback segment extents.
Ideally you should have no shrinks or a small number of large
shrinks. To fix this just increase the size of the extents and
adjust optimal accordingly.

24. You look at the dba_rollback_segs view and see that you
have a large number of wraps is this a problem?
Level: High
Expected answer: A large number of wraps indicates that your
extent size for your rollback segments are probably too small.
Increase the size of your extents to reduce the number of
wraps. You can look at the average transaction size in the
same view to get the information on transaction size.

25. In a system with an average of 40 concurrent users you


get the following from a query on rollback extents:

ROLLBACK CUR EXTENTS


--------------------- ------------------
R01 11
R02 8
R03 12
R04 9
SYSTEM 4

You have room for each to grow by 20 more extents each. Is


there a problem? Should you take any action?
Level: Intermediate
Expected answer: No there is not a problem. You have 40
extents showing and an average of 40 concurrent users.
Since there is plenty of room to grow no action is needed.

26. You see multiple extents in the temporary tablespace. Is


this a problem?
Level: Intermediate
Expected answer: As long as they are all the same size this
isn’t a problem. In fact, it can even improve performance since
Oracle won’t have to create a new extent when a user needs
one.

INSTALLATION/CONFIGURATION

1. Define OFA.
Level: Low
Expected answer: OFA stands for Optimal Flexible
Architecture. It is a method of placing directories and files in
an Oracle system so that you get the maximum flexibility for
future tuning and file placement.

Score: ____________ Comment:


_________________________________________________
_______

2. How do you set up your tablespace on installation?


Level: Low
Expected answer: The answer here should show an
understanding of separation of redo and rollback, data and
indexes and isolation os SYSTEM tables from other tables. An
example would be to specify that at least 7 disks should be
used for an Oracle installation so that you can place SYSTEM
tablespace on one, redo logs on two (mirrored redo logs) the
TEMPORARY tablespace on another, ROLLBACK tablespace
on another and still have two for DATA and INDEXES. They
should indicate how they will handle archive logs and exports
as well. As long as they have a logical plan for combining or
further separation more or less disks can be specified.

3. What should be done prior to installing Oracle (for the OS


and the disks)?
Level: Low
Expected Answer: adjust kernel parameters or OS tuning
parameters in accordance with installation guide. Be sure
enough contiguous disk space is available.

4. You have installed Oracle and you are now setting up the
actual instance. You have been waiting an hour for the
initialization script to finish, what should you check first to
determine if there is a problem?
Level: Intermediate to high
Expected Answer: Check to make sure that the archiver isn’t
stuck. If archive logging is turned on during install a large
number of logs will be created. This can fill up your archive log
destination causing Oracle to stop to wait for more space.

5. When configuring SQLNET on the server what files must be


set up?
Level: Intermediate
Expected answer: INITIALIZATION file, TNSNAMES.ORA file,
SQLNET.ORA file

6. When configuring SQLNET on the client what files need to


be set up?
Level: Intermediate
Expected answer: SQLNET.ORA, TNSNAMES.ORA

7. What must be installed with ODBC on the client in order for


it to work with Oracle?
Level: Intermediate
Expected answer: SQLNET and PROTOCOL (for example:
TCPIP adapter) layers of the transport programs.

8. You have just started a new instance with a large SGA on a


busy existing server. Performance is terrible, what should you
check for?
Level: Intermediate
Expected answer: The first thing to check with a large SGA is
that it isn’t being swapped out.

9. What OS user should be used for the first part of an Oracle


installation (on UNIX)?
Level: low
Expected answer: You must use root first.

10. When should the default values for Oracle initialization


parameters be used as is?
Level: Low
Expected answer: Never

11. How many control files should you have? Where should
they be located?
Level: Low
Expected answer: At least 2 on separate disk spindles. Be
sure they say on separate disks, not just file systems.

12. How many redo logs should you have and how should
they be configured for maximum recoverability?
Level: Intermediate
Expected answer: You should have at least three groups of
two redo logs with the two logs each on a separate disk
spindle (mirrored by Oracle). The redo logs should not be on
raw devices on UNIX if it can be avoided.

13. You have a simple application with no ¡°hot¡± tables (i.e.


uniform IO and access requirements). How many disks should
you have assuming standard layout for SYSTEM, USER,
TEMP and ROLLBACK tablespaces?
Expected answer: At least 7, see disk configuration answer
above.

DATA MODELER:
1. Describe third normal form?
Level: Low
Expected answer: Something like: In third normal form all
attributes in an entity are related to the primary key and only
to the primary key

2. Is the following statement true or false:

¡°All relational databases must be in third normal form¡±

Why or why not?


Level: Intermediate
Expected answer: False. While 3NF is good for logical design
most databases, if they have more than just a few tables, will
not perform well using full 3NF. Usually some entities will be
denormalized in the logical to physical transfer process.

3. What is an ERD?
Level: Low
Expected answer: An ERD is an Entity-Relationship-Diagram.
It is used to show the entities and relationships for a database
logical model.

4. Why are recursive relationships bad? How do you resolve


them?
Level: Intermediate
A recursive relationship (one where a table relates to itself) is
bad when it is a hard relationship (i.e. neither side is a
¡°may¡± both are ¡°must¡±) as this can result in it not being
possible to put in a top or perhaps a bottom of the table (for
example in the EMPLOYEE table you couldn’t put in the
PRESIDENT of the company because he has no boss, or the
junior janitor because he has no subordinates). These type of
relationships are usually resolved by adding a small
intersection entity.

5. What does a hard one-to-one relationship mean (one


where the relationship on both ends is ¡°must¡±)?
Level: Low to intermediate
Expected answer: This means the two entities should
probably be made into one entity.

6. How should a many-to-many relationship be handled?


Level: Intermediate
Expected answer: By adding an intersection entity table

7. What is an artificial (derived) primary key? When should


an artificial (or derived) primary key be used?
Level: Intermediate
Expected answer: A derived key comes from a sequence.
Usually it is used when a concatenated key becomes too
cumbersome to use as a foreign key.

8. When should you consider denormalization?


Level: Intermediate
Expected answer: Whenever performance analysis indicates
it would be beneficial to do so without compromising data
integrity.

UNIX:

1. How can you determine the space left in a file system?


Level: Low
Expected answer: There are several commands to do this: du,
df, or bdf

2. How can you determine the number of SQLNET users


logged in to the UNIX system?
Level: Intermediate
Expected answer: SQLNET users will show up with a process
unique name that begins with oracle<SID>, if you do a ps -ef|
grep oracle<SID>|wc -l you can get a count of the number of
users.

3. What command is used to type files to the screen?


Level: Low
Expected answer: cat, more, pg

4. What command is used to remove a file?


Level: Low
Expected answer: rm

5. Can you remove an open file under UNIX?


Level: Low
Expected answer: yes

6. How do you create a decision tree in a shell script?


Level: intermediate
Expected answer: depending on shell, usually a case-esac or
an if-endif or fi structure

7. What is the purpose of the grep command?


Level: Low
Expected answer: grep is a string search command that
parses the specified string from the specified file or files

8. The system has a program that always includes the word


nocomp in its name, how can you determine the number of
processes that are using this program?
Level: intermediate
Expected answer: ps -ef|grep *nocomp*|wc -l

9. What is an inode?
Level: Intermediate
Expected answer: an inode is a file status indicator. It is
stored in both disk and memory and tracts file status. There is
one inode for each file on the system.

10. The system administrator tells you that the system hasn’t
been rebooted in 6 months, should he be proud of this?
Level: High
Expected answer: Maybe. Some UNIX systems don’t clean up
well after themselves. Inode problems and dead user
processes can accumulate causing possible performance and
corruption problems. Most UNIX systems should have a
scheduled periodic reboot so file systems can be checked and
cleaned and dead or zombie processes cleared out.

11. What is redirection and how is it used?


Level: Intermediate
Expected answer: redirection is the process by which input or
output to or from a process is redirected to another process.
This can be done using the pipe symbol ¡°|¡±, the greater than
symbol ¡°>¡° or the ¡°tee¡± command. This is one of the
strengths of UNIX allowing the output from one command to
be redirected directly into the input of another command.

12. How can you find dead processes?


Level: Intermediate
Expected answer: ps -ef|grep zombie -- or -- who -d
depending on the system.

13. How can you find all the processes on your system?
Level: Low
Expected answer: Use the ps command

14. How can you find your id on a system?


Level: Low
Expected answer: Use the ¡°who am i¡± command.

15. What is the finger command?


Level: Low
Expected answer: The finger command uses data in the
passwd file to give information on system users.

16. What is the easiest method to create a file on UNIX?


Level: Low
Expected answer: Use the touch command

17. What does >> do?


Level: Intermediate
Expected answer: The ¡°>>¡° redirection symbol appends the
output from the command specified into the file specified. The
file must already have been created.

18. If you aren’t sure what command does a particular UNIX


function what is the best way to determine the command?
Expected answer: The UNIX man -k <value> command will
search the man pages for the value specified. Review the
results from the command to find the command of interest.

ORACLE TROUBLESHOOTING:

1. How can you determine if an Oracle instance is up from the


operating system level?
Level: Low
Expected answer: There are several base Oracle processes
that will be running on multi-user operating systems, these will
be smon, pmon, dbwr and lgwr. Any answer that has them
using their operating system process showing feature to
check for these is acceptable. For example, on UNIX a ps -ef|
grep dbwr will show what instances are up.

2. Users from the PC clients are getting messages indicating :

Level: Low
ORA-06114: (Cnct err, can't get err txt. See Servr Msgs &
Codes Manual)

What could the problem be?

Expected answer: The instance name is probably incorrect in


their connection string.

3. Users from the PC clients are getting the following error


stack:
Level: Low
ERROR: ORA-01034: ORACLE not available
ORA-07318: smsget: open error when opening sgadef.dbf file.

HP-UX Error: 2: No such file or directory

What is the probable cause?

Expected answer: The Oracle instance is shutdown that they


are trying to access, restart the instance.

4. How can you determine if the SQLNET process is running


for SQLNET V1? How about V2?
Level: Low
Expected answer: For SQLNET V1 check for the existence of
the orasrv process. You can use the command ¡°tcpctl
status¡± to get a full status of the V1 TCPIP server, other
protocols have similar command formats. For SQLNET V2
check for the presence of the LISTENER process(s) or you
can issue the command ¡°lsnrctl status¡±.

5. What file will give you Oracle instance status information?


Where is it located?
Level: Low
Expected answer: The alert<SID>.ora log. It is located in the
directory specified by the background_dump_dest parameter
in the v$parameter table.
6. Users aren’t being allowed on the system. The following
message is received:
Level: Intermediate

ORA-00257 archiver is stuck. Connect internal only, until


freed

What is the problem?

Expected answer: The archive destination is probably full,


backup the archive logs and remove them and the archiver
will re-start.

7. Where would you look to find out if a redo log was


corrupted assuming you are using Oracle mirrored redo logs?
Level: Intermediate

Expected answer: There is no message that comes to the


SQLDBA or SRVMGR programs during startup in this
situation, you must check the alert<SID>.log file for this
information.

8. You attempt to add a datafile and get:


Level: Intermediate

ORA-01118: cannot add anymore datafiles: limit of 40


exceeded

What is the problem and how can you fix it?

Expected answer: When the database was created the


db_files parameter in the initialization file was set to 40. You
can shutdown and reset this to a higher value, up to the value
of MAX_DATAFILES as specified at database creation. If the
MAX_DATAFILES is set to low, you will have to rebuild the
control file to increase it before proceeding.

9. You look at your fragmentation report and see that smon


hasn’t coalesced any of you tablespaces, even though you
know several have large chunks of contiguous free extents.
What is the problem?
Level: High

Expected answer: Check the dba_tablespaces view for the


value of pct_increase for the tablespaces. If pct_increase is
zero, smon will not coalesce their free space.

10. Your users get the following error:


Level: Intermediate

ORA-00055 maximum number of DML locks exceeded

What is the problem and how do you fix it?

Expected answer: The number of DML Locks is set by the


initialization parameter DML_LOCKS. If this value is set to low
(which it is by default) you will get this error. Increase the
value of DML_LOCKS. If you are sure that this is just a
temporary problem, you can have them wait and then try
again later and the error should clear.

Data Handling Questions

19. How did you handle the rejected data? Ans: Open the log
file and rejected file and analyze the reason for rejection of
each row and the modify the data in the rejected file, then
using reject load utility reload the data into the target tables.
20. In how many types you can load the target data?
21. Can we create target table dynamically? How?
22. How can we use the same mapping for extracting data
from a source, which comes with a different name every week
without modifying the mapping?
23. What is the difference between bulk load and normal
load?
24. In how many types you can load the target data?
25. Can we create target table dynamically? How?
26. How can we use the same mapping for extracting data
from a source, which comes with a different name every week
without modifying the mapping.
27. There are three Targets "X","Y","Z" in a mapping. How
do I look mapping only for target "X" without? Having the "Y"
and "Z" on the screen. Ans) select "Layout " from the Toolbar.
Go to option "Arrange". And select the Target "X"
28. There are three Targets "X","Y","Z" in a mapping. How
do I set the load sequence?
29. So that the Data gets loaded for "X" then "Y" and then
"Z"? Ans GO to "Mappings" in the toolbar and then select
Target Load Plan.
30. How did you handle the data errors, say bad data?
31. How do you do a Test Load ?
32. If there is no primary key on the Target Table can we
update the Target Table? Ans : No
33. What is difference in between normal load and bulk
load?
34. How did you handle the rejected data? Ans: Open the
rejected file and analyze the reason for rejection of each row
and modify the data in the rejected file, then using reject load
utility reload the data into the target tables.
35. What the different sources that Informatica can handle?
36. When can we run a Store Procedure?
Ans:
f. Normal - when the store procedure is supposed to be
executed after each and Every row of data.
g. Pre-Load of The Source - Before the sessions retrieve
the data from the source.

h. Post-Load of the Source- After the session retrieves the


data from the source

i. Pre-Load of the Target - Before the session sends the data


to the target.

j. Post-Load of the Target- After the session sends the data


to the target.

General Informatica Questions

19. Difference between Powermart and PowerCenter and


explain about their architecture?
20. How many mappings did you create?
21. Did you deal with any legacy systems?
22. What are the enhancements in Informatica 6.x?
23. How informatica can be used as data warehousing tool?
24. Which versions you have worked and what are the
differences?
25. What are the enhancements from a previous version of
Informatica like 4.7 to 5.1.1?Ans: Router transformation and
debugger were introduced. Aggregator performance was
enhanced
26. What were the problems you faced while upgrading from
5.1.1 to 6.0?
27. What are the difference between 5.1 and 6.1?
28. How many mapplets you have created?
29. Did you do parallel ETL?
30. Have you used multiple registered servers?
31. How many records are transferred from Source to
Target?
32. How many Informatica Servers?
33. How do you get flat files from Source tables?
34. Have you worked with DB2?
35. Tell me the problems you faced in your last project?
36. What is Power Connect? Ans Power Connect is an
option that is available for Power Center to connect to IBM
DB2, SAP and People Soft

Designer Questions

61. Every Mapping should have atleast three components.


What are they?
62. What is Mapplet? What are reusable mapplet?
63. What is mapping?
64. How can you connect to the target database without
ODBC?
65. How many transformations did you use?
66. Difference between Active Transformations and Passive
Transformations?
67. What are the transformations, which use cache?
68. What is the difference between filter and Router
Transformations?
69. What is lookup transformation? What is the difference
between connected and unconnected lookup transformations?
70. What is the normalize transformation?
71. Can you used normalize for both normalizing and de-
normalizing a record?
72. A particular mapping’s source is in a database schema.
The Schema is changed and even the columns in the source are
changed. How can we run the same mapping?
73. What are the various dd_ commands? In which
transformation you give them where do you give dd_insert,
dd_update and dd_delete? What are the different Update
Strategies?
74. How do you calculate the index cache?
75. In Sequence Generator, what happens when nextval is
connected and curval is not connected? What happens if it is
reverse?
76. Can we send data from normalization and filter to
expression transformation?
77. Explain this Transformation:
IIF(ISNULL(C),IIF(UPPER(A)='SALARIED',1,IIF(UPPER(B
)='HOURLY',2,3)),4)'A' is not equal to ‘SALARIED’ and B is
not equal to 'HOURLY'. What the o/p of this transformation
and 'C' is Null
78. I have a sequence "Y" which gets incremented whenever
the data gets loaded into Table "X". How do i set "Y" to
"ZERO" before the data gets loaded into table "X" Next time.
Ans : check the "RESET" Button in the sequence properties.

79. IIF (isnull (A), NULL, IIF (isnull (B), 4,iif (D ='1', 0,
-1)))
a. If d =1 and B= null and a= not null
80. What is SQL Override
81. Did you do Error Handling? (Null Handling?)
82. Explain the complex mapping you did?
83. Purpose of Source qualifier Transformation ?
84. How do you we migration the Mappings and Sessions
from Development to QA or Testing ?
85. Can we use two tables from two different databases in a
joiner sql override in Source Qualifier Transformation? Ans :
NO
86. Have you created Stored Procedure ?
87. For what Transformations the Sessions runs slowly….If
so how do you fix them?
88. What are the Various Kinds of Ports that are used in
different Transformations?
89. Designer lets you to add local variables to which
transformations?
90. What is pmcmd
91. What are the two ways to validate a Mapping
92. Which Transformation is used to join heterogeneous
sources residing at different locations or File Systems?
93. Can you tell one Scenario where you used lookup
transformation
94. Have you used any of the advanced configuration options
regarding performance?
95. Can u send data into a relational table and flat file from
the same source?
96. What was the target database?
97. How do you debug a procedure (or how do I get process
details)?
98. How do you truncate a table?What about the truncate
option on the target settings?
99. What are target load strategies? What you were using in
your latest project?
100. What is a router transformation?
101. Can you have a maplet inside another maplet? No
102. Which tool do you use perform unlocks? Repository
Manager
103. Added features of Informatica 6.0 Designer
104. What is a worklets?
105. Difference between using a joiner transformation and
SQL with multiple joins? Which do you prefer?
106. What does the normalizer transformation do?
107. Which transformations should not use in Mapplets?
108. What is pmrep command?
109. Write the Syntax for Unconnected Lookup And the
lookup Name is "SATYAM" and two values are to be parsed
"SATYAM COMPUTERS" AND "STC"?
110. IIF (ISNULL (A), DD_INSERT, DD_UPDATE), what is
the O/P?
111. How do you run the server on unix machines? Ans:
Using ‘pmcmd’ command.
112. What is the warehouse designer in Informatica?
113. You usually get flat files from legacy systems. They can
be joined with tables from relational sources using a joiner
transformation.
114. Have you used FTP connections? Ans : Yes, we used to
get flat files from legacy systems. We used to create ftp
connections to predefined paths on remote systems and when
the session is run Informatica gets the file from the remote
system.
115. The order in which Informatica server sends records to
various target definitions in the mapping is known as?
116. What properties should be there for the shared folder
(shortcuts)?
117. Did u create stored procedure and what exactly did u
write?
118. Sorter transformation what it is for?

Integrity Interview Questions

1. What is the process of integrity?


2. What is standardization in integrity?
3. What is matching process in integrity? Give an example?
4. Can we interact with other systems through integrity?
What programming language you use in integrity?

Teradata Questions

1. What is difference between Teradata and oracle?


2. What are the tools in terra data? And on what tools you have
worked?
3. What are the difference between Fast load and multi load?

Data Stage Questions


Where exactly you used Data Stage with these tools?
Have ever installed DataStage software?
How many DataStage jobs you have created there?

1) Tell us about your carrer profile?


2) Howmany years of US work exprience you have?
3) What is your role in previous project?
4) What kind of jobs you created?
5) Can you draw and explain how exactly your jobs look and their
functionality?
6) What stages you used mostly?
7) Howmany dimensions and facts were in your previous project?
8) What kind of issues you will get while loading Fact table?
9) What does your FACT table contain?
10) How do you handle rejects in your job?
11) What will you do when you get those records?
12) Do you have special logic for those?
13) What kind of Datawarehouse Design it is?
14) What kind of experience you have with RDBMS?
15) Did you create all jobs with tables or sequential stage?
16) Did you ever take up back ups?
17) How did you take them?
18) What are the components of DataStage?
19) What are active and passive stages?
20) Did you write any functions in Datastage?
21) What kind of routines you used in DataStage?
22) Did you ever use Import/export commands?
23) Did you use any scheduling tools?
24) Did you develop any unix shell scripts?
25) Did you write any procedure/packages in pl/sql?
26) What kind of performance issues you got in your jobs?
27) What is the size of your data?
28) How do you import metadata and where will you keep it and
how often?
29) Did you write constraints in DataSatge?
30) What is the advantage of Datawarehouse when compared to
business data?
31) Can you handle pressure working in a team with close
deadlines?
33) What kind of BI tool was used in your previous project?
34) What kind of issues you were getting with BI tool accessing
Datawarehouse tables?
35) Did you ever used Sequence generator?
36) Can you explain logic of Surrogate key?
37) How do you generate Surrogate key?
38) How do you send measures in to your Fact table?
39) Why did you use Unix shell scripts?
40) How do you run shell scripts?
41) What kind of window did you use to access oracle?
42) Did you have development,Testing,Production servers?
43) On what platform your DataStage server is?
44) How did you generate Surrogate keys?
45) How do you rate yourself in DataStage?
46) Did you ever passed parameters?
47) If you have to pass around 200 paremeters in a job how do you
handle them?

DataStage Designer Questions

1. What is the delimiter in the source file?


Usually it is ‘|’ , sometime it is ‘,’

2. Have you written any JCL scripts?


In mainframe jobs we generated JCL scripts, which we run on
mainframe machine.

3. How you handled parameters in DataStage with Unix shell


scripts?
You have to write procedures to handle parameters. Using $? In
your Unix script. On Datastage in director you have option to
pass parameters.

4. What are Hash Files? Why are they used? How to optimize
them for better performance?
Hash files are basically used for lookup reference to make the
fetch fast. To get the maximum performance while creating the
file only select the columns which are required.
5. Is there any chance that hash file gets corrupted (hope OS isn’t
the source of Corruption)?
Yes, Hash files can get corrupted. So always make sure that you
can re-create the hash file with your job re-running.

6. What are the best practices to handle a hash file? What is the
limitation in the size of the hash file in datastage 6.0?
The maximum size of Hash file is 2 GB by default. To increase a
hash table larger than that you need to create it with the 64bit
option.

7. If the hash file gets corrupted (partially or fully), will there be


any error whenever we use that particular hash file in an ETL job
or we need to find it out after loading the junk data into the target
stage?
The usual error message is "unable to open", which will abort
your job fairly quickly - certainly before any rows are processed.

8. What is the difference in setting 'MFILES' in uvconfig and


setting 'ulimit' in dsenv? If both are set to different values,
what happens?

9. How can you generate a surrogate in datastage? I need to


generate a sequence?
You have few functions in Datastage which creates a column
which has all unique numbers. But from version 7.0 we have a
separate stage called Surrogate key.
10. How can I suppress leading zeros? Example, my Job is
writing out 0001, but i want to write out 1.010 – 10,000023 –
23.
11. My source files are from mainframes and all are in
EBCIDC format. Do i need to convert them to ASCII for using
the file in datastage. If yes, is there support from datastage in
doing so & how expensive is the operation (in terms of coding
effort?). Will it have a performance issue?
Ans:- you can use the DS Ascii() function in a transformer to do
the conversion OR you could ask that the files will be supplied in
ascii format which there are special utilities in MF that do just
that.To do so you could also use the Complex Flat File Stage. On
the output page, general tab you can choose the data format being
ASCII or EBCDIC. Don't know how familiar you are with
dealing with EBCDIC stuff, but just in case... apologies if this is
old news. OR Only convert 'unpacked' fields to ASCII. If you
have any packed fields (COMP-1 or COMP-3 fields, for
example), they need to be handled separately and *not*
converted. There are routines in the SDK for handling packed
fields.

12. Will a DataStage job bomb when it runs out of disk


space?
The answer is YES.

13. Will it bomb in such as way as to be difficult to reset?


The answer is YES.

14. Will it bomb in such a way so that you have to do


extreme things to reset it to a run-able state?
The answer is YES.

15. What is a reject link, and where do you use this?


There is nothing called reject link in Datastage. So you need to
send the data into a file based on a condition.
16. What is Stream link and Reference link?
17. Stream link is used for active and passive stages where as
reference link is used only in active stages.

18. Have you written any BASIC Subroutines in DataStage?


Answer Yes. A subroutine is a self-contained set of instructions
that perform a specific task.

19. What is "Iconv" function?


Answer: Converts a string to an internal storage format.

20. I have two sequential files that I need to


load into a stage table. The files are both the
same format, just two different subsets of data
based on time. How do I load both files in one
job?

21. We are trying to construct a job where we


update/insert into dimension table in oracle using
the OCI plugin.
22. How you were handling complex flat files in
Mainframes.
Complex flat file stage is used for multi dimensional
flat files.
23. How many mainframe stage jobs you have
used?
We have lookup, sort, complex flat files, delimited
fixed width and multi-dimensional flat file.

24. Did you ever use lookup Stage in


Mainframes? If yes how many types of lookup’s
do you know?
We have only one type of look up in mainframe
jobs. Looks up is done only on Database not on a
flat file.
25. Did you ever use routines if yes please tell
which ones?
We have an external routine stage through which
we can call a cobol routine in mainframe job.

26. What other inbuilt functions? Did you use any function
other than Iconv and Oconv?
We have a lot of function like trim, cast and many time stamp
functions.

27. What are constraints and derivatives?


Constraints are written in tranformer stage. Constraints are like
conditions based on which data is send to target.

28. Do you about control jobs?


scheduling of jobs if you don’t have job sequencer

29. How many stages you have used in server jobs? Name
and explain them?

30. Name the stages, which are different in mainframes,


parallel jobs and server jobs?

In mainframe we have lookup stage, in parallel jobs we have


merge stage.

MetaRecon Questions

Where you have used MetaRecon?


What is the use of MetaRecon?
Why we need to use metarecon?

Telephonic interviews – how to speak right


Many years after Graham Bell invented the telephone for
basic communication purposes, we are using it on an entirely
new platform. Telephonic interviews. Yes that’s the latest
concept in interview methods. These days when candidates
are scanned from across the globe, who has the time to take
off from their current jobs and travel to attend an interview,
which for all you know will be only the first round. So,
telephonic interviews have emerged as godsend in these
days of efficient time management.

We have for you few tips to remember in case you have to


face a telephonic interview.

First and foremost, when you get to know that you are
scheduled to have a telephonic interview with a company, do
a little bit of research. Homework always comes in handy
here. So gather some information about the company and
note it down in your note pad.

The number you have given your prospective employer


should preferably be your residence number, where you will
be relaxed. Disturbance of any kind is unwanted.

Most professionally managed groups, will fix a time with you


so as to ensure that you will be sufficiently free and relaxed.
Make sure you are available at the appointed time. It would be
a shame to miss out on this opportunity wouldn’t it?

Before you get your all important phone call, there are some
things to keep ready. Make sure you have a note pad and pen
with you, to jot down all relevant details about the company
and any queries that you might have to ask. This says
Appaji.S.N.Rao,V., an Assistant Professor, at Gandhi Institute
of Technology & Management, Vizag, India, is a very
important factor to be kept in mind during a telephonic
interview.
When the phone call comes through, make sure you pick up
the call yourself and introduce yourself properly. Understand
your interviewer’s name correctly, as you wouldn’t want to call
Mr. Allan King, Mr. Fig, or some such inexcusable faux pas.

Pay attention to the questions and do not interrupt. And if it


happens to be an international call, there tends to be a slight
delay in speech. So keep this in mind and speak slowly and
deliberately.

“Another important factor to keep in mind here is that what


goes across to the person at the other end is only your voice
and body language and gesticulations can’t be seen. So
ensure that your voice is well modulated and DO NOT HAVE
LONG PAUSES INTERSPERSED WITH ER….UM….MAY
BE…..I GUESS – all strictly avoidable. It sends very wrong
signals to the person on the other end,” says Appaji.

Although you are indeed not in front of the interviewer, it helps


to presume the person at the other end can see you. Be well
dressed and do not lounge on the bed. Sit down at a table
with your resume and note pad in front of you. “Be formally
dressed and go through your interview seriously and believe
me, it will be reflected at the other end,” reminds Appaji.

Do not at any point ask to know details about remuneration,


perquisites, etc. This is surely not the final interview. Wait until
the interviewer puts forth the subject.

Finally, remember to thank the interviewer for his time and


enquire as to when he will get back to you to further the
discussion.

Telephonic interviews are gaining in popularity especially in


project-based companies. Appaji says in today’s time when
people are being hired from across the world, it is not possible
to be at an appointed place at short notice. So the telephonic
mode of interviewing is the simpler option. At Novell, they
peruse the resume and the short-listed candidates are
thereafter interviewed telephonically. The candidate is
checked for his technical expertise. While over the table he
would be asked to draw a project he has handled, over the
telephone he would have to explain the entire project. This
requires him to be thorough with his work. Further, they have
to project their technical depth adequately in order to impress
the interviewer. Thereafter if required, they are met with
personally whenever he/she comes down. Or else their
technical panel goes to the candidate’s city of work and meets
with him.

This concept is fast catching up with those people looking to


go West, especially in the software industry where most
interviews are entirely conducted over the telephone. Even
MNCs, which are setting base here, source people from
abroad doing so through telephonic interviews.

So all of you out there reading this, be confident and pick up


the telephone and speak forth with élan. Don’t fret, and surely
the appointment letter will soon be in your hands.

All the best with your next telephonic interview!

You might also like