You are on page 1of 141

 

What is Informatica PowerCenter?


Answer: InformaticaPowerCenter is one of the Enterprise Data Integration products
developed by Informatica Corporation. InformaticaPowerCenter is an ETL tool used
for extracting data from the source, transforming and loading data into the target.
The Extraction part involves understanding, analyzing and cleaning of the source
data.
The transformation part involves cleaning of the data more precisely and modifying
the data as per the business requirements.
The loading part involves assigning the dimensional keys and loading them into the
warehouse.

2. What is Mapplet in INFORMATICA?


Answer: Mapplet is a reusable object in INFORMATICA that contains a certain set of
rules for transformation and transformation logic that can be used in multiple
mappings. Mapplet is created in the Mapplet Designer in the Designer Tool.

3. What is the Session task and Command task?


Answer:  Session Task is a set of instructions that are to be applied while
transferring data from source to target using Session Command. Session Command
can be either a pre-session command or post-session command.

Command Task is a specific task that allows one or multiple shell commands of
UNIX to run in Windows during the workflow.

4. What is a Standalone command task?


Answer:  The standalone command task can be used to run Shell Command
anywhere and anytime in the workflow.

5. What is Workflow? What are the components of the Workflow Manager?


Answer: Workflow is the way of a manner in which the task should be implemented.
It is a collection of instructions that inform the server about how to implement the
task.

Given below are the three major components of the Workflow Manager:

Task Designer
Task Developer

6. Describe the scenarios where we go for Joiner transformation instead of Source


Qualifier transformation?
Answer: While joining Source Data of heterogeneous sources as well as to join flat
files we will use the Joiner transformation. Use the Joiner transformation when we
need to join the following types of sources:

Join data from different Relational Databases.


Join data from different Flat Files.
Join relational sources and flat files.

7. Describe the impact of several join conditions and join order in a Joiner
Transformation?
Answer: We can define one or more conditions based on equality between the
specified master and detail sources. Both ports in a condition must have the same
datatype.

If we need to use two ports in the join condition with non-matching datatypes we
must convert the datatypes so that they match. The Designer validates datatypes in
a join condition.

Additional ports in the join condition increase the time necessary to join two
sources.

The order of the ports in the join condition can impact the performance of the Joiner
transformation. If we use multiple ports in the join condition, the Integration Service
compares the ports in the order we specified.

8. What are the different LookUp Caches?


Answer:
INFORMATICA Lookup Can be either cached or uncached. It is divided into five parts.

They are:

Static Cache
 Dynamic Cache
 Recache
 Persistent Cache
 Shared Cache
Static Cache remains as it is without change while a session is running.

Dynamic Cache keeps updating frequently while a session is running.

9. What is Source Qualifier Transformation in INFORMATICA?


Answer: Source Qualifier Transformation is useful in Mapping, whenever we add
relational flat files it is automatically created. It is an active and connected
transformation that represents those rows which are read by integration service.
( tableau training videos  )

10. Suppose we have two Source Qualifier transformations SQ1 and SQ2 connected
to Target tables TGT1 and TGT2 respectively. How do you ensure TGT2 is loaded
after TGT1?
Answer: If we have multiple Source Qualifier transformations connected to multiple
targets, we can designate the order in which the Integration Service loads data into
the targets.

In the Mapping Designer, We need to configure the Target Load Plan based on the
Source Qualifier transformations in mapping to specify the required loading order.
( data science online training )

11. Write the different tools in the workflow manager?


Answer:
The different tools in the workflow manager are:

 Task Developer
 Task Designer
 Workflow Designer
12. Differentiate between a repository server and a powerhouse?
Answer: Repository server mainly guarantees repository reliability and uniformity
while the powerhouse server tackles the execution of many procedures between the
factors of the server’s database repository.

13. What do you understand by a term domain?


Answer: Domain is the term in which all interlinked relationships and nodes are
undertaken by the sole organizational point.

14. What is the use of an aggregator cache file?


Answer: If extra memory is needed aggregator provides extra cache files for keeping
the transformation values. It also keeps the transitional value that is there in the
local buffer memory. ( hadoop training online )  

15. What is the Advantage of Informatica?


Answer: Its GUI tool, Coding in any graphical tool is generally faster than hand-code
scripting.
Can communicate with all major data sources (mainframe/RDBMS/Flat
Files/XML/VSM/SAP etc).
Can handle very large/huge data very effectively?
Users can apply Mappings, extract rules, cleansing rules, transformation rules,
aggregation logic, and loading rules are in separate objects in an ETL tool. Any
change in any of the objects will give the minimum impact of another object.
Reusability of the object (Transformation Rules)
Informatica has different “adapters” for extracting data from packaged ERP
applications (such as SAP or PeopleSoft).
Availability of resources in the market.
It can be run on Window and Unix environment. 

16. What are pre and post-session shell commands?


Answer: Command task can be called the pre or post-session shell command for a
session task. One can run it as pre-session command r post-session success
command or post-session failure command.

17. How we can create indexes after completing the loan process?
Answer: With the help of the command task at session-level we can create indexes
after the loading procedure. ( devops training )

18. What are the advantages of using Informatica as an ETL tool over Teradata?
Answer: First up, Informatica is a data integration tool, while Teradata is an MPP
database with some scripting (BTEQ) and fast data movement (load, FastLoad,
Parallel Transporter, etc) capabilities. Informatica over Teradata1) Metadata
repository for the organization’s ETL ecosystem.
Informatica jobs (sessions) can be arranged logically into worklets and workflows in
folders.
This leads to an ecosystem that is easier to maintain and quicker for architects and
analysts to analyze and enhance.2) Job monitoring and recovery-
Easy to monitor jobs using Informatica Workflow Monitor.
Easier to identify and recover in case of failed jobs or slow running jobs.
Ability to restart from failure row/step.3) InformaticaMarketPlace- one-stop-shop for
lots of tools and accelerators to make the SDLC faster, and improve application
support.4) Plenty of developers in the market with varying skill levels and expertise5)
Lots of connectors to various databases, including support for Teradata mode,
trump, FastLoad, and Parallel Transporter in addition to the regular (and slow) ODBC
drivers. Some ‘exotic’ connectors may need to be procured and hence could cost
extra. Examples – Power Exchange for Facebook, Twitter, etc which source data
from such social media sources.6) Surrogate key generation through shared
sequence generators inside Informatica could be faster than generating them inside
the database.7) If the company decides to move away from Teradata to another
solution, then vendors like Infosys can execute migration projects to move the data,
and change the ETL code to work with the new database quickly, accurately and
efficiently using automated solutions.8) Pushdown optimization can be used to
process the data in the database.9) Ability to code ETL such that processing load is
balanced between ETL server and the database box – useful if the database box is
aging and/or in case the ETL server has a fast disk/ large enough memory & CPU to
outperform the database in certain tasks.10) Ability to publish processes as web
services.Teradata over Informatica

Cheaper (initially) – No initial ETL tool license costs (which can be significant), and
lower OPEX costs as one doesn’t need to pay for yearly support from Informatica
Corp.
A great choice if all the data to be loaded is available as structured files – which can
then be processed inside the database after an initial stage load.
Good choice for a lower complexity ecosystem
Only Teradata developers or resources with good ANSI/Teradata SQL / BTEQ
knowledge required to build and enhance the system.

19. How to elaborate PowerCenter Integration Service?


Answer: Integration Services control the workflow and execution of PowerCenter
processes.

There are three components of INFORMATICA Integration Services as shown in the


below figure.

Powercenter Integration Service

Integration Service Process: It is called as the preserver, Integration Service


can start more than one processes to monitor the workflow.

Load Balancing: Load Balancing refers to distributing the entire workload across


several nodes in the grid. Load Balancer conducts different tasks that include
commands, sessions, etc.

Data Transformation Manager(DTM): Data Transformation Manager allows to


perform the following data transformations:

Active: To change the number of rows in the output.


Passive: Cannot change the number of rows in the output.
Connected: Link to the other transformation.
Unconnected: No link to other transformations.

20. What is PowerCenter on Grid?


Answer: INFORMATICA has the feature of Grid Computing which can be utilized for
the largest data scalability to the performance. The grid feature is used for load
balancing and parallel processing.
PowerCenter domains contain a set of multiple nodes to configure the workload and
then run it on the Grid.

Informatica- PowerCenter on Grid…

A domain is a foundation for efficient service administration served by the


PowerCenter.

Node is an independent physical machine that is logically represented for running


the PowerCenter environment.

21. Name the different lookup cache(s)?


Answer: Informatica lookups can be cached or un-cached (no cache). Cached
lookups can be either static or dynamic. A lookup cache can also be divided as
persistent or non-persistent based on whether Informatica retains the cache even
after completing the session run or if it deletes it.

 Static cache
 Dynamic cache
 Persistent cache
 Shared cache
 Recache
22. What are the various types of transformation?
Answer:

 Aggregator transformation
 Expression transformation
 Filter transformation
 Joiner transformation
 Lookup transformation
 Normalizer transformation
 Rank transformation
 Router transformation
 Sequence generator transformation
 Stored procedure transformation
 Sorter transformation
 Update strategy transformation
 XML source qualifier transformation
23. When do you use SQL override in a lookup transformation?
Answer: You should override the lookup query in the following circumstances:

Override the ORDER BY clause. Create the ORDER BY clause with fewer columns to
increase performance. When you override the ORDER BY clause, you must suppress
the generated ORDER BY clause with a comment notation.
Note: If you use pushdown optimization, you cannot override the ORDER BY clause
or suppress the generated ORDER BY clause with a comment notation.
A lookup table name or column names contains a reserved word. If the table name
or any column name in the lookup query contains a reserved word, you must ensure
that they are enclosed in quotes.
Use parameters and variables. Use parameters and variables when you enter a
lookup SQL override. Use any parameter or variable type that you can define in the
parameter file. You can enter a parameter or variable within the SQL statement, or
use a parameter or variable as the SQL query. For example, you can use a session
parameter, $ParamMyLkpOverride, as the lookup SQL query, and set
$ParamMyLkpOverride to the SQL statement in a parameter file. The designer
cannot expand parameters and variables in the query override and does not validate
it when you use a parameter or variable. The integration service expands the
parameters and variables when you run the session. ( data science training online  )
A lookup column name contains a slash (/) character. When generating the default
lookup query, the designer and integration service replace any slash character (/) in
the lookup column name with an underscore character. To query lookup column
names containing the slash character, override the default lookup query, replace the
underscore characters with the slash character, and enclose the column name in
double-quotes.
Add a WHERE clause. Use a lookup SQL override to add a WHERE clause to the
default SQL statement. You might want to use the WHERE clause to reduce the
number of rows included in the cache. When you add a WHERE clause to a Lookup
transformation using a dynamic cache, use a Filter transformation before the Lookup
transformation to pass rows into the dynamic cache that match the WHERE clause.
Note: The session fails if you include large object ports in a WHERE clause.
Other. Use a lookup SQL override if you want to query lookup data from multiple
lookups or if you want to modify the data queried from the lookup table before the
Integration Service caches the lookup rows. For example, use TO_CHAR to convert
dates to strings.

24. What is the difference between a repository server and a powerhouse?


Answer: Repository server controls the complete repository which includes tables,
charts, and various procedures, etc.

A powerhouse server governs the implementation of various processes among the


factors of the server’s database repository.

25. What are the transformations that are not supported in Mapplet?
Answer: Normalizer, Cobol sources, XML sources, XML Source Qualifier
transformations, Target definitions, Pre- and post-session Stored Procedures, Other
Mapplets.
26. Describe Data Concatenation?
Answer: Data concatenation is the bringing of different pieces of the record
together.

27. Differentiate between sessions and batches?


Answer: Session is a set of commands for the server to move data to the target.

Batch is a set of tasks that can include one or more tasks.

28. What are data-driven sessions?


Answer: When you configure a session using an updated strategy, the session
property data-driven instructs the Informatica server to use the instructions coded in
mapping to flag the rows for insert, update, delete or reject. This is done by
mentioning DD_UPDATE or DD_INSERT or DD_DELETE in the update strategy
transformation.

“Treat source rows as” property in session is set to “Data-Driven” by default when
using an update strategy transformation in a mapping.

29. What is the need for an ETL tool?


Answer: The problem comes with traditional programming languages where you
need to connect to multiple sources and you have to handle errors. For this, you have
to write complex code. ETL tools provide a ready-made solution for this. You don’t
need to worry about handling these things and can concentrate only on coding the
required part. ( python training )

30. Which is the T/R that builts only single cache memory?
Answer: Rank can build two types of cache memory. But the sorter always built only
one cache memory.
– The cache is also called Buffer.

31. What is a Snowflake Schema?


Answer:
A large denormalized dimension table is splitted into multiple normalized
dimensions.

Advantage:
Select Query performance increases.

Disadvantage:
Maintenance cost increases due to more no. of tables.
32. What is Mapping Debugger?
Answer:
– A debugger is a tool. By using this we can identify records are loaded or not and
correct data is loaded or not from one T/R to other T/R.
– Session succeeded but records are not loaded. In this situation, we have to use the
Debugger tool.

33. What is a Repository Manager?


Answer: It is a GVI based administrative client which allows performing the following
administrative tasks:

1. Create, edit and delete folders.


2. Assign users to access the folders with reading, write and execute permissions.
3. Backup and Restore repository objects.

34. Scheduling a Workflow?


Answer:

1. A schedule is automation of running the workflow at a given date and time.


2. There are 2 types of schedulers:

(i) Reusable scheduler


(ii) Non Reusable scheduler

(i) Reusable scheduler:-


A reusable scheduler can be assigned to multiple workflows.

(ii) Non-Reusable scheduler:-


– A nonreusable scheduler is created specifically for the workflow.
– A nonreusable scheduler can be converted into a reusable scheduler.

The following are the 3rd party schedulers:

1. Cron (Unix based scheduling process)


2. Tivoli
3. Control M
4. Autosys
5. Tidal
6. WLM (work hard manager)
– 99% of the production people will do the scheduling.
– Before we run the workflow manually. Through scheduling, we run workflow this is
called Auto Running.

35. What is Workflow Monitor?


Answer:

i. It is a GUI based client application that allows users to monitor ETL objects running
an ETL Server.
ii. Collect runtime statistics such as:

a. No. of records extracted.


b. No. of records loaded.
c. No. of records rejected.
d. Fetch session loge. Throughput

– Complete information can be accessed from the workflow monitor.


– For every session, one log file is created.

36. If Informatica has its scheduler why using third party scheduler?
Answer: The client uses various applications (mainframes, oracle apps use Tivoli
scheduling tool) and integrate different applications & scheduling that applications it
is very easy by using third party schedulers.

37. What is a Dimensional Model?


Answer:

1. Data Modeling:- It is a process of designing the database by fulfilling business


requirements specifications.
2. A Data Modeler (or) Database Architect Designs the warehouse Database using a
GUI based data modeling tool called “Erwin”.
3. Erwin is a data-modeling tool from Computer Associates (A).
4. Dimensional modeling consists of following types of schemas designed for
Datawarehouse:

a. Star Schema.
b. Snowflake Schema.
c. Gallery Schema.
5. A schema is a data model that consists of one or more tables.

38. What are the new features of Informatica 9.x at the developer level?
Answer: From a developer’s perspective, some of the new features in Informatica 9.x
are as follows:
Now Lookup can be configured as an active transformation – it can return multiple
rows on a successful match
Now you can write SQL override on an un-cached lookup also. Previously you could
do it only on cached lookup
You can control the size of your session log. In a real-time environment, you can
control the session log file size or time
Database deadlock resilience feature – this will ensure that your session does not
immediately fail if it encounters any database deadlock, it will now retry the
operation again. You can configure several retry attempts.

39. Suppose we do not group by on any ports of the aggregator what will be the
output?
Answer: If we do not group values, the Integration Service will return only the last
row for the input rows.

40. Give one example for each of Conditional Aggregation, Non-Aggregate


expression, and Nested Aggregation?
Answer: Use conditional clauses in the aggregate expression to reduce the number
of rows used in the aggregation. The conditional clause can be any clause that
evaluates to TRUE or FALSE.
SUM( SALARY, JOB = CLERK )
Use non-aggregate expressions in the group by ports to modify or replace groups.
IIF( PRODUCT = Brown Bread, Bread, PRODUCT )
The expression can also include one aggregate function within another aggregate
function, such as:
MAX( COUNT( PRODUCT )

41. What is a Rank Transform?


Answer: Rank is an Active Connected Informatica transformation used to select a
set of top or bottom values of data.

42. How does a Rank Transform differ from Aggregator Transform functions MAX
and MIN?
Answer: Like the Aggregator transformation, the Rank transformation lets our group
information. The Rank Transform allows us to select a group of top or bottom
values, not just one value as in the case of Aggregator MAX, MIN functions.

43. What are the restrictions of Union Transformation?


Answer:

1. All input groups and the output group must have matching ports. The precision,
datatype, and scale must be identical across all groups.
2. We can create multiple input groups, but only one default output group.
3. The Union transformation does not remove duplicate rows.
4. We cannot use a Sequence Generator or Update Strategy transformation
upstream from a Union transformation.
5. The Union transformation does not generate transactions.

44. What is Persistent Lookup Cache?


Answer: Lookups are cached by default in Informatica. Lookup cache can be either
non-persistent or persistent. The Integration Service saves or deletes lookup cache
files after a successful session run based on whether the Lookup cache is checked
as persistent or not.

1. How to remove duplicate records in Informatica?


Explain the different ways to do it?
Answer:

There are many ways of eliminating duplicates:

1. If there are duplicates in the source database, a user can use the property in

source qualifier. A user must go to the Transformation tab and checkmark the

‘Select Distinct’ option. Also, a user can use SQL override for the same purpose.

The user can go to the Properties tab and in SQL query tab write the distinct query.

2. A user can use Aggregator and select ports as key to getting distinct values. If a

user wishes to find duplicates in the entire column, then all ports should be selected

as a group by key.

3. The user can also use Sorter with Sort distinct property to get distinct values.

4. Expression and filter transformation can also be used to identify and remove

duplicate data. If data is not sorted, then it needs to be sorted first.

5. When a property in Lookup transformation is changed to use Dynamic cache, a

new port is added to the transformation. This cache is updated as and when data is
read. If a source has duplicate records, the user can look in Dynamic lookup cache

and then the router selects only one distinct record.

2. What is the difference between Source qualifier and


filter transformation?
Answer:

Source qualifier transformation is used to represent rows that Integration service

reads in a session. It is an active transformation. Using source qualifier, the

following tasks can be fulfilled:

1. When two tables from the same source database with primary key – foreign key

transformation relationship is there, then the sources can be linked to one source

qualifier transformation.

2. Filtering rows when Integration service adds a where clause to the user’s default

query.

3. When a user wants an outer join instead of an inner join, then join information is

replaced by metadata specified in SQL query.

4. When sorted ports are specified, the integration service uses the order by clause

to the default query.

5. If a user chooses to find a distinct value, then integration service uses select

distinct to the specified query.

When the data we need to filter is not a relational source, the user should use Filter

transformation. It helps the user to meet the specified filter condition to let go or
pass through. It will directly drop the rows that do not meet the condition, and

multiple conditions can be specified.

3. Design a mapping to load the last 3 rows from a flat-file


into the target?
Answer:

Suppose the flat file in consideration has below data:

Column A

Aanchal

Priya

Karishma

Snehal

Nupura

Step1: Assign row numbers to each record. Generate row numbers using

expression transformation by creating a variable port and incrementing it by 1.

After this assign this variable port to output port. After expression transformation,

the ports will be as –

 Popular Course in this category


Informatica Training (7 Courses) 7 Online Courses | 47+ Hours | Verifiable Certificate
of Completion | Lifetime Access
4.5 (8,346 ratings)
Course Price
₹4999 ₹27999
View Course

Related Courses
Data Visualization Training (15 Courses, 5+ Projects)Cloud Computing Training (18 Courses, 5+ Projects)

Variable_count= Variable_count+1

O_count=Variable_count

Create a dummy output port for the same expression transformation and assign 1 to

that port. This dummy port will always return 1 for each row.

Finally, the transformation expression will be as follows:

Variable_count= Variable_count+1

O_count=Variable_count

Dummy_output=1

The output of this transformation will be :

Column A O_count Dummy_output

Aanchal 1 1

Priya 2 1

Karishma 3 1

Snehal 4 1

Nupura 5 1
Step 2: Pass the above output to an aggregator and do not specify any group by the

condition. A new output port should be created as O_total_records in the

aggregator and assign O_count port to it. The aggregator will return the last row.

This step’s final output will have a dummy port with value as 1 and

O_total_records will have a total number of records in the source. The aggregator

output will be: O_total_records, Dummy_output

51

Step 3: Pass this output to joiner transformation and apply a join on dummy port.

The property sorted input should be checked in joiner transformation. Only then

the user can connect both expression and aggregator transformation to joiner

transformation. Joiner transformation condition will be as follows:

Dummy_output (port from aggregator transformation) = Dummy_output (port

from expression transformation)

The output of joiner transformation will be

Column A o_count o_total_records

Aanchal 1 5

Priya 2 5

Karishma 3 5

Snehal 4 5

Nupura 5 5
Step 4: After the joiner transformation we can send this output to filter

transformation and specify filter condition as O_total_records (port from

aggregator)-O_count(port from expression) <=2

The filter condition, as a result, will be

O_total_records – O_count <=2

The final output of filter transformation will be :

Column A o_count o_total_records

Karishma 3 5

Snehal 4 5

Nupura 5 5

4. How to load only NULL records into the target?


Explain using mapping flow?
Answer:

Consider the below data as a source

Emp_Id Emp_Name Salary City Pincode

619101 Aanchal Singh 20000 Pune 411051

619102 Nupura Pattihal 35000 Nagpur 411014

NULL NULL 15000 Mumbai 451021

The target table also has a table structure as a source. We will have two tables

containing NULL values and others that would not contain NULL values.

The mapping can be as:


SQ –> EXP –> RTR –> TGT_NULL/TGT_NOT_NULL

EXP – Expression transformation create an output port

O_FLAG= IIF ( (ISNULL(emp_id) OR ISNULL(emp_name) OR ISNULL(salary)

OR ISNULL(City) OR ISNULL(Pincode)), ‘NULL’,’NNULL’)

RTR – Router transformation two groups

Group 1 connected to TGT_NULL ( Expression O_FLAG=’NULL’)

Group 2 connected to TGT_NOT_NULL ( Expression O_FLAG=’NNULL’)

5. Explain how the performance of joiner condition can be


increased?
Answer:

The performance of the joiner condition can be increased by following some

simple steps.

1. The user must perform joins whenever possible. When for some tables, this is

not possible, then a user can create a stored procedure and then join the tables in

the database.

2. Data should be sorted before applying to join whenever possible.

3. When data is unsorted, then a source with fewer rows should be considered a

master source.

4. For sorted joiner transformation, a source with less duplicate key values should

be considered a master source.

1. Differentiate between Source Qualifier and Filter Transformation?


Source Qualifier vs Filter Transformation
Source Qualifier Transformation Filter Transformation

1. It filters rows while reading the data from a


1. It filters rows from within a mapped data.
source.

2. Can filter rows only from relational sources. 2. Can filter rows from any type of source syste

3. It limits the row sets extracted from a  source. 3. It limits the row set sent to a target.

4. It enhances performance by minimizing the 4. It is added close to the source to filter out the
number of rows used in mapping.  unwanted data early and maximize performanc

5. In this, filter condition uses the standard SQL to 5. It defines a condition using any statement or
execute in the database. transformation function to get either TRUE or F

2. How do you remove Duplicate records in Informatica? And how many ways


are there to do it?
There are several ways to remove duplicates.

i. If the source is DBMS, you can use the property in Source Qualifier to select the
distinct records.

Or
you can also use the SQL Override to perform the same.

ii. You can use, Aggregator and select all the ports as key to get the distinct values.
After you pass all the required ports to the Aggregator, select all those ports ,
those you need to select for de-duplication. If you want to find the duplicates
based on the entire columns, select all the ports as group by key.

Th
e Mapping will look like this.

iii. You can use Sorter and use the Sort Distinct Property to get the distinct
values. Configure the sorter in the following way to enable this.

iv. You can use, Expression and Filter transformation, to identify and remove
duplicate if your data is sorted. If your data is not sorted, then, you may first use
a sorter to sort the data and then apply this logic:

 Bring the source into the Mapping designer.


 Let’s assume the data is not sorted. We are using a sorter to sort the data. The
Key for sorting would be Employee_ID.
Configure the Sorter as mentioned below.

 Use one expression transformation to flag the duplicates. We will use the
variable ports to identify the duplicate entries, based on Employee_ID.

 Use a filter transformation, only to pass IS_DUP = 0. As from the previous


expression transformation, we will have IS_DUP =0 attached to only records,
which are unique. If IS_DUP > 0, that means, those are duplicate entries.
 Add the ports to the target. The entire mapping should look like this.

v. When you change the property of the Lookup transformation to use the
Dynamic Cache, a new port is added to the transformation. NewLookupRow.

The Dynamic Cache can update the cache, as and when it is reading the data.

If the source has duplicate records, you can also use Dynamic Lookup cache and
then router to select only the distinct one.

3. What are the differences between Source Qualifier and Joiner Transformation?
The Source Qualifier can join data originating from the same source database.
We can join two or more tables with primary key-foreign key relationships by
linking the sources to one Source Qualifier transformation.

If we have a requirement to join the mid-stream or the sources are


heterogeneous, then we will have to use the Joiner transformation to join the
data.

4. Differentiate between joiner and Lookup Transformation.


Below are the differences between lookup and joiner transformation:

 In lookup we can override the query but in joiner we cannot.


 In lookup we can provide different types of operators like – “>,<,>=,<=,!=” but, in
joiner only “= “ (equal to )operator is available.
 In lookup we can restrict the number of rows while reading the relational table
using lookup override but, in joiner we cannot restrict the number of rows while
reading.
 In joiner we can join the tables based on- Normal Join, Master Outer, Detail
Outer and Full Outer Join but, in lookup this facility is not available .Lookup
behaves like Left Outer Join of database.

5. What is meant by Lookup Transformation? Explain the types of Lookup


transformation.
Lookup transformation in a mapping is used to look up data in a flat file,
relational table, view, or synonym. We can also create a lookup definition from a
source qualifier.

We have the following types of Lookup.


 Relational or flat file lookup. To perform a lookup on a flat file or a relational
table.
 Pipeline lookup. To perform a lookup on application sources such as JMS or
MSMQ.
 Connected or unconnected lookup.
o A connected Lookup transformation receives source data, performs a
lookup, and returns data to the pipeline.
o An unconnected Lookup transformation is not connected to a source or
target. A transformation in the pipeline calls the Lookup transformation
with a: LKP expression. The unconnected Lookup transformation returns
one column to the calling transformation.
 Cached or un-cached lookup.We can configure the lookup transformation to
Cache the lookup data or directly query the lookup source every time the lookup
is invoked. If the Lookup source is Flat file, the lookup is always cached.

6. How can you increase the performance in joiner transformation?


Below are the ways in which you can improve the performance of Joiner
Transformation.

 Perform joins in a database when possible.


In some cases, this is not possible, such as joining tables from two different
databases or flat file systems. To perform a join in a database, we can use the
following options:
Create and Use a pre-session stored procedure to join the tables in a database.
Use the Source Qualifier transformation to perform the join.

 Join sorted data when possible


 For an unsorted Joiner transformation, designate the source with fewer rows as
the master source.
 For a sorted Joiner transformation, designate the source with fewer duplicate key
values as the master source.

7. What are the types of Caches in lookup? Explain them.


Based on the configurations done at lookup transformation/Session Property
level, we can have following types of Lookup Caches.

 Un- cached lookup– Here, the lookup transformation does not create the cache.
For each record, it goes to the lookup Source, performs the lookup and returns
value. So for 10K rows, it will go the Lookup source 10K times to get the related
values.
 Cached Lookup– In order to reduce the to and fro communication with the
Lookup Source and Informatica Server, we can configure the lookup
transformation to create the cache. In this way, the entire data from the Lookup
Source is cached and all lookups are performed against the Caches.
Based on the types of the Caches configured, we can have two types of caches,
Static and Dynamic.

The Integration Service performs differently based on the type of lookup cache
that is configured. The following table compares Lookup transformations with
an uncached lookup, a static cache, and a dynamic cache:

Persistent Cache

By default, the Lookup caches are deleted post successful completion of the
respective sessions but, we can configure to preserve the caches, to reuse it next
time.

Shared Cache

We can share the lookup cache between multiple transformations. We can share
an unnamed cache between transformations in the same mapping. We can
share a named cache between transformations in the same or different
mappings.

8. How do you update the records with or without using Update Strategy?


We can use the session configurations to update the records. We can have
several options for handling database operations such as insert, update, delete.

During session configuration, you can select a single database operation for all
rows using the Treat Source Rows As setting from the ‘Properties’ tab of the
session.

 Insert: – Treat all rows as inserts.


 Delete: – Treat all rows as deletes.
 Update: – Treat all rows as updates.
 Data Driven :- Integration Service follows instructions coded into Update
Strategy flag rows for insert, delete, update, or reject.

Once determined how to treat all rows in the session, we can also set options for
individual rows, which gives additional control over how each rows behaves. We
need to define these options in the Transformations view on mapping tab of the
session properties.

 Insert: – Select this option to insert a row into a target table.


 Delete: – Select this option to delete a row from a table.
 Update :- You have the following options in this situation:
o Update as Update: – Update each row flagged for update if it exists in the
target table.
o Update as Insert: – Insert each row flagged for update.
o Update else Insert: – Update the row if it exists. Otherwise, insert it.
 Truncate Table: – Select this option to truncate the target table before loading
data.

Steps:

1. Design the mapping just like an ‘INSERT’ only mapping, without Lookup, Update

Strategy Transformation.
2. First set Treat Source Rows As property as shown in below image.

3. Next, set the properties for the target table as shown below. Choose the
properties Insert and Update else Insert.
These options will make the session as Update and Insert records without using
Update Strategy in Target Table.

When we need to update a huge table with few records and less inserts, we can
use this solution to improve the session performance.

The solutions for such situations is not to use Lookup Transformation and
Update Strategy to insert and update records.

The Lookup Transformation may not perform better as the lookup table size
increases and it also degrades the performance.

9. Why update strategy and union transformations are Active? Explain with
examples.

1. The Update Strategy changes the row types. It can assign the row types based on
the expression created to evaluate the rows. Like IIF (ISNULL (CUST_DIM_KEY),
DD_INSERT, DD_UPDATE). This expression, changes the row types to Insert for
which the CUST_DIM_KEY is NULL and to Update for which the CUST_DIM_KEY is
not null.
2. The Update Strategy can reject the rows. Thereby with proper configuration, we
can also filter out some rows. Hence, sometimes, the number of input rows, may
not be equal to number of output rows.

Like IIF (IISNULL (CUST_DIM_KEY), DD_INSERT,

IIF (SRC_CUST_ID! =TGT_CUST_ID), DD_UPDATE, DD_REJECT))

Here we are checking if CUST_DIM_KEY is not null then if SRC_CUST_ID is equal


to the TGT_CUST_ID. If they are equal, then we do not take any action on those
rows; they are getting rejected.

Union Transformation

In union transformation, though the total number of rows passing into the
Union is the same as the total number of rows passing out of it, the positions of
the rows are not preserved, i.e. row number 1 from input stream 1 might not be
row number 1 in the output stream. Union does not even guarantee that the
output is repeatable. Hence it is an Active Transformation.

10. How do you load only null records into target? Explain through mapping
flow.
Let us say, this is our source
Cust_id Cust_name Cust_amount Cust_Place Cust_zip

101 AD 160 KL 700098

102 BG 170 KJ 560078

NULL NULL 180 KH 780098

The target structure is also the same but, we have got two tables, one which will
contain the NULL records and one which will contain non NULL records.

We can design the mapping as mentioned below.

SQ –> EXP –> RTR –> TGT_NULL/TGT_NOT_NULL


EXP – Expression transformation create an output port

O_FLAG= IIF ( (ISNULL(cust_id) OR ISNULL(cust_name) OR ISNULL(cust_amount)


OR ISNULL(cust _place) OR ISNULL(cust_zip)), ‘NULL’,’NNULL’)
** Assuming you need to redirect in case   any of value is null

Informatica Certification Training Course

 Instructor-led Sessions
 Real-life Case Studies
 Assignments
 Lifetime Access

Explore Curriculum

OR

O_FLAG= IIF ( (ISNULL(cust_name) AND ISNULL(cust_no) AND


ISNULL(cust_amount) AND ISNULL(cust _place) AND ISNULL(cust_zip)),
‘NULL’,’NNULL’)
** Assuming you need to redirect in case all of   value is null
RTR – Router transformation two groups

Group 1 connected to TGT_NULL ( Expression O_FLAG=’NULL’)


Group 2 connected to TGT_NOT_NULL ( Expression O_FLAG=’NNULL’)

11. How do you load alternate records into different tables through mapping


flow?
The idea is to add a sequence number to the records and then divide the record
number by 2. If it is divisible, then move it to one target and if not then move it
to other target.

1. Drag the source and connect to an expression transformation.


2. Add the next value of a sequence generator to expression transformation.

3. In expression transformation make two port, one is “odd” and another “even”.
4. Write the expression as below

5. Connect a router transformation to expression.


6. Make two group in router.
7. Give condition as below
8. Then send the two group to different targets. This is the entire flow.

12. How do you load first and last records into target table? How many ways are
there to do it? Explain through mapping flows.
The idea behind this is to add a sequence number to the records and then take
the Top 1 rank and Bottom 1 Rank from the records.

1. Drag and drop ports from source qualifier to two rank transformations.
2. Create a reusable sequence generator having start value 1 and connect the next
value to both rank transformations.

3. Set rank properties as follows. The newly added sequence port should be chosen
as Rank Port. No need to select any port as Group by Port.Rank – 1
4. Rank – 2

5. Make two instances of the target.


Connect the output port to target.

13. I have 100 records in source table, but I want to load 1, 5,10,15,20…..100 into
target table. How can I do this? Explain in detailed mapping flow.
This is applicable for any n= 2, 3,4,5,6… For our example, n = 5. We can apply the
same logic for any n.
The idea behind this is to add a sequence number to the records and divide the
sequence number by n (for this case, it is 5). If completely divisible, i.e. no
remainder, then send them to one target else, send them to the other one.

1. Connect an expression transformation after source qualifier.


2. Add the next value port of sequence generator to expression transformation.

3. In expression create a new port (validate) and write the expression as in the
picture below.
4. Connect a filter transformation to expression and write the condition in property
as given in the picture below.

5. Finally connect to target.

14. How do you load unique records into one target table and duplicate records
into  a different target table?
Source Table:

COL
 COL1 COL3
2

a b c

x y z

a b c

r f u

a b c

v f r
v f r

Target Table 1: Table containing all the unique rows

COL1 COL2 COL3

a b c

x y z

r f u

v f r

Target Table 2: Table containing all the duplicate rows

COL1 COL2 COL3

a b c

a b c

v f r

1. Drag the source to mapping and connect it to an aggregator transformation.

2. In aggregator transformation, group by the key column and add a new port. Call
it count_rec to count  the key column.
3. Connect  a router to the aggregator from the previous step. In router make two
groups: one named “original” and another as “duplicate”.
In original write count_rec=1 and in duplicate write count_rec>1.
The
picture below depicts the group name and the filter conditions.
Connect two groups to corresponding target tables.

15. Differentiate between Router and Filter Transformation?

16. I have two different source structure tables, but I want to load into single
target table? How do I go about it? Explain in detail through mapping flow.

 We can use joiner, if we want to join the data sources. Use a joiner and use the
matching column to join the tables.
 We can also use a Union transformation, if the tables have some common
columns and we need to join the data vertically. Create one union
transformation add the matching ports form the two sources, to two different
input groups and send the output group to the target.
The basic idea here is to use, either Joiner or Union transformation, to move the
data from two sources to a single target. Based on the requirement, we may
decide, which one should be used.

17. How do you load more than 1 Max Sal in each Department through


Informatica or write sql query in oracle?
SQL query:
You can use this kind of query to fetch more than 1 Max salary for
each department.

SELECT * FROM (

SELECT EMPLOYEE_ID, FIRST_NAME, LAST_NAME, DEPARTMENT_ID, SALARY,


RANK () OVER (PARTITION BY DEPARTMENT_ID ORDER BY SALARY) SAL_RANK
FROM EMPLOYEES)

WHERE SAL_RANK <= 2

Informatica Approach:

We can use the Rank transformation to achieve this.


Use Department_ID as the group key.

In the properties tab, select Top, 3.


The entire mapping should look like this.

This will give us the top 3 employees earning maximum salary in their respective
departments.

18. How do you convert single row from source into three rows into target?
We can use Normalizer transformation for this. If we do not want to use
Normalizer, then there is one alternate way for this.

We have a source table containing 3 columns: Col1, Col2 and Col3. There is only
1 row in the table as follows:

Col1 Col2 Col3

a b C

There is target table contains only 1 column Col. Design a mapping so that the
target table contains 3 rows as follows:

Col
a

1. Create 3 expression transformations exp_1,exp_2 and exp_3 with 1 port each.


2. Connect col1 from Source Qualifier to port in exp_1.
3. Connect col2 from Source Qualifier to port in exp_2.
4. Connect col3 from source qualifier to port in exp_3.
5. Make 3 instances of the target. Connect port from exp_1 to target_1.
6. Connect port from exp_2 to target_2 and connect port from exp_3 to target_3.
19. I have three same source structure tables. But, I want to load into single
target table. How do I do this? Explain in detail through mapping flow.
We will have to use the Union Transformation here. Union Transformation is a
multiple input group transformation and it has only one output group.

1. Drag all the sources in to the mapping designer.


2. Add one union transformation and configure it as follows.Group Tab.

Gr
oup Ports Tab.
3. Connect the sources with the three input groups of the union transformation.

4. Send the output to the target or via a expression transformation to the


target.The entire mapping should look like this.

20. How to join three sources using joiner? Explain though mapping flow.
We cannot join more than two sources using a single joiner. To join three
sources, we need to have two joiner transformations.

Let’s say, we want to join three tables – Employees, Departments and Locations –
using Joiner. We will need two joiners. Joiner-1 will join, Employees and
Departments and Joiner-2 will join, the output from the Joiner-1 and Locations
table.

Here are the steps.

1. Bring three sources into the mapping designer.


2. Create the Joiner -1 to join Employees and Departments using Department_ID.

3. Create the next joiner, Joiner-2. Take the Output from Joiner-1 and ports from
Locations Table and bring them to Joiner-2. Join these two data sources using
Location_ID.
4. The last step is to send the required ports from the Joiner-2 to the target or via
an expression transformation to the target table.

21. What are the differences between OLTP and OLAP?

22. What are the types of Schemas we have in data warehouse and what are the
difference between them?
There are three different data models that exist.
1. Star schema

Here, the Sales fact table is a fact table and the surrogate keys of each dimension
table are referred here through foreign keys. Example: time key, item key,
branch key, location key. The fact table is surrounded by the dimension tables
such as Branch, Location, Time and item. In the fact table there are dimension
keys such as time_key, item_key, branch_key and location_keys and measures
are untis_sold, dollars sold and average sales.Usually, fact table consists of more
rows compared to dimensions because it contains all the primary keys of the
dimension along with its own measures.
2. Snowflake schema

In snowflake, the fact table is surrounded by dimension tables and the


dimension tables are also normalized to form the hierarchy. So in this example,
the dimension tables such as location, item are normalized further into smaller
dimensions forming a hierarchy.
3. Fact constellations

In fact constellation, there are many fact tables sharing the same dimension
tables. This examples illustrates a fact constellation in which the fact tables sales
and shipping are sharing the dimension tables time, branch, item.

23. What is Dimensional Table? Explain the different dimensions.


Dimension table is the one that describes business entities of an enterprise,
represented as hierarchical, categorical information such as time, departments,
locations, products etc.

Types of dimensions in data warehouse

A dimension table consists of the attributes about the facts. Dimensions store
the textual descriptions of the business. Without the dimensions, we cannot
measure the facts. The different types of dimension tables are explained in
detail below.

 Conformed Dimension:
Conformed dimensions mean the exact same thing with every possible fact table
to which they are joined.
Eg: The date dimension table connected to the sales facts is identical to the date
dimension connected to the inventory facts.
 Junk Dimension:
A junk dimension is a collection of random transactional codes flags and/or text
attributes that are unrelated to any particular dimension. The junk dimension is
simply a structure that provides a convenient place to store the junk attributes.
Eg: Assume that we have a gender dimension and marital status dimension. In
the fact table we need to maintain two keys referring to these dimensions.
Instead of that create a junk dimension which has all the combinations of gender
and marital status (cross join gender and marital status table and create a junk
table). Now we can maintain only one key in the fact table.
 Degenerated Dimension:
A degenerate dimension is a dimension which is derived from the fact table and
doesn’t have its own dimension table.
Eg: A transactional code in a fact table.
 Role-playing dimension:
Dimensions which are often used for multiple purposes within the same
database are called role-playing dimensions. For example, a date dimension can
be used for “date of sale”, as well as “date of delivery”, or “date of hire”.

24. What is Fact Table? Explain the different kinds of Facts.


The centralized table in the star schema is called the Fact table. A Fact table
typically contains two types of columns. Columns which contains the measure
called facts and columns, which are foreign keys to the dimension tables. The
Primary key of the fact table is usually the composite key that is made up of the
foreign keys of the dimension tables.

Types of Facts in Data Warehouse

Data Warehousing and ETL Training

INFORMATICA CERTIFICATION TRAINING COURSE


Informatica Certification Training Course
Reviews

 5(18079)

SNOWFLAKE CERTIFICATION TRAINING COURSE


Snowflake Certification Training Course
Reviews

 5(282)
DATA WAREHOUSING AND BI CERTIFICATION TRAINING
Data Warehousing and BI Certification Training
Reviews

 5(5645)

TALEND CERTIFICATION TRAINING FOR BIG DATA INTEGRATION


Talend Certification Training For Big Data Integration
Reviews

 5(3732)

DATA WAREHOUSING CERTIFICATION TRAINING


Data Warehousing Certification Training
Reviews

 4(4847)

Next

A fact table is the one which consists of the measurements, metrics or facts of
business process. These measurable facts are used to know the business value
and to forecast the future business. The different types of facts are explained in
detail below.

 Additive:
Additive facts are facts that can be summed up through all of the dimensions in
the fact table. A sales fact is a good example for additive fact.
 Semi-Additive:
Semi-additive facts are facts that can be summed up for some of the dimensions
in the fact table, but not the others.
Eg: Daily balances fact can be summed up through the customers dimension but
not through the time dimension.
 Non-Additive:
Non-additive facts are facts that cannot be summed up for any of the
dimensions present in the fact table.
Eg: Facts which have percentages, ratios calculated.

Factless Fact Table:


In the real world, it is possible to have a fact table that contains no measures or
facts. These tables are called “Factless Fact tables”.
E.g: A fact table which has only product key and date key is a factless fact. There
are no measures in this table. But still you can get the number products sold
over a period of time.

A fact table that contains aggregated facts are often called summary tables.

25. Explain in detail about SCD TYPE 1 through mapping.


SCD Type1 Mapping

The SCD Type 1 methodology overwrites old data with new data, and therefore
does not need to track historical data.

1. Here is the source.

2. We will compare the historical data based on key column CUSTOMER_ID.


3. This is the entire mapping:

4. Connect lookup to source. In Lookup fetch the data from target table and send
only CUSTOMER_ID port from source to lookup.
5. Give the lookup condition like this:

6. Then, send rest of the columns from source to one router transformation.
7. In router create two groups and give condition like this:

8. For new records we have to generate new customer_id. For that, take a sequence
generator and connect the next column to expression. New_rec group from
router connect to target1 (Bring two instances of target to mapping, one for new
rec and other for old rec). Then connect next_val from expression to customer_id
column of target.

9. Change_rec group of router bring to one update strategy and give the condition
like this:
10. Instead of 1 you can give dd_update in update-strategy and then connect to
target.

26. Explain in detail SCD TYPE 2 through mapping.


SCD Type2 Mapping

In Type 2 Slowly Changing Dimension, if one new record is added to the existing
table with a new information then, both the original and the new record will be
presented having new records with its own primary key.

1. To identifying new_rec we should and one new_pm and one vesion_no.


2. This is the source:

3. This is the entire mapping:

4. All the procedures are similar to SCD TYPE1 mapping. The Only difference is,
from router new_rec will come to one update_strategy and condition will be
given dd_insert and one new_pm and version_no will be added before sending to
target.
5. Old_rec also will come to update_strategy condition will give dd_insert then will
send to target.

27. Explain SCD TYPE 3 through mapping.


SCD Type3 Mapping

In SCD Type3, there should be two columns added to identifying a single


attribute. It stores one time historical data with current data.
1. This is the source:

2. This is the entire mapping:

3. Up to router transformation, all the procedure is same as described in SCD


type1.
4. The only difference is after router, bring the new_rec to router and give condition
dd_insert send to.
5. Create one new primary key send to target. For old_rec send to update_strategy
and set condition dd_insert and send to target.
6. You can create one effective_date column in old_rec table
28. Differentiate between Reusable Transformation and Mapplet.
Any Informatica Transformation created in the Transformation Developer or a
non-reusable promoted to reusable transformation from the mapping designer
which can be used in multiple mappings is known as Reusable Transformation.

When we add a reusable transformation to a mapping, we actually add an


instance of the transformation. Since the instance of a reusable transformation
is a pointer to that transformation, when we change the transformation in the
Transformation Developer, its instances reflect these changes.

A Mapplet is a reusable object created in the Mapplet Designer which contains a


set of transformations and lets us reuse the transformation logic in multiple
mappings.

A Mapplet can contain as many transformations as we need. Like a reusable


transformation when we use a mapplet in a mapping, we use an instance of the
mapplet and any change made to the mapplet is inherited by all instances of the
mapplet.

29. What is meant by Target load plan?


Target Load Order:

Target load order (or) Target load plan is used to specify the order in which the
integration service loads the targets. You can specify a target load order based
on the source qualifier transformations in a mapping. If you have multiple
source qualifier transformations connected to multiple targets, you can specify
the order in which the integration service loads the data into the targets.

Target Load Order Group:

A target load order group is the collection of source qualifiers, transformations


and targets linked in a mapping. The integration service reads the target load
order group concurrently and it processes the target load order group
sequentially. The following figure shows the two target load order groups in a
single mapping.
Use of Target Load Order:

Target load order will be useful when the data of one target depends on the
data of another target. For example, the employees table data depends on the
departments data because of the primary-key and foreign-key relationship. So,
the departments table should be loaded first and then the employees table.
Target load order is useful when you want to maintain referential integrity when
inserting, deleting or updating tables that have the primary key and foreign key
constraints.

Target Load Order Setting:

You can set the target load order or plan in the mapping designer. Follow the
below steps to configure the target load order:

1. Login to the PowerCenter designer and create a mapping that contains


multiple target load order groups.
2. Click on the Mappings in the toolbar and then on Target Load Plan. The
following dialog box will pop up listing all the source qualifier transformations in
the mapping and the targets that receive data from each source qualifier.
3. Select a source qualifier from the list.
4. Click the Up and Down buttons to move the source qualifier within the load
order.
5. Repeat steps 3 and 4 for other source qualifiers you want to reorder.
6. Click OK.

30. Write the Unconnected lookup syntax and how to return more than one
column.
We can only return one port from the Unconnected Lookup transformation. As
the Unconnected lookup is called from another transformation, we cannot
return multiple columns using Unconnected Lookup transformation.

However, there is a trick. We can use the SQL override and concatenate the
multiple columns, those we need to return. When we can the lookup from
another transformation, we need to separate the columns again using substring.

As a scenario, we are taking one source, containing the Customer_id and


Order_id columns.

Source:

We need to look up the Customer_master table, which holds the Customer


information, like Name, Phone etc.
The target should look like this:

Let’s have a look at the Unconnected Lookup.

The SQL Override, with concatenated port/column:

Entire mapping will look like this.

We are calling the unconnected lookup from one expression transformation.


Informatica Certification Training Course
Weekday / Weekend BatchesSee Batch Details

Below is the screen shot of the expression transformation.


After execution of the above mapping, below is the target, that is populated.

I am pretty confident that after going through both these Informatica Interview
Questions blog, you will be fully prepared to take Informatica Interview without
any hiccups. If you wish to deep dive into Informatica with use cases, I will
recommend you to go through our website and enrol at the earliest. 

1. What do you understand by enterprise data warehousing?


Enterprise data warehousing is the data of the organization being created at a single
point of access. The data is always accessed and viewed through a single source since
the server is linked to this single source. It also includes the periodic analysis of the
source.  

2. How many input parameters can be present in an unconnected lookup?


The number of parameters that can combine in an unconnected lookup is numerous.
However, no matter how many parameters are put, the return value would be only
one. If we take an example like box 1, box 2, box 3, box 4 can be put in an
unconnected lookup, but there is only one return value.

3. What is the domain?


The main organizational point sometimes commences all the interlinked and
interconnected nodes and relationships, and this is known as the domain. These links
are covered mainly by one single aspect of the organization. 

4. Explain how the performance of the joiner condition can be increased?


The performance of the joiner condition can be increased through some simple steps:

 Data should be sorted before applying to join whenever possible


 The users must perform joins whenever possible, and if it is for some tables
which are not possible, and then a user can create a stored procedure and then
join the tables in the database.
 When data is unsorted, then a source with less number of rows should be
considered as a master source
 For sorted joiner transformation, a source with less duplicate vital values
should be considered as a master source

5. Name some different features of complex mapping?


Three different features of complex mapping are:

 Numerous transformation
 Difficult requirements
 Complex logic regarding business

6. How do you load alternate records into different tables through mapping
flow?
The concept is to add a sequence number to the records and then divide the record
number by 2. If it is breakable, then move it to one target and if not then move it to
another target. Some of the following steps are:

 Drag the source and connect to an expression transformation


 Add the next value of a sequence generator to expression transformation
 In the expression transformation make two ports, one is “odd”, and another one
is “even.”
 Write the expression below
 Connect a router transformation to expression
 Make two groups in the router 
 Give condition 
 Then send the two groups to different targets 

7. Define the target load order?


Targe t load order relies upon the source qualifiers in a mapping. Broadly, multiple
source qualifiers are linked to a target load order.

8. Name the scenario in which the Informatica server rejects files?


When the server faces rejection of the update strategy transformation, it regrets files.
The database consisting of the information and data also gets disrupted. The is a rare
case scenario 

9. What is meant by Informatica Power center architecture?


The following components get installed:
 Power center domain
 Informatica administrator
 Power center clients 
 Power center repository
 Power center integration service 
 Power center repository service

10. What are the types of groups in router transformation?


The types of groups in router transformation are:

Unit Testing

Unit Testing
In unit testing what we need  do is something like below

1. Validate source and target


           -  Analyze & validate your transformation business rules.
          -  We need review field by field from source to target and ensure that the required
               transformation logic is applied.
            -   We generally check the source and target counts for each mapping.
2. Analyze the success and reject rows
-    In this stage we generally customized sql queries to check source and target.
-    Analyze the rejected rows and build the process to handle this rejection.
3. Calculate the load time
-    Run the session and view the statistics
-    We observe how much time is taken by reader and writer .
-    We should look at lesion log and workflow log to view the load statistics
4. Testing performance
-    Source performance
-    Target performance
-    Session performance
-    Network performance
-    Database performance
After unit testing we generally prepare one document  as described below
5. UNIT TEST CASE FOR LOAN_MASRER
EXPECTED
ACTUA
PASS/FAIL  
FIELD_NAME DETAIL VALUE PASSED L REMARK
FUNCTIONALITY_ID RESULT RESULT
RESULT

_TYPE_ID SHOULD BE
NOT NULL ,FIRST
LOAN CHARACHER
INSCH00000000 ACCEPT
RECOR
D
SCD
STG_SCHM_DTLS_001 ALPHABET(INSCH) AND PASS  
_ID LAST 10 CHARACTER
002 RECORD ACCEPT
ED
Type3
NUMERIC VALUES AND
ALSO ITS LENGTH IS 16
In SCD
RECORD Type3 ,there
INSERTED
REJECT WHEN , NOT
INTO
should be
NULL ,FIRST 5
REJECT REJECTED added two
CHARACHER NOT
INSCP00100000 RECORDREC FILE WITH AN column to
STG_SCHM_DTLS_002 LOAN_TYPE_ID (INSCH) OR LAST 10 PASS  
CHARACTER NON
0002 ORD ERROR_ID identifying a
REJECTED &ERROR_DET
NUMERIC VALUES AND
AILS INTO
single
ALSO ITS LENGTH <>16
ERROR_TABL attribute. It
E stores one
time
LOAN_COMPANY_ID 
MUST BE NOT
historical
NULL,FIRST 4 RECOR data with
LOAN_COMPANY_ CHRACTER INCO000000000 ACCEPT D current data
STG_SCHM_DTLS_003
ID ALPHABET(INCO) AND 03 RECORD ACCEPT
PASS  
LAST 11 CHRACTER ED
NUMERIC VALUES AND 1. This
ALSO LENGTH IS 15
is the
RECORD
INSERTED
REJECT WHEN , NOT
INTO
NULL ,FIRST 4
RECOR REJECTED
CHARACHER NOT
LOAN_COMPANY_ INSO000000600 REJECT D FILE WITH AN
STG_SCHM_DTLS_004 (INCO) OR LAST 11 PASS
ID 03 RECORD REJECT ERROR_ID
CHARACTER NON
ED &ERROR_DET
NUMERIC VALUES AND
AILS INTO
ALSO ITS LENGTH <>15
ERROR_TABL
E

RECOR
         START DATE
ACCEPT D
STG_SCHM_DTLS_005 START_DATE SHOULD  BE  A VALID 12/9/1988
RECORD ACCEPT
PASS  
DATE
ED

RECORD
INSERTED
INTO
RECOR REJECTED
START DATE SHOULD
REJECT D FILE WITH AN
STG_SCHM_DTLS_006 START_DATE  NOT BE LOADED WHEN 33FeB/88 PASS
RECORD REJECT ERROR_ID
IT IS NOT A VALID DATE
ED &ERROR_DET
AILS INTO
ERROR_TABL
E

RECOR
SCHEME-DESC SHOULD ACCEPT D
STG_SCHM_DTLS_007 SCHEME_DESC
BE ALPHABETIC TYPE
AUTOMOBILE
RECORD ACCEPT
PASS  
ED

RECORD
INSERTED
INTO
RECOR REJECTED
REJECT WHEN SCHEME
SCHEME_DESC        REJECT D FILE WITH AN
STG_SCHM_DTLS_008 DISCOUNT IS NOT MOTO124 PASS
   RECORD REJECT ERROR_ID
ALPHABETIC TYPE
ED &ERROR_DET
AILS INTO
ERROR_TABL
E

RECOR
PREMIUM_PER_L PREMIUM_PER_LACSS ACCEPT D
STG_SCHM_DTLS_009
ACS     HOULD BE NUMERIC    
5000
RECORD ACCEPT
PASS  
ED
source

2. This is the entire mapping

3. Up to rouer transformation ,all the procedure is same as described in Scenario_36


SCD type1.
4. The only difference is after router bring the new_rec to router and give condition
dd_insert send to  target. Create one new primary key send to target.
5. For old_rec send to update_strategy and set condition dd_insert and send to target.
6. You can create one effective_date column in old_rec table
SCD Type2

In Type 2 Slowly Changing Dimension, if  one new record is added to the existing table with a new
information then both the original and the new record will be presented having  new  records with  its
own primary key.

1. To identifying new_rec we should and one new_pm and one vesion_no.


2. This is the source.

3. This is the entire mapping

4. All the procedure same as described in SCD TYPE1 mapping. The Only
difference is , From router new_rec will come to one update_strategy and
condition will be given dd_insert and one new_pm and version_no will be
added before sending to target.
5. Old_rec also will come to update_strategy condition will given dd_insert then
will send to target.

SCD TYPE1

SCD TYPE1 MAPPING

The SCD Type 1 methodology overwrites old data with new data, and therefore does
no need to track historical data .

1. Here is the source

2. We will compare the historical data based on key column CUSTOMER_ID.


3. This is the entire mapping
4. Connect lookup to source. In Lookup fetch the data from target table and send only
CUSTOMER_ID port from source to lookup

5. Give the lookup condition like this


6. Then rest of the columns from source send to one router transformation

7. In router create two groups and give condition like this

8. For new records we have to generate new customer_id. For that  take a sequence
generator and connect the next column to expression .New_rec group from router
connect to target1(Bring two instances of target to mapping, one for new rec and
other for old rec) .Then connect next_val from expression to customer_id column of
target

9. Change_rec group of router bring to one update strategy. and give the condition like
this
10. Instead of 1 you can give dd_update in update-stratgy. Then connect to target.
Extracting Middle Name From Ename

Suppose in e_name column is like this

empno            ename

1                   Sudhansu Sekher Dash

2                   Amiya Prasad Mishra

In target we have to send middle name like this

empno      ename

1              Sekher

2              Prasad

These are the steps for achieving this

1. Drag the source and connect to an expression transformation


2. In Expression create two ports one is name1(as variable port) and Middle_Name (o/p
port)

3. In Name1 write the condition like this

4. In Middle_Name write the condition like this

5. Then send to target.

Extracting first and last name

Suppose In Ename column there is first name and last name like this
empno   ename

1         Amit Rao

2         Chitra Dash

In target we have to separate ename column to firstnane and lastname like this

empno    firstname    Lastname

1             Amit           Rao

2             Chitra          Dash

Steps for solving this scenario

1. Drag the source to mapping area and connect with an expression trans formation as
shown bellow.
2. In expression transformation create two output port one is f_name and other is
l_name.

3. In f_name write the condition like this

4. In l_name write the condition like this

Then connect the target.

These are the basic steps for achieving this scenario

Targeting records of employees who joined in current month.

scenario: Insert   the records  of those employees who  have joined  in current


month  and Reject other rows.

Source

E_NO       JOIN_DATE
    -------       ---------
      1        07-JUL-11
      2        05-JUL-11
      3        05-MAY-11
If the current month is july ,2011 then target  will be like this.

Target
   E_NO      JOIN_DATE
   -------       ---------
     1        07-JUL-11
     2        05-JUL-11

To insert current month records we have to follow these steps

1. Connect one update strategy transformation next to SQF.

2. In update strategy properties write the condition like this

3. Send required ports update strategy to target.


Convert Day No. to corresponding month and date of year

Scenario: Suppose you have a source is like this

Source
  E_NO    YEAR        DAYNO
   ------   --------- -         ---------
  1          01-JAN-07     301
  2          01-JAN-08     200
Year column  is a  date and dayno is numeric that represents a day ( as in 365 for 31-
Dec-Year). Convert the Dayno to corresponding year's month and date and then send
to targer.

Target
  E_NO           YEAR_MONTH_DAY
   ------            --------- ----------
     1                  29-OCT-07
     2                  19-JUL-08

These are the basic steps for this scenario

1. Connect  SQF with an expression transformation.


2. In expression create one o/p port c_year_mm_dd, make it to date type and in that
port write the condition like this.

3. Finally send to target

Sending to target with days difference more than 2 days

Scenario: From the order_delivery table  insert  the records to target where , day 
difference between  order_date and delivery_date is greater than 2 days. ( Note: see
last article , where we discussed finding the time in hour between two dates)

Source
ORDER_NO    ORDER_DATE       DELIVERY_DATE
---------                      ---------                   ---------
        2                        11-JAN-83               13-JAN-83
        3                        04-FEB-83               07-FEB-83
        1                        08-DEC-81              09-DEC-81

Target
 ORDER_NO    ORDER_DATE   DELIVERY_ DATE    
 ---------             -------- ------    --- ----------
  2                      11-JAN-83       13-JAN-83         
  3                      04-FEB-83       07-FEB-83  

These are the steps for achieving this scenario


1. Connect all the rows from SQF to update strategy transformation.

2. In update strategy properties write the expression like this

3. Finally send to target.

Date Difference in Hours

Scenario:There is a order_delivery table having record like this


   ORDER_NO      ORDER_DATE      DELIVERY_DATE
   ---------                        ---------             --------
        2                        11-JAN-83          13-JAN-83
        3                         04-FEB-83        07-FEB-83
        1                         08-DEC-81        09-DEC-81

We have to calculate difference between order_date and delivery date in hours and
send it to target.
o/p will be

 ORDER_NO   ORDER_DATE   DELIVERY_ DATE      DIFF_IN_HH


 ---------                     ---------                  ---------            ---------
     2                       11-JAN-83            13-JAN-83              48
     3                       04-FEB-83           07-FEB-83             72
     1                       08-DEC-81           09-DEC-81             24
These are the steps for achieving this scenario

Don't forget to comment

1. Connect one expression transformation next to SQF.

     

2. In expression create one out/put port “diff” and make it integer type.

3. In that port write the condition like this and sent to target.
Check the Hire-Date is Date or Not

Scenario:Suppose we have a table with records like this


 EMPNO        HIRE_DATE           
------ ----       -----------
     1             12-11-87
     2             02-04-88;
     3             02-2323
empno is number and hire_date is in string format. We have to check the hire_date
column, if it is in date format like 'dd-mm-yy', then convert it to date , in the  format 
“mm/dd/yy”  and send it to target  else send null.
output
    EMPNO       HIRE_DATE
    --------          ---------
       1             11-DEC-87
       2                null
       3               null

These are the steps for achieving this scenaio

1. Connect the ports from SQF to an expression transformation.

2. In expression create another oupput port hire_date1 and make it to date data-type,
shown in picture.
3. In Hire_date1 write the condition like this.

4. Send ports to target.

Convert Numeric Value to Date Format

Senario:Suppose you are importing a flat file emp.csv and hire_date colummn is in
numeric format, like 20101111 .Our objective is convert it to date,with a format
'YYYYMMDD'.
 

source
  EMPNO       HIRE_DATE(numeric)           
  -------            -----------
     1                20101111
     2                20090909
target
EMPNO            HIRE_DATE (date)          
 ------                   -----------
     1                   11/11/2010
     2                    09/09/2009
1.  Connect SQF to an expression.

2. In expression make hire_date as input only and make another port hire_date1 as o/p
port with date data type.
3. In o/p port of hire_date write condition like as below

4. Finally send to target

Remove special characters from empno

Senario:Suppose  in flat_file  some special symbols like @,%,$,#,&  has added in


empno column along with the actual data. How to  remove those special charcters ?
( see article , on how to remove $ from salary )
example
empno in source   
empno(in string format)
7@3%$,21
432#@1
324&*,$2
In target
empno
7321
4321
3242
Following are the steps for achieving this mapping

1. Connect 0/p columns of SQF to an expression transformation.


2. In expression make empno as input and create another port empno1 as output port
with date datatype. And in empno1 write condition like this. and finally send it to
target

You comments and suggests keeps me going, don't forget to do so.

Count the no of vowel present in emp_name column

Scenario:Count the no of vowels present in emp_name column of EMP table as


shown bellow.
     emp_name         total_vowels_count
       Allen                          2
       Scott                          1
       Ward                         1

These are the steps to achieve it

1. Connect required columns from SQF to an expression transformation.


2.  In Expression add 6 columns like in the picture as bellow. But You can make it two
columns( One for all the vowels and one for the vowel counts). For better
understanding I have added 6 columns,5 for each of the vowels and one for the
vowel count.

The way I achieved is for each of the vowels in ename , I replaced it with null and in
port total vowel count , I  substract the vowel port from the ename length which gives
me the individual count of vowels, after adding up for all vowels I found all the vowels
present. Here are  all the  variable ports.
For A  write                              REPLACECHR(0,ENAME,'a',NULL)
For E  write                             REPLACECHR(0,ENAME,'e',NULL)
For I  write                              REPLACECHR(0,ENAME,'i',NULL)
For O  write                            REPLACECHR(0,ENAME,'o',NULL)
For U  write                           REPLACECHR(0,ENAME,'u',NULL)
And for o/p column total_vowels_count write expression  like this
(length(ENAME)-length(A))
+
(length(ENAME)-length(E))
+
(length(ENAME)-length(I))
+
(length(ENAME)-length(O))
+
(length(ENAME)-length(U))

3. Finally send to target.

Insert and reject records using update strategy

Scenario:There is a emp table and from that table insert  the data to targt where
sal<3000 and reject other rows.

Following are the steps for achieving it

1. connect out-puts from SQF to Update Strategy transformation.


2. In properties of  Update Strategy write the condition like this

3. Connectthe Update Strategy to target

Converting '$' symbol to 'RS." in sal column

Q24 The Emp table contains the salary and commission in USD, in the target the
com and sal will converted to a given currency prefix ex: Rs.

 Source

EMPNO ENAME      JOB              MGR HIREDATE         SAL            DEPTNO 

7369 SMITH            CLERK           7902 17-DEC-80        $800                20              

7499 ALLEN      SALESMAN        7698 20-FEB-81         $1600               30              

Target

EMPNO ENAME      JOB              MGR HIREDATE         SAL                 DEPTNO 


7369 SMITH            CLERK           7902 17-DEC-80        Rs.800                20    

7499 ALLEN      SALESMAN        7698 20-FEB-81          RS.1600               30             

1. Drag the source and connect it to expression transformation

2. In expression make a output port sal1 and make sal as input port only.
3.  In sal1 write the condition as like bellow

4.  Then send it to target.

sending data one after another to three tables in cyclic order

Q23 In source  there are some record. Suppose I want to send three targets. First
record will go to first target, Second one will go to second target and third record
will go to third target and then 4th to 1st,5th to 2nd , 6th to  3rd and so on.

1. Put the source to mapping and connect it to an expression transformation.


2. Drag an sequence generator transformation and set properties like this And connect 
the next value port to expression.

3. Drag all output port of expression to router. In router make three groups and gve the
conditions Like
this 

4. connect desire group to desire target .

Currency convertor

Q22 Suppose that a source contains a column which holds the salary information
prefixed with the currency code , for example  
EMPNO ENAME      JOB              MGR HIREDATE         SAL            DEPTNO  
7369 SMITH            CLERK           7902 17-DEC-80        $300               20               
7499 ALLEN      SALESMAN        7698 20-FEB-81         £1600               30               
7521 WARD       SALESMAN        7698 22-FEB-81        ¥8500                30       
In the target different currency will evaluate to a single currency value, for  example
covert all to Rupees.

1. First thing we should consider that there are different types of currency like pound,
dollar, yen etc.So it’s a good idea to use mapping parameter or variable.Go to
mapping=> mapping parameter and variables then create three parameters (for this
example) and set its initial value as bellow

2. Then drag the source to mapping area and connect to an expression transformation.
3. In expression create a output port as sal1 and make sal as input only as bellow.

4. In sal1 port write the condition as below

iif(instr(SAL,'$')!=0,TO_integer(SUBSTR(SAL,INSTR(SAL,'$')+1,LENGTH(SAL)-
1))*$$DOLAR,
iif(instr(SAL,'£')!=0,TO_integer(SUBSTR(SAL,INSTR(SAL,'£')+1,LENGTH(SAL)-
1))*$$POUND,
iif(instr(SAL,'¥')!=0,TO_integer(SUBSTR(SAL,INSTR(SAL,'¥')+1,LENGTH(SAL)-
1))*$$YEN
)
)
)
$$DOLAR,$$POUND,$$YEN these are mapping parameter . you can multiply
price in rupee directly  for example dollar price in rupees i.e 48 .

5. Connect required output port from expression to target directly. And run the
session.

Removing '$' symbol from salary column

Q21: Reading a source file with salary prefix $ , in the target the Sal column must
store in number .
Source
EMPNO ENAME      JOB              MGR HIREDATE         SAL            DEPTNO  
7369 SMITH            CLERK           7902 17-DEC-80        $800                20               
7499 ALLEN      SALESMAN        7698 20-FEB-81         $1600               30               
Target
EMPNO ENAME      JOB              MGR HIREDATE         SAL            DEPTNO  
7369 SMITH            CLERK           7902 17-DEC-80        800                20               
7499 ALLEN      SALESMAN        7698 20-FEB-81         1600               30     
1. Drag the source to mapping area and connect each port to an expression
transformation.

2. In expression transformation add a new col sal1 and make it as out put and sal as in
put  only as shown in picture.
3. In expression write the condition like this.

4.  connect the required port to target.

Using mapping parameter and variable in mapping

Scenario:How to use mapping parameter and variable in mapping ?

Solution:

1. Go to mapping then parameter and variable tab in the Informatica designer.Give


name as $$v1, type choose parameter (You can also choose variable), data type as
integer and give initial value as 20.

2. Create a mapping as shown in the figure( I have considered a simple scenario where
a particular department id will be filtered to the target).
3. In filter set deptno=$$v1 (that means only dept no 20 record will go to the target.)

4. Mapping parameter value can’t change throughout the session but variable can be
changed. We can change variable value by using text file. I’ll show it in next scenario.

Validating all mapping in repository

Scenario:How validate all mapping in repository ?

Solution:

1. In repository go to menu “tool” then “queries”. Query Browser dialog box will
appear.Then click on new button.
2. In Query Editor,  choose folder name and object type as I have shown in the picture.

3. After that, execute it (by clicking the blue arrow button).


4. Query results window will appear. You select single mapping (by selecting single
one) or whole mapping (by pressing Ctrl + A) and go to "tools" then "validate" option
to validate it.
Produce files as target with dynamic names

Scenario:How to generate file name dynamically  with name of sys date ?

Solution:

1. Drag your target file to target designer and add a column as show on the picture. It’s
not a normal column .click on the ‘add file name to the table’ property. (I have given a
red mark there)
2. Then drag your source to mapping area and connect it to an expression
transformation.

3. In expression transformation add a new port as string data type and make it output
port.

4. In that output port write the condition like describe as bellow and then map it in to
filename port of target. Also send other ports to target. Finally run the session. You
will find two file one with sys date and other one is ‘.out’ file which one you can
delete.
5.

Target table rows , with each row as sum of all previous rows from source
table.

Scenario: How to produce rows in target table with every row  as sum of all previous
rows in source table ? See the source and target table to understand the scenario.

SOURCE TABLE

Id Sal

1 200

2 300

3 500

4 560

TARGET TABLE
Id Sal

1 200

2 500

3 1000

4 1560

1. Pull the source to mapping and then connect it to expression.

2. In expression add one column and make it output(sal1) and sal port as input only.
We will make use of a function named cume() to solve our problem, rather using any
complex mapping.  Write the expression in sal1 as cume(sal) and send the output
rows to target.
Concatenation of duplicate value by comma separation

Scenario: You have two columns in source table T1, in which the col2 may contain
duplicate values.All the duplicate values in col2 of  will be transformed as comma
separated in the column col2  of target table T2.

Source Table: T1

Col1 Col2

A x

B y

C z

A m

B n

Target Table: T2

col1 col2

A x,m

B y,n

C z

Solution:
1. We have to use the following transformation as below.
First connect a sorter transformation to source and make col1 as key and its order is
ascending. After that connect it to an expression transformation.

2. In Expression make four new port and give them name as in picture below.
3. In concat_val write expression like as describe bellow and send it to an aggregator

4. In aggregator group it by col1 and send it to target

5. Finally run the session.


Sending records to target tables in cyclic order

Scenario: There is a source table and 3 destination table T1,T2, T3. How  to insert
first 1 to 10 record in T1, records from 11 to 20 in T2 and 21 to 30 in T3.Then again
from 31 to 40 into T1, 41 to 50 in T2 and 51 to 60 in T3 and so on i.e in cyclic order.

Solution:

1. Drag the source and connect to an expression.Connect the next value port of
sequence generator to expression.

2. Send the all ports to a router and make three groups as bellow

Group1

mod(NEXTVAL,30) >= 1 and mod(NEXTVAL,30) <= 10

Group2

mod(NEXTVAL,30) >= 11 and mod(NEXTVAL,30) <= 20

Group3

mod(NEXTVAL,30) >= 21and mod(NEXTVAL,30) <= 29 or mod(NEXTVAL,30) = 0


3. Finally connect Group1 to T1, Group2 to T2 and Group3 to T3.

 
Extracting every nth row

Scenario: How to load every nth row from a Flat file/ relational DB to the target?
Suppose n=3, then in above condition the row numbered 3,6,9,12,....so on, This
example takes every 3 row to target table.

Solution:

1. Connect an expression transformation after source qualifier.


Add the next value port of sequence generator to expression transformation.

2. In expression create a new port (validate) and write the expression like in the picture
below.
3. Connect a filter transformation to expression and write the condition in property like
in the picture below.

4. Finally connect to target.

Segregating rows on group count basis

Scenario 13: There are 4 departments in Emp table. The first one with 100,2nd with
5, 3rd with 30 and 4th dept has 12 employees. Extract those dept numbers which has
more than 5 employees in it,  to a target table.

Solution:
1. Put the source to mapping and connect the ports to aggregator transformation.

2. Make 4 output ports in aggregator  as in the picture above : count_d10, count_d20,


count_d30, count_d40.
For each port write expression like in  the picture below.
3. Then send it to expression transformation.

4. In expression make four output ports (dept10, dept20, dept30, dept40) to validate
dept no
And provide the expression like in the picture below.
5. Then connect to router transformation. And create a group and fill condition like
below.

6. Finally connect to target table having one column that is dept no.

Get top 5 records to target without using rank

Scenario 12: How to get top 5 records to target without using rank ?

Solution:

1. Drag the source to mapping and connect it to sorter transformation.


2. Arrange the salary in descending order in sorter as follows and send the record to
expression.

sorter properties

3. Add the next value of sequence generator to expression.(start the value from 1 in
sequence generator).
sorter to exp mapping

4. Connect the expression transformation to a filter or router. In the property set the
condition as follows-

5. Finally connect to the target.


final mapping sc12
Separate rows on group basis

Scenario 11: In Dept table there are four departments (dept no 40,30,20,10).


Separate the record to different target department wise.

Solution:
Step 1: Drag the source to mapping.
Step 2: Connect the router transformation to source and in router make 4 groups and
give condition like below.

router transformation

Step 3: Based on the group map it to different target.

The final mapping looks like below.


router to target

Get top 5 records to target without using rank

Scenario 12: How to get top 5 records to target without using rank ?

Solution:

1. Drag the source to mapping and connect it to sorter transformation.


2. Arrange the salary in descending order in sorter as follows and send the record to
expression.

sorter properties

3. Add the next value of sequence generator to expression.(start the value from 1 in sequence
generator).
sorter to exp mapping

4. Connect the expression transformation to a filter or router. In the property set the condition
as follows-

5. Finally connect to the target


Separate the original records in target

Scenario 10: How to separate  the original records from source table to separate
target table by using rank  transformation ?

Source Table

col1 col2 col3

a
b c

X y z

A B c

R F u

A B c

V F r

V F r

Target Table

Col1 Col2 Col3

A B c

X Y z

R F u
Col1 Col2 Col3

V F r

Solution:
Step 1: Bring the source to mapping.

src to rank mapping

Step 2: Connect the rank to source.


Step 3: In rank, set the property like this.
rank property

Step 4: Then send it to target.


Run the session to see the result.

Sending alternate record to target

Scenario 9: How to send alternate record to target?


Or
Sending Odd numbered records to one target and even numbered records to another
target.

Solution:
Step 1: Drag the source and connect to an expression transformation.
Step2: Add the next value of a sequence generator to expression transformation.                                 

scr to seq mapping

Step 3: In expression transformation make two port, one is "odd" and another "even".
And Write the expression like below

expression property
Step 4: Connect a router transformation to expression.
Make two group in router.
And give condition  Like below

rtr property

Step 5: Then send the two group to different targets.

The entire mapping is as below


Final mapping view scenario 9

Sending second half record to target


 

Scenario 8: How to send second  half record to target?

Solution
Step 1: Drag and drop the source to mapping.

src to tgt mapping

Step 2: In source-Qualifier  , go to propery and write the SQL query like

1select * from emp minus select * from emp where rownum <= ( select count(*)/2 f
src qualifier sql query

Step:3 Then connect to target, and run mapping to see the results.

Sending first half record to target

Scenario 6: How to send first half record to target?

Solution:

1.  Drag and drop the source to mapping.

2. Step:2 In source-Qualifier  , go to property and write the SQL query like


?

1select * from emp where rownum <= ( select count(*)/2 from emp)

3. Then connect to target.Now you are ready to run the mapping to see it in action.

Remove header from your file

Scenario 6: How to remove header from a file ?

Solution

Step1:  After mapping  go to workflow  and scheduled it.


Step2:  Just double click on the session  and go to mapping option.
Step3:  Select  the source  and go to the set file properties.

flat file properties

Step4:Chose the advance option.  Set number of initial rows skip: 1 ( it can be more
as per requirement )
adv properties

It will skip the header.

Remove footer from your file

Scenario 5: How to remove footer from your file ?

For example the file content looks like as below:-

some Header here


col1    col2    col3     col4
data1  data2  data3  data4
data5  data6  data7  data8
data1  data2  data3  data4
data1  data2  data3  data4
footer

Just we have to remove footer  from the file.


Solution:

Step1:  Drag the source to mapping area.

Step2: After that  connect a filter or router transformation.

Step3:   In filter write the condition  like  in the picture

Step 4:Finally  pass it over to target.


 

Retrieving first and last record from a table/file

Scenario 4:

How to get first and last record from a table/file?

Solution:

Step  1: Drag and drop ports from source qualifier to two rank transformations.

Step  2: Create a reusable sequence generator having start value 1 and connect the next
value to both rank transformations.
Step  3: Set rank properties as follows

In Rank1

In Rank2
Step  4: Make two instances of the target.

Step  5: Connect the output port to target.

You might also like