You are on page 1of 124

Master Informatica Questions and

Answer Set
Version 2.5
The one stop master manual of Informatica interview questions and answers

DWBIConcepts.com

www.dwbiconcepts.com Community of DWBI Professionals

Copyright Notice
Informatica Master Question and Answer Set is copyright DWBIConcepts 2013.

All rights reserved. No part of this book shall be reproduced, stored in a retrieval system,
or transmitted by any means electronic, mechanical, photocopying, recording, or otherwise without written permission from the publisher. No patent liability is assumed
with respect to the use of the information contained herein. Although every precaution
has been taken in the preparation of this book, the publisher and author assume no responsibility for errors or omissions. Neither is any liability assumed for damages resulting from the use of the information contained herein.

Trademarks
All terms mentioned in this book that are known to be trademarks or service marks have
been appropriately capitalized. New Riders Publishing cannot attest to the accuracy of
this information. Use of a term in this book should not be regarded as affecting the validity of any trademark or service mark.

Warning and disclaimer


Every effort has been made to make this book as complete and as accurate as possible,
but no warranty of fitness is implied. The information is provided on an as is basis. The
author and the publisher shall have neither liability nor responsibility to any person or
entity with respect to any loss or damages arising from the information contained in this
book

2
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

How this book should be used


This book contains various questions and answers pertaining to Informatica Power Center and allied tools as commonly asked in Job Interviews. As such the book is written
for the candidates who are preparing for Job Interviews. It is suggested that the candidate
start preparing from the material at least one week in advance so that s/he can finish
reading the entire content before appearing for the interview. In case the candidate is
stuck with any question or answer, is not clear on something or has a doubt s/he can
interact with the Experts by using DWBIConcepts forum.
For the help of the readers, we have tagged certain questions accordingly as shown below:
Common / Frequently Asked Questions
Harder Questions
Additional Information

3
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

13

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.

WHAT IS AN AGGREGATOR TRANSFORMATION?


13
HOW AN EXPRESSION TRANSFORMATION DIFFERS FROM AGGREGATOR TRANSFORMATION?
13
DOES AN AGGREGATOR TRANSFORMATION SUPPORT ONLY AGGREGATE EXPRESSIONS?
13
GIVE ONE EXAMPLE FOR EACH OF CONDITIONAL AGGREGATION, NON-AGGREGATE EXPRESSION AND NESTED AGGREGATION. 13
HOW DOES AGGREGATOR TRANSFORMATION HANDLE NULL VALUES?
13
WHAT ARE THE PERFORMANCE CONSIDERATIONS WHEN WORKING WITH AGGREGATOR TRANSFORMATION?
14
WHAT ARE THE USES OF INDEX AND DATA CACHE?
14
WHAT DIFFERS WHEN WE CHOOSE SORTED INPUT FOR AGGREGATOR TRANSFORMATION?
14
UNDER WHAT CONDITIONS SELECTING SORTED INPUT IN AGGREGATOR WILL STILL NOT BOOST SESSION PERFORMANCE?
15
UNDER WHAT CONDITION SELECTING SORTED INPUT IN AGGREGATOR MAY FAIL THE SESSION?
15
SUPPOSE WE DO NOT GROUP BY ON ANY PORTS OF THE AGGREGATOR WHAT WILL BE THE OUTPUT.
15
WHAT IS THE EXPECTED VALUE IF THE COLUMN IN AN AGGREGATOR TRANSFORMATION IS NEITHER A GROUP BY NOR AN
AGGREGATE EXPRESSION?
15
13. WHAT IS INCREMENTAL AGGREGATION?
15
14. SORTED INPUT FOR AGGREGATOR TRANSFORMATION WILL IMPROVE PERFORMANCE OF MAPPING. HOWEVER, IF SORTED INPUT IS
USED FOR NESTED AGGREGATE EXPRESSION OR INCREMENTAL AGGREGATION, THEN THE MAPPING MAY RESULT IN SESSION FAILURE.
EXPLAIN WHY?
16
15. HOW CAN WE DELETE DUPLICATE RECORD USING INFORMATICA AGGREGATOR?
16
16. SCENARIO IMPLEMENTATION 1
16
17. SCENARIO IMPLEMENTATION 2
18
2.

EXPRESSION TRANSFORMATION

19

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

WHAT IS AN EXPRESSION TRANSFORM?


19
HOW MANY TYPES OF PORTS ARE THERE IN EXPRESSION TRANSFORM?
19
WHAT IS THE EXECUTION ORDER OF THE PORTS IN AN EXPRESSION?
19
DESCRIBE THE APPROACH FOR THE REQUIREMENT. SUPPOSE THE INPUT IS:
19
HOW CAN WE IMPLEMENT AGGREGATION OPERATION WITHOUT USING AN AGGREGATOR TRANSFORMATION IN INFORMATICA? 20
SCENARIO IMPLEMENTATION 1
20
SCENARIO IMPLEMENTATION 2
21
SCENARIO IMPLEMENTATION 3
22
SCENARIO IMPLEMENTATION 4
22
SCENARIO IMPLEMENTATION 5
22

3.

FILTER TRANSFORMATION

24

1.
2.

WHAT IS A FILTER TRANSFORMATION AND WHY IT IS AN ACTIVE ONE?


WHAT IS THE DIFFERENCE BETWEEN SOURCE QUALIFIER TRANSFORMATIONS SOURCE FILTER OPTION AND FILTER
TRANSFORMATION?

www.dwbiconcepts.com All rights reserved.

24

24

DWBIConcepts

AGGREGATOR TRANSFORMATION

DWBIConcepts

1.

2
2
2
3

DWBIConcepts

COPYRIGHT NOICE
TRADEMARKS
WARNING AND DISCLAIMER
HOW THIS BOOK SHOULD BE USED

DWBIConcepts

Table of Contents

www.dwbiconcepts.com Community of DWBI Professionals

WHAT IS A JOINER TRANSFORMATION AND WHY IT IS AN ACTIVE ONE?


25
STATE THE LIMITATIONS WHERE WE CANNOT USE JOINER IN THE MAPPING PIPELINE.
25
OUT OF THE TWO INPUT PIPELINES OF A JOINER, WHICH ONE WILL WE SET AS THE MASTER PIPELINE?
25
WHAT ARE THE DIFFERENT TYPES OF JOINS AVAILABLE IN JOINER TRANSFORMATION?
26
DEFINE THE VARIOUS JOIN TYPES OF JOINER TRANSFORMATION.
27
DESCRIBE THE IMPACT OF NUMBER OF JOIN CONDITIONS AND JOIN ORDER IN A JOINER.
27
HOW DOES JOINER TRANSFORMATION TREAT NULL VALUE MATCHING?
27
WHEN WE CONFIGURE THE JOIN CONDITION, WHAT ARE THE GUIDELINES WE NEED TO FOLLOW TO MAINTAIN THE SORT ORDER? 28
WHAT ARE THE TRANSFORMATIONS THAT CANNOT BE PLACED BETWEEN THE SORT ORIGIN AND THE JOINER TRANSFORMATION SO
THAT WE DO NOT LOSE THE INPUT SORT ORDER?
28
10. WHAT IS THE USE OF SORTED INPUT IN JOINER TRANSFORMATION?
28
11. CAN WE JOIN TWO TABLES BASED ON A JOIN COLUMN HAVING DIFFERENT DATA TYPE?
29
12. IMPLEMENTATION SCENARIO1 - JOINER TRANSFORMATION IS JOINING TWO TABLES S1 AND S2. S1 HAS 10,000 ROWS AND S2
HAS 1000 ROWS . WHICH TABLE YOU WILL SET MASTER FOR BETTER PERFORMANCE OF JOINER TRANSFORMATION? WHY?
29
5.

LOOKUP TRANSFORMATION

30

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.

WHAT IS A LOOKUP TRANSFORM?


30
WHAT ARE THE DIFFERENCES BETWEEN CONNECTED AND UNCONNECTED LOOKUP?
30
WHAT ARE THE DIFFERENT LOOKUP CACHE(S)?
30
IS LOOKUP AN ACTIVE OR PASSIVE TRANSFORMATION ?
31
WHAT IS THE DIFFERENCE BETWEEN STATIC AND DYNAMIC LOOKUP CACHE?
31
WHAT ARE THE USES OF INDEX AND DATA CACHES?
31
WHAT IS PERSISTENT LOOKUP CACHE?
31
WHAT TYPE OF JOIN DOES LOOKUP SUPPORT?
32
EXPLAIN HOW LOOKUP TRANSFORMATION WORKS LIKE SQL LEFT OUTER JOIN.
32
WHERE AND WHY DO WE USE UNCONNECTED LOOKUP INSTEAD OF CONNECTED LOOKUP?
32
HOW CAN WE IDENTIFY PERSISTENT CACHE FILES IN INFORMATICA SERVER?
33
HOW TO CONFIGURE A LOOKUP ON A FLAT FILE WITH HEADER?
33
WHAT IS THE DIFFERENCE BETWEEN PERSISTENT CACHE AND SHARED CACHE?
33
DESCRIBE HOW TO RETURN MULTIPLE PORT VALUES FROM UNCONNECTED LOOKUP IN INFORMATICA.
34
HOW TO MAKE THE PERSISTENT LOOKUP CACHE IN SYNC WITH LOOKUP TABLE?
34
IF WE USE PERSISTENT CACHE FOR A DYNAMIC LOOKUP, WILL THE CACHE FILE BE UPDATED OR INSERTED AS REQUIRED?
34
IS THERE ANYTHING WRONG IN SHARING A PERSISTENT CACHE BETWEEN STATIC AND DYNAMIC LOOKUP?
34
WHAT IS THE DIFFERENCE BETWEEN THE TWO UPDATE PROPERTIES - UPDATE ELSE INSERT, INSERT ELSE UPDATE IN DYNAMIC
LOOKUP CACHE?
35
19. IF THE DEFAULT VALUE FOR THE LOOKUP RETURN PORT IS NOT SET, WHAT WILL BE THE OUTPUT WHEN THE LOOKUP CONDITION
FAILS? 35
20. HOW CAN WE ENSURE DATA IS NOT DUPLICATED IN THE TARGET WHEN THE SOURCE HAS DUPLICATE RECORDS, USING LOOKUP
TRANSFORMATION?
35
6.

NORMALIZER TRANSFORMATION

36

1.
2.
3.
4.

WHAT IS A NORMALIZER TRANSFORMATION?


SCENARIO IMPLEMENTATION 1
WHAT ARE LEVELS IN NORMALIZER TRANSFORMATION?
WHAT IS THE PURPOSE OF GCID AND GK IN A NORMALIZER TRANSFORMATION?

36
36
36
37

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

1.
2.
3.
4.
5.
6.
7.
8.
9.

DWBIConcepts

25

DWBIConcepts

JOINER TRANSFORMATION

DWBIConcepts

4.

www.dwbiconcepts.com Community of DWBI Professionals

WHAT IS A RANK TRANSFORM?


38
HOW DOES A RANK TRANSFORM DIFFER FROM AGGREGATOR TRANSFORM FUNCTIONS MAX AND MIN?
38
HOW DOES A RANK CACHE WORKS?
38
WHAT IS A RANK PORT AND RANKINDEX?
38
HOW CAN YOU GET RANKS BASED ON DIFFERENT GROUPS?
38
WHAT HAPPENS IF TWO RANK VALUES MATCH?
39
WHAT ARE THE RESTRICTIONS OF RANK TRANSFORMATION?
39
HOW DOES RANK TRANSFORMATION HANDLE STRING VALUES?
39
WHAT IS DENSE RANK AND DOES INFORMATICA SUPPORTS DENSE RANK?
39
HOW DO WE ACHIEVE DENSE_RANK IN INFORMATICA?
40
SOURCE TABLE HAS 5 ROWS. RANK IN RANK TRANSFORMATION IS SET TO 10. HOW MANY ROWS THE RANK TRANSFORMATION
WILL OUTPUT?
40
12. HOW YOU WILL LOAD UNIQUE RECORD INTO TARGET FLAT FILE FROM SOURCE FLAT FILES HAS DUPLICATE DATA?
40
8.

ROUTER TRANSFORMATION

42

1.
2.
3.
4.
5.

WHAT IS THE DIFFERENCE BETWEEN ROUTER AND FILTER?


WHAT IS THE MINIMUM NUMBER OF GROUPS WE CAN DECLARE IN A ROUTER TRANSFORMATION?
SCENARIO IMPLEMENTATION 1
SCENARIO IMPLEMENTATION 2
SCENARIO IMPLEMENTATION 3

42
42
42
43
44

9.

SEQUENCE GENERATOR TRANSFORMATION

45

1.
2.
3.
4.
5.

WHAT IS A SEQUENCE GENERATOR TRANSFORMATION?


45
DEFINE THE PROPERTIES AVAILABLE IN SEQUENCE GENERATOR TRANSFORMATION IN BRIEF.
45
SCENARIO IMPLEMENTATION 1
46
SCENARIO IMPLEMENTATION 2
46
WHAT ARE THE CHANGES WE OBSERVE WHEN WE PROMOTE A NON-REUSABLE SEQUENCE GENERATOR TO A REUSABLE ONE? AND
WHAT HAPPENS IF WE SET THE NUMBER OF CACHED VALUES TO 0 FOR A REUSABLE TRANSFORMATION?
47
6. HOW SEQUENCE GENERATOR IN THE MAPPING IS HANDLED WHEN WE MIGRATE THE MAPPING FROM ONE ENVIRONMENT TO
ANOTHER?
47
7. SCENARIO IMPLEMENTATION 3
48
8. HOW DO I GET A SEQUENCE GENERATOR TO "PICK UP" WHERE ANOTHER "LEFT OFF"?
48
10.
1.
2.
3.
4.
5.
6.
11.
1.
2.
3.

STORED PROCEDURE TRANSFORMATION


WHAT IS A STORED PROCEDURE TRANSFORMATION?
HOW MANY TYPES OF STORED PROCEDURE TRANSFORMATION ARE THERE?
HOW DO WE CALL AN UNCONNECTED STORED PROCEDURE TRANSFORMATION?
HOW DO WE SET THE EXECUTION ORDER OF PRE-POST LOAD STORED PROCEDURE?
HOW DO WE SET THE CALL TEXT FOR STORED PROCEDURE TRANSFORMATION?
HOW DO WE RECEIVE OUTPUT/RETURN PARAMETERS FROM UNCONNECTED STORED PROCEDURE?
SORTER TRANSFORMATION
WHAT IS A SORTER TRANSFORMATION?
WHY IS SORTER AN ACTIVE TRANSFORMATION?
HOW DOES SORTER HANDLE CASE SENSITIVE SORTING?

www.dwbiconcepts.com All rights reserved.

49
49
49
49
49
49
50
51
51
51
51

DWBIConcepts

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.

DWBIConcepts

38

DWBIConcepts

RANK TRANSFORMATION

DWBIConcepts

7.

www.dwbiconcepts.com Community of DWBI Professionals

UPDATE STRATEGY TRANSFORMATION

53
53
53
54

1. WHAT IS UPDATE STRATEGY TRANSFORM?


54
2. WHAT ARE UPDATE STRATEGY CONSTANTS?
54
3. HOW CAN WE UPDATE A RECORD IN TARGET TABLE WITHOUT USING UPDATE STRATEGY?
54
4. WHAT IS DATA DRIVEN?
54
5. WHAT HAPPENS WHEN DD_UPDATE IS DEFINED IN UPDATE STRATEGY AND TREAT SOURCE ROWS AS INSERT IS SELECTED IN
SESSION?
55
6. WHAT ARE THE THREE AREAS WHERE THE ROWS CAN BE FLAGGED FOR PARTICULAR TREATMENT?
55
7. BY DEFAULT OPERATION CODE FOR ANY ROW IN INFORMATICA WITHOUT BEING ALTERED IS INSERT. THEN STATE WHEN DO WE
NEED DD_INSERT?
55
8. WHAT IS THE DIFFERENCE BETWEEN UPDATE STRATEGY AND FOLLOWING UPDATE OPTIONS IN TARGET?
55
9. WHAT IS THE USE OF FORWARD REJECT ROWS IN MAPPING?
56
10. SCENARIO IMPLEMENTATION 1
56
14.
1.
2.
15.

JAVA TRANSFORMATION

57

SCENARIO IMPLEMENTATION 1
SCENARIO IMPLEMENTATION 2

57
57

SOURCE QUALIFIER TRANSFORMATION

59

1.

WHAT IS A SOURCE QUALIFIER? WHAT ARE THE TASKS WE CAN PERFORM USING A SOURCE QUALIFIER AND WHY IT IS AN ACTIVE
TRANSFORMATION?
59
2. WHAT HAPPENS TO A MAPPING IF WE ALTER THE DATA TYPES BETWEEN SOURCE AND ITS CORRESPONDING SOURCE QUALIFIER? 59
3. SUPPOSE WE HAVE USED THE SELECT DISTINCT AND THE NUMBER OF SORTED PORTS PROPERTY IN THE SOURCE QUALIFIER AND
THEN WE ADD CUSTOM SQL QUERY. EXPLAIN WHAT WILL HAPPEN.
59
4. DESCRIBE THE SITUATIONS WHERE WE WILL USE THE SOURCE FILTER, SELECT DISTINCT AND NUMBER OF SORTED PORTS
PROPERTIES OF SOURCE QUALIFIER TRANSFORMATION.
60
5. WHAT WILL HAPPEN IF THE SELECT LIST COLUMNS IN THE CUSTOM OVERRIDE SQL QUERY AND THE OUTPUT PORTS ORDER
IN SOURCE QUALIFIER TRANSFORMATION DO NOT MATCH?
60
6. WHAT HAPPENS IF IN THE SOURCE FILTER PROPERTY OF SQ TRANSFORMATION WE INCLUDE KEYWORD WHERE SAY, WHERE
CUSTOMERS.CUSTOMER_ID > 1000.
60
7. DESCRIBE THE SCENARIOS WHERE WE GO FOR JOINER TRANSFORMATION INSTEAD OF SOURCE QUALIFIER TRANSFORMATION. 60
8. WHAT IS THE MAXIMUM NUMBER WE CAN USE IN NUMBER OF SORTED PORTS FOR SYBASE SOURCE SYSTEM?
61
9. WHAT IS USE OF SOURCE QUALIFIER IN INFORMATICA? CAN WE CREATE A MAPPING WITHOUT A SOURCE QUALIFIER?
61
10. SUPPOSE WE HAVE TWO TABLES OF SAME DATABASE TYPE, RESIDING IN DIFFERENT DATABASE INSTANCE. IF A DATABASE LINK IS
AVAILABLE, HOW CAN WE JOIN THE TWO TABLES USING A SOURCE QUALIFIER IN INFORMATICA PROVIDED THERE ARE VALID JOIN
COLUMNS.
61
11. WHAT IS THE MEANING OF OUTPUT IS DETERMINISTIC PROPERTY IN SOURCE QUALIFIER TRANSFORMATION?
61

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

13.

WHAT IS A UNION TRANSFORMATION?


WHAT ARE THE RESTRICTIONS OF UNION TRANSFORMATION?
HOW COME UNION TRANSFORMATION IS ACTIVE?

53

DWBIConcepts

1.
2.
3.

UNION TRANSFORMATION

51
51
52

DWBIConcepts

12.

HOW DOES SORTER HANDLE NULL VALUES?


HOW DOES A SORTER CACHE WORKS?
HOW TO DELETE DUPLICATE RECORDS OR RATHER TO SELECT DISTINCT ROWS FOR FLAT FILE SOURCES?

DWBIConcepts

4.
5.
6.

www.dwbiconcepts.com Community of DWBI Professionals

12.

SCENARIO IMPLEMENTATION 1

62

16.

MISCELLANEOUS

63

MAPPING

71

1.
2.
4.
5.
6.

SCENARIO IMPLEMENTATION 1
71
WHAT ARE MAPPING PARAMETERS AND VARIABLES?
71
WHAT ARE THE DEFAULT VALUES FOR VARIABLES?
72
WHAT DOES FIRST COLUMN OF BAD FILE (REJECTED ROWS) INDICATES?
72
OUT OF 100000 SOURCE ROWS SOME ROWS GET DISCARD AT TARGET, HOW WILL YOU TRACE THEM AND WHERE IT GETS LOADED?
72
7. WHAT IS REJECT LOADING?
72
8. WHY INFORMATICA WRITER THREAD MAY REJECT A RECORD?
74
9. WHY TARGET DATABASE CAN REJECT A RECORD?
74
10. DESCRIBE VARIOUS STEPS FOR LOADING REJECT FILE?
74
11. VARIABLE V1 HAS VALUES SET AS 5 IN DESIGNER (DEFAULT), 10 IN PARAMETER FILE, AND 15 IN REPOSITORY. WHILE RUNNING
SESSION WHICH VALUE INFORMATICA WILL READ?
74
12. WHAT ARE SHORTCUTS? WHERE IT CAN BE USED? WHAT ARE THE ADVANTAGES?
74
13. CAN WE HAVE AN INFORMATICA MAPPING WITH TWO PIPELINES, WHERE ONE FLOW IS HAVING A TRANSACTION CONTROL
TRANSFORMATION AND ANOTHER NOT. EXPLAIN WHY?
75
14. HOW CAN WE IMPLEMENT REVERSE PIVOTING USING INFORMATICA TRANSFORMATIONS?
75
15. IS IT POSSIBLE TO UPDATE A TARGET TABLE WITHOUT ANY KEY COLUMN IN TARGET?
75
18.

MAPPLET

77

www.dwbiconcepts.com All rights reserved.

DWBIConcepts
DWBIConcepts

17.

DWBIConcepts

WHAT ARE THE NEW FEATURES OF INFORMATICA 9.X IN DEVELOPER LEVEL?


63
NAME THE TRANSFORMATIONS WHICH CONVERTS ONE TO MANY ROWS I.E. INCREASES THE I/P: O/P ROW COUNT. ALSO WHAT IS
THE NAME OF ITS REVERSE TRANSFORMATION?
63
3. HOW MANY WAYS WE CAN FILTER RECORDS?
63
4. WHAT ARE THE TRANSFORMATIONS THAT USE CACHE FOR PERFORMANCE?
63
5. WHAT IS THE FORMULA FOR CALCULATION OF LOOKUP/RANK/AGGREGATOR INDEX & DATA CACHES?
64
6. WHAT IS THE DIFFERENCE BETWEEN INFORMATICA POWERCENTER AND EXCHANGE AND MART?
64
7. HOW DO WE HANDLE DELIMITER CHARACTER AS A PART OF THE DATA IN A DELIMITED SOURCE FILE?
65
8. WE HAVE JUST RECEIVED SOURCE FILES FROM UNIX. WE WANT TO STAGE THAT DATA TO ETL PROCESS. WHAT ARE THE POINTS
WE NEED TO LOOK FOR?
65
9. WHAT IS THE DIFFERENCE BETWEEN JOINER AND LOOKUP. PERFORMANCE WISE WHICH ONE IS BETTER TO USE.
65
10. WHAT IS THE B2B IN INFORMATICA? HOW CAN WE USE IT IN INFORMATICA?
66
11. WHAT IS CDC, SCD AND MD5 IN INFORMATICA?
66
12. HOW CAN WE IMPLEMENT AN SCD TYPE2 MAPPING WITHOUT USING A LOOKUP TRANSFORMATION?
67
13. HOW DOES JOINER AND LOOKUP TRANSFORMATION TREAT NULL VALUE MATCHING?
67
14. DOES MICROSOFT SQL SERVER SUPPORTS BULK LOADING? IF YES, WHAT HAPPENS WHEN YOU SPECIFY BULK MODE AND DATA
DRIVEN FOR SQL SERVER TARGET
67
15. HOW CAN YOU UTILIZE COM COMPONENTS IN INFORMATICA?
67
16. WHAT IS SQL TRANSFORMATION IN INFORMATICA?
67
17. WHAT IS A XML SOURCE QUALIFIER?
68
18. WHAT IS THE METADATA EXTENSIONS TAB IN INFORMATICA?
68
19. DESCRIBE SOME OF THE ETL BEST PRACTICES
69
20. IS THERE A SCOPE OF CLOUD COMPUTING IN DATA WAREHOUSING TECHNOLOGY?
69

DWBIConcepts

1.
2.

www.dwbiconcepts.com Community of DWBI Professionals

1.
2.
3.
4.

WHAT IS SESSION AND BATCHES?


79
WHAT ARE VARIOUS SESSION TRACING LEVELS?
79
CAN WE COPY A SESSION TO NEW FOLDER OR NEW REPOSITORY?
79
IS IT POSSIBLE TO STORE ALL THE INFORMATICA SESSION LOG INFORMATION IN A DATABASE TABLE? NORMALLY THE SESSION LOG IS
STORED AS A BINARY COMPRESSION .BIN FILE IN SESSLOGS DIRECTORY. CAN WE STORE THE SAME INFORMATION IN DATABASE TABLES
FOR FUTURE ANALYSIS?
79
5. CAN WE CALL A SHELL SCRIPT FROM SESSION PROPERTIES?
80
6. CAN WE CHANGE THE SOURCE AND TARGET TABLE NAMES IN SESSION LEVEL?
81
7. HOW TO WRITE FLAT FILE COLUMN NAMES IN TARGET?
81
8. WHAT ARE THE ERROR TABLES PRESENT IN INFORMATICA?
81
9. WHAT ARE THE ALTERNATE WAYS TO STOP A SESSION WITHOUT USING STOP ON ERRORS OPTION SET TO 1 IN SESSION
PROPERTIES?
81
10. SUPPOSE A SESSION FAILS AFTER LOADING OF 10,000 RECORDS IN THE TARGET. HOW CAN WE LOAD THE RECORDS FROM 10,001
WHEN WE RUN THE SESSION NEXT TIME?
82
11. DEFINE THE TYPES OF COMMIT INTERVALS APART FROM USER DEFINED?
82
12. SUPPOSE SESSION IS CONFIGURED WITH COMMIT INTERVAL OF 10,000 ROWS AND SOURCE HAS 50,000 ROWS EXPLAIN THE
COMMIT POINTS FOR SOURCE BASED COMMIT & TARGET BASED COMMIT. ASSUME APPROPRIATE VALUE WHEREVER REQUIRED?
82
13. HOW TO CAPTURE PERFORMANCE STATISTICS OF INDIVIDUAL TRANSFORMATION IN THE MAPPING AND EXPLAIN SOME
IMPORTANT STATISTICS THAT CAN BE CAPTURED?
83
14. HOW CAN WE PARAMETERIZE SUCCESS OR FAILURE EMAIL LIST?
83
15. IS IT POSSIBLE THAT A SESSION FAILED BUT STILL THE WORKFLOW STATUS IS SHOWING SUCCESS?
83
16. WHAT IS BUSY PERCENTAGE?
83
17. CAN WE WRITE A PL/SQL BLOCK IN PRE AND POST SESSION OR IN TARGET QUERY OVERRIDE?
84
18. WHENEVER A SESSION RUNS DOES THE DATA GETS OVERWRITTEN IN A FLAT FILE TARGET? IS IT POSSIBLE TO KEEP THE EXISTING
DATA AND ADD THE NEW DATA TO THE TARGET FILE?
84
19. CAN WE USE THE SAME SESSION TO LOAD A TARGET TABLE IN DIFFERENT DATABASES HAVING SAME TARGET DEFINITION?
84
20. HOW DO YOU REMOVE THE CACHE FILES AFTER THE TRANSFORMATION?
84
21. WHY DOESN'T A RUNNING SESSION QUIT WHEN ORACLE OR SYBASE RETURN FATAL ERRORS?
84
20.

WORKFLOW

86

1.
2.

WHAT IS THE DIFFERENCE BETWEEN STOP AND ABORT OPTIONS IN WORKFLOW?


86
RUNNING INFORMATICA WORKFLOW CONTINUOUSLY HOW TO RUN A WORKFLOW CONTINUOUSLY UNTIL A CERTAIN CONDITION
IS MET?
86
3. HOW DO WE SEND EMAILS FROM INFORMATICA AFTER THE SUCCESSFUL COMPLETION OF ONE SESSION? THE EMAIL WILL CONTAIN
THE JOB NAME/ SESSION START TIME AND SESSION END TIME IN THE MESSAGE BODY.
87
4. SCENARIO IMPLEMENTATION 1
87
5. HOW CAN WE SEND TWO SEPARATE EMAILS AFTER A SUCCESSFUL SESSION RUN?
87
6. WHAT IS COLD START IN INFORMATICA?
88
7. SCENARIO IMPLEMENTATION 2
88

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

79

DWBIConcepts

SESSION

77
77
77
77
78
78

DWBIConcepts

19.

WHAT IS A MAPPLET?
WHAT IS THE DIFFERENCE BETWEEN REUSABLE TRANSFORMATION AND MAPPLET?
WHAT ARE THE TRANSFORMATIONS THAT ARE NOT SUPPORTED IN MAPPLET?
IS IT POSSIBLE TO CONVERT REUSABLE TRANSFORMATION TO A NON-REUSABLE ONE?
WHAT IS THE USE OF MAPPLET & WORKLET IN PROJECT?
IS IT POSSIBLE TO HAVE A MAPPLET WITHIN A MAPPLET AND WORKLET WITHIN A WORKLET?

DWBIConcepts

1.
2.
3.
4.
5.
6.

www.dwbiconcepts.com Community of DWBI Professionals

8.

WE KNOW THERE ARE 3 OPTIONS FOR SESSION RECOVERY STRATEGY - RESTART TASK, FAIL TASK AND CONTINUE RUNNING THE

21.

92

22.
1.
2.
3.
23.

WHAT IS LOAD MANAGER?


WHAT IS DTM PROCESS? HOW MANY THREADS IT CREATES TO PROCESS DATA, EXPLAIN EACH THREAD IN BRIEF?
CAN YOU CREATE A FOLDER WITHIN DESIGNER?
HOW DO YOU TAKE CARE OF SECURITY USING A REPOSITORY MANAGER?
WHAT ARE THE DIFFERENT USES OF A REPOSITORY MANAGER?
WHAT ARE 2 MODES OF DATA MOVEMENT IN INFORMATICA SERVER?
WHAT IS CODE PAGE USED FOR?
WHAT IS CODE PAGE COMPATIBILITY?
WHAT IS DEFAULT BLOCK BUFFER SIZE?
WHAT IS DEFAULT LM SHARED MEMORY SIZE?
DEFINE SERVER CONCEPTS WITH RESPECT TO MEMORY BUFFERS
WHAT ARE THE TWO PROGRAMS THAT COMMUNICATE WITH THE INFORMATICA SERVER?
COMMAND LINE ARGUMENTS
WHAT IS PMCMD COMMANDS?
WHAT IS PMREP COMMANDS?
HOW DO WE START & STOP SESSION FROM PMCMD COMMAND LINE?
METADATA REPOSITORY

92
92
92
93
93
93
93
94
94
94
94
95
96
96
96
96
97

1.

IS THERE ANY METADATA QUERY TO FIND THE LIST OF INFORMATICA FOLDER NAME, WORKFLOW NAMES WHICH ARE MIGRATED IN
A PARTICULAR QUARTER?
97
3. WRITE A METADATA QUERY TO IDENTIFY THE SESSIONS HAVING TRUNCATE OPTION ENABLED
97
4. WHERE CAN I FIND A HISTORY / METRICS OF THE LOAD SESSIONS THAT HAVE OCCURRED IN INFORMATICA?
97
5. HOW TO EXTRACT THE WORKFLOW MONITOR RECORD INFORMATION FROM INFORMATICA METADATA REPOSITORY?
98
24.

REPOSITORY MANAGER

100

1.
2.
3.
4.
5.

DESCRIBE THE STEPS FOR EXPORT AND IMPORT?


100
WHAT ARE THE VARIOUS METHODS OF CODE MIGRATION OR WHICH IS THE BEST WAY OF DEPLOYMENT?
100
WHAT ARE THE VARIOUS OPTIONS FOR ETL CODE MIGRATION
101
WHAT IS LABELING IN INFORMATICA?
101
SUPPOSE HAVING INFORMATICA VERSION CONTROL IN PLACE, CAN WE REVERT BACK AN OBJECT TO A STATE OF TWO PREVIOUS
VERSION.
102
6. WHAT DO WE MEAN BY TEAM BASED DEVELOPMENT IN INFORMATICA?
102

25.

SCENARIO QUESTIONS

104

10
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.

ADMINISTRATION

DWBIConcepts

9. WHAT IS THE DIFFERENCE REAL-TIME AND CONTINUOUS WORKFLOWS?


11. SCENARIO IMPLEMENTATION 3
12. HOW DO WE SEND A SESSION FAILURE MAIL WITH THE WORKFLOW OR SESSION LOG AS ATTACHMENT?
13. EXPLAIN DEADLOCK IN INFORMATICA AND HOW DO WE RESOLVE IT?
14. SCENARIO IMPLEMENTATION 4
15. HOW CAN WE PASS A VALUE FROM ONE WORKFLOW TO ANOTHER?

89
89
89
90
90
90
91

DWBIConcepts

WITHOUT ANY MANUAL INTERVENTION IN THE EVENT OF SESSION FAILURE?

DWBIConcepts

WORKFLOW, RESUME FROM LAST CHECKPOINT WHENEVER A SESSION FAILS. HOW DO WE RESTART A WORKFLOW AUTOMATICALLY

www.dwbiconcepts.com Community of DWBI Professionals

1.

SUPPOSE WE HAVE TEN SOURCE FLAT FILES OF SAME STRUCTURE. HOW CAN WE LOAD ALL THE FILES IN TARGET DATABASE IN A

1.
2.
3.
4.
5.
6.
7.
8.

PERFORMANCE TUNING
WHICH ONE IS FASTER CONNECTED OR UNCONNECTED LOOKUP?
HOW WE CAN IMPROVE PERFORMANCE OF INFORMATICA NORMALIZATION TRANSFORMATION.
HOW TO IMPROVE THE SESSION PERFORMANCE?
HOW DO YOU IDENTIFY THE BOTTLENECKS IN MAPPINGS?
HOW DO YOU HANDLE PERFORMANCE ISSUES IN INFORMATICA? WHERE CAN YOU MONITOR THE PERFORMANCE?
WHAT ARE PERFORMANCE COUNTERS?
HOW CAN WE INCREASE SESSION PERFORMANCE?
SCENARIO IMPLEMENTATION 1

119
119
119
120
121
122
122
124

11
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

119

DWBIConcepts

26.

DWBIConcepts

104
2. SUPPOSE WE HAVE TWO SOURCE QUALIFIER TRANSFORMATIONS SQ1 AND SQ2 CONNECTED TO TARGET TABLES TGT1 AND TGT2
RESPECTIVELY. HOW DO YOU ENSURE TGT2 IS LOADED AFTER TGT1?
104
3. SUPPOSE WE HAVE A SOURCE QUALIFIER TRANSFORMATION THAT POPULATES TWO TARGET TABLES. HOW DO YOU ENSURE TGT2
IS LOADED AFTER TGT1?
106
4. SUPPOSE WE HAVE THE EMP TABLE AS OUR SOURCE. IN THE TARGET WE WANT TO VIEW THOSE EMPLOYEES WHOSE SALARY ARE
GREATER THAN OR EQUAL TO THE AVERAGE SALARY FOR THEIR DEPARTMENTS. DESCRIBE YOUR MAPPING APPROACH.
106
5. HOW CAN WE PERFORM CHANGED DATA CAPTURE BASED ON LOAD SEQUENCE NUMBER (INTEGER) COLUMN PRESENT IN THE
SOURCE TABLE?
110
6. SCENARIO IMPLEMENTATION 1
111
7. HOW CAN WE LOAD X RECORDS (USER DEFINED RECORD NUMBERS) OUT OF N RECORDS FROM SOURCE DYNAMICALLY,
WITHOUT USING FILTER AND SEQUENCE GENERATOR TRANSFORMATION?
112
8. SUPPOSE WE HAVE N NUMBER OF ROWS IN THE SOURCE AND WE HAVE TWO TARGET TABLES. HOW CAN WE LOAD N/2 I.E. FIRST
HALF THE SOURCE DATA INTO ONE TARGET AND THE REMAINING HALF INTO THE NEXT TARGET?
112
9. SUPPOSE WE HAVE A FLAT FILE WHICH HAS A HEADER RECORD WITH FILE CREATION DATE, AND DETAILED DATA RECORDS.
DESCRIBE THE APPROACH TO LOAD THE 'FILE CREATION DATE' COLUMN ALONG WITH EACH AND EVERY DETAILED RECORD.
113
10. SCENARIO IMPLEMENTATION 2
113
11. SUPPOSE WE HAVE A FLAT FILE WHICH CONTAINS JUST A NUMERIC VALUE. WE NEED TO POPULATE THIS VALUE IN ONE COLUMN
OF THE TARGET TABLE FOR EVERY SOURCE RECORD. HOW CAN WE ACHIEVE THIS?
113
12. HOW WILL YOU LOAD A SOURCE FLAT FILE INTO A STAGING TABLE WHEN THE FILE NAME IS NOT FIXED? THE FILE NAME IS LIKE
SALES_2013_02_22.TXT, I.E. DATE IS APPENDED AT THE END OF THE FILE AS A PART OF FILE NAME.
114
13. SOLVE THE BELOW SCENARIO USING INFORMATICA AND DATABASE SQL.
114
14. SUPPOSE WE HAVE A COLUMN IN SOURCE WITH VALUES AS BELOW:
115
15. CAN WE PASS THE VALUE OF A MAPPING VARIABLE BETWEEN 2 PIPELINES UNDER THE SAME MAPPING? IF NOT HOW CAN WE
ACHIEVE THIS?
116
16. SCENARIO IMPLEMENTATION 3
116
17. SCENARIO IMPLEMENTATION 4
117
18. IMPLEMENT SLOWLY CHANGING DIMENSION OF TYPE 2 WHICH WILL LOAD CURRENT RECORD IN CURRENT TABLE AND OLD DATA
IN LOG TABLE.
118

DWBIConcepts

SINGLE BATCH RUN USING A SINGLE MAPPING?

www.dwbiconcepts.com Community of DWBI Professionals

DWBIConcepts

Questions
17
10
2
12
20
4
12
5
8
6
6
3
10
2
12
20
12
6
22
15
12
3
5
6
18
8

DWBIConcepts

Topics
Aggregator
Expression
Filter
Joiner
Lookup
Normalizer
Rank
Router
Sequence Generator
Stored Procedure
Sorter
Union
Update Strategy
Java
Source Qualifier
Miscellaneous
Mapping
Mapplet
Session
Workflow
Administration
Command Line Arguments
Metadata Repository
Repository Manager
Scenario Questions
Performance Tuning

DWBIConcepts

Serial Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

DWBIConcepts

Topic Matrix:

12
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

1. Aggregator Transformation
1. What is an Aggregator Transformation?

2. How an Expression Transformation differs from Aggregator Transformation?


Answer:
An Expression Transformation performs calculation on a row-by-row basis, whereas an Aggregator Transformation performs calculations on groups.

3. Does an Aggregator Transformation support only aggregate expressions?


Answer:

DWBIConcepts

An aggregator is an Active, Connected transformation which performs aggregate calculations like AVG,
COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM and VARIANCE.

DWBIConcepts

Answer:

Answer:
Use conditional clauses in the aggregate expression to reduce the number of rows used in the aggregation. The conditional clause can be any clause that evaluates to TRUE or FALSE.
SUM (SALARY, JOB = CLERK)
Use non-aggregate expressions in group by ports to modify or replace groups.
IIF (PRODUCT = Brown Bread, Bread, PRODUCT)
Nested aggregation expression can include one aggregate function within another aggregate function.
MAX (COUNT (PRODUCT))

5. How does Aggregator Transformation handle NULL values?


Answer:

13
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

4. Give one example for each of Conditional Aggregation, Non-Aggregate expression and
Nested Aggregation.

DWBIConcepts

Apart from aggregate expressions, aggregator transformation supports non-aggregate expressions and conditional clauses.

www.dwbiconcepts.com Community of DWBI Professionals

By default, the aggregator transformation treats null values as NULL in aggregate functions. But
we can specify to treat null values in aggregate functions as NULL or zero.

Filter the unnecessary data before aggregating it. Place a Filter transformation in the mapping before the aggregator transformation to reduce unnecessary aggregation.
Improve performance by connecting only the necessary input/output ports to subsequent transformations, thereby reducing the size of the data cache.
Use Sorted input which reduces the amount of data cached and improves session performance.

Aggregator performance improves dramatically if records are sorted before passing to the aggregator and
Sorted Input option under aggregator properties is checked. The record set should be sorted on those columns that are used in Group By operation.

DWBIConcepts

It is often a good idea to sort the record set in database level (click here to see why?) e.g. inside
a source qualifier transformation, unless there is a chance that already sorted records from
source qualifier can again become unsorted before reaching aggregator.

DWBIConcepts

Answer:

DWBIConcepts

6. What are the performance considerations when working with Aggregator Transformation?

7. What are the uses of index and data cache?


Answer:
The group data is stored in index files whereas Row data stored in data files.

Answer:
Integration Service creates the index and data caches files in memory to process the Aggregator transformation. If the Integration Service requires more space as allocated for the index and data cache sizes in the
transformation properties, it stores overflow values in cache files i.e. paging to disk.
One way to increase session performance is to increase the index and data cache sizes in the transformation
properties.
But when we check Sorted Input the Integration Service uses memory to process an Aggregator transformation it does not use cache files.

14
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

8. What differs when we choose Sorted Input for Aggregator Transformation?

www.dwbiconcepts.com Community of DWBI Professionals

9. Under what conditions selecting Sorted Input in aggregator will still not boost session performance?
Answer:

If the input data is not sorted correctly, the session will fail.
Also if the input data is properly sorted, the session may fail if the sort order by ports and the group
by ports of the aggregator are not in the same order.

11. Suppose we do not group by on any ports of the aggregator what will be the output.
Answer:

If we do not use an input port in group-by neither in aggregate expression, the Integration Service will return only the last row value of the column for the input rows.

For example, if we have 100 rows coming from source then aggregator will output only the last record (100 th
record)

12. What is the expected value if the column in an aggregator transformation is neither a
group by nor an aggregate expression?
Answer:
Integration Service produces one row for each group based on the group by ports. The columns which are
neither part of the key nor aggregate expression will return the corresponding value of last record of the
group received.
However, if we specify particularly the FIRST function, the Integration Service then returns the value of the
specified first row of the group. So default is the LAST function.

13. What is Incremental Aggregation?

15
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Answer:

DWBIConcepts

10. Under what condition selecting Sorted Input in aggregator may fail the session?

DWBIConcepts

DWBIConcepts

Incremental Aggregation, session option is enabled.


The aggregate expression contains nested aggregate functions.
When session property, Treat Source rows as is set to data driven.

www.dwbiconcepts.com Community of DWBI Professionals

Answer:

Answer:
In case of a nested aggregation, there are multiple levels of sorting associated as each aggregation function
will require one sorting pass, and after the first level of aggregation, the sort order of the group by column
may get jumbled up, so before the second level of aggregation, Informatica must internally sort it again.
However, if we already indicate that input is sorted, Informatica will not do this sorting - resulting into failure.
In incremental aggregation, the aggregate calculations are stored in historical cache on the server. In this historical cache the data may not be in sorted order. If we give sorted input, the records come as presorted for
that particular run but in the historical cache the data may not be in the sorted order.

DWBIConcepts

14. Sorted input for aggregator transformation will improve performance of mapping. However, if sorted input is used for nested aggregate expression or incremental aggregation,
then the mapping may result in session failure. Explain why?

DWBIConcepts

We can enable the session option, Incremental Aggregation for a session that includes an Aggregator Transformation. When the Integration Service performs incremental aggregation, it actually passes changed
source data through the mapping and uses the historical cache data to perform aggregate calculations incrementally.

Answer:
One way to handle duplicate records in source batch run is to use an Aggregator Transformation and using
the Group By checkbox on the ports having duplicate occurring data. Here you can have the flexibility to select the last or the first of the duplicate column value records.

DWBIConcepts

15. How can we delete duplicate record using Informatica Aggregator?

DWBIConcepts

16. Scenario Implementation 1


Suppose in our Source Table we have data as given below:
Student Name
Sam
Tom
Sam
John
Sam
John
John
Tom
Tom

Subject Name
Maths
Maths
Physical Science
Maths
Life Science
Life Science
Physical Science
Life Science
Physical Science

Marks
100
80
80
75
70
100
85
100
85

16
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

We want to load our Target Table as:


Maths
100
75
80

Life Science
70
100
100

Physical Science
80
85
85

Describe your approach.


Answer:
Here our scenario is to convert many rows to one row, and the transformation which will help us to achieve
this is Aggregator.

DWBIConcepts

Our Mapping will look like this:

DWBIConcepts

Student Name
Sam
John
Tom

Now based on STUDENT_NAME in GROUP BY clause the following output subject columns are populated as
MATHS: MAX( MARKS, SUBJECT = Maths )
LIFE_SC: MAX( MARKS, SUBJECT = Life Science )
PHY_SC: MAX( MARKS, SUBJECT = Physical Science )

17
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

DWBIConcepts

We will sort the source data based on STUDENT_NAME ascending followed by SUBJECT ascending.

DWBIConcepts

DWBIConcepts

www.dwbiconcepts.com Community of DWBI Professionals

17. Scenario Implementation 2


Source:

DWBIConcepts

100 XYZ AAA


100 XYZ BBB
100 XYZ CCC
The expected output data:
100 XYZ AAA BBB CCC
Which transformations are used for this?
Answer:

DWBIConcepts

Use an Aggregator transformation with variable.

18
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

2. Expression Transformation
1. What is an Expression Transform?

Expression is a Passive connected transformation used to calculate values in a single row before you write to
the target. We can use the Expression transformation to perform any non-aggregate calculations. We can also use the Expression transformation to test conditional statements before you output the results to target
tables or other transformations.
For example, we might need to adjust employee salaries, concatenate first and last names, or convert strings
to numbers.

DWBIConcepts

2. How many types of ports are there in Expression transform?

DWBIConcepts

Answer:

Answer:
There are three types of ports- INPUT, OUTPUT, and VARIABLE

All ports are executed TOP TO BOTTOM in a serial physical ordering fashion, but they are done in the
following groups:
All input ports are pushed values first.
Then all variables are executed (top to bottom physical ordering in the expression).
Last - all output expressions are executed to push values to output ports

You can utilize this to your advantage, by placing lookups in to variables, then using the variables
"later" in the execution cycle.

4. Describe the approach for the requirement. Suppose the input is:
Col1
10
20
30
40

Col2
a
b
c

19
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Answer:

DWBIConcepts

3. What is the execution order of the ports in an expression?

www.dwbiconcepts.com Community of DWBI Professionals

50

The desired output is:

Port Name
Col1
Col2
V_Seq
V_Col2
Prev_Col2
Out_Col2

Port Type
I/O
I
V
V
V
O

Expression

CUME(1)
IIF (V_Seq = 1, Col2, IIF ( ISNULL (Col2), Prev_Col2, Prev_Col2 || ',' || Col2))
V_Col2
Prev_Col2

Keep in mind the string length of the variable and output ports.
CUME function is used to calculate the cumulative amount based on the argument of the cumulative function. This means, if we call CUME with argument 1, e.g. CUME(1); then on the first call it will return 1; on the second call, it will return 2; on the third call, it will return 3 and so on. Since
Informatica process data row by row, this means that when the first row is processed CUME(1)
will return 1; for the next row, it will return 2 and so on.

5. How can we implement aggregation operation without using an Aggregator Transformation in Informatica?
Answer:
We will use the very basic concept of the Expression Transformation, that at a time we can access the previous row data as well as the currently processed data in an expression transformation. What we need is simple Sorter, Expression and Filter transformation to achieve aggregation at Informatica level.
For detailed understanding visit Aggregation without Aggregator.

6. Scenario Implementation 1
Source
Col1
A
B
C
A

Col2
W
R
E
R

www.dwbiconcepts.com All rights reserved.

20

DWBIConcepts

Answer: Use an Expression transformation:-

DWBIConcepts

DWBIConcepts

Col2
a
a,b
a,b,c
a,b,c
a,b,c,d

DWBIConcepts

Col1
10
20
30
40
50

www.dwbiconcepts.com Community of DWBI Professionals

Col1
A
B
C

READ
1
1
0

Target
EXECUTE
0
1
1

DWBIConcepts

WRITE
1
0
0

In this scenario Source values in Col2 W, R, E means read write and execute.
Answer:
Take an Expression transformation followed by Aggregator transformation.
In Expression Transformation:
Port Type
I/O
I/O
O
O
O

Expression

DWBIConcepts

Port Name
Col1
Col2
Read
Write
Execute

IIF ( Col2 = 'R', 1, 0 )


IIF ( Col2 = 'W', 1, 0 )
IIF ( Col2 = 'E', 1, 0 )

In Aggregator Transform:
I/O
I/O
I/O
I/O

GROUP BY
MAX (Read)
MAX (Write)
MAX (Execute)

DWBIConcepts

Col 1
Read
Write
Execute

7. Scenario Implementation 2

Id
10
10
20

name1
A
C
E

DWBIConcepts

Source data is like below:


name2
B
D
F

Desired Target data is like below


Id
10
10
20

name
AB
CD
EF

21
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

Answer:
Use Expression Transformation to concatenate both values as- name = name1 || name2

Suppose we have a field in source file named as DATA. We need to mark those records having 9 characters
such that the first 2 characters must be alphabets i.e.(A-Z) and the rest 7 characters must be alphanumeric
i.e.(A-Z) or (0-9) for the DATA field as output. And the records which dont match the condition should be
marked as Invalid. How do we implement this?
E.g.

DWBIConcepts

OUTPUT
AB345GH67
CD56789PJ
Invalid
Invalid

Answer:
Use the below logic in an output port of an Expression Transformation in Informatica:IIF( REG_MATCH( SUBSTR(DATA,1,2), '[[:alpha:]]{2}' ) = 1
ANDREG_MATCH( SUBSTR(DATA,3,7), '[[:alnum:]]{7}' ) = 1, SUBSTR(DATA, 1,
9), 'Invalid' )

9. Scenario Implementation 4
How do we convert a Date field coming as data type string from a flat file?
Answer:

DWBIConcepts

DATA
AB345GH6756
CD56789PJ
56CHJK97889
DG//*67DF

DWBIConcepts

8. Scenario Implementation 3

In the above example, we have assumed the format of the date field is YYYYMMDD. If the format is something else (e.g. YYYY-MM-DD), we need to specify the same

10. Scenario Implementation 5


Source:
Col1 Col2
1
B
2
C

22
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Use Date Conversion Functions:IIF( IS_DATE( Column1 ) = 1, TO_DATE( Column1 , 'YYYYMMDD' ),


NULL )

www.dwbiconcepts.com Community of DWBI Professionals

3
4

D
E

Col1 Col2 Col3 Col4


1
B
2
C
3
D
4
E
Describe the approach to the above scenario where the source 1st record loaded to target col1,col2 then
2nd record loaded to col3,col4 again 3rd record to col1,col2 and so on.
Answer:

Port Type Expression


I
I
V
O
O
O
O
O
V
V

1 MOD (Col1, 2)
V_ID
V_Col1
V_Col2
Col1
Col2
Col1
Col2

DWBIConcepts

Port
Name
Col1
Col2
V_ID
O_ID
O_Col1
O_Col2
O_Col3
O_Col4
V_Col1
V_Col2

DWBIConcepts

Use an Expression transformation:

DWBIConcepts

Target

Next use a Filter transformation with condition O_ID = 1

DWBIConcepts

Next map O_Col1, O_Col2, O_Col3, O_Col4 to Col1, Col2, Col3, Col4 of the target respectively.

23
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

3. Filter Transformation
1. What is a Filter Transformation and why it is an Active one?

Only the rows that meet the Filter Condition pass through the Filter transformation to the next transformation in the pipeline. TRUE and FALSE are the implicit return values from any filter condition we set. If the
filter condition evaluates to NULL, the row is assumed to be FALSE. The numeric equivalent of FALSE is zero
(0) and any non-zero value is the equivalent of TRUE.
As an ACTIVE transformation, the Filter transformation may change the number of rows passed through it. A
filter condition returns TRUE or FALSE for each row that passes through the transformation, depending on whether a row meets the specified condition. Only rows that return TRUE pass
through this transformation. Discarded rows do not appear in the session log or reject files.

2. What is the difference between Source Qualifier transformations Source filter option and
filter transformation?
Answer:

Source Qualifier transformation filters rows when


read from a source.

Filter transformation filters rows from


within a mapping

Source Qualifier transformation can only filter rows


from relational sources.

Filter transformation filters rows coming


from any type of source system in the mapping level.
Filter transformation limits the row set
sent to a target.

Source Qualifier limits the row set extracted from a


source.
Source Qualifier reduces the number of rows used
throughout the mapping and hence it provides better
performance.

To maximize session performance, include the Filter transformation as close to


the sources in the mapping as possible to
filter out unwanted data early in the flow of
data from sources to targets.

The filter condition in the Source Qualifier transformation only uses standard SQL as it runs in the database.

Filter Transformation can define a condition using any statement or transformation


function that returns either a TRUE or FALSE
value.

24
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Filter Transformation

DWBIConcepts

SQ Source Filter

DWBIConcepts

A Filter transformation is an Active and Connected transformation that can filter rows in a mapping.

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

4. Joiner Transformation
1. What is a Joiner Transformation and why it is an Active one?

The Joiner transformation joins sources with at least one matching column. The Joiner transformation uses
a condition that matches one or more pairs of columns between the two sources.
In the Joiner transformation, we must configure the transformation properties namely Join Condition, Join
Type and optionally Sorted Input option to improve Integration Service performance.
The join condition contains ports from both input sources that must match for the Integration Service to join
two rows. Depending on the join condition and the type of join selected, the Integration Service
either adds the row to the result set or discards the row. Because of this reason, the number of
rows in Joiner output may not be equal to the number of rows in Joiner Input. This is why Joiner
is considered an Active transformation.

2. State the limitations where we cannot use Joiner in the mapping pipeline.

DWBIConcepts

A Joiner is an Active and Connected transformation used to join two source data streams coming from same
or heterogeneous databases or files.

DWBIConcepts

Answer:

Joiner transformation cannot be used when either of the input pipelines contains an Update Strategy transformation.
Joiner transformation cannot be used if we connect a Sequence Generator transformation directly
before the Joiner transformation.

3. Out of the two input pipelines of a joiner, which one will we set as the master pipeline?
Answer:

During a session run, the Integration Service compares each row of the master source against the
detail source. The master and detail sources need to be configured for optimal performance.
When the Integration Service processes an unsorted Joiner transformation, it blocks the detail source while
it caches rows from the master source. Once the Integration Service finishes reading and caching all master
rows, it unblocks the detail source and reads the detail rows. This is why if we have the source containing
fewer input rows in master, the cache size will be smaller, thereby improving the performance.

25

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

The Joiner transformation accepts input from most transformations. However, following are the
limitations:

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

For a Sorted Joiner transformation, use the source with fewer duplicate key values as the master source for
optimal performance and disk storage. When the Integration Service processes a sorted Joiner transformation, it caches rows for one hundred keys at a time. If the master source contains many rows with the
same key value, the Integration Service must cache more rows, and performance can be slowed.

DWBIConcepts

Blocking logic is possible if master and detail input to the Joiner transformation originate from different sources. Otherwise, it does not use blocking logic. Instead, it stores more rows in the cache.

4. What are the different types of Joins available in Joiner Transformation?


Answer:
In SQL, a join is a relational operator that combines data from multiple tables into a single result set. The
Joiner transformation is similar to an SQL join except that data can originate from different types of sources.

Normal
Master Outer
Detail Outer
Full Outer

DWBIConcepts

DWBIConcepts

DWBIConcepts

The Joiner transformation supports the following types of joins:

A normal or master outer join performs faster than a full outer or detail outer join.

26
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

5. Define the various Join Types of Joiner Transformation.

Answer:
We can define one or more conditions based on equality between the specified master and detail sources.
Both ports in a condition must have the same data type.
If we need to use two ports in the join condition with non-matching data types we must convert the data
types so that they match. The Designer validates data types in a join condition.
Additional ports in the join condition, increases the time necessary to join two sources.
The order of the ports in the join condition can impact the performance of the Joiner transformation. If we
use multiple ports in the join condition, the Integration Service compares the ports in the order we specified.

Only equality operator is available in joiner join condition.

DWBIConcepts

6. Describe the impact of number of join conditions and join order in a Joiner.

DWBIConcepts

In a normal join, the Integration Service discards all rows of data from the master and detail source
that do not match, based on the join condition.
A master outer join keeps all rows of data from the detail source and the matching rows from the
master source. It discards the unmatched rows from the master source.
A detail outer join keeps all rows of data from the master source and the matching rows from the
detail source. It discards the unmatched rows from the detail source.
A full outer join keeps all rows of data from both the master and detail sources.

DWBIConcepts

Answer:

Answer:
The Joiner transformation does not match null values.

For example, if both EMP_ID1 and EMP_ID2 contain a row with a null value, the Integration Service does not
consider them a match and does not join the two rows.
To join rows with null values, replace null input with default values in the Ports tab of the joiner, and then
join on the default values.

27
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

7. How does Joiner transformation treat NULL value matching?

www.dwbiconcepts.com Community of DWBI Professionals

Answer:
If we have sorted both the master and detail pipelines in order of the ports say ITEM_NO, ITEM_NAME and
PRICE we must ensure that:
Use ITEM_NO in the First Join Condition.
If we add a Second Join Condition, we must use ITEM_NAME.
If we want to use PRICE as a Join Condition apart from ITEM_NO, we must also use ITEM_NAME in
the Second Join Condition.
If we skip ITEM_NAME and join on ITEM_NO and PRICE, we will lose the input sort order and the Integration Service fails the session.

9. What are the transformations that cannot be placed between the sort origin and the Joiner transformation so that we do not lose the input sort order?
Answer:
The best option is to place the Joiner transformation directly after the sort origin to maintain sorted data.
However do not place any of the following transformations between the sort origin and the Joiner transformation:
Custom
Unsorted Aggregator
Normalizer
Rank
Union transformation
XML Parser transformation
XML Generator transformation
Mapplet [if it contains any one of the above mentioned transformations]

DWBIConcepts

10. What is the use of sorted input in joiner transformation?


Answer:
It is recommended to Join sorted data when possible. We can improve session performance by configuring the Joiner transformation to use sorted input. When we configure the Joiner transformation
to use sorted data, it improves performance by minimizing disk input and output. We see

28

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Suppose we configure Sorter transformations in the master and detail pipelines with the following sorted
ports in order: ITEM_NO, ITEM_NAME and PRICE.

DWBIConcepts

8. When we configure the join condition, what are the guidelines we need to follow to maintain the sort order?

DWBIConcepts

If a result set includes fields that do not contain data in either of the sources, the Joiner transformation populates the empty fields with null values. If we know that a field will return a NULL and
we do not want to insert NULLs in the target, set a default value on the Ports tab for the corresponding port.

www.dwbiconcepts.com Community of DWBI Professionals

great performance improvement when we work with large data sets.


For an unsorted Joiner transformation, designate as the master source the source with fewer rows.

For example table 1 EMPNO (string) and table 2 EMPNUM (number)


Answer:
Yes possible in this case. If we are using Joiner, we should be able to do this explicit conversion in an expression transformation before joining the tables.

12. Implementation Scenario1 - Joiner transformation is joining two tables s1 and s2. s1 has
10,000 rows and s2 has 1000 rows . Which table you will set master for better performance of joiner transformation? Why?

DWBIConcepts

11. Can we join two tables based on a join column having different data type?

DWBIConcepts

For optimal performance and disk storage, designate the master source as the source with the fewer rows.
During a session, the Joiner transformation compares each row of the master source against the detail source. The fewer unique rows in the master, the fewer iterations of the join comparison occur, which
speeds the join process.

DWBIConcepts

Set table S2 as Master table because informatica server has to keep master table in the cache so if it is 1000
in cache will get performance instead of having 10000 rows in cache.

DWBIConcepts

Answer:

29
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

5. Lookup Transformation
1. What is a Lookup transform?

The transform is used to look up data in a flat file, relational table, views, or synonym. The informatica server
queries the lookup table based on the lookup ports in the transformation. It compares lookup transformation port values to lookup table column values based on the lookup condition. The result is passed to
other transformations and the target.
Uses:

Answer:

The differences are illustrated in the below table:


Connected Lookup

Unconnected Lookup

Connected lookup participates in dataflow and receives input directly from the pipeline

Unconnected lookup receives input values


from the result of a LKP: expression in another transformation

Connected lookup can use both dynamic and static


cache

Unconnected Lookup cache can NOT be


dynamic

Connected lookup can return more than one column value ( output port )

Unconnected Lookup can return only one


column value i.e. output port

Connected lookup caches all lookup columns

Unconnected lookup caches only the


lookup output ports in the lookup conditions and the return port

Supports user-defined default values (i.e. value to


return when lookup conditions are not satisfied)

Does not support user defined default values

3. What are the different lookup cache(s)?


Answer:
Informatica Lookups can be cached or un-cached (No cache). And Cached lookup can be either static or dynamic.

30
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

2. What are the differences between Connected and Unconnected Lookup?

DWBIConcepts

DWBIConcepts

Get related value


Perform a calculation
Update slowly changing dimension tables.

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

A static cache is one which does not modify the cache once it is built and the data remains same during the
session run.
On the other hand, a dynamic cache is refreshed during the session run by inserting or updating the records
in cache based on the incoming source data.

A lookup cache can also be divided as persistent or non-persistent based on whether Informatica retains the
cache even after the completion of session run or deletes it.

4. Is lookup an active or passive transformation?


Answer:

DWBIConcepts

By default, Informatica cache is static cache.

From Informatica 9x, Lookup transformation can be configured as an "Active" transformation.

DWBIConcepts

Find out How to configure lookup as active transformation.


However, in the earlier versions of Informatica, lookup is a passive transformation.

5. What is the difference between Static and Dynamic Lookup Cache?

We can configure a Lookup transformation to cache the underlying lookup table. In case of static or readonly lookup cache the Integration Service caches the lookup table at the beginning of the session and does
not update the lookup cache while it processes the Lookup transformation. Rows are not added dynamically
in the cache.
In case of dynamic lookup cache the Integration Service dynamically inserts or updates data in the lookup
cache and passes the data to the target. The dynamic cache is synchronized with the target. It basically,
caches the rows as and when it is passed.

DWBIConcepts

Answer:

DWBIConcepts

In case you are wondering why we need to make lookup cache dynamic, read this article on dynamic lookup.

6. What are the uses of index and data caches?


Answer:
The conditions are stored in index cache and records from the lookup are stored in data cache

7. What is Persistent Lookup Cache?


Answer:

31
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

If the cache generated for a Lookup needs to be preserved for subsequent use then persistent cache is used.
It will not delete the index and data files. It is useful only if the lookup table remains constant.
Lookups are cached by default in Informatica. Lookup cache can be either non-persistent or persistent. The
Integration Service saves or deletes lookup cache files after a successful session run based on, whether the
Lookup cache is checked as persistent or not.

DWBIConcepts

8. What type of join does Lookup support?


Answer:
Lookup is just similar like SQL LEFT OUTER JOIN.

Answer:
Lookup means if the source input column value matches the lookup table comparison column value then it
will Return valid values from the lookup table else it will return NULL.

So the equivalent SQL query looks like below:SELECT EMP.*, DEPT.LOC


FROM EMP LEFT OUTER JOIN DEPT
ON EMP.DEPTNO = DEPT.DEPTNO
Hence Lookup is associated with the Source table as Left Outer Join.

10. Where and why do we use Unconnected Lookup instead of Connected Lookup?
Answer:
The best part of unconnected lookup is that, we can call the lookup based on some condition and
not every time. I.e. based on some condition met we can invoke the unconnected lookup in an
expression transformation else not. By this we may optimize the performance of a flow.
We may consider unconnected lookup as a function in any procedural language. It takes multiple parameters
as input and returns one values, and can be used repeatedly. Same way unconnected lookup can be used in
any scenario where we need to use the lookup repeatedly either in single or multiple transformation.
With the unconnected lookup, we get the performance benefit of not caching the same data multiple times.
Also it is a good coding practice.

32
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

DWBIConcepts

Lets consider the EMP table as Source and DEPT table as lookup. We want to extract the location of each
employee based on his or her department number. So if the Location details are not available in the DEPT
table, still we want to have all the other information of the employee coming from the source EMP table,
apart from NULL as location and load in our target table.

DWBIConcepts

9. Explain how lookup transformation works like SQL Left Outer Join.

www.dwbiconcepts.com Community of DWBI Professionals

11. How can we Identify Persistent Cache Files in Informatica Server?

So if we have followed any particular naming convention for Lookup Persistent Cache Name e.g. table_name_PC or the table names have a convention like GDW_ then use shell commands accordingly to
identify the cache files in server.
In this context you can revisit Lookup Persistent Cache and Incremental Aggregation article

DWBIConcepts

Cache files are generated in the Cache directory of the Informatica Server for transformations like
Aggregator, Joiner, Lookup, Rank & Sorter.
Two types of cache files are generated i.e. the data and index files exception being Sorter transformation.
Most Important point is that Informatica automatically deletes all the generated .dat and .idx cache
files after a session run is finished.
So the files that are present in the Cache directory are basically the Persistent Cache files of Lookup
transformation, Aggregator Cache files of Incremental Aggregation sessions or if the session run was
not successfully completed.
Informatica generated cache files are named as:
PMAGG*.idx, PMAGG*.dat, PMJNR*.idx, PMJNR*.dat, PMLKP*.idx, PMLKP*.dat.
Often while handling big data cache Informatica creates multiple index and data files due to paging
and appends a number to the end of the files e.g. PMAGG*.dat0, PMAGG*.idx0, PMAGG*.dat1,
PMAGG*.idx1.

DWBIConcepts

Answer:

When we try to create a lookup transformation, we have the option to select the location of the Lookup Table from any of Source, Target, Source Qualifier, Import from Relational Table or Import from Flat File.
So after selecting the flat file as lookup from the desired location, the edit Transformation tab of the lookup
will have the Flat file information to choose between Delimited or Fixed width and advanced properties to
modify like Column Delimiters, Code Page and obviously Number of initial rows to skip.
Set Number of initial rows to skip as 1. Set the Lookup condition as required.
Apart from that go to the Mapping tab of the corresponding session and select the lookup transformation to
configure the Lookup source file directory and filename and Lookup source file type i.e. Direct or Indirect.

13. What is the difference between persistent cache and shared cache?
Answer:
Persistent cache is a type of Informatica lookup cache in which the cache file is stored in disk. We
can configure the session to re-cache if necessary. It will be used only if we are sure that lookup
table will not change between sessions.
It will be used if your mapping uses any static tables as lookup mostly.

33
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Answer:

DWBIConcepts

12. How to configure a Lookup on a flat file with header?

www.dwbiconcepts.com Community of DWBI Professionals

If the persistent cache is shared across mappings, we call it as shared cache (named). We will provide a name
for this cache file.
If the lookup table is used in more than one transformation/mapping then the cache built for the first lookup
can be used for the others. It can be used across mappings.

Unshared cache: Within the mapping if the lookup table is used in more than one transformation then the
cache built for the first lookup can be used for the others. It cannot be used across mappings.

14. Describe how to return multiple port values from unconnected lookup in Informatica.

DWBIConcepts

For Shared cache we have to give the name in cache file name prefix property. Use the same name it in different lookup where we want to use the cache.

15. How to make the persistent lookup cache in sync with lookup table?
Answer:
To make the persistent cache in sync with the lookup table simply enable Re-cache option of the lookup
transformation to rebuild the lookup cache from lookup table again. While loading the target dimension table we can choose to make the lookup cache dynamic and recache-persistent so that once dimension is
loaded the persistent cache file is in sync and available during Fact table loading.

16. If we use persistent cache for a dynamic lookup, will the cache file be updated or inserted
as required?
Answer:
Having persistent cache will not impact the dynamic cache anyway in doing insert & updates to the cache
file. Just that cache file will have a proper name assigned using persistent named cache and it can be reused
later.

17. Is there anything wrong in sharing a persistent cache between static and dynamic lookup?
Answer:

34
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Call the Unconnected lookup from the expression transformation and use various output ports to retrieve
the lookup values based on the concatenated return value. Use SUBSTR, INSTR functions to extract the column values from the concatenated return field.

DWBIConcepts

Informatica Unconnected Lookup by default supports only one return port.


So alternatively we can write a Lookup SQL override with the required ports values concatenated into a single string as return port value.

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

Static & Dynamic lookup cannot share the same persistent cache.

I
In Dynamic Cache:
Update else Insert: In this scenario, if incoming record already exists in lookup cache then the record
is going to be updated in the cache and also the target else it will be inserted.
Insert else Update: In this scenario, if incoming record does not exist in lookup cache then the record
is going to be inserted in the cache and also the target else it will be updated.
These options play a role in the performance part. If we know the nature of the source data we can set the
update option accordingly. Suppose if the maximum source data is destined for insert we will select Insert
else Update, otherwise we will go for Update else Insert. Also, if the number of duplicate records coming
from Source is greater or there are few potential duplicates in source then we go for Update Else Insert or
Insert Else Update respectively for better performance.

DWBIConcepts

19. If the default value for the lookup return port is not set, what will be the output when the
lookup condition fails?

DWBIConcepts

Answer:

DWBIConcepts

18. What is the difference between the two update properties - update else insert, insert else
update in dynamic lookup cache?

Answer:
NULL will be returned from lookup transformation on lookup condition failure.

Answer:
Using Dynamic lookup cache we can ensure duplicate records are not inserted in the target. That is through
Using Dynamic Lookup Cache of the target table and associating the input ports with the lookup port and
checking the Insert Else Update option will help to eliminate the duplicate records in source and hence loading unique records in the target.
For more details check, Dynamic Lookup Cache

35
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

20. How can we ensure data is not duplicated in the target when the source has duplicate
records, using lookup transformation?

www.dwbiconcepts.com Community of DWBI Professionals

6. Normalizer Transformation
1. What is a Normalizer transformation?

The normalizer transformation normalizes records from COBOL and relational sources, allowing you to organize the data according to your own needs. A Normalizer transformation can appear anywhere in a data
flow when you normalize a relational source. Use a Normalizer transformation instead of the Source Qualifier transformation when you normalize COBOL source. When you drag a COBOL source into the Mapping Designer Workspace, the Normalizer transformation appears, creating input and output ports for every columns in the source.

DWBIConcepts

2. Scenario Implementation 1
Suppose in our Source Table we have data as given below:
Student Name
Sam
John
Tom

Math
100
75
80

Life Science
70
100
100

DWBIConcepts

Answer:

Physical Science
80
85
85

Subject Name
Math
Life Science
Physical Science
Math
Life Science
Physical Science
Math
Life Science
Physical Science

Marks
100
70
80
75
100
85
80
100
85

Describe your approach.


Answer:
Here to convert the Rows to Columns we have to use the Normalizer Transformation followed by an Expression Transformation to decode the column taken into consideration. For more details on how the mapping is
performed please visit Working with Normalizer.

3. What are levels in Normalizer transformation?

36
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Student Name
Sam
Sam
Sam
John
John
John
Tom
Tom
Tom

DWBIConcepts

We want to load our Target Table as:

www.dwbiconcepts.com Community of DWBI Professionals

Answer:
The VSAM Normalizer transformation is the Source Qualifier for a COBOL source definition. A COBOL can
contain multiple-occurring data (Group of columns of same type) and multiple types of records in the same
file. Mostly level is for that use. The Normalizer tab defines the structure of the source data. A group of columns might define a record in a COBOL source or it might define a group of multiple-occurring fields in the
source.

Columns in a group have the same level number and display sequentially below a group-level column. A
group-level column has a lower level number, and it contains no data.

4. What is the purpose of GCID and GK in a Normalizer transformation?

DWBIConcepts

The column level number identifies groups of columns in the data. Level numbers define a data hierarchy.

Lets take an example:


Source data is:

Name
Saurav
Jenny

FOOD
1000
2000

HOUSERENT
2000
2500

TRANSPORT
500
700

When we set the OCCURS property of the Normalizer to 3, the Normalizer creates 3 input ports to get data
from the source. Say the 3 columns FOOD, HOUSERENT and TRANSPORT is connected to the 3 input ports of
the Normalizer. Then the GCID gets 3 values 1, 2 and 3 corresponding to the connected input columns for
FOOD, HOUSERENT and TRANSPORT. Going forward it generates 3 rows for each input columns values of a
single source row.
On the other hand GK will keep a sequence value starting from 1 to number of source records. It holds the
sequence number of the source records being processed.

Name
Saurav
Saurav
Saurav
Jenny
Jenny
Jenny

EXPENSEHEAD
FOOD
HOUSERENT
TRANSPORT
FOOD
HOUSERENT
TRANSPORT

GCID_EXPENSEHEAD
1
2
3
1
2
3

EXPENSE
1000
2000
500
2000
500
700

DWBIConcepts

Below will help to visualize output data from the Normalizer in GCID and GK fields:

GK_EXPENSEHEAD
1
1
1
2
2
2

37
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

7. Rank Transformation
1. What is a Rank Transform?

Answer:
Like the Aggregator transformation, the Rank transformation also groups information. The Rank Transform
allows us to select a group of top or bottom values, not just one value as in case of Aggregator MAX, MIN
functions.

3. How does a Rank Cache works?


Answer:
During a session, the Integration Service compares an input row with rows in the data cache. If the input row
out-ranks a cached row, the Integration Service replaces the cached row with the input row. If we configure
the Rank transformation to rank based on different groups, the Integration Service ranks incrementally for
each group it finds. The Integration Service creates an index cache to stores the group information and data
cache for the row data.

4. What is a RANK port and RANKINDEX?

DWBIConcepts

2. How does a Rank Transform differ from Aggregator Transform functions MAX and MIN?

DWBIConcepts

Rank is an Active Connected transformation used to select a set of top or bottom values of data. It basically
filters the required number of records from the top or from the bottom.

DWBIConcepts

Answer:

Rank port is an input/output port used to specify the column for which we want to rank the
source values. By default Informatica creates an output port RANKINDEX for each Rank transformation. It stores the ranking position for each row in a group.

5. How can you get ranks based on different groups?


Answer:
Rank transformation lets us group information. We can configure one of its input/output ports as a group by
port. For each unique value in the group port, the transformation creates a group of rows falling within the
rank definition (top or bottom, and a particular number in each rank).

38
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

6. What happens if two rank values match?


Answer:

DWBIConcepts

If two rank values match, they receive the same value in the rank index and the transformation skips the
next value.

7. What are the restrictions of Rank Transformation?


Answer:

DWBIConcepts

We can connect ports from only one transformation to the Rank transformation.
We can select the top or bottom rank.
We need to select the Number of records in each rank.
We can designate only one Rank port in a Rank transformation.

8. How does Rank transformation handle string values?


Answer:
Rank transformation can return the strings at the top or the bottom of a session sort order.
When the Integration Service runs in Unicode mode, it sorts character data in the session using
the selected sort order associated with the Code Page of Integration Service which may be
French, German, etc. When the Integration Service runs in ASCII mode, it ignores this setting and
uses a binary sort order to sort character data.

9. What is Dense Rank and does Informatica supports Dense Rank?

DWBIConcepts

When multiple rows share the same rank the next rank in the sequence is not consecutive. On the other
hand DENSE RANK assigns consecutive ranks.
Take the following example: Lets say we want to see the top 2 highest salary of each department.
DEPTNO
10
10
10
10
20
20
20
30

SAL
400
400
300
100
550
550
150
200

RANK
1
1
3
4
1
2
2
1

DENSE_RANK
1
1
2
3
1
2
2
1

www.dwbiconcepts.com All rights reserved.

39

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

40

600

10. How do we achieve DENSE_RANK in Informatica?


Answer:
In order to achieve the DENSE RANK functionality in Informatica we will use the combination of Sorter, Expression and Filter transformation. Based on the previous example data set, lets say we want to get the top
2 highest salary of each department as per DENSE RANK.
Use a SORTER transformation.
DEPTNO ASC, SAL DESC
After the sorter place an EXPRESSION transformation.

DWBIConcepts

RANK
V_DEPT_PREV
V_SAL_PREV

TYPE EXPRESSION
I/O
I/O
V
IIF (DEPT <> V_DEPT_PREV, 1, IIF (DEPT = V_DEPT_PREV AND SAL <>
V_SAL_PREV, RANK+1, RANK))
O
V_COMP
V
DEPT
V
SAL

Next use a FILTER transformation.


FILTER CONDITION: RANK < 3

11. Source table has 5 rows. Rank in rank transformation is set to 10. How many rows the
rank transformation will output?
Answer:
5 Rank

12. How you will load unique record into target flat file from source flat files has duplicate data?

40

Answer:

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

PORT_NAME
DEPT
SAL
V_COMP

DWBIConcepts

Informatica RANK transform performs a simple RANK, not DENSE RANK. So using Informatica RANK transform we may miss consecutive ranks.

DWBIConcepts

So the normal RANK will generate the result set where we can miss rank (here RANK = 2 is missing for department 10) for due to sharing of same ranks between multiple records. On the other hand the DENSE
RANK will generate all the consecutive ranks.

www.dwbiconcepts.com Community of DWBI Professionals

DWBIConcepts

DWBIConcepts

DWBIConcepts

DWBIConcepts

In rank transformation using group by port (Group the records) and then set no. of rank 1. Rank transformation returns one value from the group. That value will be a unique one.

41
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

8. Router Transformation
1. What is the difference between Router and Filter?
Answer:

Filter

Router transformation divides the incoming records into multiple groups based on some condition. Such groups can be mutually inclusive (Different groups may contain same record)

Filter transformation restricts or


blocks the incoming record set based
on one given condition.

Router transformation itself does not block any


record. If a certain record does not match any of
the routing conditions, the record is routed to default group

Filter transformation does not have a


default group. If one record does not
match filter condition, the record is
blocked

Router acts like CASE... WHEN statement in SQL


(Or Switch ()... statement in C)

Filter acts like WHERE condition is


SQL.

DWBIConcepts

Router

DWBIConcepts

Following differences can be note:

Answer:
We can define minimum 1 group condition for a Router transformation, and it will create automatically another group called Default to pass those records that do not conform to the Router condition for the group
defined.

3. Scenario Implementation 1
Loading Multiple Target Tables Based on Conditions- Suppose we have some serial numbers in a flat file
source. We want to load the serial numbers in two target files one containing the EVEN serial numbers and
the other file having the ODD ones.
Answer:
After the Source Qualifier place a Router Transformation. Create two Groups namely EVEN and ODD, with
filter conditions as:
MOD(SERIAL_NO,2)=0
MOD(SERIAL_NO,2)=1
Then output the two groups into two flat file targets.

www.dwbiconcepts.com All rights reserved.

42

DWBIConcepts

2. What is the minimum number of groups we can declare in a Router transformation?

DWBIConcepts

In filter transformation the records are filtered based on the condition and rejected rows are discarded. In
Router the multiple conditions are placed and the rejected rows can be assigned to a port.

DWBIConcepts

DWBIConcepts

www.dwbiconcepts.com Community of DWBI Professionals

Suppose we have a source table and we want to load three target tables based on source rows such that first
row moves to first target table, second row in second target table, third row in third target table, fourth row
again in first target table so on and so forth. Describe your approach.
Answer:
We can clearly understand that we need a Router transformation to route or filter source data to the three
target tables. Now the question is what will be the filter conditions.

DWBIConcepts

First of all we need an Expression Transformation where we have all the source table columns and along
with that we have another i/o port say seq_num, which gets sequence numbers for each source row from
the port NEXTVAL of a Sequence Generator start value 0 and increment by 1.
Now the filter condition for the three router groups will be:
MOD(SEQ_NUM,3)=1 connected to 1st target table
MOD(SEQ_NUM,3)=2 connected to 2nd target table
MOD(SEQ_NUM,3)=0 connected to 3rd target table

43
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

4. Scenario Implementation 2

DWBIConcepts

DWBIConcepts

www.dwbiconcepts.com Community of DWBI Professionals

5. Scenario Implementation 3
How can we distribute and load n number of Source records equally into two target tables, so that each
have n/2 records?

DWBIConcepts

Answer:
After Source Qualifier use an expression transformation.
In the expression transformation create a counter variable
V_COUNTER = V_COUNTER + 1 (Variable port)
O_COUNTER = V_COUNTER (o/p port)
This counter variable will get incremented by 1 for every new record which comes in.
Router Transformation:

DWBIConcepts

Group_ODD: IIF(MOD(O_COUNTER, 2) = 1)
Group_EVEN: IIF(MOD(O_COUNTER, 2) = 0)
Half of the record (all odd number record) will go to Group_ODD and rest to Group_EVEN.
Finally the target tables.

44
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

9. Sequence Generator Transformation


1. What is a Sequence Generator Transformation?

2. Define the Properties available in Sequence Generator transformation in brief.


Answer:
Sequence Generator
Properties

Description

Start Value

Start value of the generated sequence that we want the Integration


Service to use if we use the Cycle option. If we select Cycle, the Integration Service cycles back to this value when it reaches the end
value. Default is 0.

Increment By

Difference between two consecutive values from the NEXTVAL


port. Default is 1.
Maximum value generated by Sequence Generator. After reaching
this value the session will fail if the sequence generator is not configured to cycle. Default is 2147483647.

End Value

Current Value

Cycle

Current value of the sequence. Enter the value we want the Integration Service to use as the first value in the sequence. Default is
1.
If selected, when the Integration Service reaches the configured
end value for the sequence, it wraps around and starts the cycle
again, beginning with the configured Start Value.

Number of Cached
Values

Number of sequential values the Integration Service caches at a


time. Default value for a standard Sequence Generator is 0. Default
value for a reusable Sequence Generator is 1,000.

Reset

Restarts the sequence at the current value each time a session


runs. This option is disabled for reusable Sequence Generator
transformations.

45
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

This transformation by default contains two OUTPUT ports only, namely CURRVAL and NEXTVAL. We cannot edit or delete these ports neither we cannot add ports to this unique transformation. We can create approximately two billion unique numeric values with the widest range from 1 to 2147483647.

DWBIConcepts

It is used to create unique primary key values, replace missing primary keys, or cycle through a sequential
range of numbers.

DWBIConcepts

A Sequence Generator is a Passive and Connected transformation that generates numeric values.

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

3. Scenario Implementation 1
Suppose we have a source table populating two target tables. We connect the NEXTVAL port of the Sequence Generator to the surrogate keys of both the target tables.
Will the Surrogate keys in both the target tables be same? If not how can we flow the same sequence values
in both of them.

A block of sequence numbers is sent to one target tables surrogate key column. The second target receives a
block of sequence numbers from the Sequence Generator transformation only after the first target table receives the block of sequence numbers.
Suppose we have 5 rows coming from the source, so the targets will have the sequence values as TGT1
(1,2,3,4,5) and TGT2 (6,7,8,9,10). [Taken into consideration Start Value 0, Current value 1 and Increment by
1]
Now suppose the requirement is like that we need to have the same surrogate keys in both the targets.

4. Scenario Implementation 2
Suppose we have 100 records coming from the source. Now for a target column population we used a Sequence generator.
Suppose the Current Value is 0 and End Value of Sequence generator is set to 80. What will happen?
Answer:
End Value is the maximum value the Sequence Generator will generate. After it reaches the End value the
session fails with the following error message:

46
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

DWBIConcepts

Then the easiest way to handle the situation is to put an Expression transformation in between the Sequence Generator and the Target tables. The Sequence Generator will pass unique values to the expression
transformation, and then the rows are routed from the expression transformation to the targets.

DWBIConcepts

When we connect the NEXTVAL output port of the Sequence Generator directly to the surrogate key columns of the target tables, the Sequence number will not be the same.

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

TT_11009 Sequence Generator Transformation: Overflow error.

5. What are the changes we observe when we promote a non-reusable Sequence Generator
to a reusable one? And what happens if we set the Number of Cached Values to 0 for a
reusable transformation?
Answer:
When we convert a non-reusable sequence generator to reusable one we observe that the Number of
Cached Values is set to 1000 by default.

DWBIConcepts

Failing of session can be handled if the Sequence Generator is configured to Cycle through the sequence, i.e.
whenever the Integration Service reaches the configured end value for the sequence; it wraps around and
starts the cycle again, beginning with the configured Start Value.

The number of cached values must be greater than zero for reusable sequence transformation.

6. How Sequence Generator in the mapping is handled when we migrate the mapping from
one environment to another?
Answer:
While promoting the Informatica Objects using Copy Folder Wizard we have the option to choose to retain
existing values or to replace them with values from the source folder.
Generally we Retain the current values for the Sequence Generator transformation in the destination folder, else we may end up having duplicate values for the sequence generated column and may result to session failure.

DWBIConcepts

When we try to set the Number of Cached Values property of a Reusable Sequence Generator to 0 in the
Transformation Developer we encounter the following error message:

DWBIConcepts

And the Reset property is disabled.

SELECT
OPB_SUBJECT.SUBJ_NAME AS "FOLDER NAME",
OPB_MAPPING.MAPPING_NAME AS "MAPPING NAME",
REP_WIDGET_INST.INSTANCE_NAME AS "SEQ NAME",
OPB_WIDGET_ATTR.ATTR_VALUE AS "CURRENT VALUE"
FROM REP_WIDGET_INST
INNER JOIN OPB_MAPPING ON
(REP_WIDGET_INST.MAPPING_ID = OPB_MAPPING.MAPPING_ID)
INNER JOIN OPB_WIDGET_ATTR ON
(REP_WIDGET_INST.WIDGET_TYPE = OPB_WIDGET_ATTR.WIDGET_TYPE AND
REP_WIDGET_INST.WIDGET_ID = OPB_WIDGET_ATTR.WIDGET_ID)
INNER JOIN OPB_SUBJECT ON
(OPB_MAPPING.SUBJECT_ID = OPB_SUBJECT.SUBJ_ID )
WHERE
REP_WIDGET_INST.WIDGET_TYPE_NAME like 'Sequence%'
AND OPB_WIDGET_ATTR.ATTR_ID = 4 --Current Value
ORDER BY OPB_MAPPING.MAPPING_NAME

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Find the below Informatica Metadata query which gives the list of the current value of Sequence Generator
transform:

47

www.dwbiconcepts.com Community of DWBI Professionals

7. Scenario Implementation 3

Answer:
We should use a Reusable Sequence Generator in both the mappings to generate the target surrogate keys.

8. How do I get a Sequence Generator to "pick up" where another "left off"?

DWBIConcepts

Consider we have two mappings that populate a single target table from two different source systems. Both
the mappings have Sequence Generator transform to generate surrogate key in the target table. How can
we ensure that the surrogate key generated is consistent and does not generate duplicate values when populating data from two different mappings?

DWBIConcepts

DWBIConcepts

Use an unconnected lookup on the Sequence ID of the target table. Set the properties to "LAST VALUE", input port is an ID. the condition is: SEQ_ID >= input_ID. Then in an expression set up a variable port: connect
a NEW self-resetting sequence generator to a new input port in the expression. The variable port's expression should read: IIF( v_seq = 0 OR ISNULL(v_seq) = true, :LKP.lkp_sequence(1), v_seq). Then, set up an
output port. Change the output port's expression to read: v_seq + input_seq (from the resetting sequence
generator). Thus you have just completed an "append" without a break in sequence numbers.

DWBIConcepts

Answer:

48
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

10.

Stored Procedure Transformation

1. What is a Stored Procedure Transformation?

Stored Procedure is a Passive transformation used to execute stored procedures pre-built on the database
through Informatica ETL. It can also be used to call functions to return calculated values.

2. How many types of Stored Procedure transformation are there?

DWBIConcepts

Answer:

There are two types of Stored Procedure transformation based on calling, Connected and Unconnected. Based on the execution order they can be classified as Source Pre Load, Source Post Load,
Normal, Target Pre Load and Target Post Load.
Normal Stored Procedure transformation can be configured as both connected and unconnected whereas
Pre-Post Load Stored Procedures are unconnected ones.

DWBIConcepts

Answer:

Answer:
The unconnected Stored Procedure transformation is called from expression transformation using the
:SP.<Stored_Procedure_Name>(Argument1, Argument2).
Conditional execution of a Stored Procedure is possible using Unconnected Stored Procedure unlike the connected one.

DWBIConcepts

4. How do we set the Execution order of Pre-Post Load Stored Procedure?


Answer:
We set the execution order using the Stored Procedure Plan from the mapping property.

5. How do we set the Call Text for Stored Procedure transformation?


Answer:
Once we specify the Stored Procedure Type other than Normal, the Call Text Attribute in the Properties tab
gets enabled. Here we have to specify how the procedure has to be called along with arguments to be
passed. E.g. <Stored_Procedure_Name>(Argument1, Argument2).

49

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

3. How do we call an Unconnected Stored Procedure transformation?

www.dwbiconcepts.com Community of DWBI Professionals

6. How do we receive output/return parameters from Unconnected Stored Procedure?

This particular stored procedure requires an integer value as an input parameter and returns a string value
as an output parameter. How the output parameter or return value is captured depends on the number of
output parameters and whether the return value needs to be captured.
If the stored procedure returns a single output parameter or a return value (but not both), you should use
the reserved variable PROC_RESULT as the output variable. In the previous example, the expression would
appear as:
:SP.GET_NAME_FROM_ID(inID, PROC_RESULT)
InID can be either an input port for the transformation or a variable in the transformation. The value of
PROC_RESULT is applied to the output port for the expression.
If the stored procedure returns multiple output parameters, you must create variables for each output parameter. For example, if you created a port called varOUTPUT2 for the stored procedure expression, and a
variable called varOUTPUT1, the expression would appears as:
:SP.GET_NAME_FROM_ID (inID, varOUTPUT1, PROC_RESULT)
The value of the second output port is applied to the output port for the expression, and the value of the
first output port is applied to varOUTPUT1. The output parameters are returned in the order they are declared in the stored procedure itself.
With all these expressions, the datatypes for the ports and variables must match the datatypes for the input/output variables and return value.

50
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

:SP.GET_NAME_FROM_ID()

DWBIConcepts

For example, when you click the stored procedure, something similar to the following appears:

DWBIConcepts

Configure the expression to send any input parameters and capture any output parameters or return value
You must know whether the parameters shown in the Expression Editor are input or output parameters. You
insert variables or port names between the parentheses in the exact order that they appear in the stored
procedure itself. The datatypes of the ports and variables must match those of the parameters passed to the
stored procedure.

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

11.

Sorter Transformation

1. What is a Sorter Transformation?

2. Why is Sorter an Active Transformation?


Answer:
This is because we can select the distinct option in the sorter property. When the Sorter transformation is
configured to treat output rows as distinct, it assigns all ports as part of the sort key. The Integration Service discards duplicate rows compared during the sort operation. The number of Input
Rows will vary as compared with the Output rows and hence it is an Active transformation.

3. How does Sorter handle Case Sensitive sorting?

DWBIConcepts

Sorter is an Active Connected transformation used to sort data in ascending or descending order according
to specified sort keys. The Sorter transformation contains only input/output ports.

DWBIConcepts

Answer:

The Case Sensitive property determines whether the Integration Service considers case when sorting data.
When we enable the Case Sensitive property, the Integration Service sorts uppercase characters higher than
lowercase characters.

4. How does Sorter handle NULL values?

DWBIConcepts

Answer:

We can configure the way the Sorter transformation treats null values. Enable the property Null Treated
Low if we want to treat null values as lower than any other value when it performs the sort operation. Disable this option if we want the Integration Service to treat null values as higher than any other value.

5. How does a Sorter Cache works?


Answer:
The Integration Service passes all incoming data into the Sorter Cache before Sorter transformation performs the sort operation.
The Integration Service uses the Sorter Cache Size property to determine the maximum amount

51
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

of memory it can allocate to perform the sort operation. If it cannot allocate enough memory, the Integration Service fails the session. For best performance, configure Sorter cache size with a value less than or
equal to the amount of available physical RAM on the Integration Service machine.

6. How to delete duplicate records or rather to select distinct rows for flat file sources?
Answer:

DWBIConcepts

DWBIConcepts

Since the source system is a Flat File you will not be able to select the distinct option in the source qualifier
as it will be disabled due to flat file source table. Hence the next approach may be we use a Sorter Transformation and check the Distinct option. When we select the distinct option all the columns will the selected
as keys, in ascending order by default.

DWBIConcepts

DWBIConcepts

If the amount of incoming data is greater than the amount of Sorter cache size, the Integration Service temporarily stores data in the Sorter transformation work directory. The Integration Service requires disk space
of at least twice the amount of incoming data when storing data in the work directory.

52
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

12.

Union Transformation

1. What is a Union Transformation?

Union is an Active, Connected non-blocking multiple input group transformation used to merge data from
multiple pipelines or sources into one pipeline branch. Similar to the UNION ALL SQL statement, the Union
transformation does not remove duplicate rows.

2. What are the restrictions of Union Transformation?

DWBIConcepts

Answer:

All input groups and the output group must have matching ports. The precision, data type, and scale
must be identical across all groups.
We can create multiple input groups, but only one default output group.
The Union transformation does not remove duplicate rows.
We cannot use a Sequence Generator or Update Strategy transformation upstream from a Union
transformation.
The Union transformation does not generate transactions.

DWBIConcepts

Answer:

Active transformations are those that may change the number or position of rows in the data
stream. Any transformation that splits or combines data streams or reduces, expands or sorts data is an active transformation because it cannot be guaranteed that when data passes through the
transformation the number of rows and their position in the data stream are always unchanged.
Union is an active transformation because it combines two or more data streams into one. Though the total
number of rows passing into the Union is the same as the total number of rows passing out of it, and the sequence of rows from any given input stream is preserved in the output, the positions of the rows are not
preserved, i.e. row number 1 from input stream 1 might not be row number 1 in the output stream. Union
does not even guarantee that the output is repeatable.
For Union, number of input rows does not match with the number of output rows. Consider, we have two
sources with 10 and 20 rows individually. For each of this input Source we are getting 30 output rows. We
could probably consider this like a Joiner with 10 and 20 rows with Full Outer Join, with no matching columns, which will give you all the rows as output.
It is a debatable Topic as why UNION transformation is Active. Union Transformation is derived from
Multigroup External transformation. As Multigroup External transformation is Active, Union transformation
can be termed as active.

53
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Answer:

DWBIConcepts

3. How come union transformation is active?

www.dwbiconcepts.com Community of DWBI Professionals

13.

Update Strategy Transformation

1. What is Update Strategy transform?


Answer:

DWBIConcepts

Update strategy defines the sources to be flagged for insert, update, delete, and reject at the targets.

2. What are Update Strategy Constants?


Answer:
DD_INSERT - 0
DD_UPDATE - 1
DD_DELETE - 2
DD_REJECT - 3

DWBIConcepts

3. How can we update a record in target table without using Update strategy?

Let's assume we have a target table "Customer" with fields as "Customer ID", "Customer Name" and "Customer Address". Suppose we want to update "Customer Address" without an Update Strategy. Then we
have to define "Customer ID" as primary key in Informatica level and we will have to connect Customer ID
and Customer Address fields in the mapping. If the session properties are set correctly as described above,
then the mapping will only update the customer address field for all matching customer IDs.

4. What is Data Driven?


Answer:
Update strategy defines the sources to be flagged for insert, update, delete, and reject at the targets.
Treat input rows as Data Driven: This is the default session property option selected while using an Update
Strategy transformation in a mapping.
The integration service follows the instructions coded in mapping to flag the rows for insert, update, delete
or reject.

54
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

A target table can also be updated without using Update Strategy. For this, we need to define
the key in the target table in Informatica level and then we need to connect the key and the
field we want to update in the mapping Target. In the session level, we should set the target
property as Update as Update and enable the Update check-box.

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

5. What happens when DD_UPDATE is defined in update strategy and Treat source rows as
INSERT is selected in Session?
Answer:

In Mapping Update Strategy


In Session - Treat Source Rows As
In Session - Target Insert / Update / Delete Options.

7. By default operation code for any row in Informatica without being altered is INSERT.
Then state when do we need DD_INSERT?
Answer:
When we handle data insertion, updating, deletion and/or rejection in a single mapping, we use
Update Strategy transformation to flag the rows for Insert, Update, Delete or Reject. We flag it
by either providing the values 0, 1, 2, 3 respectively or by DD_INSERT, DD_UPDATE, DD_DELETE
or DD_REJECT in the Update Strategy transformation. By default the transform has the value '0'
and hence it performs insertion.
Suppose we want to perform insert or update target table in a single pipeline. Then we can write the below
expression in update strategy transformation to insert or update based on the incoming row.
IIF (LKP_EMPLOYEE_ID IS NULL, DD_INSERT, DD_UPDATE)
If we can use more than one pipeline then, its not a problem. For the Insert part we dont even need an Update Strategy transform explicitly (DD_INSERT), we can map it straight away.

8. What is the difference between update strategy and following update options in target?
Update as Update - Update as Insert - Update else Insert Even if we do not use update strategy we can still
update the target by setting, for example Update as Update and treating target rows as data driven. So
what's the difference here?
Answer:
The operations for the following options will be done in the Database Level.
Update as Update
Update as Insert
Update else Insert

www.dwbiconcepts.com All rights reserved.

55

DWBIConcepts

Answer:

DWBIConcepts

6. What are the three areas where the rows can be flagged for particular treatment?

DWBIConcepts

DWBIConcepts

If in Session anything other than DATA DRIVEN is mentioned then Update strategy in the mapping is ignored.

www.dwbiconcepts.com Community of DWBI Professionals

It will write a 'select' statement on the target table and will compare with the source. Accordingly if the record already exits it will do an update else it will insert. On the other hand the update strategy the operations
will be done at the Informatica level itself.

If DD_REJECT is selected in the Update Strategy, then we need to select this option to generate the Reject/
Bad file.

10. Scenario Implementation 1


Suppose we have source employee table and we want to load employees who belong to department 10 to
Target 1, 20 to Target 2 and 30 to Target 3. Describe the approach without using FILTER or ROUTER Transformations.
Answer:

DWBIConcepts

We will use three separate Update Strategy transformations before each of the target tables (T1, T2, T3),
and provide below condition in their expression editor:
UPD_T1: IIF (DEPTNO = 10, DD_INSERT, DD_REJECT)
UPD_T2: IIF (DEPTNO = 20, DD_INSERT, DD_REJECT)
UPD_T3: IIF (DEPTNO = 30, DD_INSERT, DD_REJECT)

DWBIConcepts

Answer:

DWBIConcepts

9. What is the use of Forward Reject rows in Mapping?

DWBIConcepts

Update strategy also gives conditional update option - wherein based on some condition you can update/ insert even reject the rows. Such conditional options are not available in target based updates (wherein it will
either update or it will perform update else insert based on the keys defined in Informatica level)

56
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

14.

Java Transformation

1. Scenario Implementation 1
Col2
3
2
2

Col1
A
A
A
B
B
C
C

Col2
3
3
3
2
2
2
2

Target:

Answer:
Using Java transformation in Informatica we can generate as many records required as per the requirement.
Here goes the Java code.

DWBIConcepts

In_Col1 = Col1;
In_Col2 = Col2;

DWBIConcepts

Col1
A
B
C

DWBIConcepts

Source:

for (int i = 0, i < In_Col2, i++) {


Out_Col1 = In_Col1;
Out_Col2 = In_Col2;
generaterows();
}

How can I replace characters e.g. A to Z in a particular string to its ASCII value?
E.g. Input String-AB123C1; Output string-6566123671
Answer:
If the INPUT string is fixed size of 9 characters, Use the below code as expression in an Output port of an
Informatica Expression transformation.
Alternatively you can use Informatica User-Defined Function with the INPUT string as an Argument:
IIF( IS_NUMBER(
TO_CHAR( ASCII(
IIF( IS_NUMBER(
TO_CHAR( ASCII(
IIF( IS_NUMBER(
TO_CHAR( ASCII(
IIF( IS_NUMBER(
TO_CHAR( ASCII(
IIF( IS_NUMBER(

SUBSTR(
SUBSTR(
SUBSTR(
SUBSTR(
SUBSTR(
SUBSTR(
SUBSTR(
SUBSTR(
SUBSTR(

INPUT,
INPUT,
INPUT,
INPUT,
INPUT,
INPUT,
INPUT,
INPUT,
INPUT,

1,
1,
2,
2,
3,
3,
4,
4,
5,

1
1
1
1
1
1
1
1
1

)
)
)
)
)
)
)
)
)

)
)
)
)
)
)
)
)
)

=
)
=
)
=
)
=
)
=

1, SUBSTR(
) ||
1, SUBSTR(
) ||
1, SUBSTR(
) ||
1, SUBSTR(
) ||
1, SUBSTR(

www.dwbiconcepts.com All rights reserved.

INPUT, 1, 1 ),
INPUT, 2, 1 ),
INPUT, 3, 1 ),
INPUT, 4, 1 ),
INPUT, 5, 1 ),

57

DWBIConcepts

2. Scenario Implementation 2

www.dwbiconcepts.com Community of DWBI Professionals

SUBSTR(
SUBSTR(
SUBSTR(
SUBSTR(
SUBSTR(
SUBSTR(
SUBSTR(
SUBSTR(
SUBSTR(

INPUT,
INPUT,
INPUT,
INPUT,
INPUT,
INPUT,
INPUT,
INPUT,
INPUT,

5,
6,
6,
7,
7,
8,
8,
9,
9,

1
1
1
1
1
1
1
1
1

)
)
)
)
)
)
)
)
)

)
)
)
)
)
)
)
)
)

)
=
)
=
)
=
)
=
)

) ||
1, SUBSTR(
) ||
1, SUBSTR(
) ||
1, SUBSTR(
) ||
1, SUBSTR(
)

INPUT, 6, 1 ),
INPUT, 7, 1 ),
INPUT, 8, 1 ),
INPUT, 9, 1 ),

As per the requirement we want to convert just the Characters in an input String to its ASCII equivalent not
the Digits.
If the requirement were to convert a single character to ASCII equivalent in Informatica, then
the ASCII in-built function of Informatica would have been helpful. E.g. ASCII(inp_chr)
But single this is a string and we need the ASCII equivalent of each characters in the string i.e.
parse each characters; concept of loop comes in picture. So use Informatica JAVA transformation.

DWBIConcepts

TO_CHAR( ASCII(
IIF( IS_NUMBER(
TO_CHAR( ASCII(
IIF( IS_NUMBER(
TO_CHAR( ASCII(
IIF( IS_NUMBER(
TO_CHAR( ASCII(
IIF( IS_NUMBER(
TO_CHAR( ASCII(

DWBIConcepts

Use Informatica Passive Java transformation:


I have the i/p column name as INPUT and o/p value from Java transform as OUTPUT port created.
On the Java Code tab of Java transformation use the below java code:String inp = INPUT;
String ch;
String out="";
for (int i = 0; i < inp.length(); i++) {

DWBIConcepts

ch= inp.substring(i, i+1);


char c = inp.charAt(i);
if(! Character.isDigit(c)) {
int j = (int) c;
out = out + j;
} else
out = out + ch;

DWBIConcepts

}
OUTPUT = out;

58
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

15.

Source Qualifier Transformation

1. What is a Source Qualifier? What are the tasks we can perform using a Source Qualifier
and why it is an ACTIVE transformation?

2. What happens to a mapping if we alter the data types between Source and its corresponding Source Qualifier?
Answer:
The Source Qualifier transformation displays the Informatica data types. The transformation data types determine how the source database binds data when the Integration Service reads it.
Now if we alter the data types in the Source Qualifier transformation or the data types in the Source definition and Source Qualifier transformation do not match, the Designer marks the mapping as invalid when
we save the mapping.

3. Suppose we have used the Select Distinct and the Number of Sorted Ports property in the
Source Qualifier and then we add Custom SQL Query. Explain what will happen.
Answer:
Whenever we add Custom SQL or SQL override query it overrides the User-Defined Join, Source Filter, Number of Sorted Ports, and Select Distinct settings in the Source Qualifier transformation. Hence only the user
defined SQL Query will be fired in the database and all the other options will be ignored.

59

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Since the transformation provides us with the property Select Distinct, when the Integration Service adds a
SELECT DISTINCT clause to the default SQL query, which in turn affects the number of rows returned by the
Database to the Integration Service and hence it is an Active transformation.

DWBIConcepts

We can configure the SQ to join [Both INNER as well as OUTER JOIN] data originating from the same
source database.
We can use a source filter to reduce the number of rows the Integration Service queries.
We can specify a number for sorted ports and the Integration Service adds an ORDER BY clause to
the default SQL query.
We can choose Select Distinct option for relational databases and the Integration Service adds a SELECT DISTINCT clause to the default SQL query.
Also we can write Custom/Used Defined SQL query which will override the default query in the
Source Qualifier by changing the default settings of the transformation properties for relational databases.
Also we have the option to write Pre as well as Post SQL statements to be executed before and after
the Source Qualifier query in the source database.

DWBIConcepts

A Source Qualifier is an Active and Connected transformation that reads the rows from a relational database
or flat file source.

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

4. Describe the situations where we will use the Source Filter, Select Distinct and Number of
Sorted Ports properties of Source Qualifier transformation.

5. What will happen if the SELECT list COLUMNS in the Custom override SQL Query and the
OUTPUT PORTS order in Source Qualifier transformation do not match?
Answer:
Mismatch or changing the order of the list of selected columns in the SQL Query override of Source Qualifier
to that of the connected transformation output ports may result is unexpected value result for ports if data
types matches by chance, else will lead to session failure.

6. What happens if in the Source Filter property of SQ transformation we include keyword


WHERE say, WHERE CUSTOMERS.CUSTOMER_ID > 1000.
Answer:
We use Source filter to reduce the number of source records. If we include the string WHERE in the source
filter, the Integration Service fails the session. In the above case, the correct syntax will be CUSTOMERS.CUSTOMER_ID > 1000

7. Describe the scenarios where we go for Joiner transformation instead of Source Qualifier
transformation.
Answer:
While joining Source Data of heterogeneous sources as well as to join flat files we will use the Joiner transformation. Use the Joiner transformation when we need to join the following types of sources:
Join data from different Relational Databases.
Join data from different Flat Files.
Join relational sources and flat files.

60
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Number Of Sorted Ports option is used when we want the source data to be in a sorted fashion, so as to use
the same in some following transformations like Aggregator or Joiner, those when configured for sorted input will improve the performance.

DWBIConcepts

Select Distinct option is used when we want the Integration Service to select unique values from a source.
Filtering out unnecessary data earlier in the data flow, will improve performance.

DWBIConcepts

Source Filter option is used basically to reduce the number of rows the Integration Service queries, so as to
improve performance.

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

8. What is the maximum number we can use in Number of Sorted Ports for Sybase source
system?

Sybase supports a maximum of 16 columns in an ORDER BY clause. So if the source is Sybase, do not sort
more than 16 columns.

9. What is use of Source Qualifier in Informatica? Can we create a mapping without a source
qualifier?

DWBIConcepts

Answer:

Also for relational table Source Qualifier helps to join multiple tables from the same database and also allows doing Pre or Post SQL operations.
We cannot create a mapping without Source Qualifier; it is the first transformation in Informatica that is attached with the source tables or source flat file instance.

10. Suppose we have two tables of same database type, residing in different Database instance. If a Database Link is available, how can we join the two tables using a Source
Qualifier in Informatica provided there are valid join columns.
Answer:

DWBIConcepts

Source Qualifier is used to convert the data types of Heterogeneous Source Objects supported by
Informatica to Native Informatica data types, after which Informatica processes the following objects in a mapping with consistent Informatica data types.

DWBIConcepts

Answer:

SELECT e.empno, e.ename, s.salary, s.comm


FROM emp e, sal@dblinkname s
WHERE e.empno=s.empno
It is advisable to create a Public Synonym at Database for the remote tables so that we can avoid using the
syntax : TableName@DBLinkName

11. What is the meaning of output is deterministic property in source qualifier transformation?
Answer:
Output is deterministic means we are informing Informatica that the output does not change (for
the same input) across every session run. Why is this required? Consider the source is relational
and we have enabled the session for recovery. The session fails and we resume the session. In this

61

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Source Qualifier Override:-

www.dwbiconcepts.com Community of DWBI Professionals

case if we have set the source as deterministic, then the session would have created a cache (on the disc) of
the source during normal run to be used for recovery. This saves time during recovery because we need not
issue the SQL command to the source database again.

12. Scenario Implementation 1


How to delete duplicate rows present in relational database using Informatica? Suppose we have duplicate
records in Source System and we want to load only the unique records in the Target System eliminating the
duplicate rows. What will be the approach?
Answer:

DWBIConcepts

DWBIConcepts

DWBIConcepts

Assuming that the source system is a Relational Database, to eliminate duplicate records, we can check
the Distinct option of the Source Qualifier of the source table and load the target accordingly.

DWBIConcepts

If this was not set, then the source data cache is not created during normal run and SQL will be reissued during recovery. In some cases, if this property is not set you will not be able to enable recovery for the session.

62
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

16.

Miscellaneous

1. What are the new features of Informatica 9.x in developer level?

Now Lookup can be configured as an active transformation - it can return multiple rows on successful match.
Now you can write SQL override on un-cached lookup also. Previously you could do it only on
cached lookup.
You can control the size of your session log. In a real-time environment you can control the session
log file size or time.
Database deadlock resilience feature - this will ensure that your session does not immediately fail if
it encounters any database deadlock, it will now retry the operation again. You can configure number of retry attempts.
Cache can be updated based on a condition or expression.
New interface for admin console, now onwards called Informatica Administrator. (Create connection
objects, grant permission on database connections, deploy or configure deployment units from the
Informatica Administrator)
PowerCenter licensing now onwards based on the number of CPUs and repositories.

DWBIConcepts

From a developer's perspective, some of the new features in Informatica 9.x are as follows:

DWBIConcepts

Answer:

Answer:
Normalizers as well as Router Transformations are two Active transformations which can increase the number of input rows to output rows.
Aggregator Transformation performs the reverse action of Normalizer transformation.

DWBIConcepts

3. How many ways we can filter records?


Answer:

Source Qualifier
Filter transformation
Router transformation
Update strategy

4. What are the transformations that use cache for performance?


Answer:

63
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

2. Name the transformations which converts one to many rows i.e. increases the I/P: O/P
row count. Also what is the name of its reverse transformation?

www.dwbiconcepts.com Community of DWBI Professionals

Aggregator, Sorter, Lookups, Joiner and Rank transformations use cache.

5. What is the formula for calculation of Lookup/Rank/Aggregator index & data caches?

Index cache size = Total no. of rows * size of the column in the lookup condition (50 * 4)
Aggregator/Rank transformation Data Cache size = (Total no. of rows * size of the column in the
lookup condition) + (Total no. of rows * size of the connected output ports)

Aggregator Index cache: #Groups (( column size) + 7)


Aggregate data cache: #Groups (( column size) + 7)

Rank Index Cache : #Groups (( column size) + 7)


Rank Data Cache: #Group [(#Ranks * ( column size + 10)) + 20]

6. What is the difference between Informatica PowerCenter and Exchange and Mart?
Answer:
PowerCenter:
PowerCenter can have many repositories.
It supports the Global Repository and networked local repositories.
PowerCenter can connect to all native legacy source systems such as Mainframe, ERP, CRM, EAI
(TIBCO, MSMQ, JMQ)
High Availability and Load sharing on multiple servers in the grid.
Informatica Session level Partioning is available.
Informatica Pushdown Optimizer is available.
PowerMart:
PowerMart supports only one repository.
PowerMart can connect to Relational and flat file sources.
PowerExchange:
PowerExchange Client and PowerExchange ODBC are PowerExchange interfaces to extract and load
data for a variety of data types on a variety of platforms relational, non-relational, and changed data
in batch-mode or real-time using PowerCenter.

64
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Joiner Index Cache: #Master rows [( column size) + 16)


Joiner Data Cache: #Master row [( column size) + 8]

DWBIConcepts

DWBIConcepts

Lookup Index Cache : #Rows in lookup table [( column size) + 16)


Lookup Data Cache: #Rows in lookup table [( column size) + 8]

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

The PowerExchange Client for PowerCenter is installed with PowerCenter and integrates
PowerExchange(Separate License for the required source system; Check Sources->Import from
PowerExchange) and PowerCenter to extract relational, non-relational, and changed data.

1. Select Optional Quotes to Double or Single Quote. The column delimiters within the quote characters are
ignored.
2. Escape Character used to escape the delimiter or quote character.
Escape character preceding the delimiter character in an unquoted string or the quote character in a quoted
string is treated as regular character.

8. We have just received source files from UNIX. We want to stage that data to ETL process.
What are the points we need to look for?
Answer:
When a source flat file is loaded to a staging database table, generally we focus on the below items:
Define proper file-format for the input file (Delimited/Fixed-width), Code Page etc.
Header information having any Processing date to be checked with sysdate or some other business
logic.
Check the detail records count in the file with the information in the Trailer information if any.
Sum of any measure fields of detail records matches with Header/Trailer information if any.
In case of Indirect Loading we can add the filename and record number in file as part of columns in
the staging table.
Basically everything depends on your/business requirement.

9. What is the difference between Joiner and Lookup. Performance wise which one is better
to use.
Answer:
Joiner:

65
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

To handle the same flat-files in Informatica, use the following options as per the data file format while defining the file structure.

DWBIConcepts

For delimiter files the delimiter is the separator that identifies the data values of fields present in
the file.
So ideally if the data file contains the delimiter character as a part of the data in a field value,
the field value either remains within double or single quotes or an escape character precedes
the delimiter that is actually to be treated as a normal character.

DWBIConcepts

Answer:

DWBIConcepts

7. How do we handle delimiter character as a part of the data in a delimited source file?

www.dwbiconcepts.com Community of DWBI Professionals

Lookup:
=, <, <=, >, >=, != operators can be used in join condition
Supports left outer join
Earlier a Passive transformation, 9 onwards an Active transformation (Can return more than 1 records in case of multiple match)
Supports SQL override
Connected/Unconnected
Supports dynamic cache update
Relational/FlatFile source/target
Pipeline Lookup
Selection between these two transformations is completely dependent on project requirement. Its a debatable topic to conclude which one among these two serves good in terms of performance.

DWBIConcepts

Only = operator can be used in join condition


Supports normal, full outer, left/right outer join
Active transformation
No SQL override
Connected transformation
No dynamic cache
Heterogeneous source

DWBIConcepts

B2B allows to parse and read unstructured data such as PDF, EXCEL, HTML etc. It has the capability to read
binary data such as Messages, EBCDIC File etc. and has a very large list of supported formats.
B2B Data Transformation Studio is the Developer tool, by which the parsing of (reading) the unstructured data is done. B2B mostly gives the output as an XML file.
B2B Data Transformation is integrated with Informatica PowerCenter using a Transformation "Unstructured
Data Transformation", This transformation can receive the output of B2B Data Transformation studio and
load into any Target supported by PowerCenter.

11. What is CDC, SCD and MD5 in Informatica?


Answer:
CDC - Changed Data Capture. How, only the changed data is captured from the Source System.
SCD- Slowly Changing Dimension. How, history data is maintained in the Dimension tables.
MD5- MD5 Checksum Encoding. It generates 32 character HEX code encoding, can be used to decide
Insert/Update strategy for target records.

66
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Answer:

DWBIConcepts

10. What is the B2B in Informatica? How can we use it in Informatica?

www.dwbiconcepts.com Community of DWBI Professionals

12. How can we implement an SCD Type2 mapping without using a lookup transformation?

The entire implementation will be same as that using a lookup. The only thing we need to replace the
Lookup transformation with a Joiner transformation. In the Joiner transformation the Source table will be
used as Master and the Target table as Detail. The join condition will be same as that of lookup condition
and the join type being Detail Outer Join.

13. How does Joiner and Lookup transformation treat NULL value matching?
Answer:

DWBIConcepts

Answer:

14. Does Microsoft SQL server supports bulk loading? If yes, What happens when you specify
bulk mode and data driven for SQL server target
Answer:

DWBIConcepts

A NULL value is not equal to another NULL value in Joiner whereas, Lookup transformation matches null values.

DWBIConcepts

Yes MS SQL Server supports Bulk Loading. But if we select Treat Source Rows as Data Driven with the Target
Load Type as Bulk then the session will fail. We have to select Normal Load with Data Driven source records.

15. How can you utilize COM components in Informatica?


Answer:

DWBIConcepts

By writing C+, VB, VC++ code in External Stored Procedure Transformation

16. What is SQL transformation in Informatica?


Answer:
A SQL transformation can processes any SQL queries midstream in an Informatica pipeline. It supports
mostly all the DDL, DML, DCL, TCL.
For quick reference following are some important notes: We can configure the SQL transform in two modes that makes it Active/Passive.
Active, Query mode fires the SQL query in the database defined in the transformation.
Script mode, which is the Passive, one can call external SQL scripts to be executed.

67
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

Query mode can be configured to handle Static SQL Query (i.e. the SQL query is the same with bind
variables) or Dynamic SQL Query (i.e. different query statements for each input row).
In case of Dynamic Query when we substitute the entire SQL query of the Query_Port is called Full
Query or portion of the query statement called Partial Query.

DWBIConcepts

We can configure the SQL transformation to connect to a database with a Static Connection (i.e. selecting a particular connection object) or Dynamic Connection (i.e. based on the logic it will dynamically select the connection object to connect to a database).
Also we can pass the entire database connection information (i.e. username,password, connectstring,
codepage) called Full Database Connection.

17. What is a XML source qualifier?

The XML source qualifier represents the data elements that the Informatica server reads when it runs a session with XML sources.

18. What is the metadata extensions tab in Informatica?

DWBIConcepts

Answer:

For example, when we create a mapping, we can store the information like the mapping functionality, business user information, CR information. Similarly for Session we can store schedule information, contact person for failed session information. We basically associate the information with repository metadata using
metadata extensions.
When we create reusable metadata extensions for a repository object using the Repository Manager, the
metadata extension becomes part of the properties of that type of object. For example, we can create a reusable metadata extension for source definition called SourceCreator. When we create or edit any source
definition in the Designer, the SourceCreator extension appears on the Metadata Extensions tab. anyone
who creates or edits a source can enter the name of the person that created the source into this field.
PowerCenter Client applications can contain the following types of metadata extensions: Vendor-defined. Third-party application vendors create vendor-defined metadata extensions. We
can view and change the values of vendor-defined metadata extensions, but we cannot create, delete, or redefine them.
User-defined. We create user-defined metadata extensions using PowerCenter. We can create, edit,
delete, and view user-defined metadata extensions. We can also change the values of user-defined
extensions.
All metadata extensions exist within a domain. We see the domains when we create, edit, or view metadata
extensions. Vendor-defined metadata extensions exist within a particular vendor domain. If we use third-

68

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

PowerCenter allows end users and partners to extend the metadata stored in the repository by associating
information with individual objects in the repository. That why its called Metadata Extension.

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

party applications or other Informatica products, we may see domains such as Ariba or PowerExchange for
Siebel. We cannot edit vendor-defined domains or change the metadata extensions in them.

Both vendor and user-defined metadata extensions can exist for the repository objects- Source definitions,
Target definitions, Transformations, Mappings, Mapplets, Sessions, Tasks, Workflows, Worklets.

19. Describe some of the ETL Best Practices


Answer:

DWBIConcepts

User-defined metadata extensions exist within the User Defined Metadata Domain. When we create
metadata extensions for repository objects, we add them to this domain.

20. Is there a scope of cloud computing in Data warehousing technology?


Answer:
This is not only possible; in fact, this is the way to go for many of the providers of the modern day BI tools.
There are certain advantages and benefits of using cloud computing for Business Intelligence applications
and this is a big topic of discussion today. I will quickly touch upon a few points that will substantiate the
need of Cloud BI and in the future I will try to make a comprehensive article post in this website with more
details. First, if you see the current state of BI - there are these typical characteristics:

69

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Naming conventions for ETL objects


Naming conventions for Database objects
Parameterization of connections (so that things are easy for moving from 1 environment to other)
Maintaining of ETL job log - ideally automated maintenance through logging of job run
Handling of rejected records (and logging)
Data reconciliation
Meta data management- e.g. - maintaining Meta data columns in tables (Use of Audit columns e.g.
load date/ load user/ batch id etc.)
Error reporting
ETL job Performance evaluation
Following generic coding standards
Documentation
Decomposing complex logic in multiple ETL stages - load balancing (pushdown optimization wherever applicable) etc.
Removal of unwanted ports from different transformations used in a mapping
Using Shortcuts for source, target and lookups
Using mapplet, worklet as and when required
Write some comments for every transformation
Use Decode function rather that if than else
make sure that the sorted data is moved into the aggregator transformation
If the target table is having indexes, loading data into such tables will decrease the performance; in
such situations, use pre SQL to drop the index before loading the data into target tables and once
the data is loaded then, re-create the index using post SQL.

DWBIConcepts

DWBIConcepts

A lot of best practices may be applicable to a certain tool and pointless for the other. In a very high level and
in a very tool independent way-

www.dwbiconcepts.com Community of DWBI Professionals

High Infrastructure requirement, leading to high upfront investment


High development cost (needs special talent) as well high maintenance cost
Unpredictable workload (data volume), and skewed business growth pattern

Lower entry cost


Lower maintenance cost (pay as you use)
Faster deployment
Reduced risk
Lower TCO (total cost of ownership)
Multiple deployment model etc. etc.

Moreover, Small and medium enterprises (SMEs) can easily adapt to this model given their typical constraints of small business. Companies like Pentaho etc. are already in with their products in SaaS (software as a service) model of cloud computing. But cloud models like SaaS has some typical problems (e.g.
no flexibility of design, security concerns etc.).

DWBIConcepts

DWBIConcepts

As opposed to SaaS model, we have another cloud model called PaaS - Platform as a service - which has
the benefit of design flexibility. PaaS is very suitable for custom applications and even enterprise level BI
applications. This cloud service is being offered by almost everyone in the BI market - - BusinessObjects SAS - Microsoft Azure (check here: http://en.wikipedia.org/wiki/SQL_Azure ) - Vertica - Greenplum etc.

DWBIConcepts

DWBIConcepts

All these lead to the issues of longer cycle time and limited adoption of BI solutions. Now cloud platform,
as opposed to typical in-house software platform, is basically an alternative delivery method for the
software service. When you deliver the software or platform or infrastructure (as a service) through
cloud, you can instantly start to get the following benefits:

70
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

17.

Mapping

1. Scenario Implementation 1

Answer:
Reimport the source and target definition. Next open the mapping and Right click on the source port ename
and use "Propagate Attribute" option. This option allows us to change the properties of one port across multiple transformations without manually modifying the port in each and every transformation. We can choose
the direction of propagation (forward / backward / both) and can also select attributes of propagation e.g.
data type, scale, precision etc.

DWBIConcepts

Suppose we have a source port called ename with data type varchar(20) and the corresponding target port
as ename with varchar(20). The data type is now altered to varchar(50) in both source and target database.
Describe the changes required to modify the mapping.

Answer:
A mapping parameter is a user definable constant that takes up a value before running a session.
It can be used in SQ expressions, Expression transformation etc.

DWBIConcepts

2. What are mapping parameters and variables?

DWBIConcepts

A mapping variable is also defined similar to the parameter except that the value of the variable is subjected
to change. It picks up the value in the following order.
From the Session parameter file
As stored in the repository object in the previous run
As defined in the initial values in the designer
Data type Default values

3. Which type of variables or parameters can be declared in parameter file?


Answer:
There is a difference between variable and parameter.
Variable, as the name suggests, is like a variable value which can change within a session run.
Parameters are fixed and their values don't change during session run.
$ - for session level parameters which can be declared in parameter files.
$$ - for mapping level parameters which can be declared in parameter files.
$$$- Inbuilt Informatica system variables that cannot be declared in parameter files
E.g. $$$SessStartTime these are constant throughout the mapping and cannot be changed.

71
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

$, $$, $$$ - Can all be declared or not.

www.dwbiconcepts.com Community of DWBI Professionals

Read this article to get a detail understanding:http://www.dwbiconcepts.com/etl/14-etlinformatica/74-stop-hardcoding-follow-parameterization-technique.html

4. What are the default values for variables?

DWBIConcepts

Answer:
String = Null
Number = 0
Date = 1/1/1753

Answer:
First Column - Row indicator (0, 1, 2, 3)
Second Column Column Indicator (D, O, N, T)

Answer:

Rejected records are loaded into bad files. It has record indicator and column indicator.
Record indicator identified by (0-insert,1-update,2-delete,3-reject) and
Column indicator identified by (D-valid,O-overflow,N-null,T-truncated).
Normally data may get rejected in different reason due to transformation logic

7. What is Reject loading?


Answer:
During a session, the Informatica server creates a reject file for each target instance in the mapping. If the
writer or the target rejects data, the Informatica server writes the rejected row into reject file. The reject file
and session log contain information that helps you determine the cause of the reject. You can correct reject
files and load them to relational targets using the Informatica reject load utility. The reject loader also creates another reject file for the data that the writer or target reject during the reject loading.
Reject Loading
During a session, the server creates a reject file for each target instance in the mapping. If the writer of the
target rejects data, the server writers the rejected rows into the reject file. You can correct those rejected

72

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

DWBIConcepts

6. Out of 100000 source rows some rows get discard at target, how will you trace them and
where it gets loaded?

DWBIConcepts

5. What does first column of bad file (rejected rows) indicates?

www.dwbiconcepts.com Community of DWBI Professionals

data and re-load them to relational targets, using the reject loading utility. (You cannot load rejected data into a flat file target) Each time, you run a session, the server appends a rejected data to the reject file.
Locating the BadFiles
$PMBadFileDir / Filename.bad

DWBIConcepts

When you run a partitioned session, the server creates a separate reject file for each partition.
Reading Rejected data
Ex: 3,D,1,D,D,0,D,1094345609,D,0,0.00
To help us in finding the reason for rejecting, there are two main things.
Row indicator - Row indicator tells the writer, what to do with the row of wrong data.

o
o
o
o

0 Insert Writer or target


1 Update Writer or target
2 Delete Writer or target
3 Reject Writer

If a row indicator is 3, the writer rejected the row because an update strategy expression marked it
for reject.

DWBIConcepts

Row indicator Meaning Rejected By

o
o
o
o

D Valid Data Good Data. The target accepts it unless a database error occurs, such as finding
duplicate key.
Overflow Bad Data.
N Null Bad Data.
T Truncated Bad Data

NOTE: NULL columns appear in the reject file with commas marking their column.

Correcting Reject File


Use the reject file and the session log to determine the cause for rejected data. Keep in mind that correcting
the reject file does not necessarily correct the source of the reject. Correct the mapping and target database
to eliminate some of the rejected data when you run the session again. Trying to correct target rejected
rows before correcting writer rejected rows is not recommended since they may contain misleading column
indicator. For example, a series of N indicator might lead you to believe the target database does not
accept NULL values, so you decide to change those NULL values to Zero. However, if those rows also had a 3
in row indicator. Column, the row was rejected b the writer because of an update strategy expression, not
because of a target database restriction. If you try to load the corrected file to target, the writer will again reject those rows, and they will contain inaccurate 0 values, in place of NULL values.

73

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Column Indicator Meaning Writer Treats as

DWBIConcepts

Column indicator - Column indicator is followed by the first column of data, and another column indicator. They appears after every column of data and define the type of data preceding it

www.dwbiconcepts.com Community of DWBI Professionals

8. Why Informatica writer thread may reject a record?


Answer:
Data overflowed column constraints
An update strategy expression

Data contains a NULL column


Database errors, such as key violations

10. Describe various steps for loading reject file?


Answer:
After correcting the rejected data, rename the rejected file to reject_file.in
The rejloader used the data movement mode configured for the server. It also used the code page of
server/OS. Hence do not change the above, in middle of the reject loading
Use the reject loader utility Pmrejldr pmserver.cfg [folder name] [session name]

11. Variable v1 has values set as 5 in designer (default), 10 in parameter file, and 15 in repository. While running session which value Informatica will read?
Answer:

DWBIConcepts

Answer:

DWBIConcepts

DWBIConcepts

9. Why target database can reject a record?

12. What are shortcuts? Where it can be used? What are the advantages?
Answer:
There are 2 shortcuts (Local and global) Local used in local repository and global used in global repository.
The advantage is reusing an object without creating multiple objects. Say for example a source definition
want to use in 10 mappings in 10 different folders without creating 10 multiple source you create 10
shortcuts.

74
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Informatica read value 15 from repository

www.dwbiconcepts.com Community of DWBI Professionals

13. Can we have an Informatica mapping with two pipelines, where one flow is having a
Transaction Control transformation and another not. Explain why?

No it is not possible. Whenever we have a Transaction Control transformation in a mapping, the session
commit type is User Defined. Whereas for a pipeline without the Transaction Control transform, the session
expects the commit type to be either Source based or Target based.
Hence we cannot have both the pipelines in a single mapping; rather we have to develop single mappings for
each of the pipelines.

14. How can we implement Reverse Pivoting using Informatica transformations?

DWBIConcepts

Answer:

Pivoting can be done using Normalizer transformation. For reverse-pivoting we will need to use an aggregator transformation like below:
From,

To,
Col1 Col2
A
B
10
20
can be done using one Expression transformation and one Aggregator transformation:
In Expression transform, create two ports, o_col_a, o_col_b.
o_col_a = IIF (col1="A", ColB, 0)
o_col_b = IIF (col1="B", ColB, 0)
Next in the aggregator transform, take the MAX () of o_col_a, o_col_b and map it to target A and B columns.
(We may need to take SUM (), instead of MAX () if we have multiple A, B rows)

15. Is it possible to update a Target table without any key column in target?
Answer:

75
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

DWBIConcepts

Col1 Col2
A
10
B
20

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

DWBIConcepts

DWBIConcepts

DWBIConcepts

DWBIConcepts

Yes it is possible to update the target table either by defining keys at Informatica level in Warehouse
designer or by using Update Override.

76
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

18.

Mapplet

1. What is a Mapplet?
Answer:

2. What is the difference between Reusable transformation and Mapplet?


Answer:
Any Informatica Transformation created in the Transformation Developer or a non-reusable promoted to reusable transformation from the mapping designer which can be used in multiple
mappings is known as Reusable Transformation. When we add a reusable transformation to a
mapping, we actually add an instance of the transformation. Since the instance of a reusable transformation
is a pointer to that transformation, when we change the transformation in the Transformation Developer, its
instances reflect these changes.

DWBIConcepts

A Mapplet is a reusable object created in the Mapplet Designer which contains a set of transformations and
lets us reuse the transformation logic in multiple mappings. A Mapplet can contain as many transformations
as we need. Like a reusable transformation when we use a mapplet in a mapping, we use an instance of the
mapplet and any change made to the mapplet in Mapplet Designer, is inherited by all instances of the
mapplet.

3. What are the transformations that are not supported in Mapplet?


Answer:
Normalizer
Cobol sources
XML sources
XML Source Qualifier
Target definitions
Pre- and Post- session Stored Procedures
Other Mapplet

DWBIConcepts

4. Is it possible to convert reusable transformation to a non-reusable one?


Answer:
Reusable transformations are created in the Transformation Developer.
Another way is to promote a non-reusable transformation in a Mapping/Mapplet to reusable one.

77
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

DWBIConcepts

Mapplets are reusable objects that represent collection of transformations.

www.dwbiconcepts.com Community of DWBI Professionals

**Converting a non-reusable transformation into a reusable transformation is not reversible.


But we can use the reusable transformation as a non-reusable one in any mapping or mapplet by dragging
the selected Reusable Transform from the Repository Navigator and press the Ctrl key just before dropping
the object in the Mapplet/Mapping designer.

5. What is the use of Mapplet & Worklet in project?


Answer:
Mapplet and Worklets allow you to create reusable objects and thus make your informatica code reusable.

DWBIConcepts

The same applies for creating a non-reusable session from a reusable one in the Worklet/Workflow designer.

Mapplet can be created in PowerCenter Designer and reused in mapings. Worklet can be created in Workflow Manager and reused in Workflows.

6. Is it possible to have a mapplet within a mapplet and worklet within a worklet?

DWBIConcepts

Just like a procedure or function in a procedural language, we can build a mapplet or worklet, to incorporate
a business logic, which can be used again and again in different mapping and workflow.

DWBIConcepts

Informatica does not support mapplet within a mapplet transformation but it supports worklet within a
worklet.

DWBIConcepts

Answer:

78
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

19.

Session

1. What is Session and Batches?

SESSION - A Session is a set of instructions that tells the Informatica Server / Integration Service, how and
when to move data from Sources to Targets. After creating the session, we can use either the server manager or the command line program pmcmd to start or stop the session.
BATCHES - It Provides A Way to Group Sessions For Either Serial Or Parallel Execution By The Informatica
Server. There Are Two Types Of Batches:

DWBIConcepts

Answer:

DWBIConcepts

SEQUENTIAL - Run Session One after the Other.


CONCURRENT - Run Session at the Same Time.

2. What are various session tracing levels?


Answer:

Verbose Initialization - In addition to normal tracing levels, it also logs additional initialization information,
names of index and data files used and detailed transformation statistics.
Verbose Data - In addition to verbose initialization, it records row level logs.

3. Can we copy a session to new folder or new repository?


Answer:
Yes we can copy session to new folder or repository, provided the corresponding Mapping is already in the
folder or repository.

4. Is it possible to store all the Informatica session log information in a database table?
Normally the session log is stored as a binary compression .bin file in SessLogs directory.
Can we store the same information in database tables for future analysis?

79
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Terse - Log initialization, error messages, notification of rejected data.

DWBIConcepts

Normal - default Logs initialization and status information, errors encountered, skipped rows
due to transformation errors, summarizes session results but not at the row level.

www.dwbiconcepts.com Community of DWBI Professionals

Answer:
It is not possible to store all the session log information in some table. Along with error related information we may get some other session related information from metadata repository tables like
REP_SESS_LOG.

PMERR_DATA. Stores data and metadata about a transformation row error and its corresponding source
row.
PMERR_MSG. Stores metadata about an error and the error message.
PMERR_SESS. Stores metadata about the session.
PMERR_TRANS. Stores metadata about the source and transformation ports, such as name and data type,
when a transformation error occurs.
The above tables are specifically used to store the information about exception (error) records - e.g. records
in the reject file.
We can use this as a base for error handling strategy. But this does not contain all the information that are
present in session log - like performance details (thread busy percentage), details of the transformation invoked in the session etc. We can also check the contents of REP_SESS_LOG view under Informatica repository schema; however, that too does not contain all the information.

DWBIConcepts

List of Error tables created by Informatica:

DWBIConcepts

Give the settingError Log Type: Relational Database.


Error Log Type: Give the Database Connection, where we want to store the error tables.
Error Log Table Name Prefix: Prefix for the error tables. By default, Informatica creates 4 different error tables. If we provide a prefix here the error tables will be created with the same prefix in the database.
Log Row Data: This option is used to log the data at the point where the error happened.
Log Source Row Data: Capture the source date for the error record.
Log Source Row Data: Error data will be stored into a single column of the database table. We can specify
the delimiter for the source data here.

DWBIConcepts

To capture error data, we can configure the session as below:


Go to Session->Config Object-> Error Handling Section

Answer:
The Integration Service can execute shell commands at the beginning or at the end of the session. The Workflow Manager provides the following types of shell commands for each Session task:
Pre-session command
Post-session success command
Post-session failure command
Use any valid UNIX command or shell script for UNIX nodes, or any valid DOS or batch file for Windows
nodes. Configure the session to run the pre- or post-session shell commands.

80
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

5. Can we call a shell script from session properties?

www.dwbiconcepts.com Community of DWBI Professionals

6. Can we change the Source and Target table names in Session level?
Answer:

One more suitable method would be to parameterize the source and target table name. We can
run the same mapping concurrently using different parameter files. We have to enable concurrent run mode
in the Workflow level. Also find more information regarding parameterization.

7. How to write flat file column names in target?

DWBIConcepts

Yes, we can change the source and target table names in the session level. Go to the session and navigate to
the mapping tab. Select the source/target to be changed- for target mention new table name in
Target Table Name & for source choose Source Table Name.

There are two options available in session properties to take care of this requirement. For this, Go to Mapping Tab Target Properties and Choose the header option as Output Field names OR Use Header Command
output File.
Option 1, will create your output file with a header record and the column heading names will be same as
your Target transformation port names.
Option 2, we can create our command to generate the header record text. We can use an 'echo' command
here to get this created. Here is an example

DWBIConcepts

Answer:

It is recommended using the second option as it gives more flexibility for writing the column names.

8. What are the ERROR tables present in Informatica?


Answer:
PMERR_DATA- Stores data and metadata about a transformation row error and its corresponding
source row.
PMERR_MSG- Stores metadata about an error and the error message.
PMERR_SESS- Stores metadata about the session.
PMERR_TRANS- Stores metadata about the source and transformation ports, such as name and data
type, when a transformation error occurs.

9. What are the alternate ways to stop a session without using STOP ON ERRORS option
set to 1 in session properties?
Answer:
We can also use the functions STOP () or ERROR () in an expression transformation to stop the execution of a
session based on some user-defined conditions.

81
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

DWBIConcepts

echo '"Employee ID"|"Department ID"'

www.dwbiconcepts.com Community of DWBI Professionals

10. Suppose a session fails after loading of 10,000 records in the target. How can we load the
records from 10,001 when we run the session next time?

If we configure the Session for Normal load rather than Bulk load & by using Recovery Strategy
in the Session Properties & selecting the Option Resume from last Check point, then we can
run the Session from the last Commit Interval.
In this case if we specify the Commit Interval as 10,000 & the Integration Service issues a commit after loading 10,000 records then you can load the records from 10,001.

DWBIConcepts

If 9999 rows were loaded and the session fails and Integration Service did not issue any commit as the Commit Interval in this case is 10,000 then we cannot perform Recovery. In this case truncate the Target Table &
Restart the session.

DWBIConcepts

Answer:

11. Define the types of Commit intervals apart from user defined?
Answer:

Source-based commit. The Informatica Server commits data based on the number of source rows. The
commit point is the commit interval you configure in the session properties.

12. Suppose session is configured with commit interval of 10,000 rows and source has 50,000
rows explain the commit points for source based commit & target based commit. Assume
appropriate value wherever required?
Answer:
Target Based commit (First time Buffer size full 7500 next time 15000)
Commit Every 15000, 22500, 30000, 40000, 50000
Source Based commit(Does not affect rows held in buffer)
Commit Every 10000, 20000, 30000, 40000, 50000

82
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Target-based commit. The Informatica Server commits data based on the number of target rows and the key
constraints on the target table. The commit point also depends on the buffer block size and the commit interval.

DWBIConcepts

The different commit intervals are:

www.dwbiconcepts.com Community of DWBI Professionals

13. How to capture performance statistics of individual transformation in the mapping and
explain some important statistics that can be captured?
Answer:

DWBIConcepts

Use tracing level Verbose data.

14. How can we parameterize success or failure email list?


Answer:

Also we can use pmrep command to update the email task:


updateemailaddr
-d <folder_name>
-s <session_name>
-u <success_email_address>
-f <failure_email_address>

15. Is it possible that a session failed but still the workflow status is showing success?
Answer:

DWBIConcepts

If the workflow completes successfully it will show the execution status of success irrespective of whether
any session within the workflow failed or not. The workflow success status has nothing to do with session
failure. If and only if we set the session general option in the workflow designer Fail Parent if this task fails,
then only the workflow status will display as failed on session failure.

16. What is Busy Percentage?


Answer:
Duration of time the thread was occupied compared to total run time of the mapping.
So lets say, we have one writer thread - this thread is internally responsible for writing data to the target table/ file. Now if our mapping runs for 100 seconds but the time taken by the mapping to write the data to
the target is only 20 seconds (because other time it was busy in reading/ transforming the data), then busy
percentage of the writer thread is 20%

83
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

DWBIConcepts

We can parameterize the email user list and modify the values in parameter file.
Use $PMSuccessEmailUser, $PMFailureEmailUser.

www.dwbiconcepts.com Community of DWBI Professionals

17. Can we write a PL/SQL block in pre and post session or in target query override?
Answer:

19. Can we use the same session to load a target table in different databases having same
target definition?
Answer:
Yes we can use the same session to load same target definition in different databases with the help of the
Parameterization; i.e. using different parameter files with different values for the parameterized Target Connection object $DBConnection_TGT and Owner/Schema name Table Name Prefix with
$Param_Tgt_Tablename. To run the single workflow with the session, to load two different database target
tables we can consider using Concurrent workflow Instances with different parameter files.
Even we can load two instance of the same target connected in the same pipeline. At the session level use
different relational connection object created for different Databases.

20. How do you remove the cache files after the transformation?
Answer:
After session complete, DTM remove cache memory and deletes caches files. In case using persistent cache
and Incremental aggregation then caches files will be saved.

21. Why doesn't a running session QUIT when Oracle or Sybase return fatal errors?
Answer:
The session will only QUIT when its threshold: "Stop on errors" is set to 1. Otherwise the session will continue to run.

84
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Normally with every session run target file data will be overwritten, except if we select Append if Exist (8x
onwards) option for the Target session Property which will append the new data to the existing data in the
flat file target.

DWBIConcepts

Answer:

DWBIConcepts

18. Whenever a session runs does the data gets overwritten in a flat file target? Is it possible
to keep the existing data and add the new data to the target file?

DWBIConcepts

Yes we can. Remember always to put a backslash (\) before any semi-colon ( ; ) we use in the PL-SQL block.

www.dwbiconcepts.com Community of DWBI Professionals

22. If we have written a source override query in source qualifier in mapping level but have
modified the query in session level SQL override then how integration service behaves.

DWBIConcepts

DWBIConcepts

DWBIConcepts

Informatica Integration Service treats the Session Level Query as final during the session run. If both the queries are different Integration Service will consider the Session level query for execution and will ignore the
Mapping level query.

DWBIConcepts

Answer:

85
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

20.

Workflow

1. What is the difference between STOP and ABORT options in Workflow?

In contrast ABORT command has a timeout period of 60 seconds. If the Integration Service cannot finish processing and committing data within the timeout period, it kills the DTM process and terminates the session.
We can stop or abort tasks, worklets within a workflow from the Workflow Monitor or Control
task in the workflow or from command task by using pmcmd stop or abort command. We can also
call the ABORT function from mapping level.
When we stop or abort a task, the Integration Service stops processing the task and any other tasks in the
path of the stopped or aborted task. The Integration Service however continues processing concurrent tasks
in the workflow. If the Integration Service cannot stop the task, we can abort the task.
The Integration Service aborts any workflow if the Repository Service process shuts down.

DWBIConcepts

When we issue the STOP command on the executing session task, the Integration Service stops
reading data from source. It continues processing, writing and committing the data to targets. If
the Integration Service cannot finish processing and committing data, we can issue the abort
command.

DWBIConcepts

Answer:

Answer:
We can schedule a workflow to run continuously. A continuous workflow starts as soon as the Integration Service initializes. If we schedule a real-time session to run as a continuous workflow,
the Integration Service starts the next run of the workflow as soon as it finishes the first. When
the workflow stops, it restarts immediately.

DWBIConcepts

2. Running Informatica Workflow continuously How to run a workflow continuously until a


certain condition is met?

Suppose wf_Bus contains the business session that we want to run continuously until a certain conditions is
meet before it stops, may be presence of file or particular value of workflow variable etc.
So modify the workflow as Start-Task followed by Decision Task which evaluates a condition to be TRUE or
FALSE. Based on this condition the workflow will run or stop.
Next use the Link Task to link the business session for $Decision.Condition=TRUE.
For the other part use a Command Task for $Decision.Condition=FALSE.
In the command task create a command to call a dummy workflow using pmcmd functionality. e.g.
"C:\Informatica\PowerCenter8.6.0\server\bin\pmcmd.exe" startworkflow -sv
IS_info_repo8x -d Domain_hp -u info_repo8x -p info_repo8x -f WorkFolder
wf_dummy
Next create the dummy workflow name it as wf_dummy. Place a Command Task after the Start Task.

86
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Alternatively for normal batch scenario we can create conditional-continuous workflow as below.

www.dwbiconcepts.com Community of DWBI Professionals

Within the command task put the pmcmd command as


"C:\Informatica\PowerCenter8.6.0\server\bin\pmcmd.exe" startworkflow -sv
IS_info_repo8x -d Domain_sauravhp -u info_repo8x -p info_repo8x -f
WorkFolder wf_Bus

3. How do we send emails from Informatica after the successful completion of one session?
The email will contain the job name/ session start time and session end time in the message body.
Answer:

DWBIConcepts

In this way we can manage to run a workflow continuously. So the basic concept is to use two workflows and
make them call each other.

How to pass a value calculated in mapping variable to the email message. The email will be sent in HTML
format with a predefined message in which one value will be populated from one mapping variable. Suppose, the predefined message is:
<html> <body>
The last transaction service ID is: <informatica_variable>
</body> </html>
In the place of <informatica_variable>, the value of the mapping variable at the end of the session will go.
Answer:
We cannot use a mapping variable in Workflow or Session level. It is local to a mapping. Instead, we have to
use a Workflow variable for this purpose. But, we cannot pass the value of the Mapping Variable to the
Workflow variable directly from your mapping.
1) Write the calculated value in some Flat File using your mapping say "value.txt".
2) Create a shell script say "mail.sh" to send the 2nd mail. Read the value from the "value.txt" into a variable
in "mail.sh". Use this variable in the body of the mail.
3) Create a Cmd task in the WF level. Call this "mail.sh" in that Cmd task.
4) Use this Cmd task upstream of your actual session and link it on its success.

5. How can we send two separate emails after a successful session run?
Answer:

87
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

4. Scenario Implementation 1

DWBIConcepts

After that, we will use the Informatica Email Task. We can create a email task and call it in the session level
On Success Email. Here we can use Informatica pre-build variables like- mapping name (%m), session start
time (%b) etc.

DWBIConcepts

The first thing is to have "mail" utility configured in the Informatica server (UNIX/WINDOWS).

www.dwbiconcepts.com Community of DWBI Professionals

The problem is we cannot call two email tasks from one session i.e. from session level On Success Email.
So, for the second email we can create another Email Task following the Session using and link them using
Link Task with execution condition as status=SUCCEEDED.

With respect to Informatica, we can resume a stopped or failed real-time session. To resume a session, we
must restart or recover the session. The Integration Service can recover a session automatically if you enabled the session for automatic task recovery. When you restart a session, the Integration Service resumes
the session based on the real-time source. Depending on the real-time source, it restarts the session with or
without recovery.
We can restart a task or workflow in cold start mode. When you restart a task or workflow in cold start
mode, the Integration Service discards the recovery information and restarts the task or workflow.
For e.g. if a workflow failed in between and we don't want to recover data because we manually did all clean
up of data in the impacted target tables. If workflow recovery is enabled then we can opt for a cold start
which will skip recovery task. Cold start will remove all recover data if any stored when session failed.
When we restart a stopped or failed task or workflow that has recovery enabled in cold start mode,
the Integration Service discards the recovery information and restarts the task or workflow.
Cold Start Task, Cold Start Workflow or Cold Start Workflow from Task commands can be executed
from the Workflow Manager, Workflow Monitor, or pmcmd command line programs.
If we restart a session in cold start mode, targets may receive duplicate rows.
So avoid cold start and restart the session with recovery to prevent data duplication.
So if recovery is not enabled in a session, then there is no difference between cold start and restart.

DWBIConcepts

In general terms, Cold Start means To start a program from the very beginning, without being able to continue the processing that was occurring previously when the system was interrupted.

DWBIConcepts

Answer:

DWBIConcepts

6. What is Cold Start in Informatica?

Email - I have a llist of 10 peoples in email after session failure. can we edit the list emails dynamically - I
mean can we add or delete email ID without touching the mapping.
Answer:
We can parameterize the email user list and modify the values in parameter file. Use $PMSuccessEmailUser,
$PMFailureEmailUser. Also you can use pmrep command to update the email task:
updateemailaddr -d <folder_name> -s <session_name> -u <success_email_address> -f <failure_email_address>
You can create a distribution list and use that DL in the session failure cmd. What so ever emails will be listed
in the DL will receive the mail. Later on you can add/remove the emails in the DL depending upon your requirement.

88
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

7. Scenario Implementation 2

www.dwbiconcepts.com Community of DWBI Professionals

8. We know there are 3 options for Session recovery strategy - Restart task, Fail task and
continue running the workflow, Resume from last checkpoint whenever a session fails.
How do we restart a workflow automatically without any manual intervention in the
event of session failure?

Select Automatically recover terminated tasks option in workflow properties. Also we can specify the maximum number of auto attempts in the workflow property Maximum automatic recovery attempts.

9. What is the difference Real-time and continuous workflows?


Answer:

DWBIConcepts

Answer:

11. Scenario Implementation 3


Suppose we have two workflows workflow 1 (wf1) having two sessions (s1, s2) and workflow 2 (wf2) having
three sessions (s3, s4, s5) in the same folder, like below
wf1: s1, s2
wf2: s3, s4, s5

DWBIConcepts

Real-time Workflow is source XML Message triggered workflow, whereas if any workflow which runs continuously using two workflows and command line arguments to call each other.

Answer:
Use Command Task or Post Session Command to create touch file and use Event Wait Task to wait for the
file (Filewatch Name).
Combination of Command Task and Event Wait will help to solve the problem.

DWBIConcepts

How can we run s1 first then s3 after that s2 next s4 and s5 without using pmcmd command or unix script?

So run both the workflows, session s1 starts and after successful execution calls command task cmd1. cmd1
generates a touch file say s3.txt
After that the execution passes to event wait ew2. Immediately event wait ew1 will start to process session
s3 after the file s3.txt was generated. Next after success of session s3 it will pass the control to command
task cmd2 which in turn will generate a touch file say s2.txt and passes the control to event wait task ew3.
Immediately at the same time the event wait ew2 gets started after receiving the event wait file s2.txt and
passes the control to session s2. After completion of session s2 it triggers command task cmd3 which in turn
generates a wait file s4.txt and the workflow wf1 ends. On the other hand the event wait ew3 gets triggered
with wait file s4.txt in place and calls the session s4 which in turn after success triggers the last session s5
and the workflow wf2 completes.

89
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

WF1----->S1------>CMD1----->EW2------>S2------->CMD3
WF2----->EW1--->S3--------->CMD2----->EW3---->S4------>S5

www.dwbiconcepts.com Community of DWBI Professionals

12. How do we send a session failure mail with the workflow or session log as attachment?

13. Explain deadlock in Informatica and how do we resolve it?


Answer:
In Database level deadlock normally occurs when two concurrent user sessions are trying to apply a DML command for same row in a table. Say for example, below query got executed by user1 in session1

update emp set deptno=20 where deptno=10;


Before user1 is commits the transaction, if user2 from session2 execute the same query as below , it causes
deadlock error.

DWBIConcepts

Design an Informatica email task to send email communication in the event of session failure and used email
variable %g to attach the corresponding session log.
Email Variables:
(%g) - To attach session log.
(%a<>) - To attach any file, Absolute path needs to be given <>.

DWBIConcepts

Answer:

14. Scenario Implementation 4


Busy Percentage is given by (runtime-idle time) * 100 / runtime.
If a thread is having 0 idle time, which means more Busy Percentage. So do we need to tune that thread
component?
Why is it like that? So does it means we need to tune the thread whose busy percentage (BP) is more or the
one having more idle time.
Answer:
3 persons are asked to run 1 mile each. Each one of them is allotted 20 minutes of time. First person completes 1 mile in 5 minutes and stands idle other 15 minutes of his allotted time. The 2nd person completes it
in 10 minute and sits idle the rest 10 minute. The last one takes all 20 minutes and idle for 0 minutes. Who is
the worst performer?
Isn't it the last person who had no idle time? It's the same for a thread with 0 idle time.

90
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

In informatica normally deadlock occurs when two sessions are updating or deleting records from a table in
parallel, (parallel insert is not a problem). One option to avoid deadlock is to identify those sessions and
make them sequential. Another option is to make use of the session level properties such as deadlock retry
limits and deadlock recovery option

DWBIConcepts

update emp set deptno=30 where deptno=10;

www.dwbiconcepts.com Community of DWBI Professionals

15. How can we pass a value from one workflow to another?

Pass the Workflow variable value to a session variable in pre-assignment and then next to mapping parameter.
Next develop a mapping to generate a parameter file with the desired value as a workflow variable that can
be passes to the next workflow using this parameter file.

DWBIConcepts

DWBIConcepts

DWBIConcepts

Alternatively, develop the mapping to store the value in a flat file or Database table. Next create another
mapping to use that in the next workflow by passing it to the session in post-assignment and then to workflow level if required.

DWBIConcepts

Answer:

91
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

21.

Administration

1. What is Load Manager?


Answer:

Manages session and batch scheduling.


Locks the session and read session properties.
Reads the parameter file.
Expand the server and session variables and parameters.
Verify permissions and privileges.
Validate source and target code pages.
Create the session log file.
Create the Data Transformation Manager which executes the session.

2. What is DTM process? How many threads it creates to process data, explain each thread
in brief?

DWBIConcepts

DWBIConcepts

The load Manager performs the following tasks

MASTER THREAD - Main thread of the DTM process. Creates and manages all other threads.
MAPPING THREAD - One Thread to Each Session. Fetches Session and Mapping Information.
Pre and Post Session Thread - One Thread Each To Perform Pre and Post Session Operations.
READER THREAD - One Thread for Each Partition for Each Source Pipeline.
WRITER THREAD - One Thread for Each Partition If Target Exist in the Source pipeline Write to the
Target.
TRANSFORMATION THREAD - One or More Transformation Thread For Each Partition.

3. Can you create a folder within designer?


Answer:

92
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

After the load manager performs validations for the session, it creates the DTM process. The DTM process is
the second process associated with the session run. The primary purpose of the DTM process
is to create and manage threads that carry out the session tasks. The DTM allocates process
memory for the session and divide it into buffers. This is also known as buffer memory. It creates the main thread, which is called the master thread. The master thread creates and manages all other threads. If we partition a session, the DTM creates a set of threads for each partition to allow concurrent processing. When Informatica server writes messages to the session log it includes
thread type and thread ID. Following are the types of threads that DTM creates:

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

Not possible

4. How do you take care of security using a repository manager?

Using repository privileges, folder permission and locking.


Repository privileges(Session operator, Use designer, Browse repository, Create session and batches,
Administer repository, administer server, super user)
Folder permission(owner, groups, users)
Locking(Read, Write, Execute, Fetch, Save)

DWBIConcepts

Answer:

6. What are 2 modes of data movement in Informatica Server?


Answer:
The data movement mode depends on whether Informatica Server should process single byte or multi-byte
character data. This mode selection can affect the enforcement of code page relationships and code page
validation in the Informatica Client and Server.
Unicode IS allows 2 bytes for each character and uses additional byte for each non-ascii character
(such as Japanese characters)
ASCII IS holds all data in a single byte
The IS data movement mode can be changed in the Informatica Server configuration parameters. This comes
into effect once you restart the Informatica Server.

7. What is Code Page used for?


Answer:
A code page contains the encoding to specify characters in a set of one or more languages. An encoding is
the assignment of a number to a character in the character set. Code Page is used to identify characters that
might be in different languages. If you are importing Japanese data into mapping, then u must select the
Japanese code page for the source data.

93

www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Repository manager used to create repository which contains metadata the Informatica uses to transform
data from source to target. And also it use to create informatica users and folders and copy, backup and restore the repository

DWBIConcepts

Answer:

DWBIConcepts

5. What are the different uses of a repository manager?

www.dwbiconcepts.com Community of DWBI Professionals

8. What is Code Page Compatibility?


Compatibility between code pages is used for accurate data movement when the Informatica Sever runs in
the Unicode data movement mode. If the code pages are identical, then there will not be any data loss. One
code page can be a subset or superset of another. For accurate data movement, the target code page must
be a superset of the source code page.
Superset - A code page is a superset of another code page when it contains the character encoded in the
other code page. It also contains additional characters not contained in the other code page.
Subset - A code page is a subset of another code page when all characters in the code page are encoded in
the other code page.

DWBIConcepts

Answer:

DWBIConcepts

9. What is default block buffer size?


Answer: 64K

10. What is default LM shared memory size?

DWBIConcepts

Answer: 2MB

11. Define Server Concepts with respect to memory buffers

The Informatica server used three system resources CPU, Shared Memory & Buffer
MemoryInformatica server uses shared memory, buffer memory and cache memory for session
information and to move data between session threads.
LM Shared Memory - Load Manager uses both process and shared memory. The LM keeps the information
server list of sessions and batches, and the schedule queue in process memory. Once a session starts, the LM
uses shared memory to store session details for the duration of the session run or session schedule. This
shared memory appears as the configurable parameter (LMSharedMemory) and the server allots 2,000,000
bytes as default. This allows you to schedule or run approximately 10 sessions at one time.
DTM Buffer Memory - The DTM process allocates buffer memory to the session based on the DTM buffer
poll size settings, in session properties. By default, it allocates 12,000,000 bytes of memory to the session.
DTM divides memory into buffer blocks as configured in the buffer block size settings. (Default: 64,000 bytes
per block)

94
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

Answer:

www.dwbiconcepts.com Community of DWBI Professionals

12. What are the two programs that communicate with the Informatica Server?

Informatica provides Server Manager and pmcmd programs to communicate with the Informatica Server:
Server Manager - A client application used to create and manage sessions and batches, and to monitor and
stop the Informatica Server. You can use information provided through the Server Manager to troubleshoot
sessions and improve session performance.

DWBIConcepts

DWBIConcepts

DWBIConcepts

pmcmd - A command-line program that allows you to start and stop sessions and batches, stop the
Informatica Server, and verify if the Informatica Server is running.

DWBIConcepts

Answer:

95
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

22.

Command Line Arguments

1. What is pmcmd commands?

These are some operations that you can do using PMCMD - Start, Stop and abort the session

2. What is pmrep commands?


Answer:
You can use pmrep to create or delete repository users and groups. You can also use pmrep to modify repository privileges assigned to users and groups.

3. How do we start & stop session from pmcmd command line?

DWBIConcepts

pmcmd is a command line program to communicate with the Informatica server. This does not replace the
server manager, since there are many tasks that you can perform only with server Manager.

DWBIConcepts

Answer:

Use the following syntax to ping the Informatica Server on a UNIX system:
pmcmd ping [{user_name | %user_env_var} {password | %password_env_var}]
[hostname:]portno
Use the following syntax to start a session or batch on a UNIX system:

DWBIConcepts

Answer:

Use the following syntax to stop a session or batch on a UNIX system:


pmcmd stop {user_name | %user_env_var} {password | %password_env_var}
[hostname:]portno[folder_name:]{session_name | batch_name} session_flag
Use the following syntax to stop the Informatica Server on a UNIX system:
pmcmd stopserver {user_name | %user_env_var} {password | %password_env_var} [hostname:]portno

96
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

pmcmd start {user_name | %user_env_var} {password | %password_env_var}


[hostname:]portno [folder_name:]{session_name | batch_name}
[:pf=param_file] session_flag wait_flag

www.dwbiconcepts.com Community of DWBI Professionals

23.

Metadata Repository

1. Is there any metadata query to find the list of Informatica folder name, workflow names
which are migrated in a particular Quarter?

The below SQL will give you the list of folders, workflows and their last saved date.
SELECT W.SUBJECT_AREA FOLDER_NAME, W.WORKFLOW_NAME, W.WORKFLOW_LAST_SAVED
FROM REP_WORKFLOWS W
ORDER BY TO_DATE (W.WORKFLOW_LAST_SAVED, 'MM/DD/YYYY HH24:MI:SS') DESC

DWBIConcepts

Answer:

Answer:

DWBIConcepts

Informatica metadata is stored in some database repository. This can be the same database where we have
our source/ staging / target tables or it may be a completely different database (that is the case in general).
We can execute User defined queries metadata queries only on this database.
We may need to ask Informatica administrator about the database login credentials. We need to have a read
access username/password for the database. After that we can connect to the database and run the
metadata queries.

3. Write a metadata query to identify the sessions having truncate option enabled
Answer:

DWBIConcepts

select
task_name,
'Truncate Target Table' ATTR,
decode(attr_value,1,'Yes','No') Value
from OPB_EXTN_ATTR OEA,
REP_ALL_TASKS RAT
where
OEA.SESSION_ID=rat.TASK_ID
and attr_id=9

4. Where can I find a history / metrics of the load sessions that have occurred in
Informatica?
Answer:

97
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

2. How can I run Metadata Queries in Informatica PowerCenter?

www.dwbiconcepts.com Community of DWBI Professionals

5. How to extract the workflow monitor record information from Informatica metadata repository?

DWBIConcepts

Answer:

DWBIConcepts

The tables which house this information are OPB_LOAD_SESSION, OPB_SESSION_LOG, and
OPB_SESS_TARG_LOG. OPB_LOAD_SESSION contains the single session entries, OPB_SESSION_LOG contains
a historical log of all session runs that have taken place. OPB_SESS_TARG_LOG keeps track of the errors, and
the target tables which have been loaded. Keep in mind these tables are tied together by Session_ID. If a
session is deleted from OPB_LOAD_SESSION, it's history is not necessarily deleted from OPB_SESSION_LOG,
nor from OPB_SESS_TARG_LOG. Unfortunately - this leaves un-identified session ID's in these tables. However, when you can join them together, you can get the start and complete times from each session.

SELECT DISTINCT
FOLDER_NAME, WORKFLOW_NAME, SESSION_NAME,
START_DATE, START_TIME, END_DATE, END_TIME, DURATION "DURATION IN
DD:HH:MI:SS",
SOURCE_ROWS, TARGET_ROWS, REJECTED_ROWS, REJECTED_STATUS, STATUS,
FAILED_REASON
( SELECT
t.SUBJECT_AREA FOLDER_NAME, t.WORKFLOW_NAME, t.SESSION_NAME,
DECODE(t.RUN_STATUS_CODE, 2,NULL, TO_CHAR(t.ACTUAL_START,'DD-MON-YYYY'))
START_DATE,
DECODE(t.RUN_STATUS_CODE, 2,NULL, TO_CHAR(t.ACTUAL_START,'HH24:MI:SS
AM')) START_TIME,
DECODE(t.RUN_STATUS_CODE, 2,NULL, TO_CHAR(t.SESSION_TIMESTAMP,'DD-MONYYYY')) END_DATE,

DWBIConcepts

FROM

DECODE(t.RUN_STATUS_CODE, 2,NULL, TRUNC((((86400*(SESSION_TIMESTAMPACTUAL_START))/60)/60)/24)||':'


|| (TRUNC(((86400*(SESSION_TIMESTAMP-ACTUAL_START))/60)/60) 24*(TRUNC((((86400*(SESSION_TIMESTAMP-ACTUAL_START))/60)/60)/24)))||':'
|| (TRUNC((86400*(SESSION_TIMESTAMP-ACTUAL_START))/60) 60*(TRUNC(((86400*(SESSION_TIMESTAMP-ACTUAL_START))/60)/60))) ||':'
|| (TRUNC(86400*(SESSION_TIMESTAMP-ACTUAL_START)) 60*(TRUNC((86400*(SESSION_TIMESTAMP-ACTUAL_START))/60)))) DURATION ,
DECODE(t.RUN_STATUS_CODE, 2,NULL, t.SUCCESSFUL_SOURCE_ROWS) SOURCE_ROWS ,
DECODE(t.RUN_STATUS_CODE, 2,NULL, t.SUCCESSFUL_ROWS) TARGET_ROWS,
DECODE(t.RUN_STATUS_CODE, 2,NULL, t.FAILED_ROWS) REJECTED_ROWS,
DECODE(t.RUN_STATUS_CODE, 2,NULL,CASE WHEN t.SUCCESSFUL_SOURCE_ROWS <>
t.SUCCESSFUL_ROWS THEN 'VALIDATE THE MISMATCH' END) REJECTED_STATUS,

98
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

DECODE(t.RUN_STATUS_CODE, 2,NULL, TO_CHAR(t.SESSION_TIMESTAMP,'HH24:MI:SS


PM')) END_TIME,

www.dwbiconcepts.com Community of DWBI Professionals

DECODE(t.RUN_STATUS_CODE, 1,'Succeeded', 2,'Disabled', 3,'Failed',


4,'Stopped', 5,'Aborted', 6,'Running', 7,'Suspending', 8,'Suspended',
9,'Stopping', 10,'Aborting', 11,'Waiting', 15,'Terminated') AS STATUS,
REPLACE(REPLACE(t.FIRST_ERROR_MSG,CHR(10),' '),'No errors encountered.','') AS FAILED_REASON,
RANK() OVER (PARTITION BY session_name ORDER BY t.SESSION_TIMESTAMP DESC)
rnk
) sess_run
WHERE sess_run.rnk = 1
ORDER BY START_DATE, START_TIME

DWBIConcepts

DWBIConcepts

DWBIConcepts

Don't forget to put the informatica folder name in the SUBJECT_AREA filter above. Also we might need to
make some other small adjustments above to better suit your purpose / informatica version.

DWBIConcepts

FROM REP_SESS_LOG t WHERE t.SUBJECT_AREA='<<informatica_folder_name>>'

99
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

24.

Repository Manager

1. Describe the steps for export and import?


Answer:
Open the folder which contains the mapping.
Check Out the mapping to be exported.
Click Repository-->Export Objects and save it in your local drive.
Open the folder in which you want to export the mapping.
Click Repository-->Import Objects and select mapping xml file and Click import.
Once the mapping is imported to the new folder just save it and Check In.

DWBIConcepts

If you need to migrate only some small objects (say some designer or workflow manager objects) then we
can go for copying through Repository Manager or through Designer(for Designer objects) or through Workflow manager (for Workflow manager objects) itself. But for this we have to be connected to both the repositories while coping.
Sometime we may need to migrate entire project and want to have a complete log of deployment, then we
can go for creating Deployment Group using Deployment Wizard.
We might use pmrep to automate exporting objects on a daily or weekly basis. To use this command, we
must create a Control File with all the specifications that the Copy Wizard requires. The control file is an XML
file defined by the depcntl.dtd file. A deployment control file is an XML file that you use with the
DeployFolder and DeployDeploymentGroup pmrep commands to deploy a folder or deployment group.
We can create a deployment control file manually to provide parameters for deployment, or you can create
a deployment control file with the Copy Wizard. If you create the deployment control file manually, it must
conform to the depcntl.dtd file that is installed with the PowerCenter Client. You include the location of the
depcntl.dtd file in the deployment control file.
One good thing is we can roll back a deployment to purge the deployed versions from the target repository
or folder. When we roll back a deployment, you roll back all the objects in a deployment group that we deployed at a specific date and time. We cannot roll back part of a deployment.

DWBIConcepts

The best way is, arguably, the XML export and import, as it is very easy.
But again it all depends upon the requirement; if we want to migrate some workflows with dependent objects at once shot, then the suggested way is XML export and import.

DWBIConcepts

Answer:

DWBIConcepts

2. What are the various methods of code migration or which is the best way of deployment?

In the PowerCenter Client, we can export repository objects to an XML file and then import repository objects from the XML file. Use the following client applications to export and import repository objects:
Repository Manager: You can export and import both Designer and Workflow Manager Objects.
Designer: You can export and import Designer objects.
Workflow Manager: You can export and import Workflow Manager objects.

100

www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

pmrep: You can export and import both Designer and Workflow Manager objects. You might use
pmrep to automate exporting objects on a daily or weekly basis.

3. What are the various options for ETL code migration

Option 2. You can keep your Dev and QA is in the same Repo, you can just do the Drag and Drop option. For
this Open Both Dev and QA Folders in Repository Manager and Just Drag the Objects from Dev to QA.
Option 3. You can Create a Deployment Group using Repository Manager and attach all the Workflows you
need to migrate in the Deployment group and This Deployment group can be migrated
Option 4. You have the Option to Migrate the Entire Folder As well
when we can Use these Options
Option 1. We can use this Option when the number of Workflows to migrate is few. If you do not have
Informatica Versioned Repository, These Exported XML can be used to keep your Versions.
Option 2. When you have less number of Workflows to Migrate you can use this option.
Option 3. Large number of Objects migrated together. It will keep the list of Objects migrated as a group and
in case of a rollback is required it is easy in this approach.
Option 4. Mostly used when you migrate a Project for the first time to QA with a large number of workflows .

4. What is labeling in Informatica?


Answer:
we can see label concept in many places like in our mail box. Some time we do group some of our mails to
different level. Like marking some mails to personal level.

DWBIConcepts

Option 1. Now you can export the Workflow from Repository Manager using the Export Object Option to export as XML and then import into QA using Repository Manager Import Object Option.

DWBIConcepts

If you have a Versioned Repository, as the first step Check in all the Workflows and dependent objects. Now
we have Couple of different ways to achieve the migration.

DWBIConcepts

There are couples of Options Available for Code migration.

DWBIConcepts

Answer:

In Informatica, Label is a global object that you can associate with any versioned object or group of versioned objects in a repository. You may want to apply labels to versioned objects to achieve the following results:
- Track versioned objects during development.
- Improve query results.
- Associate groups of objects for deployment.

www.dwbiconcepts.com All rights reserved.

101

www.dwbiconcepts.com Community of DWBI Professionals

- Associate groups of objects for import and export.

For example, you might apply a label to sources, targets, mappings, and sessions associated with a workflow
so that you can deploy the workflow to another repository without breaking any dependency.

Informatica Version control is nothing but a team based development methodology where we create copies
of the actual objects to tract the modification using check in and checkout options.

5. Suppose having Informatica Version Control in place, can we revert back an object to a
state of two previous version.
Answer:
From the Version History of the Object, open the required version of the Object in Workspace.
Next export the xml metadata of the Object.

DWBIConcepts

You can create and modify labels in the Label Browser. From the Repository Manager, click Versioning > Labels to browse for a label.

DWBIConcepts

You can apply the label to multiple versions of an object. Or you can specify that you can apply the label to
one version of the object.

Next Check out the Object.

Save and Check In the Object.

6. What do we mean by Team based development in Informatica?


Answer:
Team based development is nothing but version control for the metadata objects.
If we have the team-based development option, we can enable version control for the repository. A versioned repository stores multiple versions of an object. Each version is a separate object with unique properties. A PowerCenter version control feature allows us to efficiently develop, test, and deploy metadata into
production.

DWBIConcepts

DWBIConcepts

Then import the metadata exported earlier.

During development, we can perform the following change management tasks to create and manage multiple versions of objects in the repository:
Check out and check in versioned objects.
Compare objects.

102
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

Track changes to an object.


Delete or purge a version.

DWBIConcepts

DWBIConcepts

DWBIConcepts

DWBIConcepts

Use global objects such as queries, deployment groups, and labels to group versioned objects.

103
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

25.

Scenario Questions

1. Suppose we have ten source flat files of same structure. How can we load all the files in
target database in a single batch run using a single mapping?

After we create a mapping to load data in target database from source flat file definition, next we move on
to the session property of the Source Qualifier.

2. Suppose we have two Source Qualifier transformations SQ1 and SQ2 connected to Target
tables TGT1 and TGT2 respectively. How do you ensure TGT2 is loaded after TGT1?

DWBIConcepts

DWBIConcepts

DWBIConcepts

To load a set of source files we need to create a file say final.txt containing the source flat file names, ten
files in our case and set the Source filetype option as Indirect. Next point this flat file final.txt, fully qualified
with Source file directory and Source filename.

DWBIConcepts

Answer:

Answer:
If we have multiple Source Qualifier transformations connected to multiple targets, we can designate the order in which the Integration Service loads data into the targets.
In the Mapping Designer, We need to configure the Target Load Plan based on the Source Qualifier transformations in a mapping to specify the required loading order.

104

www.dwbiconcepts.com All rights reserved.

It defines the order in which Informatica server loads the data into the targets. This is to avoid integrity constraint violations

DWBIConcepts

DWBIConcepts

DWBIConcepts

DWBIConcepts

www.dwbiconcepts.com Community of DWBI Professionals

105
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

3. Suppose we have a Source Qualifier transformation that populates two target tables. How
do you ensure TGT2 is loaded after TGT1?

In the Workflow Manager, we can Configure Constraint based load ordering for a session. The Integration
Service orders the target load on a row-by-row basis. For every row generated by an active source, the Integration Service loads the corresponding transformed row first to the primary key table, then to the foreign
key table.

4. Suppose we have the EMP table as our source. In the target we want to view those employees whose salary are greater than or equal to the average salary for their departments. Describe your mapping approach.
Answer:

DWBIConcepts

DWBIConcepts

DWBIConcepts

Hence if we have one Source Qualifier transformation that provides data for multiple target tables having
primary and foreign key relationships, we will go for Constraint based load ordering.

DWBIConcepts

Answer:

Our Mapping will look like this:

106
www.dwbiconcepts.com All rights reserved.

To start with the mapping we need the following transformations:

Next we place a Sorted Aggregator Transformation. Here we will find out the AVERAGE SALARY for each
(GROUP BY) DEPTNO.

DWBIConcepts

DWBIConcepts

After the Source qualifier of the EMP table place a Sorter transformation. Sort based on DEPTNO port.

DWBIConcepts

www.dwbiconcepts.com Community of DWBI Professionals

To maintain employee data, we must pass a branch of the pipeline to the Aggregator Transformation and
pass a branch with the same sorted source data to the Joiner transformation to maintain the original data.
When we join both branches of the pipeline, we join the aggregated data with the original data.

DWBIConcepts

When we perform this aggregation, we lose the data for individual employees.

107
www.dwbiconcepts.com All rights reserved.

So next we need Sorted Joiner Transformation to join the sorted aggregated data with the original data,
based on DEPTNO. Here we will be taking the aggregated pipeline as the Master and original dataflow as Detail Pipeline.

DWBIConcepts

DWBIConcepts

DWBIConcepts

DWBIConcepts

www.dwbiconcepts.com Community of DWBI Professionals

108
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

DWBIConcepts

DWBIConcepts

DWBIConcepts

www.dwbiconcepts.com Community of DWBI Professionals

After that we need a Filter Transformation to filter out the employees having salary less than average salary
for their department.
Filter Condition: SAL >= AVG_SAL

109
www.dwbiconcepts.com All rights reserved.

DWBIConcepts

DWBIConcepts

www.dwbiconcepts.com Community of DWBI Professionals

Finally we place the Target table instance.

Create a Mapping Variable as integer data type and Aggregation type as MAX. Set the value of this
mapping variable in any of these transformations (Expression, Filter, Router or Update Strategy).
Use SETMAXVARIABLE( $$Variable, load_seq_column ) function. This function will assign the MAX
sequence number of that particular load into the variable $$variable.
This function executes only if a row is marked as insert. SETMAXVARIABLE ignores all other row types and
the current value remains unchanged. The function sets the current value of a mapping variable to the higher of two values- the current value of the variable or the value from the source column for each record. At
the end of a successful session, the Integration Service saves the final current value to the repository.
When used with a session that contains multiple partitions, the Integration Service generates different current values for each partition. At the end of the session, it saves the highest current value across all partitions to the repository. Unless overridden, it uses the saved value as the initial value of the variable for the
next session run.

DWBIConcepts

Answer:

DWBIConcepts

5. How can we perform changed data capture based on load sequence number (integer) column present in the Source table?

Now since the max sequence number for previous load is captured in this mapping variable and is saved in
the repository. We can use this variable as a filter in the Source Qualifier query. Next time when we run the
workflow, it will only extract those records having load sequence number greater than this sequence number.

110
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

6. Scenario Implementation 1
In my mapping I have 3 tables that we are joining.
In the source query we want to filter the data based off a value that is stored in one of our target tables. Is
there a way of pulling that one particular value from that target table and be able to use it in the filter in the
source qualifier? Basically the value is a load sequence number that gets incremented with each session run.
So when the session runs again we only pull records that are greater than that load sequence number.

Option 1: Assumption- Source and target tables cannot be accessed using a single DB Connection and "load
Sequence Number" is modified by the current process.
In this case you can use a mapping variable in the mapping and set the value of the mapping variable to the
highest/current value using the SETMAXVARIABLE function. This value will be stored in Informatica repository and the same value can be used in Source Qualifier Filter for the next session run. If incase the workflow
fails, the value of the mapping variable will not get incremented.
Steps
Define mapping Variable with Aggregation type as MAX.
Use SETMAXVARIABLE($$variable, Current load Sequence Number") function to store the value into
repository.
Use the variable $$Variable in Source Qualifier filter.

DWBIConcepts

There are different options to solve the problem.

DWBIConcepts

Answer:

In this case you can create a mapping parameter and need to pass the value as a parameter.
Steps
Create a workflow to get the latest "load Sequence Number" and create a parameter file.
This workflow will write a flat file which will contain the parameter value. E.g.
[wf_DAILY_INCR_LOAD]
$$Variable=100
In the actual mapping
Define a mapping parameter $$Variable and use $$Variable in the Source Qualifier
Each time you need to run the workflow which creates the parameter file before your actual workflow is run

DWBIConcepts

Option 2: Assumption- Source and target tables cannot be accessed using a single DB Connection and "load
Sequence Number" is modified by different process.

DWBIConcepts

We can provide a default value for the variable and change the value during your code migration to set the
starting value

Option 3: Assumption- Source and Target table can be accessed using a single DB connection.
If both your source and target tables are connected using a single DB Connection, we can write the filter to
get the latest data in the Source Qualifier itself joining all the tables.

111
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

7. How can we load x records (user defined record numbers) out of N records from source
dynamically, without using filter and sequence generator transformation?

Take a mapping parameter say $$CNT to pass the number of records we want to load dynamically by
changing in the parameter file each time before session run.
Next after the Source Qualifier use an Expression transformation and create one output port say
CNTR with value CUME (1).
Next use an Update Strategy with condition IIF ($$CNT >= CNTR, DD_INSERT, DD_REJECT).

8. Suppose we have n number of rows in the Source and we have two target tables. How
can we load n/2 i.e. first half the source data into one target and the remaining half into
the next target?

DWBIConcepts

Answer:

DWBIConcepts

Answer:

Use a Expression transformation with an output port ROWNUM with the expression CUME(1)
Next use a Router with 2 groups having below conditions:
MOD( ROWNUM, 2 ) = 0
MOD( ROWNUM, 2 ) = 1

Below are the implementation steps in Informatica.


First place the Source table and its corresponding Source Qualifier in the mapping.
Next split the data into two flows; One going to the Expression Transformation with all the ports and
the other flow with any one column to an Aggregator Transformation.
In the Aggregator add a numeric output port say CNT with expression as COUNT (1) and do not
group by on any other input port.
Propagate this output column CNT to an Expression Transformation. Next in this expression transformation create another numeric output port JN with expression value 1.
Now lets go back to the first expression transformation having all the source columns. Introduce a
Sequence Generator transformation with RESET attribute property enabled and propagate
the NEXTVAL port to the expression transformation. Next also add one more numeric output
port JN with expression value 1
Now take a Joiner Transformation and check the property Sorted Input.
Now bring in all the columns from the Expression Transformation next to the Source Qualifier. Another flow to the joiner is from the expression with two columns CNT and JN. Join condition is based
on JN ports.
Next after the joiner place a Router Transformation. Create one group say FST with condition
as NEXTVAL < (CNT/2).
Next introduce two target tables first and second. Propagate the columns of the FST group of the
router to the first target. Next propagate the columns of the Default group of the router transformation to the second target.

DWBIConcepts

Alternatively,

DWBIConcepts

Connect to the corresponding target instances.

112
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

9. Suppose we have a flat file which has a header record with file creation date, and detailed data records. Describe the approach to load the 'file creation date' column along
with each and every detailed record.

Table 1 Table 2
ID
ID Name
10
10 A
10 B
10 C
Answer:
When we use a Joiner Transformation as Inner Join on column id, we will get 3 rows as output.
When we use Passive Lookup Transformation we will get 1 row as output. In this case of multiple lookup
match, lookup will return either the first or the last as configured in on multiple matches property of the
transformation.
When we use Active Lookup Transformation we will get 3 rows as output, as active lookup returns all the
matching values on multiple lookup matches.

11. Suppose we have a flat file which contains just a numeric value. We need to populate this
value in one column of the target table for every source record. How can we achieve this?
Answer:

DWBIConcepts

Suppose we have the below two tables. What will be the output if we select Table 1 as Source and use Joiner
and Lookup transformation on Table 2 based on column ID?

DWBIConcepts

10. Scenario Implementation 2

DWBIConcepts

We can use the below shell command to write the header information in another flat file as presession command.
head -1 Sourc_File.dat > header.txt
Next Use this flat file header.txt as Lookup in the mapping.
Create an output port in expression transformation with value 'H' or the tag in the source data file
that identifies the header record
Use this as Lookup condition and get the file creation date as return field and populate it in your target table.

DWBIConcepts

Answer:

Use an Expression and create a decimal Output port say DUMMY with a very high number
along with other I/O ports from the source table.
Say, DUMMY = 99999999999 [Note- Use such a number value that can never appear in the
lookup flat file.]
Now use a Lookup transformation based on the source file. Say, the column name in the lookup
is VALUE

113

www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

Map DUMMY from Expression to Lookup and use the lookup condition as
DUMMY != VALUE

Answer:
The generic file name is like- sales_YYYY_MM_DD.txt
One option is to rename the file in the pre session load task. We will use OS level command to rename the
file to a fixed name. We will next set the Informatica source filename to this fixed name and load the file.
E.g. in Unix:
$> mv sales_*.txt sales.txt
Another option is to use Indirect Loading with a fixed file name. The content of the filename will contain the
actual filename to be processed.
E.g. in Unix:
$> ls sales_*.txt > sales.txt

13. Solve the below scenario using Informatica and Database SQL.
Source
PRODUCT_NAME
Lux
Dove
Cinthol
Dettol
Fiama

PRODUCT_PRICE
100
200
400
500
600

DWBIConcepts

PRODUCT_ID
10
10
20
20
30
Target

PRODUCT_NAME
Lux
Dove
Cinthol
Dettol
Fiama

PRODUCT_PRICE
100
200
400
500
600

SUM_PRODUCT_PRICE
300
300
900
900
600

DWBIConcepts

PRODUCT_ID
10
10
20
20
30

DWBIConcepts

12. How will you load a source flat file into a staging table when the file name is not fixed?
The file name is like sales_2013_02_22.txt, i.e. date is appended at the end of the file as a
part of file name.

DWBIConcepts

Next use the VALUE column of the Lookup to populate the target column.

Answer:
Using Informatica:
In one pipeline, calculate SUM (product-price) GROUP BY product-id using Aggregator transformation.
In the other flow bring all the data normally, then join the first flow with the second using an Informatica
Joiner transformation suing join column product-id and join type inner join.

114

www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

Using SQL:

14. Suppose we have a column in source with values as below:


EMPNO
1
2
3
4
999
6

ENAME
Tom
Jack
Peter
Donald
TEST
Eric

SAL
100
200
150
230
999
300

Answer:
From Source create two flows:1: Source -> Expression -> Sorter
2: Source -> Filter ->Expression -> Sorter
1.1 In the Expression create output field dummy_M as 'X'
1.2 Sort on dummy
2.1 In the Filter set Filter Condition as EMPNO = 999
2.2 In the Expression create output field dummy_D as 'X'

DWBIConcepts

If we encounter EMPNO = 999, then whole record set should not be loaded in target table. Describe the approach.

DWBIConcepts

DWBIConcepts

SELECT M.*, N. SUM_PRODUCT_PRICE


FROM SOURCE M,
(SELECT SUM (PRODUCT_PRICE) SUM_PRODUCT_PRICE, PRODUCT_ID
FROM SOURCE
GROUP BY PRODUCT_ID) N
WHERE M. PRODUCT_ID = N. PRODUCT_ID

3. Next use a Joiner Transform:


Set first flow as Master and second flow as Detail.
Set Join Condition as dummy_M = dummy_D
Set Join Type as Detail Outer Join.
Use Sorted Input.

DWBIConcepts

2.3 Sort on dummy

4. Next use a Filter Transform:


Set Filter Condition as dummy_D IS NULL
And finally your Target.

115
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

15. Can we pass the value of a mapping variable between 2 pipelines under the same mapping? If not how can we achieve this?

The alternative method to solve this scenario is as below:


1. Split the pipelines into two different mappings say map1 and map2.
2. Create a mapping variable say var1 in map1 and set the value of the variable using SETVARIABLE ()
function. Next our goal is to pass the value of var1 at the end of the successful session run to map2.
3. Create a mapping variable say var2 in map2 and use this in the mapping where ever the value of the
variable from the first mapping var1 is required.
4. Create the workflow with a workflow variable say "wfvar".
5. Create two Non-Reusable sessions say ses1,ses2 for map1, map2 respectively.
6. In the Post-session success variable assignment of ses1 assign the value of mapping variable var1 to
workflow variable wfvar.
7. In the Pre-session variable assignment of ses2 assign the value of workflow variable wfvar to the mapping variable var2.

DWBIConcepts

We cannot pass the value of an Informatica variable between 2 pipelines in a same mapping. Mapping variables are values that can change between sessions. The Integration Service saves the latest value of a mapping variable to the repository only at the end of each successful session run. Now in case we have two pipelines under same mapping- The mapping will have a single session and the value of the mapping variable will
be saved to the repository only when this session succeeds, that means when both the pipeline execution
completes.

DWBIConcepts

Answer:

16. Scenario Implementation 3


Suppose we have a huge (size in GB) flat file as source. The flat file contains 22 columns- out of which 4 columns are considered as key columns-CUST_SRC_ID, PRODUCT_ID, FF_ID, SNM_ID
There is one more column in the flat file relevant to the discussion that is DATE_ID which stores date in YYYYMM-DD format.

DWBIConcepts

With this approach, we will be able to pass the value from the first session to the second session.

Now the requirement is to choose all the unique records from the flat file based on the uniqueness of the
above mentioned keys. If there is any duplicate record then, we must select the record for which DATE_ID
column contains the latest value. So suppose we get following records in the flat file:
CUST_SRC_ID
123
123
123

PRODUCT_ID
P1
P1
P1

FF_ID
F1
F1
F1

SNM_ID
S1
S1
S1

DATE_ID
2013-01-02
2013-01-06
2013-01-02

OTHER COLUMNS
X, Y, Z
P, Q, R
S, T, U

DWBIConcepts

The flat file contains duplicate records based on the above 4 columns (that is - the records are not entirely
duplicated, may be some values are different in some other columns).

In the above case we want the following row in the target:


CUST_SRC_ID
123

PRODUCT_ID
P1

FF_ID
F1

SNM_ID DATE_ID
OTHER COLUMNS
S1
2013-01-06 P, Q, R

116
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

How can we achieve this in a single mapping?


Answer:
Use a Sorter transformation after Source Qualifier. Sorting key will be in below order:
CUST_SRC_ID Ascending order

DWBIConcepts

PRODUCT_ID Ascending order


FF_ID Ascending order
SNM_ID Ascending order
DATE_ID Descending order
Next use an Expression transformation and create 3 variable ports in the below order:
V_Keys = CUST_SRC_ID || PRODUCT_ID || FF_ID || SNM_ID

V_Keys_PREV = V_Keys
O_FLAG = V_FLAG (output port)
Now use a filter transformation with filter condition as below:
O_FLAG=1

17. Scenario Implementation 4


I have a flat file with just one column as given belowC1
L1
C2
L2
C3
L3
where data starting with C denotes company name and that of L depicts Location of the Company.
Have to load this data in Target table (using Infa) as C1, L1
C2, L2
C3, L3

DWBIConcepts

After sorting the data, for every group based on the unique keys, first record will have the latest date, because we have sorted it on DATE_ID descending. Using this expression logic, for every group 1st record (with
latest date) will have O_FLAG value as 1 and rest others with 0. We will filter those unwanted duplicate records using Filter transformation.

DWBIConcepts

DWBIConcepts

V_FLAG = IIF (V_Keys != V_Keys_PREV, 1, 0)

Answer:
This is what i would do to achieve this req.

www.dwbiconcepts.com All rights reserved.

117

www.dwbiconcepts.com Community of DWBI Professionals

18. Implement slowly changing dimension of Type 2 which will load current record in Current
table and old data in Log table.
Answer:
Use Joiner transformation to join Source and Current table with Full Outer Join.
Next use Expression transformation to mark the rows which are new or old and correspondingly
assign values like 0 or 1 in new output port.
Pass all the columns to a Router transformation and filter based on new port created.
If 0 means use Update Strategy transform DD_INSERT with insert to current table.
If 1 means use Update Strategy transform DD_UPDATE with update to current table

DWBIConcepts

Also populate the data from Current table for 1 to the Log table.

DWBIConcepts

2. Add an Aggregator with


group by on the first column
Agg expression max(col3, col2 = 1)
Agg expression max(col3, col2 = 2)

DWBIConcepts

DWBIConcepts

1. After the SQ, in a expression generate (This is tricky, use variable port logic)
unique sequence number each group
unique number for each record with in the group
duplicate the column once
After the Expression the output will be as below
Col1, Col2, Col3, Col4
1,1,C1,C1
1,2,L1,L1
2,1,C2,C2
2,2,L2,L2
3,1,C3,C3
3,2,L3,L3

118
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

26.

Performance Tuning

1. Which one is faster Connected or Unconnected Lookup?

2. How we can improve performance of Informatica Normalization Transformation.


Answer:
As such there is no way to improve the performance of any session by using Normalizer. Normalizer is a
transformation used to pivot or normalize datasets and has nothing to with performance. In fact, Normalizer
does not much impact the performance (apart from taking a little more memory).

3. How to improve the Session performance?


Answer:
Run concurrent sessions
Partition session (Power center)
Tune Parameter - DTM buffer pool, Buffer block size, Index cache size, data cache size, Commit Interval, Tracing level (Normal, Terse, Verbose Initialization, Verbose Data)
The session has memory to hold 83 sources and targets. If it is more, then DTM can be increased.
The Informatica server uses the index and data caches for Aggregate, Rank, Lookup and Joiner transformation. The server stores the transformed data from the above transformation in the data cache
before returning it to the data flow. It stores group information for those transformations in index
cache. If the allocated data or index cache is not large enough to store the date, the server stores
the data in a temporary disk file as it processes the session data. Each time the server pages to the
disk the performance slows. This can be seen from the counters. Since generally data cache is larger
than the index cache, it has to be more than the index.
Remove Staging area

DWBIConcepts

The improvement will be more apparent if your data volume is really huge. Keep the Pre-build Lookup
Cache option set to Always disallowed for the lookup, so that you can ensure that the lookup is not even
cached if it is not being called, although this technique has other disadvantages, check
http://www.dwbiconcepts.com/etl/14-etl-informatica/46-tuning-informatica-lookup.html , especially the
points under following subheadings:
- Effect of choosing connected OR Unconnected Lookup, and
- WHEN TO set Pre-build Lookup Cache OPTION (AND WHEN NOT TO)

DWBIConcepts

If you are calling the Unconnected lookup based on some condition (e.g. calling it from an expression
transformation only when some specific condition is met - as opposed to a connected lookup which will be
called anyway) then you might save some calls to the unconnected lookup, thereby marginally improving
the performance.

DWBIConcepts

There can be some very specific situation where unconnected lookup may add some performance benefit on
total execution.

DWBIConcepts

Answer:

119

www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

Tune off Session recovery


Reduce error tracing

Solution:

Drop or Disable index or constraints


Perform bulk load (Ignores Database log)
Increase commit interval (Recovery is compromised)
Tune the database for RBS, Dynamic Extension etc.,

Sources - Set a filter transformation after each SQ and see the records are not through. If the time taken
is same then there is a problem. You can also identify the Source problem by Read Test Session - where
we copy the mapping with sources, SQ and remove all transformations and connect to file target. If the
performance is same then there is a Source bottleneck.
Using database query - Copy the read query directly from the log. Execute the query against the source
database with a query tool. If the time it takes to execute the query and the time to fetch the first row
are significantly different, then the query can be modified using optimizer hints.
Solution:

Optimize Queries using hints.


Use indexes wherever possible.

Mapping - If both Source and target are OK then problem could be in mapping. Add a filter transformation before target and if the time is the same then there is a problem. (OR) Look for the performance
monitor in the Sessions property sheet and view the counters.
Solutions:

If High error rows and rows in lookup cache indicate a mapping bottleneck.
Optimize Single Pass Reading:
Optimize Lookup transformation :
o Caching the lookup table: When caching is enabled the Informatica server caches the lookup table and queries the cache during the session. When this option is not enabled the server queries
the lookup table on a row-by row basis. Static, Dynamic, Shared, Un-shared and Persistent cache
o

DWBIConcepts

Targets - The most common performance bottleneck occurs when the informatica server writes to a target database. You can identify target bottleneck by configuring the session to write to a flat file target. If
the session performance increases significantly when you write to a flat file, you have a target bottleneck.

DWBIConcepts

Bottlenecks can occur in

DWBIConcepts

Answer:

DWBIConcepts

4. How do you identify the bottlenecks in Mappings?

Optimizing the lookup condition: Whenever multiple conditions are placed, the condition with
equality sign should take precedence.

120
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

Use a source qualifier filter to remove those same rows at the source, If not possible to move the filter
into SQ, move the filter transformation as close to the source qualifier as possible to remove unnecessary data early in the data flow.

Optimize Aggregate transformation:


o Group by simpler columns. Preferably numeric columns.
o Use Sorted input. The sorted input decreases the use of aggregate caches. The server assumes
all input data are sorted and as it reads it performs aggregate calculations.
o Use incremental aggregation in session property sheet.

Optimize Seq. Generator transformation:


o Try creating a reusable Seq. Generator transformation and use it in multiple mappings
o The number of cached value property determines the number of values the Informatica server
caches at one time.

Optimize Expression transformation:


o Factoring out common logic
o Minimize aggregate function calls.
o Replace common sub-expressions with local variables.
o Use operators instead of functions.

Sessions: If you do not have a source, target, or mapping bottleneck, you may have a session bottleneck.
You can identify a session bottleneck by using the performance details. The informatica server creates
performance details when you enable Collect Performance Data on the General Tab of the session properties. Performance details display information about each Source Qualifier, target definitions, and individual transformation. All transformations have some basic counters that indicate the Number of input
rows, output rows, and error rows. Any value other than zero in the readfromdisk and writetodisk counters for Aggregate, Joiner, or Rank transformations indicate a session bottleneck. Low
BufferInput_efficiency and BufferOutput_efficiency counter also indicate a session bottleneck. Small
cache size, low buffer memory, and small commit intervals can cause session bottlenecks.

System (Networks)

5. How do you handle performance issues in Informatica? Where can you monitor the performance?

DWBIConcepts

Optimize Filter transformation: You can improve the efficiency by filtering early in the data flow. Instead
of using a filter transformation halfway through the mapping to remove a sizable amount of data.

DWBIConcepts

DWBIConcepts

Indexing the lookup table: The cached lookup table should be indexed on order by columns. The
session log contains the ORDER BY statement The un-cached lookup since the server issues a SELECT statement for each row passing into lookup transformation, it is better to index the lookup
table on the columns in the condition

DWBIConcepts

Answer:
There are several aspects to the performance handling .Some of them are: Source tuning
Target tuning
Repository tuning

www.dwbiconcepts.com All rights reserved.

121

www.dwbiconcepts.com Community of DWBI Professionals

Session performance tuning


Incremental Change identification in source side.
Software, hardware (Use multiple servers) and network tuning.
Bulk Loading
Use the appropriate transformation.

To monitor this

Answer:
The performance details provide that help you understand the session and mapping efficiency. Each Source
Qualifier, target definition, and individual transformation appears in the performance details, along with that
display performance information about each transformation
Understanding Performance Counters
All transformations have some basic that indicates the number of input rows, output rows, and error rows.
Source Qualifiers, Normalizes, and targets have additional that indicates the efficiency of data moving into
and out of buffers. You can use these to locate performance bottlenecks. Some transformations have specific to their functionality. For example, each Lookup transformation has an indicator that indicates the number
of rows stored in the lookup cache. When you read performance details, the first column displays the transformation name as it appears in the mapping, the second column contains the name, and the third column
holds the resulting number or efficiency percentage. When you partition a source, the Informatica Server
generates one set of for each partition. The following performance illustrate two partitions for an Expression
transformation:

DWBIConcepts

6. What are performance counters?

DWBIConcepts

DWBIConcepts

Set performance detail criteria


Enable performance monitoring
Monitor session at runtime &/ or Check the performance monitor file .

EXPTRANS [1]
o Expression_input rows
o Expression_output rows
EXPTRANS [2]
o Expression_input rows
o Expression_output rows

DWBIConcepts

Transformation Counter Value

8
8
16
16

Note: When you partition a session, the number of aggregate or rank input rows may be different from the
number of output rows from the previous transformation.

7. How can we increase Session Performance?

122
www.dwbiconcepts.com All rights reserved.

www.dwbiconcepts.com Community of DWBI Professionals

WIN NT/2000-Use the task manager


UNIX: VMSTART, IOSTART
Hierarchy of optimization

Target
Source
Mapping
Session
System

Optimizing Target Databases:

Drop indexes /constraints


Increase checkpoint intervals
Use bulk loading /external loading
Turn off recovery
Increase database network packet size

DWBIConcepts

At system level,

DWBIConcepts

Minimum log (Terse)


Partitioning source data
Performing ETL for each partition, in parallel. (For this, multiple CPUs are needed)
Adding indexes
Changing commit Level
Using Filter transformation to remove unwanted data movement
Increasing buffer memory, when large volume of data
Multiple lookups can reduce the performance. Verify the largest lookup table and tune the expressions.
In session level, the causes are small cache size, low buffer memory and small commit interval

DWBIConcepts

Answer:

Optimize the query (using group by, group by)


Use conditional filters
Connect to RDBMS using IPC protocol
Mapping
Optimize data type conversions
Eliminate transformation errors
Optimize transformations/ expressions

DWBIConcepts

Source level

Session
Concurrent batches
Partition sessions

www.dwbiconcepts.com All rights reserved.

123

www.dwbiconcepts.com Community of DWBI Professionals

Reduce error tracing


Tune session parameters
System

Since the target busy percentage is 99.99% it is very clear that the bottleneck is on the target. So we need
tweak the target. I have couple of Options
1. Since the target tale is partitioned on time_id, you need to include in the WHERE clause of the SQL fired by
Informatica. For that you can define the time_id column as primary key in the target definition. With this
your update query will have the time_id in the where clause.
2. With Informatica update strategy, it fires update sql for every row which is marked for update by update
strategy. To avoid multiple update statements you can INSERT all the records which is meant to be UPDATE
into a temporary table. Then use a correlated sql to update the records in the actual table (200M table). This
query can be fires as a post session SQL. Please see the sample SQL
UPDATE TGT_TABLE U SET (U.COLUMNS_LIST /*Column List to be updated*/) = (SELECT I.COLUMNS_LIST
/*Column List to be updated*/ FROM UPD_TABLE I WHERE I.KEYS = U.KEYS AND I.TIME_ID = U.TIME_ID)
WHERE EXISTS (SELECT 1 FROM UPD_TABLE I WHERE I.KEYS = U.KEYS AND I.TIME_ID = U.TIME_ID)
TGT_TABLE
Actual table with 200M records UPD_TABLE - Table with records meant for UPDATE (1K record) We need to
make sure that your indexes are up to date and stats are collected. Since this is more to be done with DB
performance, you may need the help of DBA as well to check the DB throughput, SQL cost etc Hope this will
help you.

DWBIConcepts

Answer:

DWBIConcepts

What would be the best approach to update a huge table (more than 200 million records) using Informatica.
The table does not contain any primary key. However there are a few indexes defined on it. The target table
is partitioned. On the other hand the source table contains only a few records (less than a thousand) that will
go to the target and update the same. Is there any better approach than just doing it by an update strategy
transformation?

DWBIConcepts

8. Scenario Implementation 1

DWBIConcepts

Improve network speed


Use multiple preservers on separate systems
Reduce paging

124
www.dwbiconcepts.com All rights reserved.

You might also like