Professional Documents
Culture Documents
377.informatica - What Are The Main Issues While Working With Flat Files As Source and As Targets ?
377.informatica - What Are The Main Issues While Working With Flat Files As Source and As Targets ?
We need to specify correct path in the session and mension either that file is 'direct' or 'indirect'. keep
that file in exact path which you have specified in the session .
-regards
rasmi
=======================================
1. We can not use SQL override. We have to use transformations for all our requirements
2. Testing the flat files is a very tedious job
3. The file format (source/target definition) should match exactly with the format of data file. Most of the time erroneous
result come when the data file layout is not in sync with the actual file.
(i) Your data file may be fixed width but the definition is delimited----> truncated data
(ii) Your data file as well as definition is delimited but specifying a wrong delimiter (a) a delimitor other than present in
actual file or (b) a delimiter that comes as a character in some field of the file-->wrong data again
(iii) Not specifying NULL character properly may result in wrong data
(iv) there are other settings/attributes while creating file definition which one should be very careful
4. If you miss link to any column of the target then all the data will be placed in wrong fields. That
missed column wont exist in the target data file.
332.Informatica - Explain about Informatica server process that how it works relates to mapping variables?
informatica primarly uses load manager and data transformation manager(dtm) to perform extracting transformation and
loading.load manager reads parameters and variables related to session mapping and server and paases the mapping
parameters and variable information to the DTM.DTM uses this information to perform the datamovement from source
to target
=======================================
The PowerCenter Server holds two different values for a mapping variable during a session run:
l Start value of a mapping variable
l Current value of a mapping variable
Start Value
The start value is the value of the variable at the start of the session. The start value could be a value defined in the
parameter file for the variable a value saved in the repository from the previous run of the session a user defined initial
value for the variable or the default value based on the variable datatype.
The PowerCenter Server looks for the start value in the following order:
1. Value in parameter file
2. Value saved in the repository
3. Initial value
4. Default value
Current Value
The current value is the value of the variable as the session progresses. When a session starts the current value of a
variable is the same as the start value. As the session progresses the PowerCenter Server calculates the current value
using a variable function that you set for the variable. Unlike the start value of a mapping variable the current value can
change as the PowerCenter Server evaluates the current value of a variable as each row passes through the mapping.
=======================================
First load manager starts the session and it performs verifications and validations about variables and manages post
session tasks such as mail. then it creates DTM process.
this DTM inturn creates a master thread which creates remaining threads.
master thread credtes
read thread
write thread
transformation thread
pre and post session thread etc...
Finally DTM hand overs to the load manager after writing into the target
331.Informatica - write a query to retrieve the latest records from the table sorted by version(scd).
you can write a query like inline view clause you can compare previous version to new highest version
then you can get your result
=======================================
hi Sunil
Can u please expalin your answer some what in detail ????
=======================================
Hi
Assume if you put the surrogate key in target (Dept table) like p_key and
version field dno field and loc field is there
then
select a.p_key a.dno a.loc a.version from t_dept a
where a.version (select max(b.version) from t_dept b where a.dno b.dno)
this is the query if you write in lookup it retrieves latest (max)
version in lookup from target. in this way performance increases.
=======================================
Select Acct.* Rank() Over ( partition by ch_key_id order by version desc) as Rank
from Acct
where Rank() 1
=======================================
select business_key max(version) from tablename group by business_key
You can configure the following information in the Partitions view on the Mapping tab:
l Add and delete partition points.
l Enter a description for each partition.
l Specify the partition type at each partition point.
l Add a partition key and key ranges for certain partition types.
=======================================
By default when we create the session workflow creates pass-through partition points at Source Qualifier
transformations and target instances.
283.Informatica - hi, how we validate all the mappings in the repository at once
You can not validate all the mappings in one go. But you can validate all the mappings in a folder in one go and
continue the process for all the folders.
For dooing this log on to the repository manager. Open the folder then the mapping sub folder then select all or some of
the mappings(by pressing the shift or control key ctrl+A does not work) and then rightclick and validate.
=======================================
Yes. We can validate all mappings using the Repo Manager.
Whenever any source data is changed we need to capture it in the target system also this can be basically in 3 ways
Target record is completely replaced with new record(Type 1)
Complete changes can be captured as different records & stored in the target table(Type 2)
Only last change & present data can be captured (Type 3)
CDC can be done generally by using a timestamp or version key
228.Informatica - what is the repository agent?
The Repository Agent is a multi-threaded process that fetches inserts and updates metadata in the repository database
tables. The Repository Agent uses object locking to ensure the consistency of metadata in the repository.
=======================================
The Repository Server uses a process called Repository agent to access the tables from Repository database. The
Repository sever uses multiple repository agent processes to manage multiple repositories on different machines on the
network using native drivers.
=======================================
Name itself it is saying that agent means mediator between and repository server and repository database tables.
simply repository agent means who speaks with repository.
224.Informatica - what are the transformations that restrict the partitioning of sessions?
Advanced External procedure transformation and External procedure transformation:
This Transformation contains a check box on the properties tab to allow
partitioning.
*Aggregator Transformation:
If you use sorted ports you cannot partition the associated source
*Joiner Transformation:
you can not partition the master source for a joiner transformation
*Normalizer Transformation
*XML targets.
=======================================
1)source defination
2)Sequence Generator
3)Unconnected Transformation
4)Xml Target defination
Advanced External procedure transformation and External procedure transformation:
This Transformation contains a check box on the properties tab to allow partitioning.
Aggregator Transformation:
If you use sorted ports you cannot partition the associated source
Joiner Transformation:
you can not partition the master source for a joiner transformation
Normalizer Transformation
XML targets.
213.Informatica - How do you create single lookup transformation using multiple tables?
Write a override sql query. Adjust the ports as per the sql query.
=======================================
no it is not possible to create single lookup on multiple tables. beacuse lookup is created upon target
table.
=======================================
for connected lkp transformation1>create the lkp transformation.2>go for skip.3>manually enter the
ports name that u want to lookup.4>connect with the i/p port from src table.5>give the condition6>go
for generate sql then modify according to u'r requirement validateit will work....
=======================================
just we can create the view by using two table then we can take that view as lookup table
=======================================
If you want single lookup values to be used in multiple target tables this can be done !!!
For this we can use Unconnected lookup and can collect the values from source table in any target table
depending upon the business rule ...
184.Informatica - what is the difference between constraind base load ordering and target load plan
Constraint based load ordering
example:
Table 1---Master
Tabke 2---Detail
If the data in table1 is dependent on the data in table2 then table2 should be loaded first.In such cases to control the
load order of the tables we need some conditional loading which is nothing but constraint based load
In Informatica this feature is implemented by just one check box at the session level.
Target load order comes in the designer property..Click mappings tab in designer and then target load plan.It will show
all the target load groups in the particular mapping. You specify the order there the server will loadto the target
accordingly.
A target load group is a set of source-source qulifier-transformations and target.
Where as constraint based loading is a session proerty. Here the multiple targets must be generated from one source
qualifier. The target tables must posess primary/foreign key relationships. So that the server loads according to the key
relation irrespective of the Target load order plan.
=======================================
If you have only one source it s loading into multiple target means you have to use Constraint based loading. But the
target tables should have key relationships between them.
If you have multiple source qualifiers it has to be loaded into multiple target you have to use Target load order.
Constraint based loading : If your mapping contains single pipeline(flow) with morethan one target (If target tables
contain Master -Child relationship) you need to use constraint based load in session level.
Target Load plan : If your mapping contains multipe pipeline(flow) (specify execution order one by one.example
pipeline 1 need to execute first then pipeline 2 then pipeline 3) this is purly based on pipeline dependency
139.Informatica - what are cost based and rule based approaches and the difference
Cost based and rule based approaches are the optimization techniques which are used in related to databases where
we need to optimize a sql query.
Basically Oracle provides Two types of Optimizers (indeed 3 but we use only these two techniques. Bcz the third has
some disadvantages.)
When ever you process any sql query in Oracle what oracle engine internally does is it reads the query and decides
which will the best possible way for executing the query. So in this process Oracle follows these optimization
techniques.
1. cost based Optimizer(CBO): If a sql query can be executed in 2 different ways ( like may have path 1 and path2 for
same query) then What CBO does is it basically calculates the cost of each path and the analyses for which path the
cost of execution is less and then executes that path so that it can optimize the quey execution.
2. Rule base optimizer(RBO): this basically follows the rules which are needed for executing a query.
So depending on the number of rules which are to be applied the optimzer runs the query.
If the table you are trying to query is already analysed then oracle will go with CBO.
If the table is not analysed the Oracle follows RBO.
For the first time if table is not analysed Oracle will go with full table scan.
if u want to lookup data on multiple tables at a time u can do one thing join the tables which u want then lookup that
joined table. informatica provieds lookup on joined tables hats off to informatica.
=======================================
You can do it.
When you create lookup transformation that time INFA asks for table name so you can choose either source target
import and skip. So click skip and the use the sql overide property in properties tab to join two table for lookup.
join the two source by using the joiner transformation and then apply a look up on the resaulting table
=======================================
what ever my friends have answered earlier is correct. to be more specific
if the two tables are relational then u can use the SQL lookup over ride option to join the two tables in the lookup
properties.u cannot join a flat file and a relatioanl table.
eg: lookup default query will be select lookup table column_names from lookup_table. u can now continue this query.
add column_names of the 2nd table with the qualifier and a where clause. if u want to use a order by then use -- at the
end of the order by.
120.Informatica - How to retrive the records from a rejected file. explane with syntax or example
there is one utility called reject Loader where we can findout the reject records.and able to refine and reload the rejected
records..
=======================================
ya. every time u run the session one reject file will be created and all the reject files will be there in the reject file. u can
modify the records and correct the things in the records and u can load them to the target directly from the reject file
using Regect loader. =======================================
can you explain how to load rejected rows thro informatica
=======================================
During the execution of workflow all the rejected rows will be stored in bad files(where your
informatica server get installed;C:\Program Files\Informatica PowerCenter 7.1\Server) These bad files can be imported
as flat a file in source then thro' direct maping we can load these files in desired format.
98.Informatica - can we modify the data in flat file?
=======================================
Let's not discuss about manually modifying the data of flat file.
Let's assume that the target is a flat file. I want to update the data in the flat file target based on the input source rows.
Like we use update strategy/ target properties in case of relational targets for update; do we have any options in the
session or maaping to perform a similar task for a flat file target?
I have heard about the append option in INFA 8.x. This may be helpful for incremental load in the flat file.
But this is not a workaround for updating the rows.
=======================================
You can modify the flat file using shell scripting in unix ( awk grep sed ).
97.Informatica - how to get the first 100 rows from the flat file into the target?
please check this one
task ----->(link) session (workflow manager)
double click on link and type $$source sucsess rows(parameter in session variables) 100
it should automatically stops session.
82.Informatica - If i done any modifications for my table in back end does it reflect in informatca warehouse or
mapi
Informatica is not at all concern with back end data base.It
displays u all the information that is to be stored in repository.If want to reflect back end changes to informatica
screens,again u have to import from back end to informatica by valid connection.And u have to replace the existing files
with imported files.
=======================================
Yes It will be reflected once u refresh the mapping once again.
=======================================
It does matter if you have SQL override - say in the SQ or in a Lookup you override the default sql. Then if you make a
change to the underlying table in the database that makes the override SQL incorrect for the modified table the session
will fail.
If you change a table - say rename a column that is in the sql override statement then session will fail.
But if you added a column to the underlying table after the last column then the sql statement in the override will still be
valid. If you make change to the size of columns the sql will still be valid although you may get truncation of data if the
database column has larger size (more characters) than the SQ or subsequent transformation.
17.Informatica - What r the mapping paramaters and maping
variables?
17 Maping parameter represents a constant value that U can define before running a session.A mapping parameter
retains the same value throughout the entire session.
When u use the maping parameter ,U declare and use the parameter in a maping or maplet.Then define the value of
parameter in a parameter file for the session.
Unlike a mapping parameter,a maping variable represents a value that can change throughout the session.The
informatica server saves the value of maping variable to the repository at the end of session run and uses that value
next time
U run the session.
21.Informatica - What is aggregate cache in aggregator
transforamtion?
21 The aggregator stores data in the aggregate cache until it completes aggregate calculations.When u run a session
that uses an aggregator transformation,the informatica server creates index and data caches in memory to process the
transformation.If the informatica server requires more space,it stores overflow values in cache files.
26.Informatica - What r the joiner caches?
26 When a Joiner transformation occurs in a session, the
Informatica Server reads all the records from the master source and builds index
and data caches based on the master rows.
After building the caches, the Joiner transformation reads records from the detail
source and perform joins.
30.Informatica - Differences between connected and unconnected
lookup?
32.Informatica - What r the types of lookup caches?
32 Persistent cache: U can save the lookup cache files and reuse
them the next time the informatica server processes a lookup transformation
configured to use the cache.
Recache from database: If the persistent cache is not synchronized with he
lookup table,U can configure the lookup transformation to rebuild the lookup
cache.
Static cache: U can configure a static or readonly cache for only lookup table.By
default informatica server creates a static cache.It caches the lookup table and
lookup values in the cache for each row that comes into the transformation.when
the lookup condition is true,the informatica server does not update the cache
while it prosesses the lookup transformation.
Dynamic cache: If u want to cache the target table and insert new rows into
cache and the target,u can create a look up transformation to use dynamic cache.
The informatica server dynamically inerts data to the target table.
shared cache: U can share the lookup cache between multiple transactions.U can
share unnamed cache between transformations inthe same maping.
36.Informatica - What is the Rankindex in Ranktransformation?
36 The Designer automatically creates a RANKINDEX port for
each Rank transformation. The Informatica Server uses the Rank Index port to
store the ranking position for each record in a group. For example, if you create a
Rank transformation that ranks the top 5 salespersons for each quarter, the rank
index numbers the salespeople from 1 to 5:
Incremental Aggregation
Using this, you apply captured changes in the source to aggregate calculation in a session. If the source
changes only incrementally and you can capture changes, you can configure the session to process only
those changes
This allows the sever to update the target incrementally, rather than forcing it to process the entire source
and recalculate the same calculations each time you run the session.
Steps:
The first time you run a session with incremental aggregation enabled, the server process the entire source.
-
At the end of the session, the server stores aggregate data from that session ran in two files, the index file
and data file. The server creates the file in local directory.
The second time you run the session, use only changes in the source as source data for the session. The
server then performs the following actions:
(1)
For each input record, the session checks the historical information in the index file for a corresponding
group, then:
If it finds a corresponding group
The server performs the aggregate operation incrementally, using the aggregate data for that
group, and saves the incremental changes.
Else
Server create a new group and saves the record data
(2)
When writing to the target, the server applies the changes to the existing target.
o
Saves modified aggregate data in Index/Data files to be used as historical data the next time you run
the session.
o
Each Subsequent time you run the session with incremental aggregation, you use only the incremental source changes
in the session.
If the source changes significantly, and you want the server to continue saving the aggregate data for the future
incremental changes, configure the server to overwrite existing aggregate data with new aggregate data.
You can capture incremental changes. You might do this by filtering source data by timestamp.
SESSION LOGS
Information that reside in a session log:
-
Session Initialization
Other Information
-
By default, the server generates log files based on the server code page.
Thread Identifier
Ex: CMN_1039
Reader and Writer thread codes have 3 digit and Transformation codes have 4 digits. The number following a
thread name indicate the following:
(a) Target load order group number
(b) Source pipeline number
(c) Partition number
(d) Aggregate/ Rank boundary number
BR -
CMN -
DBGR -
Related to debugger
EP-
External Procedure
LM -
Load Manager
TM -
DTM
REP -
Repository
WRT -
Writer
Load Summary
(a) Inserted
(b) Updated
(c) Deleted
(d) Rejected
Statistics details
(a) Requested rows shows the no of rows the writer actually received for the specified operation
(b) Applied rows shows the number of rows the writer successfully applied to the target (Without Error)
(c) Rejected rows show the no of rows the writer could not apply to the target
(d) Affected rows shows the no of rows affected by the specified operation
Detailed transformation statistics
The server reports the following details for each transformation in the mapping
(a) Name of Transformation
(b) No of I/P rows and name of the Input source
(c) No of O/P rows and name of the output target
(d) No of rows dropped
Tracing Levels
Normal
- Initialization and status information, Errors encountered, Transformation errors, rows skipped,
summarize session details (Not at the level of individual rows)
Terse
Verbose Init
- Addition to normal tracing, Names of Index, Data files used and detailed transformation
statistics.
Verbose Data
- Addition to Verbose Init, Each row that passes in to mapping detailed transformation statistics.
NOTE
When you enter tracing level in the session property sheet, you override tracing levels configured for
transformations in the mapping.
Non-Fatal
Fatal
Others
Usages of ABORT function in mapping logic, to abort a session when the server encounters a transformation
error.
Stopping the server using pmcmd (or) Server Manager
Performing Recovery
-
When the server starts a recovery session, it reads the OPB_SRVR_RECOVERY table and notes the rowid
of the last row commited to the target database. The server then reads all sources again and starts
processing from the next rowid.
By default, perform recovery is disabled in setup. Hence it wont make entries in OPB_SRVR_RECOVERY
table.
The recovery session moves through the states of normal session schedule, waiting to run, Initializing,
running, completed and failed. If the initial recovery fails, you can run recovery as many times.
The normal reject loading process can also be done in session recovery process.
Un recoverable Sessions
Under certain circumstances, when a session does not complete, you need to truncate the target and run the
session from the beginning.
Commit Intervals
A commit interval is the interval at which the server commits data to relational targets during a session.
(a) Target based commit
-
Server commits data based on the no of target rows and the key constraints on the target table. The commit
point also depends on the buffer block size and the commit pinterval.
During a session, the server continues to fill the writer buffer, after it reaches the commit interval. When the
buffer block is full, the Informatica server issues a commit command. As a result, the amount of data
committed at the commit point generally exceeds the commit interval.
The server commits data to each target based on primary foreign key constraints.
Server commits data based on the number of source rows. The commit point is the commit interval you
configure in the session properties.
During a session, the server commits data to the target based on the number of rows from an active source
in a single pipeline. The rows are referred to as source rows.
A pipeline consists of a source qualifier and all the transformations and targets that receive data from
source qualifier.
Although the Filter, Router and Update Strategy transformations are active transformations, the server does
not use them as active sources in a source based commit session.
When a server runs a session, it identifies the active source for each pipeline in the mapping. The server
generates a commit row from the active source at every commit interval.
When each target in the pipeline receives the commit rows the server performs the commit.
Reject Loading
During a session, the server creates a reject file for each target instance in the mapping. If the writer of the target
rejects data, the server writers the rejected row into the reject file.
You can correct those rejected data and re-load them to relational targets, using the reject loading utility. (You
cannot load rejected data into a flat file target)
Each time, you run a session, the server appends a rejected data to the reject file.
Locating the BadFiles
$PMBadFileDir
Filename.bad
When you run a partitioned session, the server creates a separate reject file for each partition.
Reading Rejected data
Ex:
3,D,1,D,D,0,D,1094345609,D,0,0.00
To help us in finding the reason for rejecting, there are two main things.
(a) Row indicator
Row indicator tells the writer, what to do with the row of wrong data.
Row indicator
Meaning
Rejected By
Insert
Writer or target
Update
Writer or target
Delete
Writer or target
Reject
Writer
If a row indicator is 3, the writer rejected the row because an update strategy expression marked it for reject.
(b) Column indicator
Column indicator is followed by the first column of data, and another column indicator. They appears after every
column of data and define the type of data preceding it
Column Indicator
Meaning
Writer Treats as
Valid Data
Overflow
Bad Data.
Null
Bad Data.
Truncated
Bad Data
NOTE
NULL columns appear in the reject file with commas marking their column.
For example, a series of N indicator might lead you to believe the target database does not accept NULL values,
so you decide to change those NULL values to Zero.
However, if those rows also had a 3 in row indicator. Column, the row was rejected b the writer because of an
update strategy expression, not because of a target database restriction.
If you try to load the corrected file to target, the writer will again reject those rows, and they will contain inaccurate 0
values, in place of NULL values.
After correcting the rejected data, rename the rejected file to reject_file.in
The rejloader used the data movement mode configured for the server. It also used the code page of
server/OS. Hence do not change the above, in middle of the reject loading
Other points
The server does not perform the following option, when using reject loader
(a)
(b)
(c)
(d)
FTP targets
(e)
External Loading
External Loading
You can configure a session to use Sybase IQ, Teradata and Oracle external loaders to load session target files
into the respective databases.
The External Loader option can increase session performance since these databases can load information directly
from files faster than they can the SQL commands to insert the same data into the database.
Method:
When a session used External loader, the session creates a control file and target flat file. The control file contains
information about the target flat file, such as data format and loading instruction for the External Loader. The control
file has an extension of *.ctl and you can view the file in $PmtargetFilesDir.
For using an External Loader:
The following must be done:
-
Configure the session to write to a target flat file local to the server.
Choose an external loader connection for each target file in session property sheet.
Disable constraints
Performance issues
o
The server can use multiple External Loader within one session (Ex: you are having a session with the two
target files. One with Oracle External Loader and another with Sybase External Loader)
Other Information:
-
The External Loader performance depends upon the platform of the server
The serve writes External Loader initialization and completing messaging in the session log. However,
details about EL performance, it is generated at EL log, which is getting stored as same target directory.
If the session contains errors, the server continues the EL process. If the session fails, the server loads
partial target data using EL.
The EL creates a reject file for data rejected by the database. The reject file has an extension of *.ldr
reject.
You can load corrected data from the file, using database reject loader, and not through Informatica reject
load utility (For EL reject file only)
Configuring EL in session
-
Caches
-
server creates index and data caches in memory for aggregator ,rank ,joiner and Lookup transformation in a
mapping.
Server stores key values in index caches and output values in data caches : if the server requires more
memory ,it stores overflow values in cache files .
When the session completes, the server releases caches memory, and in most circumstances, it deletes
the caches files .
releases caches memory, and in most circumstances, it deletes the caches files .
index cache
stores group values
data cache
stores calculations
As configured in the
Group-by ports.
Rank
Information.
Column overhead includes a null indicator, and row overhead can include row to key information.
Steps:
-
first, add the total column size in the cache to the row overhead.
Multiply the result by the no of groups (or) rows in the cache this gives the minimum cache requirements .
Location:
-by default , the server stores the index and data files in the directory $PMCacheDir.
-the server names the index files PMAGG*.idx and data files PMAGG*.dat. if the size exceeds 2GB,you may find
multiple index and data files in the directory .The server appends a number to the end of
filename(PMAGG*.id*1,id*2,etc).
Aggregator Caches
-
when server runs a session with an aggregator transformation, it stores data in memory until it
the aggregation.
completes
when you partition a source, the server creates one memory cache and one disk cache and one and disk
cache for each partition .It routes data from one partition to another based on group key values of the
transformation.
Index cache:
#Groups (( column size) + 7)
Aggregate data cache:
#Groups (( column size) + 7)
Rank Cache
-
when the server runs a session with a Rank transformation, it compares an input row with rows with rows
in data cache. If the input row out-ranks a stored row,the Informatica server replaces the stored row with the
input row.
If the rank transformation is configured to rank across multiple groups, the server ranks incrementally for
each group it finds .
Index Cache :
#Groups (( column size) + 7)
Rank Data Cache:
#Group [(#Ranks * ( column size + 10)) + 20]
Joiner Cache:
-
When server runs a session with joiner transformation, it reads all rows from the master source and builds
memory caches based on the master rows.
After building these caches, the server reads rows from the detail source and performs the joins
Server creates the Index cache as it reads the master source into the data cache. The server uses the
Index cache to test the join condition. When it finds a match, it retrieves rows values from the data cache.
To improve joiner performance, the server aligns all data for joiner cache or an eight byte boundary.
Index Cache :
#Master rows [( column size) + 16)
Joiner Data Cache:
#Master row [( column size) + 8]
Lookup cache:
-
When server runs a lookup transformation, the server builds a cache in memory, when it process the first
row of data in the transformation.
Server builds the cache and queries it for the each row that enters the transformation.
If you partition the source pipeline, the server allocates the configured amount of memory for each partition.
If two lookup transformations share the cache, the server does not allocate additional memory for the
second lookup transformation.
The server creates index and data cache files in the lookup cache drectory and used the server code page
to create the files.
Index Cache :
#Rows in lookup table [( column size) + 16)
Lookup Data Cache:
#Rows in lookup table [( column size) + 8]
Mapplets
When the server runs a session using a mapplets, it expands the mapplets. The server then runs the session as it
would any other sessions, passing data through each transformations in the mapplet.
If you use a reusable transformation in a mapplet, changes to these can invalidate the mapplet and every mapping
using the mapplet.
You can create a non-reusable instance of a reusable transformation.
Mapplet Objects:
(a)
Input transformation
(b)
Source qualifier
(c)
(d)
Output transformation
Joiner
Normalizer
Target definitions
Types of Mapplets:
(a)
Active Mapplets
(b)
Passive Mapplets
Copied mapplets are not an instance of original mapplets. If you make changes to the original, the copy does not inherit
your changes
You can use a single mapplet, even more than once on a mapping.
Ports
Default value for I/P port-
NULL
ERROR
This parameter represent values you might want to change between sessions, such as DB Connection or source file.
We can use session parameter in a session property sheet, then define the parameters in a session parameter file.
The user defined session parameter are:
(a)
DB Connection
(b)
(c)
(d)
Description:
Use session parameter to make sessions more flexible. For example, you have the same type of transactional data
written to two different databases, and you use the database connections TransDB1 and TransDB2 to connect to the
databases. You want to use the same mapping for both tables.
Instead of creating two sessions for the same mapping, you can create a database connection parameter, like
$DBConnectionSource, and use it as the source database connection for the session.
When you create a parameter file for the session, you set $DBConnectionSource to TransDB1 and run the session.
After it completes set the value to TransDB2 and run the session again.
NOTE:
You can use several parameter together to make session management easier.
Session parameters do not have default value, when the server can not find a value for a session parameter, it fails to
initialize the session.
Session Parameter File
-
In that, we can specify the folder and session name, then list the parameters and variables used in the
session and assign each value.
Mapping parameter
Mapping variables
Session parameters
You can include parameter and variable information for more than one session in a single parameter file by
creating separate sections, for each session with in the parameter file.
You can override the parameter file for sessions contained in a batch by using a batch parameter file. A
batch parameter file has the same format as a session parameter file
Locale
Locale
(a) System Locale -
System Default
Input locale
Mapping Parameter and Variables
Before Session: After saving mapping, we can run some initial tests.
(b)
After Session:
MEadata Reporter:
-
Web based application that allows to run reports against repository metadata
Reports including executed sessions, lookup table dependencies, mappings and source/target schemas.
Repository
Types of Repository
(a) Global Repository
a. This is the hub of the domain use the GR to store common objects that multiple developers can use
through shortcuts. These may include operational or application source definitions, reusable
transformations, mapplets and mappings
(b) Local Repository
a. A Local Repository is with in a domain that is not the global repository. Use4 the Local Repository for
development.
Standard Repository
a. A repository that functions individually, unrelated and unconnected to other repository
NOTE:
-
Once you create a global repository, you can not change it to a local repository
Provide a way to group sessions for either serial or parallel execution by server
Batches
o
Nesting Batches
Each batch can contain any number of session/batches. We can nest batches several levels deep, defining batches
within batches
Nested batches are useful when you want to control a complex series of sessions that must run sequentially or
concurrently
Scheduling
When you place sessions in a batch, the batch schedule override that session schedule by default. However, we
can configure a batched session to run on its own schedule by selecting the Use Absolute Time Session Option.
Server Behavior
Server configured to run a batch overrides the server configuration to run sessions within the batch. If you have
multiple servers, all sessions within a batch run on the Informatica server that runs the batch.
The server marks a batch as failed if one of its sessions is configured to run if Previous completes and that
previous session fails.
Sequential Batch
If you have sessions with dependent source/target relationship, you can place them in a sequential batch, so that
Informatica server can run them is consecutive order.
They are two ways of running sessions, under this category
(a) Run the session, only if the previous completes successfully
(b) Always run the session (this is default)
Concurrent Batch
In this mode, the server starts all of the sessions within the batch, at same time
Concurrent batches take advantage of the resource of the Informatica server, reducing the time it takes to run the
session separately or in a sequential batch.
Concurrent batch in a Sequential batch
If you have concurrent batches with source-target dependencies that benefit from running those batches in a
particular order, just like sessions, place them into a sequential batch.
If the session you want to stop is a part of batch, you must stop the batch
When you issue the stop command, the server stops reading data. It continues processing and writing data
and committing data to targets
If the server cannot finish processing and committing data, you can issue the ABORT command. It is similar
to stop command, except it has a 60 second timeout. If the server cannot finish processing and committing
data within 60 seconds, it kills the DTM process and terminates the session.
Recovery:
-
After a session being stopped/aborted, the session results can be recovered. When the recovery is
performed, the session continues from the point at which it stopped.
If you do not recover the session, the server runs the entire session the next time.
Hence, after stopping/aborting, you may need to manually delete targets before the session runs again.
NOTE:
ABORT command and ABORT function, both are different.
When can a Session Fail
-
Session exceeds the maximum no of sessions the server can run concurrently
Server cannot obtain an execute lock for the session (the session is already locked)
Server encounter Transformation row errors (Ex: NULL value in non-null fields)
Performing ETL for each partition, in parallel. (For this, multiple CPUs are needed)
Adding indexes.
Multiple lookups can reduce the performance. Verify the largest lookup table and tune the expressions.
In session level, the causes are small cache size, low buffer memory and small commit interval.
At system level,
o
Hierarchy of optimization
-
Target.
Source.
Mapping
Session.
System.
Source
level
Mapping
-
Session:
-
concurrent batches.
Partition sessions.
System:
-
Reduce paging.
Session Process
Info server uses both process memory and system shared memory to perform ETL process.
It runs as a daemon on UNIX and as a service on WIN NT.
The following processes are used to run a session:
(a)
(b)
starts a session
DTM process: -
Locks session.
Verifies permissions/privileges.
DTM process:
The primary purpose of the DTM is to create and manage threads that carry out the session tasks.
The DTM allocates process memory for the session and divides it into buffers. This is known as buffer
memory. The default memory allocation is 12,000,000 bytes .it creates the main thread, which is called master
thread .this manages all other threads.
Various threads
functions
Master thread-
Mapping thread-
Reader thread-
Writer thread-
Transformation thread-
Note:
When you run a session, the
to move/transform data.
threads for a partitioned source execute concurrently. The threads use buffers
Read lock. Created when you open a repository object in a folder for which you do not have write permission.
Also created when you open an object with an existing write lock.
Write lock. Created when you create or edit a repository object in a folder for which you have write permission.
Execute lock. Created when you start a session or batch, or when the Informatica Server starts a scheduled
session or batch.
Fetch lock. Created when the repository reads information about repository objects from the database.
Save lock. Created when you save information to the repository.
Q: What happens in a database when a cached LOOKUP object is created (during a session)?
The session generates a select statement with an Order By clause. Any time this is issued, the databases like Oracle
and Sybase will select (read) all the data from the table, in to the temporary database/space. Then the data will be
sorted, and read in chunks back to Informatica server. This means, that hot-spot contention for a cached lookup will
NOT be the table it just read from. It will be the TEMP area in the database, particularly if the TEMP area is being
utilized for other things. Also - once the cache is created, it is not re-read until the next running session re-creates it.
Q: Can you explain how "constraint based load ordering" works? (27 Jan 2000)
Constraint based load ordering in PowerMart / PowerCenter works like this: it controls the order in which the target
tables are committed to a relational database. It is of no use when sending information to a flat file. To construct the
proper constraint order: links between the TARGET tables in Informatica need to be constructed. Simply turning on
"constraint based load ordering" has no effect on the operation itself. Informatica does NOT read constraints from the
database when this switch is turned on. Again, to take advantage of this switch, you must construct primary / foreign
key relationships in the TARGET TABLES in the designer of Informatica. Creating primary / foreign key relationships is
difficult - you are only allowed to link a single port (field) to a single table as a primary / foreign key.
What is the method of loading 5 flat files of having same structure to a single target and which transformations
will you use?
This can be handled by using the file list in informatica. If we have 5 files
in different locations on the server and we need to load in to single target
table. In session properties we need to change the file type as Indirect.
(Direct if the source file contains the source data. Choose Indirect if the
source file contains a list of files.
When you select Indirect the PowerCenter Server finds the file list then reads
each listed file when it executes the session.)
am taking a notepad and giving following paths and filenames in this notepad and saving this notepad as
emp_source.txt in the directory /ftp_data/webrep/
/ftp_data/webrep/SrcFiles/abc.txt
/ftp_data/webrep/bcd.txt
/ftp_data/webrep/srcfilesforsessions/xyz.txt
/ftp_data/webrep/SrcFiles/uvw.txt
/ftp_data/webrep/pqr.txt
In session properties i give /ftp_data/webrep/ in the
directory path and file name as emp_source.txt and file type as Indirect.
Other methods to Improve Performance
Optimizing the Target Database
If your session writes to a flat file target, you can optimize session performance by writing to a flat file target that is local
to the Informatica Server.
If your session writes to a relational target, consider performing the following tasks to increase performance:
Drop indexes and key constraints.
Increase checkpoint intervals.
Use bulk loading.
Use external loading.
Turn off recovery.
Increase database network packet size.
Optimize Oracle target databases.
When you write to Oracle target databases, the database uses rollback segments during loads. Make sure that
the database stores rollback segments in appropriate tablespaces, preferably on different disks. The rollback segments
should also have appropriate storage clauses.
You can optimize the Oracle target database by tuning the Oracle redo log. The Oracle database uses the redo
log to log loading operations. Make sure that redo log size and buffer size are optimal. You can view redo log properties
in the init.ora file.
If your Oracle instance is local to the Informatica Server, you can optimize performance by using IPC protocol to
connect to the Oracle database. You can set up Oracle database connection in listener.ora and tnsnames.ora.
Improving Performance at mapping level
Optimizing Datatype Conversions
Forcing the Informatica Server to make unnecessary datatype conversions slows performance.
For example, if your mapping moves data from an Integer column to a Decimal column, then back to an Integer column,
the unnecessary datatype conversion slows performance. Where possible, eliminate unnecessary datatype conversions
from mappings.
Some datatype conversions can improve system performance. Use integer values in place of other datatypes when
performing comparisons using Lookup and Filter transformations.
For example, many databases store U.S. zip code information as a Char or Varchar datatype. If you convert your zip
code data to an Integer datatype, the lookup database stores the zip code 94303-1234 as 943031234. This helps
increase the speed of the lookup comparisons based on zip code.
Optimizing Lookup Transformations
If a mapping contains a Lookup transformation, you can optimize the lookup. Some of the things you can do to increase
performance include caching the lookup table, optimizing the lookup condition, or indexing the lookup table.
Caching Lookups
If a mapping contains Lookup transformations, you might want to enable lookup caching. In general, you want to cache
lookup tables that need less than 300MB.
When you enable caching, the Informatica Server caches the lookup table and queries the lookup cache during the
session. When this option is not enabled, the Informatica Server queries the lookup table on a row-by-row basis. You
can increase performance using a shared or persistent cache:
Shared cache. You can share the lookup cache between multiple transformations. You can share an unnamed cache
between transformations in the same mapping. You can share a named cache between transformations in the same or
different mappings.
Persistent cache. If you want to save and reuse the cache files, you can configure the transformation to use a
persistent cache. Use this feature when you know the lookup table does not change between session runs. Using a
persistent cache can improve performance because the Informatica Server builds the memory cache from the cache
files instead of from the database.
Reducing the Number of Cached Rows
Use the Lookup SQL Override option to add a WHERE clause to the default SQL statement. This allows you to reduce
the number of rows included in the cache.
Optimizing the Lookup Condition
If you include more than one lookup condition, place the conditions with an equal sign first to optimize lookup
performance.
Indexing the Lookup Table
The Informatica Server needs to query, sort, and compare values in the lookup condition columns. The index needs to
include every column used in a lookup condition. You can improve performance for both cached and uncached lookups:
Cached lookups. You can improve performance by indexing the columns in the lookup ORDER BY. The session log
contains the ORDER BY statement.
Uncached lookups. Because the Informatica Server issues a SELECT statement for each row passing into the Lookup
transformation, you can improve performance by indexing the columns in the lookup condition.
Improving Performance at Repository level
Tuning Repository Performance
The PowerMart and PowerCenter repository has more than 80 tables and almost all tables use one or more indexes to
speed up queries. Most databases keep and use column distribution statistics to determine which index to use to
execute SQL queries optimally. Database servers do not update these statistics continuously.
In frequently-used repositories, these statistics can become outdated very quickly and SQL query optimizers may
choose a less than optimal query plan. In large repositories, the impact of choosing a sub-optimal query plan can affect
performance drastically. Over time, the repository becomes slower and slower.
To optimize SQL queries, you might update these statistics regularly. The frequency of updating statistics depends on
how heavily the repository is used. Updating statistics is done table by table. The database administrator can create
scripts to automate the task.
You can use the following information to generate scripts to update distribution statistics.
Note: All PowerMart/PowerCenter repository tables and index names begin with OPB_.
Oracle Database
You can generate scripts to update distribution statistics for an Oracle repository.
To generate scripts for an Oracle repository:
1. Run the following queries:
select 'analyze table ', table_name, ' compute statistics;' from user_tables where table_name like 'OPB_%'
select 'analyze index ', INDEX_NAME, ' compute statistics;' from user_indexes where INDEX_NAME like
'OPB_%'
This produces an output like the following:
'ANALYZETABLE'
TABLE_NAME
'COMPUTESTATISTICS;'
OPB_ANALYZE_DEP
compute statistics;
analyze table
OPB_ATTR
compute statistics;
analyze table
OPB_BATCH_OBJECT
compute statistics;
'ANALYZETABLE' TABLE_NAME
'COMPUTESTATISTICS;'
-------------- ---------------- -------------------4. Run this as an SQL script. This updates repository table statistics.
Microsoft SQL Server
You can generate scripts to update distribution statistics for a Microsoft SQL Server repository.
To generate scripts for a Microsoft SQL Server repository:
1. Run the following query:
select 'update statistics ', name from sysobjects where name like 'OPB_%'
This produces an output like the following:
name
------------------ -----------------update statistics OPB_ANALYZE_DEP
update statistics OPB_ATTR
update statistics OPB_BATCH_OBJECT
2. Save the output to a file.
3. Edit the file and remove the header information.
Headers are like the following:
name
------------------ -----------------4. Add a go at the end of the file.
5. Run this as a sql script. This updates repository table statistics.
Suggested
Value
Default Value
Buffer
Pool 12,000,000
MB]
bytes
[12
Minimum Suggested
Value
6,000,000 bytes
128,000,000 bytes
64,000 bytes
[64 KB]
4,000 bytes
128,000 bytes
1,000,000 bytes
1,000,000 bytes
12,000,000 bytes
2,000,000 bytes
2,000,000 bytes
24,000,000 bytes
Commit interval
10,000 rows
N/A
N/A
Decimal arithmetic
Disabled
N/A
N/A
Tracing Level
Normal
Terse
N/A
Maximum
How to correct and load the rejected files when the session completes
During a session, the Informatica Server creates a reject file for each target instance in the mapping. If the writer or the
target rejects data, the Informatica Server writes the rejected row into the reject file. By default, the Informatica Server
creates reject files in the $PMBadFileDir server variable directory.
The reject file and session log contain information that helps you determine the cause of the reject. You can correct
reject files and load them to relational targets using the Informatica reject loader utility. The reject loader also creates
another reject file for the data that the writer or target reject during the reject loading.
Complete the following tasks to load reject data into the target:
NOTE: You cannot load rejected data into a flat file target
After you locate a reject file, you can read it using a text editor that supports the reject file code page.
Reject files contain rows of data rejected by the writer or the target database. Though the Informatica Server writes the
entire row in the reject file, the problem generally centers on one column within the row. To help you determine which
column caused the row to be rejected, the Informatica Server adds row and column indicators to give you more
information about each column:
Row indicator. The first column in each row of the reject file is the row indicator. The numeric indicator tells
whether the row was marked for insert, update, delete, or reject.
Column indicator. Column indicators appear after every column of data. The alphabetical character indicators
tell whether the data was valid, overflow, null, or truncated.
The following sample reject file shows the row and column indicators:
3,D,1,D,,D,0,D,1094945255,D,0.00,D,-0.00,D
0,D,1,D,April,D,1997,D,1,D,-1364.22,D,-1364.22,D
0,D,1,D,April,D,2000,D,1,D,2560974.96,D,2560974.96,D
3,D,1,D,April,D,2000,D,0,D,0.00,D,0.00,D
0,D,1,D,August,D,1997,D,2,D,2283.76,D,4567.53,D
0,D,3,D,December,D,1999,D,1,D,273825.03,D,273825.03,D
0,D,1,D,September,D,1997,D,1,D,0.00,D,0.00,D
Row Indicators
The first column in the reject file is the row indicator. The number listed as the row indicator tells the writer what to do
with the row of data.
Table 15-1 describes the row indicators in a reject file:
Table 15-1. Row Indicators in Reject File
Row Indicator Meaning Rejected By
0
Insert
Writer or target
Update
Writer or target
Delete
Writer or target
Reject
Writer
If a row indicator is 3, the writer rejected the row because an update strategy expression marked it for reject.
If a row indicator is 0, 1, or 2, either the writer or the target database rejected the row. To narrow down the reason why
rows marked 0, 1, or 2 were rejected, review the column indicators and consult the session log.
Column Indicators
After the row indicator is a column indicator, followed by the first column of data, and another column indicator. Column
indicators appear after every column of data and define the type of the data preceding it.
Type of data
Writer Treats As
Valid data.
Overflow. Numeric data exceeded the Bad data, if you configured the mapping target to reject
specified precision or scale for the column.
overflow or truncated data.
Truncated. String data exceeded a specified Bad data, if you configured the mapping target to reject
precision for the column, so the Informatica overflow or truncated data.
Server truncated it.
After you correct the target data in each of the reject files, append .in to each reject file you want to load into the
target database. For example, after you correct the reject file, t_AvgSales_1.bad, you can rename it
t_AvgSales_1.bad.in.
After you correct the reject file and rename it to reject_file.in, you can use the reject loader to send those files through
the writer to the target database.
Use the reject loader utility from the command line to load rejected files into target tables. The syntax for reject loading
differs on UNIX and Windows NT/2000 platforms.
Use the following syntax for UNIX:
pmrejldr pmserver.cfg [folder_name:]session_name
Use the following syntax for Windows NT/2000:
pmrejldr [folder_name:]session_name
Recovering Sessions
If you stop a session or if an error causes a session to stop, refer to the session and error logs to determine the cause
of failure. Correct the errors, and then complete the session. The method you use to complete the session depends on
the properties of the mapping, session, and Informatica Server configuration.
Use one of the following methods to complete the session:
Run the session again if the Informatica Server has not issued a commit.
Truncate the target tables and run the session again if the session is not recoverable.
Consider performing recovery if the Informatica Server has issued at least one commit.
When the Informatica Server starts a recovery session, it reads the OPB_SRVR_RECOVERY table and notes the row
ID of the last row committed to the target database. The Informatica Server then reads all sources again and starts
processing from the next row ID. For example, if the Informatica Server commits 10,000 rows before the session fails,
when you run recovery, the Informatica Server bypasses the rows up to 10,000 and starts loading with row 10,001. The
commit point may be different for source- and target-based commits.
By default, Perform Recovery is disabled in the Informatica Server setup. You must enable Recovery in the Informatica
Server setup before you run a session so the Informatica Server can create and/or write entries in the
OPB_SRVR_RECOVERY table.
Causes for Session Failure
Reader errors. Errors encountered by the Informatica Server while reading the source database or source files.
Reader threshold errors can include alignment errors while running a session in Unicode mode.
Writer errors. Errors encountered by the Informatica Server while writing to the target database or target files.
Writer threshold errors can include key constraint violations, loading nulls into a not null field, and database
trigger responses.
Transformation errors. Errors encountered by the Informatica Server while transforming data. Transformation
threshold errors can include conversion errors, and any condition set up as an ERROR, such as null input.
Fatal Error
A fatal error occurs when the Informatica Server cannot access the source, target, or repository. This can include loss of
connection or target database errors, such as lack of database space to load data. If the session uses a Normalizer or
Sequence Generator transformation, the Informatica Server cannot update the sequence values in the repository, and a
fatal error occurs.
What is target load order?
You specify the target loadorder based on source qualifiers in a maping.If you have the
multiple source qualifiers connected to the multiple targets,You can designatethe order in
which informatica server loads data into the targets.
Can we use aggregator/active transformation after update strategy transformation?
You can use aggregator after update strategy. The problem will be, once you perform the
update strategy, say you had flagged some rows to be deleted and you had performed
aggregator transformation for all rows, say you are using SUM function, then the deleted
rows will be subtracted from this aggregator transformation.
How can we join the tables if the tables have no primary and forien key relation and
no matchig port to join?
without common column or common data type we can join two sources using dummy
ports.
1.Add one dummy port in two sources.
2.In the expression trans assing '1' to each port.
2.Use Joiner transformation to join the sources using dummy port(use join conditions).
In which circumstances that informatica server creates Reject files?
When it encounters the DD_Reject in update strategy transformation.
Violates database constraint
Filed in the rows was truncated or overflowed.
When do u we use dynamic cache and when do we use static cache in an connected
and unconnected lookup transformation
We use dynamic cache only for connected lookup. We use dynamic cache to check
whether the record already exists in the target table are not. And depending on that, we
insert,update or delete the records using update strategy. Static cache is the default cache
in both connected and unconnected. If u select static cache on lookup table in infa, it
own't update the cache and the row in the cache remain constant. We use this to check the
results and also to update slowly changing records
How to get two targets T1 containing distinct values and T2 containing duplicate
values from one source S1.
Use filter transformation for loading the target with no duplicates. and for the other
transformation load it directly from source.
How to delete duplicate rows in flat files source is any option in informatica
Use a sorter transformation , in that u will have a "distinct" option make use of it .
why did u use update stategy in your application?
Update Strategy is used to drive the data to be Inert, Update and Delete depending upon
some condition. You can do this on session level tooo but there you cannot define any
condition.For eg: If you want to do update and insert in one mapping...you will create
two flows and will make one as insert and one as update depending upon some
condition.Refer : Update Strategy in Transformation Guide for more information
What r the options in the target session of update strategy transsformatioin?
Update as Insert:
This option specified all the update records from source to be flagged as inserts in the
target. In other words, instead of updating the records in the target they are inserted as
new records.
Update else Insert:
This option enables informatica to flag the records either for update if they are old or
insert, if they are new records from source.
What r the different types of Type2 dimension maping?
Type2
1. Version number
2. Flag
3.Date
What are the basic needs to join two sources in a source qualifier?
Two sources should have primary and Foreign key relation ships.
Two sources should have matching data types.
What are the different options used to configure the sequential batches?
Two options
Run the session only if previous session completes sucessfully. Always runs the session.
What are conformed dimensions?
A data warehouse must provide consistent information for queries requesting
similar information. One method to maintain consistency is to create dimension
tables that are shared (and therefore conformed), and used by all applications
and data marts (dimensional models) in the data warehouse. Candidates for
shared or conformed dimensions include customers, time, products, and
geographical dimensions, such as the store dimension.
What are conformed facts?
Fact conformation means that if two facts exist in two separate locations, then
they must have the same name and definition. As examples, revenue and profit
are each facts that must be conformed. By conforming a fact, then all business
processes agree on one common definition for the revenue and profit measures.
Then, revenue and profit, even when taken from separate fact tables, can be
mathematically combined.
Establishing conformity
Developing a set of shared, conformed dimensions is a significant challenge. Any
dimensions that are common across the business processes must represent the
dimension information in the same way. That is, it must be conformed. Each
business process will typically have its own schema that contains a fact table,
several conforming dimension tables, and dimension tables unique to the
specific business function. The same is true for facts.
Degenerate dimensions
Before we discuss degenerate dimensions in detail, it is important to understand
the following:
A fact table may consist of the following data:
_ Foreign keys to dimension tables
_ Facts which may be:
Additive
Semi-additive
Non-additive
Pseudo facts (such as 1 and 0 in case of attendance tracking)
Textual fact (rarely the case)
Derived facts
year-to-date facts
_ Degenerate dimensions (one or more)
What is a degenerate dimension?
A degenerate dimension sounds a bit strange, but it is a dimension without
attributes. It is a transaction-based number which resides in the fact table. There
may be more than one degenerate dimension inside a fact table.
Identifying garbage dimensions
A garbage dimension is a dimension that consists of low-cardinality columns
such as codes, indicators, and status flags. The garbage dimension is also
referred to as a junk dimension. The attributes in a garbage dimension are not
related to any hierarchy.
Non-additive facts
Non-additive facts are facts which cannot be added meaningfully across any
dimensions.
Textual facts: Adding textual facts does not result in any number. However,
counting textual facts may result in a sensible number.
_ Per-unit prices: Adding unit prices does not produce any meaningful
Percentages and ratios:
Measures of intensity: Measures of intensity such as the room temperature
Averages:
Semi-additive facts
Semi-additive facts are facts which can be summarized across some dimensions
but not others. Examples of semi-additive facts include the following:
_ Account balances
_ Quantity-on-hand
adding the monthly balances across the
different days for the month of January results in an incorrect balance figure.
However, if we average the account balance to find out daily average balance
during each day of the month, it would be valid.
event-based fact tables
Event fact tables are tables that record events. For example, event fact tables are used to record events such as Web
page clicks and employee or student attendance. Events, such as a Web user clicking on a Web page of a Web site, do
not always result in facts. In other words, millions of such Web page click
events do not always result in sales. If we are interested in handling such event-based scenarios where there are no
facts, we use event fact tables which consist of either pseudo facts or these tables have no facts (factless) at all.
From a conceptual perspective, the event-based fact tables capture the
many-to-many relationships between the dimension tables.
Q. What type of repositories can be created using Informatica Repository Manager?
A. Informatica PowerCenter includeds following type of repositories :
Standalone Repository : A repository that functions individually and this is unrelated to any other repositories.
Global Repository : This is a centralized repository in a domain. This repository can contain shared objects
across the repositories in a domain. The objects are shared through global shortcuts.
Local Repository : Local repository is within a domain and its not a global repository. Local repository can
connect to a global repository using global shortcuts and can use objects in its shared folders.
Versioned Repository : This can either be local or global repository but it allows version control for the
repository. A versioned repository can store multiple copies, or versions of an object. This features allows to
efficiently develop, test and deploy metadata in the production environment.
When using a dynamic lookup and WHERE clause in SQL override. Make sure that you add a filter before the
lookup. The filter should remove rows which do not satisfy the WHERE Clause.
Reason
During dynamic lookups while inserting the records in cache the WHERE clause is not evaluated, only the join
condition is evaluated. So, the lookup cache and table are not in sync. Records satisfying the join condition are
inserted into lookup cache. Its better to put a filter before the lookup using WHERE clause so that it contains
records satisfying both join condition and where clause.
1. Difference between Filter and Router?
Filter
You can filter rows in a mapping with the Filter
transformation. You pass all the rows from a
source transformation through the Filter
transformation, and then enter a filter condition
for the transformation. All ports in a Filter
transformation are input/output, and only rows
that meet the condition pass through the Filter
transformation
Router
A Router transformation is similar to a Filter
transformation because both transformations
allow you to use a condition to test data. A
Filter transformation tests data for one
condition and drops the rows of data that do
not meet the condition. However, a Router
transformation tests data for one or more
conditions and gives you the option to route
rows of data that do not meet any of the
conditions to a default output group
As an active transformation, the router
transformation may change the number of rows
passed through it
In router we can have multiple condition
Why in informatica usage of dynamic cache not possible in flat file lookup?
Nothing in this thread makes any sense. Nothing gets updated in a dynamic cached other than the cache itself. What
happens in the file is a matter of what your mapping does to it, not the cache.
A lookup (dynamic or otherwise) is loaded from a source. The source can be anything you have defined in your
environment... flat file, table, whatever.
The difference between a dynamic and static cache is in a dynamic cache one of the columns in the source must be
identified as the primary key (separate from the lookup key) and it must be numeric. It uses the values in that column to
figure out what the new key should be should you insert a new row in the cache.
If your flat file does not have such a column you cannot use it in a dynamic lookup.
Enable
You can configure the Integration Service to perform a test load.
Test Load With a test load, the Integration Service reads and transforms data without writing to targets. The
Integration Service generates all session files and performs all pre- and post-session functions, as if
running the full session.
The Integration Service writes data to relational targets, but rolls back the data when the session
completes. For all other target types, such as flat file and SAP BW, the Integration Service does not write
data to the targets.
Enter the number of source rows you want to test in the Number of Rows to Test field.
You cannot perform a test load on sessions that use XML sources.
Note: You can perform a test load when you configure a session for normal mode. If you configure the
session for bulk mode, the session fails.