You are on page 1of 9

Optimization of Lookup

Transformation

DECLARATION

I hereby declare that this document is based on my personal experiences and / or


experiences of my project members. To the best of my knowledge, this document
does not contain any material that infringes the copyrights of any other individual or
organization including the customers of Infosys.

Balaji Subramanian
MHRSINFA Project
Date: 07th December, 2004
ACKNOWLEDGEMENTS

I take immense pleasure in expressing my gratitude to my Project Manager


Mr.Lakshmi Narasimhan CR, under whose leadership we are executing the project
MHRSINFA.

Loads of thanks to my Module Leader Mr.Tak Amit Ashok who guided me in


completing this BOK.

I also take this opportunity to thank our teammates who provided many of the inputs
to create this BOK.

Thank You,

Balaji Subramanian

mailto: balaji_subramanian@infosys.com

INDEX

DECLARATION.................................................................................................
ACKNOWLEDGEMENTS....................................................................................
1. LOOKUP TRANSFORMATION OVERVIEW......................................................
2. LOOKUP PROPERTIES..................................................................................
3. LOOKUP CACHE...........................................................................................
4. LOOKUP TRANSFORMATION TIPS...............................................................
1. LOOKUP TRANSFORMATION OVERVIEW

Lookup transformation is used to look up data in a relational table, view, or


synonym. We can import a lookup definition from any relational database to which
both the Informatica Client and Server can connect. Lookup is a passive
transformation. We can use multiple Lookup transformations in a mapping.

The Informatica Server queries the lookup table based on the lookup ports in the
transformation. It compares Lookup transformation port values to lookup table
column values based on the lookup condition. We can configure the Lookup
transformation to perform different types of lookups. We can configure the
transformation to be connected or unconnected, cached or uncached.

Connected and Unconnected Lookups


Connected and unconnected lookup transformations receive input and send output in
different ways. We can configure a connected Lookup transformation to receive input
directly from the mapping pipeline, or we can configure an unconnected Lookup
transformation to receive input from the result of an expression in another
transformation.

  Connected lookup receives input values directly from the pipeline,


whereas unconnected lookup receives input values from the result of a :LKP
expression in another transformation.
  Connected lookup can return multiple columns from the same row,
whereas unconnected lookup can return only one column from each row.
  Unconnected lookup transformation can be called more than once in a
mapping.
  If we are going to return only one column from the table, then it is better
to use unconnected lookup transformation. Because in unconnected lookup,
we can decide whether to lookup for a row or not, by giving an IIF condition
in the expression transformation when calling the lookup.

CACHED OR UNCACHED

We can configure a Lookup transformation to cache the lookup table. The Informatica
Server builds a cache in memory when it processes the first row of data in a cached
Lookup transformation. It allocates memory for the cache based on the amount we
configure in the transformation or session properties. The Informatica Server stores
condition values in the index cache and output values in the data cache. The
Informatica Server queries the cache for each row that enters the transformation.

2. LOOKUP PROPERTIES

Properties for the Lookup transformation identify the database source, how the
Informatica Server processes the transformation, and how it handles caching and
multiple matches.

The following are some of the lookup transformation properties.

LOOKUP SQL OVERRIDE

  Overrides the default SQL statement to query the lookup table


  Specifies the SQL statement that we want the Informatica Server to use
for querying lookup values.
  Use only with the lookup cache enabled.
  Enter only the SELECT, FROM, and WHERE clauses when entering the
SQL override.

LOOKUP CACHING ENABLED

Indicates whether the Informatica Server caches lookup values during the session.
When we enable lookup caching, the Informatica Server queries the lookup table
once, caches the values, and looks up values in the cache during the session. This
can improve session performance.

When we disable caching, each time a row passes into the transformation, the
Informatica Server issues a select statement to the lookup table for lookup values.
LOOKUP DATA CACHE SIZE

Indicates the maximum size the Informatica Server allocates to the data cache in
memory. If the Informatica Server cannot allocate the configured amount of memory
when initializing the session, it fails the session. When the Informatica Server cannot
store all the data cache data in memory, it pages to disk as necessary.
  The Lookup Data Cache Size is 2,000,000 bytes by default. The minimum
size is 1,024 bytes. Use only with the lookup cache enabled.
  For optimized performance the data cache size should be total size of all
fields * rows

LOOKUP INDEX CACHE SIZE

Indicates the maximum size the Informatica Server allocates to the index cache in
memory. If the Informatica Server cannot allocate the configured amount of memory
when initializing the session, it fails the session. When the Informatica Server cannot
store all the index cache data in memory, it pages to disk as necessary.
  The Lookup Index Cache Size is 1,000,000 bytes by default. The
minimum size is 1,024 bytes. Use only with the lookup cache enabled.
  For optimized performance the index cache size should be total size of
condition fields * rows

3. LOOKUP CACHE

The Informatica Server creates the cache files by default in the $PMCacheDir. If the
data does not fit in the memory cache, the Informatica Server stores the overflow
values in the cache files. When the session completes, the Informatica Server
releases cache memory and deletes the cache files unless we configure the Lookup
transformation to use a persistent cache.

When configuring a lookup cache, we can specify any of the following options:

PERSISTENT CACHE

If we want to save and reuse the cache files, we can configure the transformation to
use a persistent cache. Use a persistent cache when the lookup table does not
change between session runs. The first time the Informatica Server runs a session
using a persistent lookup cache, it saves the cache files to disk instead of deleting
them. The next time the Informatica Server runs the session, it builds the memory
cache from the cache files.

RECACHE FROM DATABASE

If the persistent cache is not synchronized with the lookup table, we can configure
the Lookup transformation to rebuild the lookup cache.
STATIC CACHE

We can configure a static, or read-only, cache for any lookup table. By default, the
Informatica Server creates a static cache. It caches the lookup table and looks up
values in the cache for each row that comes into the transformation. When the
lookup condition is true, the Informatica Server returns a value from the lookup
cache. The Informatica Server does not update the cache while it processes the
Lookup transformation.

DYNAMIC CACHE

The lookup transformation is configured to use a dynamic cache when the target
table is also the lookup table. When we use a dynamic cache, the Informatica Server
updates the lookup cache as it passes rows to the target.

SHARED CACHE

We can configure multiple Lookup transformations to share a single lookup cache.


The Informatica Server builds the cache when it processes the first Lookup
transformation. It uses the same cache to perform lookups for subsequent Lookup
transformations that share the cache.

We can share an unnamed cache between transformations in the same mapping. We


can share a named cache between transformations in the same or different
mappings.

4. LOOKUP TRANSFORMATION TIPS

  If we include more than one lookup condition, place the conditions with
an equal sign first to optimize lookup performance.

  We can increase performance by overriding the default ORDER BY


statement with an ORDER BY statement with fewer columns. By default, the
Informatica Server generates an ORDER BY statement for a cached lookup
that contains all lookup ports. To increase performance, we can suppress the
default ORDER BY statement and enter an override ORDER BY with fewer
columns.

  Default buffer block size should be total size of all fields * (20 to 100) for
optimal performance.

  Add an index to the columns used in a lookup condition


  Cache small lookup tables. The result of the lookup query and processing
is the same, whether or not we cache the lookup table.

  s p rsist nt lppkup pr st ti
lppkup t l s. I t lppkup t l dp s npt
n tw n s ssipns pn i ur t
Lppkup tr ns prm tipn tp us p rsist nt
lppkup . T In prm ti S rv r t n
s v s nd r us s il s rpm s ssipn tp
s ssipn limin tin t tim r quir d tp
r d t lppkup t l .

  When your source is large, cache lookup table columns for those lookup
tables of 500,000 rows or less.

  Cache only lookup tables if the number of lookup calls is more than 10-
20% of the lookup table rows.

  For small lookup tables, less than 5,000 rows, cache for more than 5-10
lookup calls.

-----------------------------7d4e1390270 Content-Disposition: form-data; name="UploadIt";


filename="" Content-Type: application/octet-stream