Performance Tuning

Informatica
Performance Tuning
Performance Tuning Methodology
• It is an iterative process – Take measurements 
Analyze  Make one adjustment  Take
measurements
• Quit after the point of diminishing returns
• Overall plan
Establish benchmark Optimize memory Isolate
bottleneck  Eliminate bottleneck  Take adv of
underutilized CPU and memory.
The Tuning Environment
• Hardware (CPU bandwidth, RAM, disk space,
etc.) should be similar to production
• Database configuration should be similar to
production
• Data volume and characteristics should be similar to
production
• Challenge: production data is constantly
changing
Optimal tuning may be data dependent
Estimate “average” behavior
Estimate “worst case” behavior
Preliminary steps
• Eliminate transformation errors & data rejects
• Override tracing level to terse or normal
• Source row logging requires reader to hold onto
buffers until data is written to target  EVEN IF
THERE ARE NO ERRORS
Identify the bottleneck
• Target
• Source
• Transformations
• Mapping/Session
Thread Statistics in Session Log
Before Tuning
Thread [READER_1_2_1] created for [the read stage] of partition point [SQ_NON_SEARCH] has completed.
Total Run Time = [1467.493398] secs
Total Idle Time = [1367.666442] secs
Busy Percentage = [6.802549]
Thread [TRANSF_1_2_1] created for [the transformation stage] of partition point [SQ_NON_SEARCH] has completed.
Thread work time breakdown:
Union: 0.145138 percent
EXP_NON_SEARCH: 99.854862 percent
After Tuning
Thread [READER_1_3_1] created for [the read stage] of partition point [SQ_NON_SEARCH] has completed.
Thread [TRANSF_1_3_1] created for [the transformation stage] of partition point [SQ_NON_SEARCH] has completed.
Thread work time breakdown:
Union: 44.549763 percent
AGG: 16.587678 percent
LKP_NON_SRCH: 38.862559 percent
Collect performance data
Performance Counters in WF Monitor
Target Bottleneck
Reader Thread | Transformation Thread |Transform |Writer Thread
(First Stage) (Second Stage) (Third Stage) (Fourth Stage)

Busy% Busy% Busy%=15 Busy%=95
Other Methods of Bottleneck Isolation
• Write to flat file
If significantly faster than relational target–Target
Bottleneck
• Place FALSE Filter right after Source Qualifier
If significantly faster–Transformation Bottleneck
• If target & transformation bottlenecks are ruled out–
Source Bottleneck
• Use a read test mapping for Source Bottleneck
Remove transformations and check if session
performance is same.
Target Optimization
• Target Optimization often involves non-Informatica
components
• Drop Indexes and Constraints
Use pre/post SQL to drop and rebuild
Use pre/post-load stored procedures
• Use constraint-based loading only when necessary
• Use Bulk Loading
Informatica bypasses the database log
Target cannot perform rollback
Weigh importance of performance over recovery
• Use External Loader
Similar to bulk loader, but the DB reads from a flat file
Source Bottlenecks
• Source optimization often involves non-Informatica
components
• Generated SQL available in session log
Execute directly against DB
Update statistics on DB
Used tuned SELECT as SQL override
• Set the Line Sequential Buffer Length session
property to correspond with the record size
• Avoid reading same data more than once
• Filter at source if possible (reduce data set)
• Minimize connected outputs from the source qualifier
Tuning Mapping Design
Basic guidelines
• Consistency - Naming conventions, Descriptions,
environments, documentation
• Modularity – Modular design, common error
handling, reprocessing
• Reusability – shortcuts, mapplets
• Scalability – caching, queries , partitioning, reduce
data set (reduce ports and rows).
• Simplicity – multiple simple processes, simple
queries, staging table.
Sources and Targets
• Use shortcuts
• Extract only what is necessary
• Limit reads on source
• Distinguish between similar sources and targets
• Apply to update non-key columns on the target.
Source Qualifier
• Apply default query when possible.
• SQL override
Adv – utilize d/b optimizers, can accommodate
complex queries
Co nt d.
Cons – impacts d/b resources, unable to utilize partitioning,

unable to utilize pushdown optimization option, lose
transformation logic in metadata searched.
• SQ can be used as a lookup.
Tip – put a copy of override query in desc to avoid losing it
when pressing ‘generate SQL query’
Transformations
• Calculate once, use many times
• Filter as early as possible
• Reduce data for transformations with caches.
• Avoid data type conversions – expensive
• Reduce coding outside Informatica
• Don’t have high precision for decimal type data unless
needed.
Expressions
• Functions are more expensive than operators
Use || instead of CONCAT()
• Use variable ports to factor out common logic
• Simplify nested functions when possible
Try DECODE instead of IIF
• Provide comments in expression editor.
Filters
• Consider SQ as a filter to limit rows within relational
sources
• Filter close to source
• Replace multiple filter with router
Aggregators
• Use sorted i/p
• Limit connected i/p o/p ports
• Filter data before aggregating
• Use early as possible
Joiners
• Perform joins in SQ when possible (relational srcs)
• Perform normal joins
• Join sorted i/p
• Source with fewer rows  master
Lookups
• Relational lkps should only return ports that meet
the condition
• Apply unnconnected lkp  to return only 1 port
• Use SQL override in lkp (comment out ‘order by’)
• Replace large lkp tables with joins in SQ when
possible.
• Use SQ as lkp table.
• Use persistent cache to save lkp cache files for re-
use.
• Apply cache calculator in session for huge volume of
data.
Anatomy of session
Memory optimization
Reader Bottleneck
Transformer Bottleneck
Writer Bottleneck
Tuning the DTM Buffer
• Buffer block size
Recommendation: at least 100 rows / block
Compute based on largest source or target row size
Typically not a significant bottleneck unless below 10
rows/buffer
• Number of blocks
Minimum of 2 blocks required for each source, target
and XML group
(number of blocks) =0.9 x ((DTM buffer size)/(buffer
block size))
Contd.
• Determine the minimum DTM buffer size
(DTM buffer size) =(buffer block size) x (minimum
number of blocks) / 0.9
• Increase by a multiple of the block size
• If performance does not improve, return to previous
setting
• There is no “formula” for optimal DTM buffer size
• Auto setting may be adequate for some sessions
Transformation Caches
• Temporary storage area for certain transformations
• Except for Sorter, each is divided into a Data & Index
Cache
• The size of each transformation cache is tunable
• If runtime cache requirement > setting, overflow
written to disk
• The default setting for each cache is Auto
Tuning the Transformation Caches
• If a cache setting is too small, DTM writes overflow to
disk
• Determine if transformation caches are overflowing:
Watch the cache directory on the file
system while the session runs
Use the session performance counters
• Options to tune:
• Increase the maximum memory allowed for Auto
transformation cache sizes
• Set the cache sizes for individual transformations
manually
Performance Counters
Tuning the Transformation Caches
• Non-0 counts for readfromdisk and writetodisk
indicate sub-optimal settings for transformation
index or data caches
• This may indicate the need to tune transformation
caches manually
• Any manual setting allocates memory outside of
previously set maximum
• Cache Calculators provide guidance in manual tuning
of transformation caches
Aggregator Caches
• Unsorted Input
• Must read all input before releasing any output
rows
• Index cache contains group keys
• Data cache contains non-group-by ports
• Sorted Input
• Releases output row as each input group is
processed
• Does not require data or index cache (both =0)
• May run much faster than unsorted BUT must
consider the expense of sorting
Joiner Caches: Unsorted Input
Lookup Caches
• To cache or not to cache?
• Large number of invocations–cache
• Large lookup table–don’t cache
• Flat file lookup is always cached
• Data cache
• Only connected output ports included in data cache
• For unconnected lookup, only “return” port
included in data cache
• Index cache size
• Only lookup keys included in index cache
Lookup Caches
• Lookup Transformation–Fine-tuning the Cache
• SQL override
• Persistent cache (if the lookup data is static)
• Optimize sort
• Default- lookup keys, then connected output ports in port order
• Can be commented out or overridden in SQL override
• Indexing strategy on table may impact performance
• Use Any Value property suppresses sort
• Can build lookup caches concurrently
• May improve session performance when there is significant activity
upstream from the lookup & the lookup cache is large
• This option applies to the individual session
Performance Tuning features
Pushdown optimization
• Push transformation logic to source or target d/b.
• Executes SQL against src or tgt d/b instead of processing
transformation logic within the IS
Recommendations:
• Use when there is a large mismatch in the processing power
of Informatica server and the d/b server.
• Some transformations can never be ‘pushed down’ because
they may have multiple connections. (Jnr, lkp, union, target)
• Connections properties must be identical. (connect string,
code page, connection environment SQL, Transaction
environment SQL).
Performance Tuning features
Pipeline partitioning
• Improves session performance by creating threads to move
data down the pipeline.
• The data is moved in pipeline stages defined by partition
points; stages run in parallel.
• By default, there is a partition point at the SQ, Target,
Aggregator and Rank transformations.
• Cannot add a partition on certain transformations – sequence
generator, unconnected lkp and the source definition.
• Partition types – pass through, key range, round robin, hash
auto keys, hash user keys, database.
Partition Recommendations
• Make sure you have ample CPU BW and memory.
• Make sure you have gone through other optimization
techniques.
• Add one partition at a time and monitor – if CPU usage is
closer to 100%, don’t add any more.
• Multiply the DTM buffer size by the number of partitions.
• Multiply the transformation cache sizes for aggregators, ranks,
joiners & sorters by the number of partitions.
• Partition the source data evenly.
• If you have >1 partitions, add where data needs to be
redistributed – aggregator, rank or sorter where data must be
grouped, or where data is distributed unevenly.
64 bit vs. 32 bit OS
• Take advantage of large memory support in 64-bit
• Cache based transformations like Sorter, Lookup, Aggregator,
Joiner, and XML Target can address larger blocks of memory
How to increase Informatica Server performance:

• Check hard disks on related machines (slow disk access can
slow down session performance)
• Improve the network speed.
• Check CPU’s on related machines.
• Configure physical memory for the Informatica Server to
minimize disk I/O.
• Optimize database configuration
Contd.
• Staging areas. If you use a staging area, you force the
Informatica Server to perform multiple passes on your data.
Where possible, remove staging areas to improve
performance
• You can run multiple Informatica Servers on separate systems
against the same repository. Distributing the session load to
separate Informatica Server systems increases performance
Maximum Memory Allocation Example
• Parameters
• 64 Bit OS
• Total system memory: 32 GB
• Maximum allowed for transformation caches: 5 GB
or 10%
• DTM Buffer: 24 MB
• One transformation manually configured
Index Cache: 10 MB
Data Cache: 20 MB
• All other transformations set to Auto
Maximum Memory Allocation Example
• Result
• 10% = 3.2 GB < 5 GB:
max allowed for transformation caches =3.2GB=3200
MB
• Manually configured transformation uses 30 MB
• DTM Buffer uses 24 MB
• 3200 + 30 + 24 = 3254 MB
• Note that 3254 MB represents an upper limit;
cached
transformations may use less than the 3200 MB max
Summary
This presentation showed you how to:
• Approach the performance tuning challenge
• Create a performance tuning test environment
• Identify Bottlenecks
• Test for CPU (thread) utilization
• Tune mappings and transformations
• Test and adjust memory and cache usage

Performance Tuning

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Performance Tuning

Uploaded by

Copyright:

Available Formats

Informatica

• Quit after the point of diminishing returns

Reader Thread | Transformation Thread |Transform |Writer Thread

(First Stage) (Second Stage) (Third Stage) (Fourth Stage)

Cons – impacts d/b resources, unable to utilize partitioning,

How to increase Informatica Server performance:

You might also like