You are on page 1of 14

Informatica Map/Session Tuning

Covers basic, intermediate, and advanced tuning practices.


(by: Dan Linstedt
Table of Contents
Basic Guidelines
Intermediate Guidelines
Advanced Guidelines
INFORMATICA BASIC TUNING GUID!INS
The following points are high-level issues on where to go to perform "tuning" in
Informatica's products. These are NOT permanent instructions, nor are they the
end-all solution. Just some items which if tuned first! might ma"e a difference.
The level of s"ill availa#le for certain items will cause the results to vary.

To 'test' performance throughput it is generally recommended that the source set
of data produce a#out $%%,%%% rows to process. &eyond this - the performance
pro#lems ' issues may lie in the data#ase - partitioning ta#les, dropping ' re-
creating inde(es, striping raid arrays, etc... )ithout such a large set of results to
deal with, you're average timings will #e s"ewed #y other users on the data#ase,
processes on the server, or networ" traffic. This seems to #e an ideal test si*e
set for producing mostly accurate averages.

Try tuning your maps with these steps first. Then move to tuning the session,
iterate this se+uence until you are happy, or cannot achieve #etter performance
#y continued efforts. If the performance is still not accepta#le,. then the
architecture must #e tuned which can mean changes to what maps are
created!. In this case, you can contact us - we tune the architecture and the
whole system from top to bottom.

,--. T/I0 IN 1IN23 In order to achieve optimal performance, it's always a good
idea to stri"e a #alance #etween the tools, the data#ase, and the hardware
resources. 4llow each to do what they do #est. 5arying the architecture can
ma"e a huge difference in speed and optimi*ation possi#ilities.
6. Utili"e a database #li$e Oracle % S&base % Informi' % DB( etc)))* for si+nificant
data ,andlin+ o-erations #suc, as sorts. +rou-s. a++re+ates*) In ot!er "ords,
staging tab#es can be a !uge benefit to para##e#ism of operations. In para##e#
design $ simp#y defined by mat!ematics, near#y a#"ays cuts your e%ecution time.
Staging tab#es !ave many benefits. &#ease see t!e staging tab#e discussion in t!e
met!odo#ogies section for fu## detai#s.
$. !ocali"e) !ocali"e all tar+et tables on to t,e SAM instance of Oracle #same
SID*. or same instance of S&base) Try not to use Synonyms (remote database
#in's for anyt!ing (inc#uding: #oo'ups, stored procedures, target tab#es, sources,
functions, privi#eges, etc.... (ti#i)ing remote #in's "i## most certain#y s#o" t!ings
do"n. *or Sybase users, remote mounting of databases can definite#y be a
!indrance to performance.
7. If &ou can / locali"e all tar+et tables. stored -rocedures. functions. vie0s.
se1uences in t,e SOURC database) +gain, try not to connect across
synonyms. Synonyms (remote database tab#es cou#d potentia##y affect
performance by as muc! as a factor of , times or more.
8. Remove e'ternal re+istered modules) 2erform -re/-rocessin+ % -ost/
-rocessin+ utili"in+ 2R!. SD. A34. GR2 instead) T!e +pp#ication
&rogrammers Interface (+&I "!ic! ca##s e%terna#s is in!erent#y s#o" (as of:
-/-/.///. 0opefu##y Informatica "i## speed t!is up in t!e future. T!e e%terna#
modu#e "!ic! e%!ibits speed prob#ems is t!e regu#ar e%pression modu#e ((ni%:
Sun So#aris 123/, 2 C&(4s . 5I5S 6+M, 7rac#e 8i and Informatica. It bro'e
speed from -3//9 ro"s per second "it!out t!e modu#e $ to 28: ro"s per second
"it! t!e modu#e. ;o ot!er sessions "ere running. (T!is "as a S&1CI*IC case $
"it! a S&1CI*IC map $ it4s not #i'e t!is for a## maps.
9. Remember t,at Informatica su++ests t,at eac, session ta$es rou+,l& 5 to 5
5%( C2U6s. In 'eeping "it! t!is $ Informatica p#ay4s "e## "it! 6D<MS engines
on t!e same mac!ine, but does ;7T get a#ong (performance "ise "it! +;=
ot!er engine (reporting engine, >ava engine, 7L+& engine, >ava virtua# mac!ine,
etc...
:. Remove an& database based se1uence +enerators) T!is re?uires a "rapper
function / stored procedure ca##. (ti#i)ing t!ese stored procedures !as caused
performance to drop by a factor of , times. T!is s#o"ness is not easi#y debugged $
it can on#y be spotted in t!e @rite T!roug!put co#umn. Copy t!e map, rep#ace t!e
stored proc ca## "it! an interna# se?uence generator for a test run $ t!is is !o" fast
you C7(LD run your map. If you must use a database generated se?uence
number, t!en fo##o" t!e instructions for t!e staging tab#e usage. If you4re dea#ing
"it! 5I54s or Terabytes of information $ t!is s!ou#d save you #ot4s of !ours
tuning. I* =7( M(ST $ !ave a s!ared se?uence generator, t!en bui#d a staging
tab#e from t!e f#at fi#e, add a S1A(1;C1 ID co#umn, and ca## a &7ST T+651T
L7+D stored procedure to popu#ate t!at co#umn. &#ace t!e post target #oad
procedure in to t!e f#at fi#e to staging tab#e #oad map. + sing#e ca## to inside t!e
database, fo##o"ed by a batc! operation to assign se?uences is t!e fastest met!od
for uti#i)ing s!ared se?uence generators.
;. TURN OFF 7RBOS !OGGING) T!e session #og !as a tremendous impact
on t!e overa## performance of t!e map. *orce over$ride in t!e session, setting it to
;76M+L #ogging mode. (nfortunate#y t!e #ogging mec!anism is not Bpara##e#B
in t!e interna# core, it is embedded direct#y in to t!e operations.
<. Turn off 6collect -erformance statistics6) T,is also ,as an im-act / alt,ou+,
minimal at times $ it "rites a series of performance data to t!e performance #og.
6emoving t!is operation reduces re#iance on t!e f#at fi#e operations. 0o"ever, it
may be necessary to !ave t!is turned on D(6I;5 your tuning e%ercise. It can
revea# a #ot about t!e speed of t!e reader, and "riter t!reads.
=. If &our source is a flat file / utili"e a sta+in+ table (see t!e staging tab#e s#ides in
t!e presentations section of t!is "eb site. T!is "ay $ you can a#so use
SALCLoader, <C&, or some ot!er database <u#'$Load uti#ity. &#ace basic #ogic in
t!e source #oad map, remove a## potentia# #oo'ups from t!e code. +t t!is point $ if
your reader is s#o", t!en c!ec' t"o t!ings: - if you !ave an item in your registry
or configuration fi#e "!ic! sets t!e BT!rott#e6eaderB to a specific ma%imum
number of b#oc's, it "i## #imit your read t!roug!put (t!is on#y needs to be set if
t!e sessions !ave a demonstrated prob#ems "it! constraint based #oads . Move
t!e f#at fi#e to #oca# interna# dis' (if at a## possib#e. Try not to read a fi#e across
t!e net"or', or from a 6+ID device. Most 6+ID array4s are fast, but Informatica
seems to top out, "!ere interna# dis' continues to be muc! faster. 0ere $ a #in'
"i## ;7T "or' to increase speed $ it must be t!e fu## fi#e itse#f $ stored #oca##y.
6%. Tr& to eliminate t,e use of non/cac,ed loo$u-s) <y issuing a non$cac!ed
#oo'up, you4re performance "i## be impacted significant#y. &articu#ar#y if t!e
#oo'up tab#e is a#so a Bgro"ingB or BupdatedB target tab#e $ t!is genera##y means
t!e inde%es are c!anging during operation, and t!e optimi)er #ooses trac' of t!e
inde% statistics. +gain $ uti#i)e staging tab#es if possib#e. In uti#i)ing staging
tab#es, vie"s in t!e database can be bui#t "!ic! >oin t!e data toget!erD or
Informatica4s >oiner ob>ect can be used to >oin data toget!er $ eit!er one "i## !e#p
dramatica##y increase speed.
66. Se-arate com-le' ma-s $ try to brea' t!e maps out in to #ogica# t!readed
sections of processing. 6e$arrange t!e arc!itecture if necessary to a##o" for
para##e# processing. T!ere may be more sma##er components doing individua#
tas's, !o"ever t!e t!roug!put "i## be proportionate to t!e degree of para##e#ism
t!at is app#ied. + discussion on 07@ to perform t!is tas' is posted on t!e
met!odo#ogies page, p#ease see t!is discussion for furt!er detai#s.
6$. BA!ANC) Balance bet0een Informatica and t,e -o0er of S8! and t,e
database) Try to uti#i)e t!e D<MS for "!at it "as bui#t for:
reading/"riting/sorting/grouping/fi#tering data en$masse. (se Informatica for t!e
more comp#e% #ogic, outside >oins, data integration, mu#tip#e source feeds, etc...
T!e ba#ancing act is difficu#t "it!out D<+ 'no"#edge. In order to ac!ieve a
ba#ance, you must be ab#e to recogni)e "!at operations are best in t!e database,
and "!ic! ones are best in Informatica. T!is does not degrade from t!e use of t!e
1TL too#, rat!er it en!ances it $ it4s a M(ST if you are performance tuning for
!ig!$vo#ume t!roug!put.
67. TUN t,e DATABAS) Don6t be afraid to estimate9 sma##, medium, #arge, and
e%tra #arge source data set si)es (in terms of: numbers of ro"s, average number of
bytes per ro", e%pected t!roug!put for eac!, turnaround time for #oad, is it a
tric'#e feedE 5ive t!is information to your D<+4s and as' t!em to tune t!e
database for B"ost caseB. 0e#p t!em assess "!ic! tab#es are e%pected to be !ig!
read/!ig! "rite, "!ic! operations "i## sort, (order by, etc... Moving dis's,
assigning t!e rig!t tab#e to t!e rig!t dis' space cou#d ma'e a## t!e difference.
(ti#i)e a &16L script to generate Bfa'eB data for sma##, medium, #arge, and e%tra
#arge data sets. 6un eac! of t!ese t!roug! your mappings $ in t!is manner, t!e
D<+ can "atc! or monitor t!roug!put as a rea# #oad si)e occurs.
68. Be sure t,ere is enou+, S3A2. and TM2 s-ace on &our 2MSR7R
mac,ine) ;ot !aving enoug! dis' space cou#d potentia##y s#o" do"n your entire
server during processing (in an e%ponentia# fas!ion. Sometimes t!is means
"atc!ing t!e dis' space as "!i#e your session runs. 7t!er"ise you may not get a
good picture of t!e space avai#ab#e during operation. &articu#ar#y if your maps
contain aggregates, or #oo'ups t!at f#o" to dis' Cac!e directory $ or if you !ave a
F7I;16 ob>ect "it! !eterogeneous sources.
69. 2lace some +ood server load monitorin+ tools on &our 2MServer in
develo-ment $ "atc! it c#ose#y to understand !o" t!e resources are being
uti#i)ed, and "!ere t!e !ot spots are. Try to fo##o" t!e recommendations $ it may
mean upgrading t!e !ard"are to ac!ieve t!roug!put. Loo' in to 1MC4s dis'
storage array $ "!i#e e%pensive, it appears to be e%treme#y fast, I4ve !eard (but not
verified t!at it !as improved performance in some cases by up to 3/G
6:. SSSION STTINGS) In t,e session. t,ere is onl& so muc, tunin+ &ou can
do) <a#ancing t!e t!roug!put is important $ by turning on BCo##ect &erformance
StatisticsB you can get a good fee# for "!at needs to be set in t!e session $ or "!at
needs to be c!anged in t!e database. 6ead t!e performance section carefu##y in
t!e Informatica manua#s. <asica##y "!at you s!ou#d try to ac!ieve is: 7&TIM+L
61+D, 7&TIMI+L T067(50&(T, 7&TIM+L @6IT1. 7ver$tuning one of
t!ese t!ree pieces can resu#t in u#timate#y s#o"ing do"n your session. *or
e%amp#e: your "rite t!roug!put is governed by your read and transformation
speed, #i'e"ise, your read t!roug!put is governed by your transformation and
"rite speed. T!e best met!od to tune a prob#ematic map, is to brea' it in to
components for testing: - 6ead T!roug!put, tune for t!e reader, see "!at t!e
settings are, send t!e "rite output to a f#at fi#e for #ess contention $ C,ec$ t,e
:T,rottleReader: settin+ ("!ic! is not configured by defau#t, increase t!e
Defau#t <uffer Si)e by a factor of :2' eac! s!ot $ ignore t!e "arning above -.8'.
If t!e 6eader sti## appears to increase during t!e session, t!en stabi#i)e (after a fe"
t!ousand ro"s, t!en try increasing t!e S!ared Session Memory from -.M< to
.2M<. If t!e reader sti## stabi#i)es, t!en you !ave a s#o" source, s#o" #oo'ups, or
your C+C01 directory is not on interna# dis'. If t!e reader4s t!roug!put
continues to c#imb above "!ere it stabi#i)ed, ma'e note of t!e session settings.
C!ec' t!e &erformance Statistics to ma'e sure t!e "riter t!roug!put is ;7T t!e
bott#enec' $ you are attempting to tune t!e reader !ere, and don4t "ant t!e "riter
t!reads to s#o" you do"n. C!ange t!e map target bac' to t!e database targets $
run t!e session again. T!is time, ma'e note of !o" muc! t!e reader s#o"s do"n,
it4s optima# performance "as reac!ed "it! a f#at fi#e(s. T!is time $ s#o" targets
are t!e cause. NOT9 if &our reader session to flat file ;ust doesn6t ever :+et
fast:. t,en &ou6ve +ot some basic ma- tunin+ to do) Try to merge e%pression
ob>ects, set your #oo'ups to unconnected (for re$use if possib#e, c!ec' your Inde%
and Data cac!e settings if you !ave aggregation, or #oo'ups being performed.
1tc... If you !ave a s#o" "riter, c!ange t!e map to a sing#e target tab#e at a time $
see "!ic! target is causing t!e Bs#o"nessB and tune it. Ma'e copies of t!e
origina# map, and brea' do"n t!e copies. 7nce t!e Bs#o"erB of t!e ; targets is
discovered, ta#' to your D<+ about partitioning t!e tab#e, updating statistics,
removing inde%es during #oad, etc... T!ere are many database t!ings you can do
!ere.
6;. Remove all ot,er :a--lications: on t,e 2MServer) 1%cept for t!e database /
staging database or Data @are!ouse itse#f. &MServer p#ays "e## "it! 6D<MS
(re#ationa# database management system $ but doesn4t p#ay "e## "it! app#ication
servers, particu#ar#y F+H+ Hirtua# Mac!ines, @eb Servers, Security Servers,
app#ication, and 6eport servers. +## of t!ese items s!ou#d be bro'en out to ot!er
mac!ines. T!is is critica# to improving performance on t!e &MServer mac!ine.
&ac" To Top
INFORMATIA INT!RM!"IAT! T#NIN$ $#I"!%IN!&
The following num#ered items are for intermediate level tuning. 4fter going
through all the pieces a#ove, and still having trou#le, these are some things to
loo" for. These are items within a map which ma"e a difference in performance
)e've done e(tensive performance testing of Informatica to #e a#le to show
these affects!. ,eep in mind - at this level, the performance isn't affected unless
there are more than 6 1illion rows average si*e3 $.9 >I> of data!.
4?? items are Informatica 14. items, and Informatica O#@ects - none are outside
the map. 4lso remem#er, this applies to .ower1art'.owerAenter 8.9(, 8.:(, '
6.9(, 6.:(! - other versions have NOT #een tested. The order of these items is
not relevant to speed. -ach one has it's own impact on the overall performance.
4gain, throughput is also gauged #y the num#er of o#@ects constructed within a
map'maplet.
0ometimes it's #etter to sacrifice a little reada#ility, for a little speed. It's the old
paradigm, weighing reada#ility and maintaina#ility true modularity! against raw
speed. 1a"e sure the client agrees with the approach, or that the data sets are
large enough to warrant this type of tuning. &- 4)4B-3 The following tuning tips
range from "minor" cleanup to "last resort" types of things - only when data sets
get very large, should these items #e addressed, otherwise, start with the &40IA
tuning list a#ove, then wor" your way in to these suggestions.

To understand the intermediate section, you'll need to review the memory usage
diagrams (also available on this web site).
6. Filter '-ressions / tr& to evaluate t,em in a -ort e'-ression) Try to create
t!e fi#ter (true/fa#se ans"er inside a port e%pression upstream. Comp#e% fi#ter
e%pressions s#o" do"n t!e mapping. +gain, e%pressions/conditions operate
fastest in an 1%pression 7b>ect "it! an output port for t!e resu#t. Turns out $ t!e
#onger t!e e%pression, or t!e more comp#e% $ t!e more severe t!e speed
degradation. &#ace t!e actua# e%pression (comp#e% or not in an 1I&61SSI7;
7<F1CT upstream from t!e fi#ter. Compute a sing#e numerica# f#ag: - for true, /
for fa#se as an output port. &ump t!is in to t!e fi#ter $ you s!ou#d see t!e
ma%imum performance abi#ity "it! t!is configuration.
$. Remove all :DFAU!T: value e'-ressions 0,ere -ossible) 0aving a defau#t
va#ue $ even t!e B16676(%%%B command s#o"s do"n t!e session. It causes an
unnecessary eva#uation of va#ues for every data e#ement in t!e map. T!e on#y
time you "ant to use BD1*+(LT va#ue is "!en you !ave to provide a defau#t
va#ue for a specific port. T!ere is anot!er met!od: p#acing a variab#e "it! an
II*(%%%%, D1*+(LT H+L(1, %%%% condition "it!in an e%pression. T!is "i##
a#"ays be faster (if assigned to an output port t!an a defau#t va#ue.
7. 7ariable 2orts are :slo0er: t,an Out-ut '-ressions) @!enever possib#e,
use output e%pressions instead of variab#e ports. T!e variab#es are good for Bstatic
$ and state drivenB but do s#o" do"n t!e processing time $ as t!ey are
a##ocated/rea##ocated eac! pass of a ro" t!roug! t!e e%pression ob>ect.
8. Datat&-e conversion / -erform it in a -ort e'-ression) Simp#y mapping a
string to an integer, or an integer to a string "i## perform t!e conversion, !o"ever
it "i## be s#o"er t!an creating an output port "it! an e%pression #i'e:
toJinteger(%%%% and mapping an integer to an integer. It4s because &MServer is
#eft to decide if t!e conversion can be done mid$stream "!ic! seems to s#o"
t!ings do"n.
9. Unused 2orts) Surprising#y, unused output ports !ave no affect on performance.
T!is is a good t!ing. 0o"ever in genera# it is good practice to remove any unused
ports in t!e mapping, inc#uding variab#es. (nfortunate#y $ t!ere is no B?uic'B
met!od for identifying unused ports.
:. Strin+ Functions) String functions definite#y !ave an impact on performance.
&articu#ar#y t!ose t!at c!ange t!e #engt! of a string (substring, #trim, rtrim, etc...
T!ese functions s#o" t!e map do"n considerab#y, t!e operations be!ind eac!
string function are e%pensive (de$a##ocate, and re$a##ocate memory "it!in a
61+D16 b#oc' in t!e session. String functions are a necessary and important
part of 1TL, "e do not recommend removing t!eir use comp#ete#y, on#y try to
#imit t!em to necessary operations. 7ne of t!e "ays "e advocate tuning t!ese, is
to use Bvarc!ar/varc!ar.B data types in your database sources, or to use de#imited
strings in source f#at fi#es (as muc! as possib#e. T!is "i## !e#p reduce t!e need for
BtrimmingB input. If your sources are in a database, perform t!e LT6IM/6T6IM
functions on t!e data coming in from a database SAL statement, t!is "i## be muc!
faster t!an operationa##y performing it mid$stream.
;. IIF Conditionals are costl&) @!en possib#e $ arrange t!e #ogic to minimi)e t!e
use of II* conditiona#s. T!is is not particu#ar to Informatica, it is cost#y in +;=
programming #anguage. It introduces BdecisionsB "it!in t!e too#, it a#so
introduces mu#tip#e code pat!s across t!e #ogic (t!us increasing comp#e%ity.
T!erefore $ "!en possib#e, avoid uti#i)ing an II* conditiona# $ again, t!e on#y
possibi#ity !ere mig!t be (for e%amp#e an 76+CL1 D1C7D1 function app#ied
to a SAL source.
<. Se1uence Generators slo0 do0n ma--in+s) (nfortunate#y t!ere is no BfastB and
easy "ay to create se?uence generators. T!e cost is not t!at !ig! for using a
se?uence generator inside of Informatica, particu#ar#y if you are cac!ing va#ues
(cac!e at around ./// $ seems to be t!e suite spot. 0o"ever $ if at a## avoidab#e,
t!is is one BcardB up a s#eve t!at can be p#ayed. If you don4t abso#ute#y need t!e
se?uence number in t!e map for ca#cu#ation reasons, and you are uti#i)ing 7rac#e,
t!en #et SALCLoader create t!e se?uence generator for a## Insert 6o"s. If you4re
using Sybase, don4t specify t!e Identity co#umn as a target $ #et t!e Sybase Server
generate t!e co#umn. +#so $ try to avoid Breusab#eB se?uence generators $ t!ey
tend to s#o" t!e session do"n furt!er, even "it! cac!ed va#ues.
=. Test '-ressions slo0 do0n sessions) 1%pressions suc! as: ISJS&+C1S tend
s#o" do"n t!e mappings, t!is is a data va#idation e%pression "!ic! !as to run
t!roug! t!e entire string to determine if it is spaces, muc! t!e same as
ISJ;(M<16 !as to va#idate an entire string. T!ese e%pressions (if at a##
avoidab#e s!ou#d be removed in cases "!ere it is not necessary to BtestB prior to
conversion. <e a"are !o"ever, t!at direct conversion "it!out testing (conversion
of an inva#id va#ue "i## 'i## t!e transformation. If you abso#ute#y need a test
e%pression for numerics, try t!is: II*(Kfie#dL C - LM /,Kfie#dL,;(LL preferab#y
you don4t care if it4s )ero. +n a#p!a in t!is e%pression s!ou#d return a ;(LL to
t!e computation. =es $ t!e II* condition is s#ig!t#y faster t!an t!e ISJ;(M<16 $
because ISJ;(M<16 parses t!e entire string, "!ere t!e mu#tip#ication operator
is t!e actua# speed gain.
6%. Reduce Number of OB<TS in a ma-. *re?uent#y, t!e idea of t!ese too#s is to
ma'e t!e Bdata trans#ation mapB as easy as possib#e. +## to often, t!at means
creating BanB (- e%pression for eac! t!roug!put/trans#ation (ta'ing it to an
e%treme of course. 1ac! ob>ect adds computationa# over!ead to t!e session and
timings may suffer. Sometimes if performance is an issue / goa#, you can integrate
severa# e%pressions in to one e%pression ob>ect, t!us reducing t!e Bob>ectB
over!ead. In doing so $ you cou#d speed up t!e map.
66. U-date '-ressions / Session set to U-date lse Insert) If you !ave t!is s"itc!
turned on $ it "i## definite#y s#o" t!e session do"n $ Informatica performs .
operations for eac! ro": update ("/&N, t!en if it returns a O167 ro"s updated,
performs an insert. T!e "ay to speed t!is up is to B'no"B a!ead of time if you
need to issue a DDJ(&D+T1 or DDJI;S16T inside t!e mapping, t!en te## t!e
update strategy "!at to do. +fter "!ic! you can c!ange t!e session setting to:
I;S16T and (&D+T1 +S (&D+T1 or (&D+T1 +S I;S16T.
6$. Multi-le Tar+ets are too slo0) *re?uent#y maps are generated "it! mu#tip#e
targets, and sometimes mu#tip#e sources. T!is (despite first appearances can
rea##y burn up time. If t!e arc!itecture permits c!ange, and t!e users support re$
"or', t!en try to c!ange t!e arc!itecture $L - map per target is t!e genera# ru#e of
t!umb. 7nce reac!ing one map per target, t!e tuning get4s easier. Sometimes it
!e#ps to reduce it to - source and - target per map. <ut $ if t!e arc!itecture a##o"s
more modu#ari)ation - map per target usua##y does t!e tric'. 5oing furt!er, you
cou#d brea' it up: - map per target per operation (suc! as insert vs update. In
doing t!is, it "i## provide a fe" more cards to t!e dec' "it! "!ic! you can BtuneB
t!e session, as "e## as t!e target tab#e itse#f. 5oing t!is route a#so introduces
para##e# operations. *or furt!er info on t!is topic, see my arc!itecture
presentations on Staging Tab#es, and ,rd norma# form arc!itecture (Corporate
Data @are!ouse S#ides.
67. Slo0 Sources / Flat Files) If you4ve got s#o" sources, and t!ese sources are f#at
fi#es, you can #oo' at some of t!e fo##o"ing possibi#ities. If t!e sources reside on
a different mac!ine, and you4ve opened a named pipe to get t!em across t!e
net"or' $ t!en you4ve opened (potentia##y a can of "orms. =ou4ve introduced t!e
net"or' speed as a variab#e on t!e speed of t!e f#at fi#e source. Try to compress
t!e source fi#e, *T& &(T it on t!e #oca# mac!ine (#oca# to &MServer, decompress
it, t!en uti#i)e it as a source. If you4re reac!ing across t!e net"or' to a re#ationa#
tab#e $ and t!e session is pu##ing many many ro"s (over -/,/// t!en t!e source
system itse#f may be s#o". =ou may be better off using a source system e%tract
program to dump it to fi#e first, t!en fo##o" t!e above instructions. 0o"ever,
t!ere is somet!ing your S+4s and ;et"or' 7ps fo#'s cou#d do (if necessary $ t!is
is covered in detai# in t!e advanced section. T!ey cou#d bac'bone t!e t"o servers
toget!er "it! a dedicated net"or' #ine (no !ubs, routers, or ot!er items in
bet"een t!e t"o mac!ines. +t t!e very #east, t!ey cou#d put t!e t"o mac!ines on
t!e same sub$net. ;o", if your fi#e is #oca# to &MServer but is sti## s#o", e%amine
t!e #ocation of t!e fi#e ("!ic! device is it on. If it4s not on an I;T16;+L DISN
t!en it "i## be s#o"er t!an if it "ere on an interna# dis' (C drive for you fo#'s on
;T. T!is doesn4t mean a uni% fi#e LI;N e%ists #oca##y, and t!e fi#e is remote $ it
means t!e actua# fi#e is #oca#.
68. Too Man& A++re+ators) If your map !as more t!an - aggregator, c!ances are t!e
session "i## run very very s#o"#y $ un#ess t!e C+C01 directory is e%treme#y fast,
and your drive see'/access times are very !ig!. 1ven sti##, p#acing aggregators
end$to$end in mappings "i## s#o" t!e session do"n by factors of at #east .. T!is
is because of a## t!e I/7 activity being a bott#enec' in Informatica. @!at needs to
be 'no"n !ere is t!at Informatica4s products: &M / &C up t!roug! 2.P% are ;7T
bui#t for para##e# processing. In ot!er "ords, t!e interna# core doesn4t put t!e
aggregators on t!reads, nor does it put t!e I/7 on t!reads $ t!erefore being a sing#e
strung process it becomes easy for a part of t!e session/map to become a
Bb#oc'edB process by I/7 factors. *or I/7 contention and resource monitoring,
p#ease see t!e database/data"are!ouse tuning guide.
69. Ma-lets containin+ A++re+ators) Map#ets are a good source for rep#icating data
#ogic. <ut >ust because an aggregator is in a map#et doesn4t mean it "on4t affect
t!e mapping. T!e reason map#ets don4t affect speed of t!e mappings, is t!ey are
treated as a part of t!e mapping once t!e session starts $ in ot!er "ords, if you
!ave an aggregator in a map#et, fo##o"ed by anot!er aggregator in a mapping you
"i## sti## !ave t!e prob#em mentioned above in Q-2. 6educe t!e number of
aggregators in t!e entire mapping (inc#uded map#ets to - if possib#e. If
necessary, sp#it t!e map up in to severa# different maps, use intermediate tab#es in
t!e database if re?uired to ac!ieve processing goa#s.
6:. liminate :too man& loo$u-s:) @!at !appens and "!yE @e## $ "it! too many
#oo'ups, your cac!e is eaten in memory $ particu#ar#y on t!e -.: / 2.: products.
T!e end resu#t is t!ere is no memory #eft for t!e sessions to run in. T!e DTM
reader/"riter/transformer t!reads are not #eft "it! enoug! memory to be ab#e to
run efficient#y. &C -.P, &M 2.P so#ve some of t!ese prob#ems by cac!ing some of
t!ese #oo'ups out to dis' "!en t!e cac!e is fu##. <ut you sti## end up "it!
contention $ in t!is case, "it! too many #oo'ups, you4re trading in Memory
Contention for Dis' Contention. T!e memory contention mig!t be "orse t!an t!e
dis' contention, because t!e system 7S end4s up t!ras!ing (s"apping in and out
of T1M&/S@+& dis' space "it! sma## b#oc' si)es to try to #ocate BfindB your
#oo'up ro", and as t!e ro" goes from #oo'up to #oo'up, t!e s"apping / t!ras!ing
get4s "orse.
6;. !oo$u-s = A++re+ators Fi+,t) T!e #oo'ups and t!e aggregators fig!t for
memory space as discussed above. 1ac! re?uires Inde% Cac!e, and Data Cac!e
and t!ey Bs!areB t!e same 01+& segments inside t!e core. See Memory Layout
document for more information. &articu#ar#y in t!e 2.: / -.: products and prior $
t!ese memory areas become critica#, and "!en dea#ing "it! many many ro"s $
t!e session is a#most certain to cause t!e server to Bt!ras!B memory in and out of
t!e 7S S"ap space. If possib#e, separate t!e maps $ perform t!e #oo'ups in t!e
first section of t!e maps, position t!e data in an intermediate target tab#e $ t!en a
second map reads t!e target tab#e and performs t!e aggregation (a#so provides t!e
option for a group by to be done "it!in t!e database... +not!er speed
improvement...

INFORMATIA A"'AN!" T#NIN$ $#I"!%IN!&
The following num#ered items are for advanced level tuning. .lease proceed
cautiously, one step at a time. 2o not attempt to follow these guidelines if you
haven't already made it through all the #asic and intermediate guidelines first.
These guidelines may re+uire a level of e(pertise which involves 0ystem
4dministrators, 2ata#ase 4dministrators, and Networ" Operations fol"s. .lease
#e patient. The most important aspect of advanced tuning is to #e a#le to
pinpoint specific #ottlenec"s, then have the funding to address them.
4s usual - these advanced tuning guidelines come last, and are pointed at
suggestions for the system. There are other advanced tuning guidelines
availa#le for 2ata )arehousing Tuning. Cou can refer to those for +uestions
surrounding your hardware ' software resources.
6. Brea$ t,e ma--in+s out. - per target. If necessary, - per source per target. @!y
does t!is "or'E @e## $ e#iminating mu#tip#e targets in a sing#e mapping can
great#y increase speed... <asica##y it4s #i'e t!is: one session per map/target. 1ac!
session estab#is!es it4s o"n database connection. <ecause of t!e uni?ue database
connection, t!e D<MS server can no" !and#e t!e insert/update/de#ete re?uests in
para##e# against mu#tip#e targets. It a#so !e#ps to a##o" eac! session to be specified
for it4s intended purpose (no #onger mi%ing a data driven session "it! I;S16TS
on#y to a sing#e target. 1ac! session can t!en be p#aced in to a batc! mar'ed
BC7;C(661;TB if preferences a##o". 7nce t!is is done, para##e#ism of
mappings and sessions become obvious. + study of para##e# processing !as s!o"n
again and again, t!at t!e operations can be comp#eted sometimes in !a#f t!e time
of t!eir origina# counterparts mere#y by streaming t!em at t!e same time. @it!
mu#tip#e targets in t!e same mapping, you4re te##ing a sing#e database connection
to !and#e mu#tip#y diverse database statements $ sometimes !itting t!is target,
ot!er times !itting t!at target. T!in' $ in t!is situation it4s e%treme#y difficu#t for
Informatica (or any ot!er too# for t!at matter to bui#d <(LN operations... even
t!oug! Bbu#'B is specified in t!e session. 6emember t!at B<(LNB means t!is is
your preference, and t!at t!e too# "i## revert to ;76M+L #oad if it can4t provide a
<(LN operation on a series of consecutive ro"s. 7bvious#y, data driven t!en
forces t!e too# do"n severa# ot!er #ayers of interna# code before t!e data actua##y
can reac! t!e database.
$. Develo- ma-lets for com-le' business lo+ic. It appears as if Map#ets do ;7T
cause any performance !indrance by t!emse#ves. 1%tensive use of map#ets means
better, more manageab#e business #ogic. T!e map#ets a##o" you to better brea'
t!e mappings out.
7. 4ee- t,e ma--in+s as sim-le as -ossible) <ury comp#e% #ogic (if you must in
to a map#et. If you can avoid comp#e% #ogic a## toget!er $ t!en t!at "ou#d be t!e
'ey. T!e o#d ru#e of t!umb app#ies !ere (common sense t!e straig!ter t!e pat!
bet"een t"o points, t!e s!orter t!e distance... Trans#ated as: t!e s!orter t!e
distance bet"een t!e source ?ua#ifier and t!e target $ t!e faster t!e data #oads.
8. Remember t,e TIMING is affected b&
RADR%TRANSFORMR%3RITR t,reads) @it! comp#e% mappings,
don4t forget t!at eac! 1L1M1;T (fie#d must be "eig!ed $ in t!is #ig!t a firm
understanding of !o" to read performance statistics generated by Informatica
becomes important. In ot!er "ords $ if t!e reader is s#o", t!en t!e rest of t!e
t!reads suffer, if t!e "riter is s#o", same effect. + pipe is on#y as big as it4s
sma##est diameter.... + c!ain is on#y as strong as it4s "ea'est #in'. Sorry for t!e
metap!ors, but it s!ou#d ma'e sense.
9. C,an+e Net0or$ 2ac$et Si"e #for S&base. MS/S8! Server = Oracle users*)
Ma%imum net"or' pac'et si)e is a Database @ide Setting, "!ic! is usua##y
defau#ted at 3-. bytes or -/.2 bytes. Setting t!e ma%imum database pac'et si)e
doesn4t necessari#y !urt any of t!e ot!er users, it does !o"ever a##o" t!e
Informatica database setting to ma'e use of t!e #arger pac'et si)es $ t!us transfer
more data in a sing#e pac'et faster. T!e typica# 4best4 settings are bet"een -/' and
./'.
In 7rac#e: you4## need to ad>ust t!e Listener.76+ and T;S;ames.76+ fi#es.
Inc#ude t!e parameters: SD(, and TD(. SD( M Service Layer Data <uffer Si)e
(in bytes, TD( M Transport Layer Data <uffer Si)e (in bytes. T!e SD( and
TD( s!ou#d be set e?ua##y. See t!e Informatica *+A page for more information
on setting t!ese up.
:. C,an+e to I2C Database Connection for !ocal Oracle Database. If &MServer
and 7rac#e are running on t!e same server, use an I&C connection instead of a
TC&/I& connection. C!ange t!e protoco# in t!e T;S;ames.76+ and
Listener.76+ fi#es, and restart t!e #istener on t!e server. <e carefu# $ t!is protoco#
can on#y be used #oca##y, !o"ever t!e speed increases from using Inter &rocess
Communication can be bet"een .% and :%. I&C is uti#i)ed by 7rac#e, but is
defined as a (ni% System 3 standard specification. =ou can find more
information on I&C by reading about in in (ni% System 3 manua#s.
;. C,an+e Database 2riorities for t,e 2MServer Database User) &rioriti)ing t!e
database #ogin t!at any of t!e connections use (setup in Server Manager can
assist in c!anging t!e priority given to t!e Informatica e%ecuting tas's. T!ese
tas's "!en #ogged in to t!e database t!en can over$ride ot!ers. Si)ing memory
for t!ese tas's (in s!ared g#oba# areas, and server settings must be done if
priorities are to be c!anged. If <C& or SALCLoader or some ot!er bu#'$#oad
faci#ity is uti#i)ed, t!ese priorities must a#so be set. T!is can great#y improve
performance. +gain, it4s on#y suggested as a #ast resort met!od, and doesn4t
substitute for tuning t!e database, or t!e mapping processes. It s!ou#d on#y be
uti#i)ed "!en a## ot!er met!ods !ave been e%!austed (tuned. Neep in mind t!at
t!is s!ou#d on#y be re#egated to t!e production mac!ines, and on#y in certain
instances "!ere t!e Load cyc#e t!at Informatica is uti#i)ing is ;7T impeding
ot!er users.
<. C,an+e t,e Uni' User 2riorit&) In order to gain speed, t!e Informatica (ni%
(ser must be given a !ig!er priority. T!e (ni% S+ s!ou#d understand "!at it
ta'es to ran' t!e (ni% #ogins, and grant priorities to particu#ar tas's. 7r $ simp#y
!ave t!e pmserver e%ecuted under a super user (S( command, t!is "i## ta'e care
of reprioriti)ing Informatica4s core process. T!is s!ou#d on#y be used as a #ast
resort $ once a## ot!er tuning avenues !ave been e%!austed, or if you !ave a
dedicated (ni% mac!ine on "!ic! Informatica is running.
=. Tr& not to load across t,e net0or$) If at a## possib#e, try to co$#ocate &MServer
e%ecutab#e "it! a #oca# database. ;ot !aving t!e database #oca# means: - t!e
repository is across t!e net"or' (s#o", . t!e sources / targets are across t!e
net"or', a#so potentia##y s#o". If you !ave to #oad across t!e net"or', at #east try
to #oca#i)e t!e repository on a database instance on t!e same mac!ine as t!e
server. T!e ot!er t!ing is: try to co$#ocate t!e t"o mac!ines (pmserver and Target
database server on t!e same sub$net, even t!e same !ub if possib#e. T!is
e#iminates unnecessary routing of pac'ets a## over t!e net"or'. 0aving a
#oca#i)ed database a#so a##o"s you to setup a target tab#e #oca##y $ "!ic! you can
t!en BdumpB fo##o"ing a #oad, ftp to t!e target server, and bu#'$#oad in to t!e
target tab#e. T!is "or's e%treme#y "e## for situations "!ere append or comp#ete
refres! is ta'ing p#ace.
6%. Set Session S,ared Memor& Settin+s bet0een 5(MB and (>MB) Typica##y
I4ve seen fo#'s attempt to assign a session #arge !eaps of memory (in !opes it "i##
increase speed. +## it tends to do is s#o" do"n t!e processing. See t!e memory
#ayout document for furt!er information on !o" t!is affects Informatica and it4s
memory !and#ing, and "!y simp#y giving it more memory doesn4t necessari#y
provide speed.
66. Set S,ared Buffer Bloc$ Si"e around 5(?$) +gain, somet!ing t!at4s covered in
t!e memory #ayout document. T!is seems to be a Bs"eet spotB for !and#ing
b#oc's of ro"s in side t!e Informatica process.
6$. MMOR@ STTINGS9 T!e settings above are for an average configured
mac!ine, any mac!ine "it! #ess t!an -/ 5I54s of 6+M s!ou#d abide by t!e above
settings. If you4ve got -.9 5I54s, and you4re running on#y - to , sessions
concurrent#y, go a!ead and specify t!e Session S!ared Memory si)e at - or .
5I54s. Neep in mind t!at t!e S!ared <uffer <#oc' Si)e s!ou#d be set in re#ative
si)e to t!e S!ared Memory Setting. If you set a S!ared Mem to -.2 M<, set t!e
<uffer <#oc' Si)e to -.M<, 'eep t!em in re#ative si)es. If you don4t $ t!e resu#t
"i## be more memory B!and#ingB going on in t!e bac'ground, so #ess actua# "or'
"i## be done by Informatica. +#so $ t!is !o#ds true for t!e simp#er mappings. T!e
more comp#e% t!e mapping, t!e #ess #i'e#y you are to see a gain by increasing
eit!er buffer b#oc' si)e, or s!ared memory settings $ because Informatica
potentia##y !as to process ce##s (ports/fie#ds/va#ues inside of a !uge memory
b#oc'D t!us resu#ting in a potentia# re$a##ocation of t!e "!o#e b#oc'.
67. Use SNA2SAOTS 0it, &our Database. If you !ave dedicated #ines, DS,/T-,
etc... bet"een servers, use a snaps!ot or +dvanced 6ep#ication to get data out of
t!e source systems and in to a staging tab#e (dup#icate of t!e source. T!en
sc!edu#e t!e snaps!ot before running processes. T!e 6D<MS servers are bui#t
for t!is 'ind of data transfer $ and !ave optimi)ations bui#t in to t!e core to
transfer data incrementa##y, or as a "!o#e refres!. It may be to your advantage.
&articu#ar#y if your sources contain -, Mi##ion 9 ro"s. &#ace Informatica
processes to read from t!e snaps!ot, at t!at point you can inde% any "ay you #i'e
$ and increase t!e t!roug!put speed "it!out affecting t!e source systems. =es $
Snaps!ots on#y "or' if your sources are !omogeneous to your targets (on t!e
same type of system.
68. INCRAS TA DIS4 S2D) 7ne of t!e most common fa##acies is t!at a
Data @are!ouse 6D<MS needs on#y . contro##ers, and -, dis's to survive. T!is
is fine if you4re running #ess t!an 3 Mi##ion 6o"s tota# t!roug! your system, or
your #oad "indo" e%ceeds 3 !ours. I recommend at #east 2 to : contro##ers, and at
#east 3/ dis's $ set on a 6aid /9- array, spinning at P.// 6&M or better. If it4s
necessary, p#un' t!e money do"n and go get an 1MC device. =ou s!ou#d see a
significant increase in performance after insta##ing or upgrading to suc! a
configuration.
69. S0itc, to Raid BC5) 6aid Leve# 3 is great for redundancy, !orrib#e for Data
@are!ouse performance, particu#ar#y on bu#' #oads. 6aid /9- is t!e preferred
met!od for data "are!ouses out t!ere, and most fo#'s find t!at t!e rep#ication is
>ust as safe as a 6aid 3, particu#ar#y since t!e 0ard"are is no" near#y a## !ot$
s"appab#e, and t!e soft"are to manage t!is !as improved great#y.
6:. U-+rade &our Aard0are) 7n your production bo%, if you "ant 5igabytes per
second t!roug!put, or you "ant to create -/ inde%es in 2 !ours on ,2 mi##ion
ro"s, t!en add C&( po"er, 6+M, and t!e Dis' modifications discussed above. +
2 C&( mac!ine >ust "on4t cut t!e mustard today for t!is si)e of operation. I
recommend a minimum of 8 C&(4s as a starter bo%, and increase to -. as
necessary. +gain, t!is is for !uge Data @are!ousing systems $ 5I54s per
!our/M< per 0our. + bo% "it! 2 C&(4s is great for deve#opment, or for sma##er
systems (tota##ing #ess t!an 3 Mi##ion ro"s in t!e "are!ouse. 0o"ever, 'eep in
mind t!at <us Speed is a#so a !uge factor !ere. I4ve !eard of a 2 C&( Dec$+#p!a
system outperforming a : C&( system... So "!at4s t!e bottom #ineE Dis' 6&M4s,
<us Speed, 6+M, and Q of C&(4s. I4d say potentia##y in t!at order. <ot! 7rac#e
and Sybase perform e%treme#y "e## "!en given :9 C&(4s and 8 or -. 5I54s
6+M setup on an 1MC device at P.// 6&M "it! minimum of 2 contro##ers.


Sorting performance issues
You can improve Aggregator transformation performance by using the
Sorted Input option. When the Sorted Input option is selected, the
Informatica Server assumes all data is sorted by group. As the
Informatica Server reads rows for a group, it performs aggregate
calculations as it reads. When necessary, it stores group information in
memory. To use the Sorted Input option, you must pass sorted data to
the Aggregator transformation. You can gain added performance with
sorted ports when you partition the session.
When Sorted Input is not selected, the Informatica Server performs
aggregate calculations as it reads. However, since data is not sorted,
the Informatica Server stores data for each group until it reads the
entire source to ensure all aggregate calculations are accurate.
or e!ample, one Aggregator has the ST"#$%I& and IT$' (roup )y
ports, with the Sorted Input option selected. When you pass the
following data through the Aggregator, the Informatica Server performs
an aggregation for the three records in the *+*,battery group as soon
as it finds the new group, -+*,battery.
&TOR!(I" IT!M )T* +RI!
6%6 D#atteryE 7 $.==
6%6 D#atteryE 6 7.6=
6%6 D#atteryE $ $.9=
$%6 D#atteryE 8 6.9=
$%6 D#atteryE 6 6.==
If you use the Sorted Input option and do not presort data correctly,
the session fails.
&orted Input onditions
&o not use the Sorted Input option if any of the following conditions are
true.
The aggregate e(pression uses nested aggregate functions.
The session uses incremental aggregation.
Input data is data-driven. Cou choose to treat source data as data driven in
the session properties, or the Fpdate 0trategy transformation appears
#efore the 4ggregator transformation in the mapping.
The mapping is upgraded from .ower1art 7.9.
If you use the Sorted Input option under these circumstances, the
Informatica Server reverts to default aggregate behavior, reading all
values before performing aggregate calculations.
+re-&ortin, "ata
To use the Sorted Input option, you pass sorted data through the
Aggregator.
&ata must be sorted as follows.
&y the 4ggregator group #y ports, in the order they appear in the
4ggregator transformation.
Fsing the same sort order configured for the session.
If data is not in strict ascending or descending order based on
the session sort order, the Informatica Server fails the session.
or e!ample, if you configure a session to use a rench sort
order, data passing into the Aggregator transformation must be
sorted using the rench sort order.
If the session uses file sources, you can use an e!ternal utility to sort
file data before starting the session. If the session uses relational
sources, you can use the /umber of Sorted 0orts option in the Source
1ualifier transformation to sort group by columns in the source
database. (roup )y columns must be in the exact same order in both
the Aggregator and Source Qualifier transformations.
or details on sorting data in the Source 1ualifier, see Sorted 0orts.
Inde!es 2
'a3e sure inde!es are in place and tables have been analy4ed
'ight be able to use inde! hints in source 5ualifier

You might also like