Version History: Version Date By Changes 0.1 28/December RN First draft 0.2 29/December RN Added UNIX portion (section 5) 0.

3 07/Jan RN Merged section2.2/2.3 from development standards. 0. 12/Oct MF Added Pre GO-LIVE Checks, Oracle Tunning. 5.0 28/Feb/2006 CHA Added 4.22.4 section to the existing 4.0 version 0.6 07-Mar-2006 NBL Additions to Sections 4.22.4 Added section 6.5 All the above pertaining to performance tuning done by Paragon TOC updated 0.7 Contributors: Name Role Location Remarks

Approval: Name Role Location Peter reinbold Ming Fung Reference Documents: Name Author Version Unix system (OS) tuning

Remarks

Date Informatica

CONTENTS 1 DOCUMENT DESCRIPTION 5 2 DOCUMENT ORGANISATION 5 3 INFORMATICA PC PRIMARY GUIDELINES 3.1 DATABASE UTILISATION 5 3.2 LOCALISATION 5

5

3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.12.1 4.13 4.14 4.15 4.16 4.17 4.18 4.19 4.19.1 4.20 4.21 4.21.1 4.21.2 4.22 4.22.1 4.22.2 4.22.3 4.22.4 4.23 USER 4.24

REMOVAL OF DATABASE DRIVEN SEQUENCE GENERATORS 5 SWITCH OFF THE “COLLECT PERFORMANCE STATISTICS”6 SWITCH OFF THE VERBOSE LOGGING 6 UTILISE STAGING 6 ELIMINATE NON-CACHED LOOKUPS 6 TUNE THE DATABASE 7 AVAILABILITY OF SWAP & TEMP SPACE ON PMSERVER 7 SESSION SETTINGS 7 REMOVE ALL OTHER APPLICATIONS ON PMSERVER 8 REMOVAL EXTERNAL REGISTERED MODULES 8 INFORMATICA PC ADVANCED GUIDELINES 9 FILTER EXPRESSIONS : 9 REMOVE DEFAULT’S: 9 OUTPUT PORT INSTEAD OF VARIABLE PORT 9 DATATYPE CONVERSION:10 STRING FUNCTIONS: 10 IIF CONDITIONS CAVEAT 10 EXPRESSIONS 10 UPDATE EXPRESSIONS FOR SESSION : 10 MULTIPLE TARGETS / SOURCES ARE TOO SLOW : 10 AGGREGATOR 11 JOINER 11 LOOKUPS 12 Lookups & Aggregators Fight. 13 MAPLETS FOR COMPLEX LOGIC 13 DATABASE IPC SETTINGS & PRIORITIES 14 LOADING 14 MEMORY SETTINGS 14 REDUCE NUMBER OF OBJETS IN A MAP 15 SLOW SOURCES - FLAT FILES 15 BREAK THE MAPPINGS OUT 15 Keep the mappings as simple as possible 16 READER/TRANSFORMER/WRITER THREADS AFFECT THE TIMING 16 SORTING – PERFORMANCE ISSUES 16 Sorted Input Conditions 17 Pre-Sorting Data 17 WORKFLOW MANAGER 17 Monitoring and Running a Session: 18 Informatica suggests that each session takes roughly 1 to 1 1/2 CPU's. 19 Place some good server load monitoring tools on the PM Server in development 19 Parallel sessions/worklets. 19 CHANGE DATABASE PRIORITIES FOR THE PMSERVER DATABASE 19 CHANGE THE UNIX USER PRIORITY 20

4.25 TRY NOT TO LOAD ACROSS THE NETWORK 20 4.26 BALANCE BETWEEN INFORMATICA AND THE POWER OF SQL AND THE DATABASE 20 5 PERFORMANCE TUNING THE UNIX OS21 5.1 PROCESS CHECK 21 5.2 IDENTIFIYING & RESOLVING MEMORY PROBLEM 21 5.3 IDENTIFYING AND RESOLVING DISK I/O ISSUES 22 5.4 IDENTIFYING AND RESOLVING CPU OVERLOAD ISSUES 23 5.5 IDENTIFYING AND RESOLVING NETWORK ISSUES 23 6 IDENTIFYING ORACLE PERFORMANCE ISSUES 25 6.1 CHECKING PROBLEM PROCESSES. 26 6.2 GETTING AN EXPLAIN PLAN 29 6.3 CHECK STATS FOR AN OBJECT 30 6.4 PROCEDURE TO BE FOLLOWED PRIOR TO AN INTERFACE GOLIVE/POST GO-LIVE 31 6.5 LOAD METHODOLOGY FOR STAGING TABLES WITH INDEXES 32 6.6 INVESTIGATING PERFORMANCE ISSUES USING THE DATA DICTIONARY VIEWS. 33 6.6.1 V$sysstat and V$waitstat 33 6.6.2 Buffer Cache Hit Ratio Should be above 85% 35 6.6.3 Library Cache Hit Ratio 36 6.6.4 Dictionary Cache Hit Ratio should be less than 15% 36 6.6.5 Shared pool 37 6.6.6 Recursive/Total Calls. 38 6.6.7 Short/Total Table Scans 39 6.6.8 Redo Activity 40 6.6.9 Table Contention 42 6.6.10 CPU Parse Overhead 43 6.6.11 Latches 43 6.6.12 Rollback Segment Contention 43 1 Document Description This document describes the practices that can be followed by the ETL development team, in order to get the best of Informatica PowerCenter (ETL). This document mainly concentrates on optimising the performance of core ETL. In order to make ETL to achieve the optimal performance; it is imperative to strike a good balance in hardware, OS, RDBMS & Informatica PowerCenter 7.1.1. This document can be used as reference by the development team & administration team. 2 Document Organisation This document is divided into following parts o Primary guidelines - Necessary for ETL to perform optimally, fundamental approach for ETL design with Informatica PC 7.1.1 o Advanced guidelines - Guidelines can be applied on case-to-case basis, Can be followed based on the problem scenario / environment o Optimising Unix system – Performance tuning the OS (Unix/Linux system) 3 Informatica PC Primary guidelines

3.1 Database utilisation Utilise database for significant data handling operations, staging tables can be a real benefit for parallelism in operations. It reduces the amount of processing time by a significant amount. 3.2 Localisation Try to localise the relational objects as far as possible. Try not to use synonyms for remote database. Usage of remote links for data processing & loading certainly slow the things down. 3.3 Removal of Database driven Sequence Generators Usage of database oriented sequence generators proves to be a costly decision. As this requires wrapper function / store procedure call, which might degrade the performance by 3 times. Also it is not so easy to determine the bottleneck caused by database sequence generators. If it is must to use database sequence generators, then have a shared sequence generator & build a staging table from the flat file, add a SEQ_ID column & call a POST TARGET LOAD procedure to populate this column. This requires a wrapper function / stored procedure call. Utilizing these stored procedures has caused performance to drop by a factor of 3 times. This slowness is not easily debugged - it can only be spotted in the Write Throughput column. Copy the map, replace the stored proc call with an internal sequence generator for a test run - this is how fast you COULD run your map. If we must use a database generated sequence number, then follow the instructions for the staging table usage. If we're dealing with GIG's or Terabytes of information - this should save you lot's of hours tuning. IF YOU MUST have a shared sequence generator, then build a staging table from the flat file, add a SEQUENCE ID column, and call a POST TARGET LOAD stored procedure to populate that column. Place the post target load procedure in to the flat file to staging table load map. A single call to inside the database, followed by a batch operation to assign sequences is the fastest method for utilizing shared sequence generators. 3.4 Switch off the “Collect performance statistics” This has an impact though it is minimal; removing this operation reduces reliance on the flat file operations. However it may be useful to have this option switch ON during tuning exercise. 3.5 Switch off the verbose logging The session log has a tremendous impact on the overall performance of a session. Override the session log to NORMAL logging mode. In informatica logging mechanism is not parallel; it is embedded into the operations. Also, this prevents informatica metadata table from growing. Also, it is good idea to perform some amount of automated house keeping which truncates the log from Informatica metadata at regular intervals. 3.6 Utilise staging If the source is flat file utilise a staging table. This way you can use SQL Loader, BulkLoad utility. Keep the basic logic in source load map; eliminate all lookups from the code. At this juncture if the reader is slow, then check for following o If there is an item in configuration file which sets a value to throttle reader, it will limit the read throughput.

o Move flat file to local disk; don’t read from network or from RAID. 3.7 Eliminate non-cached lookups Usage of non-cached lookups will hamper the performance significantly. Especially if the lookup table is “growing” or “updated” target table. This show the indexes are changing during the operation and optimizer looses the track of index & its statistics. If possible use staging table - this allows using joiner also which can increase the performance to large extent. 3.8 Tune the database Estimate for small, medium and large source data set sizes, in terms of number of rows / average bytes per row. Also estimate the throughput for each and turnaround time for load. DBA should be provided with this information, along with tables that are expected to be high read / write .DBA should assign the right table to the right disk space that could make difference. 3.9 Availability of SWAP & TEMP space on PMSERVER Having less disk space for SWAP & TEMP could potentially slow down the performance of entire server. To monitor this one need to watch the disk space while sessions are running. Without monitoring, it would be difficult to assess the reason, especially if mapping contains Aggregates, or lookups that uses disk cache or Joiner with heterogeneous sources. 3.10 Session Settings

Major chunk of tuning can be done in session. By switching on the “Collect performance statistics” one will come to know the parameters to be set in session level, or at least what has to be changed in database. Basically one should try to achieve OPTIMAL READ, OPTIMAL THROUGHPUT and OPTIMAL WRITE. Over-tuning one of these pieces can ultimately slow down the sessions. Index Cache and Data cache are dynamically allocated first. As soon as the session is initialised, the memory for data and index caches are setup. Their sizes depend upon session settings The Reader DTM also based on dynamic allocation algorithm, it uses the memory available in chunks. Size of the chunk would be determined by the session setting “Default Buffer block size” Read the session throughput, then tune for the reader, see what the settings are, and send the write output to a flat file for less contention. Check the Throttle reader setting; increase the default buffer size by a factor of 64K each shot. If the reader still appears to increase during the session, then stabilize, and then try increasing Shared Session Memory from 12 MB to 24 MB. Check the writer throughput performance statistics to make sure there is NO writer bottleneck. If you have slow writer, change the map to single target table at a time to see which target is causing the slowness and tune it.

NOTE: if the reader session to flat file just doesn't ever "get fast", then we have got some basic map tuning to do. Try to merge expression objects, set the lookups to unconnected (for re-use if possible), check the Index and Data cache settings if we have aggregation, or lookups being performed. Etc... If we have a slow writer, change the map to a single target table at a time - see which target is causing the "slowness" and tune it. Make copies of the original map, and break down the copies. Once the "slower" of the N targets is discovered, talk to DBA about partitioning the table, updating statistics, removing indexes during load, etc... There are many database things you can do here. Remember the TIMING is affected by READER/TRANSFORMER/WRITER threads. With complex mappings, don't forget that each ELEMENT (field) must be weighed - in this light a firm understanding of how to read performance statistics generated by Informatica becomes important. In other words - if the reader is slow, then the rest of the threads suffer, if the writer is slow, same effect. A pipe is only as big as its smallest diameter.... A chain is only as strong as its weakest link. Sorry for the metaphors, but it should make sense. 3.11 Remove all other applications on PMServer Except the database staging, PMServer plays well with RDBMS & its engine, but doesn’t play well with application servers, in particularly JAVA Virtual Machines, Web Servers, Security Servers, applications and Report Servers. All of these items should be broken out to other machines; this is critical to improve performance on PMServer machine. 3.12 Removal external registered modules As far as possible, try to avoid the API’s which calls external objects, as this has been proven slow. External modules might exhibit speed problems, instead try using preprocessing / post processing with SED, AWK or GREP. 4 Informatica PC Advanced guidelines 4.1 Filter Expressions : Create the filter (TRUE / FALSE) inside the port expression upstream. Complex filter expressions slow down the mapping. However it acts faster in Expression transformation with an output port for the result. Place the expression in EXPRESSION Transformation upstream from filter. Compute a single numerical flag: 1 for TRUE 0 for FALSE as output port. Push this data into the filter. This will have positive impact on performance. Use the Filter transformation early in the mapping. To maximize session performance, keep the Filter transformation as close as possible to the sources in the mapping. Rather than passing rows that you plan to discard through the mapping, you can filter out unwanted data early in the flow of data from sources to targets. Use the Source Qualifier to filter The Source Qualifier transformation provides an alternate way to filter rows. Rather than filtering rows from within a mapping, the Source Qualifier transformation filters rows when read from a source. The main difference is that the source qualifier limits the row set extracted from a source, while the Filter transformation limits the row set sent to a

target. Since a source qualifier reduces the number of rows used throughout the mapping, it provides better performance. 4.2 Remove Default’s: Having a default value including “ERROR” slows down the session. It causes unnecessary evaluation of values for every data element in the mapping. Best method of allotting default value is to have variable in expression, which returns the expected value on the condition. This will be faster than assigning default value. 4.3 Output port instead of Variable port Variables are good for static and state driven, but slow down the performance time as they are allocated each time a row passes through expression object. Try to use Output port instead of variable port. 4.4 Datatype conversion: Avoid performing implicit conversion of datatypes by connecting an Integer to string or vice versa. Instead use the function that converts the data explicitly, this avoids PMServer to decide on datatype conversion at run time. 4.5 String Functions: String functions are costly on performance. E.g. ltrim, rtrim etc., as there involves allocate & re-allocate of memory within READER thread. Also it would be imperative to perform the string operations on the data, in which case following can be considered. Use varchar/varchar2 datatypes in database sources, if source is file then make it delimited one. Try to use LTRIM/RTRIM functions on the data coming in from a database SQL; this would be much faster than performing in ETL. 4.6 IIF Conditions caveat As far as possible, make a logic that goes away from IIF, as IIF conditions are costly in any language. IIF creates multiple path logic inside the application & uses the decision to navigate. This might have an implication on performance as well. Other option is to use Oracle DECODE in source qualifier. 4.7 Expressions Expressions like IS_SPACES, ISNUMBER etc. affects the performance, as this is the data validation expression that has to scan the entire string to determine the result. Try to avoid using these expressions unless there is absolute requirement for its usage. 4.8 Update Expressions for session : In session if the option Update Else Insert is ON, then definitely performance will slow down. As, Informatica has to performs 2 operations for each rows update w.r.t PK, then if it returns 0 rows then perform Insert. As an alternative, Update Strategy can be used where rows would be marked using DD_UPDATE or DD_INSERT inside the mapping. In this case session settings can be INSERT & UPDATE AS UPDATE or UPDATE AS INSERT. 4.9 Multiple targets / sources are too slow : Mappings with Multiple targets can eat up the performance some time. If the architecture permits then make one map per target. If the sources are from different ftp locations & they are flat file, then ideal choice would be FTPing the file to source to the ETL server & then process it. 4.10 Aggregator If the mapping contains more than one aggregators, then the session will run slow, unless the cache dir is fast & disk drive access speed is high. Placing aggregator towards the end

might be another option; however this will also bring down the performance. As all the I/O activity would be a bottleneck in informatica. Maplets are good source for replicating data logic, but if a maplet contains aggregator still the performance of the mapping (that contains maplet) will affect. Reduce the number of aggregators in the entire mapping to 1(if can), if possible, split the mapping to several mappings for breaking down the logics. Sorted input to aggregator will increase the performance to large extent, however if the sorted input is enabled & the data passing to aggregator is not sorted, Session will fail. Set the cache size to calculated amount using below mentioned formulae. Index size = (sum of column size in group-by ports + 17) X number of groups Data size = (sum of column size of output ports + 7) X number of groups 4.11 Joiner Perform joins in a database. Performing a join in a database is faster than performing a join in the session. Use one of the following options: o Create a pre-session stored procedure to join the tables in a database. o Use the Source Qualifier transformation to perform the join. Designate as the master source the source with the smaller number of records. For optimal performance and disk storage, designate the master source as the source with the lower number of rows. With a smaller master source, the data cache is smaller, and the search time is shorter. Set the cache size to calculated amount using below mentioned formulae. Index size = (sum of master column size in join condition +16) X number of rows in master table Data size = (sum of master column size NOT in join condition but on output ports + 8) x number of rows in master table 4.12 Lookups When caching is enabled, the PowerCenter Server caches the lookup table and queries the lookup cache during the session. When this option is not enabled, the PowerCenter Server queries the lookup table on a row-by-row basis. Eliminate too many lookups. More the lookups means, the DTM reader/writer/Transform threads are not left with enough memory to be able to run efficiently (as it can). With too many lookups one need to trade in Memory contention for Disk contention. The memory contention might be worse than disk contention, as the OS ends up swapping in and out of TEMP/SWAP disk space, with small block sizes to try to locate the lookup row, and as the row goes from lookup to lookup, swapping becomes worse. Both lookups and aggregators require memory space & each of them requires Index & Data cache, ideally they share from the same heap segments. Hence, care should be taken while designing mapping that consumes the memory. In the case where a lookup uses more than one lookup condition, set the conditions with an equal sign first in order to optimize lookup performance. In the case of a cached lookup, an ORDER BY condition is issued in the SQL statement used to create the cache. Columns used in the ORDER BY condition should be indexed. The session log will contain the ORDER BY statement. Tips on Caches: Cache small lookup tables

Improve session performance by caching small lookup tables. The result of the lookup query and processing is the same, whether or not you cache the lookup table. Use a persistent lookup cache for static lookup tables: If the lookup table does not change between sessions, configure the Lookup transformation to use a persistent lookup cache. The Informatica Server then saves and reuses cache files from session to session, eliminating the time required to read the lookup table. If the lookup table does not change between sessions, configure the Lookup transformation to use a persistent lookup cache. The Informatica Server then saves and reuses cache files from session to session, eliminating the time required to read the lookup table. Override the ORDER BY statement for cached lookups: By default, the Informatica Server generates an ORDER BY statement for a cached lookup that contains all lookup ports. To increase performance, you can suppress the default ORDER BY statement and enter an override ORDER BY with fewer columns Place conditions with an equality operator (=) first. If a Lookup transformation specifies several conditions, you can improve lookup performance by placing all the conditions that use the equality operator first in the list of conditions that appear under the Condition tab Consider following for calculating caches for lookup Attributes Method Minimum Index Cache 200 * [( S column size) + 16] over all condition ports Maximum Index Cache # rows in lookup table [(S column size) + 16] * 2 over all condition ports Minimum Data Cache # rows in lookup table [(S column size) + 8] over all outputports (not condition port) Maximum Data Cache 2 * minimum data cache 4.12.1 Lookups & Aggregators Fight. The lookups and the aggregators fight for memory space as discussed above. Each requires Index Cache, and Data Cache and they "share" the same HEAP segments inside the core. Particularly in the 4.6 / 1.6 products and prior - these memory areas become critical, and when dealing with many rows - the session is almost certain to cause the server to "thrash" memory in and out of the OS Swap space. If possible, separate the maps - perform the lookups in the first section of the maps, position the data in an intermediate target table - then a second map reads the target table and performs the aggregation (also provides the option for a group by to be done within the database)... Another speed improvement... 4.13 Maplets for complex Logic It’s good idea to break the complex logic into maplets. This allows managing the mapping in much better & efficient way of breaking down the business logics. Always remember shorter the distance between source and Target, better the performance. With complex mappings READER,WRITER & TRANSFORM threads affects by timing, i.e. if the reader is slow, then rest of the threads suffer,similarily f he writer is slow same is the effect. 4.14 Database IPC settings & Priorities

If PMServer & Oracle instance are running on the same server, use IPC connection instead of TCP/IP connection. Try to change the protocol in the TNSNames.ORA and Listener.ORA files, and restart the listener on the server .However this can be used only locally, at the same time the speed increases between 2x and 5x .Another option one can think is prioritizing the database login that informatica uses to execute its task. These tasks when login to the database would override others. This would be particularly helpful in increasing the performance especially when bulk loader or SQL Loader is used. 4.15 Loading Make sure indexes and constraints are removed before loading into relational targets & this can be created as soon as the load is completed. This would help in boost up the performance in bulk data loads. Lesser the commit interval more the time for session to complete, set the appropriate commit interval, anything above 50K is good. Partioning the data while loading is another wise option. Following are the partitions Informatica provides o Key Range o Hash Key o Round Robin o Pass Through When partitioning the individual transformation it is advisable to go for the following o Aggregator Cache Use hash auto key o Lookup Cache Use hash auto key partition type, equality condition o Sorter Cache Use Hash auto key or Pass-through or Key range 4.16 Memory Settings Session Shared Memory Size controls the total amount of memory used to buffer rows internally by the reader and writer. Set session shared memory between 12MB and 25MB, remember increasing the shared memory beyond this doesn’t guarantee increase in performance rather it does decrease the performance. Buffer Block Size controls the size of the blocks that move in the pipeline. Set shared buffer block size around 128K, This would be used by informatica for handling block of rows. If the server has RAM over 12 GIG’s, then shared memory can be increased between 1 and 2 GIG’s. Also the shared buffer block size should set relative to shared memory settings. The Informatica Server moves data from sources to targets based on workflow and mapping metadata stored in a repository. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. The Informatica Server runs workflow tasks according to the conditional links connecting the tasks. 4.17 Reduce Number of OBJETS in a map Frequently, the idea of these tools is to make the "data translation map" as easy as possible. All too often, that means creating "an" (1) expression for each throughput/translation (taking it to an extreme of course). Each object adds computational overhead to the session and timings may suffer. Sometimes if performance is an issue / goal, integrate several expressions in to one expression object, thus reducing the "object" overhead. In doing so – it could speed up the map.

4.18 Slow Sources - Flat Files If we've got slow sources, and these sources are flat files, we can look at some of the following possibilities. If the sources reside on a different machine, and we've opened a named pipe to get them across the network - then we've opened (potentially) a can of worms. We’ve introduced the network speed as a variable on the speed of the flat file source. Try to compress the source file, FTP PUT it on the local machine (local to PMServer), decompress it, and then utilize it as a source. If you're reaching across the network to a relational table - and the session is pulling many rows (over 10,000) then the source system itself may be slow. We may be better off using a source system extract program to dump it to file first, and then follow the above instructions. However, there is something your SA's and Network Ops folks could do (if necessary) - this is covered in detail in the advanced section. They could backbone the two servers together with a dedicated network line (no hubs, routers, or other items in between the two machines). At the very least, they could put the two machines on the same sub-net. Now, if the file is local to PMServer but is still slow, examine the location of the file (which device is it on). If it's not on an INTERNAL DISK then it will be slower than if it were on an internal disk (C drive for you folks on NT). This doesn't mean a UNIX file LINK exists locally, and the file is remote - it means the actual file is local. 4.19 Break the mappings out

One per target. If necessary, 1 per source per target. Why does this work? Well eliminating multiple targets in a single mapping can greatly increase speed... Basically it's like this: one session per map/target. Each session establishes its own database connection. Because of the unique database connection, the DBMS server can now handle the insert/update/delete requests in parallel against multiple targets. It also helps to allow each session to be specified for its intended purpose (no longer mixing a data driven session with INSERTS only to a single target). Each session can then be placed in to a batch marked "CONCURRENT" if preferences allow. Once this is done, parallelism of mappings and sessions become obvious. A study of parallel processing has shown again and again, that the operations can be completed sometimes in half the time of their original counterparts merely by streaming them at the same time. With multiple targets in the same mapping, a single database connection to handle multiplies diverse database statements - sometimes hitting this target, other times hitting that target. Think - in this situation it's extremely difficult for Informatica (or any other tool for that matter) to build BULK operations... even though "bulk" is specified in the session. Remember that "BULK" means this is your preference, and that the tool will revert to NORMAL load if it can't provide a BULK operation on a series of consecutive rows. Obviously, data driven then forces the tool down several other layers of internal code before the data actually can reach the database. 4.19.1 Keep the mappings as simple as possible Bury complex logic (if you must) in to a maplet. If you can avoid complex logic all together - then that would be the key. The old rule of thumb applies here (common sense) the straighter the path between two points, the shorter the distance... Translated as:

the shorter the distance between the source qualifier and the target - the faster the data loads 4.20 READER/TRANSFORMER/WRITER threads affect the TIMING With complex mappings, don't forget that each ELEMENT (field) must be weighed - in this light a firm understanding of how to read performance statistics generated by Informatica becomes important. In other words - if the reader is slow, then the rest of the threads suffer, if the writer is slow, same effect. A pipe is only as big as its smallest diameter.... A chain is only as strong as its weakest link. 4.21 Sorting – performance issues We can improve Aggregator transformation performance by using the Sorted Input option. When the Sorted Input option is selected, the Informatica Server assumes all data is sorted by group. As the Informatica Server reads rows for a group, it performs aggregate calculations as it reads. When necessary, it stores group information in memory. To use the Sorted Input option, we must pass sorted data to the Aggregator transformation. We can gain added performance with sorted ports when we partition the session. When Sorted Input is not selected, the Informatica Server performs aggregate calculations as it reads. However, since data is not sorted, the Informatica Server stores data for each group until it reads the entire source to ensure all aggregate calculations are accurate. For example, one Aggregator has the STORE_ID and ITEM Group By ports, with the Sorted Input option selected. When you pass the following data through the Aggregator, the Informatica Server performs an aggregation for the three records in the 101/battery group as soon as it finds the new group, 201/battery: STORE_ID ITEM QTY PRICE 101 ‘battery’ 3 2.99 101 ‘battery’ 1 3.19 101 ‘battery’ 2 2.59 201 ‘battery’ 4 1.59 201 ‘battery’ 1 1.99 If you use the Sorted Input option and do not presort data correctly, the session fails. 4.21.1 Sorted Input Conditions Do not use the Sorted Input option if any of the following conditions are true: • The aggregate expression uses nested aggregate functions. • The session uses incremental aggregation. • Input data is data-driven. You choose to treat source data as data driven in the session properties, or the Update Strategy transformation appears before the Aggregator transformation in the mapping. • If we use the Sorted Input option under these circumstances, the Informatica Server reverts to default aggregate behaviour, reading all values before performing aggregate calculations. 4.21.2 Pre-Sorting Data To use the Sorted input option, you pass sorted data through the Aggregator. Data must be sorted as follows:

• By the Aggregator group by ports, in the order they appear in the Aggregator transformation. • Using the same sort order configured for the session. If data is not in strict ascending or descending order based on the session sort order, the Informatica Server fails the session. For example, if you configure a session to use a French sort order, data passing into the Aggregator transformation must be sorted using the French sort order. If the session uses file sources, you can use an external utility to sort file data before starting the session. If the session uses relational sources, we can use the Number of Sorted Ports option in the Source Qualifier transformation to sort group by columns in the source database. Group By columns must be in the exact same order in both the Aggregator and Source Qualifier transformations. 4.22 Workflow Manager The Informatica Server moves data from sources to targets based on workflow and mapping metadata stored in a repository. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. The Informatica Server runs workflow tasks according to the conditional links connecting the tasks. • Monitor, add, edit, delete Informatica Server info in the repository • Stop the informatica Server • Configure database, external loader, and FTP connections • Manage sessions and batches - create, edit, delete, copy/move within a folder, start/stop, abort sessions, view session logs, details, and session performance details.

Source Server Source Data

Target Transformed data Instructions from Metadata

` Repository

4.22.1 Monitoring and Running a Session: When the Informatica Server runs a Session task, the Workflow Monitor creates session details that provide load statistics for each target in the mapping. We can view session details when the session runs or after the session completes. Create a worklet to reuse a set of workflow logic in several workflows. Use the Worklet Designer to create and edit worklets. 4.22.2 Informatica suggests that each session takes roughly 1 to 1 1/2 CPU's. In keeping with this - Informatica play's well with RDBMS engines on the same machine, but does NOT get along (performance wise) with ANY other engine (reporting engine, java engine, OLAP engine, java virtual machine, etc...) 4.22.3 Place some good server load monitoring tools on the PM Server in development Watch it closely to understand how the resources are being utilized, and where the hot spots are. Try to follow the recommendations - it may mean upgrading the hardware to achieve throughput. Look in to EMC' s disk storage array - while expensive, it appears to be extremely fast, it may improve the performance in some cases by up to 50% 4.22.4 Parallel sessions/worklets. In workflow, depending on the business logic configure the session/worklet tasks parallel instead of sequential. This will reduce the workflow process time to large extent. However creating too many parallel tasks may decrease the performance. For example : - This logic has been implemented for Acs_London workflow, previously balances, postings and contracts worklets are arranged in sequential. Postings worklet has no dependency on any other worklet, in this case postings re- arranged as parallel worklet to balances worklet. Balances worklet has three sequential staging sessions, these three tasks re-arranged as parallel. Now the execution time for balances staging sessions reduced to 40%. Earlier, ICON worklet was designed with sequential tasks which will create 6 output files. Since there were no dependencies between the files, this worklet was re-arranged to load parallel. This reduced to 30% time. Funding inter company tasks moved to another worklet. Paragon workflow parallelism – Out of the 17 sessions loading Paragon data into staging tables which were scheduled to run sequentially, 2 sessions (corporate load and cp mapping load) were changed to run sequentially and the remaining 15 sessions were changed to run in parallel. Due to the parallelism and other changes at the database level (described in detail under section 6.5), performance improved by 100%. Creating too many parallel tasks may decrease the performance, there is no limit for parallel tasks, however the performance depends upon available Informatica server resources. 4.23 Change Database Priorities for the PMServer Database User

Prioritizing the database login that any of the connections use (setup in Server Manager) can assist in changing the priority given to the Informatica executing tasks. These tasks when logged in to the database then can over-ride others. Sizing memory for these tasks (in shared global areas, and server settings) must be done if priorities are to be changed. If BCP or SQL*Loader or some other bulk-load facility is utilized, these priorities must

also be set. This can greatly improve performance. Again, it's only suggested as a last resort method, and doesn't substitute for tuning the database, or the mapping processes. It should only be utilized when all other methods have been exhausted (tuned). Keep in mind that this should only be relegated to the production machines and only in certain instances where the Load cycle that Informatica is utilizing is NOT impeding other users. 4.24 Change the Unix User Priority In order to gain speed, the Informatica UNIX User must be given a higher priority. The UNIX SA should understand what it takes to rank the UNIX logins, and grant priorities to particular tasks. Or - simply have the pmserver executed under a super user (SU) command; this will take care of reprioritizing Informatica's core process. This should only be used as a last resort - once all other tuning avenues have been exhausted, or if we have a dedicated UNIX machine on which Informatica is running. 4.25 Try not to load across the network If at all possible, try to co-locate PMServer executable with a local database. Not having the database local means: 1) the repository is across the network (slow), 2) the sources / targets are across the network, also potentially slow. If we have to load across the network, at least try to localize the repository on a database instance on the same machine as the server. The other thing is: try to co-locate the two machines (pmserver and Target database server) on the same sub-net, even the same hub if possible. This eliminates unnecessary routing of packets all over the network. Having a localized database also allows us to setup a target table locally - which you can then "dump" following a load, ftp to the target server, and bulk-load in to the target table. This works extremely well for situations where append or complete refresh is taking place. 4.26 Balance between Informatica and the power of SQL and the database Try to utilize the DBMS for what it was built for: reading/writing/sorting/grouping/filtering data en-masse. Use Informatica for the more complex logic, outside joins, data integration, multiple source feeds, etc... The balancing act is difficult without DBA knowledge. In order to achieve a balance, we must be able to recognize what operations are best in the database, and which ones are best in Informatica. This does not degrade from the use of the ETL tool, rather it enhances it it's a MUST if you are performance tuning for high-volume throughput.

5 Performance tuning the UNIX OS 5.1 Process check ps-axu: Run to check for the following items: o Are there any processes waiting for disk access or for paging, if so check the I/O and memory subsystems. o Processes that are using most of the CPU and Processes are using most of the memory. This may help you distribute the workload better. 5.2 Identifiying & resolving memory problem

Run vmstat S 5 confirms memory problems and check for the following: o Pages-outs occurring consistently? If so, you are short of memory. o Are there a high number of address translation faults? (System V only) This suggests a memory shortage. o Are swap-outs occurring consistently? If so, you are extremely short of memory. Occasional swap-outs are normal; BSD systems swap-out inactive jobs. Long bursts of swap-outs mean that active jobs are probably falling victim and indicate extreme memory shortage. If you don’t have vmstat -S, look at the w and de fields of vmstat. These should ALWAYS be zero. If memory seems to be the bottleneck of the system, try following remedial steps: o Reduce the size of the buffer cache, if your system has one, by decreasing BUFPAGES. The buffer cache is not used in system V.4 and SunOS 4.X systems. Making the buffer cache smaller will hurt disk I/O performance. o If you have statically allocated STREAMS buffers, reduce the number of large (2048- and 4096-byte) buffers. This may reduce network performance, but netstat-m should give you an idea of how many buffers you really need. o Reduce the size of your kernels tables. This may limit the systems capacity (number of files, number of processes, etc.). o Try running jobs requiring a lot of memory at night. This may not help the memory problems, but you may not care about them as much. o Try running jobs requiring a lot of memory in a batch queue. If only one memoryintensive job is running at a time, your system may perform satisfactorily. o Try to limit the time spent running sendmail, which is a memory hog. o If you don’t see any significant improvement, add more memory. 5.3 Identifying and Resolving Disk I/O Issues Use iostat to check i/o load and utilization, as well as CPU load. Iostat can be used to monitor the I/O load on the disks on the UNIX server. Using iostat permits monitoring the load on specific disks. Take notice of how fairly disk activity is distributed among the system disks. If it is not, are the most active disks also the fastest disks Following might help to rectify the problem due to I/O o Reorganize your file systems and disks to distribute I/O activity as evenly as possible. o Using symbolic links helps to keep the directory structure the same throughout while still moving the data files that are causing I/O contention. o Use your fastest disk drive and controller for your root file system; this will almost certainly have the heaviest activity. Alternatively, if single-file throughput is important, put performance-critical files into one file system and use the fastest drive for that file system. o Put performance-critical files on a file system with a large block size: 16KB or 32KB (BSD). o Increase the size of the buffer cache by increasing BUFPAGES (BSD). This may hurt your systems memory performance. o Rebuild your file systems periodically to eliminate fragmentation (backup, build a new file system, and restore). o If you are using NFS and using remote files, look at your network situation. You don’t have local disk I/O problems.

o Check memory statistics again by running vmstat 5 (sar-rwpg). If your system is paging or swapping consistently, you have memory problems, fix memory problem first. Swapping makes performance worse. If system has disk capacity problem and is constantly running out of disk space, try the following actions: o Write a find script that detects old core dumps, editor backup and auto-save files, and other trash and deletes it automatically. Run the script through croon. o Use a smaller block size on file systems that are mostly small files (e.g., source code files, object modules, and small data files). 5.4 Identifying and Resolving CPU Overload Issues Use sar u to check for CPU loading. This provides the %usr (user), %sys (system), %wio (waiting on I/O), and %idle (% of idle time). A target goal should be %usr + %sys= 80 and %wio = 10 leaving %idle at 10. If %wio is higher, the disk and I/O contention should be investigated to eliminate I/O bottleneck on the UNIX server. If the system shows a heavy load of %sys, and %usr has a high %idle, this is indicative of memory and contention of swapping/paging problems. In this case, it is necessary to make memory changes to reduce the load on the system server. When you run iostat 5 above, also observe for CPU idle time. Is the idle time always 0, without letup? It is good for the CPU to be busy, but if it is always busy 100 percent of the time, work must be piling up somewhere. These points to CPU overload. o Eliminate unnecessary daemon processes. rwhod and routed are particularly likely to be performance problems, but any savings will help. o Get users to run jobs at night with any queuing system that’s available always for help. You may not care if the CPU (or the memory or I/O system) is overloaded at night, provided the work is done in the morning. o Use nice to lower the priority of CPU-bound jobs will improve interactive performance. Also, using nice to raise the priority of CPU-bound jobs will expedite them but will hurt interactive performance. In general though, using nice is really only a temporary solution. If your workload grows, it will soon become insufficient. Consider upgrading your system, replacing it, or buying another system to share the load. 5.5 Identifying and Resolving Network Issues One can suspect problems with network capacity or with data integrity if users experience slow performance when they are using rlogin or when they are accessing files via NFS. Look at netsat-I. If the number of collisions is large, suspect an overloaded network. If the number of input or output errors is large, suspect hardware problems. A large number of input errors indicate problems somewhere on the network. A large number of output errors suggest problems with your system and its interface to the network. If collisions and network hardware are not a problem, figure out which system appears to be slow. Use spray to send a large burst of packets to the slow system. If the number of dropped packets is large, the remote system most likely cannot respond to incoming data fast enough. Look to see if there are CPU, memory or disk I/O problems on the remote system. If not, the system may just not be able to tolerate heavy network workloads. Try to reorganize the network so that this system isn’t a file server.

A large number of dropped packets may also indicate data corruption. Run netstat-s on the remote system, then spray the remote system from the local system and run netstat-s again. If the increase of UDP socket full drops (as indicated by netstat) is equal to or greater than the number of drop packets that spray reports, the remote system is slow network server If the increase of socket full drops is less than the number of dropped packets, look for network errors. Run nfsstat and look at the client RPC data. If the retransfield is more than 5 percent of calls, the network or an NFS server is overloaded. If timeout is high, at least one NFS server is overloaded, the network may be faulty, or one or more servers may have crashed. If badmixis roughly equal to timeout, at least one NFS server is overloaded. If timeout and retrans are high, but badxidis low, some part of the network between the NFS client and server is overloaded and dropping packets. Try to prevent users from running me /O- intensive programs across the network. The greputility is a good example of an I/O intensive program. Instead, have users log into the remote system to do their work. Reorganize the computers and disks on your network so that as many users as possible can do as much work as possible on a local system. Use systems with good network performance as file servers. If you are short of STREAMS data buffers and are running Sun OS 4.0 or System V.3 (or earlier), reconfigure the kernel with more buffers. 6 Identifying Oracle performance Issues When reviewing Oracle performance issues on the box be aware of the following: • If there is a difference in performance between UAT and PRD, do not assume that the two environments are configured the same. Keeping in mind that UAT is the DR environment for PRD this is NOT likely to be different. • Always start by extracting the slow SQL being executed and get an explain plan out in UAT & PRD, its likely the explain plan wont be the same, in which case drill down to ask why. Start simple with ORACLE 101 analyse. • Always consider what else is running on the environment, it may not be the slow SQL which is the problem, you may have another process holding onto resource. This is especially true of the cases when you get inconsistent performance from a process. • Concurrency/Parrellism is NOT always the solution to all your problems, infact if you dont have idle CPU it causes more problems. Use Concurrency/Parrellism sensibly and with caution. • Keep in mind there is no magical switch when tunning slow Oracle jobs, the most important weapon we have is Visibility and Information on whats been executed at a point in time. • 70% of Oracle performance issues are related to the SQL being executed. • Inconsistency between environments can attributed to STATS, this should be one of the first areas to be reviewed. • Consult the DBA but only after you’ve followed the necessary steps above, the DBA team can setup the stats pack report and drill down into further detail if its established that its not an application problem or where we dont have enough visibility/information. 6.1 Checking Problem Processes.

There a number of ways to monitor and check for problem Oracle processes. • Use the monitoring in tools like TOAD to review Oracle jobs in realtime. • Request the DBA setup/switch on STATS PACK and give you a point in time report. The STATS PACK report can give you the top worst queries running at a point in time. Whilst a problem process is running we should consider using TOAD (or a tool like it), to review what this process is doing. This can be done via the Kill/Trace Session Tool in TOADs DBA menu. Using this tool we can review exactly what the process is executing and also review statistics based on physical reads, connect time, user, block gets, block changes, consistent gets. Consider taking the SQL and getting a plan.

Using this TOOL we can do a very simple analyse of the process by using the SPID in the Kill/Trace Session Tool and linking that back to the PID in “top” output in the servers unix command line. For example the Oracle process below with SPID 22301 in TOAD can be linked to PID 22301 in TOP and is using 12.1% of a CPU. If we see a process approaching 100% of a CPU this process clearly needs to be tuned. Be careful when using “top” on the unix command line as it can be expensive, you should Quit out of this when its no longer needed.

6.2

Getting an explain plan

The thing that should be done when reviewing any problem process is to get the explain plan, this will give you a clue as to what the problem may or may not be. • Compare the explain plan between an environment where the process runs quickly and in an environment where it does not. If the plans are different between environments we could be looking at a stats problem or missing indexes or different data volumes. • For expensive plans use Oracle 101 techniques, check we have stats that reflect the contents of an object and that we have indexes that relect the WHERE clause. • Keep in mind a “Full Table Scan” is NOT always bad, especially if we return 50% of the rows when using an index. In the case of nested joins the smaller table is always fully scanned, in the case of hash joins both tables are usually fully scanned. You can use FEED_USER to get an explain plan, ensure you use table PLAN_TABLE owned by SYS. A tool like SQL*Navigator can be used for this.

6.3

Check Stats for an object

Stale or out of date statistics can cause havoc in an Oracle database, here are some tell tall signs. • • • Indexes exist but never used. Performance different between environments. Performance for the same environment degrades or is inconsistent.

A very simple sanity check can be done using the query for objects associated with the problem process. Look at the NUM_ROWS value initially. SELECT * FROM DBA_TABLES WHERE TABLE_NAME=’<table>’; SELECT * FROM DBA_TAB_PARTITIONS WHERE TABLE_NAME=’<table>’; Keep in mind that statistics must be fairly accurate in the Oracle data dictionaries, it does not have to exact BUT the differences should NEVER be in order of magnitude. As Oracle will make a decision to use an index ot not using information like the data volumns in an object. The following can be done to ensure good statistics.

• Gather Stats scheduled to capture a accurate snapshot of volumnes in database. • Dynamic Sampling in process, preferable for objects that are not accessed by multple jobs at the same time. • Export/Import of statistics used to capture and reuse good statistics at a point in time, and potentially Gather Stats not used for these objects. 6.4 Procedure to be followed prior to an interface go-live/post go-live

These steps should be followed prior to an interface go-live. Using the information/tools ( or similar tools ) above • • • • • o tables. o static. Explain Plans for processed to be reviewed. CPU consumption reviewed using top or tools in section 5. I/O reviewed using tools in section 5. Time taken for major queries / function / procedures If Dynamic Sampling/Runtime Gather Stats is not used. Ensure the scheduled gather stats job gives accurate stats post population of Consider exporting/importing the good stats from UAT to PRD, and keeping this

Theese steps should be followed post interface go-live for the first few days of processing. Using the information/tools ( or similar tools ) above • • • • Explain Plans for processed to be reviewed. CPU consumption reviewed using top or tools in section 5. I/O reviewed using tools in section 5. Check statistics for objects used see 6.3

6.5 Load methodology for staging tables with indexes Follow the guidelines below for loading any staging table – • Indexes on staging table – check up if the staging table to be loaded has any indexes • Usage of staging tables by other processes – ascertain if any concurrent processes running at the time when the staging table is loaded,query the staging table If the staging table has indexes and no concurrent processes query the staging table, then performance improvement can be achieved by dropping all the indexes on the staging table before loading the data and re-creating the indexes after completion of data load. Paragon case study – in Paragon, 17 staging tables which had indexes (but not queried by any concurrent processes) were getting loaded with indexes. After changing the workflow

to drop all the indexes on the staging tables, load data and re-create indexes, the performance gain was to the order of 80% 6.6 Investigating Performance Issues Using the Data Dictionary Views.

Oracle maintains good statistics on the state of the database and this can be a good inidcator as to what problems queries exist, what waits exist and also the current setup of the Oracle database 6.6.1 V$sysstat and V$waitstat

V$SYSSTAT stores instance-wide statistics on resource usage, cumulative since the instance was started. Below is list of useful stats can be taken from v$sysstat. SELECT * FROM V$sysstat where name in ('parse count (hard)', 'db block changes', 'execute count', 'CPU used by this session', 'logons current', 'logons cumulative', 'parse count (total)', 'parse time cpu', 'parse time elapsed', 'physical reads', 'physical writes', 'redo log space requests', 'redo size', 'session logical reads', 'sorts (memory)', 'sorts (disk)', 'sorts (rows)', 'table fetch by rowid', 'table scan rows gotten', 'table scan blocks gotten', 'user commits', 'user rollbacks' ) V$WAITSTAT stores a summary all buffer waits since instance startup. It is useful for breaking down the waits by class if you see a large number of buffer busy waits on the system. The following are possible reasons for waits: • Undo segment header: not enough rollback segments

• • • • • • 6.6.2

Data segment header/freelist: freelist contention Data block Large number of CR clones for the buffer Range scans on indexes with large number of deletions Full table scans on tables with large number of deleted rows Blocks with high concurrency Buffer Cache Hit Ratio Should be above 85%

Example SELECT NAME, VALUE FROM V$SYSSTAT WHERE NAME IN ('session logical reads','physical reads', 'physical reads direct','physical reads direct (lob)','db block gets','consistent gets'); Hit Ratio = 1 - ((physical reads - physical reads direct - physical reads direct (lob)) / (db block gets + consistent gets - physical reads direct - physical reads direct (lob)) SELECT 1 - (40436054-2384700-0) / (786683547 + 5145590004 - 40443416 - 2384700) FROM DUAL Interpreting and Using the Buffer Cache Advisory Statistics There are many factors to examine before considering whether to increase or decrease the buffer cache size. For example, you should examine V$DB_CACHE_ADVICE data and the buffer cache hit ratio. A low cache hit ratio does not imply that increasing the size of the cache would be beneficial for performance. A good cache hit ratio could wrongly indicate that the cache is adequately sized for the workload. To interpret the buffer cache hit ratio, you should consider the following: Repeated scanning of the same large table or index can artificially inflate a poor cache hit ratio. Examine frequently executed SQL statements with a large number of buffer gets, to ensure that the execution plan for such SQL statements is optimal. If possible, avoid repeated scanning of frequently accessed data by performing all of the processing in a single pass or by optimizing the SQL statement. If possible, avoid requerying the same data, by caching frequently accessed data in the client program or middle tier. Blocks encountered during a long full table scan are not put at the head of the list of last recently used (LRU) blocks. Therefore, the blocks are aged out faster than blocks read when performing indexed lookups or small table scans. Thus, poor hit ratios when valid

large full table scans are occurring should also be considered when interpreting the buffer cache data.

6.6.3

Library Cache Hit Ratio

SELECT sum(pinhits) / sum(pins) "Hit Ratio", sum(reloads) / sum(pins) "Reload percent" FROM v$librarycache WHERE namespace in ('SQL AREA', 'TABLE/PROCEDURE', 'BODY', 'TRIGGER'); The hit ratio should be at least 85% (i.e. 0.85). The reload percent should be very low, 2% (i.e. 0.02) or less. If this is not the case, increase the initialisation parameter SHARED_POOL_SIZE. Although less likely, the init.ora parameter OPEN_CURSORS may also need to increased. 6.6.4 Dictionary Cache Hit Ratio should be less than 15%

select sum(gets-getmisses)*100/sum(gets) from v$rowcache select (sum(getmisses)/sum(gets))*100 from v$rowcache The dictionary cache hit ratio is a measure of the proportion of requests for information from the data dictionary, the collection of database tables and views containing reference information about the database, its structures, and its users. On instance startup, the data dictionary cache contains no data, so any SQL statement issued is likely to result in cache misses. As more data is read into the cache, the likelihood of cache misses should decrease. Eventually the database should reach a "steady state" in which the most frequently used dictionary data is in the cache. The dictionary cache resides within the Shared Pool, part of the SGA, so increasing the shared pool size should improve the dictionary cache hit ratio. 6.6.5 Shared pool

select * from V$SHARED_POOL_ADVICE V$SHARED_POOL_ADVICE displays information about estimated parse time savings in the shared pool for different sizes. The sizes range from 50% to 200% of the current

shared pool size, in equal intervals. The value of the interval depends on the current size of the shared pool. Table 24-22 V$SHARED_POOL_ADVICE View Column Datatype Description SHARED_POOL_SIZE_FOR_ESTIMATE NUMBER Shared pool size for the estimate (in megabytes) SHARED_POOL_SIZE_FACTOR NUMBER Size factor with respect to the current shared pool size ESTD_LC_SIZE NUMBER Estimated memory in use by the library cache (in megabytes) ESTD_LC_MEMORY_OBJECTS NUMBER Estimated number of library cache memory objects in the shared pool of the specified size ESTD_LC_TIME_SAVED NUMBER Estimated elapsed parse time saved (in seconds), owing to library cache memory objects being found in a shared pool of the specified size. ESTD_LC_TIME_SAVED_FACTOR NUMBER Estimated parse time saved factor with respect to the current shared pool size ESTD_LC_MEMORY_OBJECT_HITS NUMBER Estimated number of times a library cache memory object was found in a shared pool of the specified size

6.6.5.1 Shared Pool Free select round((sum(decode(name,'free memory',bytes,0))/sum(bytes))*100,2)from v$sgastat

The percentage of the shared pool not currently in use. If a large proportion of the shared pool is always free, it is likely that the size of the shared pool can be reduced. Low free values are not a cause for concern unless other factors also indicate problems, e.g. a poor dictionary cache hit ratio or large proportion of reloads occurring. 6.6.5.2 Shared Pool Reload select round(sum(reloads)/sum(pins)*100,2)from v$librarycache where namespace in ('SQL AREA','TABLE/PROCEDURE','BODY','TRIGGER') This is similar to a Library Cache Miss Ratio, but is specific to SQL and PL/SQL blocks. Shared pool reloads occur when Oracle has to implicitly reparse SQL or PL/SQL at the point when it attempts to execute it. A larger shared pool wil reduce the number of times that code needs to be reloaded. Also, ensuring that similar pieces of SQL are written identically will increase sharing of code. To take advantage of additional memory available for shared SQL areas, you may also need to increase the number of cursors permitted for a session. You can do this by increasing the value of the initialization parameter OPEN_CURSORS. 6.6.6 Recursive/Total Calls.

select round((rcv.value/(rcv.value+usr.value))*100,2) from v$sysstat rcv, v$sysstat usr where rcv.name='recursive calls'and usr.name='user calls' High Ratio Caused by Dynamic extension of tables due to poor sizing Growing and shrinking of rollback segments due to unsuitable OPTIMAL settings Large amounts of sort to disk resulting in creation and deletion of temporary segments Data dictionary misses Complex triggers, integrity constraints, procedures, functions and/or packages

6.6.7

Short/Total Table Scans

select round((shrt.value/(shrt.value+lng.value))*100,2)from v$sysstat shrt, v$sysstat lng where shrt.name='table scans (short tables)'and lng.name='table scans (long tables)' This is the proportion of full table scans which are occurring on short tables. Short tables may be scanned by Oracle when this is quicker than using an index. Full table scans of long tables is generally bad for overall performance. Low figures for this graph may indicate lack of indexes on large tables or poorly

written SQL which fails to use existing indexes or is returning a large percentage of the table.

6.6.8

Redo Activity

6.6.8.1 Redo Space Wait Ratio select round((req.value/wrt.value)*100,2)from v$sysstat req, v$sysstat wrt where req.name= 'redo log space requests' and wrt.name= 'redo writes' A redo space wait is when there is insufficient space in the redo buffer for a transaction to write redo information. It is an indication that the redo buffer is too small given the rate of transactions occurring in relation to the rate at which the log writer is writing data to the redo logs. 6.6.8.2 Redo Log Allocation Contention There are two latches 6.6.8.2.1 The Redo Allocation Latch select round(greatest((sum(decode(ln.name,'redo allocation',misses,0))/ greatest(sum(decode(ln.name,'redo allocation',gets,0)),1)), (sum(decode(ln.name,'redo allocation',immediate_misses,0))/ greatest(sum(decode(ln.name,'redo allocation',immediate_gets,0)) +sum(decode(ln.name,'redo allocation',immediate_misses,0)),1)))*100,2) from v$latch l,v$latchname ln where l.latch#=ln.latch# The redo allocation latch controls the allocation of space for redo entries in the redo log buffer. To allocate space in the buffer, an Oracle user process must obtain the redo allocation latch. Since there is only one redo allocation latch, only one user process can allocate space in the buffer at a time. The single redo allocation latch enforces the sequential nature of the entries in the buffer. After allocating space for a redo entry, the user process may copy the entry into the buffer. This is called "copying on the redo allocation latch". A process may only copy on the redo allocation latch if the redo entry is smaller than a threshold size. The maximum size of a redo entry that can be copied on the redo allocation latch is specified by the initialization parameter LOG_SMALL_ENTRY_MAX_SIZE.

6.6.8.2.2 Redo Copy Latches select round(greatest((sum(decode(ln.name,'redo copy',misses,0))/ greatest(sum(decode(ln.name,'redo copy',gets,0)),1)), (sum(decode(ln.name,'redo copy',immediate_misses,0))/ greatest(sum(decode(ln.name,'redo copy',immediate_gets,0)) +sum(decode(ln.name,'redo copy',immediate_misses,0)),1)) )*100,2) from v$latch l,v$latchname ln where l.latch#=ln.latch# The user process first obtains the copy latch. Then it obtains the allocation latch, performs allocation, and releases the allocation latch. Next the process performs the copy under the copy latch, and releases the copy latch. The allocation latch is thus held for only a very short period of time, as the user process does not try to obtain the copy latch while holding the allocation latch. If the redo entry is too large to copy on the redo allocation latch, the user process must obtain a redo copy latch before copying the entry into the buffer. While holding a redo copy latch, the user process copies the redo entry into its allocated space in the buffer and then releases the redo copy latch. With multiple CPUs the redo log buffer can have multiple redo copy latches. These allow multiple processes to copy entries to the redo log buffer concurrently. The number of redo copy latches is determined by the parameter LOG_SIMULTANEOUS_COPIES. 6.6.9 Table Contention

There are two figures which give indications of how well table storage is working. Figures are averaged across all tables in use. This means one table may be seriously at fault or many tables may have low level problems. 6.6.9.1 Chained Fetch Ratio select round((cont.value/(scn.value+rid.value))*100,2) from v$sysstat cont, v$sysstat scn, v$sysstat rid where cont.name= 'table fetch continued row' and scn.name= 'table scan rows gotten' and rid.name= 'table fetch by rowid' This is a proportion of all rows fetched which resulted in a chained row continuation. Such a continuation means that data for the row is spread across two blocks, which can occur in either of two ways:

Row Migration This occurs when an update to a row cannot fit within the current block. In this case, the data for the row is migrated to a new block leaving a pointer to the new location in the original block. Row Chaining This occurs when a row cannot fit into a single data block, e.g. due to having large or many fields. In this case, the row is spread over two or more blocks. 6.6.9.2 Free List Contentions select round((sum(decode(w.class,'free list',count,0))/ (sum(decode(name,'db block gets',value,0)) + sum(decode(name,'consistent gets',value,0))))*100,2) from v$waitstat w, v$sysstat Free list contention occurs when more than one process is attempting to insert data into a given table. The table header structure maintains one or more lists of blocks which have free space for insertion. If more processes are attempting to make insert than there are free lists some will have to wait for access to a free list.

6.6.10 CPU Parse Overhead select round((prs.value/(prs.value+exe.value))*100,2) from v$sysstat prs, v$sysstat exe where prs.name like 'parse count (hard)' and exe.name= 'execute count' The CPU parse overhead is the proportion of database CPU time being spent in parsing SQL and PL/SQL code. High values of this figure indicate that either a large amount of once-only code is being used by the database or that the shared sql area is too small. 6.6.11 Latches Latches are simple, low-level serialization mechanisms to protect shared data structures in the SGA. When attempting to get a latch a process may be willing to wait, hence this graph includes two figures. See also redo log allocation latches. 6.6.11.1 Willing to Wait Latch Gets select round(((sum(gets) - sum(misses)) / sum(gets))*100,2)from v$latch An attempt by a process to obtain a latch which is willing to wait will sleep and retry until it obtains the latch. Optimum = High. 6.6.11.2 Immediate Latch Gets

select round(((sum(immediate_gets) - sum(immediate_misses)) / sum(immediate_gets))*100,2)from v$latch An attempt to obtain a latch which a process is not allowed to wait for will timeout. Optimum = High. 6.6.12 Rollback Segment Contention select sum(waits)/sum(gets)*100 from v$rollstat This figure is an indication of whether a process had to wait to get access to a rollback segment. To improve figures, increase the number of rollback segments available.

Sign up to vote on this title
UsefulNot useful