Professional Documents
Culture Documents
Powercenter 7 Architecture and Performance Tuning: Erwin Dral
Powercenter 7 Architecture and Performance Tuning: Erwin Dral
Architecture and
Performance Tuning
Erwin Dral
Sales Consultant
1
Agenda
° PowerCenter Architecture
° Performance tuning step-by-step
° Eliminating Common bottlenecks
2
PowerCenter Architecture:
Engine-based & Metadata-driven
Metadata
Client Tools Workflow Workflow Repository Designer
Windows Manager Monitor Manager Reporter
ODBC ODBC
Metadata
Exchange
Erwin
Designer 2000 TCP/IP
Power Designer
Heterogeneous
CWM Heterogeneous
JDBC
ODBC Targets
Sources Repository Server
Oracle Repository Agent Oracle API, SQL*Loader
MS SQL Server MS SQL Server, BCP
Sybase Native
Informix Sybase, IQ Load
DB2 UDB Targets Informix
ODBC Sources
Flat File DB2 UDB, Autoloader
MainFrame XML Teradata fload,
fload, tpump,
tpump,
MainFrame
VSAM/COBOL
ERP Metadata mpumpERP
Copybook GDR Native ODBC SAS
SAS Native
ODBC Repository ODBC
Flat FileRealTime
RealTime
Remote Files XML Remote Files
TCP/IP
PowerConnect PowerConnect
Buffers
UNIX, Windows
Reader DTM Writer
Key
Data
Metadata
3
Introducing PowerExchange
On-Demand Data Access through Changed Data Capture
Mainframe
Real-time
AS/400, HP3000
Change
Relational
Batch
File Formats, EAI
Change Bulk
4
PowerCenter Environment
Disk Disk
° Performance is determined by
THE SLOWEST COMPONENT (the bottleneck)
− Usually need to monitor performance in several places
− Usually need to monitor outside PowerCenter
5
Server Architecture - Memory
6
Server Architecture - Memory
Transformation
Reader Writer
Engine
7
Server memory runtime
° Example
8
Server Architecture - Memory
° DTM Buffer Pool Size controls the total amount of memory used
to buffer rows internally by the reader and writer
− This sets the total number of blocks available
− The optimal value is about 25MB
− If the block size is 64K, then you get 25M/64K = 390 blocks
° Buffer Block Size controls the size of the blocks that move in
the pipeline
− Optimum size depends on the row size being processed
− 64KB ≈ 64 rows of 1KB
− 128KB ≈ 128 rows of 1KB
9
Server Architecture – DTM Parameters
10
Server Architecture - Threads
DTM
Master Thread
Mapping Thread Transformation
Transformation
Thread
Thread
Reader Rank
Thread Threads
Reader
Thread
Thread
Transformation
Thread
Transformation Transformation
Writer Thread Thread
Transformation
Thread Thread
Writer
Thread
Thread
Aggregator
Process Memory Threads
11
Performance tuning step-by-step
5. 3.
Run Determine
sessions bottleneck
HINTS:
•Write down a log of every step 4.
•If all resources are used 100%, buy more Make ONE
change
•If the change doesn’t help, UNDO
12
2. Measuring Performance Internal to
Informatica
13
Measuring Performance - Internal
14
Measuring Performance - Internal
15
Measuring Performance - Internal
16
Measuring Performance - Internal
Example
Session Name
Start/End Times
Applied Rows
17
Measuring Performance - Internal
Tips:
° Calculated rows per second are not the same as “Write
Throughput”
° For multiple targets use sum of rows loaded for targets which
are similar in row size
° For multiple partitions use the sum of rows loaded for all
partitions
° Monitor background processes external to Informatica that will
have an effect between test runs
18
Establishing Baselines Internal to
Informatica
19
Establishing Baselines - Internal
° Each component in a production environment contributes to the
overall session performance
° Performance is limited to the slowest component
LAN/ DBMS
WAN OS
PowerCenter
20
Establishing Baselines – Read
Throughput Mapping
° Read Throughput Mapping – Use a database table to
flat file mapping to establish a typical read rate
Rows
Session Name Rows Rows Start End Elapsed Per
Loaded Failed Time Time Time Sec
21
Establishing Baselines - Historical
22
2. Measure Performance
23
3. Determine bottleneck
24
3. Determine Target Bottlenecks
25
3. Determine Source or Mapping Bottlenecks
Original
Modified
26
6. Make ONE change
° Very case-specific,
here are some common bottlenecks
− Target
− Source
− Mapping
− Session
− System
27
6. Eliminate Target Bottlenecks
28
6. Eliminate Target Bottlenecks
29
3. Eliminate Source Bottlenecks
30
3. Eliminate Mapping Bottlenecks
31
6. Optimize expression performance
32
6. Optimize Lookup Performance
33
6. Session Optimizing
° Set the DTM Buffer Pool Size and Buffer Block Size
− Large row sizes may require a larger buffer block size
− Default buffer pool is 12000000b = 12 Mb,
recommended is 24Mb
° Buffer Block Size controls the size of the blocks that
move in the pipeline
− Buffer Block size should hold about 100 rows
− 64K (64,000) ≈ 64 rows of 1Kb
− 128K (128,000) ≈ 128 rows of 1Kb
34
6. Session Memory Settings
° Set cache memory larger than the size of the cachefile on disk
° Set the server variable directories
(Badfiles, Cache, SessLogs, etc.)
to point to high performance disk arrays
° Reduce transformation errors (& error logging)
35
For those that are still on PowerCenter 5 …
PowerCenter 6 Performance highlights
36
For those that are still on PowerCenter 6 …
PowerCenter 7 Performance highlights
° Block DTM
− Enables moving/transforming a block of rows at a time
at each transformation
− Accelerates ALL sessions with:
° Mapping bottleneck AND
° (Lots of transformations OR Lots of string ports)
37
Performance tuning step-by-step
5. 3.
Run Determine
sessions bottleneck
HINTS:
- Write down a log of every step 4.
3. If all resources are used 100%, buy more Make ONE
change
4. If the change doesn’t help, UNDO
38