You are on page 1of 55

IBM Oracle Solutions

Advanced Technical Support - Americas

Oracle on AIX – Configuration & Tuning

R. Ballough, IBM Advanced Technical Support

Oracle on AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Legal information
The information in this presentation is provided by IBM on an "AS IS"
basis without any warranty, guarantee or assurance of any kind. IBM also
does not provide any warranty, guarantee or assurance that the
information in this paper is free from errors or omissions. Information is
believed to be accurate as of the date of publication. You should check
with the appropriate vendor to obtain current product information.

Any proposed use of claims in this presentation outside of the United


States must be reviewed by local IBM country counsel prior to such use.

IBM,^ , and pSeries are trademarks or registered trademarks of the


International Business Machines Corporation.

Oracle and Oracle9i are trademarks or registered trademarks of Oracle


Corporation.

All other products or company names are used for identification


purposes only, and may be trademarks of their respective owners.

2 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Agenda

Basic AIX Configuration/Tuning for Oracle


– Memory
– CPU
– I/O
– Network
– Miscellaneous

RAC Configuration
RAC Tuning

3 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

AIX Configuration for Oracle “starting points”

ƒ The suggestions presented here are considered to be basic


configuration “starting points” for general Oracle workloads

ƒ Customer workloads will vary

ƒ Ongoing performance monitoring and tuning is


recommended to ensure that the configuration is optimal for
the particular workload characteristics

4 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Oracle Server Architecture – Memory


Structures
PMON SMON

Archive
Program Global Area (PGA) ARC0 Logs
System Global Area (SGA)

Private SQL Shared Database Redo Log


Area Pool Buffer Cache Buffer
LGWR

DBWR CHKP Online


Redo Logs
Control
User Files

D000
Database
Files

5 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Advanced Technical Support - Americas

Memory Tools - Virtual Memory Manager (VMM)


ƒ The AIX “vmo” command provides for the display and/or
update of several parameters which influence the way AIX
manages physical memory
– The “-a” option displays current parameter settings
¾vmo –a

– The “-o” option is used to change parameter values


¾vmo –o minfree=1440

– The “-p” option is used to make changes persist


across a reboot
¾vmo –p –o minfree=1440
A number of the default “vmo” settings are not optimized for
database workloads and should be modified for Oracle environments
6 © 2005 IBM Corporation © 2003 IBM Corporation
IBM Oracle Solutions - Advanced Technical Support - Americas

General Memory Tuning

Memory Use testsys 8/2/2005

%comp %file

100
90
80
70
60
50
40
30
20
10
0
8:03
8:04
8:05
8:06
8:07
8:08
8:09
8:10
8:11
8:12
8:13
8:14
8:15
8:16
8:17
8:18
8:19
8:20
8:21
8:22
8:23
8:24
8:25
8:26
8:27
8:28
8:29
8:30
8:31
8:32
8:33
8:34
8:35
7 Oracle for AIX Workshop © 2006 IBM Corporation
IBM Oracle Solutions - Advanced Technical Support - Americas

VMM Tuning Pre AIX 5.2 ML4


MINPERM% – minimum % real memory for fs buffer cache
15-20%: JFS or JFS2 filesystems without DIO or CIO
5%: RAW logical volumes
JFS or JFS2 with DIO or CIO
GPFS

MAXPERM%, MAXCLIENT% - max % real memory for fs buffer cache


40-60%: JFS or JFS2 filesystems without DIO or CIO
<= 20%: Raw logical volumes
JFS or JFS2 with DIO or CIO
GPFS
ƒ Never more than 20 GB prior to AIX 5.3
ƒ To start, set to vmtune "numperm" value
ƒ Reduce until vmstat freed (fr) to scanned (sr) ratio is 4:1

8 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

VMM Tuning – AIX 5.2ML4+

MINPERM% =5%

MAXPERM%, MAXCLIENT%=80% or higher


make this a threshold which is > (1-computational memory)

LRU_FILE_REPAGE=0
LRU_POLL_INTERVAL=10ms

LRU_FILE_REPAGE=0 is a “hint” to lrud to ignore repage rates when


determining what to page out – effectively favoring paging out file
pages (filesystem buffer cache) rather than computational pages

LRU_POLL_INTERVAL indicates the time period after which LRUD


pauses and interrupts can be serviced. Default value of “0” means
no preemption.

STRICT_MAXPERM=0 (default)
STRICT_MAXCLIENT=1 (default)

9 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Understanding Memory Pools

ƒ Memory pools are configured at boot time based on physical hardware


configuration
ƒ Number can be seen with ‘vmstat –v |grep pools’
ƒ Size can only be seen using KDB

ƒ LRUD operates per memory pool

ƒ SGA is allocated equally from memory pools

ƒ If free list has been depleted in a memory pool, LRUD will scan/reclaim
memory in that pool

ƒ Consider implementing one of the following:


Setting memory_affinity=0 ignores physical hardware configuration and
allocates evenly sized memory pools
AIX 5.3 ML3 with APAR IY69237 modifies VMM to consider all free frames in
system regardless of memory pool before reclaiming memory

10 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

VMM Page Stealing Thresholds


The following define thresholds for the VMM page stealing process (lrud):
ƒ minfree
Set minfree = 120 x # logical CPUs /#mem pools
Consider increasing if vmstat “fre” column frequently approaches zero or if
“vmstat –s” shows significant “free frame waits”

ƒ maxfree
Set maxfree = minfree + (MAX(maxpgahead, j2_maxPageReadAhead) * #
logical CPUs)/ # mem pools

Example:
ƒ For a 6-way LPAR with SMT enabled, maxpgahead=8 and
j2_maxPageReadAhead=8:
minfree = 1440 = 120 x 6 x 2
maxfree = 1536 = 1440 + (max(8,8) x 6 x 2)

¾ vmo –o minfree=1440 –o maxfree=1536 -p

11 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

General AIX memory tuning


No requirements for allocating SGA

SGA may be pinned on AIX, but IS NOT RECOMMENDED – we have better


ways of keeping sga resident in memory

Do not over commit real memory!!!!


Server should be configured with enough physical memory to satisfy memory
requirements

Paging space
ƒ With AIX demand paging, paging space does not have to be large
ƒ ½ memory + 4GB

Monitor paging activity:


ƒ vmstat -s
ƒ sar -r
ƒ nmon

Resolve paging issues:


ƒ Reduce Filesystem cache size (MAXPERM, MAXCLIENT)
ƒ Reduce Oracle SGA or PGA (9i or later) size
ƒ Add physical memory

12 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Advanced Technical Support - Americas

Changing SGA size

Memory
ƒ Oracle 9i or 10g
– SGA can be dynamically resized, but has an upper bound by the parameter
SGA_MAX_SIZE.
• SQL> alter system set db_cache_size=2048m scope=both;
• (
ƒ SGA_TARGET (10g)
• DB_CACHE_SIZE, SHARED_POOL_SIZE., etc.

– PGA_AGGREGATE_TARGET can be dynamically resized

– SGA_TARGET and PGA_AGGREGATE_TARGET are not hard limits

Something to keep in mind for use of DLPAR with


memory…..

13 © 2005 IBM Corporation © 2003 IBM Corporation


IBM Advanced Technical Support - Americas

Determining SGA size


ƒ Statspack:
SGA Memory Summary for DB: test01 Instance: test01 Snaps: 1046 -1047

SGA regions Size in Bytes


------------------------------ ----------------
Database Buffers 16,928,210,944
Fixed Size 768,448
Redo Buffers 2,371,584
Variable Size 1,241,513,984
----------------
sum 18,172,864,960

ƒ SQLPLUS:
SQL> show sga
SQL> show parameters

14 © 2005 IBM Corporation © 2003 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Agenda

Basic AIX Configuration/Tuning for Oracle


– Memory
– CPU
– I/O
– Network
– Miscellaneous

15 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Advanced Technical Support - Americas

Oracle parameters based on # CPUs


– DB_WRITER_PROCESSES
– Degree of Parallelism
– - table level
– - query level
– - PARALLEL_MAX_SERVERS or
AUTOMATIC_PARALLEL_TUNING (CPU_COUNT *
PARALLEL_THREADS_PER_CPU)
– CPU_COUNT
– FAST_START_PARALLEL_ROLLBACK – should be using
UNDO instead
– CBO – execution plan may be affected; check explain plan

16 © 2005 IBM Corporation © 2003 IBM Corporation


IBM Advanced Technical Support - Americas

CPU Considerations
Use SMT with AIX 5.3/Power5 environments

Micropartitioning considerations
- Virtual cpus)<= physical processors in shared pool
- CAPPED
- Virtual CPUs should be the nearest integer >= capping limit
- UNCAPPED
- Virtual CPUS should be set to the max peak demand requirement

DLPAR considerations
ƒ Oracle 9i
– Oracle CPU count does not recognize change in # cpus
– AIX scheduler can still use the added CPUs
ƒ Oracle 10g
– Oracle CPU count recognizes change in # cpus

17 © 2005 IBM Corporation © 2003 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Agenda

Basic AIX Configuration/Tuning for Oracle


– Memory
– CPU
– I/O
– Network
– Miscellaneous

18 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Oracle Server Architecture - Files


PMON SMON

Archive
Program Global Area (PGA) ARC0 Logs
System Global Area (SGA)

Private SQL Shared Database Redo Log


Area Pool Buffer Cache Buffer
LGWR

DBWR CHKP Online


Redo Logs
Control
User Files

D000
Database
Files

19 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Flashback database design

20 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Advanced Technical Support - Americas

Options for storing Oracle data files

ƒ Filesystems
– Single-instance:
– JFS, JFS2, Veritas VxFS
– Clustered:
– GPFS, Veritas CFS

ƒ Raw
ƒ Automatic Storage Management (ASM) – new in
10g

21 © 2005 IBM Corporation © 2003 IBM Corporation


IBM Advanced Technical Support - Americas

Data Layout for Optimal I/O Performance

Stripe and mirror everything (SAME) approach:


ƒGoal is to balance I/O activity across all disks, loops, adapters, etc...
ƒAvoid/Eliminate I/O hotspots
ƒManual file-by-file data placement is time consuming, resource intensive and iterative

Use RAID-5 or RAID-10 to create striped LUNs (hdisks)


Create AIX Volume Group(s) (VG) w/ LUNs from multiple
arrays, striping on the front end as well for maximum
distribution
ƒPhysical Partition Spreading (mklv –e x) –or-
ƒLarge Grained LVM striping (>= 1MB stripe size)

http://www-1.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP100319

22 © 2005 IBM Corporation © 2003 IBM Corporation


IBM Advanced Technical Support - Americas

Data Layout cont’d…


Stripe using Logical Volume (LV) or Physical Partition (PP) striping

ƒ LV Striping
– Oracle recommends stripe width of a multiple of
¾ Db_block_size * db_file_multiblock_read_count
¾ Usually around 1 MB
– Valid LV Strip sizes:
¾ AIX 5.2: 4k, 8k, 16k, 32k, 64k, 128k, 256k, 512k, 1 MB
¾ AIX 5.3: AIX 5.2 Strip sizes + 2M, 4M, 16 MB, 32M, 64M, 128M
– Use AIX Logical Volume 0 offset (9i Release 2 or later)
• Use Scalable Volume Groups (VGs), or use “mklv –T O” with Big VGs
• Requires AIX APAR IY36656 and Oracle patch (bug 2620053)

ƒ PP Striping
– Use minimum Physical Partition (PP) size (mklv -t, -s parms)
¾Spread AIX Logical Volume (LV) PPs across multiple hdisks
in VG
• (mklv –e x)
23 © 2005 IBM Corporation © 2003 IBM Corporation
IBM Oracle Solutions - Advanced Technical Support - Americas

Single Instance Environments - Filesystems


Filesystems
ƒ JFS – no longer being enhanced
Better for lots of small file creates & deletes
ƒ JFS2 – generally the preferred single-instance filesystem
Better for large files/filesystems

Mount options:
Buffer Caching (default)– stage data in fs buffer cache
Direct I/O (DIO)– no caching on reads
Concurrent I/O (CIO) – DIO + no write lock (JFS2 only)
Release Behind Read (RBR)– During sequential reads, memory pages released after
pages copied to internal buffers
Release Behind Write (RBW) – During sequential writes, memory pages released after
pages written to disk

In 9i, DIO and CIO must be specified at the filesystem level


In 10g, Oracle issues o_cio and o_dio calls as appropriate

24 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Single-instance environments
Cached vs. non-Cached (Direct) I/O
File System caching tends to benefit heavily sequential workloads with low
write content. To enable caching for JFS/JFS2:
ƒUse default filesystem mount options
ƒSet Oracle filesystemio_options=ASYNCH
DIO tends to benefit heavily random access workloads and CIO tends to
benefit heavy update workloads. To disable JFS, JFS2 caching:
ƒIn 9i, set filesystemio_options=SETALL and use dio or cio
mount option
ƒIn 10g, set filesystemio_options=SETALL
When using DIO/CIO, fs buffer cache isn’t used. Consider the
following db changes:
ƒIncrease db_cache_size
ƒIncrease db_file_multiblock_read_count
Read Metalink Note #272520.1
25 Oracle for AIX Workshop © 2006 IBM Corporation
IBM Advanced Technical Support - Americas

Single-instance environments - Oracle Database File Access


Data Base Files (DBF)
ƒ I/O size is db_block_size or db_block_size * db_file_multiblock_read_count
ƒ Use CIO or no mount options for extremely sequential I/O
ƒ If block size is >=4096, use a filesystem block size of 4096, else use 2048

Redo Log/Control Files


ƒ I/O size is always a multiple of 512 bytes
ƒ Use CIO or DIO and set filesystem block size to 512

Archive Log Files


ƒ Do not use CIO or DIO
ƒ ‘rbrw’ mount option can be advantageous

Oracle Binaries
ƒ Do not use CIO or DIO

26 © 2005 IBM Corporation © 2003 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Single-instance environments - I/O Tuning (ioo)


READ-AHEAD (Only applicable to JFS/JFS2 with caching enabled)
MINPGAHEAD (JFS) or j2_minPageReadAhead (JFS2)
Default: 2
Starting value: MAX(2,DB_BLOCK_SIZE / 4096)

MAXPGAHEAD (JFS) or j2_maxPageReadAhead (JFS2)


Default: 8 (JFS), 128 (JFS2)
Set equal to (or multiple of) size of largest Oracle I/O request
DB_BLOCK_SIZE * DB_FILE_MULTI_BLOCK_READ_COUNT
Number of buffer structures per filesystem:
NUMFSBUFS (JFS2):
Default: 196, Starting Value: 568
Monitor with “vmstat –v”, increase if value of “filesystem I/Os blocked with no
fsbuf” is increasing
j2_dynamicBufferPreallocation
Default: 16
Monitor with “vmstat –v”, increase if value of “external pager filesystem I/Os
blocked with no fsbuf “is increasing

27 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Asynchronous I/O for filesystem environments


AIX parameters
minservers = 10 * # cpus
maxservers = (10 * # disks) / # cpus
maxreqs = a multiple of 4096 > 4 * #disks * queue_depth
“enable” at system restart
Typical settings: minservers=100, maxservers=200, maxreqs=16384
Oracle parameters
disk_asynch_io = TRUE
filesystemio_options = {ASYNCH | SETALL}
db_writer_processes = default

Monitor usage:
Watch alert.log for errors:
Warning “lio_listo returned EAGAIN”
Monitor from AIX
“pstat –a | grep aios”
Use “-A” option for NMON
Iostat –Aq (new in AIX 5.3)

28 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

ASM configurations

AIX parameters
– Async I/O needs to be enabled, but default values may be used

ASM instance parameters


– ASM_POWER_LIMIT=1
Makes ASM rebalancing a low-priority operation. May be changed
dynamically. It is common to set this value to 0, then increase to a
higher value during maintenance windows
– PROCESSES=25+ 15n, where n=# of instances using ASM

DB instance parameters
– disk_asynch_io=TRUE
– filesystemio_options=ASYNCH
– Increase Processes by 16
– Increase Large_Pool by 600k
– Increase Shared_Pool by [(1M per 100GB of usable space) + 2M]

29 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Agenda

Basic AIX Configuration/Tuning for Oracle


– Memory
– CPU
– I/O
– Network
– Miscellaneous
RAC Configuration
RAC Tuning

30 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Network Parameters – all environments

Set sb_max >= 1 MB (1048576) (generally ok by default)

Set tcp_sendspace = 262144

Set tcp_recvspace = 262144

Set rfc1323=1

Also confirm these are set properly at network interface level

31 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Miscellaneous parameters

ƒ /etc/security/limits
Set to “-1” for everything except core for Oracle user

ƒ Sys0 attribute maxuproc >= 4096

ƒ Environment variables:
AIXTHREAD_SCOPE=S
NUM_SPAREVP=1 (AIX 5.1 only)

ƒ Use a 64-bit kernel

32 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Agenda

Basic AIX Configuration/Tuning for Oracle


– Memory
– CPU
– I/O
– Network
– Miscellaneous

RAC Configuration/Tuning

33 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

RAC Architecture Summary


Centralized
Management
Console
Users
Low Latency Interconnect
ƒ Full Cache Fusion VIA or Proprietary
Cache-to-cache data shipping
Shared cache eliminates High Speed
Cache Fusion
slow I/O Switch or
Interconnect
Enhanced IPC
Clustered
ƒ Allows flexible
Database Servers
Shared
and transparent Cache
deployment Storage Area Network
Shared Disk
Subsystem
Hub or
Switch
Fabric

Drive and Exploit Industry Advances in Clustering

34 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Logical Shared Disk Architecture


N e tw o r k In te r c o n n e c t

N ode 1 N ode 2 N ode 3

Shared
D is k

ƒ Each RAC node requires access to all shared disk:


Physical Shared disk
typically Fiber Channel or SAN attached
Best Practices:
1) Use 2 or more HBAs using multipathing software for load
balancing and path failover
2) plan for fabric redundancy

35 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Storage Software Options for Oracle RAC:


ƒ GPFS (http://publib.boulder.ibm.com/clresctr/library/gpfs_aix_faq.html)
See metalink note 302806.1 for details
10gR2: GPFS 2.3.0.3+
10gR1: GPFS 2.1, 2.2, 2.3.01+
9i: GPFS 2.1, 2.2, 2.3.0.1+
Link above includes references to supported storage platforms

ƒ HACMP Raw Logical Volumes


10g: 5.1, 5.2 – NOTE: HACMP 5.3 not certified with 10g
9i: 4.4.x, 4.5, 5.1, 5.2, 5.3

ƒ Automatic Storage Management (10g only)


Check with Storage Vendor for support/recommended configuration
Check with Internal operations groups for management implications
eg, OS level backups, scripts to manage disk space, etc.
Required for Standard Edition RAC

ƒ VERITAS SFRAC (http://support.veritas.com)

ƒ Other Notes:
ORACLE_HOME directories must be local unless using GPFS or SFRAC.
10g REQUIRES separate LUNs for Voting Disk and OCR partitions unless a clustered filesystem or HACMP is used
Use separate directories for CRS, ASM, ORACLE_HOME

36 Oracle for AIX Workshop


Information subject to change, and additional requirements may apply. © 2006 IBM Corporation
IBM Oracle Solutions - Advanced Technical Support - Americas

Oracle Options for Data Storage


RAW GPFS ASM

Data Base Files


9 9 9
Redo Log Files
9 9 9
Control Files
9 9 9
Archive Log
Files 9 9 9
Oracle Binaries
9
OCR
9 9
Voting Disk
9 9
37 Oracle for AIX Workshop © 2006 IBM Corporation
IBM Oracle Solutions - Advanced Technical Support - Americas

Clustering Software Requirements for RAC on pSeries:

ƒ Oracle 9i RAC:
HACMP - 5.3 recommended
Veritas SFRAC 4.0
Other:
PSSP 3.5 is required for use of SP switch and can replace HACMP as
clusterware

ƒ Oracle 10g RAC:


Oracle CRS provides basic clusterware capability
HACMP required for raw logical volume environments only - 5.2 currently
most recent certified version
Veritas SFRAC may be used if desired
Other:
PSSP 3.5 is required for use of SP switch, but cannot function as
clusterware

Latest Requirements can be found at http://metalink.oracle.com


Check note # 282036.1 for detail

38 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Oracle 10g R2: Clusterware

ƒ Provides same base functions as 10g R1 CRS


ƒ Extended to provide protection for 3rd Party
Application components
– Employs Application Virtual IPs for transparent network
connectivity
– May require Clustered File System support for Application
Configuration Files
– Applications do not necessarily have to reside on DB node

39 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Public Interface Requirements for RAC on pSeries:

ƒ Public interface
At least one network interface for client traffic
May use virtual ethernet if sufficient bandwidth – watch for 10.2.0.1 bug
10g requires one virtual IP address for client traffic for each node

Validate that the application uses OCI calls and can take advantage of TAF
JDBC Thick does not support use of OCI
TAF Supports failure/reconnect of failed connections
Optional select statement failover

OR
Hardware load balancer/sprayer
Software load balancer
DNS CNAMES

40 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

RAC Interconnect Configuration n ec


t
P u b lic N e tw o rk c on
er
In t
P riva te N e tw o rk (P rim a ry) C
RA
P riva te N e tw o rk (B a c k u p )

Node 1 Node 2 Node 3

ƒ Gigabit Ethernet satisfactory for most applications


IBM HPS offers lower latency/higher bandwidth
IP over InfiniBand 10.2.0.1+
UDP network tuning required for optimum performance
Switched Networks REQUIRED – crossover cables not supported
Virtual Ethernet not yet supported

ƒ Oracle9i: Oracle “Fault Tolerant IP” (FTP-IP) feature provides Interconnect Network Fault
Tolerance

ƒ Oracle10g: Etherchannel or 802.3ad Link Aggregation AIX options replace Oracle FTP-IP code
May also be used with 9i CLUSTER_INTERCONNECTS parameter

41 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Private Interconnect Requirements for RAC on pSeries:


Private interconnect
Oracle 9i RAC:
HACMP/RAW solutions:
cluster interconnect: minimum of 1, recommend 2 gigabit Ethernet
HACMP/GPFS solutions:
cluster interconnect: same as above
additional: private subnet required for GPFS traffic; can be 10/100 ethernet
VERITAS SFRAC:
cluster interconnect: recommended 3 Ethernet interfaces, minimum 2

Oracle 10g RAC:


HACMP/RAW solutions
cluster interconnect: minimum of 1, recommend 2 gigabit
Ethernet
GPFS solutions:
cluster interconnect: same as above
additional: GPFS traffic can share with the cluster interconnect!
ASM:
cluster interconnect: same as above

42 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

RAC Interconnect: 9i R2 (Default Configuration)

ƒ Private Networks (normally 2) defined to HACMP


ƒ At instance startup, Oracle searches HACMP Network
Topology to identify up to 3 eligible networks for
Interconnect use
– Normally, the 2 private networks, plus 1 add’l public one
ƒ Networks are operated in High Availability “failover”
(primary/backup/backup) mode via Oracle provided “Fault
Tolerant IP” code
ƒ Oracle “Fault Tolerant IP” code registers DB connections
with HACMP Event Manager (haemd) to assist with
network failover

43 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Advanced Technical Support - Americas

RAC Interconnect: 9i R2 (Default Configuration)


Node 1 Node 2

Oracle Oracle

FT-IP FT-IP

AIX/HACMP AIX/HACMP
en1
en2
en3

en0
en1
en2
en3
en0

• Fault Tolerant
Private • Primary/Backup
10.0.0.1 10.0.0.3
Switch1 • No Bandwidth
10.0.0.2 10.0.0.4
Switch2 Aggregation

Public

44 © 2006 IBM Corporation © 2003 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

10g RAC Interconnect Configuration n ec


t
P u b lic N e tw o rk c on
er
In t
P riva te N e tw o rk (P rim a ry) C
RA
P riva te N e tw o rk (B a c k u p )

Node 1 Node 2 Node 3

Best practice: Use 2 Gigabit Ethernet interfaces for interconnect

Oracle9i “Fault Tolerant IP” feature not available in 10g


Etherchannel and 802.3ad Link Aggregation recommended

ƒ Network interface names must be the same on all nodes – eg, if en0 is
a cluster interconnect, it must be a cluster interconnect on all nodes.

45 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

RAC Interconnect Configuration: 10g

ƒ HACMP Network Topology no longer used


ƒ Oracle “Fault Tolerant IP” code has been removed
ƒ Public and Private (RAC Interconnect) network names are
specified at cluster configuration time
– Public interface names (e.g. en0) MUST be the same on all nodes
– Private interface names (e.g. en6) SHOULD be the same on all
nodes
– Private networks should be non-routable
ƒ Fault Tolerance provided by either:
– Logical Network with Primary/Backup adapters (9iR2 Alternative 2)
– EtherChannel or 802.3ad Link Aggregation
¾ Requires protocol capable switches
¾ Supports bandwidth aggregation and load balancing

46 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Advanced Technical Support - Americas

RAC Interconnect: 10g (Bandwidth Aggregation/Backup)


Node 1 Node 2

Oracle Oracle

en6 Fault Tolerant


10.0.0.5 AIX/TCP-IP en6 AIX/TCP-IP 10.0.0.6
Load Balanced or
Active/Standby
en1
en2
en3

en1
en2
en3
en0

en4

en0

en4
• Fault Tolerant
Private • Load Balanced
Switch1 • Bandwidth
Aggregation
Switch2

Public

47 © 2006 IBM Corporation © 2003 IBM Corporation


IBM Advanced Technical Support - Americas

Additional Network Parameters for RAC:


ƒ Set udp_sendspace = db_block_size * db_file_multiblock_read_count +4k
• (not less than 65536)

ƒ Set udp_recvspace = 10 * udp_sendspace


– Must be < sb_max
– Increase if buffer overflows occur
ƒ Ipqmaxlen=512 for GPFS environments

ƒ Use Jumbo Frames if supported at the switch layer

ƒ Time synchronization – use the “-x” flag with xntpd

Examples:
ƒ no -a |grep udp_sendspace
ƒ no –o -p udp_sendspace=65536
ƒ netstat -s |grep "socket buffer overflows"

48 © 2006 IBM Corporation © 2003 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

GPFS tunables
ƒ See metalink note 302806.1
Async I/O:
ƒ Oracle parameter filesystemio_options is ignored
ƒ Set Oracle parameter disk_asynch_io=TRUE
ƒ Worker1threads = GPFS asynch I/O
ƒ Prefetchthreads= exactly what the name says
ƒ Worker1threads +prefetchThreads<=550
ƒ Usually set prefetchthreads=default (64) and worker1threads=550-
prefetchthreads
ƒ Set aio maxrequests=(worker1threads/#cpus) + 10
Other settings:
ƒ GPFS block size is configurable; most will use 512KB-1MB
ƒ Pagepool – GPFS fs buffer cache, not used for RAC but may be for
binaries. Default=64M
mmchconfig pagepool=100M
ƒ MaxMBpS = maximum I/O that GPFS can submit per second.
Default=150MBpS, should be set to approximate capacity of I/O subsystem

ƒ Ipqmaxlen=512
No –r –o ipqmaxlen=512

49 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Advanced Technical Support - Americas

GPFS & Oracle – best practices

ƒ Have < 10 filesystems

ƒ Because of different usage patterns, use separate filesystems for binaries


than Oracle data files

ƒ Use local filesystems for CRS HOME

ƒ When I/O to GPFS filesystems outside of Oracle instance I/O is expected, it


can be desirable to increase the pagerpool parameter

ƒ OCR, vote placement is recommended on raw luns, rather than as GPFS


files

50 © 2006 IBM Corporation © 2003 IBM Corporation


IBM Advanced Technical Support - Americas

Scalability: OLTP Environments


ƒ Scale-out tends to be good for OLTP environments when:
– There is low to moderate update activity
– The workload is relatively uniform and predictable
– The application is well designed and there are minimal lock/latch or serialization
related contention issues
– A functional partitioning strategy is used to direct users to a limited number of
nodes
–However, functional partitioning may reduce load balancing effectiveness
ƒ Industry benchmark proof points are limited (as of 11/1/05):
– 1 out of 178 TPC-C benchmarks used RAC (16-node HP Integrity rx5670)
¾Relatively poor scale-out vs. non-RAC result on same hardware
¾No demonstrable $/tpmC advantage vs. IBM p595 non-RAC results
– Limited number of SAP, Oracle E-Business Suite, PeopleSoft, other results
ƒ Most customer RAC environments are 2 or 3 nodes

A deployment involving a small number (2 or 3) of large nodes carries


significantly less business risk than one involving many (4+) small nodes

51 © 2006 IBM Corporation © 2003 IBM Corporation


IBM Advanced Technical Support - Americas

Scalability: DSS Environments


ƒ Scale-out tends to be good for Decision Support environments when:
– There is minimal data update or DDL operations during peak shift
– Data is predominantly in large, partitioned tables which have even key/data
distribution
– Data has a low “locality of reference” (low buffer cache hit%)
– Are used predominantly for large query with Parallel Query Option (PQO) and
PARALLEL_AUTOMATIC_TUNING=TRUE
– Query response time is not critical

ƒ Oracle focus on RAC based TPC-H benchmarks (as of 11/1/05):


– 8 of 17 clustered results (all DB) were with Oracle RAC
– 8 of 23 Oracle DB results were with Oracle RAC

ƒ Small number of known Oracle accounts > 1TB and/or involving more
than 6 nodes

If you want to scale-out beyond 4-nodes, plan to do comprehensive


stress testing before production deployment

52 © 2006 IBM Corporation © 2003 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

Oracle Documentation
ƒ Oracle Reference Manuals:
http://otn.oracle.com/documentation/index.html

ƒ Oracle Whitepapers:
http://otn.oracle.com/products/database/clustering/RACWhitepapers.html

ƒ Oracle (Metalink) Certification Info:


http://otn.oracle.com/support/metalink/index.html

ƒ Oracle Database 10g Release 2 Automatic Storage Management


Overview and Technical Best Practices
http://www.oracle.com/technology/products/database/asm/pdf/asm_10gr2_bptw
p_sept05.pdf

ƒ Oracle Metalink:
http://metalink.oracle.com
– 282036.1

ƒ IBM Redbooks: (www.ibm.com/redbooks)


ƒ Techdocs – Technical Sales Library:
http://www.ibm.com/support/techdocs

53 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

GPFS Documentation

ƒ GPFS for AIX home page:


www.ibm.com/servers/eserver/pseries/software/sp/gpfs.html

ƒ GPFS for AIX Library:


www.ibm.com/servers/eserver/pseries/library/gpfs.html
– Concepts, Planning and Installation Guide" (GA22-7453)
– Administration and Programming Reference (SA22-7452)
– Problem Determination Guide (GA22-7434)

ƒ Clusters Library:
http://www.ibm.com/servers/eserver/clusters/library/
– GPFS for AIX Frequently Asked Questions

ƒ Oracle Metalink:
http://metalink.oracle.com
– 302806.1: IBM General Parallel File System (GPFS) and Oracle RAC on AIX
5L and IBM eServer pSeries

54 Oracle for AIX Workshop © 2006 IBM Corporation


IBM Oracle Solutions - Advanced Technical Support - Americas

QUESTIONS
ANSWERS

55 Oracle for AIX Workshop © 2006 IBM Corporation

You might also like