Oracle Real Application Clusters (RAC

)
Comprehensive Concepts Overview, Insight, Recommendations, Best Practices and a whole lot more

A “BrainSurface” Presentation www.brainsurface.com

Disclaimer
This views/content in this document are those of the author and do not necessarily reflect that of Oracle Corporation and/or its affiliates/subsidiaries. The material in this document is for informational purposes only and is published with no guarantee or warranty, express or implied.

About “Tariq Farooq”
Tariq Farooq – www.brainsurface.com
Founder of BrainSurface | The After-Oracle ConnectSpace Next-Generation Social Networking for the Oracle, Java & MySQL Communities Oracle ACE – Oracle Technologist for 16+ years Oracle Certified Expert – Real Application Clusters Oracle Certified Professional – DBA – 8i, 9i, 10g, 11g & 10g Apps DBA Oracle Certified Professional – Internet Application Developer 2, 6i, 9i Oracle Certified Professional – E-Business Suite 11i Author, Speaker, Blogger, Forumizer & Community Organizer

. however there are exceptions to this rule with Extended Distance Clusters existing over WANs as well. Cluster nodes are typically connected to each other through a fast network connection on a local area network within a small geographical site such as a building or a cluster of buildings: This is the norm. Having redundant nodes within a cluster provides for a continuous availability of computing services. Clustering is also Known-as/Synonymous-to Massive Parallel Processing (MPP)..What is Clustering? – Synopsis & Overview A Cluster is a collection of networked computers working together to form a single logical machine. as well as providing for more processing power to balance the computing load amongst several nodes instead of a single expensive powerful machine.

which is provided by Oracle's DataGuard product). . Essentially. Scalability. Elasticity and High-Availability by keeping the Oracle database server product available and running across a set of multiple server nodes accessing a common database on Shared Storage.What is Oracle RAC? – Synopsis & Overview Real Application Clusters (RAC) is the Oracle trade name for its database server cluster product and provides load-balancing. Oracle RAC is a parallelization product that provides Load-Balancing. scalability and automatic failover protection in case of server failure (the key word here being server failure not site failure.

Allows multiple servers to run Oracle RDBMS Server Instances allowing them to concurrently read/write to a single “Clustered” database on Shared Storage. Oracle RAC employs the Shared-Everything approach.2 for the DEC VAX platform. Previous/Primitive version of Oracle RAC was known as Oracle Parallel Server (OPS). Oracle was the first vendor to run a Parallel Server product at the database level with Oracle Database Server 6. .What is Oracle RAC? – Synopsis & Overview Real Application Clusters (RAC) was released in 2001 with Oracle 9i.

2. With 11gR2. Oracle CRS was renamed to Oracle Clusterware in 10g R. . HP TruCluster. Oracle Cluster Ready Services (CRS) was provided by Oracle for Linux and Windows. Sun Clusters. Grid Infrastructure combines Clusterware and ASM in a single unified Oracle HOME. 3rd Party Clusterware is no longer required OR recommended by Oracle with 10g and up. on all major Unix OS brands except for Linux and Windows. IBM HACMP etc.What is Oracle RAC? – Synopsis & Overview Oracle 9i RAC required 3rd-Party Clusterware e. Veritas SFRAC.g.

2 supports up-to 100 nodes. .What is Oracle RAC? – Synopsis & Overview Oracle RAC is at the most complex end of the Oracle RDBMS family spectrum and needs sophisticated management tools such as Oracle Enterprise Manager (OEM) Grid Control. Oracle RAC 10gR. Enables Dynamic Provisioning for Grid/Cloud Computing.

What is Oracle RAC? – Synopsis & Overview Oracle RAC Provides: Scalability Load Balancing / Workload Distribution Elasticity Fault Tolerance Oracle RAC used primarily for Load-Balancing. Oracle RAC used secondarily for Fail-Over. .

What is Oracle RAC? – Synopsis & Overview Oracle RAC Instances have their own separate/distinct Redo Log files and Alert/Trace Log files. . Redo Log files can be read by all instances but written to only by the Master Instance. Oracle RAC Instances have their own separate/distinct System Global Areas (SGAs) and set of background processes.

Every RAC Instance has its own Redo Log files and Undo Segments/Tablespaces. Cache Fusion/Synchronization enables concurrent/simultaneous transaction-processing between all Instances using the Private Cluster Interconnect.Oracle RAC Architecture Overview A typical Oracle RAC cluster comprises of the following components: Single RAC Database (comprising of shared Control and Data Files) on shared Storage accessed accessed by Multiple Instances. . Multiples Instances running on Multiple Nodes. Control and INIT files are shared across Instances. Nodes are typically comprised of Low Cost/Commodity Hardware. Data.

Oracle RAC Architecture Overview Figure/Diagram from Oracle Documentation .

Oracle RAC Architecture Overview Figure/Diagram from Oracle Documentation .

Oracle RAC Architecture Overview Figure/Diagram from Oracle Documentation .

Oracle RAC Architecture Overview Figure/Diagram from Oracle Documentation .

Maximum Availability Architecture .Overview Figure/Diagram from Oracle Documentation .

Overview Figure/Diagram from Oracle Documentation .Maximum Availability Architecture .

Oracle RAC with DataGuard Architecture .Overview Figure/Diagram from Oracle Documentation .

Overview Figure/Diagram from Oracle Documentation .Oracle RAC with DataGuard Architecture .

. Horizontal Scalability Horizontal Scalability within a single server has its well-known limits that prevent an application/database from growing beyond a certain threshold.Oracle RAC: Vertical Scalability vs. Oracle RAC is the only viable solution for scaling out the Oracle RDBS server product horizontally to support a Very Large User Base (VLUB).

SMP vs. RAC (MPP)

Figure/Diagram from Oracle Documentation

Oracle RAC: Application Scalability
With the advent of Oracle RAC 9i (and up) Cache Fusion:
Applications typically scale out-of-the-box with zero/minimal tuning. More nodes can be added/removed in HOT MODE=ZERO DOWNTIME with zero database downtime to provide elasticity and scalability. Database Files residing on Shared Disk Cluster File System provide a uniform, fast and readconsistent image to the end-user.

Oracle RAC: Application Scalability

Quote from Oracle Documentation

Cache Fusion is the driving technology behind Oracle RAC that enable Applications to scale out on multiple servers/instances.Oracle RAC: Application Scalability Typically Applications that behave correctly/normally in a Single-Server/SingleInstance Oracle database typically scale just fine on a Oracle RAC database without making code changes. .

Overall Downtime for a subset of users affected by downed instance: Within Seconds to few Minutes. .Oracle RAC: Instance Failure: Failover/Switchover Times Instance Failover Time: Within Seconds to few Minutes. Application Switch-Over Time: Within Seconds to few Minutes.

. Manages multiple resources within the Cluster and presents the clustered database as a single logical machine to the end-user.What is Oracle Clusterware? – Synopsis & Overview Oracle Clusterware is the layer between the OS and the Database.

What is Oracle Clusterware? – Synopsis & Overview Oracle Clusterware provides: Global Resource Management Group Services Node Membership High-Availability Functions .

What is Oracle Clusterware? – Synopsis & Overview Unified Cohesive Solution comprising of 3-Tiered Architecture: Cluster Synchronization Service (CSS) Cluster Ready Service (CRS) Event Manager (EVM) .

What is Oracle Clusterware? – Synopsis & Overview Physical Components: Oracle Cluster Registry (OCR) Voting Disk .

Oracle Clusterware Architecture – Synopsis & Overview Figure/Diagram from Oracle Documentation .

Oracle Clusterware Architecture – Overview Figure/Diagram from Oracle Documentation .

Oracle Clusterware Architecture – Overview

Figure/Diagram from Oracle Documentation

Oracle Clusterware Architecture – Overview

Figure/Diagram from Oracle Documentation

Oracle Clusterware Architecture – Overview

Figure/Diagram from Oracle Documentation

Oracle Clusterware Architecture – Overview Figure/Diagram from Oracle Documentation .

Oracle Clusterware Architecture – Overview Figure/Diagram from Oracle Documentation .

Overview CRS daemon: Cluster resources management. Event Manager (EVM): Event publishing. Cluster Synchronization Services Daemon (CSSD): Node membership. RACGOPROCD: Cluster Monitoring. Oracle Notification Services (ONS): used by EVM.Oracle Clusterware – Daemons/Processes . .

. C-based API – Used for managing resources. Can be used to restart and relocate an application in case of node failure. Stop an application. Script: Agent – Start.Oracle Clusterware Framework – Synopsis & Overview Framework: Scripting Interface. Check.

Cache Fusion – Synopsis & Overview Cache Fusion. disk writes are eliminated when other instances request blocks for updates. . Cache Fusion is very fast due to the fact that. DB Blocks are transferred in-memory from instance-to-instance cache over the Cluster InterConnect when requested after proper locking procedures are implemented. a mechanism within Oracle RAC employs Shared Cache Architecture that fuses the in-memory data buffer cache across all nodes into a single logical read-consistent buffer cache available to all instances. DB Blocks are synchronized. NOT mirrored = Faster performance.

Cache Fusion – Synopsis & Overview Some useful Dynamic Performance Views for monitoring Cache Fusion: gv$file_cache_transfer gv$temp_cache_transfer gv$cache_transfer gv$class_cache_transfer .

Cache Fusion – Synopsis & Overview Global Cache Service (GCS) is used for FAST instance-to-instance block buffer transfer and establishes/implements Cache Coherency = Never more than 3 hops. previously known as Dynamic Lock Manager (DLM) is used for block buffer locking. . Global Enqueue Service (GES). Global Resource Directory (GRD) is used for keeping track of Block Buffer Location/Mode/Role information.

.Cache Fusion – Synopsis & Overview Global Cache Services (GCS) Waits = Cross-Instance Block transfer Waits = Measure of Data Block Transfer Efficiency.

Cache Fusion Architecture Overview Figure/Diagram from Oracle Documentation .

Cache Fusion Architecture Overview Figure/Diagram from Oracle Documentation .

Recommendation: OCR Voting Disks should be mirrored for Redundancy/High-Availability (Upto two copies). OCR is automatically backed up every 4 hours in $ORACLE_CRS_HOME\cdata directory.Oracle Cluster Registry (OCR) – Synopsis & Overview OCR is the central repository for storing any/all information about all clusterware resources. .

Voting Disk – Synopsis & Overview Synonymous to Quorum Disk. Recommendation: Voting Disks should be mirrored for Redundancy/HighAvailability (Upto 3 copies). Used for managing Cluster Members. Voting Disks can be dynamically added in 11g. Used for preventing Split Brain scenario. .

. Disk Heartbeat (I/O Timeout in Seconds): Disk Heartbeat not updated in I/O Timeout in seconds = Node Eviction.Node Evictions: Overview Network Heartbeat (MissCount in Seconds): Node unable to send a network Heartbeat for MissCount = Node Eviction.

Cluster InterConnect – Synopsis & Overview Dedicated “Private” Network for Oracle RAC. Used for block-transfers amongst instances to enable Cache Fusion. Recommendation: Test Cluster InterConnect latency/band-width with 3rd Party Tools. Recommendation: Don’t use Cascading Switches. . Dedicated >= GigaBit Ethernet Switch required for the Private Cluster Interconnect. Same Interconnect should be used for both Clusterware and Database.

Recommendation: Use Jumbo Frames in Gigabit Ethernet. Recommendation: Disable Unicast Storm Control. . Recommendation: Enable Rapid Spanning Tree Protocol (RSTP). Recommendation: NIC Ring Buffers.Cluster InterConnect – Synopsis & Overview Recommendation: GigaBit Ethernet Adapters should be teamed/bonded together to provide for higher bandwidth/fault-tolerance. Recommendation: Enable Flow Control. Recommendation: Full Duplex Mode.

MTU size on all adapters should be identical on all nodes. Socket Receive Buffer. NIC should be on the fastest PCI bus. UDP is recommended over TCP because of lower latency. Recommendation: UDP: Send/Receive Buffers . TCP Settings Flow Control Settings.Cluster InterConnect – Synopsis & Overview Recommendation: Turn on Port Fast.Max Setting. Highest-Top-Bit-Rate recommended over Auto-Negotiate. Duplex Mode recommended. Network Interrupts for CPU. .

Cluster InterConnect – Synopsis & Overview = + Blocks received per second + PQ message received per second ) / Maximum Network transmit capacity ( M (Message received per second = No. of CR Blocks received + No. of GCS messages) + B (Blocks received per second = DB Block Size * (No. of GES messages + No. of PX Remote Messages Received ) / MTU Size) ) / 85000 Bandwidth required per second ( Message received per second Formula from Oracle Documentation . of Current Blocks received) / MTU Size) + P (PQ message received per second = (PQ Message Size * No.

which is a unique and unused IP address within the same network subnet. EVM generates an event: Oracle Clusterware transfers the VIP address to another instance. . Upon Node/Instance Failure. Each node must have its own virtual IP (VIP).Virtual IP (VIP) – Synopsis & Overview Virtual Connection over Public Interface. VIP is stored in the OCR. One active VIP per node. VIP must be a DNS known address.

. Application VIP (10gR2 and up): NodeIndependent connections.Virtual IP (VIP) – Synopsis & Overview Fast Failover: Database clients no longer have to deal with TCP/IP timeouts. NOT the physical IP address. Recommendation: Use VIP for database connections.

Cluster File System Required for Shared Storage: Oracle RAC supports the following types of Shared Storage: Automatic Storage Management (ASM) Oracle Cluster File System (OCFS) RAW Devices . Redo Log files and Undo Files. Data Files.Shared Storage – Synopsis & Overview Comprises of Control Files.

for RAC deployments. Recommendation: Enable Multi-Pathing (Active-Active IO Paths) between servers and SAN storage. .Shared Storage – Synopsis & Overview Automatic Storage Management (ASM) goes hand-in-hand with Oracle RAC and is highly recommended by Oracle Corp.

Shared Storage – Synopsis & Overview Typical Thresholds for Disk I/O: Log File Parallel Write > 3 MSec DB File Scattered Read > 30 MSec Db File Sequential Read > 25 MSec .

Cluster File System. Features: Elasticity = Cloud Computing. Load balancing. Eliminate/Significantly-Minimize I/O Performance Tuning. . Mirroring. Volume manager. Built for Oracle by Oracle. HOT Mode: Add/remove disks online. Striping of data across disks.Automatic Storage Management (ASM) – Synopsis & Overview Recommendation: Use ASM with Oracle RAC.

Recommendation: Each ASM Diskgroup should have disks with similar characteristics in it: Avoid uneven architecture within each ASM disk group. .Automatic Storage Management (ASM) – Synopsis & Overview ASM instance going down on one instance = Surviving cluster nodes still keep running.

Automatic Storage Management (ASM) – HOT Data Migration Figure/Diagram from Oracle Documentation .

Automatic Storage Management (ASM) – HOT Data Migration Figure/Diagram from Oracle Documentation .

Automatic Storage Management (ASM) – HOT Data Migration Figure/Diagram from Oracle Documentation .

Automatic Storage Management (ASM) – HOT Data Migration Figure/Diagram from Oracle Documentation .

Segregation of jobs into various workloads depending on their individual/common characteristics.Workload Management – Synopsis & Overview Virtualization/Grid Computing/Abstraction. Parallelization: Segregate large jobs into smaller units and execute them in parallel = Lot better performance. Define Services to accomplish Workload Management. .

Workload Management – Synopsis & Overview Figure/Diagram from Oracle Documentation .

Enable Automatic Fail-over/Recovery of applications. Services decouple hard-coded mapping between a connection request and a RAC instance. Up-to 100 services can be created in 10g R.2. .Services – Synopsis & Overview Introduced in 1999 with release of Oracle 8i. Services split-up the workload into different classes. Workload Management: Services.

. Each Instance can have multiple Services. Job Classes to be managed. Date Range of Job Class.Services – Synopsis & Overview Services are a must for using the LoadBalancing Advisory and Runtime Connection Load Balancing. Priority of Execution. Each Service is comprised of: Thresholds & Priorities. Each Services can service multiple Instances.

Services – Synopsis & Overview Can be created/managed in 10g by the following: Oracle Enterprise Manager (OEM) Grid Control. SRVCTL Command-Line Utility. DBMS_SERVICE PL/SQL database package in SQL*Plus. Services can relocated across nodes. Integrated with Oracle Database Resource Manager (DRM). . Automatic Workload Repository (AWR) collects/contains/measures full-blown statistics about Services. Database Configuration Assistant (DBCA).

g. Distributed Transaction Processing (DTP) supported in 10gR. . THROUGHPUT – Uniform completion times e.Services – Synopsis & Overview Service Level Goals (3 Options): NONE SERVICE_TIME – Used for non-uniform completion times e. eCommerce Application.2 to provide tight coupling with single instance. Stock-Trading System.g.

TAF Policy options: NONE BASIC PRECONNECT . CLB_GOAL_LONG: Default Value: Used for longer duration connections.Services – Synopsis & Overview Connection Load-Balancing Goal: CLB_GOAL_SHORT: Used for short duration connections.

Services – Synopsis & Overview Services .Some useful DBA & Dynamic Performance Views: DBA_SERVICES V$SESSION V$SERV_MOD_ACT_STATS V$ACTIVE_SESSION_HISTORY V$ACTIVE_SERVICES V$SERVICE_WAIT_CLASSES V$SERVICEMETRIC V$SERVICEMETRIC_HISTORY V$SERVICE_STATS V$SERVICE_EVENTS .

Limit the undo space for user classes. . Limit session idle time. Degree of parallelism control. CPU resource distribution for user classes.Oracle Database Resource Manager – Synopsis & Overview Features: Kill/prevent runaway queries automatically.

Set priority level: Low.Oracle Database Resource Manager – Synopsis & Overview Consumer Groups: Grouping of user sessions into groups. Module. . High. Attributes: User. Service. Pre-configured rules. Resource Manager Plans: Plan Allocation: CPU Allocation. Action.

Fast Application Notification (FAN): Synopsis & Overview Fast Application Notification (FAN) events are used for notifying applications of cluster component failures. FAN includes functionality for dynamically starting and stopping applications and other related events. configuration changes and failure recovery for applications using RAC. . Enables Fail-over/Recovery of Applications in case of cluster component failure thereby providing HighAvailability & Scalability.

TCP/IP Time-outs.Fast Application Notification (FAN) – Synopsis & Overview Typical FAN Events are published for the following: Downed Services. Cluster Configuration Changes. Dead/Crashed/Slow nodes. Connection Time-outs. Non-Load Balancing of applications upon restart/scale-out of services. .

Server-side callouts to notify Administrative Personnel log and create Problem-Tickets. Disruption of service is minimized/ mitigated. 3 types of FAN events: Load Balancing Events Node Events Service Events .Fast Application Notification (FAN) – Synopsis & Overview FAN enables the following: Users are directed to Available Instances.

Fast Application Notification (FAN) – Synopsis & Overview Applications can subscribe to FAN events using the Oracle Notification Service (ONS) and API Oracle Call Interface (OCI).NET . FAN integrated with the following Oracle Clients (Handled Automatically: Don’t require any code changes): Oracle Call Interface (OCI) Oracle JDBC Oracle ODP.

Fast Application Notification (FAN) – Synopsis & Overview Server-side Callouts: Used to run an Event-Handling Shell Script/ Compiled Executable in the CRS HOME/racg/usrco directory. Executed Asynchronously upon the occurrence of a condition e. . FAN posts an event to ONS about a change in state such as Startup/Stopping of an Instance. Database or Service.g.

Starting/Stopping of Remote Daemons/Processes.Fast Application Notification (FAN) – Synopsis & Overview Server-side Callouts – Can be used for: Logging of events e. Paging etc.g. generation of log file for event. Email. Paging to Administrative Personnel e.g. .

Fast Application Notification (FAN) – Synopsis & Overview Figure/Diagram from Oracle Documentation .

. Utilized by FAN to publish HA/LoadBalancing events.Oracle Notification Service (ONS) Overview Publisher/Subscriber model for messaging.

Detects and avoids sending jobs to slow and hung nodes.Load Balancing Advisory – Synopsis & Overview Used for Load-distribution amongst the various instances. The Load Balancing Advisory analyzes the service/work-load level of the nodes within a RAC cluster and sends a Fast Application Notification (FAN) event to the application so that. the requests are sent to the best service at the time of the request. .

Load Balancing Advisory – Synopsis & Overview
Integrated with Automatic Workload Repository (AWR) & Connection Pools. V$SERVICE & V$SERVICEMETRIC_HISTORY updated every hour.

Load Balancing Advisory – Synopsis & Overview

Figure/Diagram from Oracle Documentation

Runtime Connection Load Balancing – Architecture Overview

Figure/Diagram from Oracle Documentation

Overview Load Balancing: Parallel Execution. Connection Load Balancing: Oracle NET Services: Server Side Load Balancing. Connection Pooling: Use Fast Connection Failover + Runtime Load Balancing. . Client Side Load Balancing. Automatic Workload Management = Services.Various types of Load Balancing in Oracle .

Parallel Query – Synopsis & Overview Parallel Query Options: Standard parallel query: Utilizes all available resources in cluster. Achieved by: Services Parallel Instance Group . Restricted parallel query: Processing limited to specific assigned nodes in cluster.

Parallel Query – Synopsis & Overview Figure/Diagram from Oracle Documentation .

Parallel Query – Synopsis & Overview Figure/Diagram from Oracle Documentation .

Parallel Query – Synopsis & Overview Figure/Diagram from Oracle Documentation .

Parallel Query – Synopsis & Overview Figure/Diagram from Oracle Documentation .

Oracle RAC Software: Storage & Organization – Overview Figure/Diagram from Oracle Documentation .

Oracle RAC Software: Storage & Organization – Overview Figure/Diagram from Oracle Documentation .

. Non-Shared HOME(s) – Synopsis & Overview Shared Oracle HOME: One copy of the Oracle HOME shared by all nodes within the cluster on a Shared File System.Oracle RAC: Shared HOME(s) vs. OS needs to be cross-node compatible. Non-Shared Oracle HOME: Each node has its own set of Oracle HOMEs mutually exclusive of other nodes. Recommendation: Use Non-Shared Oracle HOME(s). Shared Oracle HOME(s) cannot be used for rolling upgrades.

Rolling Upgrades .Overview Figure/Diagram from Oracle Documentation .

Overview Figure/Diagram from Oracle Documentation .Rolling Upgrades .

Oracle RAC – Administration & Management Three levels of Management/ Administration: Cluster-Level Administration Database-Level Administration Instance-Level Administration .

Oracle RAC – Administration & Management GUI: Oracle Enterprise Manager (OEM) Database/Grid Control GUI: Database Configuration Assistant (DBCA) GUI: Virtual Internet Protocol Configuration Assistant (VIPCA) Command Line: Cluster Verification Utility (CVU) Command Line: Oracle Interface Configuration Tool (OIFCFG). SRVCTL. SQL*Plus. . Oracle Clusterware Command-Line Interface.

Oracle RAC – Administration & Management: Command-Line Utilities CRS_STAT CRSCTL CRS_STOP CRS_START SRVCTL .

bat .Cluster Verification Utility (CVU) – Synopsis & Overview Cluster Verification Utility (CVU/CLUVFY) is a very useful tool for performing Pre and Post componentlevel checks at various stages of the Oracle RAC Install/Patch/Update process in addition to various other system-level checks at all major stages of the deployment cycle.Windows) Cluvfy (cluvfy.sh – Unix /cluvfy.sh – Unix /runcluvfy.bat . Two scripts provided for running CVU: Runcluvfy (runcluvfy.Windows) .

Cluster Verification Utility (CVU) – Synopsis & Overview Figure/Diagram from Oracle Documentation .

. Oracle along with Dell. EMC and Intel launched Project MegaGrid in 2004 to demonstrate the costeffectiveness.Oracle RAC over Commodity Hardware Oracle RAC works on low-cost commodity hardware to lower the costs of ownership and produce a highavailability parallelized Grid/Cloud computing architecture environment. employing Oracle Real Application Clusters on inexpensive commodity hardware as an economic and powerful alternative to the conventional SMP computing paradigm. reliability and functionality usage of Grid Computing Infrastructure.

50GB/sec Flash Bandwidth = Hypersonic Speeds! Figure/Diagram from Oracle Documentation . Compression. PCI Flash & Flash Cache = The Ultimate Database Consolidation Platform Concurrent Query/Updates at the same time with MultiVersion Read Consistency Beneficial 4 TrickleFeedDataLoads Extreme Performance. Intelligence/Compute Capability at the Storage tier. utilizing the same Oracle Skillset that you currently have Get upto 10x query performance on mixed workloads 5TB of Flash Cache = 56 Flash PCI Cards per Exadata Rack.Exadata & Oracle RAC Oracle Exadata – The world’s fastest series of database machines: Is based on and comes preconfigured with Oracle Real Application Clusters (RAC) No Single Point of Failure Infiniband. Upto 50 GB/sec with FLASH. Load Upto 5TB/hour Replace distributed systems with a consolidated system. 25GB/sec IO bandwidth.

Deploying Oracle RAC: A balanced system approach Figure/Diagram from Oracle Documentation .

Deploying Oracle RAC: A balanced system approach Figure/Diagram from Oracle Documentation .

Deploying Oracle RAC: A balanced system approach Few large OR Many small: Either approach is fine/feasible from a scalability perspective. Recommendation: Cluster nodes with uniform/even performance properties. . Recommendation: Avoid uneven architecture within Oracle RAC.

Deploying Oracle RAC: Destructive Testing Figure/Diagram from Oracle Documentation .

QA Beta Testing. QA Regression/Stress Testing. Database worst-case scenario testing for possible long-running/slow queries that pose a bottleneck/domino threat to the system. Test the System Infrastructure-As-a-Whole to identify/correct any Integration Flaws. .Oracle RAC: Project Deployment Phases/Goals Deploy/Implement a TEST Cluster.

Simulate Peak Performance load on the database. RAW Database Load Testing. . Comprehensive and Repetitive Rehearsal of deployment to Production.Oracle RAC: Project Deployment Phases/Goals Expansive Real Application Testing (Where possible > 10g). collect statistics and analyze results.

Deploying Applications on Oracle RAC – Recommendations & Best Practices Automatic Storage Management (ASM) Workload Management: Services SPFILEs on Shared Storage Automatic Undo Management Automatic Segment Space Management Automatic Database Diagnostic Monitor (ADDM) in conjunction with (Automatic Workload Repository) AWR .

of CPUs. . Increase the Cache for High-Usage Sequences and change them to NOORDER. FREELIST. Lesser LMS with high utilization = More Efficiency. Commit Sizes should be reduced for faster operations. LMS procecess <= No. Use Automatic Segment Space Management (ASSM). Use Network Time Protocol (NTP) to synchronize the time on all nodes within a Oracle RAC cluster. Remove PCTUSED. Use Reverse-Key and Hash-Partitioned indexes. FREELISTs groups.General Recommendations for a healthier/faster RAC system Keep Batch & OLTP processes on separate instances.

Global Dynamic Performance Views = GV$ Prefix. Move as much PL/SQL code from client to ServerSide e. Tune INITRANS and FREELISTS to mitigate block contention.g. Partition High-Usage Database Segments to minimize resource contention.General Recommendations for a healthier/faster RAC system High-DML tables should have lesser rows per block: ALTER TABLE MINIMIZE RECORDS_PER_BLOCK. . Server-side Stored Procedures/Packages.

Oracle DataGuard provides protection for SiteLevel failure. .High Availability: Redundancy is Crucial High-Availability requires redundant components to be built in to every layer of the Infrastructure Stack to eliminate SPOFs (Single Point of Failure). Geographical Redundancy: Multiple Data Centers at geographically distant locations. Multiple Power/Air-Conditioning/Other Critical Resource units within each Data Center site.

High Availability: Redundancy is Crucial Software Redundancy at each Data Center site: Multiple Web Servers. . Oracle RAC: Multiple Oracle Database Instances. RAID groups within each SAN. Multiple Application/ File/ BI/ Batch/ Transaction/ Interface/ Reporting/ Services/ Support/ Other/ Management/ Monitoring Servers. Storage Redundancy at each Data Center site: Multiple SAN(s).

within each server. Multiple Storage Area Networks (SANs). Multiple Power. Multiple Local Traffic Managers (GTM). . CPUs etc. Multiple Network Interface Cards (NICs) within each server. Multiple Servers at every tier-level. Multiple Network Switches/VLANs. Cooling.High Availability: Redundancy is Crucial Hardware Redundancy at each Data Center site: Multiple Global Traffic Managers (GTM). Multiple HBAs in each server. Multiple Fibre/Infiniband Switches between the SAN and servers. Hard Drives.

deploy and maintain but pays off in the medium/long run. . The Right Mindset: HA is expensive to acquire. Comprehensive Knowledge Transfer (CKT): Every new technology comes with a learning curve.Attaining High Availability: Its not JUST the technology! Processes and People along with Technology are crucial in implementing and achieving HA.

Incidents. Releases etc. Risk. Change Control. agreed upon. SLOs and SLAs must be negotiated. Elaborate and Exhaustive Testing must be performed at all levels of the Infrastructure Stack. honored & monitored. . Downtime Planning.Attaining High Availability: Its not JUST the technology! Processes/Management need to be in place: Problems.

Memory. Workload Monitoring: Peak/Average/Varioustimes-of day Resource Monitoring: Network. Interconnect Monitoring: Latency.Monitoring Oracle RAC Establish Baselines. IO. CPU. Transactions. Efficiency Consumption > 70% CPU = Add another node. .

Monitoring Oracle RAC 3 levels of monitoring: OS Application Database Compare/Match statistics/metrics reported by Oracle with statistics/metrics reported by OS/3rd-Party tools. .

Monitoring Oracle RAC Monitoring Tools: VMSTAT IOSTAT NETSTAT OS Watcher .

Monitoring Oracle RAC Database Monitoring: AWR ADDM Statspack ASH OEM Grid Control: Enables/Facilitates all of the above in an intuitive easy-to-use GUI. .

(RACDIAG. Node Eviction. IO Fencing.1] RACDDT 2.SQL): Script to Collect RAC Diagnostic Information [MetaLink ID 135714.0.5 User Guide [MetaLink ID 360926. STONITH: Shoot The Other Node In The Head.Troubleshooting Oracle RAC Split-Brain: A cluster’s worst nightmare.1] .

Troubleshooting Oracle RAC Log files: Resource specific logs Cluster Network Communication logs CRS alert logs CSS logs CRS logs EVM logs OPMN logs SRVM logs Listener Logs Trace files: BDUMP UDUMP CDUMP .

1]. Solaris.1]. Perfmon for Windows family of OS. Data Gathering for Instance Evictions in a RAC environment (ORA29740) [MetaLink ID 412884. HP-UX. .Troubleshooting Oracle RAC Master Note for Real Application Clusters (RAC) Oracle Clusterware and Oracle Grid Infrastructure [MetaLink ID 1096952.1]. Resolving Instance Evictions on Windows Platforms [MetaLink ID 297498. RAC: Frequently Asked Questions [MetaLink ID 220970. Linux). OS Watcher (OSW) for various Unix flavors (Tru64.1]. AIX.

oracle. http://www. Learn more about Oracle RAC at Oracle's RAC homepage.com/technology/products/database/clustering/index. Oracle RAC is a mature. corporations and organizations across the planet to provide continuous service. loadbalancing and scalability and a lower-cost alternative to Mainframe-like SMP (Symmetric Multi-Processing) models of computing.Summary To summarize.html . stable and robust clustering version of Oracle's database server product providing fault-tolerance against a singlepoint-of-server-failure and is used by government entities.

Sign up to vote on this title
UsefulNot useful