This action might not be possible to undo. Are you sure you want to continue?
Demystifying Oracle RAC Internals
Barb Lundhild RAC Product Management
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
Answer most common questions about Oracle Clusterware and Oracle RAC
<Insert Picture Here>
• • • • • • • •
Architecture Oracle Clusterware – Group Membership Oracle Cluster Registry The Interconnect The Public Network and the Virtual IP (VIP) Oracle RAC Startup/Shutdown Advanced Features of Oracle RAC Appendix
<Insert Picture Here>
RAC Architecture public network Node1 VIP1 Service Listener instance 1 ASM VIPn Service Listener instance n ASM Node n cluster Oracle Clusterware interconnect Oracle Clusterware Operating System Operating System shared storage Managed by ASM RAW Devices Redo / Archive logs all instances Database / Control files OCR and Voting Disks What does Clusterware provide? VIP Event Management Oracle Clusterware High Availability Framework Process Monitor Group Membership Operating System 3 .
Oracle Clusterware Architecture VIP EVM Oracle Clusterware RACG CRS OPROC CSS Operating System <Insert Picture Here> Oracle Clusterware Group Membership and Heartbeats 4 .
AIX. either up or down. other nodes will be evicted. • IO Fencing similar to the Stonith algorithm • Voting disk is used to detect network problems that could lead to a split-brain • Final arbiter of the status of configured nodes. or Solaris 5 .Oracle Clusterware Group Membership and Heartbeats • Cluster needs to know who is a member at all times • Oracle Clusterware has 2 heartbeats: • Network heartbeat If a node does not send a heartbeat for MissCount (time in seconds). then node is evicted from cluster • Disk heartbeat If disk heartbeat is not updated in I/O timeout. then node is evicted from cluster Oracle Clusterware Split Brain Resolution • When interconnect breaks – keeps the largest cluster possible up. and delivers eviction notices • Recommended to have at least 3 voting disks • Standard NFS support for 3rd voting disk on Linux. in 2 node cluster lowest number node remains.
IT IS NOT SUPPORTED TO REDUCE MISSCOUNT BELOW THE DEFAULT (30s) <Insert Picture Here> Oracle Cluster Registry 6 .
repair. and the OCR locations that you have configured • OCRDUMP .1) backups • ocrconfig –manualbackup Oracle Cluster Registry (OCR) • Tools to manage OCR • OCRCONFIG – command line tool to manage backups.LOC on Linux and Unix • Mirrored by Oracle Clusterware or externally (RAID) • Supports both automatic (every 4 hours) and manual (new in 11.Oracle Cluster Registry (OCR) • A repository containing the definition of the configuration of the cluster and status of resources managed by the cluster • Required file(s) for Oracle Clusterware • Initialized during install of Oracle Clusterware • Location defined in Registry on Windows or OCR. import. and replace • Make sure you have a good backup before changing the cluster configuration! • OCRCHECK – checks integrity and displays the version of the OCR's block format.view the OCR contents by writing OCR content to a file or stdout in a readable format. export. used space. 7 . total space available. restore.
<Insert Picture Here> Interconnect Failure Protection and Scalability Private Interconnect /…/ public network Node1 VIP1 Service Listener instance 1 ASM Oracle Clusterware VIP2 Service Listener instance 2 ASM Oracle Clusterware Node 2 VIPn Service Listener instance n ASM Oracle Clusterware Node n Operating System Operating System Operating System Switch 1 cluster interconnect Switch 2 8 .
The Interconnect • Interconnect is typically a standard GigE network • IP over IB is supported • Network should use a private dedicated non-routable switch or VLAN • A crossover cable is not supported as an interconnect • For high availability and scalability use OS based solution to combine multiple physical links into a single logical link • Same technology can be applied to public network • Only logical link should be provided to Oracle Clusterware and therefore Oracle RAC <Insert Picture Here> Public Network and VIP Failure Protection 9 .
client fails immediately to next address in the list sales.Why Oracle RAC has a VIP? • Protects database clients from long TCP/IP timeouts (can be >10 minutes) • During normal operation.com =(DESCRIPTION=(ADDRESS_LIST= (LOAD_BALANCE=on)(FAILOVER=ON) (ADDRESS=(PROTOCOL=tcp)(HOST=sales1-vip)(PORT=1521)) (ADDRESS=(PROTOCOL=tcp)(HOST=sales2-vip)(PORT=1521))) (CONNECT_DATA= (SERVICE_NAME= sales.0.acme. it removes network timeout from connection request time.255.com))) Oracle RAC VIP The Details • • • • One for each node in cluster Required for Oracle Clusterware installation IP and network name should not currently be in use Should be registered in DNS and must be on the same subnet as public IP address • Configuration managed by VIPCA and SRVCTL • Note that netmask defaults to 255.255.acme.us.us. rather than defaulting to netmask of underlying physical interface. works the same as hostname • During failure. 10 .
0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:5762695 errors:0 dropped:0 overruns:0 frame:0 TX packets:5679252 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3400642002 (3243.Oracle RAC VIP is DIFFERENT • Only accepts connections when on its home node • Failure on home node: relocates to another node in the cluster only to send a error back to client (it will not be in the listener so connections are not accepted!) • You will only have one active RAC VIP per node (there may be others who have relocated due to failure!) • Independent of number of databases running in cluster Oracle RAC VIP [root@pmrac1 root]# ifconfig eth0 Link encap:Ethernet HWaddr 00:12:79:D8:90:93 inet addr:144.15.255 Mask:255.15.255.3 Mb) TX bytes:3621223517 (34126.96.36.199 Mask:255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:5070815 errors:0 dropped:0 overruns:0 frame:0 TX packets:3064435 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:509963813 (486.15.0 Mb) Interrupt:25 11 .215.30 Bcast:188.8.131.52.1 Mb) TX bytes:3166774792 (3020.10 Bcast:144.214.4 Mb) Interrupt:25 eth0:1 Link encap:Ethernet HWaddr 00:12:79:D8:90:93 inet addr:144.255.
ora remote_listener = listeners_sales local_listener = listeners_sales1 • tnsnames.214.ora SID_LIST_LISTENER_PMRAC1 = (SID_LIST = (SID_DESC = (SID_NAME = PLSExtProc) (ORACLE_HOME = /u01/oracle/product/10gR2/asm) (PROGRAM = extproc) ) ) LISTENER_PMRAC1 = (DESCRIPTION_LIST = (DESCRIPTION = (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1)) (ADDRESS = (PROTOCOL = TCP)(HOST = pmrac1-vip)(PORT = 1521)(IP = FIRST)) (ADDRESS = (PROTOCOL = TCP)(HOST = 144.45)(PORT = 1521)(IP = FIRST)) ) ) Use the VIP in the Address List Automatically completed by DBCA • init.25.ora in RAC ORACLE_HOME LISTENERS_SALES = (ADDRESS_LIST = (ADDRESS = (PROTOCOL (ADDRESS = (PROTOCOL (ADDRESS = (PROTOCOL ) LISTENERS_SALES1 = (ADDRESS_LIST = (ADDRESS = (PROTOCOL ) = TCP)(HOST = node1-vip)(PORT = 1521)) = TCP)(HOST = node2-vip)(PORT = 1521)) = TCP)(HOST = node3-vip)(PORT = 1521)) = TCP)(HOST = node1-vip)(PORT = 1521)) 12 .Listener.
one per User Application Creating an Application VIP • The usrvip script must run as root • The default permissions need to be changed after registration • As root… crs_setperm ApplicationVIP1 –o root • Allow oracle user to execute this script • As root… crs_setperm ApplicationVIP1 –u user:oracle:r-x • Start the VIP • As oracle… crs_start ApplicationVIP1 13 .Application VIPs • New resource as of Oracle RAC 10g Release 2 • Created as functional VIPs which can be used to connect to an application regardless of the node it is running on • VIP is a dependent resource of the user registered application • There can be many VIPs.
<Insert Picture Here> Oracle RAC Startup/Shutdown Node Startup Sequence VIP1 Service Listener Instance 1 ASM Oracle Clusterware Operating System 14 .
0.Oracle Dependencies public network Node1 VIP1 Service Listener instance 1 ASM VIP1 VIP2 Service Listener instance 2 ASM Node 2 cluster Oracle Clusterware interconnect Oracle Clusterware Operating System Operating System shared storage Managed by ASM RAW Devices Redo / Archive logs all instances Database / Control files OCR and Voting Disks Oracle Dependencies Prior to 10.2.3 public network Node1 VIP1 Service Listener instance 1 ASM VIP1 VIP2 Service Listener instance 2 ASM Node2 cluster Oracle Clusterware interconnect Oracle Clusterware Operating System Operating System shared storage Managed by ASM RAW Devices Redo / Archive logs all instances Database / Control files OCR and Voting Disks 15 .
Scheduler.G.<Insert Picture Here> Advanced Features of RAC High Availability and Load Balancing for Applications Services • Application workloads can be defined as Services Individually managed and controlled Assigned to instances during normal startup On instance failure. automatic re-assignment Service performance individually tracked Finer grained control with Resource Manager Integrated with other Oracle tools / facilities (E. Streams) • Managed by Oracle Clusterware • Several services created and managed by database server • • • • • • Many features discussed do not apply to default database service 16 .
Cluster Managed Services • Service has a set of resources defined to Oracle Clusterware • Oracle Clusterware manages start/stop/re-locate based on definition • Define Preferred (normal operations) and Available (if failure occurs) instances • Dependent on Instance and VIP • Manage using Enterprise Manager • SRVCTL CLI for Cluster configuration • DBMS_SERVICE PL/SQL package What is FAN? • Fast Application Notification (FAN) is a RAC notification mechanism • FAN HA Events: Notification of Up/Down for service. 17 . and Forget it. instance & node • Load Balancing Advisory Events: Advise clients of current load for service and where to send connection requests • Enable it.
CMAN • New with 11.Oracle Notification Service (ONS) • Publish/Subscribe Messaging System • Allows both local and remote consumption • Used by Fast Application Notification (FAN) to publish HA Events and Load Balancing Events • Used by FAN clients to subscribe to events • Automatically installed and configured by the installation of Oracle Clusterware • DO NOT TURN OFF – Required by Oracle Clusterware and RAC Fan Clients • HA Events: JDBC Implicit Connection Cache. OCI. Server Side Callouts.0. Listener.NET Connection Pools.NET Connection Pools. ODP.7: Universal Connection Pool for JAVA 18 . Listener.1. ODP. CMAN • Load Balancing Advisory Events: JDBC Implicit Connection Cache.
• Stops sending work to slow. • Load Balances at the transaction level (not connections!) • Directs work to where services are executing well and resources are available. hung. changing demand. different priority and shape workloads. failed nodes early.Fast Connection Failover • Fast and reliable high availability for connections in an Oracle Real Application Clusters 10g environment • Enable it and forget it • Application can make it transparent to user by trapping SQL Exception and retrying • Supported by Oracle JDBC. 19 . OCI.NET Load Balancing Advisory • Load Balancing Advisory is an advisory for balancing work across RAC instances. • Adjusts distribution for different power nodes. and ODP.
• Supported by Oracle JDBC.0. OCI. and ODP.1.NET connection Pools • Policy defined by setting GOAL on Service • Need to have Oracle Net Services Connection Load Balancing Leverage Temporal Connection Affinity New with 11.7 RAC Database Web Client Connect to me Instance1 Instance2 Pool Connection Affinity Context Instance3 20 . the connection given is the one that will provide the best service.Runtime Connection Load Balancing • When application does “getConnection”.
XA Affinity adds Performance and Scalability. • Eliminates current single DTP service limitation for XA/RAC • XA affinity is the ability to automatically localize a global transaction to a single RAC instance • Scope is the life of a global transaction • • First connection request for a global transaction uses Runtime Connection Load Balancing (RCLB) Subsequent requests use affinity and are routed to the same RAC instance where XA first started Q & A QUESTIONS ANSWERS 21 .1.7 • DB 11g fixes the correctness problem.Leverage XA Connection Affinity New with 11.0.
<Insert Picture Here> Appendix For More Information http://search.com/rac 22 .oracle.com REAL APPLICATION CLUSTERS or otn.oracle.
1 ‘Setting Up Bonding in Suse SLES8’ • Note 298891. FCF. Reboottime and Disktimeout” • Note 291962.1 “How to Change Subnet Masks for VIPs” • Note 294430.Useful Metalink Notes • Note 342082.com/technology/sample_code/products/rac/index.1 “CSS Timeout Computation in RAC 10g ” • Note 284752.1 “10g RAC: Steps To Increase CSS Misscount.ORACLE.com/clusterware • RAC Sample Code Page http://www.oracle. Load Balancing) • Using standard NFS to support a third voting disk on a stretch cluster configuration on Linux • Using Oracle Clusterware to Protect 3rd Party Applications • New: otn.oracle.1 ‘Configuring Linux for the Oracle 10g VIP using bonding’ • Note 283107.1 ‘Setting Up Bonding in SLES 9’ • Note 291958.1 ‘Configuring Solaris IP Multipathing (IPMP) for the Oracle 10g VIP’ OTN.html 23 .COM/RAC • Workload Management with Oracle Real Application Clusters (FAN.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.