RAC: Frequently Asked Questions [ID 220970.

1] Modified 08-MAR-2010 PUBLISHED Type FAQ Status

Applies to:
Oracle Server - Enterprise Edition - Version: 9.2.0.1 to 11.2.0.1 - Release: 9.2 to 11.2

Purpose
Frequently Asked Questions for Real Application Clusters and Grid Infrastructure.

Questions and Answers
Grid Computing / RAC
General RAC
• • • • • • • • • • • •

What is Cache Fusion and how does this affect applications? Is it difficult to transition (migrate) from Single Instance to Oracle RAC? What are the dependencies between OCFS and ASM in Oracle Database 10g ? Do we have to have Oracle Database on all nodes? Is rcp and/or rsh required for normal Oracle RAC operation ? What software is necessary for Oracle RAC? Does it have a separate installation CD to order? What Application Design considerations should I be aware of when moving to Oracle RAC? What kind of HW components do you recommend for the interconnect? Where can I find a list of supported solutions to ensure NIC availability / redundancy (for the interconnect) per platform? I have changed my spfile with alter system set parameter_name =.... scope=spfile. The spfile is on ASM storage and the database will not start. What combinations of Oracle Clusterware, Oracle RAC and ASM versions can I use? I had a 3 node Oracle RAC. One of the nodes had to be completely rebuilt as a result of a problem. As there are no backups, What is the proper procedure to remove the 3rd node from the cluster so it can be added back in? Does Weblogic (WLS) support Services, FAN/FCF, and the Load Balancing Advisory (LBA) with Oracle RAC? Are jumbo frames supported for the RAC interconnect? Is a relink required for the clusterware home after an OS upgrade? Are Sun Logical Domains (ldoms) supported with RAC? What is Standard Edition Oracle RAC?

• • • • •

• • • • • • • • • •

How do I determine which node in the cluster is the "Master" node? Where do I find Oracle Clusterware binaries and ASM binaries with Oracle Database 11g Release 2? If my OCR and Voting Disks are in ASM, can I shutdown the ASM instance? I have the 11.2 Grid Infrastructure installed and now I want to install an earlier version of Oracle Database (11.1 or 10.2), is this supported ? What is SCAN? I get an error with DBCA from 10.2 or 11.1 after I have installed the 11.2 Grid Infrastructure? I get the following error starting my Oracle RAC database, what do I do? WARNING: No cluster interconnect has been specified. Are block devices supported for OCR, Voting Disks, ASM devices? Can I use iSCSI storage with my Oracle RAC cluster? What would you recommend to customer, Oracle Clusterware or Vendor Clusterware (I.E. HP Service Guard, HACMP, Sun Cluster, Veritas etc.) with Oracle Real Application Clusters? When configuring the NIC cards and switch for a GigE Interconnect should it be set to FULL or Half duplex in Oracle RAC? Can I use Oracle RAC in a distributed transaction processing environment? How can a NAS storage vendor certify their storage solution for Oracle RAC ? Can I run Oracle 9i RAC and Oracle RAC 10g in the same cluster? Is Infiniband supported for the Oracle RAC interconnect? What storage is supported with Standard Edition Oracle RAC? My customer has an XA Application with a Oracle RAC Database, can I do Load Balancing across the Oracle RAC instances? What are the restrictions on the SID with an Oracle RAC database? Is it limited to 5 characters? Does Oracle Clusterware or Oracle Real Application Clusters support heterogeneous platforms? I want to use rconfig to convert a single instance to Oracle RAC but I am using raw devices in Oracle RAC. Does rconfig support RAW ? How many NICs do I need to implement Oracle RAC? Can we designate the place of archive logs on both ASM disk and regular file system, when we use SE RAC? Can my customer use Veritas Agents to manage their Oracle RAC database on Unix with SFRAC installed? Can I run more than one clustered database on a single Oracle RAC cluster? Is there a need to renice LMS processes in Oracle RAC 10g Release 2? Can I run Oracle RAC 10g with Oracle RAC 11g?

• • • • • • • • • • • • • • • •

• •

Can I have multiple public networks accessing my Oracle RAC? I could not get the user equivalence check to work on my Solaris 10 server when trying to install 10.2.0.1 Oracle Clusterware. The install ran fine without issue. << Message: Result: User equivalence check failed for user "oracle". >> Is it supported to install Oracle Clusterware and Oracle RAC as different users? Why does the NOAC attribute need to be set on NFS mounted RAC Binaries? We are using Transparent Data Encryption (TDE). We create a wallet on node 1 and copy to nodes 2 & 3. Open the wallet and we are able to select encrypted data on all three nodes. Now, we want to REKEY the MASTER KEY. What do we have to do? How do I check for network problems on my interconect? Are there any issues for the interconnect when sharing the same switch as the public network by using VLAN to separate the network? The Veritas installation document on page 219 asks for setting LD_LIBRARY_PATH_64. Should I remove this? Why does netca always creates the listener which listens to public ip and not VIP only? Does changing uid or gid of the Oracle User affect Oracle Clusterware? Can we output the backupset onto regular file system directly (not onto flash recovery area) using RMAN command, when we use SE RAC? Should the SCSI-3 reservation bit be set for our Oracle Clusterware only installation? Is it a good idea to add anti-virus software to my RAC cluster? How do I use DBCA in silent mode to set up RAC and ASM? Can RMAN backup Oracle Real Application Cluster databases? I am receiving an ORA-29740 error. What should I do? Is Oracle Application Server integrated with FAN and FCF? How do I configure FCF with BPEL so I can use Oracle RAC 10g in the backend? Where can I find more information on cluster_interconnects? How does OCR mirror work? What happens if my OCR is lost/corrupt? If I change my cluster configuration, do I need to update the ONS configuration on my middle tier? Can I change my SCAN after I have completed my Grid Infrastructure 11g Release 2 install? Why do we have a Virtual IP (VIP) in Oracle RAC 10g or 11g? Why does it just return a dead connection when its primary node fails? What do the VIP resources do once they detect a node has failed/gone down? Are the VIPs automatically acquired, and published, or is manual intervention required? Are VIPs mandatory?

• • •

• • • • • • • • • • • • • • • • • • •

RAC Assistance High Availability

the service resource remains in an OFFLINE status when I display the cluster resource status. if I pull the interconnect on node 1 to simulate failure. why does node 2 reboot? Can our Oracle RAC 10g VIP fail over from NIC to NIC as well as from node to node ? Is there a way to provide or configure HA for the interconnect using Infiniband on AIX ? I am using shared services which the following set in init. and the runtime connection load balancing) Is it fair to say Runtime Connection Load Balancing is the only option to leverage FAN up/down events? What is Server-side Transparent Application Failover (TAF) and how do I use it? What is CLB_GOAL and how should I set it? What does the Virtual IP service do? I understand it is for failover but do we need a separate network card? Can we use the existing private/public cards? What would happen if we used the public ip? I want to configure a secure environment for ONS so have added a Wallet however I am seeing errors (SSL handshake failed) after adding the wallet? Do I need to install the ONS on all my mid-tier serves in order to enable JDBC Fast Connection Failover (FCF)? Will FAN/FCF work with the default database service? Will FAN work with SQLPlus? • • • • • • • • • • • • • • High Availability -. I stopped my service with srvctl stop service but it is still registered with the listener and accepting connections. So I do not have to change the connection string when I add a node to the Oracle RAC database) After executing DBMS_SERVICE. Is this expected? Is it possible to use SVRCTL start database with a user account other than oracle ( that is other than the owner of the oracle software)? With three primary load balancing options (client-side connect-time LB. Is that expected behaviour ? Does Oracle support rolling upgrades in a cluster? What are my options for load balancing with Oracle RAC? Why do I get an uneven number of connections on my instances? What do I do if I am getting handshake failed messages in my ONS. server-side connect-time LB.START_SERVICE.LOG file every minute? What should I do to make my Oracle RAC deployment highly available? I have a 2 node Oracle RAC cluster.• • If I use Services with Oracle RAC.E.ora SQL> show parameters dispatchers=(protocol=TCP)(listener=listen ers_nl01)(con=500)(serv=oltp). do I still need to set up Load Balancing ? How can a customer mask the change in their clustered database configuration from their client or application? (I.FAN/FCF • • • • .

and/or 'gcs for action'.• Why am I seeing the following warnings in my listener. 6. minLimit.1 database for FCF? What clients provide integration with FAN through FCF? Can I use TAF and FAN/FCF? How does the datasource properties initialLimit. and maxLimit affect Fast Connection Failover processing with JDBC? Will FAN/OCI work with Instant Client? What type of callbacks are supported with OCI when using FAN/FCF? Does FCF for OCI react to FAN HA UP events? Can I use FAN/OCI with Pro*C? Do I have to link my OCI application with a thread library? Why? I am seeing the wait events 'ges remote message'.2 JDBC driver with 10. How should one review the ability to scale out to 4. What should I do about these? What are the changes in memory requirements from moving from single instance to RAC? Will adding a new instance to my Oracle RAC database (new node to the cluster) allow me to scale the workload? How do I change my Veritas SF RAC installation to use UDP instead of LLT? Can I have different servers in my Oracle RAC? Can they be from different vendors? Can they be different sizes? What do I do if I see GC CR BLOCK LOST in my top 5 Timed Events in my AWR Report? A customer is currently using RAC in a 2 node environment. 'gcs remote message'. 8 or even more nodes? What should the requirements of a scale out test? What is the Load Balancing Advisory? How do I enable the load balancing advisory? What are my options for setting the Load Balancing Advisory GOAL on a Service? How can I validate the scalability of my shared storage? (Tightly related to RAC / Application scalability) How many nodes are supported in a RAC Database? How do I measure the bandwidth utilization of my NIC or my interconnect? Does Database blocksize or tablespace blocksize affect how the data is passed across the interconnect? What is Runtime Connection Load Balancing? • • • • • • • • • • • • • • • • Scalability • • • • • • • • .log for my RAC 10g environment? WARNING: Subscription for node down event still pending Can I use the 10.

must the directory be in the shared storage? What is the purpose of the gsd service in Oracle 9i RAC? How do I identify which node was used to install the cluster software and/or database software? Are the Oracle Clusterware bundle patches cumulative. Sun or HP. If using plsql native code. can I load balance my users across this network? Srvctl cannot start instance. In an Oracle RAC environement. what are they for? I have 2 clusters named "crs" (the default). the plsql_native_library_dir needs to be defined. how do I get Grid Control to recognize them as targets? How do I determine whether or not an OneOff patch is "rolling upgradeable"? I found in 10.Manageability • • • • • • • • How should I deal with space management? Do I need to set free lists and free list groups? I was installing Oracle 9i RAC and my Oracle files did not get copied to the remote node(s). do they conflict with one another? I have added a second network to my cluster. I get the following error PRKP-1001 CRS-0215. IBM. however sqlplus can start it on both nodes? What is the problem? When I look at ALL_SERVICES view in my database I see services I did not create. do I still need Oracle Clusterware to run Oracle RAC 10g or Oracle RAC 11g? How is Oracle Enterprise Manager integrated with the Oracle RAC 11g Release 2 stack? What storage option should I use for Oracle RAC on Linux? ASM / OCFS / Raw Devices / Block Devices / Ext3 ? What are the implications of using srvctl disable for an instance in my Oracle RAC cluster? I want to have it available to start if I need it but at this time to not want to run this extra instance for this database. Is this a known issue? What is the Cluster Verification Utiltiy (cluvfy)? What versions of the database can I use the cluster verification utility (cluvfy) with? How many nodes can be had in an HP-UX/Solaris/AIX/Windows/Linux cluster? Is crossover cable supported as an interconnect with RAC on any platform ? • • • • • • • • • • • • • • Platform Specific . I changed the password for the dbsnmp account to be dbsnmp (same as username) and it worked OK. What went wrong? How do I stop the GSD? Does Oracle RAC work with NTP (Network Time Protocol)? If I am using Vendor Clusterware such as Veritas.2 that the EM "Convert to Cluster Database" wizard would always fall over on the last step where it runs emca and needs to log into the new cluster database as dbsnmp to create the cluster database targets etc.

sh and exited? Should he worry about this message? How to configure bonding on Suse SLES8.• • • • • • • • • • • • • Is it possible to run Oracle RAC on logical partitions (i. LPARs) or virtual separate servers. and got the error Error in invoking target 'all_no_orcl'. How do I check Oracle RAC certification? What is Oracle's position with respect to supporting RAC on Polyserve CFS? Can the Oracle Database Configuration Assistant (DBCA) be used to create a database with Veritas DBE / AC 3. Does Oracle Support Oracle RAC with Solaris 10 Containers (aka Zones)? In Solaris 10.6. Customer ignored the error and the install succeeded without any other errors and oracle apparently worked fine.Solaris .Linux • • • • • • • • • • • Platform Specific -. How to configure bonding on Suse SLES9. the startup was very slow Is there a cluster file system (CFS) Available for Linux? Oracle Clusterware fails to start after a reboot due to permissions on raw devices reverting to default values. What should they do? Is OCFS2 certified with Oracle RAC 10g? How do I configure my RAC Cluster to use the RDS Infiniband? Is the hangcheck timer still needed with Oracle RAC 10g and 11g? Customer did not load the hangcheck-timer before installing RAC. Can the customer just load the hangcheck-timer ? How to reorder or rename logical network interface (NIC) names in Linux Are Red Hat GFS and GULM certified for DLM? My customer is about to install 10202 clusterwere on new Linux machinges.9-22. How do I fix this? How do I configure raw devices in order to install Oracle Clusterware 10g on RHEL5 or OEL5? Can different releases of Oracle RAC be installed and run on the same physical Linux cluster? Is 3rd Party Clusterware supported on Linux such as Veritas or Redhat? A customer installed 10g Release 2 on Linux RH4 Update 2.ELsmp #1 SMP x86_64 GNU/Linux. He is getting "No ORACM running" error when run rootpre.e.5? Is Veritas Storage Foundation supported with Oracle RAC? Is Oracle Database on VMware support? Is Oracle RAC on VMware supported? After installing patchset 9013 and patch_2313680 on Linux. 2. do we need Sun Cluster to provide redundancy for the interconnect and multiple switches? Platform Specific -.

Can I run Oracle RAC 10g on my IBM Mainframe Sysplex environment (z/OS)? Can I use Oracle Clusterware for failover of the SAP Enqueue and VIP services when running SAP in a RAC environment? Are Oracle Applications certified with RAC? What are the cdmp directories in the background_dump_dest used for? How do I gather all relevant Oracle and OS log/trace files in an Oracle RAC cluster to provide to Support? Platform Specific -.IBM-z/OS (Mainframe) Other Applications & RAC Diagnosibility EBusiness Suite with RAC . what are the benefits of moving their system to RAC? My customer wants to understand what type of disk caching they can use with their Windows RAC Cluster. what do I do? How do I verify that Host Bus Adapter Node Local Caching has been disabled for the disks I will be using in my RAC cluster? My customer has a failsafe cluster installed. what is the recommended OS level? Can I run my Oracle 9i RAC and Oracle RAC 10g on the same Windows cluster? When using MS VSS on Windows with Oracle RAC.IBM AIX • • • • • • • • Platform Specific -. the install guide tells them to disable disk caching? Is VIO supported with RAC on IBM AIX? Is HACMP needed for RAC on AIX 5.• • • • • • • • • • • • • Can I configure IPMP in Actie/Active to increase bandwidth of my interconnect? Does Sun Solaris have a multipathing solution ? Can I configure HP's Autoport aggregation for NIC Bonding after the install? (i. do I need to run the VSS on each node where I have an Oracle RAC instance? What do I do when I get an ORA-01031 error logging into the ASM instance? The OracleCRService does not start with my windows Oracle RAC implementation.e.2 using GPFS file system? Do I need HACMP/GPFS to store my OCR/Voting file on a shared device.HP-UX Platform Specific -.Windows Platform Specific -. not present beforehand) Is HMP supported with Oracle RAC 10g or Oracle RAC 11g on all HP platforms ? Does the Oracle Cluster File System (OCFS) support network access through NFS or Windows Network Shares? When running Oracle RAC on Windows 2003.

How many disks per disk group do I need? OCR stored in ASM . I have a corrupt OCR and no valid backup? Is it supported to rerun root. Any idea why the message "unable to open libhasgen10. for both public and private? Can I run a 10.1. if my ASM instance fails on a node? Clustered File Systems • • • • • • • • • • • • • • • • • • • Oracle Clusterware . Binaries as well as database files? Can I use OCFS with SE Oracle RAC? Is Sun QFS supported with Oracle RAC? What about Sun GFS? Is Red Hat GFS(Global File System) is certified by Oracle for use with Oracle Real Application Clusters? Is Linux OCFS2 (OCFS version 2) supported with Oracle RAC? What is the maximum number of nodes I can have in my cluster if I am using OCFS2? When does the Oracle node VIP fail over to another node and subsequently return to its home node? How do I protect the OCR and Voting in case of media failure? How do I use multiple network interfaces to provide High Availability and/or Load Balancing for my interconnect with Oracle Clusterware? Can the Network Interface Card (NIC) device names be different on the nodes in a cluster.sh from the Oracle Clusterware installation ? When ct run the command 'onsctl start' receives the message "Unable to open libhasgen10.• • • • • • • • What is the optimal migration path to be used while migrating the E-Business suite to Oracle RAC? Is the Oracle E-Business Suite (Oracle Applications) certified against RAC? Can I use TAF with e-Business in a RAC environment? How to configure concurrent manager in a RAC environment? Should functional partitioning be used with Oracle Applications? Which e-Business version is prefereable? Can I use Automatic Undo Management with Oracle Applications? Is Server Side Load Balancing supported/recommended/proven technology in Oracle EBusiness Suite? What are the maximum number of nodes under OCFS on Linux ? What files can I put on Linux OCFS? Where can I find documentation on OCFS ? What are the Best Practices for using a clustered file system with Oracle RAC? Can I use a cluster file system for OCR. Voting Disk.0.x database with Oracle Clusterware 10.so".What happens.2 ? What do I do.so" ? Voting Files stored in ASM .

2) and when I run the OUI. Edit and Remove buttons are grayed out. do ALL public addresses have to be DHCP managed (public IP. the Add. What happens in a 3-nodes RAC cluster if node1's cable is disconnected? Can I use Oracle Clusterware to provide cold failover of my single instance Oracle Databases? What are the licensing rules for Oracle Clusterware? Can I run it without RAC? In the course of failure testing in an extended RAC environment we find entries in the cssd logfile which indicate actions like 'diskShortTimeout set to (value)' and 'diskLongTimeout set to (value)'. public VIP. public SCAN VIP) ? Is it possible to use ASM for the OCR and voting disk? I am trying to move my voting disks from one diskgroup to another and getting the error "crsctl replace votedisk – not permitted between ASM Disk Groups. can I change the VIP? How should I test the failure of the public network (IE Oracle VIP failover) in my Oracle RAC environment? What is the voting disk used for? Can I configure a firewall (iptables) on the cluster interconnect? Can I change the public hostname in my Oracle Database 10g Cluster using Oracle Clusterware? Does the hostname have to match the public name or can it be anything else? I have a 2-node RAC running. Oracle Clusterware installer doesn't show the existing nodes. I notice that it is always node2 that is evicted when I test private network failure scenario by disconnecting the private network cable. Why? What happens if I lose my voting disk(s)? I am installing Oracle Clusterware with a 3rd party vendor clusterware however in the "Specify Cluster Configuration Page" . Why? I made a mistake when I created the VIP during the install of Oracle Clusterware. Nothing comes up in the cluster nodes either.2 OUI or CVU on a running system? What should the permissions be set to for the voting disk and ocr when doing an Oracle RAC Install? How is the Oracle Cluster Registry (OCR) stored when I use ASM? How to move the OCR location ? With Oracle Clusterware 10g. how do you backup the OCR? I am trying to install Oracle Clusterware (10.• • • • • • • • • With GNS. Can anyone please explain the meaning of these two timeouts in addition to disktimeout? • • • • • • • • • • • • . at the Specify Cluster Configuration screen. Doesn't matter whether it is node1's or node2's private network cable that is disconnected. it is always the node2 that is evicted." Why? Can I run the fixup script generated by the 11.

public on ETH1 .2. etc) and turn off Oracle Clusterware management of these? What is the High Availability API? Is it a requirement to have the public interface linked to ETH0 or does it only need to be on a ETH lower than the private interface?: .private on ETH2 How do I restore OCR from a backup? On Windows.0.3 or 10. and then on the next screen asked to define which interfaces should be used as private and public interfaces.0.2. listeners.2 and eth1 on node 3.E. What information is required to answer these questions? Can I change the name of my cluster after I have created it when I am using Oracle Clusterware? Which processes access the OCR ? Why does Oracle Clusterware use an additional 'heartbeat' via the voting disk. how do I work around this until the fix is released with the Oracle Clusterware Bundle for 10.4 is released? How do I identify the voting file location ? How much I/O activity should the voting disk have? Does Oracle Clusterware have to be the same or higher release than all instances running on the cluster? Can I use Oracle Clusterware to monitor my EM Agent? My customer has noticed tons of log files generated under $CRS_HOME/log//client. is there any way automated way we can setup through Oralce Clusterware to prevent/minimize/remove those aggressively generated files? What are the IP requirements for the private interconnect? Can I set up failover of the VIP to another card in the same machine or what do I do if I have different network interfaces on different nodes in my cluster (I. can I use ocopy? • • • • • • • • • • • • • • • • • • • • • • Streched/Extended RAC -.• During Oracle Clusterware installation.No Sub Category . eth0 on node1.4)? How to Restore a Lost Voting Disk used by Oracle Clusterware 10g How can I register the listener with Oracle Clusterware in RAC 10g Release 2? How is the voting disk used by Oracle Clusterware? Does Oracle Clusterware support application vips? Why is the home for Oracle Clusterware not recommended to be subdirectory of the Oracle base directory? How do I put my application under the control of Oracle Clusterware to achieve higher availability? Is it supported to allow 3rd Party Clusterware to manage Oracle resources (instances. when other cluster software products do not? Why does Oracle still use the voting disks when other cluster sofware is present? Customer is hitting bug 4462367 with an error message saying low open file descriptor. I am asked to define a private node name.

1".2 CLUVFY on a system where RAC 10g Release 1 is running I get following output: Package existence check failed for "SUNWscucm:3. Package existence check failed for "SUNWscr:3.1". Can I continue my install? Cluster Verification Utility (CVU) -. Package existence check failed for "ORCLudlm:Dev_Release_06/11/04. Checking this Solaris system I don't see those packages installed. Package existence check failed for "SUNWudlm:3.1". Package existence check failed for "SUNWscu:3.4._64bit_3.8_reentrant".3.1".1".No Sub Category • • • • • • • • • • • • • • What are the default values for the command line arguments? How do I check the Oracle Clusterware stack and other sub-components of it? Is there a way to verify that the Oracle Clusterware is working properly before proceeding with RAC install? At what point cluvfy is usable? Can I use cluvfy before installing Oracle Clusterware? What is CVU? What are its objectives and features? What is a stage? What is a component? What is nodelist? Do I have to be root to use CVU? What about discovery? Does CVU discover installed components? How do I report a(or tons of) bug? What are the requirements for CVU? What is 'cvuqdisk' rpm? Why should I install this rpm? How do I install 'cvuqdisk' package? .• • • • • • • • • • Can a customer use SE RAC to implement an "Extended RAC Cluster" ? What is the maximum distance between nodes in an extended RAC environment? What are the network requirements for an extended RAC cluster? Can I use ASM as mechanism to mirror the data in an Extended RAC cluster? How should voting disks be implemented in an extended cluster environment? Can I use standard NFS for the third site voting disk? Can I use ASM to mirror Oracle data in an extended RAC environment? Why is validateUserEquiv failing during install (or cluvfy run)? How do I turn on tracing? Can I check if the storage is shared among the nodes? When I run 10. Package existence check failed for "SUNWudlmr:3.

No Sub Category . and Veritas Cluster Server? How does RAC One Node compare with a single instance Oracle Database protected with Oracle Clusterware? What is Oracle Real Application Clusters One Node (RAC One Node)? If I add or remove nodes from the cluster. how do I inform RAC One Node? Is RAC One Node supported with database versions prior to 11. Do I have to type the nodelist every time for the CVU commands? Is there any shortcut? How do I get detail output of a check? How do I check network or node connectivity related issues? How do I check whether OCFS is properly configured? How do I check user accounts and administrative permissions related issues? How do I check minimal system requirements on the nodes? Is there a way to compare nodes? Why the peer comparison with -refnode says passed when the group or user does not exist? Where can I find the CVU trace files? Why cluvfy reports "unknown" on a particular node? What are the known issues with this release? How is Oracle RAC One Node licensed and priced? If a current customer has an Enterprise License Agreement (ELA).2? How do I get Oracle Real Application Clusters One Node (Oracle RAC One Node)? Where do I find the documentation for RAC One Node? Does Enterprise Manager Support RAC One Node? How does RAC One Node compare with database DR products like DataGuard or Golden Gate? How do I install the command line tools for RAC One Node? Are we certifying applications specifically for RAC One Node? Does Rac One Node make sense in a stretch cluster environment? How does RAC One Node compare with virtualization solutions like VMware? Can I use Oracle RAC One Node for Standard Edition Oracle RAC? What is RAC One Node Omotion? RAC One Node -. are they entitled to use Oracle RAC One Node? Is Oracle RAC One Node supported with 3rd party clusterware and/or 3rd party CFS? How does RAC One Node compare with traditional cold fail over solutions like HP Serviceguard. IBM HACMP.• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • How do I know about cluvfy commands? The usage text of cluvfy does not show individual commands. Sun Cluster and Symantec.

but the cluster remains operational.No Sub Category • • • If the root. This is accomplished transparently to the application and users of the system. For more information about cache fusion see the following links: Additional Information can be found at: Note: 139436. a shortage of freelists and freelist groups can . There are no changes necessary on the user data within the database. The entire cluster only fails if the Oracle ASM instance on the OCR master node fails.1 Understanding 9i Real Application Clusters Cache Fusion Is it difficult to transition (migrate) from Single Instance to Oracle RAC? If the cluster and the cluster software are not present. and if there is an OCR read or write access. this means that it easily scales to clusters with a large numbers of nodes.No Sub Category Answers What is Cache Fusion and how does this affect applications? Cache Fusion is a new parallel database architecture for exploiting clustered computers to achieve scalability of all types of applications.2) E10717-04 2-27 "If Oracle ASM fails. However.why? How do I explain for a customer who is concerned about the phrase in the following doc ? Oracle® Clusterware Administration and Deployment Guide 11g Release 2 (11.Grid Infrastructure -. if the majority of the OCR locations are in Oracle ASM. if so. Cache Fusion is a shared cache architecture that uses high speed low latency interconnects available today on clustered systems to maintain database cache coherency. then the crsd stops and the node becomes inoperative. As Cache Fusion uses at most a 3 point protocol. which necessitates the existing DB instance must be shut down. is it preferable to create a separate disk group for them? Is the GNS recommended for most Oracle RAC installations? What is Cluster Health Monitor (IPD/OS)? What OS does Cluster Health Monitor (IPD/OS) support? Cluster Health Monitor (IPD/OS) -. Database blocks are shipped across the interconnect to the node where access to the data is needed. can I re-run it? I am only using 1 of 3 SCAN IP addresses.sh script fails on a node during the install of the Grid Infrastructure with Oracle Database 11g Release 2. these components must be installed and configured. The Oracle RAC option must be added using the Oracle Universal Installer. " • • • • Is it recommended that we put the OCR/Voting disk on ASM disk and. then OCR is not accessible on the node on which Oracle ASM failed.

Converting Standby to RAC etc) The migration is automated in Enterprise Manager Grid Control 10. With Oracle Database 10g Release 2. You can run this tool in "verify only" mode prior to performing actual conversion. you do not neet to pre-create redo threads or undo tablespaces if you are using Oracle Managed Files (EG ASM). What are the dependencies between OCFS and ASM in Oracle Database 10g ? In an Oracle RAC 10g environment. you can still use OCFS for database files in Oracle Database 10g. OCFS is not required if you are using Automatic Storage Management (ASM) for database files. However. This tool only supports databases using a clustered file system or ASM. which are easily added with SQL commands or with Enterprise Manager tools. $ORACLE_HOME/bin/rconfig tool can be used to convert Single instance database to RAC. For more detailed information. you could place these files on local file systems even though it's not as convenient given the multiple locations. the need for these changes should be rare. If you do not want to use ASM for your database files. The database requires one Redo thread and one Undo tablespace for each instance.5.xml. This is documented in the Oracle RAC Admin book and a sample xml can be found $ORACLE_HOME/assistants/rconfig/sampleXMLs/ConvertToRAC.causecontention with header blocks of tables and indexes as multiple instances vie for the same block.0. You can use OCFS on Windows( Version 2 on Linux ) for files that ASM does not handle binaries (shared oracle home). Alternatively. the MAXINSTANCES parameter in the control file must be greater than or equal to number of instances you will start in the cluster. This tool takes in a xml input file and convert the Single Instance database whose information is provided in the xml. Recommendation: apply automatic space segment management to perform these changes automatically. please see Migrating from single-instance to RAC in the Oracle Documentation. trace files. Datafiles will need to be moved to either a clustered file system (CFS) so that all nodes can access them. etc. Migrating to ASM. Oracle recommends the use of Automatic Storage Management (ASM) Also. Oracle Enterprise Manager includes workflows to assiste with migrations.E. Do we have to have Oracle Database on all nodes? . Creating Standby. (I. You cannot use it with raw devices. The free space management will replace the freelists and freelist groups and is better. This may cause a performance problem and require data partitioning. there is no dependency between Automatic Storage Management (ASM) and Oracle Cluster File System (OCFS). NOTE: With ORacle RAC 11g Release 2.2. Grid Control 10g Release 2 provides a easy to use wizard to perform this function.

1) or with the appropriate platform vendor for more information. With Oracle 9i. What software is necessary for Oracle RAC? Does it have a separate installation CD to order? Oracle Real Application Clusters is an option of Oracle Database and therefore part of the Oracle Database CD. you will be provided the option of installing RAC. What Application Design considerations should I be aware of when moving to Oracle RAC? The general principals are that fundamentally no different design and coding practices are required for RAC however application flaws in execution or design have a higher impact in RAC. ssh will be used for these operations. if you wish to run Oracle RAC on 2 nodes of a 4-node cluster. Oracle RAC is an option of EE and available as part of SE. With Oracle Database 10g. The media for the grid infrastructure is on a separate CD or under the grid directory. The performance and scalability in RAC will be more sensitive to bad plans or bad schema . Oracle Clusterware and Automatic Storage Management are installed as a single set of binaries called the grid infrastructure. Note Oracle Enterprise Manager uses rsh. if you are using policy managed databases. Is rcp and/or rsh required for normal Oracle RAC operation ? rcp"" and ""rsh"" are not required for normal Oracle RAC operation. Automatic Storage Management and Oracle Restart are installed as the grid infrastructure for a standalone server which is installed from the same media. or NAS storage can provide a configuration that does not necessarily require the Oracle binaries to be installed on all nodes. For example. then you should have the Oracle RAC binaries accessible on all nodes in the cluster. Oracle 9i RAC is part of Oracle9i Enterprise Edition. For standalone servers. but not actual datafiles (these need to be available via shared disk). For Intel platforms (Linux and Windows).Each node of a cluster that is being used for a clustered database will typically have the database and Oracle RAC software loaded on it. Please check the certification matrix (Note 184875. you would need to install the clusterware on all nodes. If you install 9i EE onto a cluster. Most UNIX platforms require an OSD installation for the necessary clusterware. However ""rsh"" and ""rcp"" should to be enabled for Oracle RAC and patchset installation. With Oracle RAC 11g Release 2. Note that using a clustered file system. Oracle RAC on 2 nodes and it would only need to be licensed on the two nodes running the Oracle RAC database. In future releases. Oracle provides the OSD software within the Oracle9i Enterprise Edition release. and the Oracle Universal Installer (OUI) recognizes the cluster. With Oracle Database 11g Release 2. Oracle provides Oracle Clusterware on its own CD included in the database CD pack.

g. truncate/drop ) Look for: * Indexes with right-growing characteristics --> Use reverse key indexes --> Eliminate indexes which are not needed * Frequent updated and reads of “small” tables --> “small”=fits into a single buffer cache --> Use Sparse blocks ( PCTFREE 99 ) to reduce serialization * SQL which scans large amount of data --> Perhaps more efficient when parallelized --> Direct reads do not need to be globally synchronized ( hence less CPU for global cache ) What kind of HW components do you recommend for the interconnect? The general recommendation for the interconnect is to provide the highest bandwith interconnect.available solutions:  Etherchannel (OS based)  HACMP based network failover solution . In practice.design. Where can I find a list of supported solutions to ensure NIC availability / redundancy (for the interconnect) per platform? IBM AIX . it solves > 80% of performance problems Some of the scaleability pitfalls they should look for are: * Serializing contention on a small set of data/index blocks --> monotonically increasing key --> frequent updates of small cached tables --> segment without automatic segment space management (ASSM) or Free List Group (FLG) * Full table scans --> Optimization for full scans in 11g can save CPU and latency * Frequent invalidation and parsing of cursors --> Requires data dictionary lookups and synchronizations * Concurrent DDL ( e. If your customer uses standard SQL and schema tuning. and tends to be the lowest common denominator across platforms. together with the lowest latency protocol that is available for a given platform. Gigabit Ethernet with UDP has proven sufficient in every case it has been implemented. Serializing contention makes applications less scalable.

IPMP in general. How to recover: </p> In $ORACLE_HOME/dbs .Auto Port Aggregation (OS based)  MC/Serviceguard based network failover solution  Combination of both solutions More information: Note: 296874.available solutions:  Sun Trunking (OS based)  Sun IPMP (OS based)  Sun Cluster based network failover solution (IPMP based) More information: Note: 283107..1   @ Related RAC FAQ entries: Linux .available solutions:  APA .More information: Note: 296856. they depend on the network card used. I have changed my spfile with alter system set parameter_name =. When IPMP is used for the interconnect: Note: 368464. Please. contact the respective hardware vendor for more information. The spfile is on ASM storage and the database will not start.1 . scope=spfile.1 HP HP/UX .available solutions:  Teaming On Windows teaming solutions to ensure NIC availability are usually part of the network card driver..available solutions:  Bonding More information: Note: 298891. oraenv &ltinstance_name&gt sqlplus "/ as sysdba" .1   @ Related RAC FAQ entries: Windows .1 and Auto Port Aggregation (APA) Support Guide Sun Solaris . Thus..

Basically the Clusterware version must be at least the highest release of ASM or Oracle RAC. Just running rootdeletenode.ora so adjust to suit</b> shutdown immediate startup quit What combinations of Oracle Clusterware.2 database. otherwise rootdeltenode.startup nomount create pfile='recoversp' from spfile / shutdown immediate quit Now edit the newly created pfile to change the parameter to something sensible.The name of the spfile is in your original init(instance_name). Then: sqlplus "/ as sysdba" startup pfile='recoversp' (or whatever you called it in step one).sh will have trouble removing the nodeapps.1. create spfile='+DATA/GASM/spfileGASM.ora' from pfile='recoversp' / <b>N. I had a 3 node Oracle RAC. What is the proper procedure to remove the 3rd node from the cluster so it can be added back in? Follow the documentation for removing a node but you can skip all the steps in the node-removal doc that need to be run on the node being removed. ASM must be at least 10.B.0. Make sure that you remove any database instances that were configured on the failed node with srvctl.1 for a detailed support matrix.sh isn't really enough. As there are no backups. Note: With Oracle Database 11g Release 2.3 to work with 10. You must upgrade Oracle Clusterware and ASM to 11g Release 2 at the same time. 6 and 7 (See Chapter 10 of Oracle RAC Admin and Deployment Guide). One of the nodes had to be completely rebuilt as a result of a problem. like steps 4. and listener resources also. Oracle RAC and ASM versions can I use? See Note:337737. because you need to update the installer .

and the Load Balancing Advisory (LBA) with Oracle RAC? Currently the integration is incomplete however it is being actively worked upon.1 Cluster Interconnect and Jumbo Frames Is a relink required for the clusterware home after an OS upgrade? In 10g and 11. Does Weblogic (WLS) support Services. Oracle Clusterware binaries cannot be relinked. FAN/FCF. otherwise you won't be able to add back the node using addNode.pl -unlock As the grid infrastructure for a cluster owner: $ export ORACLE_HOME=Grid_home $ Grid_home/bin/relink As root again: # cd Grid_home/crs/install # perl rootcrs.1. And if you don't remove the instances and listeners you'll also have problems adding the node and instance back again. there is not need to relink Oracle Clusterware binaries in 10g and 11. So to answer your question no. The recommendation from Oracle Fusion Middleware is to use WLS Multi pools with Oracle RAC.1. For details see Note:341788. there are some executables in the GRID home that can and should be re-linked after an OS upgrade.inventory as well. However. Are jumbo frames supported for the RAC interconnect? Yes. The procedure to do this is: # cd Grid_home/crs/install # perl rootcrs.sh. in 11. Check certify for the latest information.2.pl -patch Are Sun Logical Domains (ldoms) supported with RAC? Sun Logical Domains (ldoms) are supported with Oracle Database (both single instance and RAC). .

For licensing restrictions you should read the Oracle Database License Doc. since they count towards capacity. please check price list for any changes): Oracle Database Standard Edition can only be licensed on servers that have a maximum capacity of 4 sockets. Where do I find Oracle Clusterware binaries and ASM binaries with Oracle Database 11g Release 2? With Oracle Database 11g Release 2. choose the Grid Infrastructure for a Standalone Server. . As of Oracle Database 11g Release 2. If you are install ASM for a single instance of Oracle Database on a Standalone Server.What is Standard Edition Oracle RAC? As of Oracle Database 10g. the binaries for Oracle Clusterware and Automatic Storage Management (ASM) are distributed in a single set of binaries called the grid infrastructure. a customer who has purchased Standard Edition is allowed to use the Oracle RAC option within the limitations of Standard Edition(SE). you can select from v$ges_resource. may only be licensed on a single cluster of servers supporting up to a total maximum capacity of 4 sockets. Oracle Database Standard Edition. This installation includes Oracle Restart. How do I determine which node in the cluster is the "Master" node? For the cluster synchronization service (CSS). To install the grid infrastructure. you must use ASM for all database files. when used with Oracle Real Application Clusters. ASM Cluster File System (ACFS) or a local OS file system must be used to store all non-database files including Oracle Home. NOTE: This means that the server capacity must meet the restriction even if the sockets are empty. Choose the Grid Infrastructure for a Cluster. For master of a enqueue resource with Oracle RAC. the master can be found by searching ORACLE_HOME/log//cssd/ocssd. Here is the text from the appropriate footnote in the Price List (as of Jan2010. the minimum is 5 Named User Plus licenses. ASM includes ACFS (a cluster file system). There should be a master_node column. At a high level this means that you can have a max of 4 sockets in the cluster.log where it is either the Oracle HOME for the Oracle Clusterware (this is the Grid Infrastructure home in Oracle Database 11g Release 2). go to the grid directory on your 11g Release 2 media and run the Oracle Universal Installer). If licensing by Named User Plus. and User files NOTE: 3rd party clusterware and clustered file systems(other than ASM) are not supported. Application and system files. This includes OCFS and OCFS2.

2). SCAN allows clients to use EZConnect or the this JDBC URL.1 or 10.2 Grid Infrastructure? You will need to apply the patch for Bug 8288940 to your database home in order for it to recognize ASM running from the new grid infrastructure home.2 cluster. Also make sure you have "pinned" the nodes.2 or 11. sqlplus system/manager@ sales1-scan:1521/oltp jdbc:oracle:thin:@sales1-scan:1521/oltp The SCAN is defined as a single name resolving to 3 IP addresses in either the cluster's GNS or your corporate DNS. is this supported ? Yes however you need to "pin" the nodes in the cluster before trying to create a database using an earlier version of Oracle Database (IE not 11. I get an error with DBCA from 10. ** Click here for more details on SCAN. what do I do? WARNING: No cluster interconnect has been specified. You should also apply the patch for Bug 8288940 to make DBCA work in an 11. You will have to stop the clusterware on that node? Either crsctl stop cluster or crsctl stop crs. What is SCAN? Single Client Access Name (SCAN) is a single name that allows client connections to connect to any database in an Oracle cluster independently of which node in the cluster the database (or service) is currently running.1 after I have installed the 11. crsctl pin css -n nodename I get the following error starting my Oracle RAC database.2 Grid Infrastructure installed and now I want to install an earlier version of Oracle Database (11.If my OCR and Voting Disks are in ASM. . The SCAN should be used in all client connection strings and does not change when you add/remove nodes from the cluster. The command to pin a node is crsctl pin css -n nodename.2). I have the 11. can I shutdown the ASM instance? No.

the Oracle Universal Installer and the Configuration Assistants do not support raw or block devices. What would you recommend to customer. The Command Line Interfaces still support raw/block. Oracle Clusterware provides a complete clustering solution and is required for Oracle RAC or Automatic Storage Management (including ACFS). Oracle has made the statement that. . No vendorspecific information will be posted on Certify.0 global cluster_interconnect Note that if hardware is not identical you'll have to provide each node with it's own correct value.236. the directio symantics are not applicable (or rather not implemented) for the block devices on these platforms.2. so the private interconnect is picked at random by the database and hence the warning. Can I use iSCSI storage with my Oracle RAC cluster? For iSCSI. OR play with oifcfg getif and setif (type oifcfg without anything for help message) $ oifcfg getif eth0 138. Voting Disks. this technology does not require validation for single instance database. We don't support NAS devices for Windows.. Are block devices supported for OCR.2.It simply means you do not have cluster_interconnects parameter set and nothing was set in the OCR. Oracle has chosen to validate the iSCSI technology (not each vendor's targets) for the 10g platforms .Any Windows iSCSI products that are supported by the host and storage device are supported by Oracle. You can either set cluster_interconnects parameter in the init..this has been completed for Linux and Windows. Vendor clusterware is only required with Oracle 9i RAC. HACMP. For Windows we have tested up to 4 nodes . Veritas etc.ora to the private interconnect IP. if it's identical hardware you can use the -global switch.E.238. Note: The support for raw/block devices is scheduled for Oracle Database 12g. as a block protocol. ASM devices? Block Devices are only supported on Linux. The Oracle Database 10g OUI does not support block devices however Oracle Clusterware and ASM do. There are many early adopter customers of iSCSI running Oracle9i and Oracle Database 10g.) with Oracle Real Application Clusters? You will be installing and using Oracle Clusterware whether or not you use the Vendor Clusterware. HP Service Guard. however some NAS devices (eg NetApp) can also present themselves as iSCSI devices. Sun Cluster. Oracle Clusterware or Vendor Clusterware (I. With Oracle RAC 11g Release 2. If this is the case then a customer can use this iSCSI device with Windows as long as the iSCSI device vendor supports Windows as an initiator OS. As for Oracle RAC. For Unix platforms.0 global public eth2 138. Check the certification matrix in MyOracleSupport for details of certified vendor clusterware.

When you install Oracle Clusterware. the service is guaranteed to run on one instance at a time in an Oracle RAC database. On Windows and Linux. However Oracle Clusterware (CRS) will not support a Oracle 9i RAC database so you will have to leave the current configuration in place. Can I use Oracle RAC in a distributed transaction processing environment? YES.com. With Oracle Clusterware 11g Release 2. Half Duplex means you can only either send OR receive at a time. All global distributed transactions performed through the DTP service are ensured to have their tightly-coupled branches running on a single Oracle RAC instance. They can submit the request for an OCE kit to ocesup_ie@oracle. you must run the 9i Cluster Manager for the 9i Database and the Oracle Clusterware for the 10g Database.When configuring the NIC cards and switch for a GigE Interconnect should it be set to FULL or Half duplex in Oracle RAC? You must use Full Duplex for all network communication. your 9i srvconfig file will be converted to the OCR. the GSD resource will be disabled by . You can install Oracle Clusterware and Oracle RAC 10g into the same cluster. How can a NAS storage vendor certify their storage solution for Oracle RAC ? As of January 2007 the OSCP has been discontinued!! Please refer to this link on OTN for details on Oracle RAC Technologies Matrix (storage being part of it). Do not restart the 9i gsd after you have installed Oracle Clusterware. Old Answer text: They should obtain an OCE test kit and complete the required Oracle RAC tests. By defining the DTP property of a service. Between transactions and between services. Best practices is to have all tightly coupled branches of a distributed transaction running on an Oracle RAC database must run on the same instance. transactions can be load balanced across all of the database instances. The list of certified NAS vendors/solutions is posted on OTN under the OSCP program Can I run Oracle 9i RAC and Oracle RAC 10g in the same cluster? YES. Prior to Oracle RAC 11g. Oracle RAC 11g provides transparent support for XA global transactions in an Oracle RAC environment and you do not need to use DTP services. you must use services to manage DTP environments. Both Oracle 9i RAC and Oracle RAC 10g will use the OCR.

Oracle Database Standard Edition requires the use of Oracle Clusterware. life gets easier. and the flash recovery area. read the Oracle RAC Admin and Deployment Guide for 10g Release 2 Chapter 6. From Oracle Database 10g Release 2 Licensing Doc: Oracle Standard Edition and Oracle Real Application Clusters (RAC) When used with Oracle Real Application Clusters in a clustered server environment. In addition. For example on Solaris. With Oracle RAC 11g. Oracle provides transparent support for XA global transactions in an Oracle RAC environment which supports load balancing with Oracle Net Services across Oracle RAC instances. What storage is supported with Standard Edition Oracle RAC? As per the licensing documentation. Qlogic (formerly SilverStorm) is the supported RDS vendor. See Note: 751343. your Oracle 9i RAC will be using Sun Cluster. archive logs. To understand services. not in the traditional Oracle Net Services Load Balancing. we will expand support. control file. We have written a document that explains the ** best practices for 9i. You can install Oracle Clusterware and Oracle RAC 10g in the same cluster that is running Sun Cluster and Oracle 9i RAC.1 for more details.0.2. Remember to check certify for details of what vendor clusterware can be run with Oracle Clusterware. and RDS on Linux is supported with 10. you must use ASM for all database files with SE Oracle RAC. including datafiles. With the Oracle Database 10g Services. Third-party clusterware management solutions are not supported. . Automatic Storage Management (ASM) must be used to manage all database-related files. There is no support for CFS or NFS.default. 10g Release 1 and 10g Release 2** . Watch certify for updates. can I do Load Balancing across the Oracle RAC instances? No. Third-party volume managers and file systems are not supported for this purpose. online logs. spfiles. There are no plans to support uDAPL or ITAPI protocols.3 forward. You only need to enable this resource if you are running Oracle 9i RAC in the clsuter. Is Infiniband supported for the Oracle RAC interconnect? IP over IB is supported. As other platforms adopt RDS. My customer has an XA Application with a Oracle RAC Database.

What are the restrictions on the SID with an Oracle RAC database? Is it limited to 5 characters? The SID prefix in 10g Release 1 and prior versions was restricted to five characters by install/config tools so that an ORACLE_SID of upto max of 5+3=8 characters can be supported in an Oracle RAC environment. thus needing a total of 6. SIDs in Oracle RAC with Policy Managed database are dynamically allocated by the system when the instance starts. The SID prefix is relaxed up to 8 characters in 10g Release 2. We do support machines of different speeds and size in the same cluster. This supports a dynamic grid infrastructure which allows the instance to start on any server in the cluster. you will need a third interface for I/O so a minimum of 3. like Oracle RAC.E. Does Oracle Clusterware or Oracle Real Application Clusters support heterogeneous platforms? Oracle Clusterware and Oracle Real Application Clusters do not support heterogeneous platforms in the same cluster. see bug4024251 for more information. All nodes must run the same operating system (I. . . rconfig supports ASM and shared file system only. Enterprise Manager Grid Control supports heterogeneous platforms. NAS/nfs or iSCSI). Does rconfig support RAW ? No. I want to use rconfig to convert a single instance to Oracle RAC but I am using raw devices in Oracle RAC. From an HA perspective. you want these to be redundant. interconnect (private). they must be binary compatible). Anything else will cause performance and stability problems under load. How many NICs do I need to implement Oracle RAC? At minimum you need 2: external (public).g. When storage for Oracle RAC is provided by Ethernet based networks (e. we do not support machines having different chip architectures. Can we designate the place of archive logs on both ASM disk and regular file system. In an active datasharing environment.customers may want to create a standby database for their SE RAC database so placing the archive logs additionally outside ASM is OK. With Oracle RAC 11g Release 2. when we use SE RAC? Yes.

please see <note / Oracle's Policy for Supporting Oracle RAC 10g (applies to Oracle RAC 11g too) with Symantec SFRAC on Unix and <note / Using Oracle Clusterware with Vendor Clusterware FAQ Can I run more than one clustered database on a single Oracle RAC cluster? You can run multiple databases in a Oracle RAC cluster. so there's NO need to renice them. either one instance per node (w/ different databases having different subsets of nodes in a cluster).271396 .10:01 ? 00:00:00 ora_lms1_appsu01 7th column. Check with ps -efl: 0 S spommere 31191 1 0 75 0 ..2. You can also use chrt to check: LMS (Real Time): $ chrt -p 31199 pid 31199's current scheduling policy: SCHED_RR pid 31199's current scheduling priority: 1 LMD (Time Share) $ chrt -p 31193 pid 31193's current scheduling policy: SCHED_OTHER pid 31193's current scheduling priority: 0 Can I run Oracle RAC 10g with Oracle RAC 11g? .271396 . but the Oracle Resource Manager will not currently limit resources between multiple instances on one node. You will need to use an OS level resource manager to do this.270857 . if it is 75 or 76 then this is Time Share.. It does provide the flexibility of being able to share CPU on the node. 58 is Real Time. or otherwise mess with them.10:01 ? 00:00:00 ora_lms0_appsu01 0 S spommere 31199 1 0 58 . Is there a need to renice LMS processes in Oracle RAC 10g Release 2? LMS processes should be running in RT by default since 10. or multiple instances per node (all databases running across all nodes) or some combination in between.271403 . but this is no different from running multiple instances on a single node in a single instance environment which is quite common.10:01 ? 00:00:07 ora_lmd0_appsu01 0 S spommere 31195 1 0 58 .Can my customer use Veritas Agents to manage their Oracle RAC database on Unix with SFRAC installed? For details on the support of SFRAC and Veritas Agents with RAC 10g.10:01 ? 00:00:00 ora_lmon_appsu01 0 S spommere 31193 1 5 75 0 . Running multiple instances per node does cause memory and resource fragmentation.

2. The Oracle Clusterware user and the Oracle RAC user must both have OINSTALL as their primary group.0. Each network will have its own VIP. and DBControl also require user equivalence. you must upgrade both Oracle Clusterware and Automatic Storage Management to 11g Release 2. Note: When you upgrade to 11g Release 2. You must set the new init. The Oracle Clusterware should always run at the highest level.ora parameter LISTENER_NETWORKS so users are load balanced across their network. << Message: Result: User equivalence check failed for user "oracle". Oracle RAC 11g Release 2 supports multiple public networks. FAN will only work on the public network with the Oracle VIPs. >> Cluvfy and the OUI tries to find SSH on Solaris at /usr/local/bin.1 Oracle Clusterware. Can I have multiple public networks accessing my Oracle RAC? Yes. you can run both Oracle RAC 10g and Oracle RAC 11g databases. If you are using ASM for storage. Why does the NOAC attribute need to be set on NFS mounted RAC Binaries? The noac attribute is required because the installer determines sharedness by creating a file and checking for that file’s existance on remote node. you can use either Oracle Database 10g ASM or Oracle Database 11g ASM however to get the 11g features. It is recommended to use Oracle Database 11g ASM. If the noac attribute is not enabled then this . DBCA. This will support Oracle Database 10g and Oracle Database 11g (both RAC and single instance). Services are tied to networks so users connecting with network 1 will use a different service than network 2. With Oracle Clusterware 11g.Yes. Every Database home can have a different OSDBA group with a different username. you must be running Oracle Database 11g ASM. the cluster can only manage a single public network with a VIP and the database can only load balance across a single network. you can have multiple networks however with Oracle RAC 10g and Oracle RAC 11g. Note: User equivalence is required for installations (IE using OUI) and patching. Oracle Clusterware and Oracle RAC can be installed as different users. Workaround is to create a softlink from /usr/bin/ssh to /usr/local/bin. NETCA. Is it supported to install Oracle Clusterware and Oracle RAC as different users? Yes. 9i RAC requires the clusterware that is certified with Oracle 9i RAC to be running in addition to Oracle Clusterware 11g. Yes. you can run Oracle 9i RAC in the cluster as well. The install ran fine without issue. I could not get the user equivalence check to work on my Solaris 10 server when trying to install 10.

Now. There are caveats associated with such deployments. 2. Check your cabling Note: If you are seeing issues with RAC. What do we have to do? After a re-key on node one. If deploying the interconnect on a VLAN. We create a wallet on node 1 and copy to nodes 2 & 3. netstat -s will give you a listing of receive packet discards. This will confuse installer and opatch. The latency and bandwidth requirements as well as availability requirements of the RAC/Clusterware interconnect IP network are more in-line with high performance computing. fragmentation and reassembly errors for IP and UDP. A consequence of this consolidation is a merging of IP networks on a single shared switch. copy the wallet with the new master key to all other nodes. RAC uses UDP as the protocol. We are using Transparent Data Encryption (TDE). Deployment concerns in this environment include Spanning Tree loops when the larger IP network topology changes. Deploying the RAC/Clusterware interconnect on a shared switch. segmented VLAN may expose the interconnect links to congestion and instability in the larger IP network topology. segmented by VLANs. Many customers have consolidated these stand-alone switches into larger managed switches. Confirm that full duplex is set correctly for all interconnect links on all interfaces on both ends. 'alter system set wallet open identified by "password". Assymetric routing that may cause packet flooding. Are there any issues for the interconnect when sharing the same switch as the public network by using VLAN to separate the network? RAC and Clusterware deployment best practices recommend that the interconnect be deployed on a stand-alone. Oracle Clusterware uses TCP/IP. 'alter system set wallet close' on all other nodes. dedicated switch. Some other minor issues issues with spfile in the default $ORACLE_HOME/dbs will definitely be affected. ifconfig -a will give you an indication of collisions/errors/overuns and dropped packets 3. . and lack of fine grained monitoring of the VLAN/port. 4. Open the wallet and we are able to select encrypted data on all three nodes. Set the udp buffers correctly 5. RAC cache fusion exercises the IP network more rigorously than non-RAC Oracle databases. Do not rely on auto negotiation. we want to REKEY the MASTER KEY. on all other nodes to load the (obfuscated) master key into node's SGA. physically seperate.test will incorrectly fail. there should be a 1:1 mapping of VLAN to non-routable subnet and the VLAN should not span multiple VLANs (tagged) or multiple switches. How do I check for network problems on my interconect? 1.

now the files are owned by the wrong group. . The filesystem records the uid (not the username). Is it a good idea to add anti-virus software to my RAC cluster? For customers who choose to run anti-virus (AV) software on their database servers. as the AV software runs. Why does netca always creates the listener which listens to public ip and not VIP only? This is for backward compatibility with existing clients: consider pre-10g to 10g server upgrade.g your database instance). it will use CPU cycles that would normally be consumed by other server processes (e. Veritas RAC requires that the storage array support SCSI-3 PGR. Should the SCSI-3 reservation bit be set for our Oracle Clusterware only installation? If you are using only Oracle Clusterware(no Veritas CM). As some AV software is . So ask your storage vendor to disable the reservation. for example EMC hypervolume level. If we made upgraded listener to only listen on VIP. and so if you exchange the names. then clients that didn't upgrade will not be able to reach this listener anymore.customers might want to backup their database to offline storage so this is also supported. then you'll get the inconsistent results. since this is how Veritas handles IO fencing. This SCSI-3 PGR is set at the array level. Can we output the backupset onto regular file system directly (not onto flash recovery area) using RMAN command. Does changing uid or gid of the Oracle User affect Oracle Clusterware? There are a lot of files in the Oracle Clusterware home and outside of the Oracle Clusterware home that are chgrp'ed to the appropriate groups for security and appropriate access.The Veritas installation document on page 219 asks for setting LD_LIBRARY_PATH_64. Should I remove this? Yes You do not need to set LD_LIBRARY_PATH for Oracle. when we use SE RAC? Yes. they should be aware that the nature of AV software is that disk IO bandwidth is reduced slightly as most AV software checks disk writes/reads. since Oracle Clusterware does not require it for IO fencing. As such. Also. If the reservation is set. then you don't need to have SCSI-3 PGR enabled. databases will have faster performance when not using AV software.

using set AUTOLOCATE ON alerts RMAN to search for backed up files and archive logs on all nodes. etc.dbc -gdbName $SID -sid $SID -sysPassword $PASSWORD -systemPassword $PASSWORD -sysmanPassword $PASSWORD -dbsnmpPassword $PASSWORD -emConfiguration LOCAL -storageType ASM -diskGroupName $ASMGROUPNAME -datafileJarLocation $ORACLE_HOME/assistants/dbca/templates -nodeinfo $NODE1./dev/sdb1' and ASMREDUNDANCY='NORMAL' Can RMAN backup Oracle Real Application Cluster databases? Absolutely. I am receiving an ORA-29740 error. RMAN can be configured to connect to all nodes within the cluster to parallelize the backup of the database files and archive logs.$NODE2 -obfuscatedPasswords false -oratabLocation /etc/oratab -asmSysPassword $PASSWORD -redundancy $ASMREDUNDANCY where ASM_DISKS = '/dev/sda1. which may include a communications error in the cluster.known to lock the files whilst is scans then it is a good idea to exclude the Oracle Datafiles/controlfiles/logfiles from a regular AV scan How do I use DBCA in silent mode to set up RAC and ASM? If you already have an ASM instance/diskgroup then the following creates a RAC database on that diskgroup (run as the Oracle user): $ORACLE_HOME/bin/dbca -silent -createDatabase -templateName General_Purpose. If files need to be restored. Action: Check the trace files of other active instances in the cluster group for indications of errors that caused a reconfiguration. What should I do? This error can occur when problems are detected on the cluster: Error: ORA-29740 (ORA-29740) Text: evicted by member %s.$NODE2 -characterset WE8ISO8859P1 -obfuscatedPasswords false -sampleSchema false -oratabLocation /etc/oratab The following will create a ASM instance & 1 diskgroup (run as the ASM/Oracle user) $ORA_ASM_HOME/bin/dbca -silent -configureASM -gdbName NO -sid NO -emConfiguration NONE -diskList $ASM_DISKS -diskGroupName $ASMGROUPNAME -nodeinfo $NODE1. failure to issue a heartbeat to the control file. . group incarnation %s --------------------------------------------------------------------------Cause: This member was evicted from the group by another member of the cluster database for one of several reasons.

2 it is possible to mirror it.For more information on troubleshooting this error.1 As well as: Oracle Real Application Clusters Administration and Deployment Guide: 11g Release 1 and 10g Release 2 How does OCR mirror work? What happens if my OCR is lost/corrupt? OCR is the Oracle Cluster Registry. For detailed information on the integration with the various releases of Application Server 10g. ensure the SSL enable attribute of ONS in opmn. If you are using SSL. Until . either true or false. The OCR file format is binary and starting with 10. Obviously if you only have one copy of the OCR and it is lost or corrupt then you must restore a recent backup.1 describes the procedure to set up BPEL with a Oracle RAC 10g Release 1 database. see ocrconfig utility for details.oracle.1 Oracle 10g & 11g: Note: 787420.htm How do I configure FCF with BPEL so I can use Oracle RAC 10g in the backend? Note: 372456. services. Where can I find more information on cluster_interconnects? Oracle 9i: Note: 183340.loc in ocrconfig_loc and ocrmirrorconfig_loc variables. look at appendix A in Oracle® Process Manager and Notification Server Administrator's Guide. see the following note: Note: 219361.1 Links to documentation: Oracle 9 to 11g: Note: 151051.1 Troubleshooting ORA-29740 in a RAC Environment Is Oracle Application Server integrated with FAN and FCF? Yes.1 Solaris IPMP specific: Note: 368464. specifically -showbackup and -restore flags.com/technology/tech/java/newsletter/articles/oc4j_data_sources/oc4j_ds. Location of file(s) is located in: /etc/oracle/ocr.xml file has same value. http://www. To troubleshoot OPMN at the application server level. for all OPMN servers in the Farm. it holds all the cluster related information such as instances.

The interesting discussion is what happens if you have the OCR mirrored and one of the copies gets corrupt? You would expect that everything will continue to work seemlessly.. Almost.0) my votes (1).a valid backup is restored the Oracle Clusterware will not startup due to the corrupt/missing OCR file. If however the corruption happens while the Oracle Clusterware stack is down. It is possible to manually modify ocr. while OCR is running when the device is down. In 2-way mirroring. DBA is advised to repair this hardware/software problem that prevent OCR from accessing the device as soon as possible.. See warning below c) This method is not recommend to be performed by customers. See warning below EXTREME CAUTION should be excersized if chosing option b or c above since data loss can occur if the wrong file is manipulated. There are 3 ways to fix this failure: a) Fix whatever problem (hardware/software?) that prevent OCR from accessing the device. Basically. When the Clusteware attempts to start you will see messages similar to: total id sets (1). then it will not be possible to start it up until the failed device becomes online again or some administrative action using ocrconfig utility with -overwrite flag is taken. If the corruption happens while the Oracle Clusterware stack is up and running. total votes (2) 2006-07-12 10:53:54. OCR assign each device with one vote. Well. (In the earlier example. The rule is to have more than 50% of total vote (quorum) in order to safely make sure the available devices contain the latest data. the total vote count is 2 so it requires 2 votes to achieve the quorum.2) 2006-07-12 10:53:54. OCR assign 2 vote to the surviving device and that is why this surviving device now with two votes can start after the cluster is down). DBA can replace the failed device with another healthy device using the ocrconfig utility with -replace flag. Despite the corrupt copy.301: [OCRRAW][1210108256]proprseterror: Error in accessing physical storage [26] This is because the software can't determin which OCR copy is the valid one. if OCR device is configured with mirror. The real answer depends on when the corruption takes place. If I change my cluster configuration. In the example above there isn't enough vote to start if only one device with one vote is available.loc to delete the failed device and restart the cluster. 2nd set (0. then the corruption will be tolerated and the Oracle Clusterware will continue to funtion without interruptions. This command will overwrite the vote check built into OCR when it starts up. b) Issue "ocrconfig -overwrite" on any one of the nodes in the cluster. In the above example one of the OCR mirrors was lost while the Oracle Clusterware was down. do I need to update the ONS configuration on my middle tier? . please contact Oracle Support for assistance before proceeding. OCR won't do the vote check if the mirror is not configured. alternatively. 1st set (1669906634.301: [OCRRAW][1210108256]proprioini:disk 0 (/dev/raw/raw1) doesn't have enough votes (1.1958222370).

If you have changed the definition of your SCAN in the DNS. the VIP associated with it is automatically failed over to some other node. (3) connected clients subscribing to FAN immediately receive ORA-3113 error or equivalent. then you will need to remove and add the SCAN. To a certain degree. As a result. Can I change my SCAN after I have completed my Grid Infrastructure 11g Release 2 install? Yes. With Oracle RAC 11g Release 2. (4) New connection requests rapidly traverse the tnsnames. 3 additional SCAN vips are required for the cluster. the middle tier will not find the cluster. you don't really have a good HA solution without using VIPs and FAN. Why do we have a Virtual IP (VIP) in Oracle RAC 10g or 11g? Why does it just return a dead connection when its primary node fails? The goal is application availability. (2) the new node re-arps the world indicating a new MAC address for the IP. it will find the rest of the nodes.ora address list skipping over the dead nodes. Best Practice is to have the SCAN resolve to 3 IP addresses. or ODP. Each SCAN_LISTENER listens on one of the SCAN VIP addresses. Check srvctl command help (-h) for details on the command. When a node fails. and published. clients connected to a node that died will often wait for a TCP timeout period (which can be up to 10 min) before getting an error. or is manual intervention required? Are VIPs mandatory? With Oracle RAC 10g or higher. (1) VIP detects public network failure which generates a FAN event. you should update the configuration.NET. each node requires a VIP. yes. Those not subscribing to FAN will eventually time out. The SCAN VIPs and SCAN_LISTENERs should be running on different nodes in the cluster. When a node fails. instead of having to wait on TCP-IP timeouts Without using VIPs or FAN. the VIP associated with the . As long as when ONS on the middle tier can find at least one node in the cluster when it starts. When this occurs. What do the VIP resources do once they detect a node has failed/gone down? Are the VIPs automatically acquired. This will result in 3 SCAN VIPs and 3 SCAN_LISTENERS in your cluster to load balance connections to databases running in the cluster. Use srvctl to modify the SCAN. ONS runs on each node in the cluster and is aware of all other nodes in the cluster. The easiest way to use FAN is to use an integrated client with Fast Connection Failover (FCF) such as JDBC.For the best availability and to ensure the application receives all FAN events. the following things happen. OCI. In the case where the only node up is the new node in the cluster when the middle tier starts. ONS will discover nodes.

you can delegate the management of the VIPs to the cluster. When a service is enabled on an instance. In the case of existing SQL conenctions. For directly connected clients. the Grid Naming Service (part of the Oracle Clusterware) will automatically allocated and manage all VIPs in the cluster. Subsequent packets sent to the VIP go to the new node. errors will typically be in the form of ORA-3113 errors. clients connected to a node that died will often wait for a TCP/IP timeout period before getting an error. With Oracle RAC 10g Release 2 or higher. Services allow you granular definition of workload and the DBA can dynamically define which instances provide the service. while a new connection using an address list will select the next entry in the list. If you do this. the clients can start getting connections to the service and the load balancing advisory will include that instance is its advice. this usually causes them to see errors on their connections to the old address. do I still need to set up Load Balancing ? Yes. So I do not have to change the connection string when I add a node to the Oracle RAC database) The combination of Server Side load balancing and Services allows you to easily mask cluster database configuration changes. This requires a DHCP service on the public network. This can be as long as 10 minutes or more. How can a customer mask the change in their clustered database configuration from their client or application? (I. With Oracle RAC 11g Release 2. the Single Client Access Name (SCAN) provides a single .failed node is automatically failed over to one of the other nodes in the cluster. set the CLB_GOAL on service to define the type of load balancing you want. The new node re-arps the world indicating a new MAC address for this IP address. server side load balancing will allow clients to connect to the service on currently available instances at connect time. SHORT for short lived connections (IE connection pool) or LONG (default) for applciations that have connections active for long periods (IE Oracle Forms applicaiton). Without using VIPs. As a result. you don't really have a good HA solution without using VIPs. As long as all instances register with all listeners (use the LOCAL_LISTENER and REMOTE_LISTENER parameters). This results in the clients getting errors immediately. When this occurs. two things happen: 1.E. If I use Services with Oracle RAC. which will send error RST packets back to the clients. 2. The load balancing advisory (setting a goal on the service) will give advice as to how many connections to send to each instance currently providing a service. as long as the instance registers with the listeners. Connection Load Balancing (provided by Oracle Net Services) still needs to be set up to allow the user connections to be balanced across all instances providing a service. With Oracle RAC 11g Release 2.

one reason that this may be impossible is that across major releases. To upgrade these in a rolling fashion one will need to use a logical standby with Oracle Database 10g or 11g. for example. Unfortunately.START_SERVICE. we mean upgrading software (Oracle Database. If 3rd party vendor clusterware in included. you need to check with the vendor about their support of a rolling upgrade. Read the MAA Best Practice on Rolling Database Upgrades using Data Guard SQL Apply or with Oracle RAC 11g.name to be put in the client connection string (as the address). Most Bundle patches and Critical Patch Updates (CPU) are rolling upgradeable. Oracle Clusterware and ASM binaries are combined into a single ORACLE_HOME called the grid infrastructure home.START_SERVICE does not update the clusterware until 11g Release 2. the DBMS_SERVICE. Is that expected behaviour ? YES this is expected behaviour. upgrading the software on that node. Oracle Clusterware. Does Oracle support rolling upgrades in a cluster? This answer is for clusters running the Oracle stack. patchsets and releases. You should use srvctl start service -d dbname then you should see it come online. it is possible only for certain single patches that are marked as rolling upgrade compatible. ASM or the OS itself) while the cluster is operational by shutting down a node.) .1 for details. the cluster resource for a Service.6 and beyond. For the Oracle Database software. contains the values for all the attributes of a service. you will not be able to upgrade ASM in a rolling fashion. there may be incompatible versions of the system tablespace. and so forth one node at a time until all the nodes in the cluster are at the new software level. all service modifications should be made with srvctl (or Oracle Enterprise Manager). Rolling Database Upgrades for Physical Standby Databases using Transient Logical Standby 11g The Oracle Clusterware software always fully supports rolling upgrades.0. Patchsets and DB version (10g to 11g) changes are not supported in a rolling fashion. After executing DBMS_SERVICE. Note: With Oracle RAC 11g Release 2. see Note: 300479. Clients using SCAN never have to change even if the cluster configuration changes such as adding nodes. This home fully supports rolling upgrades for patches. By a rolling upgrade.1. In order to save modifications across restarts. and then reintegrating it into the cluster. Oracle Clusterware will update the database with its values when it starts a service. the service resource remains in an OFFLINE status when I display the cluster resource status. while the ASM software is rolling upgradeable at version 11. For Oracle Database 11g Release 2. bundles. (If you are upgrading ASM from Oracle Database 10g to 11g Release 2.

a patchset (such as EL4u4 to EL4u6) or a release (EL4 to EL5). This means one needs to be careful as to what option one uses.log file every minute or so: 06/11/10 10:11:14 [2] Connection 0. What do I do if I am getting handshake failed messages in my ONS.186.186.58.129.58.LOG file every minute? For Example: The client gets this error message in Production in the ons. prefer_least_loaded_node_listener-name=off. Note listener name is the actual name of the listener which is different on each node in your cluster and by default is listener_nodename. Terrible for persistent connections or login storms.The Oracle Clusterware and Oracle Real Application Clusters both support rolling upgrades of the OS software when the version of the Oracle Database is certified on both releases of the OS (and the OS is the same. On the negative side this method is unaware of load on the connections or even if they are up meaning they might cause waits on TCP/IP timeouts. it is good programming practice to connect once and stay connected.). no Linux and Windows or AIX and Solaris.86. This is great for short lived connections. Load Based Server side load balancing (by the listener) redirects connections by default depending on the RunQ length of each of the instances. This means that it is very important how one balances connections and what these connections do on a long term basis. What are my options for load balancing with Oracle RAC? Why do I get an uneven number of connections on my instances? All the types of load balancing available currently (9i-10g) occur at connect time. Since establishing connections can be very expensive for your application. The following options exist prior to Oracle RAC 10g Release 2 (for 10g Release 2 see Load Balancing Advisory): Random Either client side load balancing or hardware based methods will randomize the connections to the instances.6200 SSL handshake failed 06/11/10 10:11:14 [2] Handshake for 0.6200: nz error = 29049 interval = 0 (180 max) These annoying messages in ons. Session based load balancing takes into account the number of sessions connected to each node and then distributes the connections to balance the number of sessions across the different nodes. Stay within a 24 hours of upgrade window and fully test this path as it's not possible for Oracle to test all these different paths and combinations. Oracle Net Services provides load balancing or you can use external methods such as hardware based or clusterware solutions.log are telling you that you have a configuration mismatch for .129. This can apply a patch to the operating system. or 32 and 64 bit etc.86. Do not use this method for connections from connection pools or applicaton servers Session Based Server side load balancing can also be used to balance the number of connections to each instance. Session count balancing is method used when you set a listener parameter.

You must either enable SSL for Oracle RAC ONS. and hostname resolution. If there is no wallet password configured in opmn. and so ONS clients will not be affected. This requires detailed planning to ensure there are no single points of failure throughout the infrastructure. then you must make the LDAP server highly available as well as the network connecting the users. You need to create a wallet for each Oracle RAC ONS server.ONS somewhere in the farm.xml. Oracle RAC has its own ONS server for which SSL is disabled by default. If you are deploying Oracle RAC and require high availability. If the database is up but the users cannot connect to the database because the LDAP server is not accessible. the availability metric that means the most is the response time for their transaction. For example.conf either. sudo. or copy one of the wallets from OPMN on the OID instances. vip. You specify the wallet password when you create the wallet. application. Oracle Clusterware has been programmed to recover from failures. then the entire system is down in the eyes of your users. Oracle recommends that the hostname. Oracle RAC removes the server as a single point of failure. database and LDAP server. When using external authentication such as LDAP or NIS (Network Information Service). Load balancing your workload across many servers’ along with fast recovery from failures means that the loss of any one server should have little or no impact on the end user of the application. Oracle Clusterware is constantly monitoring any process that it under its control. various system activities happen on a regular basis such as user authentication. which occur for the Oracle processes. During the testing of the Oracle RAC implementation. listener. you should include a destructive testing phase.conf you need to specify the wallet file and password: walletfile= walletpassword= ONS only uses SSL between servers. If you copy a wallet from an OPMN instance. and interconnect are defined in the /etc/hosts file on all nodes in the cluster. then you don't need to specify a wallet password in ons. if you choose to use the Lightweight Directory Access Protocol (LDAP) for authentication. etc. What should I do to make my Oracle RAC deployment highly available? Customers often deploy Oracle Real Application Clusters (RAC) to provide a highly available infrastructure for their mission critical applications. which includes all the Oracle software such as the Oracle instance. To the application user. If an outage occurs on an Oracle RAC instance. This is the end-to-end response time which means all layers must be available and performing to a defined standard for the agreed times. The level of impact to the end user depends on how well the application has been written to mask failure. you must make the entire infrastructure of the application highly available. In order to do it’s monitoring and recovery. This is a systematic set of tests of your configuration to ensure that 1) you know what to expect if the failure occurs and how to recover from it and 2) that the system behaves as expected .xml. Oracle RAC has many features that customers can take advantage of to mask failures from the end user however it requires more work than just installing Oracle RAC. a public network failure will cause failures within the cluster. In ons. the ideal situation would be that the failover time + transaction response time to be less then the maximum acceptable response time. or disable it for OID ONS(OPMN). In order for the cluster to be highly available. then use the same password configured in opmn. it must be able to perform these activities at all times.

during the failure. if node 1 was the first node to join the cluster. I have a 2 node Oracle RAC cluster. Is there a way to provide or configure HA for the interconnect using Infiniband on AIX ? The HA support will be with VIPA configured over two separate IB interfaces. why does node 2 reboot? When Oracle Clusterware recognizes a problem on the interconnect. This VIPA configuration is different from the "AIX Etherchannel" configuration. The NIC to NIC failover is fully redundant if redundant switches are installed. I stopped my service with srvctl stop service but it is still registered with the listener and accepting connections. I am using shared services which the following set in init. the Oracle RAC 10g VIP implementation is capable from failing over within a node from NIC to NIC and back if the failed NIC is back online again. storage failure. storage network failure. However in a 2 node cluster. Oracle Net Service Connection Load Balancing. applications can mask most failures and provide a very highly available application. The two interfaces can either be two ports on one adapter (not ideal HA) or two ports from different adapters.ora SQL> show parameters dispatchers=(protocol=TCP)(listener=listen ers_nl01)(con=500) (serv=oltp). interconnect failures. it will try to keep the largest sub-cluster running. Is this expected? . if I pull the interconnect on node 1 to simulate failure. instance failure. Using features of Oracle Real Application Clusters and Oracle Clients including Fast Application Notification (FAN). and the Load Balancing Advisory. voting disk failure. public network failure. see the MAA document Client Failover Best Practices for Highly Available Oracle Databases and the Oracle RAC Administration and Deployment Guide. even if you pull the interconnect cable from node 1. In the case above. For details on implementing best practices. node 2 will be rebooted. and loss of ASM. This is a good time to review operating procedures and document recovery procedures. "AIX Etherchannel" is not supported with Infiniband. Can our Oracle RAC 10g VIP fail over from NIC to NIC as well as from node to node ? Yes. Destructive testing should include tests such as node failure. Fast Connection Failover (FCF). loss of an OCR. and also we fail over between nodes. we can only keep one node up so the first node that joined the cluster will be the node that stays up and Oracle Clusterware will reboot the other node even if you pulled the cable from the node that stayed up.

Only the basic method is supported. After you create a service. serverside connect-time LB. then the connection will be failed over to another instance in the cluster that is supporting the service. If the instance where a client is connected.YES. the db creation assistant would set the correct permissions/ACLs on the CRS resources that control the db/instances etc. When you create a RAC db as a user different than the home/software owner (oracle) user. This is by design of dispatchers which are part of Oracle Net Services.us. failover_method => dbms_service. if you enable Fast Connection Failover.failover_type_select - .failover_method_basic . Sample code to modify service: execute dbms_service. NOTE: both the client and server must be 10. fails. aq_ha_notifications => true .com' .modify_service (service_name => 'gl. All restrictions of TAF still apply.2 and aq_ha_notifications must be set to true for the service. the service specified cannot be managed by the dba. Is it possible to use SVRCTL start database with a user account other than oracle ( that is other than the owner of the oracle software)? YES. then you do not have to encode TAF on the client connection string.oracle. introduces server-side TAF when using services.ora parameter.modify_service pl/sql procedure to define the TAF policy for the service. failover_type => dbms_service. assuming that you had setup group membership for this user to the dba group of the home (find it using oracle_home/bin/osdbagrp) and also part of the crs home owners primary group (usually oinstall) and there was group write permission on the oracle_home. If your service has a server side TAF policy defined. The listener is a subscriber to all FAN events (both from the load balancing advisory and the HA events). With the Oracle JDBC driver 10g Release 2. and the runtime connection load balancing) Is it fair to say Runtime Connection Load Balancing is the only option to leverage FAN up/down events? No. If you specify the service attribute of the dispatchers init. you can use the dbms_service. With three primary load balancing options (client-side connect-time LB. Therefore server side connection load balancing leverages FAN HA events as well as laod balancing advisory events. What is Server-side Transparent Application Failover (TAF) and how do I use it? Oracle Database 10g Release 2. you also enable Runtime Connection Load Balancing (one knob for both). Note this is different than the TAF policy (traditional client TAF) that is supported by srvctl and EM Services page.

, failover_retries => 180 , failover_delay => 5 , clb_goal => dbms_service.clb_goal_long);

What is CLB_GOAL and how should I set it?
CLB_GOAL is the connection load balancing goal for a service. There are 2 options, CLB_GOAL_SHORT and CLB_GOAL_LONG (default). Long is for applications that have long-lived connections. This is typical for connection pools and SQL*Forms sessions. Long is the default connection load balancing goal. Short is for applications that have short-lived connections. The GOAL for a service can be set with EM or DBMS_SERVICE. Note: You must still configure load balancing with Oracle Net Services

What does the Virtual IP service do? I understand it is for failover but do we need a separate network card? Can we use the existing private/public cards? What would happen if we used the public ip?
The 10g Virtual IP Address (VIP) exists on every RAC node for public network communication. All client communication should use the VIPs in their TNS connection descriptions. The TNS ADDRESS_LIST entry should direct clienst to VIPs rather than using hostnames. During normal runtime, the behaviour is the same as hostnames, however when the node goes down or is shutdown the VIP is hosted elsewhere on the cluster, and does not accept connection requests. This results in a silent TCP/IP error and the client fails immediately to the next TNS address. If the network interface fails within the node, the VIP can be configured to use alternate interfaces in the same node. The VIP must use the public interface cards. There is no requirement to purchase additional public interface cards (unless you want to take advantage of within-node card failover.)

I want to configure a secure environment for ONS so have added a Wallet however I am seeing errors (SSL handshake failed) after adding the wallet?
Remember that if you enable SSL for one instance of ONS, you must enable SSL for all instances with ONS (including any AS instances running OPMN). The error message in this case showed that SSL is enabled for the local ONS server, but the SSL handshake is failing when another ONS or OPMN server attempts to connect to it, indicating that the remote server does not have SSL enabled (or has an incompatible wallet configured).

Do I need to install the ONS on all my mid-tier serves in order to enable JDBC Fast Connection Failover (FCF)?

With 10g Release 1, the middle tier must have ONS running (started by same users as application). ONS is not included on the Client CD however is is part of the Oracle Database 10g cd. With 10g Release 2 or later, they do not need to install the ons on the middle tier. The JDBC driver allows the use of remote ONS (ie uses the ONS running in the RAC cluster) . Just use the datasource parameter ods.setONSConfiguration("nodes=racnode1:4200,racnode2.:4200");

Will FAN/FCF work with the default database service?
No. If you want the advanced features of RAC provided by FAN and FCF, then create a cluster managed service for your application. Use the Clustered Managed Services Page in Enterprise Manager DBControl to do this.

Will FAN work with SQLPlus?
Yes with Oracle RAC 11g, you can specify the -F (FAILOVER) option. This enables SQL*Plus to interact with the OCI failover mode in a Real Application Cluster (RAC) environment. In this mode a service or instance failure is transparently handled with transaction status messages if applicable.

Why am I seeing the following warnings in my listener.log for my RAC 10g environment? WARNING: Subscription for node down event still pending
This message indicates that the listener was not able to subscribe to the ONS events which it uses to do the connection load balancing. This is most likely due to starting the listener using lsnrctl from the database home. When you start the listener using lsnrctl, make sure you have set the environment variable ORACLE_CONFIG_HOME = {Oracle Clusterware HOME}, also set it in racgwrap in the $ORACLE_HOME/bin for the database.

Can I use the 10.2 JDBC driver with 10.1 database for FCF?
Yes with the patch for Bug 5657975 for 10.2.0.3,the 10.2 JDBC driver will work with a 10.1 database. The fix will be part of the 10.2.0.4 patchset. If you do not have the patch then using FCF, use the 10.2 JDBC driver with 10.2 database. If database is 10.1, use 10.1 JDBC driver.

What clients provide integration with FAN through FCF?
With Oracle Database 10g Release 1, JDBC clients (both thick and thin driver) are integrated with FAN by providing FCF. With Oracle Database 10g Release 2, we have added ODP.NET

and OCI. Other applications can integrate with FAN by using the API to subscribe to the FAN events. Note: If you are using a 3rd party application server, then you can only use FCF if you use the Oracle driver and except for OCI, its connection pool. If you are using the connection pool of the 3rd Party Application Server, then you do not get FCF. Your customer can subscribe directly to FAN events however that is a development project for the customer. See the white paper Workload Management with Oracle RAC 10g on OTN

Can I use TAF and FAN/FCF?
With Oracle Database 10g Release 1, NO. With Oracle Database 10g Release 2, the answer is YES for OCI and ODP.NET, it is recommended. For JDBC, you should not use TAF and FCF even with the Thick JDBC driver.

How does the datasource properties initialLimit, minLimit, and maxLimit affect Fast Connection Failover processing with JDBC?
The initialLimit property on the Implicit Connection Cache is effective only when the cache is first created. For example, if the initialLimit is set to 10, you'll have 10 connections pre-created and available when the conn cache is first created. Pls don't be confused between minLimit and initialLimit. The current behavior is that after a DOWN event and the affected connections are cleaned up, it is possible for the number of connections in the cache to be lower than minLimit. An UP event is processed for both (a) new instance joins, as well as (b) down followed by an instance UP. This has no relevance to initialLimit, or even minLimit. When a UP event comes into our jdbc Implicit Connection Cache, we will create some new connections. Assuming you have your listener load balancing set up properly, then those connections should go to the instance that was just started. When your application does a get connection to the pool, it will be given an idle connection, if you are running 10.2 and have the load balancing advisory turned on for the service, we will allocate the session based on the defined goal to provide the best service level MaxLimit, when set, defines the upper boundary limit for the connection cache. By default, maxLimit is unbounded - your database sets the limit.

Will FAN/OCI work with Instant Client?
Yes, FAN/OCI will work with Instant Client. Both client and server must be Oracle Database 10g Release 2.

What type of callbacks are supported with OCI when using FAN/FCF?

The database connection must use a Service that has been enabled for AQ events.There are two separate callbacks supported. What are the changes in memory requirements from moving from single instance to RAC? If you are keeping the workload requirements per instance the same. then about 10% more buffer cache and 15% more shared pool is needed. You need to precompile the application with the option EVENTS=TRUE. you can clean up a custom connection pool. This is required because the AQ notifications occur asynchronously. purge stale connections. however if a HA event callback is present. Use dbms_service. Both FAN and TAF are client-side callbacks. . you must link the application to a threads library.0. Can I use FAN/OCI with Pro*C? Since Pro*C (sqllib) is built on top of OCI. The additional memory requirement is due to data structures for coherency management. The 'ges remote message' might show up in a 9. for example. At failover time you can customize the newly created database session.STATS$IDLE_EVENT table so that it is not listed in Statspack reports. Does FCF for OCI react to FAN HA UP events? OCI does not perform any implicit actions on an up event. make sure you link the application with a thread library. and/or 'gcs for action'.e. What should I do about these? These are idle wait events and can be safetly ignored. The HA Events (FAN) callback is called when an event occurs. To have this wait event not show up you can add this event to the PERFSTAT.1 statspack report as one of the top wait events. When the failover occurs. I am seeing the wait events 'ges remote message'.modify_service to enable the service for events (aq_ha_notifications => true) or use the EM Cluster Database Services page. the TAF callback is invoked. The values are heuristic and are mostly upper bounds. i. You can take any required action at that time. it is invoked. 'gcs remote message'. over an implicitly spawned thread. it should support HA events. Do I have to link my OCI application with a thread library? Why? YES. FAN also has a separate server side callout that should not be confused with the OCI client callback. When a down event occurs.

But in general. In an Oracle RAC environment.3 which contains the libraries that were overwritten by the Veritas installation (IE those mentioned above). you need to look at the total I/O across all instances in the cluster. NOTE: Oracle RAC 11g will not support LLT for interconnect.10 =1G + 100M Will adding a new instance to my Oracle RAC database (new node to the cluster) allow me to scale the workload? YES! Oracle RAC allows you to dynamically scale out your workload by adding another node to the cluster.E. You must remember that adding more work to the database means that in addition to the CPU and Memory that the new node brings. How do I change my Veritas SF RAC installation to use UDP instead of LLT? Using UDP with Veritas Clusterware and Oracle RAC 10g seems to require an exception from Veritas so this may be something you should check with them. Converting from specialized protocols to UDP requires a relink after the Oracle libraries have been restored. An .0. Oracle RAC does support a cluster with nodes that have different hardware configurations. Oracle has created Patch 6846006 on 10. please take into consideration that memory requirements per instance are reduced when the same user population is distributed over multiple nodes. Can I have different servers in my Oracle RAC? Can they be from different vendors? Can they be different sizes? Oracle Real Application Clusters (RAC) requires all the nodes to run the same Operating System binary in a cluster (IE All nodes must be Windows 2008 or all nodes must be OEL 4). All nodes must be either 32 bit or all nodes must be 64 bit or all nodes must be HP-UX PARISC since you cannot mix PARISC with Itanium). All nodes must be the same architecture (I.10) [ + extra memory to compensate for failed-over users ] Thus for example with a M=2G & N=2 & no extra memory for failed-over users =( 2G / 2 ) + (( 2G / 2 )) *0.2. To make it easier for customers to convert their LLT environments to UPD. you will have to ensure that your I/O subsystem can support the additional I/O requirements. In this case: Assuming the same user population N number of nodes M buffer cache for a single system then (M / N) + ((M / N )*0. This needs a complete cluster shutdown and cannot be accomplished in a rolling fashion.Actual esource usage can be monitored by querying current and maximum columns for the gcs resource/locks and ges resource/locks entries in V$RESOURCE_LIMIT.

You will also need to consider how you will survive failures in the cluster.g.e. The Oracle Listener also uses the Load Balancing Advisory if CLB_GOAL parameter is set to SHORT (recommended Best Practice if using an integrated Oracle Client mentioned here). smaller node should have sufficient capacity to run the essential services alone. If you are using different hardware. which can be suspended/stopped if the large node fails and the cluster has significantly reduced capacity. To help balance workload across a cluster. Customers implementing uneven cluster configurations need to consider how they will balance the workload across the cluster.example is a cluster with 3 nodes with 4 CPUs and another node with 6 CPUs. will you get support for the hardware configuration? The installation of Oracle Clusterware expects the network interface to be the same name on all nodes in the cluster. This can be done using database services however it is often difficult to predict workloads and the system cannot dynamically react to changes in workload. In such a case. this is fully supported as long as they run the same binaries. the impact of losing a node could impact the ability to continue processing the application workload. than the "bigger" nodes will attract more load. The load balancing advisory runs in an Oracle RAC database and monitors the work executed by the service on all instances where the service is active in the cluster. batch and certain services. static routing of workload via services e. The general recommendation is that the nodes should be sized in such a way that the aggregated peak load of the large node(s) can be absorbed by the smaller node(s). the small nodes can slow down the larger node. Another option is to add another small node to the cluster on demand in case that the large one fails. and Oracle Net Services Connection Manager. Some customers have chosen to manually assign different workloads to different nodes. ODP. It should also be noted especially if there is a large difference between the sizes of the nodes. i. Will the service levels be maintained if the larger node in the cluster fails? Especially in a small cluster.NET Connection Pool. obviously. may be advisable. Clients that can utilize the load balancing advisory are Oracle JDBC Implicit Connection Cache. The impact of the different sized nodes depends on how much difference there is in the size. Oracle Call Interface Session Pool. the Listener will load balance the number of sessions for the service across the instances where the service is available. For this type of configuration. and in the case of failure the "smaller" node(s) will become overpowered. Setting the GOAL on the service activates the load balancing advisory. If you have a failure. See the Oracle Real . Oracle Universal Connection Pool for Java. Since many customers implement Oracle RAC for high availability. This could be critical one if the smaller node is very busy and must serve data to the large node. The servers used in the cluster can be from different vendors. Oracle RAC 10g Release 2 and above provides the Load Balancing Advisory (LBA). customers must consider some additional features to get the optimal cluster performance. you may need to work with your operating system vendor to make sure the network interface names are the same name on all nodes (IE eth0). This can easily occur when adding a new node after the cluster has been in production for a while. The LBA provides recommendations to the subscribed clients about the state of the service and where the client should direct connection requests. If CLB_GOAL is set to LONG (default). you must make sure that your hardware vendor will support the configuration. Changes to workload require the DBA to modify the service. If there is a large difference between the nodes in terms of memory and CPU size.

3. Check netstat -s Ip: 84884742 total packets received 1201 fragments dropped after timeout 3384 packet reassembles failed You do not want to see fragments dropped or packet reassemblies failed. Have tuned the application so it is reasonable scalable on their current RAC environment. 8 or even more nodes? What should the requirements of a scale out test? Once a customer is using RAC on a two node cluster and want to see how far they can actually scale it. For failover. You would see similar scalability if scaling up on a SMP.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:21721236 errors:135 dropped:0 overruns:0 frame:95 TX packets:273120 errors:0 dropped:0 overruns:0 carrier:0 You do not want to see high number of errors. 4.25.27. Thus the request to do the above items. . Ensure they are using a real enough workload that it does not have false bottlenecks.255. Your customer should also testing using DB Services. 6. ifconfig –a: eth0 Link encap:Ethernet HWaddr 00:0B:DB:4B:A2:04 inet addr:130. 6. you lose half your power and really get into trouble or have lots of extra capacity. 2. If you have 2 nodes.Application Clusters Administration and Deployment Guide for details on implementing services and the various parameter settings. How should one review the ability to scale out to 4. This should either be doing very large batch jobs quicker (via parallelism) or being able to support a greater number of short transactions in a shorter time. This is most likely due to a fault in your interconnect network. Measuring that load balacing is working properly. Actual scalability will vary for each application and its bottlenecks. Work with your system administrator or/and network administrator to find the fault.255 Mask:255.252.110 Bcast:130. Make sure you are using RCLB and a FAN aware connection pool. 7. A customer is currently using RAC in a 2 node environment.35. the following are some handy tips to follow: 1.35. you should see what happens if you lose a node. What do I do if I see GC CR BLOCK LOST in my top 5 Timed Events in my AWR Report? You should never see this or BLOCK RETRY events. 5. Make sure you are measuring a valid scalability measure.

and set CLB_GOAL=SHORT ) on your service. on the server: set a goal (service_time or throughput. enable the datasource parameter FastConnectionFailoverEnabled. it provides a percentage value of how much of the total workload should be sent to this instance as well as service quality flag.8. much further. There are 3 options for GOAL: None – Default setting. For ODP. This should be used when the work in a service completes at homogenous rates. For JDBC. This Advisory monitors the current workload activity across the cluster and for each instance where a service is active. For client. How do I enable the load balancing advisory? The load balancing advisory requires the use of services and Oracle Net connection load balancing. The feedback is provided as an entry in the Automatic Workload Repository and a FAN event is published. you must be using the connection pool. you should set CLB_GOAL=SHORT . An example is as internet shopping system where work requests are various lengths Note: If using GOAL. An example is a trading system where work requests are similar lengths. Why stop at 6 nodes? A maximum of 3 way messaging ensure RAC can scale much. SERVICE_TIME – Work requests are directed based on response time. This should be used when the work in a service completes at various rates. Get familiar w/ EM GC to manage a cluster and help eliminate a lot of the complexity of many of the nodes. What is the Load Balancing Advisory? To assist in the balancing of application workload across designated resources. turn off advisory THROUGHPUT – Work requests are directed based on throughput. To enable it.NET enable the datasource parameter Load Balancing=true. 9. What are my options for setting the Load Balancing Advisory GOAL on a Service? The load balancing advisory is enabled by setting the GOAL on your service either through PL/SQL DBMS_SERVICE package or EM DBControl Clustered Database Services page. The easiest way for an application to take advantage of the load balancing advisory. Oracle Database 10g Release 2 provides the Load Balancing Advisory. is to enable Runtime Connection Load Balancing with an integrated client.

In 9i it is platform specific due to the different clusterware support by vendors. we support 100 nodes in a cluster using Oracle Clusterware. from node 1 you measure 20MB/sec. boasting about their amazing front or backend battery backed memory caches that "eliminate" all I/O bottlenecks. In many cases RAC / Application scalability is at blame for no real reason. that is. specifically from multiple sources (nodes). There is also a documentation bug for the max-instances parameter. as it partially depends on storage scalability as well as a solid and capable interconnect (for network traffice between nodes). and get an aggregated 80MB/sec or close to that. number of simultaneous readers. and 100 instances in a RAC database. the underlying IO subsystem is not scalable. With 10g Release 1 the Maximum is 63. however for RAC scalability it's critical to also look at the storage unit's ability to process I/O's per second (throughput) in a scalable fashion. On Linux there is a freely available tool released on OTN called ORION (Oracle I/O test tool) which simulates Oracle I/O. however. See the platform specific FAQ for 9i. You then run another stream on node 3 and get another 20MB/sec. For example. This will prove that the shared storage is scalable. Try to simulate your database io patterns as much as possible. Currently DBCA has a bug where it will not go beyond 63 instances. you may uncover that the HBA (Host Buss Adapater) firmware or the driver versions are outdated (before you claim poor RAC / Application scalability issues). make sure to use version 3.271 or later and if testing raw/block devices add the "-I" flag. that will be carried over and be observed or interperted as poor RAC / Application scalability. It is highly recommended to test this storage scalability early on so that expectations are set accordingly.. if prebuilt binary not available you should build from source.. i. This is all great.e. rates. How many nodes are supported in a RAC Database? With 10g Release 2. . on a 4 node cluster. In a basic read test you will try to demonstrate that a certain IO throughput can be maintained as nodes are added. then you start a read stream on node 2 and see another 20MB/sec while the first node shows no decrease. If that criteria is not met. etc. in the end you run 4 streams on 4 nodes. RAC / Application scalability most probably will suffer. and you should take advantage of such caches as much as possible. there is no substitute to a a real world test. Storage vendors may sometimes discourage such testing. blocksize.How can I validate the scalability of my shared storage? (Tightly related to RAC / Application scalability) Storage vendors tend to focus their sales pitch mainly on the storage unit's capacity in Terabytes (1000 GB) or Petabytes (1000 TB). Obviously if you see poor scalability in this phase. On other Unix platforms (as well as Linux) one can use IOzone.

In this case delays (due to collisions and concurrency) will increase the latency even though the bandwidth might look "reasonable". Try to look for the peak (not average) usage and see if that is acceptably fast. the data buffers are broken down to packets of MTU sizes. above is measuring bandwidth utilization (how much). interactive way on Linux is to use the iptraf utility or the prebuilt rpms from redhat or Novell (SuSE). do proper conversion (divide bps value by 8 to get bytes/sec. collisions and retransmits that happens more frequently as the utilization gets higher. you may still be suffering from high latency connection (slow link) even though there is plenty of bandwidth to spare. to synchronize memory regions (SGAs) of multiple nodes (remember. entire data blocks are shipped accross the interconnect). If you are reaching high levels consider a faster interconnect or NIC bonding (multiple NICs all servicing the same IP address). Does Database blocksize or tablespace blocksize affect how the data is passed across the interconnect? Oracle ships database block buffers. AIX's topaz can show that as well. you can't expect a network device to run at full capacity with 100% efficiency. blocks residing in a tablespace with base block size (8K) will be shipped as base blocks and so on. multiple bytes value by 8 to get bps value). since 9i. this means that 50% of the time it is busy and not available to potential users. review statspack for stats on latency. Most experts agree that low latency is by far more important than a high bandwidth with respect to specifications of the private interconnect in RAC. it's hiding the real issue. you can also drill down to the Cluster Cache Coherency page to see the cluster cache coherency metrics for the entire cluster database. Additionally. On other Unix platforms: "snoop -S -tr -s 64 -d hme0". A more reliable. i. or. Latency is best measured by the actual user of the network link (RAC in this case). Finally. Remember that NIC bandwidth is measured in Mbps or Gbps (which is BITS per second) and output from above utilities can sometimes come in BYTES per second. blocks in a tablespace configured for 16K will result in a 16K data buffer shipped.How do I measure the bandwidth utilization of my NIC or my interconnect? One simple/quick and not very recommended way is to look at output of "ifconfig eth0" and compare values of "RX bytes" and "TX bytes" over time this will show _average_ usage per period of time. if the network is utilized at 50% bandwidth. another option on Linux is Netperf . . so for comparison.. Also. Keep in mind that RAC is using the private interconnect like it was never used before.e. not latency (how fast) of the interconnect. in 10gR2 Grid Control you can view Global Cache Block Access Latency. due to concurrency.

ensure that each node has administrative access to all these directories within the Windows environment by running the following at the command prompt: NET USE \\host_name\C$ . For more information see: Note: 180608.tst [node2]/tmp> rcp test. you need to set it. Automatic Workload Repository. We recommend using Automatic Segment Space Management rather than trying to manage space manually. Oracle has implemented runtime connection load balancing with ODP. then you may have an rcp/rsh problem on Unix or shares have not been configured on Windows.tst node1:/tmp On Windows.tst [node1]/tmp> rcp test. Automatic Segment Space Management is the preferred approach. Unless you are migrating from an earlier database version with OPS and have already built and tuned the necessary structures.1 Automatic Space Segment Management in RAC Environments I was installing Oracle 9i RAC and my Oracle files did not get copied to the remote node(s). What went wrong? First make sure the cluster is running and is available on all nodes. and routing responds fast to changing conditions in the system. Automatic Segment Space Management is NOT the default.E. Runtime Connection Load Balancing is tightly integrated with the automatic workload balancing features introduced with Oracle Database 10g I. You can test rcp/rsh on Unix by issuing the following from each node: [node1]/tmp> touch test. and the new Load Balancing Advisory. You should be able to see all nodes when running an 'lsnodes -v' command. This will provide the best service times globally.tst node2:/tmp [node2]/tmp> touch test.What is Runtime Connection Load Balancing? Runtime connection load balancing enables the connection pool to route incoming work requests to the available database connection that will provide it with the best service. Services.NET and JDBC connection pools. If lsnodes shows that all members of the cluster are available. How should I deal with space management? Do I need to set free lists and free list groups? Manually setting free list groups is a complexity that is no longer required.

Note: Make sure that this is the process in use by GSD If you are on 9. If it does not find a NTP configuration." Each machine has a different clock frequency and as a result a slightly different time drift. for Oracle 9i Database.Clustercheck. From the Documentation: Oracle® Database Oracle Clusterware and Oracle Real Application Clusters Installation Guide 10g Release 2 (10.0 on Unix you would issue: $ ps -ef | grep jre $ kill -9 <gsd process> Stop the OracleGSDService on Windows.2 you would issue: $ gsdctl stop Does Oracle RAC work with NTP (Network Time Protocol)? YES! NTP and Oracle RAC are compatible. With Oracle Database 11g Release 2. CTSS synchronizes all the system clocks to the first node in the cluster. as a matter of fact. it then adjusts the system clock based on this known drift as well as compares it to a given time- . ensure that each member node of the cluster is set as closely as possible to the same date and time. Oracle strongly recommends using the Network Time Protocol feature of most operating systems for this purpose.exe also checks for this. with all nodes using the same reference Network Time Protocol server. Oracle Clusterware checks for a NTP configuration. On startup. and stores this information in a "drift" file. if found. In active mode. CTSS goes into Observer mode. How do I stop the GSD? If you are on 9. This means it will monitor the clock synchronization and report in the Oracle Clusterware alert log if it finds a problem. CTSS will be active.2) for Linux B14203-05 page 2-21: "Node Time Requirements Before starting the installation. and Oracle Database 11g Release 1. Oracle Database 10g. it is recommended to setup NTP in an Oracle RAC cluster. NTP computes this time drift every about 15 minutes. Oracle Clusterware includes the Cluster Time Synchronization Service (CTSS).

which allow really low misscount settings. If you intend on making large time changes it is best to shutdown the instances and the entire Oracle Clusterware stack on that node to avoid a false eviction. hence have no impact on the Oracle software.1.conf) If I am using Vendor Clusterware such as Veritas.2/11. The Oracle Clusterware requires the use of "-x" flag to the ntpd daemon to prevent the clock from going backwards (Enterprise Linux: see /etc/sysconfig/ntpd. The SCN numbers never go back in time (unless a reset-logs operation is performed). It is possible to overcome with RECOVER DATABASE UNTIL CANCEL or UNTIL CHANGE. and diagwait not set to 13 Note: 559365. This is the recommended approach. Apart from these issues. basically you can't use RECOVER DATABASE UNTIL TIME to reach the second recovery point. If you are doing complete recovery (most of the times) then this is not an issue since the Oracle recovery code uses SCN (System Change Numbers) to advance in the redo/archive logs.  If DBMS_SCHEDULER is in usage it will be affected by time changes.  On platforms with OPROCD get fix for bug: 5015469 "OPROCD REBOOTS NODE WHEN TIME IS SET BACK BY XNTPD"  If NTP is not configured correctly (using -x flag).  Backup/recovery aspect of large time changes are documented in Note: 77370. until SCN/Cancel. Best Practice is to leave Oracle Clusterware to manage Oracle RAC.e.server the sys-admins sets up. there is always an association of an SCN to a human readable timestamp (which may change forward or backwards). Sun or HP.1 RAC systems can be rebooted due to OPROCD. will not affect transaction/read consistency operations. IBM. you can use Vendor clusterware however you must still install and use Oracle Clusterware for Oracle RAC. i.1 and for Veritas SFRAC see Note: 397460. during a leap second event.1 10.  Daylight saving time adjustments do not affect the system clock. . the Oracle RDBMS server is immuned to time changes. When certified. do I still need Oracle Clusterware to run Oracle RAC 10g or Oracle RAC 11g? Yes. Keep the following points in mind:  Minor changes in time (in the seconds range) are harmless for Oracle RAC and the Oracle Clusterware.1. Solaris: set "slewalways yes" in /etc/inet/ntp. only the displayed time. hence the issue with recovery until point in time vs. For details see Note: 332257. see Note: 759143. especially if you are using the Oracle RAC 10g low-brownout patches.1. as it's using actual clock rather than SCN.

Oracle EM Database Control is now able to manage the full Oracle RAC 11g Release 2 stack.0. for 2. since it's data structures are not cluster aware. while the new version of Oracle EM Database Control is available with every new database release.6 kernel (RHEL4/SLES9) use OCFS2. On Unix/Linux there are two types of devices: block devices (/dev/sde9) are **BUFFERED** devices!! unless you explicitly open them in O_DIRECT you will get buffered (linux buffer cache) IO.4 kernel (RHEL3/SLES8) use OCFS Version 1 and for 2. Oracle 11g Rel. Another option of course is NFS and iSCSI both are outside the scope of this FAQ but included for completeness. that is. however. it will quickly get corrupted. Oracle EM Grid Control underlies a different release cycle than the Oracle Database. The new feature that needs to be noted here is the full management of Oracle Clusterware 11g Release 2 with Oracle EM Database Control 11g Release 2.2 feature support). and Oracle Automatic Storage Management. At the time of writing. can be managed with Oracle EM in the current version with some restrictions (no 11.5. If for any reason the above options (ASM/OCFS) are not good enough and you insist on using 'raw devices' or 'block devices' here are the details on the two (This information is still very useful to know in the context of ASM and OCFS).How is Oracle Enterprise Manager integrated with the Oracle RAC 11g Release 2 stack? Oracle Enterprise Manager (EM) is available in 2 versions: Oracle EM Grid Control and Oracle EM Database Control. ASM is bundled in the RDBMS software. Oracle EM Grid Control is available in version 10. and OCFS as well as ASMLib are freely downloadable from Oracle's OSS (Open Source Software) website. For more information and details. see publicly available Technical White Paper: The New Oracle Enterprise Manager Database Control 11g Release 2 .2. With Oracle Database and Grid Infrastructure 11g Release 2. then Oracle offers OCFS (Oracle Clustered File System). This version does not support new features of the Oracle Database 11g Release 2. EXT3 is out of the question. 2 Database. This includes: Oracle RAC Databases. If you really need/want a clustered filesystem. All these options are free to use and completely supported. . if you mount an ext3 filesystem from multiple nodes.Now Managing Oracle Clusterware What storage option should I use for Oracle RAC on Linux? ASM / OCFS / Raw Devices / Block Devices / Ext3 ? The recommended way to manage large amounts of storage in an Oracle RAC environment is ASM (Automatic Storage Management). Oracle Clusterware.

it is recommended everyone should switch to using the block devices (meaning.1 root disk 162.1 root disk 8. in the case of a block device.2 (there are several 10.. and "b" for block devices.1 an RDBMS fix added the O_DIRECT flag to the open call (O_DIRECT flag tells the Linux kernel to bypass the Linux buffer cache and write directly to disk). Starting with Oracle 10. block devices on Unix do buffered IO by default (cached in linux buffer cache). or the more recommended way: ASM/ASMLib should be given the block devices directly (eg. NETCA) will not support raw/block devices. "c" stands for character device. that ment that a create datafile on '/dev/sde9' would succeed (need to set filesystemio_options=directio in init. you are recommended to plan a migration . you always get unbufferred IO. since it's just 5 files in most cases. 1 Jan 23 19:53 /dev/raw/raw1 brw-rw---. However.ora). hence no need to specify O_DIRECT on the file open call. This means that both voting disks and OCR files could live on block devices. due to OUI bug 5005148. since the IO's will not be immediately visible to other nodes. Also block devices carry persistent permissions across reboots. Since RAW devices are being phased out of Linux in the long term. and shortly after bug 4309443 was fixed (by adding the O_DIRECT flag on the OCR file open call) meaning that starting with 10. not such a big deal. this means that RAC can not operate on it (unless opened with O_DIRECT). Oracle Clusterware files (OCR and Voting Disk can be store in ASM and this is the Best Practice). However due to the desupport in the next release.character devices (/dev/raw/raw9) are *UN-BUFFERRED** devices!! no matter how you open them. The Oracle Universal Installer and the configuration assistants (IE DBCA. /dev/sde9). It is not possible to ask for a backport of this bug since it means a full re-release of 10g. Above is not a typo. All command line interfaces will support raw/block for this release. you can continue to use it and upgrading to 11g Release 2 will not change the location of any files. This enhancement was well received. You can access as many block devices as the system can support. there is still a need to configure raw devices for the voting or OCR files during installation of RAC. You may check if a device is block or character device by the first letter printed with the "ls -l" command: crw-rw---. one alternative if raw devices are not a good option is to use 11g Clusterware (with 10g RAC database). while with raw devices one would have to customize that after installation otherwise the Clusterware stack or database would fail to startup due to permission issues. pass these block devices to ASM or OCFS/2 or Oracle Clusterware) Note: With Oracle Database 11g Release 2. For the voting disk to be opened with O_DIRECT you need fix for bug 4466428 (5021707 is a duplicate).1 backports available) the Oracle OCR file could also access block devices directly. By using block devices you no longer have to live with the limitations of 255 raw devices per node. ASM or ASMlib can be given the raw devices (/dev/raw/raw2) as was done in the initial deployment of 10g Release 1. Therefore if you are using raw/block today. 112 Jan 23 14:51 /dev/sdh Above.

VIP address for this node is present in address list of database services. so a client connecting to these services will still reach some other database instance providing that service via listener redirection. srvctl disable instance -d orcl -i orcl2 SQL> alter database disable public thread 2. must the directory be in the shared storage? In Oracle RAC configuration. The archived logs for this 'disabled' instance will still be needed in any potential database recovery scenario. use ACFS (ASM Cluster file system) What is the purpose of the gsd service in Oracle 9i RAC? . ons. you can use a common directory (on the shared file system) for all instances. if the database was originally created with 3 instances.oracle. while the instance itself is disabled. if the Oracle RAC configuration supports a shared (cluster) file system. So. will not be supported in production with the ASM Cluster File System (ACFS) What are the implications of using srvctl disable for an instance in my Oracle RAC cluster? I want to have it available to start if I need it but at this time to not want to run this extra instance for this database. Alternately.html With Oracle RAC 11g Release 2.to a supported storage option. that means there are 3 threads of redo. srvctl enable instance -d orcl -i orcl2 If using plsql native code. For example. the redo thread is still enabled. all that means is that the instance itself is not starting.com/technology/tech/pl_sql/htdocs/ncomp_faq. this parameter must be set in each instance. any disabled resources will not be started by the Clusterware. It is recommended that you leave the vip. if you are going to disable the instance through srvctl. During node reboot.gsd enabled in that node. So. the plsql_native_library_dir needs to be defined. and will occasionally cause log switches. The instances are not required to have a shared file system. You can also check out the PL/SQL Native Compilation FAQ on OTN: www. On each instance the plsql_native_library_dir can be set to point to an instance local directory. In an Oracle RAC environement. However. you may also want to consider disabling the redo thread for that instance. Just be aware that by disabling an Instance on a node. therefore this instance will not be restarted. All files supported natively in ASM. SQL> alter database enable public thread 2. Do the reverse to enable the instance.

however they do conflict with one another due to the way OPatch allows patch application. separated by a newline. the gsd resource is disabled by default. Once the database has been configured and started and you don't use 'srvctl or EM' to manage or 'dbca to extend/remove' or 'dbua to upgrade' this database. To check if any two patches conflict invoke OPatch as per Note:458485. that is.GSD is only needed for configuration/management of cluster database. So.g: kcb.o. and some additional ones. then the existing patch must be manually removed before the new applied. GSD can be stopped. bundle patches are cumulative. Note: When applying patches in a rolling fashion. CRS bundle #3 fixes all the issues that bundle #2 did. OPatch does not allow to apply ANY patch if there are any overlapping libs or binaries between an already existing patch and the to-be-installed patch. I have added a second network to my cluster. How do I identify which node was used to install the cluster software and/or database software? You can find out which node by running olsnodes command. Are the Oracle Clusterware bundle patches cumulative. hence previous bundle must be manually removed before a new one is applied. in which case traces go to stdout. However. Once database has been configured and up. do they conflict with one another? Fix-wise. see Note:405820. The node which is returned first is the node from which the software was installed and patches should be installed. it can be safely stopped provided you don't run any 'srvctl or dbca or dbua' tools. Note: With Oracle RAC 11g Release 2. can I load balance my users across this network? .1 or using: $ OPatch/opatch prereq CheckConflictAmongPatches -phbasefile patchlist where patchlist is a text file containing all the patch numbers to be checked. You will only need to enable the resource if you are running Oracle 9i RAC in the same cluster.1 for complete list of bugs fixed in each bundle. you are recommended to run the rolling scripts from the last node added to the cluster first and follow the list in reverse order. If two patches touch a particular file. e. the GSD doesn't write anywhere unless tracing was turned on. In Oracle 9i RAC. the Oracle Clusterware bundles are cumulative.

You cannot rely on a TNS_ADMIN environment variable. how do I get Grid Control to recognize them as targets? There are 2 options: a) if the grid control agent install (which is a separate install) has already been done and has picked up the name of the cluster as it was configured as CRS. I have 2 clusters named "crs" (the default).1 Another cause is non-existent spfile. see Note 732683. do not add this new listener to the local_listener and remote_listener parameter. Oracle RAC 11g Release 2 adds the support for multiple public networks.Server side load balancing will only work on a single network which is configured as the public network with the Oracle VIPS. You may also see services that end with XDB which are created for the XML DB feature and you will not be able to manage these services. I get the following error PRKP-1001 CRS-0215. Each network will have its own service. Srvctl needs a TNS_ADMIN alias to the network/admin directory in this home instead of using the default ORACLE_HOME/network/admin for the database you are trying to start. You can use client-side load balancing and failover for users connecting to this network however you will be unable to use server-side load balancing or receive FAN events for this network. one can go to the EM console as .1 When I look at ALL_SERVICES view in my database I see services I did not create. Connections will be load balanced across the instances. For srvctl to work you must srvctl setenv nodeapps -n node -T TNS_ADMIN=full path on each node in the cluster. Srvctl cannot start instance. To enable load balancing use the LISTENER_NETWORKS parameter instead of LOCAL_LISTENER and REMOTE_LISTENER. If you add a second network. See Note 420977. with a second listener. This service is available on all instances in the cluster. You will also see two services used by the database SYS$BACKGROUND (for background processes) and SYS$USERS (users who connect via BEQ or without using a service_name). what are they for? You will always see a default database service that is the same name as your database. however sqlplus can start it on both nodes? What is the problem? This could be many things but a popular issue is when you have a separate ASM Home and the listener is running out of this home (it was the first home installed).

2 Grid Control patch also. Is this a known issue? The conversion to cluster happens successfully but the EM monitoring credentials for the converted database are not properly set due to this bug.3 database and to get the complete functionality you will need 10.2 that the EM "Convert to Cluster Database" wizard would always fall over on the last step where it runs emca and needs to log into the new cluster database as dbsnmp to create the cluster database targets etc. I changed the password for the dbsnmp account to be dbsnmp (same as username) and it worked OK. and for the second. user can set the monitoring password from the "monitoring configuration" screen for the RAC DB from GC console and proceed. give whatever display name you like b) Prior to performing the Grid control agent install.is. Query .Query ends ------------------- I found in 10.. When you rediscover the target.. you can go into the directory where you unpacked the patch: > pwd /ora/install/4933522 Then use the following OPatch command: > opatch query -is_rolling .. just set CLUSTER_NAME environment variable and run the install. For now you can proceed with setting password for dbsnmp user same as that of sys user. In the interim..2.. No need to set it every time agent starts. This issue has been fixed in 10.0. What is the Cluster Verification Utiltiy (cluvfy)? The Cluster Verification Utility (CVU) is a validation tool that you can use to check all the important components that need to be verified at different stages of deployment in a RAC .0. Please enter the patch location: /ora/install/4933522 ---------. This is resolved in next patchset.2. This variable need to be set only for that install session. How do I determine whether or not an OneOff patch is "rolling upgradeable"? After you have downloaded a patch. ---------. Rolling Patch: True.. manually delete and rediscover the target.Query starts -----------------Patch ID: 4933522 . as the fix is spread between the two pieces of software..

How many nodes can be had in an HP-UX/Solaris/AIX/Windows/Linux cluster? The number of nodes supported is not limited by Oracle. the maximum nodes is 100 When using a third party clusterware: Sun: 8 HP UX: 16 HP Tru64: 8 IBM AIX: * 8 nodes for Physical Shared (CLVM) SSA disk * 16 nodes for Physical Shared (CLVM) non-SSA disk * 128 nodes for Virtual Shared Disk (VSD) * 128 nodes for GPFS * Subject to storage subsystem limitations Veritas: 8-16 nodes (check w/ Veritas) For 3rd party vendor clusterware. They are not able to negotiate the DTE/DCE . please check with the vendor. does not perform any cluster or RAC operation. The wide domain of deployment of CVU ranges from initial hardware setup through fully operational cluster for RAC deployment and covers all the intermediate stages of installation and configuration of various components. Is crossover cable supported as an interconnect with RAC on any platform ? NO. When using solely Oracle Clusterware: 63 nodes (Oracle 9i or Oracle RAC 10g Release 1) With 10g Release 2. The requirement is to use a switch: Detailed Reasons: 1) cross-cabling limits the expansion of RAC to two nodes 2) cross-cabling is unstable: a) Some NIC cards do not work properly with it. but more generally by the clustering software/hardware in question. and does not attempt to verify the internals of cluster database or cluster elements.environment. What versions of the database can I use the cluster verification utility (cluvfy) with? The cluster verification utility is release with Oracle Database 10g Release 2 but can also be used with Oracle Database 10g Release 1. does not enter into areas of performance tuning or monitoring. Cluvfy does not take any corrective action following the failure of a verification task. CROSS OVER CABLES ARE NOT SUPPORTED.

since one E10K is still just one server.e. Oracle RAC can be run on cluster comprises of domains. such as back plane failure. Os vendors etc. Besides. There are single points of failures. Each domain is effectively a virtual server. we know from experience that a lot of adapters misbehave when used in a crossover configuration and cause a lot of problems for RAC. What is Oracle's position with respect to supporting RAC on Polyserve CFS? . Yes. Due to the benefits and stability provided by a switch. pre-configured Oracle RAC clusters. The corresponding system hardware is offered by System vendors and specialized Technology vendors. These are included below under the corresponding OS platform selection within the certification matrix. any domain failure will have little effect on other domains. On high end servers can be partitioned into domains (partitions) of smaller sizes. Hence we have stated on certify that we do not support crossover cables to avoid false bugs and finger pointing amongst the various parties: Oracle. Hardware vendors. Some system vendors offer pre-installed. Unfortunately there is no way to know which NICs do not have that clock. However. LPARs) or virtual separate servers. These NICS were made cheaper by assuming that the switch was going to have the clock. Check Certify to understand the current details for the different hardware solutions.. b) Media sense behaviour on various OS's (most notably Windows) will bring a NIC down when a cable is disconnected. From a purely technology point of view Oracle does not care if the customer uses cross over cable or router or switches to deliver a message. Any failures. the management of the cluster may be easier since there is only one physical server. How do I check Oracle RAC certification? See the following Metalink note: Note 184875. it is possible. that crumble the entire server will shutdown the virtual cluster. this is the only supported configuration. The benefits of using this is similar to a regular cluster. each domain with its own CPU(s) and operating system. Note however. Either of these issues can lead to cluster instability and lead to ORA29740 errors (node evictions). Is it possible to run Oracle RAC on logical partitions (i. and their afforability ($200 for a simple 16 port GigE switch).1 Please note that certifications for Oracle Real Application Clusters are performed against the Operating System and Clusterware versions. and will thus not function.. and the expense and time related to dealing with issues when one does not exist.clocking. That is the tradeoff users have to make in how best to build a cluster database.

and the customer wishes to place database files onto a cluster file system. provided as part of the patch README: Three parameters affect the startup time: soft_margin (defined at watchdog module load) -m (watchdogd startup option) WatchdogMarginWait (defined in nmcfg. . Aside from the support restrictions for the database on VMWare outlined in Metalink Note 249212. The script can then be modified to use cluster file systems instead. for most customers. they must invoke DBCA as follows: dbca -datafileDestination /oradata where /oradata is on the CFS filesystem. and have DBCA output a script. the startup was very slow Please carefully read the following new information about configuring Oracle Cluster Management on Linux. If ORACLE_HOME is on local drives on each system. this means they are not willing to run production Oracle databases on VMware. See 9iR2 README and bug 2300874 for more info.2).ora).1. The user can choose to set up a database on raw devices. Regarding Oracle RAC . After installing patchset 9013 and patch_2313680 on Linux. Effectively.Please check the certification matrix available through Metalink for your specific release. the tool will work directly. we do not support Oracle Software including Oracle RAC on VMWare. With Oracle 9i RAC Release 2 (Oracle 9. there are technical restrictions that prevent the certification of Oracle RAC in a VMWare environment. Is Veritas Storage Foundation supported with Oracle RAC? Veritas certifies Veritas Storage Foundation for Oracle RAC with each release. Can the Oracle Database Configuration Assistant (DBCA) be used to create a database with Veritas DBE / AC 3.No. Check Ceritify and Veritas Support Matrix for the latest details. Standard database creation scripts using SQL commands will work with file system and raw. DBCA can be used to create databases on a cluster filesystem. If the ORACLE_HOME is stored on the cluster filesystem.1. DBCA cannot be used to create databases on file systems on Oracle 9i Release 1.5? DBCA can be used to create databases on raw devices in 9i RAC Release 1 and 9i Release 2. Is Oracle Database on VMware support? Is Oracle RAC on VMware supported? Oracle Database support on VMware is outlined in Metalink Note 249212.

1 . This change of behavor started with the 2.d/40udev. Example permissions file: # raw devices raw/raw[1-2]:root:oinstall:0640 raw/raw[3-5]:oracle:oinstall:0660 Note that this applied to all raw device files. [5000(msec) is hardcoded] Note that the soft_margin is measured in seconds.6 kernel. ACFS (ASM Cluster File System with Oracle Database 11g Release 2) and OCFS (Oracle Cluster Filesystem) are available for Linux.permissions.permissions (you must choose a number that's lower than 50).d/50-udev. OEL4 and SLES9 is to create /etc/udev/permission. Based on benchmarking. SLES9 and SLES10. RHEL5. also a typo in the 50-udev. If there are any 'ping came too late' messages. In RHEL3 the raw devices maintained their permissions across reboots so this symptom was not seen. /dev/raw/raw{x} revert to their default values (root:disk) and are inaccessible to Oracle. The way to fix this is on RHEL4. increase the value of the above parameters. here just the voting and OCR devices were specified. How do I fix this? After a successful installation of Oracle Clusterware a simple reboot and Oracle Clusterware fails to start.permissions can render the system non-usable).How to find the current OCFS version for Linux Oracle Clusterware fails to start after a reboot due to permissions on raw devices reverting to default values. in RHEL4. You can do this by copying /etc/udev/permission.g. OEL4. . Here is an example: soft_margin=10 -m=10000 WatchdogMarginWait = 10000+10000+5000=25000 If CPU utilization in your system is high and you experience unexpected node reboots.WatchdogMarginWait is calculated using the formula: WatchdogMarginWait = soft_margin(msec) + -m + 5000(msec). This is because the permissions on the raw devices for the OCR and voting disks e. Is there a cluster file system (CFS) Available for Linux? Yes.log file. and removing the lines that are not needed (50-udev.permissions gets replaced with upgrades so you do not want to edit it directly. it is recommended to set soft_margin between 10 and 20 seconds. Use the same value for -m (converted to milliseconds) as used for soft_margin. OEL5. The following Metalink note has information for obtaining the latest version of OCFS: Note 238278. check the wdd. -m and WatchMarginWait are measured in milliseconds.

here's an example: KERNEL=="raw[1-2]*". You can install Oracle Clusterware and Oracle RAC 10g or 11g into the same cluster. This means that if a cluster file system requires a 3rd party clusterware.1 . This means that in order to install Oracle Clusterware 10g you'd have to manually bind the raw devices to the block devices for the OCR and voting disks so that the 10g installer will proceed without error. MODE="640" KERNEL=="raw[3-5]*". Oracle RAC 10g and Oracle RAC 11g do not support 3rd Party clusterware on Linux. Oracle Clusterware 10g can only support Oracle RAC 10g databases. GROUP="oinstall". this is because raw devices are being deprecated on Linux. Do not restart the 9i gsd after you have installed Oracle Clusterware. How do I configure raw devices in order to install Oracle Clusterware 10g on RHEL5 or OEL5? The raw devices OS support scripts like /etc/sysconfig/rawdevices are not shipped on RHEL5 or OEL5. When you install Oracle Clusterware. the cluster file system is not supported. OWNER="oracle".On RHEL5. and Oracle RAC 11g will use the OCR. Oracle Clusterware 11g doesn't require this configuration since the installer can handle block devices directly.However Oracle Clusterware (CRS) will not support a Oracle 9i RAC database so you will have to leave the current configuration in place. Refer to Note 465001. IE Oracle Clusterware 11g Release 2 will support Oracle RAC 10g and Oracle RAC 11g databases. . notice that now the number must be (any number) higher than 50. Can different releases of Oracle RAC be installed and run on the same physical Linux cluster? Yes .1 for exact details on how to do the above. you must run the 9i Cluster Manager for the 9i Database and the Oracle Clusterware for the 10g Database. Oracle Clusterware must be the highest level (down to the patchset). Oracle 9i RAC. On Windows and Linux. Oracle RAC 10g. your 9i srvconfig file will be converted to the OCR.rules. Also the syntax of the rules is different than the permissions file. Is 3rd Party Clusterware supported on Linux such as Veritas or Redhat? No. Remember to check certify for details of what vendor clusterware can be run with Oracle Clusterware. GROUP="oinstall".d/99-raw. OEL5 and SLES10 a different file is used /etc/udev/rules. MODE="660" This is explained in detail in Note: 414897.

See Note: 751343. Red Hat is aware of the bug.2. Oracle install guide states that RH4 64 bits update 1 "or higher" should be used for 10g R2. Customers should use the latest update (or at least update 3 to fix). you will need to relink Oracle. and got the error Error in invoking target 'all_no_orcl'.x86_64.ELsmp #1 SMP x86_64 GNU/Linux. Check certify for what is currently available as this will change as vendors adopt the technology.6.rpm is needed to relink without error.0. Check for the string “cluster interconnect IPC version:Oracle RDS/IP (generic)” in the alert. If you want to switch a database running with IP over IB.0.A customer installed 10g Release 2 on Linux RH4 Update 2. The binutils patch binutils-.2-13. To verify the hangcheck-timer module is running on every node: as root user: /sbin/lsmod | grep hangcheck If the hangcheck-timer module is not listed enter the following command as the root user: 9i: /sbin/insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180 hangcheck_reboot=1 10g & 11g: /sbin/insmod hangcheck-timer hangcheck_tick=1 hangcheck_margin=10 hangcheck_reboot=1 . You need to talk to your Infiniband vendor. $ cd $ORACLE_HOME/rdbms/lib $ make -f ins_rdbms.log file.0.9-22. Is OCFS2 certified with Oracle RAC 10g? Yes. What should they do? Because of compatibility with their storage array (EMC DMX with Powerpath 4.0.3.mk ipc_rds ioracle You can check your interconnect through the alert log at startup.15. Is the hangcheck timer still needed with Oracle RAC 10g and 11g? YES! The hangcheck-timer module monitors the Linux kernel for extended operating system hangs that could affect the reliability of the RAC node ( I/O fencing) and cause database corruption. Customer ignored the error and the install succeeded without any other errors and oracle apparently worked fine. The database must be at least 10. 2.92. How do I configure my RAC Cluster to use the RDS Infiniband? The configuration takes place below Oracle.2. See Certify to find out which platforms are currently certified.5) they must use update 2.0.1 for more details.

To ensure the module is loaded every time the system reboots. Customer did not load the hangcheck-timer before installing RAC. ASM is recommended storage for the database. hangcheck timer is a kernel module that is shipped with the Linux kernel. Are Red Hat GFS and GULM certified for DLM? Both are part of Red Hat RHEL 5. This only applies to RHEL5 and not to RHEL4 where GFS is distributed with an additional fee My customer is about to install 10202 clusterwere on new Linux machinges. it did not find an oracm. If Customer were installing 10g on an existing 9i cluster (which will have oracm) then this message . He is getting "No ORACM running" error when run rootpre. For Oracle Database 10g Release 2 on Linux x86 and Linux x86-64. all you have to do is load it as follows: 9i: /sbin/insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180 hangcheck_reboot=1 10g & 11g: /sbin/insmod hangcheck-timer hangcheck_tick=1 hangcheck_margin=10 hangcheck_reboot=1 No need to reboot the nodes. Can the customer just load the hangcheck-timer ? YES. certification in progress by RedHat. you can issue echo “$?” to ensure that it returns a zero value. verify that the local system startup file (/etc/rc. OCFS2 is certified and it's the preferred choice for Oracle. it is certified on OEL5 and RHEL5 as per certify.d/rc.1. here is more help on writing UDEV rules. The message is basically saying. Since GFS is part of the RHEL5 distribution and Oracle fully supports RHEL under the Unbreakable Linux Progam.sh and exited? Should he worry about this message? It is an informational message. Oracle will support GFS as part of RHEL5 for customers buying the Unbreakable Linux Support.local) contains the command above. see external website.1 How to reorder or rename logical network interface (NIC) names in Linux Although this is rarely needed. since most hardware will detect the cards in the correct order on all nodes. Generally for such scripts. if you still need to change/control the ordering. For more details see note:726833. GFS is not certified yet . For additional information please review the Oracle RAC Install and Configuration Guide (5-41) and note:726833.

nor the need for an inter-switch link (ISL). do we need Sun Cluster to provide redundancy for the interconnect and multiple switches? Link Aggregation (GLDv3) is bundled in the OS as of Solaris 10. even as the local container root user. They can continue the install. For ease of configuration (e. as a best practice.would have been serious. IPMP is available for Solaris 10 and Solaris 9.... use IPMP with link mode failure detection in primary/standby configuration. If the interconnect VLAN is trunked with other VLANS between the redundant switches. This will give you a single failover IP which you will define in cluster_interconnects init. insure that the interconnect VLAN is pruned from the trunk to avoid unnecessary traffic propagation through the corportate network. We rely on the uadmin command to quickly bring down a node if an urgent condition is detected. shape or form. Neither require Sun Cluster to be installed. In local containers.. You can use a Global container but remember 1 global container per system or per domain. avoid VLAN trunking across the switches. fewer IP address requirements). So. You can only do this in the global container.1 How to configure bonding on Suse SLES9. that is per domain. We can configure stand-alone redundant switches that do not require the VLAN to be trunked between them. In Solaris 10. CRS has to maintain the ability to manipulate hardware and this just is not going to happen in a local container.. You can't plumb and unplumb network interfaces . nothing . Oracle RAC is currently not supported with Solaris 10 Local Containers.ora parameter.g. you may have more than 1 global container on the whole system (hardware). Please see note:291958. Please see note:291962. As I recall. The answer is the same if you are using Vendor Clusterware such as Veritas SF RAC or Sun Cluster. How to configure bonding on Suse SLES8. AND TEST THIS RIGOROUSLY. you can't do this from the local container either. But since customer is installing this on a fresh new box. For now. as Link Aggregation . in case your hardware is capable of being split up in domains.1 Does Oracle Support Oracle RAC with Solaris 10 Containers (aka Zones)? No. you cannot manipulate hardware in any way. For the interconnect and switch redundancy. Remove any interfaces for the interconnect from the OCR using `oifcfg delif`.

sun.this is part of Solaris. You need to have the SanFoundation Kit installed (newest version). not present beforehand) You are able to add NIC bonding after the installation although this is more complicated than the other way round. This is to avoid potential latencies in switch failure detection/failover which may impact the availability of the rdbms. be aware that the machines are installed following the EIS-standard.9. you would failover. you are likely to lose both interconnect connections. In an active/active configuration this makes send/receive problems difficult to diagnose. So you are likely not to get the throughput you might have expected. Please note too that debugging active/active interfaces at the network layer is cumbersome and time consuming. please use oifcfg delif / setif to modify this.(GLDv3) cannot span multiple switches from a single host. This is a quality assurance standard introduced by Sun that mainly takes care that you always have the newest patches. There are several notes on webiv regarding this. SUNWcsr respectively and do not require Sun Clusterware.e. as described in webiv note 283107. configure IPMP as active/standby and *not* active/active.10. MPXIO is free of charge and comes with Solaris 8. BTW. Therefore. Is HMP supported with Oracle RAC 10g or Oracle RAC 11g on all HP platforms ? . it would use this feature indirectly. Can I configure IPMP in Actie/Active to increase bandwidth of my interconnect? For IPMP For active/active configurations please follow the sun doc instructions http://docs. Can I configure HP's Autoport aggregation for NIC Bonding after the install? (i. Unless you experience explicit bandwidth limitations that require active/active. Please. Note. if you have a Sun LVM. Note 276434. it is a recommended best practice to configure for maximum availability. you will need to configure the switch redundancy and the host NICs with IPMP.1. Sun confirmed that MPXIO will work with RAWs. Both Link Aggregation (GLDv3) and IPMP are core OS packages SUNWcsu. In an active/active configuration and the switch side link fails.1 Modifying the VIP of a Cluster Node Regarding the private interconnect. Does Sun Solaris have a multipathing solution ? Sun Solaris includes an inherent Multipathing tool: MPXIO . whereas active/standby.com/app/docs/doc/816-4554/6maoq027i?a=view IPMP active/active is known to load balance on transmit but serialize on a single interface for receive. IPMP spreads/load balances outbound packets on the bonded interfaces. When configuring IPMP for the interconnect with multiple switches available. but inbound packets are received on a single interface.

When running Oracle RAC on Windows 2003.10. The OracleCMService9i service will remain intact during the Oracle Clusterware 10g install. The information for the 9i database will get migrated to the OCR during the Oracle Clusterware installation. as a Oracle 9i RAC database would require that the 9i OracleCMService9i.2. do I need to run the VSS on each node where I have an Oracle RAC instance? There is no need to run Oracle VSS writer instance on each Oracle RAC node (even though it is installed and enabled by default on all nodes). Both srvctl commands will use the OCR.1 .New Partitions in Windows 2003 RAC Environments Not Visible on Remote Nodes For details on Windows Bundle patches see: Note: 342443. it should be left running. Be sure to use the same 'cluster name' in the appropriate OUI field for both 9i and 10g when you install both together in the same cluster.0.1 . See http://docs. for future database management.x Oracle Database and Networking Patches for Microsoft Platforms Can I run my Oracle 9i RAC and Oracle RAC 10g on the same Windows cluster? Yes but the Oracle 9i RAC database must have the 9i Cluster Manager and you must run Oracle Clusterware for the Oracle Database 10g. The same applies for Oracle RAC 11g When using MS VSS on Windows with Oracle RAC. And the documentation in Windows Platform Doc . you would use the 9i srvctl to manage the 9i database. 9i Cluster Manager can coexsist with Oracle Clusterware 10g.Unexplained Database Slowdown Seen on Windows 2003 Service Pack 1 Note: 454607. For details see the following Notes: Note: 464683.html?jumpid=reg_R1002_USEN Does the Oracle Cluster File System (OCFS) support network access through NFS or Windows Network Shares? No.com/en/B625790056/ch01s01.HP has desupported HMP with Oracle RAC 10g. in the current release the Oracle Cluster File System (OCFS) is not supported for use by network access approaches like NFS or Windows Network Shares. Then. and the 10g srvctl to manage any new 10g databases. what is the recommended OS level? It is strongly recommended to be at SP2 on Windows 2003.1 .hp.

Go to My computer -> Manage->Storage->Disk Management->Disk-Properties>Policies-> and uncheck the "Enable Write Caching on Disk".0..for Oracle VSS writer is applicable to Oracle RAC also.0. This could also be: .Bug:4537790 which introduced OPMD to begin with . all the archived logs generated by database must be accessible on node where backup is initiated using Oracle VSS writer. Once set back to 100/full on all NICs as well as the network switch associated with the interconnect the problem is resolved. The ability of clustered file system to create a Windows Shadow copy is a MUST to backup Oracle RAC database using Oracle VSS writer. or else does not have the correct username in it. The only other requirement is that. the writer automatically invokes RMAN to perform needed recovery actions after the snapshot is restored by the provider . Double-check what user account you are using to logon to Node2 as ( a 'set' command will show you the USERNAME and USERDOMAIN values) and then make sure that this account is part of ORA_DBA.2. When a VSS restore of a db is issued. this is the last of the three Oracle Clusterware services that we expect to start.8 . The other issue to check is that SQLNET. 3rd party HBA's may have their own management tools to modify these settings. This will disable the write caching.the VSS writer places the db in hot backup mode so that the VSS provider can initiate the snapshot.1 How do I verify that Host Bus Adapter Node Local Caching has been disabled for the disks I will be using in my RAC cluster? Disabling write caching is a standard practice while using the volume managers/file systems are shared. It could be caused by a change from to auto-negotiate instead of 100/full on the interconnect.ORA The OracleCRService does not start with my windows Oracle RAC implementation. VSS coordinates storage snapshot of db files .AUTHENTICATION_SERVICES=(NTS) is set in the SQLNET. So.that is the real value add of the writer. For OPMD.3 times. This could be caused by a few different things.which for reference sake was logged against 9.inability to access the shared disk housing your OCR permissions issue OR .. Just remember . What do I do when I get an ORA-01031 error logging into the ASM instance? This sounds like the ORA_DBA group on Node2 is empty.because due to dependencies. RMAN is not backing up anything in this case. and is still relevant today in 10. see Metalink Note 358156.2. what do I do? If OracleCRService doesn't start that's quite a different issue than say OracleCSService not starting .

For EE RAC you can implement an 'extended cluster' where there is a distance between the nodes in the cluster (usually less than 20 KM).that centralized. . Customers are using both Windows Itanium RAC and Windows X64 RAC. (typically 8 nodes . as all management for RAC can be done via Grid Control or Database Control. that for Fail Safe. Is VIO supported with RAC on IBM AIX? VIO is supported on IBM AIX. though. RAC is active active so you get dual benefits of increased scalability and availability every time you add a node to a cluster. A failsafe protected instance is an Active/Passive instance so. which will complicate things just a bit. My customer has a failsafe cluster installed. Most work on the product will be around accomodating changes in the supported resources (new releases of RDBMS. and is also 'battery backed' then that is OK. Your customer should really consider more than 2 nodes. if the server is 64-Bit. regardless of flavor. There is no such restriction for RAC. Is HACMP needed for RAC on AIX 5. as such.2 then you do not need HACMP.but it does vary). It's the node local cache that you need to turn off. If the cache is visible to all nodes as a 'single cache'. shared cache is generally OK. if it is available for AIX 5.) and the underlying Microsoft Cluster Services and Windows operating system. Fail Safe Manager must be installed on a 32-Bit client. does not benefit that much at all from adding more nodes to a cluster. etc. We have a limit of 100 nodes in a RAC cluster (we don't use MSCS).2 using GPFS file system? The newest version of GPFS can be used without HACMP. How exactly you do this will vary from HBA vendor to HBA vendor. what are the benefits of moving their system to RAC? Fail Safe development is continuing. typically in the storage array. Windows X64 seems more popular. Keep in mind. Please check ** Certify Unix RAC Technology Matrix ** for the details. AS. the install guide tells them to disable disk caching? If the write cache identified is local to the node then that is bad for RAC. Microsoft have a limit of nodes in a MSCS cluster. (because of aggregate computer power on node failure). My customer wants to understand what type of disk caching they can use with their Windows RAC Cluster. If the choice is 2 of 4 CPU nodes or 4 of 2CPU node then I would go for 2 CPU nodes.

0.Do I need HACMP/GPFS to store my OCR/Voting file on a shared device.1.3 ----- On AIX it is important to put the reserve_lock=no/reserve_policy =no_reserve in order to allow AIX to access the devices from more than one node simultaneously. you must use a GPFS file system to store the Oracle CRS files" ==> this is a documentation bug and this will be fixed with 10. The prerequisites doc for AIX clearly says: "If you are not using HACMP. .

and MPIO-capable devices you have to do an chdev -l hdiskn -a reserve_policy=no_reserve) Can I run Oracle RAC 10g on my IBM Mainframe Sysplex environment (z/OS)? YES! There is no separate documentation for RAC on z/OS. once the customer has done that it is trivial to set up a RAC database.Use the /dev/rhdisk devices (character special) for the crs and voting disk and change the attribute with the command <!--[if !supportEmptyParas]--> <!--[endif]--> chdev -l hdiskn -a reserve_lock=no (for ESS. IBM z/OS documentation explains how to set up a Sysplex Cluster. which you can read here. CLARiiON. EMC. HDS. The few steps involved are covered in in Chapter 14 of the Oracle for z/OS System Admin Guide. What you would call "clusterware" is built in to the OS and the native file systems are global. There is also an Install Guide for Oracle on z/OS ( here) .

By the way.htm Oracle 9i RAC (9. Newer versions of RDA (Remote Diagnostic Agent) have the RAC-DDT functionality. RACDDT will run on all supported Unix and Linux platforms. but is not supported on any Windows platforms.oracle.1 for details. Can I use Oracle Clusterware for failover of the SAP Enqueue and VIP services when running SAP in a RAC environment? Oracle has created sapctl to do this and it is available for certain platforms.us. DIAG is a database process which as one of its tasks. while a number of proxy scripts are written using Korn shell.1 . What are the cdmp directories in the background_dump_dest used for? These directories are produced by the diagnosibility daemon process (DIAG).but I don't think there are any RAC-specific steps in the installation. The main script is written in Perl. RACDDT is a set of scripts and configuration files that is run on one or more nodes of an Oracle RAC cluster. Quote from the User Guide: RACDDT is a data collection tool designed and configured specifically for gathering diagnostic data related to Oracle's Real Application Cluster (RAC) technology. A dump directory named something like cdmp_ is created in the bdump or background_dump_dest directory.com/isv/siebel.1.2) and Oracle RAC 10g (10. so going forward RDA is the tool of choice.1) are certified with Oracle Applications EBusiness Suute. User Guide is in Note: 301138. RAC on z/OS does not use Oracle's clusterware (CSS/CRS/OCR). How do I gather all relevant Oracle and OS log/trace files in an Oracle RAC cluster to provide to Support? Use RAC-DDT (RAC Diagnostic Data Tool). and all the trace dump files DIAG creates are placed in this directory. The DIAG process dumps out tracing to file when it discovers the death of an essential process (foreground or background) in the local instance. See Note: 285267. use Veritas or Sun Cluster. Are Oracle Applications certified with RAC? For Siebel. it will not be available in 2007. For Solaris. SAPCTL will be available for download on SAP Services Marketplace on AIX and Linux. The RDA User Guide is in Note: 314422. PeopleSoft see http://realworld. performs cash dumping.

Use Clustered File System (ASM recommended) for all data base files or migrate all database files to raw devices.What is the optimal migration path to be used while migrating the E-Business suite to Oracle RAC? Following is the recommended and most optimal path to migrate you E-Business suite to an Oracle RAC environment: 1. Their committed work will be available.g. 2. . CM middle tiers are set up to point to the appropriate database instance based on product module being used. In step 4. (Use dd for Unix or ocopy for NT) 3. Install/upgrade to the latest available e-Business suite. The concurrent manager programs can be tied to a specific middle tier (e.) "" Can I use TAF with e-Business in a RAC environment? TAF itself does not work with e-Business suite due to Forms/TAF limitations. 6. and the user will be taken to the navigator tab.FAQ : Note: 285267. (There is no seperate certification required for RAC.. install Oracle RAC option and use Installer to perform install for all the nodes. 4. their session will be directed to a surviving instance. but you can configure the tns failover clause. you can have CMs running on more than one middle tier box).1 Oracle E-Business Suite 11i and Database .1 Is the Oracle E-Business Suite (Oracle Applications) certified against RAC? Yes. On instance failure. We also recommend you configure the forms error URL to identify a fallback middle tier server for Forms processes. (If applicable).1 E-Business Suite 11i on RAC : Configuring Database Load balancing & Failover: Note: 294652. if no router is available to accomplish switching across servers. Clone Oracle Application code tree. How to configure concurrent manager in a RAC environment? Large clients commonly put the concurrent manager on a separate server now (in the middle tier) to reduce the load on the database server. It is advisable to use specilize CM. when the user logs back into the system. any uncommitted work must be re-started. Migrate the existing application to new hardware. Ensure the database version is supported with Oracle RAC 5. Reference Documents: Oracle E-Business Suite Release 11i with 9i RAC: Installation and Configuration : Note: 279956.

Should functional partitioning be used with Oracle Applications? We do not recommend functional partitioning unless throughput on your server architecture demands it.5. please set the CLB_GOAL on the service. According to the ** OCFS2 User Guide User Guide.2 or higher do not set PREFER_LEAST_LOADED_NODE = OFF in your listener. For Oracle RAC 10. Can I use Automatic Undo Management with Oracle Applications? Yes. It is recommended to set up both Client and Server side load balancing. modules used. However we recommend the latest available version.5 onwards are certified with Oracle9i and hence with Oracle9i RAC. Which e-Business version is prefereable? Versions 11. If your processing requirements are extreme and your testing proves you must partition your workload in order to reduce internode communications. That middle tier server would then be configured to connect to a specific database instance. Customers are using it successfully today. In a RAC environment we highly recommend it. Is Server Side Load Balancing supported/recommended/proven technology in Oracle EBusiness Suite? Yes. What files can I put on Linux OCFS? For optimal performance. Cache fusion has been optimized to scale well with non-partitioned workload. can scale to a maximum of 32 nodes. you can use Profile Options to designate that sessions for certain applications Responsibilities are created on a specific middle tier server. you should only put the following files on Linux OCFS: .6 home (forms and ccm).0.ora. What are the maximum number of nodes under OCFS on Linux ? Oracle 9iRAC on Linux. OCFS 2 can support up to 255 nodes. using OCFS for datafiles. batch users. workload characteristics etc. To determine the correct partitioning for your installation you would need to consider several factors like number of concurrent users. connections are directed to RAC instance based on the sequence its listed in the TNS entry description list and may not get load balanced optimally. Note that the pieces coming from 8.

Control Files .Archive Logs . When using a CFS for the database and ASM Oracle homes. This is not common practice. Oracle supports rolling upgrades for the Oracle Database Automated Storage Management (ASM) after you have upgraded to Oracle Database 11g. Rolling Upgrades with Cluster File Systems in General It is not recommended to use a cluster file system (CFS) for the Oracle Clusterware binaries.com/projects/ocfs/ For User Manual >>> http://oss. then it can be applied to a Oracle RAC database in a rolling fashion. Binaries as well as database files? Oracle Best Practice for using Cluster File Systems (CFS) with Oracle RAC * Oracle Clusterware binaries should not be placed on a CFS as this reduces cluster functionality while CFS is recovering. If a patch is marked for rolling upgrade. Check Certify for certified cluster file systems. datafiles.oracle. Oracle Clusterware supports in-place rolling upgrades. or cluster management activities to fail (I.oracle. * Oracle Database 10g binaries are supported on CFS for Oracle RAC 10g and for Oracle Database.E start. the CFS should be configured to use of context dependent links (CDSLs) or equivalent and these should configured to work in conjunction with rolling upgrades and downgrades. or check of a resource). A workaround is available to clone the Oracle Clusterware home for each upgrade.com/projects/ocfs/documentation/ For OCFS Files >>> http://oss. * Oracle Database 10g database files (e. This includes updating the database and ASM homes in the OCR to . The system should be configured to support multiple ORACLE_HOME’s in order to maintain the ability to perform a rolling patch application.. Where can I find documentation on OCFS ? For Main Page >>> http://oss.SPFILE Oracle Clusterware files OCR and Voting Disk can be put on OCFS2 however Best Practice is to put them on raw or block devices.oracle.com/projects/ocfs/files/supported/ What are the Best Practices for using a clustered file system with Oracle RAC? Can I use a cluster file system for OCR.Datafiles . and also limits the ability to perform rolling upgrades of Oracle Clusterware. and archive log files) are supported on CFS. * Oracle Clusterware voting disks and the Oracle Cluster Registry (OCR) should not be placed on a CFS as the I/O freeze during CFS reconfiguration can lead to node eviction.Redo Logs . Voting Disk. trace files. stop.g. Using a shared Oracle Clusterware home results in a global outage during patch application and upgrades.

as is the case with Sistina filesystem.1 Can I use OCFS with SE Oracle RAC? It is not supported to use OCFS with Standard Edition Oracle RAC. the filesystem is certified with the operating system. Oracle Cluster ReadyServices voting disk and recovery area can be placed on QFS.point to the current home.5 and higher. If.2. but certifies against an operating system. Is Sun QFS supported with Oracle RAC? What about Sun GFS? From certify.10g and 10gR2): No restrictions on placement of files on QFS Sun StorEdge QFS is supported for Oracle binary executables. What is the maximum number of nodes I can have in my cluster if I am using OCFS2? . This is done automatically by install. archive logs. This is included in Metalink Note 444134. database data files. database data files. control files etc).0. check there for the latest details. Is Linux OCFS2 (OCFS version 2) supported with Oracle RAC? Yes See Certify for details on which platforms are supported. Customer will have to contact the filesystem provider for support. Solaris Volume Manager for Sun Cluster can be used for host-based mirroring Supports up to 8 nodes Is Red Hat GFS(Global File System) is certified by Oracle for use with Oracle Real Application Clusters? Sistina Cluster Filesystem is not part of the standard RedHat kernel and therefore is not certified by Oracle but falls under a kernel extension. datafiles. Sun Cluster . Oracle RAC does not certify against a filesystem per se.Sun StorEdge QFS (9. recovery area. You can not place binaries on OCFS as part of the SE Oracle RAC terms. This however. We recommend that the binaries and trace files (non-ASM supported files) to be replicated on all nodes. Oracle Cluster Registry (OCR). As a fact. this only means that the Oracle does not provide direct support and fix the filesystem in case of an error. All database files must use ASM (redo logs. does not mean that Oracle RAC is not certified with it.

if required.2.0.3 a new behavior was introduced that allowed the node VIP to return to its home node after the network recovered. Instead. using the "ORA_RACG_VIP_FAILBACK" parameter. one can distinguish between Oracle Clusterware 10g & 11g Release 1 and Oracle Clusterware 11g Release 2 behavior. It needs to be noted that this behavior is not based on the parameter mentioned above and therefore does not induce the same side effects.0. This behavior can be activated.2. However. This parameter should only be used after reviewing support note 805969.1 regarding backup and restore a lost Voting/OCR . there is no way that Oracle Clusterware can detect the recovery of the network and initiate an automatic return of the node VIP to its home node.0. For Oracle Clusterware 10g & 11g Release 1 the VIP will fail over to another node either after a network or a node failure. Starting with Oracle Database 10g Release 2 Oracle Clusterware will multiplex the OCR and Voting Disk (two for the OCR and three for the Voting).2. However. When does the Oracle node VIP fail over to another node and subsequently return to its home node? The handling of the VIP with respect to a failover to another node and subsequent return to its home node is handled differently depending on the Oracle Clusterware version. In general.hence both must be mirrored via a storage vendor method. The required network check was part of the database instance check.1 and Note: 268937. Exception: With Oracle Patch Set 10. was disabled with subsequent bundle patches and the Oracle Patch Set 10. which monitors the network constantly. even after the network failed and the resource became "OFFLINE".2. Please read Note: 279793. Since the network is not constantly monitored in this Oracle Clusterware version.Theroetically you can have up to 255 however it has been tested with up to 16 nodes.1 even after the public network problem is resolved. How do I protect the OCR and Voting in case of media failure? In Oracle Database 10g Release 1 the OCR and Voting device are not mirrored within Oracle.0. this new check introduced quite some side effects and hence.4 and for Oracle Clusterware 11g Release 1 the default behavior is to avoid an automatic return of the node VIP to its home node after the network recovered.4 and 11. the VIP will automatically return to its home node only after a node failure and a subsequent restart of the node. This feature is called "OFFLINE resource monitoring" and is per default enabled for the network resource.) With Oracle Clusterware 11g Release 2 the default behavior is to automatically initiate a return of the node VIP to its home node as soon as the network recovered after a failure. like RAID 1.4 Starting with 10.1 (VIP does not relocate back to the original node starting from 10. a new network resource is used in Oracle Clusterware 11g Release 2.

hence Oracle supports using them.0. Oracle Clusterware 10.1 What do I do. Note: Customer should pay close attention to the bonding setup/configuration/features and ensure their objectives are met. As always. Can I run a 10.2 databases (and ASM too!). hyperfabric.2 ? Yes. I have a corrupt OCR and no valid backup? Note: 428682.1 and 10.1 describes how to recreate your OCR/Voting Disk which you have accidently deleted and cannot recover from backups Is it supported to rerun root. ethernet. Can the Network Interface Card (NIC) device names be different on the nodes in a cluster. the solutions are OS dependent and therefore the best source of information is from your OS Vendor. there are several articles in Metalink on how to do this.2 will support both 10.How do I use multiple network interfaces to provide High Availability and/or Load Balancing for my interconnect with Oracle Clusterware? This needs to be done externally to Oracle Clusterware usually by some OS provided nic bonding which gives Oracle Clusterware a single ip address for the interconnect but provide failover (High Availability) and/or load balancing across multiple nic cards. it's always important to test your setup to ensure it does what it was designed to do. The most reliable configuration for this architecture is to configure the NIC's for Active/Passive.x database with Oracle Clusterware 10. These solutions are provided externally to Oracle at a much lower level than the Oracle Clusterware. for both public and private? All public NICs must have the same name on all nodes in the cluster Similarly. all private NICs must also have the same names on all nodes Do not mix NICs with different interface types (infiniband. etc. For example for Sun Solaris search for IPMP (IP network MultiPathing). since some solutions provide only failover and some only loadbalancing still others claim to provide both.1. When bonding with Network Interfaces that connect to separate switches (for redundancy) you must test if the NIC's are configured for active/active mode.sh from the Oracle Clusterware installation ? . A detailed matrix is available in Note: 337737.) for the same subnet/network. However.

In the above list of rules. Note that Oracle Clusterware will store the disk within a disk group that holds the Voting Files. . Once Voting Files are managed in ASM. while a normal redundancy ASM disk group that is not used to store Voting Files requires only 2 disks in separate failure groups. a normal redundancy ASM disk group. paths. which is supposed to hold Voting Files. etc. since users are not allowed to manually manage Voting Files in ASM. a manual addition. OCR stored in ASM . the ASM disk group that hosts the Voting Files will place the appropriate number of Voting Files in accordance to the redundancy level. If the redundancy level of the disk group is set to "high". deletion. it is OK to re-run root. Oracle Clusterware does not rely on ASM to access the Voting Files. In addition. In other words. 1 Voting File is used.sh to undo the local effects of root. Any idea why the message "unable to open libhasgen10. it is assumed that each disk that is supposed to hold a Voting File resides in its own.sh after the cause of the failure is corrected (permissions. note that there can be only one Voting File per failure group. please run rootdelete. Please try to start it from the Oracle Clusterware home. requires 3 disks in separate failure groups. 5 Voting Files are used. dedicated failure group. a disk group that is supposed to hold the above mentioned number of Voting Files needs to have the respective number of failure groups with at least one disk.sh. 3 Voting Files are used. but the node remains operational. If the redundancy level of the disk group is set to "external". In case where root. (1 / 3 / 5 failure groups with at least one disk) Consequently. When ct run the command 'onsctl start' receives the message "Unable to open libhasgen10.so".For Oracle RAC 10g rerunning root. We strongly recommend not doing it.so" ? Most likely you are trying to start ONS from ORACLE_HOME instead of Oracle Clusterware (or Grid Infrastructure in 11. In this case.sh before rerunning root. if my ASM instance fails on a node? If an ASM instance fails on any node.). or replacement of Voting Files will fail.2) home.What happens.sh is failing to execute for on an initial install (or a new node joining an existing cluster).How many disks per disk group do I need? If Voting Files are stored in ASM.sh after the initial successful install of the Oracle Clusterware is expressly discouraged and unsupported. the OCR becomes unavailable on this particular node. Voting Files stored in ASM . If the redundancy level of the disk group is set to "normal".

THEN CRSD stops and becomes inoperable. they cannot access their data on this node anymore during the time the ASM instance is down.2 and you must be using 11. Hence cluster management is affected on this particular node. If a RAC database is used. the OCR and Voting Disks can be stored in ASM. For releases prior to 11g Release 2. When you use GNS. public VIP. . public SCAN VIP) ? No. There must be enough failure groups to support the redundancy of the voting disks as set by the redundancy on the disk group. 5 failure groups. I am trying to move my voting disks from one diskgroup to another and getting the error "crsctl replace votedisk – not permitted between ASM Disk Groups. This is the recommended best practice for this release. If the CRSD process running on the node affected by the ASM instance failure is the OCR writer. The compatible. AND an IO is attempted on the OCR during the time the ASM instance is down on this node.2 version of Oracle Clusterware and ASM. Under no circumstances will the failure of one ASM instance on one node affect the whole cluster." Why? You need to review the ASM and crsctl logs to see why the command is failing. Is it possible to use ASM for the OCR and voting disk? Yes. To put your voting disks in ASM. Note: by default each disk in a diskgroup is put in its own failure group. EG: Normal redundancy. do ALL public addresses have to be DHCP managed (public IP. we will use DHCP for the VIPs which includes node vips and SCAN vips. 3 failure groups are requried.If the (RAC) databases use ASM. the OCR and voting disk must be on RAW devices or CFS (cluster filesystem). High redundancy.asm attribute of the diskgroup must be set to 11. With GNS. The choice to use DHCP for the hostname is outside Oracle. too. As of Oracle Real Application Clusters 11g Release 2. you must have the diskgroup set up properly. access to the same data can be established from another node. RAW devices (or block devices on Linux) is the best practice for Oracle RAC 10g or Oracle RAC 11g Release 1. The Oracle Clusterware and Oracle RAC will work with both static and DHCP hostnames. AND the majority of the OCR locations is stored in ASM.

Can I run the fixup script generated by the 11. The extents are spread across all the disks in the diskgroup and the redundancy (which is at the extent) is based on the redundancy of the disk group. there should not be any need to change the system parameters. This step is needed to make sure that the Oracle Clusterware install succeeds.Edit /var/opt/oracle/ocr. The root script will change the ownership/permission of ocr and voting disk as part of install. . if an earlier version of Oracle Database is already running on the system. It describes the PRE-INSTALL ownership/permission requirements for ocr and voting disk. and will not work. .oracle:oinstall .Restore from one of the automatic physical backups using ocrconfig -restore. For Oracle RAC 10g Release 2 or later Please use the OCR command to replace the OCR with the new location: # ocrconfig -replace ocr /dev/newocr # ocrconfig -replace ocrmirror /dev/newocrmirror Manual editing of ocr.loc on all nodes and set up ocrconfig_loc=new OCR device . Best Practice for ASM is to have 2 diskgroups. However.crs stop" .Run ocrcheck to verify.root:oinstall .reboot to restart the CRS stack. The fixup scripts can change system parameters so you should not change system parameters while applications are running. Best Practice for OCR in ASM is to have a copy of the OCR in each diskgroup.644 How is the Oracle Cluster Registry (OCR) stored when I use ASM? The OCR is stored similar to how Oracle Database files are stored.2 OUI or CVU on a running system? It depends on what the problem that were listed to be fixed.loc or equivalent is not recommended. You can only have one OCR in a diskgroup.stop the CRS stack on all nodes using "init. What should the permissions be set to for the voting disk and ocr when doing an Oracle RAC Install? The Oracle Real Application Clusters install guide is correct.640 Voting Disk . Please don't use those values to determine what the ownership/permmission should be POST INSTALL. How to move the OCR location ? For Oracle RAC 10g Release 1 . The POST INSTALL permissions will end up being : OCR . .

rounded to 2. Oracle recommends that customers use 3 or more voting disks in Oracle RAC 10g Release 2. With Oracle RAC 11g Release 1. our cluster will fail with both 4 voting disks or 3 voting disks. You can restore this most recent backup by using the command ocr -restore temp. the Add. 1/2 of 3 is 1. 1/2 of 4 is 2. Nothing comes up in the cluster nodes either. . IE Look for /opt/ORCLcluster directory. and changing default backup dir on Windows. The default location is : $ORA_CRS_HOME\cdata\"clustername"\ To display backups : #ocrconfig -showbackup To restore a backup : #ocrconfig -restore The automatic backup mechanism keeps up to about a week old copy. Restoring corrupted voting disks is easy since there isn't any significant persistent data stored in the voting disk. Why? Check for 3rd Party Vendor clusterware (such as Sun Cluster or Veritas Cluster) that was not completely removed. once we lose 2 disks. For this reason when using Oracle for the redundancy of your voting disks. See the Oracle Clusterware Admin and Deployment Guide for information on backup and restore of voting disks. It doesn't threaten database corruption. then you should copy that "backup" file to some other name. and use -import option to restore the contents back. at the Specify Cluster Configuration screen. how do you backup the OCR? There is an automatic backup mechanism for OCR. So.1. it should be removed.. the 3 voting files should be physically separate disks.With Oracle Clusterware 10g.4.5. It is recommended to use an odd number as 4 disks will not be any more highly available than 3 disks. then nodes get evicted from the cluster.2) and when I run the OUI..ocr With Oracle RAC 10g Release 2 or later. you can do a manaual backup of the OCR with the command: # ocrconfig -manualbackup I am trying to install Oracle Clusterware (10. What happens if I lose my voting disk(s)? If you lose 1/2 or more of all of your voting disks. Only file in the backup directory is temp.0. if you want to retain a backup copy more than that. Alternatively you can use external redundancy which means you are providing redundancy at the storage level using RAID. you can also use the export command: #ocrconfig -export -s online. Unfortunately with Oracle RAC 10g Release 1 there are a couple of bugs regarding backup file manipulation.ocr which would be the last backup. or nodes kick themselves out of the cluster. Note: For best availability. OCR backup on Windows are absent. These were fixed in 10. Edit and Remove buttons are grayed out.

The intended behaviour was that the VIP would failover to the another node. instead the interface will be plumbed again. I made a mistake when I created the VIP during the install of Oracle Clusterware.276434. It was possible to test VIP failover by simply running ifconfig <interface_name> down. make sure the libskgxn* files to the /opt/ORCLcluster directory. Your tests should simulate Production failures which are generally Switch errors or interface errors. Make sure LD_LIBRARY_PATH is not set. and secondly the node with the lowest incarnation number. To test VIP failover on platforms other than Linux. Why? This shows that Oracle Clusterware does not detect the 3rd Party clusterware is installed. the switch can be turned off or the physical cable pulled. It works as follows: a) Ensures that you cannot join the cluster if you cannot access the voting disk(s) b) Leave the cluster if you cannot communicate with it (to ensure we do not have aberrant nodes) c) Should multiple sub-clusters form. The is best way to test. It prefers a greater number of nodes. it will only allow one to continue. NOTE: if you have other DB’s that share the same IP’s then they will be affected. however on other operating systems the VIP will NOT failover. What is the voting disk used for? A voting disk is a backup communications mechanism that allows CSS daemons to negotiate which sub-cluster will survive.2. Make sure you have followed the installation instructions provided by the vendor for integrating with Oracle RAC. In 10. can I change the VIP? Yes The details of how to do this are described in Metalink Note.3. For example with Sun Cluster.3 this is the same behaviour on Linux.0. Check that lsnodes returns the correct list of nodes in the Sun Cluster.1 How should I test the failure of the public network (IE Oracle VIP failover) in my Oracle RAC environment? Prior to 10. These voting disks keep a status of who is currently alive and counts votes in case of a cluster reconfiguration.0.2. d) Is kept redundant by Oracle in 10g Release 2 (you need to access a majority of existing voting .I am installing Oracle Clusterware with a 3rd party vendor clusterware however in the "Specify Cluster Configuration Page" . Oracle Clusterware installer doesn't show the existing nodes.

When vendor clusterware is present. when you have a choice. Does the hostname have to match the public name or can it be anything else? When there is no vendor clusterware. Can I change the public hostname in my Oracle Database 10g Cluster using Oracle Clusterware? Hostname changes are not supported in Oracle Clusterware (CRS). What happens in a 3-nodes RAC cluster if node1's cable is disconnected? The node with the lower node number will survive(The first node to join the cluster). Doesn't matter whether it is node1's or node2's private network cable that is disconnected. it'd be possible for them to wrap single-instance databases this way.disks) At most only one sub-cluster will continue and a split brain will be avoided. unless you want to perform a deletenode followed by a new addnode operation. I have a 2-node RAC running. See note: 554781. In case of 3 nodes. I notice that it is always node2 that is evicted when I test private network failure scenario by disconnecting the private network cable. Can I use Oracle Clusterware to provide cold failover of my single instance Oracle Databases? Oracle does not provide the necessary wrappers to fail over single-instance databases using Oracle Clusterware. Can I configure a firewall (iptables) on the cluster interconnect? Disable all firewalls on the cluster interconnect.1 for details. then the public node name must match the host name. So. only Oracle Clusterware. A sample can be found in the DEMOs that are distributed with Oracle Database 11g. it is always the node2 that is evicted.the sub cluster with the lower node number will survive. The hostname is used to store among other things the flag files and Oracle Clusterware stack will not start if hostname is changed. and the installer doesn't present an opportunity to change them. It's possible for customers to use Oracle Clusterware to wrap arbitrary applications. 4 nodes . it determines the public node names. . 2 nodes will survive and the one you pulled the cable will go away. always choose the hostname.

The private interconnect enforcement page determines which private interconnect will be used by the RAC instances. What information is required to answer these questions? The private names on the first screen determine which private interconnect will be used by CSS. The point is that we can tolerate a long disktimeout when all nodes are just running fine. The short disktimeout is in effect during a reconfiguration and is misscount-3s. just as their IP addresses have to be when specified in the init. then the IP address this maps to can be changed subsequently. The long disktimeout is 200 sec by default unless set differently via 'crsctl set css disktimeout'. Can I change the name of my cluster after I have created it when I am using Oracle Clusterware? . but is more convenient because it is a cluster-wide setting that does not have to be adjusted every time you add nodes or instances. but if you IP address is specified CSS will always use that IP address. or just the IP address itself. RAC will use all of the interconnects listed as private in this screen. and no longer just one disktimeout.1) Part Number B28287-01 Look in the Special Use section under Oracle Database Editions.What are the licensing rules for Oracle Clusterware? Can I run it without RAC? Check the Oracle® Database Licensing Information 11g Release 1 (11. During Oracle Clusterware installation. but have to revert back to a short disktimeout if there's a reconfiguration. CSS cannot use multiple private interconnects for its communication hence only one name or IP address can be specified.2. If a logical name is used.ora parameter. if one is down then the instances using them won't start. and then on the next screen asked to define which interfaces should be used as private and public interfaces. Provide exactly one name that maps to a private IP address.0. I am asked to define a private node name.ora paramter. and they all have to be up. It's equivalent to setting the CLUSTER_INTERCONNECTS init. and applies to time outside a reconfiguration.2). In the course of failure testing in an extended RAC environment we find entries in the cssd logfile which indicate actions like 'diskShortTimeout set to (value)' and 'diskLongTimeout set to (value)'. Can anyone please explain the meaning of these two timeouts in addition to disktimeout? Having a short and long disktimeout. RAC does not fail over between cluster interconnects. is due to patch for bug 4748797 (included in 10.

4 is released? The fix for "low open file descriptor" problem is to increase the ulimit for Oracle Clusterware. how do I work around this until the fix is released with the Oracle Clusterware Bundle for 10. Why does Oracle Clusterware use an additional 'heartbeat' via the voting disk.crsd before you start! To do this. Which processes access the OCR ? Oracle Cluster Registry (OCR) is used to store the cluster configuration information among other things.0. Oracle also supports a wide variety of storage types. To confirm if OCR is accessible. instead of limiting it to a specific storage type (like SCSI). This is different from classical clustering which assumes shared nothing architectures.No. Modify the file changing: # Allow the daemon to drop a diagnostic core file/ ulimit -c unlimited ulimit -n unlimited . Why does Oracle still use the voting disks when other cluster sofware is present? Voting disks are still used when 3rd party vendor clusterware is present. Stop Oracle Clusterware on the node (crsctl stop crs) 2.2. To properly de-install Oracle Clusterware.crsd as follows. when other cluster software products do not? Oracle uses this implementation because Oracle clusters always have access to a shared disk environment. OCR needs to be accessible from all nodes in the cluster. copy the /etc/init. and changes the decision of what strategies are optimal when compared to other environments.crsd 3. For example one known case is when the vendor clusterware is set to have its heartbeat go over a different network than RAC traffic.3 or 10. PMON never needs to write to OCR. because vendor clusterware is not able to monitor/detect all failures that matter to Oracle Clusterware and the database. try ocrcheck from your ORACLE_HOME and ORA_CRS_HOME. you can modify the init. you MUST follow the directions in the Installation Guide Chapter 10.2. Continuing to use the voting disks allows CSS to resolve situations which would otherwise end up in cluster hangs. This will ensure the ocr gets cleaned out.0. Please be careful when you make this type of change and make a backup copy of the init.d/init. allowing the customer quite a lot of flexibility in configuration. and take down the node. you must properly deinstall Oracle Clusterware and then re-install. If OCR became inaccessible the CSS daemon would soon fail. Customer is hitting bug 4462367 with an error message saying low open file descriptor. while you wait for the patch: 1.

specifically the section on active passive configuration of agents. but you do need to follow the startup/shutdown sequence to avoid any discontinuity of monitoring.5187351.1 Can I use Oracle Clusterware to monitor my EM Agent? Check out Chapter 3 of the EM advanced configuration guide.to # Allow the daemon to drop a diagnostic core file/ ulimit -c unlimited ulimit -n 65536 4. My customer has noticed tons of log files generated under $CRS_HOME/log//client. . There is nothing special about the commands. This is done automatically at agent start. The agent does start a watchdog that monitors the health of the actual monitoring process. is there any way automated way we can setup through Oralce Clusterware to prevent/minimize/remove those aggressively generated files? Check Note.8 You can either apply the patchset if it is available for your platform or have a cron job that removes these files until the patch is available. Therefore you could use Oracle Clusterware but you should not need to. Does Oracle Clusterware have to be the same or higher release than all instances running on the cluster? Yes . You should be able to model those to your requirements. restart Oracle Clusterware in the node (crsctl start crs) How do I identify the voting file location ? Run the following command from /bin "crsctl query css votedisk" How much I/O activity should the voting disk have? Approximately 2 read + 1 write per second per node. Please refer to Note#337737.Oracle Clusterware must be the same or a higher release with regards to the RDBMS or ASM Homes.

0.What are the IP requirements for the private interconnect? The install guide will tell you the following requirements private IP address must satisfy the following requirements: 1.0 global public eth1 138. and possibly the routing tables.0.2.255 * 192. Must be accessible on the same network interface on each node 3. .255. You should not ignore this error.255.255 Cluvfy will give you an error if you do not have your private interconnect in the ranges above. If you are using an IP address in the range used for the public network for the private network interfaces.238.0/eth0\|eth1 To check which interfaces are configured as public or private use oifcfg getif example output: eth0 138. Must be specified in the /etc/hosts file on each node The Best Pratices recommendation is to use the TCP/IP standard for non-routeable networks. . This is just a bad idea.2. Can I set up failover of the VIP to another card in the same machine or what do I do if I have different network interfaces on different nodes in my cluster (I.255.0 -> 172. Must have a unique address on each node 4.0. Then VIP will try to start on eth0 interface and if it fails.255. and then when those RAC nodes find out that there is another path to that address range (through RIP). they just might start sending traffic to those other IP addresses instead of the interconnect. there is nothing to prevent someone else to go and use them in the normal corporate network. for the rest of the corporation.255./srvctl modify nodeapps -n -A //eth0\|eth1 Note how the interfaces are a list separated by the ‘|’ symbol and how you need to quote this with a ‘\’ character or the Unix shell will interpret the character as a ‘pipe’. Reserved address ranges for private (non-routed) use (see TCP/IP RFC 1918): * 10.31.2.168.0 -> 192. So on a node called ukdh364 with a VIP address of ukdh364vip and we want a netmask (say) of 255. eth0 on node1. use them wisely. you are pretty much messing up the IP addressing.255.16. try eth1 interface./srvctl modify nodeapps -n ukdh364 -A ukdh364vip/255.0 global public eth2 138.E.255.4)? With srvctl.0. you can modify the nodeapp for the VIP to list the NICs it can use.0 global cluster_interconnect An ifconfig on your machine will show what the hardware names for the interface cards installed.168.236.240.0 -> 10.255 * 172.0 then we have: .2 and eth1 on node 3. If you use them on a non-routable network. IP addresses are a sparse commodity.255. Must be separate from the public network 2.

parallel. How is the voting disk used by Oracle Clusterware? The voting disk is accessed exclusively by CSS (one of the Oracle Clusterware daemons). configuring multiple voting disks increases the system's tolerance of disk failures (i. As far as voting disks are concerned. then the entire cluster will probably go down. The algorithms used for each of these three things are quite different. This is the backup that you took with dd (by the manual's request) after the most recent addnode.e. The voting disk. however it's unlikely that any customer would have enough disk systems with statistically independent failure characteristics that such a configuration is meaningful. in different. However there seem to be quite a few RAID systems that decide that 30-60 second (or 45 minutes in the case of veritas) IO latencies are acceptable. or install operation. At any rate. network. You can also restore the backup via dd when the cluster is down.removes a voting disk Note: the cluster has to be down. providing protection against 15 simultaneous disk failures. a node must be able to access strictly more than half of the voting disks at any time. You can configure up to 32 voting disks. So while using an independent RAID system for the voting disk may appear appealing. How can I register the listener with Oracle Clusterware in RAC 10g Release 2? NetCA is the only tool that configures listener and you should be always using it. (n=1 means 3 voting disks). There are no other supported alternatives. So if you want to be able to tolerate a failure of n voting disks.How to Restore a Lost Voting Disk used by Oracle Clusterware 10g Please read Note:279793. This is totally different from a database file. you can restore the voting disk from backup while the cluster is online. The database looks at the database files and interacts with the CSS daemon (at a significantly higher level conceptually than any notion of "voting disk"). However we have to wait for at least the longest IO latency before we can declare a node dead and allow the database to reassign database blocks. sometimes there are failover latency consequenecs.e. database corruption) is prevented by ensuring that the remote node is down before reassigning its locks. It will register the listener with Oracle Clusterware. Configuring a smaller number of voting disks on some kind of RAID system can allow a customer to use some other means of reliability than the CSS's multiple voting disk mechanisms.1 As long as you can confirm via the CSS daemon logfile that it thinks the voting disk is bad. . "Non-synchronized access" (i. If by accident you restore a voting disk that the CSS daemon thinks is NOT bad.adds a new voting disk crsctl delete css votedisk . indepdendent ways that allow each to provide additional protection compared to the other. you must have at least 2n+1 configured. crsctl add css votedisk . and the control file are used to determine when a remote node is down. deletenode.1 and for OCR Note:268937. increases reliability).

What is the High Availability API? An application-programming interface to allow processes to be put under the High Availability infrastructure that is part of the Oracle Clusterware distributed with Oracle Database 10g. listeners.Does Oracle Clusterware support application vips? Yes. Why is the home for Oracle Clusterware not recommended to be subdirectory of the Oracle base directory? If anyone other than root has write permissions to the parent directories of the CRS home. Is it supported to allow 3rd Party Clusterware to manage Oracle resources (instances. This is a security issue. etc) and turn off Oracle Clusterware management of these? In 10g we do not support using 3rd Party Clusterware for failover and restart of Oracle resources. There can be many vips defined. stop and relocate the process when the cluster node status changes. with Oracle Database 10g Release 2. It must accept 3 different parameters: start-The control agent should start the application. How do I put my application under the control of Oracle Clusterware to achieve higher availability? First write a control agent. It is defined as a dependent resource to the application. You must first create a profile (crs_profile). A user written script defines how Oracle Clusterware should start. Oracle Clusterware now supports an "application" vip. as appropriate to the security requirements. The usrvip script must run as root. This extends the high availability services of the cluster to any . then register it with Oracle Clusterware (crs_register). Secondly you must create a profile for your application using crs_profile. The application vip is a new resource defined to Oracle Clusterware and is a functional vip. stop-The Control agent should start the application. check-The control agent should check the application. See the RAC Admin and Deployment Guide for details. The CRS home itself is a mix of root and non-root permissions. then they can give themselves root escalations. Oracle Clusterware resources should not be disabled. Please follow the install docs about who is your primary group and what other groups you need to create and be a member of. typically one per user application under the control of Oracle Clusterware. Thirdly you must register your application as a resource with Oracle Clusterware (crs_register). This is to support putting applications under the control of Oracle Clusterware using the new high availability API and allow the user to use the same URL or connection string regardless of which node in the cluster the application is running on.

the additional latency on round trip on I/O and a one way cache fusion will have an affect on performance as distance increases. Effective with 11g Rel. Please. listener) are automatically managed by the clusterware.application running in the cluster.g. can I use ocopy? The only recommended way to restore an OCR from a backup is "ocrconfig -restore ". Most customers implement under this distance w/ only a handful above and the farthest known example is at 100km. You could have . compulsory usage of ASM. The ocopy command will not be able to perform the restore action for OCR. Can a customer use SE RAC to implement an "Extended RAC Cluster" ? YES. While there is not fixed distance limitation. For example tests at 100km showed a 3-4 ms impact on I/O and 1 ms impact on cache fusion. Oracle Database 10g Real Application Clusters (RAC) databases and associated Oracle processes (E. How do I restore OCR from a backup? On Windows. Customers can now use SE RAC clusters in extended environments.public on ETH1 private on ETH2 There is no requirement for interface name ordering. However. What are the network requirements for an extended RAC cluster? . thus the farther distance is the greater the impact on performance.private on ETH0 Just make sure you choose the correct public interface in VIPCA. refer to the licensing documentation for more information.G. no third party cluster nor volume manager must be installed). What is the maximum distance between nodes in an extended RAC environment? The high impact of latency create practical limitations as to where this architecture can be deployed. This architecture fits best where the 2 datacenters are relatively close (<~25km) and the impact is negligible. other SE RAC restrictions still apply (e.public on ETH2 . Due ensure a proper setup of SAN buffer credits to limit the impact of distance at the I/O layer. Is it a requirement to have the public interface linked to ETH0 or does it only need to be on a ETH lower than the private interface?: . and in the installer's interconnect classification screen. Largest distances than the commonly implemented may want to estimate or measure the performance hit on their application before implementing.1 the former restriction to have all nodes co-located in one room when using SE RAC has been lifted.

Pls refer to Metalink Note 359515.html Can I use ASM to mirror Oracle data in an extended RAC environment? . But you cannot put the OCR or Voting Disk in ASM. switch. Dark Fiber networks allow for a greater distance in separation between the nodes. If a DWDM or CWDM is used then then these are directly connected via a dedicated switch on either side.com/technology/deploy/availability/htdocs/vendors_nfs. so generally they are only an option if they already exist between the two sites.Necessary Connections Interconnect. and IP Networking need to be kept on separate channels.e. Keep in mind that cables can be cut. large FTPs) can cause performance degradations or even node evictions. path.oracle. Traditional cables are limited to about 10 km if you are to avoid using repeaters.oracle.pdf Standard NFS is only supported for the tie-breaking voting disk in an extended cluster environment. ASM works well to replicate any object you can put in ASM. each with required redundancy. or even building entrances.pdf Otherwise just as with database files. Dark Fiber networks allow the communication to occur without repeaters. In the future we are looking at providing Oracle redundancy for both.com/technology/products/database/clustering/pdf/thirdvoteonnfs. if Oracle Clusterware is being used. This is a the same as doing it over the public network which is not supported and other uses of the network (i. For SAN networks make sure you are using SAN buffer credits if the distance is over 10km. Since latency is limited. but it cannot replicate everything that needs replication. How should voting disks be implemented in an extended cluster environment? Can I use standard NFS for the third site voting disk? http://www. with the appropriate mount options. Redundant connections must not share the same Dark Fiber (if used). The disadvantage of Dark Fiber networks are they can cost hundreds of thousands of dollars. we also require that a single subnet be setup for the public connections so we can fail over VIPs from one side to another. If direct connections are used (for short distances) this is generally done by just stringing long cables from a switch. SAN. we only support voting files on certified NAS devices. No WAN or Shared connection allowed. For a complete list of supported NAS vendors refer to OTN at: http://www. Note of caution: Do not do RAC Interconnect over a WAN.oracle. In 10gR1 they can either be mirrored using a different mechanism (which could then be used instead of ASM) or the OCR needs to be restored from backup and the Voting Disk can be recreated.1 for a full description of the required mount options. Can I use ASM as mechanism to mirror the data in an Extended RAC cluster? Yes. See platform and mount option restrictions at: http://www. At the moment in Oracle 10g. The SAN and Interconnect connections need to be on dedicated point-to-point connections.com/technology/products/database/clustering/pdf/thirdvoteonnfs.

87. 3.153)' can't be established. the failure groups will need to be manually dropped and added. the additional latency induced by distance will affect I/O and cache fusion performance. You should not see this message again when you connect from this system to that node. and will fallback to password authentication with no helpful message as to why. Answering yes to this question causes an entry to be added to a "known-hosts" file in the .ssh directory which is why subsequent connection requests do not reask. Voting Disk redundancy must exists across both sites. you may see a message similar to the following: The authenticity of host 'node1 (140.rhosts $HOME/. It is also necessary to have file permissions set as described below for features such as Public Key Authorization to work. This third site may be via a WAN. $HOME $HOME/. public key authentication will fail. As in any extended RAC environments. 6. This effect will vary by distance and the customer is responsible for ensuring that the impact attained in their environment is acceptable for their application. When the site rejoins the cluster. one of the failure groups will be marked invalid. Storage at each site much be setup as seperate failure groups and use ASM mirroring. and at a 3rd site to act as an arbitrage. 2. Should a loss of connectivity between the sites occur.ssh. Customer must be aware that in 10gR2 ASM does not provide partial resilvering. This is known to work on Solaris and Linux but may work on other platforms as well. Are you sure you want to continue connecting (yes/no)? Enter |yes| at the prompt to continue.shosts $HOME/. 4. Why is validateUserEquiv failing during install (or cluvfy run)? SSH must be set up as per the pre-installation tasks. .ssh $HOME/. Customer must have a seperate and dedicated test cluster also in an extended configuration setup using the same software and hardware components (can be fewer or smaller nodes). The following server configuration files and/or directories must be owned by the account owner or by root and GROUP and WORLD WRITE permission must be disabled.authorized-keys $HOME/. SSH (from OUI) will also fail if you have not connected to each machine in your cluster as per the note in the installation guide: The first time you use SSH to connect to a node from a particular system.This support is for 10gR2 onwards and has the following limitations: 1.ssh/authorized-keys2 #Openssh specific for ssh2 protocol.152. If your permissions are not correct. to ensure at least one copy of the data at each site. 5. OCR must be mirrored across both sites using Oracle provided mechanisms. RSA key fingerprint is 7z:ez:e7:f6:f4:f2:4f:8f:9z:79:85:62:20:90:92:z9.

How do I turn on tracing?
Set the environmental variable SRVM_TRACE to true. For example, in tcsh "setenv SRVM_TRACE true" will turn on tracing. Also it may help to run cluvfy with -verbose attribute $script run.log $export SRVM_TRACE=TRUE $cluvfy -blah -verbose $exit

Can I check if the storage is shared among the nodes?
Yes, you can use 'comp ssa' command to check the sharedness of the storage. Please refer to the known issues section for the type of storage supported by cluvfy.

When I run 10.2 CLUVFY on a system where RAC 10g Release 1 is running I get following output: Package existence check failed for "SUNWscucm:3.1". Package existence check failed for "SUNWudlmr:3.1". Package existence check failed for "SUNWudlm:3.1". Package existence check failed for "ORCLudlm:Dev_Release_06/11/04,_64bit_3.3.4.8_reentrant". Package existence check failed for "SUNWscr:3.1". Package existence check failed for "SUNWscu:3.1". Checking this Solaris system I don't see those packages installed. Can I continue my install?
Note that cluvfy checks all possible prerequisites and tells you whether your system passes the check or not. You can then cross reference with the install guide to see if the checks that failed are required for your type of installation. It the above case, if you are not planning on using Sun Cluster, then you can continue the install. The checks that failed are the checks for Sun Cluster required packages and are not neede d on your cluster. As long as everything else checks out successfully, you can continue.

What are the default values for the command line arguments?
Here are the default values and behavior for different stage and component commands: For component nodecon: If no -i or -a arguments is provided, then cluvfy will get into the discovery mode.

For component nodereach: If no -srcnode is provided, then the local(node of invocation) will be used as the source node. For components cfs, ocr, crs, space, clumgr: If no -n argument is provided, then the local node will be used. For components sys and admprv: If no -n argument is provided, then the local node will be used. If no -osdba argument is provided, then 'dba' will be used. If no -orainv argument is provided, then 'oinstall' will be used. For component peer: If no -osdba argument is provided, then 'dba' will be used. If no -orainv argument is provided, then 'oinstall' will be used. For stage -post hwos: If no -s argument is provided, then cluvfy will get into the discovery mode. For stage -pre clusvc: If no -c argument is provided, then cluvfy will skip OCR related checks. If no -q argument is provided, then cluvfy will skip voting disk related checks. If no -osdba argument is provided, then 'dba' will be used. If no -orainv argument is provided, then 'oinstall' will be used. For stage -pre dbinst: If -cfs_oh flag is not specified, then cluvfy will assume Oracle home is not on a shared file system. If no -osdba argument is provided, then 'dba' will be used. If no -orainv argument is provided, then 'oinstall' will be used.

How do I check the Oracle Clusterware stack and other sub-components of it?
Cluvfy provides commands to check a particular sub-component of the CRS stack as well as the whole CRS stack. You can use the 'comp ocr' command to check the integrity of OCR. Similarly, you can use 'comp crs' and 'comp clumgr' commands to check integrity of crs and clustermanager sub-components. To check the entire CRS stack, run the stage command 'clucvy stage -post crsinst'.

Is there a way to verify that the Oracle Clusterware is working properly before proceeding with RAC install?
Yes. You can use the post-check command for cluster services setup(-post clusvc) to verify CRS status. A more appropriate test would be to use the pre-check command for database installation(-pre dbinst). This will check whether the current state of the system is suitable for RAC install.

At what point cluvfy is usable? Can I use cluvfy before installing Oracle Clusterware?
You can run cluvfy at any time, even before CRS installation. In fact, cluvfy is designed to assist the user as soon as the hardware and OS is up. If you invoke a command which requires CRS or RAC on local node, cluvfy will report an error if those required products are not yet installed.

What is CVU? What are its objectives and features?
CVU brings ease to RAC users by verifying all the important components that need to be verified at different stages in a RAC environment. The wide domain of deployment of CVU ranges from initial hardware setup through fully operational cluster for RAC deployment and covers all the intermediate stages of installation and configuration of various components. The command line tool is cluvfy. Cluvfy is a non-intrusive utility and will not adversely affect the system or operations stack.

What is a stage?
CVU supports the notion of Stage verification. It identifies all the important stages in RAC deployment and provides each stage with its own entry and exit criteria. The entry criteria for a stage define a specific set of verification tasks to be performed before initiating that stage. This pre-check saves the user from entering into a stage unless its pre-requisite conditions are met. The exit criteria for a stage define another specific set of verification tasks to be performed after completion of the stage. The post-check ensures that the activities for that stage have been completed successfully. It identifies any stage specific problem before it propagates to subsequent stages; thus making it difficult to find its root cause. An example of a stage is "precheck of database installation", which checks whether the system meets the criteria for RAC install.

What is a component?
CVU supports the notion of Component verification. The verifications in this category are not associated with any specific stage. The user can verify the correctness of a specific cluster component. A component can range from a basic one, like free disk space to a complex one like CRS Stack. The integrity check for CRS stack will transparently span over verification of multiple sub-components associated with CRS stack. This encapsulation of a set of tasks within specific component verification should be of a great ease to the user.

What is nodelist?
Nodelist is a comma separated list of hostnames without domain. Cluvfy will ignore any domain while processing the nodelist. If duplicate entities after removing the domain exist, cluvfy will

For storage related verification. 2. CVU discovers all the supported storage types if you do not specify a particular storage._ Java 1. Note that this process is purely a read-only process with no adverse impact on the system. How do I report a(or tons of) bug? Please refer to the known issue/README files before filing a bug. CVU discovers available network interfaces if you do not specify any interface or IP address in its command line. CVU will use "/tmp" as the work dir. CVU will attempt to copy the necessary bits as required to this location. CVU assumes the current user as oracle user. Otherwise. Do I have to be root to use CVU? No._ A work directory with at least 25MB on all the nodes. Make sure. This dir is set through the CV_DESTLOC environment variable. CVU discovers CRS HOME if one is available. component: OPSM and sub-component: CLUVFY.eliminate the duplicate names while processing. If the issue is not covered in those documents. this package should be installed only on RedHat Linux 3.0 distribution.0. This assists CVU in finding scsi disks and helps CVU to perform storage checks on disks. CVU will be able to perform discovery and shared storage accessibility checks for scsi disks. What is 'cvuqdisk' rpm? Why should I install this rpm? CVU requires root privilege to gather information about the scsi disks during discovery. If this variable does not exist. CVU discovery is limited to these components. 3._ On RedHat Linux 3. Please provide the relevant log file while filing a bug. Check this for more information on nodelist and shortcuts. If this package is installed on all the nodes. you can use '-n all' to check on all the cluster nodes. the location exists on all nodes and it has write permission for CVU user. 4. it complains about the missing package 'cvuqdisk'. A small binary uses the setuid mechanism to query disk information as root. What about discovery? Does CVU discover installed components? At present. What are the requirements for CVU? CVU requires: 1. file a bug against product# 5. Wherever supported. Note that. Please refer to What is 'cvuqdisk' rpm? for detail.1 location on the invocation node.4._ An area with at least 30MB for containing software bits on the invocation node. this binary is packaged in the cvuqdisk rpm and need root privilege to install on a machine. To make this secured. CVU is intended for database and system administrators. . an optional package 'cvuqdisk' is required on all the nodes.

Similarly._ Set the environment variable to a group. type '-n all'. By default. 'cluvfy stage -list' and 'cluvfy stage -help' will list valid stages and their syntax respectively. export CVUQDISK_GRP=dba 4. cluvfy will show the appropriate usage for that particular command. To get detailed output of a check.i386. this package should be installed only on RedHat Linux 3. To provide all the nodes of the cluster.0.0. cluvfy will show valid components with brief description on each of them. current version is 1. For example: setenv MYNODES node1. How do I install 'cvuqdisk' package? Here are the steps to install cvuqdisk package. Cluvfy shows the most appropriate usage text based on the cluvfy command line arguments.0.node3.rpm How do I know about cluvfy commands? The usage text of cluvfy does not show individual commands.node5 cluvfy comp nodecon -n $MYNODES How do I get detail output of a check? Cluvfy supports a verbose feature. it will look for the CV_NODE_ALL environmental variable. Cluvfy will attempt to get the nodelist in the following order: 1. cluvfy will show the syntax for pre-check of dbinst stage. Do I have to type the nodelist every time for the CVU commands? Is there any shortcut? You do not have to type the nodelist every time for the CVU commands. If you type 'cluvfy comp -help'. If you type 'cluvfy' on the command prompt._ Become root user 2.Note that.0 distribution. Cluvfy has context sensitive help built into it._ Erase any existing package rpm -e cvuqdisk 5. cluvfy displays the high level generic usage text. Discovery of scsi disks for RedHat Linux 2. cluvfy reports in non-verbose mode and just reports the summary of a test. If CRS is installed. if you type 'cluvfy stage -pre dbinst'.rpm. In none of the above. it will complain. 3.1-1. 3. cluvfy will show detail syntax for each of the valid components. To provide a partial list(some of the nodes of the cluster) of nodes.1 ) to a local directory.i386. If this variable is not defined. For example. who should own this binary.1-1._ Copy the rpm ( cvuqdisk-1._ Install the rpm rpm -iv cvuqdisk-1. If you type an invalid command. If a vendor clusterware is available. You can find the rpm in Oracle's OTN site. Here are few short cuts. which talks about valid stage and component syntax.1 is not supported. it will pick all the configured nodes from the vendor clusterware using lsnodes utility. 1. If you type 'cluvfy comp -list'. you can set an environmental variable and use it in the CVU command. Typing the nodelist for a large cluster is painful and error prone. Typically it is the "dba" group. it will pick all the configured nodes from Oracle clusterware using olsnodes utility. 2. use the flag '-verbose' in the .

You can use the peer command with -refnode argument to compare those properties of other nodes against the reference node. Is there a way to compare nodes? You can use the peer comparison feature of cluvfy for this purpose.0. use '-o user_equiv' argument. This will produce detail output of individual checks and where applicable will show per-node result in a tabular fashion. You can run this command in verbose mode to find out the mappings between the interfaces. How do I check network or node connectivity related issues? Use component verifications commands like 'nodereach' or 'nodecon' for this purpose. To check the system requirement for RAC. the sharedness check for the file sytem is supported for OCFS version 1.command line. Provide the OCFS file system you want to check through the -f argument. The '-o db_inst' will check for permissions required for installing RAC and '-o db_config' will check for permissions required for creating a RAC database or modifying a RAC database configuration. Refer to the usage text for detail instruction and type of supported operations. How do I check whether OCFS is properly configured? You can use the component command 'cfs' to check this. To check the system requirement for CRS. How do I check user accounts and administrative permissions related issues? Use admprv component verification command.14 or higher. IP addresses and subnets. The command 'comp peer' will list the values of different nodes for several pre-selected properties. use '-p database' argument. You can check the connectivity among the nodes by specifying the interface name(s) through -i argument. Then cluvfy will try to verify the node connectivity per subnet. the '-o crs_inst' will verify whether the user has the correct permissions for installing CRS. type cluvfy comp -help on the command prompt. Note that. cluvfy will attempt to discover all the available interfaces and the corresponding IP address & subnet. How do I check minimal system requirements on the nodes? The component verification command sys is meant for that. Similarly. For detail syntax of these commands. use '-p crs' argument. To check whether the privilege is sufficient for user equivalence. If the 'cluvfy comp nodecon' command is invoked without -i. .

Please make sure the directory pointed by this variable exists on all nodes and is writable by the user. then it flags that as a deviation from the reference node. CVU can not check or verify pre-10g products. It compares the system properties of other nodes against the reference node. What are the known issues with this release? 1.1 are not the same. In other words._ The component check for node application (cluvfy comp nodeapp . 'cvuqdisk' package should be installed only on RedHat Linux 3. 4._ What database version is supported by CVU? Current CVU release supports only 10g RAC and CRS and is not backward compatible.0 distribution. Why cluvfy reports "unknown" on a particular node? Cluvfy reports unknown when it can not conclude for sure if the check passed or failed.0. they are not binary compatible.log. In other words.0. This is a known issue and will be addressed shortly.1 is not supported. it will report this as 'matched' since there is no deviation from the reference node. no trace files will be generated if tracing has not been turned on. 2. If the cluvfy user does not have write permission. CVU bits for RedHat 3. Discovery of scsi disks for RedHat Linux 2.) command reports node app creation error if the local CRS stack is down. it will report as 'mismatched' for a node with higher total memory than the reference node for the above reason.. If a group or user does not exist on reference node as well as on the other node. For sharedness check on NAS.0 Update 2 and RedHat 2. the CVU distribution for RedHat 3. cluvfy reports the path as not-shared.14 or higher ) and scsi disks(if cvuqdisk package is installed) are supported. Similarly. cluvfy requires the user to have write permission on the specified path. A common cause of this type of reporting is a non-existent location set for the CV_DESTLOC variable. Currently NAS storage ( r/w._ What Linux distributions are supported? This release supports only RedHat 3. The log files are automatically rotated and the latest log file has the name cvutrace. a._ Shared storage accessibility(ssa) check reports Current release of cluvfy has the following limitations on Linux regarding shared storage accessibility check. It is a good idea to clean up unwanted log files or archive them to reclaim disk place. 3. b. Note that.Why the peer comparison with -refnode says passed when the group or user does not exist? Peer comparison with the -refnode feature acts like a baseline feature.1AS distributions. Note that.1AS are different._ CVU does not recongnize . OCFS( version 1. If the value does not match( not equal to reference node value ). 5. Where can I find the CVU trace files? CVU log files can be found under $CV_HOME/cv/log directory.0 and RedHat 2. Note that.0 Update 2 and RedHat 2. no attribute caching)..

allowing a database to relocate to another node for up to 10 days per year. which is expensive long term. Is Oracle RAC One Node supported with 3rd party clusterware and/or 3rd party CFS? No. Only one node in the cluster can be used for the 10-day rule. IBM HACMP. or for planned maintenance and upgrading.0 distribution.000 per CPU (Check price list). This is most often used in the case of failover. assungin the existing ELA/ULA includes Oracle RAC. /dev/raw/raw1 ) as valid storage paths or identifiers. Note that. Oracle RAC One Node is only supported with with version 11. 6. 'cvuqdisk' package should be installed only on RedHat Linux 3. Obviously if the ELA/ULA does not include Oracle RAC. without incurring additional licensing fees. /dev/sdm etc ) for the storage path or identifiers. then they are not entitled to use Oracle RAC One Node. and Veritas Cluster Server? RAC One Node is a better high availability solution than traditional cold fail over solutions. Unlike the Oracle RAC feature. are they entitled to use Oracle RAC One Node? Yes.1 is not supported.1 complains about the missing cvuqdisk package. This will be corrected in the future release.g. Discovery of scsi disks for RedHat Linux 2. Oracle RAC One Node is not available with the Oracle Standard Edition.g. Sun Cluster and Symantec. User should ignore this error. Please use the underlying disk( e. If a current customer has an Enterprise License Agreement (ELA). . This amounts to "burning" a Oracle RAC license for Oracle RAC One Node.the disk bindings ( e._ Current version of CVU for RedHat 2. How is Oracle RAC One Node licensed and priced? Oracle RAC One Node is an option to the Oracle Database Enterprise Edition and licensed based upon the number of CPU's in the server on which it is installed. The license guide states that all Oracle RAC option licenses (not SE RAC) include all the features of Oracle RAC One Node. How does RAC One Node compare with traditional cold fail over solutions like HP Serviceguard. Oracle RAC One Node licensing also includes the 10-day rule. Customers with existing RAC licenses or Oracle RAC ELA's can use those licenses as Oracle RAC One Node.2 (and above) of Oracle grid infrastructure. Current list price is $10.

Unlike traditional cold failover solutions. something that the traditional cold fail over solutions do not. lets say you wanted to do an upgrade of the operating system on the node that the RAC One Node database is running on. to send notifications to clients after failovers and to speed re-connection 3rd-party fail over solutions rely on timeouts for clients to detect failover and initiate a reconnection. The administrator would activate "OMotion. For example.many 3rd-party solutions do not support ASM storage. RAC One Node provides a load balancing capability that is attractive to DBAs and Sys Admins. Lastly. if you have two different database instances running on a RAC One Node Server and it becomes apparent that the load against these two instances is impacting performance. and prevent consolidation of storage across multiple databases. but the customer must rely on the 3rd-party to support their clusterware. Also." a new Oracle facility that would migrate the instance to another node in the cluster. The following table summarizes the differences between RAC One Node and 3rd-party fail over solutions: Feature RAC One Node EE plus 3rd Party Clusterware Out of the box experience RAC One Node provides everything necessary to implement database failover. This non-disruptive rolling upgrade and patching capability of RAC One Node exceeds the current functionality of the traditional cold fail over solutions. RAC One Node will detect that and will bring up that database instance on another node in the cluster. If the whole node fails. If that database instance has a problem. increasing the management burden on the DBA. RAC One Node offers this load balancing capability.RAC One Node operates in a cluster but only a single instance of the database is running on one node in the cluster. RAC One Node detects that and can attempt to restart the instance on that node. RAC One Node goes beyond the traditional cold fail over functionality by offering administrators the ability to proactively migrate instances from one node in the cluster to another. Single Vendor RAC One Node is 100% supported by Oracle EE is supported by Oracle. This can slow down failover. the server can be shut down. OMotion can then be invoked again to migrate the instance and the connections back to the now-upgraded node. the DBA can invoke OMotion and migrate one of the instances to another lessused node in the cluster. Oracle Clusterware will send out notifications (FAN events) to clients to speed reconnection after failover. Fast failover RAC One Node supports FAN Events. For example. 3rd-party fail over solutions require a separate install and a separate management infrastructure. upgraded and restarted. It could take several minutes for a client to . 3rd-party solutions may simply wait for potentially lengthy timeouts to expire. Once the instance and all of the connections have migrated.

Workload Management RAC One Node can migrate a database from one server to another while online to enable load balancing of databases across servers in the cluster. management interfaces.detect there had been a failover. Dedicated storage is also more difficult to manage. How does RAC One Node compare with a single instance Oracle Database protected with Oracle Clusterware? . Online scale out Online upgrade to multi-node RAC Complete reinstall including Oracle Grid Infrastructure is required. it eliminates the lengthy failover of volumes and file systems Traditional 3rd-party solutions rely on local file systems and volumes that must be failed over. 3rd-party solutions must be failed over from one node to another. Rolling DB patching. management interfaces. Reconnection could take several minutes. EE and RAC use different tools. Large volumes can take a long time to fail over. 3rd-party solutions must be failed over from one node to another. which means all connections will be dropped and must reconnect. Most connections should migrate with no disruption. Most connections should migrate with no disruption. Because it’s shared across nodes. Some transactions will be dropped and must reconnect. Clusterware. and processes. Standardized tools RAC and RAC One Node use and processes the same tools. Some transactions will be dropped and must reconnect. Reconnection could take several minutes. OS. Storage virtualization RAC One Node supports use of ASM to virtualize and consolidate storage. and processes. which means all connections will be dropped and must reconnect. 3rd-party clusterware requires additional interfaces. ASM patching and upgrades RAC One Node can migrate a database from one server to another to enable online rolling patching.

must reconnect. and they need to set up the environment and install the scripts manually. which means all patching and upgrades server to another to enable connections will be dropped and online rolling patching. disruption Online scale out Online upgrade to multi- Take DB outage and re-link to . Some across servers in the transactions will be dropped and cluster. Supportability RAC One Node is 100% supported While EE is 100% supported. OS. Using Oracle Clusterware to protect an EE database is possible by customizing some sample scripts we provide to work with EE. This requires custom script development by the customer. the scripts customized by the customer are not supported by Oracle. DB Control support RAC One Node fully supports failover of DB Control in a transparent manner DB Control must be reconfigured after a failover (unless the customer scripts are modified to support DB Control failover) Rolling DB patching. Most connections must reconnect. ASM migrate a database from one node to another. Reconnection should migrate with no could take several minutes.Feature RAC One Node EE plus Oracle Clusterware Out of the box experience RAC One Node is a complete solution that provides everything necessary to implement a database protected from failures by a failover solution. Reconnection could take several minutes. Some Most connections should transactions will be dropped and migrate with no disruption must reconnect. RAC One Node can online EE must be failed over from one Clusterware. Workload Management RAC One Node can online EE must be failed over from one migrate a database from one node to another. which means all server to another to enable connections will be dropped and load balancing of databases must reconnect.

2. and processes processes What is Oracle Real Application Clusters One Node (RAC One Node)? Oracle RAC One Node is an option available with Oracle Database 11g Release 2. and many of the benefits of server virtualization solutions like VMware.2 RAC One Node databases. This option adds to the flexibility that Oracle offers for reducing costs via consolidation. rolling patches. and the RAC One Node database must be at least 11. how do I inform RAC One Node? You must re-run raconeinit to update the candidate server list for each RAC One Node Database. Oracle RAC One Node is a single instance of Oracle RAC running on one node in a cluster. RAC One Node offers better high availability functionality than traditional cold failover cluster solutions because of a new Oracle technology Omotion. If I add or remove nodes from the cluster. use same tools. How do I get Oracle Real Application Clusters One Node (Oracle RAC One Node)? .2? No.node RAC upgrade to multi-node RAC. Earlier versions of the rdbms can coexist with 11. Is RAC One Node supported with database versions prior to 11. It allows customers to more easily consolidate their less mission critical. which is able to intelligently relocate database instances and connections to other cluster nodes for high availability and system load balancing. and management interfaces. single instance databases into a single cluster. restart DB. rolling OS and clusterware upgrades).2 of Oracle Grid Infrastructure. RAC One Node requires at least version 11. with most of the high availability benefits provided by Oracle Real Application Clusers (automatic restart/failover. Standardized tools and processes RAC and RAC One Node EE and RAC use different tools. management interfaces.

11. In the future.0.0. so EM will be able to detect when the instance is migrated or failed over to another node. Note that in 11.0.2. and grid infrastructure (ASM/Clusterware) rolling upgrades and patches. You need to make sure the replica can keep up with your load. Switching to a replica for patching. You need to figure out how to re-point your clients to the replica (not an issue with RAC One Node because it's the same database. RAC One Node documentation will be included in the next doc set refresh.2. OS upgrades/patches. EM will support RAC One Node database out of the box.2.1 was released and is only available for Linux). Both are required as part of a comprehensive HA solution.2. The documentation is the Oracle RAC One Node User Guide Where do I find the documentation for RAC One Node? RAC One Node was released as a patch after the original GA release of Oracle Database 11g Release 2. when you run raconeinit. How does RAC One Node compare with database DR products like DataGuard or Golden Gate? The products are entrely complementary. (this patch was released after 11. . Our recommendation is to use RAC or RAC One Node to protect from local failures and to support rolling maintenance activities. the EM DB Console will need to be reconfigured to see the new instance on the new node. and after every instance relocation (Omotion) or failover. RAC One Node is designed to protect a single database. data protection. and for rolling database upgrades. RAC One Node allows a spare node to be used 10 days per year without licensing.Oracle RAC One Node is only available with Oracle Database 11g Release 2. Oracle Grid Infrastructure for 11g Release 2 must be installed as a prerequisite. and we use VIPs). or for upgrading the OS or grid infrastructure requires that you choose to run Active/Active (and deal with potential conflicts) or Active/Passive (and wait for work on the active primary database to drain before allowing work on the replica).0. Please refer to Oracle RAC One Node User Guide Does Enterprise Manager Support RAC One Node? Yes. And lastly.2. the instance name is changed. Download and apply Patch 9004119 to your Oracle RAC 11g Release 2 home in order to obtain the code associated with RAC One Node. Support for other platforms will be added with 11. Use Data Guard or replication technology for DR. You need to make sure replication supports all data types you are using.1. you can use Enterprise Manager DB Console to manage RAC One Node databases. you should either configure EM DB Console after running raconeinit. This is less disruptive than switching to a datbase replica. This can be done using emca and is the same as with adding any new DB to the configuration.2. It can be used for rolling database patches.

Are we certifying applications specifically for RAC One Node? No. VMware offers a facility. It does allow for the hardware to be dusted and vacuumed.How do I install the command line tools for RAC One Node? The command line tools are installed when you install the RAC One Node patch 9004119 on top of 11. remember that most stretch cluster implementations also implement deparate storage arrays at both locations. RAC One Node will either restart it or migrate the database instance to another server. . RAC One Node can migrate the database instance to another server. Also. Does Rac One Node make sense in a stretch cluster environment? Yes. Anything beyond a metro area configuration is likely to introduce too much latency for the application to meet performance SLAs. that will do a memory-tomemory transfer from one virtual machine to another. This DOES NOT allow for any OS or other patches or upgrades to occur in a non-disruptive fashion (an outage must be taken). However. If the 3rd party application is certified for Oracle Database 11g Release 2 Enterprise Edition. however. Oftentimes. RAC One Node offers a finer-grained. patches or upgrades can be installed on the original server and then RAC One Node will migrate the instance back.0. RAC One Node allows database and OS patches or upgrades to be made without taking a complete database outage. RAC One Node enables both physical server consolidation as well as database consolidation without the additional overhead of a hypervisor-based solution like VMware.<!--[endif]-->High Availability: VMware offers the ability to fail over a failed virtual machine – everything running in that vm must be restarted and connections re-established in the event of a virtual machine failure.2. If it fails. How does RAC One Node compare with virtualization solutions like VMware? RAC One Node offers greater benefits and performance than VMware in the following ways: <!--[if !supportLists]-->. So write latency is still an issue that must be considered since ASM is still writing blocks to both sites. database issues or problems will manifest themselves before the whole server or virtual machine is affected.1. RAC One Node will discover these problems much sooner than a VMware solution and take action to correct it. Vmotion. <!--[if !supportLists]-->.<!--[endif]-->Server Consolidation: VMware offers physical server consolidation but imposes a 10%+ processing overhead to enable this consolidation and have the hypervisor control access to the systems resources. it is certified for RAC One Node. more intelligent and less disruptive high availability model. RAC One Node can monitor the health of the database within a physical or virtual server. VMware cannot detect a failed process within the vm – just a failed virtual machine.

and AIX.sh script fails on a node during the install of the Grid Infrastructure with Oracle Database 11g Release 2. Additionally. this requires you to define it in your DNS or the GNS. If the root.<!--[if !supportLists]-->.<!--[endif]-->Scalability: VMware allows you to “scale” on a single physical server by instantiating additional virtual machines – up to an 8core limit per vm. however you should first fix the problem that caused it to fail. The Omotion utility allows you to move the Oracle RAC One Node instance from one node to another in the cluster. or you need to do some operating system maintenance on the node however you want to eliminate the outage for application users by moving the instance to another node in the cluster. . Windows.<!--[endif]-->Operational Flexibility and Standardization: VMware only works on x86-based servers. more powerful server without taking a database outage. RAC One Node allows further scaling by allowing the RAC One Node to be online upgraded to a full Real Application Clusters implementation by adding additional database instances to the cluster thereby gaining almost unlimited scalability. If you use /etc/hosts. Can I use Oracle RAC One Node for Standard Edition Oracle RAC? No. you will only get 1 SCAN.pl -delete –force Rerun root.why? SCAN must be set up to round robin across 3 IP addresses. It is not licensed or supported for use with any other editions. RAC One Node will be available for all of the platforms that Oracle Real Application Clusters supports including Linux. HP-UX.sh I am only using 1 of 3 SCAN IP addresses. RAC One Node allows online scaling by migrating a RAC One Node implementation from one server to another. There are several reasons you may want to move the instance such as the node is overloaded so you need to balance the workload by moving the instance. Solaris. <!--[if !supportLists]-->. What is RAC One Node Omotion? Omotion is a utility that is distributed as part of Oracle RAC One Node. Oracle RAC One Node is only part of Oracle Database 11g Release 2 Enterprise Edition. Run /crs/install/rootcrs. can I re-run it? Yes.

101.21. then OCR is not accessible on the node on which Oracle ASM failed.74 rac16-cluster.example. If the crsd process running on the node affected by the Oracle ASM instance failure is the OCR writer.21.example. and you attempt I/O on OCR during the time the Oracle ASM instance is down on this node.com rac16-cluster You will only get 1 SCAN VIP and 1 SCAN LISTENER. if so. Under no circumstances will the failure of one Oracle ASM instance on one node affect the whole cluster. Is it recommended that we put the OCR/Voting disk on ASM disk and. but the cluster remains operational.2) E10717-04 2-27 "If Oracle ASM fails.101. the majority of the OCR locations are stored in Oracle ASM. Cluster management is now affected on this particular node. Here is the updated write up (posted in the online version).21. for full coverage. it is recommended to put the OCR and Voting Disks in ASM. then OCR becomes unavailable on that particular node. put an OCR in two different disk groups. Also the REMOTE_LISTENER parameter must be set to SCAN:PORT (not to a TNSNAMES alias.If your /etc/hosts looks like this: # SCAN addr 192.com rac16-cluster 192.com rac16-cluster 192. if the majority of the OCR locations are in Oracle ASM. The entire cluster only fails if the Oracle ASM instance on the OCR master node fails.75 rac16-cluster. then the crsd stops and the node becomes inoperative. then crsd stops and becomes inoperable.101. For OCR. and if there is an OCR read or write access.76 rac16-cluster. using the same diskgroup you use for your data. If an Oracle ASM instance fails on any node. Is the GNS recommended for most Oracle RAC installations? . " This was a documentation bug and has been fixed.example. How do I explain for a customer who is concerned about the phrase in the following doc ? Oracle® Clusterware Administration and Deployment Guide 11g Release 2 (11. is it preferable to create a separate disk group for them? With 11g Release 2.

It is independent of the Oracle Database or Oracle Clusterware version used. It tracks the OS resource consumption at each node. The client installation requires the 32-bit Java SDK. when thresholds are hit. process. In real time mode. Cluster Health Monitor (IPD/OS) is currently supported on Linux (requires Linux Kernel version greater than or equal to 2. an alert is shown to the operator. For more information on Cluster Health Monitor (IPD/IO). 32-bit and 64-bit installations. It provides name resolution for the cluster. historical data can be replayed to understand what was happening at the time of failure.The Grid Naming Service (GNS) is a part of the Grid Plug and Play feature of Oracle RAC 11g Release 2. It collects and analyzes the cluster-wide data. Related Products • Oracle Database Products > Oracle Database > Oracle Database > Oracle Server - . then you should implement GNS. see this publicly available Technical White Paper on OTN: Overview of Cluster Heath Monitor (IPD/OS) What OS does Cluster Health Monitor (IPD/OS) support? Cluster Health Monitor (IPD/OS) is a standalone tool that should be installed on all clusters where you are using Oracle Real Application Clusters (RAC).6. For root cause analysis. If you have a larger cluster (greater than 46 nodes) or a requirement to have a dynamic cluster (you expect to add or remove nodes in the cluster). It supports both. What is Cluster Health Monitor (IPD/OS)? This tool (formerly known as Instantaneous Problem Detection tool) is designed to detect and analyze operating system (OS) and cluster resource related degradation and failures in order to bring more explanatory power to many issues that occur in clusters where Oracle Clusterware and Oracle RAC are running such as node eviction.9) and Windows (requires at least Windows Server 2003 with service pack 2). and device level continuously. If you are implementing a small cluster 4 nodes or less. you do not need to add GNS. Note: Select GNS during install assumes that you have a DHCP server running on the public subnet where Oracle Clusterware can obtain IP addresses for the Node VIPs and the SCAN VIPs.

ORA-1031 Back to top Rate this document Top of Form 220970.Enterprise Edition Errors CRS-215. ORA-29740. 01031 ERROR. 721236 ERROR. ORA-3113. 3113 ERROR. 29740 ERROR. PRKP-1001. RFC-1918.1 Article Rating Rate this document Comments Provide some feedback Excellent Good Poor Did this document help you? Yes No Just browsing How easy was it to find this document? Very easy Somewhat easy Not easy Cance l Bottom of Form . 273120 ERROR.

Sign up to vote on this title
UsefulNot useful