Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire

http://www.oracle.com/technology/pub/articles/hunter_rac10gr2.html?_...

DBA: Linux

Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire
by Jeffrey Hunter Learn how to set up and configure an Oracle RAC 10g Release 2 development cluster for less than US$1,800.
Updated December 2005

Contents
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. Introduction Oracle RAC 10g Overview Shared-Storage Overview FireWire Technology Hardware & Costs Install the Linux Operating System Network Configuration Obtain & Install New Linux Kernel / FireWire Modules Create "oracle" User and Directories Create Partitions on the Shared FireWire Storage Device Configure the Linux Servers for Oracle Configure the hangcheck-timer Kernel Module Configure RAC Nodes for Remote Access All Startup Commands for Each RAC Node Check RPM Packages for Oracle 10g Release 2 Install & Configure Oracle Cluster File System (OCFS2) Install & Configure Automatic Storage Management (ASMLib 2.0) Download Oracle 10g RAC Software Install Oracle 10g Clusterware Software Install Oracle 10g Database Software Create TNS Listener Process Install Oracle10g Companion CD Software Create the Oracle Cluster Database Verify TNS Networking Files Create / Alter Tablespaces Verify the RAC Cluster & Database Configuration Starting / Stopping the Cluster Transparent Application Failover - (TAF) Conclusion Acknowledgements

Downloads for this guide: CentOS Enterprise Linux 4.2 or Red Hat Enterprise Linux 4 Oracle Cluster File System V2 - (1.0.4-1) Oracle Cluster File System V2 Tools - (1.0.4-1) Oracle Database 10g Release 2 EE, Clusterware, Companion CD - (10.2.0.1.0) Precompiled RHEL 4 Kernel - (2.6.9-11.0.0.10.3.EL) Precompiled RHEL 4 FireWire Modules - (2.6.9-11.0.0.10.3.EL) ASMLib 2.0 Library and Tools ASMLib 2.0 Driver

- Single Processor / SMP

1. Introduction
One of the most efficient ways to become familiar with Oracle Real Application Clusters (RAC) 10g technology is to have access to an actual Oracle RAC 10g cluster. There's no better way to understand its benefits—including fault tolerance, security, load balancing, and scalability—than to experience them directly. Unfortunately, for many shops, the price of the hardware required for a typical production RAC configuration makes this goal impossible. A small two-node cluster can cost from US$10,000 to well over US$20,000. That cost would not even include the heart of a production RAC environment—typically a storage area network—which can start at US$8,000. For those who want to become familiar with Oracle RAC 10g without a major cash outlay, this guide provides a low-cost alternative to configuring an Oracle RAC 10g Release 2 system using commercial off-the-shelf components and downloadable software at an estimated cost of US$1,200 to US$1,800. The system involved comprises a dual-node cluster (each with a single processor) running Linux (CentOS 4.2 or Red Hat Enterprise Linux 4) with a shared disk storage based on IEEE1394 (FireWire) drive technology. (Of course, you could also consider building a virtual cluster on a VMware Virtual Machine, but the experience won't quite be the same!)

1 of 19

1/4/2006 8:44 AM

Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire

http://www.oracle.com/technology/pub/articles/hunter_rac10gr2.html?_...

Please note that this is not the only way to build a low-cost Oracle RAC 10g system. I have seen other solutions that utilize an implementation based on SCSI rather than FireWire for shared storage. In most cases, SCSI will cost more than our FireWire solution where a typical SCSI card is priced around US$70 and an 80GB external SCSI drive will cost US$700-US$1,000. Keep in mind that some motherboards may already include built-in SCSI controllers. It is important to note that this configuration should never be run in a production environment and that it is not supported by Oracle or any other vendor. In a production environment, fibre channel—the high-speed serial-transfer interface that can connect systems and storage devices in either point-to-point or switched topologies—is the technology of choice. FireWire offers a low-cost alternative to fibre channel for testing and development, but it is not ready for production. The Oracle9i and Oracle 10g Release 1 guides used raw partitions for storing files on shared storage, but here we will make use of the Oracle Cluster File System Release 2 (OCFS2) and Oracle Automatic Storage Management (ASM) feature. The two Linux servers will be configured as follows: Oracle Database Files RAC Node Name linux1 linux2 Instance Name orcl1 orcl2 Database Name orcl orcl Oracle Clusterware Shared Files File Type Oracle Cluster Registry CRS Voting Disk File Name Partition Mount Point /u02/oradata/orcl /u02/oradata/orcl File System OCFS2 OCFS2 File System / $ORACLE_BASE Volume Manager for DB Files /u01/app/oracle /u01/app/oracle ASM ASM

/u02/oradata/orcl/OCRFile /dev/sda1 /u02/oradata/orcl/CSSFile /dev/sda1

Note that with Oracle Database 10g Release 2 (10.2), Cluster Ready Services, or CRS, is now called Oracle Clusterware. The Oracle Clusterware software will be installed to /u01/app/oracle/product/crs on each of the nodes that make up the RAC cluster. However, the Clusterware software requires that two of its files—the Oracle Cluster Registry (OCR) file and the Voting Disk file—be shared with all nodes in the cluster. These two files will be installed on shared storage using OCFS2. It is possible (but not recommended by Oracle) to use RAW devices for these files; however, it is not possible to use ASM for these two Clusterware files. The Oracle Database 10g Release 2 software will be installed into a separate Oracle Home, namely /u01/app/oracle/product/10.2.0/db_1, on each of the nodes that make up the RAC cluster. All the Oracle physical database files (data, online redo logs, control files, archived redo logs), will be installed to different partitions of the shared drive being managed by ASM. (The Oracle database files can just as easily be stored on OCFS2. Using ASM, however, makes the article that much more interesting!) Note: This article is only designed to work as documented with absolutely no substitutions. If you are looking for an example that takes advantage of Oracle RAC 10g Release 1 with RHEL 3, click here. For the previously published Oracle9i RAC version of this guide, click here.

2. Oracle RAC 10g Overview
Oracle RAC, introduced with Oracle9i, is the successor to Oracle Parallel Server (OPS). RAC allows multiple instances to access the same database (storage) simultaneously. It provides fault tolerance, load balancing, and performance benefits by allowing the system to scale out, and at the same time—because all nodes access the same database—the failure of one instance will not cause the loss of access to the database. At the heart of Oracle RAC is a shared disk subsystem. All nodes in the cluster must be able to access all of the data, redo log files, control files and parameter files for all nodes in the cluster. The data disks must be globally available to allow all nodes to access the database. Each node has its own redo log and control files but the other nodes must be able to access them in order to recover that node in the event of a system failure. One of the bigger differences between Oracle RAC and OPS is the presence of Cache Fusion technology. In OPS, a request for data between nodes required the data to be written to disk first, and then the requesting node could read that data. With cache fusion, data is passed along a high-speed interconnect using a sophisticated locking algorithm. Not all clustering solutions use shared storage. Some vendors use an approach known as a federated cluster, in which data is spread across several machines rather than shared by all. With Oracle RAC 10g, however, multiple nodes use the same set of disks for storing data. With Oracle RAC, the data files, redo log files, control files, and archived log files reside on shared storage on raw-disk devices, a NAS, a SAN, ASM, or on a clustered file system. Oracle's approach to clustering leverages the collective processing power of all the nodes in the cluster and at the same time provides failover security. For more background about Oracle RAC, visit the Oracle RAC Product Center on OTN.

3. Shared-Storage Overview
Fibre Channel is one of the most popular solutions for shared storage. As I mentioned previously, Fibre Channel is a high-speed serial-transfer interface used to connect systems and storage devices in either point-to-point or switched topologies. Protocols supported by Fibre Channel include SCSI and IP. Fibre Channel configurations can support as many as 127 nodes and have a throughput of up to 2.12 gigabits per second. Fibre Channel, however, is very expensive; the switch alone can start at US$1,000 and high-end drives can reach prices of US$300. Overall, a typical Fibre Channel setup (including cards for the servers) costs roughly US$8,000.

2 of 19

1/4/2006 8:44 AM

Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire

http://www.oracle.com/technology/pub/articles/hunter_rac10gr2.html?_...

A less expensive alternative to Fibre Channel is SCSI. SCSI technology provides acceptable performance for shared storage, but for administrators and developers who are used to GPL-based Linux prices, even SCSI can come in over budget at around US$2,000 to US$5,000 for a two-node cluster. Another popular solution is the Sun NFS (Network File System) found on a NAS. It can be used for shared storage but only if you are using a network appliance or something similar. Specifically, you need servers that guarantee direct I/O over NFS, TCP as the transport protocol, and read/write block sizes of 32K.

4. FireWire Technology
Developed by Apple Computer and Texas Instruments, FireWire is a cross-platform implementation of a high-speed serial data bus. With its high bandwidth, long distances (up to 100 meters in length) and high-powered bus, FireWire is being used in applications such as digital video (DV), professional audio, hard drives, high-end digital still cameras and home entertainment devices. Today, FireWire operates at transfer rates of up to 800 megabits per second while next generation FireWire calls for speeds to a theoretical bit rate to 1,600 Mbps and then up to a staggering 3,200 Mbps. That's 3.2 gigabits per second. This speed will make FireWire indispensable for transferring massive data files and for even the most demanding video applications, such as working with uncompressed high-definition (HD) video or multiple standard-definition (SD) video streams. The following chart shows speed comparisons of the various types of disk interfaces. For each interface, I provide the maximum transfer rates in kilobits (kb), kilobytes (KB), megabits (Mb), megabytes (MB), and gigabits (Gb) per second. As you can see, the capabilities of IEEE1394 compare very favorably with other available disk interface technologies. Speed Kb Serial Parallel (standard) USB 1.1 Parallel (ECP/EPP) SCSI-1 SCSI-2 (Fast SCSI / Fast Narrow SCSI) ATA/100 (parallel) IDE Fast Wide SCSI (Wide SCSI) Ultra SCSI (SCSI-3 / Fast-20 / Ultra Narrow) Ultra IDE Wide Ultra SCSI (Fast Wide 20) Ultra2 SCSI FireWire 400 - IEEE1394(a) USB 2.0 Wide Ultra2 SCSI Ultra3 SCSI FireWire 800 - IEEE1394(b) Serial ATA - (SATA) Wide Ultra3 SCSI Ultra160 SCSI Ultra Serial ATA 1500 Ultra320 SCSI FC-AL Fibre Channel 115 920 KB 14.375 115 Mb 0.115 0.92 12 24 40 80 100 133.6 160 160 264 320 320 400 480 640 640 800 1200 1280 1280 1500 2560 3200 MB 0.014 0.115 1.5 3 5 10 12.5 16.7 20 20 33 40 40 50 60 80 80 100 150 160 160 187.5 320 400 1.2 1.28 1.28 1.5 2.56 3.2 Gb

Disk Interface

5. Hardware & Costs
The hardware we will use to build our example Oracle RAC 10g environment comprises two Linux servers and components that you can purchase at any local computer store or over the Internet. Server 1 - (linux1) Dimension 2400 Series Intel Pentium 4 Processor at 2.80GHz 1GB DDR SDRAM (at 333MHz) 40GB 7200 RPM Internal Hard Drive Integrated Intel 3D AGP Graphics

US$620

3 of 19

1/4/2006 8:44 AM

Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire

http://www.oracle.com/technology/pub/articles/hunter_rac10gr2.html?_...

Integrated 10/100 Ethernet CDROM (48X Max Variable) 3.5" Floppy No monitor (Already had one) USB Mouse and Keyboard 1 - Ethernet LAN Cards Linksys 10/100 Mpbs - (Used for Interconnect to linux2) Each Linux server should contain two NIC adapters. The Dell Dimension includes an integrated 10/100 Ethernet adapter that will be used to connect to the public network. The second NIC adapter will be used for the private interconnect.

US$20

1 - FireWire Card SIIG, Inc. 3-Port 1394 I/O Card Cards with chipsets made by VIA or TI are known to work. In addition to the SIIG, Inc. 3-Port 1394 I/O Card, I have also successfully used the Belkin FireWire 3-Port 1394 PCI Card and StarTech 4 Port IEEE-1394 PCI Firewire Card I/O cards. Server 2 - (linux2)

US$30

Dimension 2400 Series Intel Pentium 4 Processor at 2.80GHz 1GB DDR SDRAM (at 333MHz) 40GB 7200 RPM Internal Hard Drive Integrated Intel 3D AGP Graphics Integrated 10/100 Ethernet CDROM (48X Max Variable) 3.5" Floppy No monitor (already had one) USB Mouse and Keyboard US$620 1 - Ethernet LAN Cards Linksys 10/100 Mpbs - (Used for Interconnect to linux1) Each Linux server should contain two NIC adapters. The Dell Dimension includes an integrated 10/100 Ethernet adapter that will be used to connect to the public network. The second NIC adapter will be used for the private interconnect.

US$20

1 - FireWire Card SIIG, Inc. 3-Port 1394 I/O Card Cards with chipsets made by VIA or TI are known to work. In addition to the SIIG, Inc. 3-Port 1394 I/O Card, I have also successfully used the Belkin FireWire 3-Port 1394 PCI Card and StarTech 4 Port IEEE-1394 PCI Firewire Card I/O cards. Miscellaneous Components FireWire Hard Drive Maxtor OneTouch II 300GB USB 2.0 / IEEE 1394a External Hard Drive - (E01G300) Ensure that the FireWire drive that you purchase supports multiple logins. If the drive has a chipset that does not allow for concurrent access for more than one server, the disk and its partitions can only be seen by one server at a time. Disks with the Oxford 911 chipset are known to work. Here are the details about the disk that I purchased for this test: Vendor: Maxtor Model: OneTouch II Mfg. Part No. or KIT No.: E01G300 Capacity: 300 GB Cache Buffer: 16 MB Spin Rate: 7200 RPM Interface Transfer Rate: 400 Mbits/s "Combo" Interface: IEEE 1394 / USB 2.0 and USB 1.1 compatible US$280

US$30

4 of 19

1/4/2006 8:44 AM

Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire

http://www.oracle.com/technology/pub/articles/hunter_rac10gr2.html?_...

The following is a list of FireWire drives (and enclosures) that contain the correct chipset, allow for multiple logins and should work with this article (no guarantees however): Maxtor OneTouch II 300GB USB 2.0 / IEEE 1394a External Hard Drive - (E01G300) Maxtor OneTouch II 250GB USB 2.0 / IEEE 1394a External Hard Drive - (E01G250) Maxtor OneTouch II 200GB USB 2.0 / IEEE 1394a External Hard Drive - (E01A200) LaCie Hard Drive, Design by F.A. Porsche 250GB, FireWire 400 - (300703U) LaCie Hard Drive, Design by F.A. Porsche 160GB, FireWire 400 - (300702U) LaCie Hard Drive, Design by F.A. Porsche 80GB, FireWire 400 - (300699U) Dual Link Drive Kit, FireWire Enclosure, ADS Technologies - (DLX185) Maxtor Ultra 200GB ATA-133 (Internal) Hard Drive Maxtor OneTouch 250GB USB 2.0 / IEEE 1394a External Hard Drive - (A01A250) Maxtor OneTouch 200GB USB 2.0 / IEEE 1394a External Hard Drive - (A01A200)

1 - Extra FireWire Cable Belkin 6-pin to 6-pin 1394 Cable US$20 1 - Ethernet hub or switch Linksys EtherFast 10/100 5-port Ethernet Switch (Used for interconnect int-linux1 / int-linux2)

US$25

4 - Network Cables Category 5e patch cable - (Connect linux1 to public network) Category 5e patch cable - (Connect linux2 to public network) Category 5e patch cable - (Connect linux1 to interconnect ethernet switch) Category 5e patch cable - (Connect linux2 to interconnect ethernet switch)

US$5 US$5 US$5 US$5 Total US$1,685

Note that the Maxtor OneTouch external drive does have two IEEE1394 (FireWire) ports, although it may not appear so at first glance. This is also true for the other external hard drives I have listed above. Also note that although you may be tempted to substitute the Ethernet switch (used for interconnect int-linux1/int-linux2) with a crossover CAT5 cable, I would not recommend this approach. I have found that when using a crossover CAT5 cable for the interconnect, whenever I took one of the PCs down the other PC would detect a "cable unplugged" error, and thus the Cache Fusion network would become unavailable. Now that we know the hardware that will be used in this example, let's take a conceptual look at what the environment looks like:

5 of 19

1/4/2006 8:44 AM

2-i386-bin3of4. This guide is designed to work with the Red Hat Enterprise Linux 4 AS/ES (RHEL4) operating environment. Before installing the Linux operating system on both nodes. hit [Enter] to start the installation process.iso (618 MB) CentOS-4.. you should have the FireWire and two NIC interfaces (cards) installed. power it on. For more detailed installation instructions. Language / Keyboard Selection The next two screens prompt you for the Language and Keyboard settings. 6. and answer the installation screen prompts as noted below.2: a free and stable version of the RHEL4 operating environment. click [Next] to continue. As an alternative.. and what I used for this article.iso (217 MB) After downloading and burning the CentOS images (ISO files) to CD. Figure 1 Architecture As we start to go into the details of the installation. the installer should then detect the video card.2-i386-bin1of4. Make the appropriate selections for your configuration. that the instructions I have provided below be used for this configuration. After completing the Linux installation on the first node. If there were any errors. perform the same Linux installation on the second node while substituting the node name linux1 for linux2 and the different IP addresses where appropriate. Welcome to CentOS Enterprise Linux At the welcome screen. monitor. however. Media Test When asked to test the CD media.2-i386-bin4of4. After several seconds.com/technology/pub/articles/hunter_rac10gr2. I would suggest.2: CentOS-4. tab over to [Skip] and hit [Enter]. ensure that the FireWire drive (our shared storage drive) is NOT connected to either of the two servers.2-i386-bin2of4. is CentOS 4. keep in mind that most tasks will need to be performed on both servers. insert CentOS Disk #1 into the first server (linux1 in this example). Download the following ISO images for CentOS 4. 6 of 19 1/4/2006 8:44 AM .html?_.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire http://www. Install the Linux Operating System This section provides a summary of the screens used to install the Linux operating system. Boot Screen The first screen is the CentOS Enterprise Linux boot screen.oracle. it is possible to use the manuals from Red Hat Linux. You may also choose to connect both servers to the FireWire drive and simply turn the power off to the drive. Also. Installation Type Choose the [Custom] option and click [Next] to continue. The installer then goes into GUI mode.iso (639 MB) CentOS-4. before starting the installation.iso (635 MB) CentOS-4. At the boot: prompt. the media burning software would have warned us. and mouse.

Finish this dialog off by supplying your gateway and DNS servers.. First. the installer will choose 100MB for /boot. During the installation process. you are asked to simply "Install default software packages" or "Customize software packages to be installed". You may be prompted with a warning dialog about not setting the firewall.255. Click [Yes] to acknowledge this warning. Also.255. I will accept all automatically preferred sizes. Disk #1. Note that with CentOS 4. Disk #3.100 .Netmask: 255. the installer created 2GB of swap. I like to have a minimum of 1GB for swap. Note that with some RHEL4 distributions. you will be asked to switch disks to Disk #2. you can accept the defaults. and the rest going to the root (/) partition. make sure that each of the network devices are checked to [Active on boot]. In almost all cases. In almost all cases. and then back to Disk #4. Partitioning The installer will then allow you to view (and modify if needed) the disk partitions it automatically selected.100 . Select the option to [Remove all partitions on this system]. double the amount of RAM for swap. If possible.255..oracle. You may choose to use different IP addresses for both eth0 and eth1 and that is OK.Check off the option to [Configure using DHCP] .2. (Including 2GB for swap since I have 1GB of RAM installed. accept all default values and click [Next] to continue.168. For example. Please see section Section 15 ("Check RPM Packages for Oracle 10g Release 2") for a more detailed look at the critical packages required for a successful Oracle installation.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire http://www.255. Click [Next] to start the installation. I basically check that it created at least 1GB of swap. Package Group Selection Scroll down to the bottom of this screen and select [Everything] under the "Miscellaneous" section. try to put eth1 (the interconnect) on a different subnet than eth0 (the public network): eth0: . The LVM Volume Group (VolGroup00) is then partitioned into two LVM partitions . Additional Language Support/Time Zone The next two screens allow you to select additional language support and time zone information.1. My decision to install all packages was for the sake of brevity. Now.Check off the option to [Configure using DHCP] . I used "linux1" for the first node and "linux2" for the second one. Disk Partitioning Setup Select [Automatically partition] and click [Next] continue. This will then bring up the "Package Group Selection" screen. If there were a previous installation of Linux on this machine.IP Address: 192. You will then be prompted with a dialog window asking if you really want to remove all partitions.Leave the [Activate on boot] checked . Second.Netmask: 255. Set Root Password Select a root password and click [Next] to continue. the installer will ask to switch to Disk #2. For the purpose of this install. Network Configuration I made sure to install both NIC interfaces (cards) in each of the Linux machines before starting the operating system installation. Click [Next] to continue. simply hit [Proceed] to continue.2. To use the GRUB boot loader.Leave the [Activate on boot] checked . Click [Next] to continue. ensure that the [hda] drive is selected for this installation. If this occurs. About to Install This screen is basically a confirmation screen. Saying that.0 Continue by setting your hostname manually. Click [Continue] to start the installation process. make sure to select [No firewall] and click [Next] to continue. the installer will create the same disk configuration as just noted but will create them using the Logical Volume Manager (LVM).0 eth1: . I also keep the checkbox [Review (and modify if needed) the partitions created] selected.168.com/technology/pub/articles/hunter_rac10gr2. 7 of 19 1/4/2006 8:44 AM . it will partition the first hard drive (/dev/hda for my configuration) into two partitions—one for the /boot partition (/dev/hda1) and the remainder of the disk dedicate to a LVM named VolGroup00 (/dev/hda2). Disk #4.IP Address: 192. Please note that the installation of Oracle does not require all Linux packages to be installed.) Starting with RHEL 4.html?_. Since I have 1GB of RAM installed. you will not get the "Package Group Selection" screen by default. The installer may choose to not activate eth1. and then Disk #4. Disk #3. This screen should have successfully detected each of the network devices. I just accept the default disk layout. Select the option to "Customize software packages to be installed" and click [Next] to continue. scroll down to the bottom of this screen and select [Everything] under the "Miscellaneous" section. Firewall On this screen. [Edit] both eth0 and eth1 as follows.one for the root filesystem (/) and another for swap. the next screen will ask if you want to "remove" or "keep" old partitions. Boot Loader Configuration The installer will use the GRUB boot loader by default. Click [Next] to continue. There.

When the system boots into Linux for the first time. when the installation is complete. 7.0 eth1: .. Network Configuration Perform the following network configuration on all nodes in the cluster! Note: Although we configured several of the network settings during the Linux installation.. The only screen I care about is the time and date (and if you are using CentOS 4.255. you need to configure both NIC devices as well as the /etc/hosts file. The following wizard allows you to configure the date and time.255.IP Address: 192. testing the sound card. Congratulations And that's it. this is what I configured for linux2: First. it will prompt you with another Welcome screen.255.255. The easiest way to configure network settings in RHEL4 is with the Network Configuration program.1. repeat the above steps for the second node (linux2). and to install any additional CDs. Notice that the /etc/hosts settings are the same for both nodes. the monitor/display settings).101 . When configuring the machine name and networking. I also include instructions for enabling Telnet and FTP services. you should now be presented with the login screen. Each node should have one static IP address for the public network and one static IP address for the private cluster interconnect. the installer will attempt to detect your video hardware.oracle.IP Address: 192. The installer will choose not to activate eth1.Netmask: 255.168. If everything was successful. you need to configure the network on both nodes for access to the public network as well as their private interconnect. Graphical Interface (X) Configuration With most RHEL4 distributions (not the case with CentOS 4.0 Continue by setting your hostname manually.com/technology/pub/articles/hunter_rac10gr2.168. this is not recommended as it may cause degraded database performance (reducing the amount of bandwidth for Cache Fusion and Cluster Manager traffic). Introduction to Network Settings During the Linux O/S install you already configured the IP address and host name for each of the nodes.255. I used "linux2" for the second node.100 192. As for the others.2). make sure that each of the network devices are checked to [Active on boot].0 255. Ensure that the installer has detected and selected the correct video hardware (graphics card and monitor) to properly use the X Windows server.168.100 Subnet 255. Perform the same installation on the second node After completing the Linux installation on the first node. simply run through them as there is nothing additional that needs to be installed (at this point anyways!). Take out the CD and click [Exit] to reboot the system. Although it is possible to use the public network for the interconnect.1.html?_. add any additional users. Our example configuration will use the following settings: Server 1 (linux1) Device eth0 eth1 IP Address 192. you need static IP addresses! Using the Network Configuration application. You now need to configure the/etc/hosts file as well as adjust several of the network settings for the interconnect.x. [Edit] both eth0 and eth1 as follows: eth0: . This application can be started from the command-line as the root user account as follows: # su # /usr/bin/system-config-network & Do not use DHCP naming for the public IP address or the interconnects.255.Check off the option to [Configure using DHCP] . For my installation.168. it is important to not skip this section as it contains critical steps that are required for the RAC environment.255. Configuring Public and Private Network In our two-node example.Leave the [Activate on boot] checked . the interconnect should be at least gigabit or more and only be used by Oracle.0 Purpose Connects linux1 to the public network Connects linux1 (interconnect) to linux2 (int-linux2) 8 of 19 1/4/2006 8:44 AM .Netmask: 255. Finish this dialog off by supplying your gateway and DNS servers. Both of these tasks can be completed using the Network Configuration GUI. The private interconnect should only be used by Oracle to transfer Cluster Manager and Cache Fusion related data. The installer will eject the CD from the CD-ROM drive.255. For a production RAC implementation. You will continue with the X configuration in the next serveral screens.2.Check off the option to [Configure using DHCP] .Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire http://www. ensure to configure the proper values.Leave the [Activate on boot] checked . Second.101 . You have successfully installed CentOS Enterprise Linux on the first node (linux1).2.

1.101 linux2 # Private Interconnect .1.200 vip-linux1 192.100 linux1 192.255.168.168.168..0.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire http://www.(eth1) 192.255.(eth0) 192.168.101 int-linux2 # Public Virtual IP (VIP) addresses for .101 linux2 # Private Interconnect .100 int-linux1 192.1. /etc/hosts 127.168.2.0.2.168.1.201 vip-linux2 Server 2 (linux2) Device eth0 eth1 IP Address 192.1.1.100 linux1 192.168.168.(eth0) 192. In the screenshots below.0 255.255.168.(eth0) 192. All virtual IP addresses will be activated when the srvctl start nodeapps -n <node_name> command is run.1.101 192.168.oracle. The public virtual IP addresses will be configured automatically by Oracle when you run the Oracle Universal Installer.168.168. Be sure to make all the proper network settings to both nodes.168. which starts Oracle's Virtual Internet Protocol Configuration Assistant (VIPCA).(eth0) 192.ora file (more details later). This is the Host Name/IP Address that will be configured in the client(s) tnsnames..101 int-linux2 # Public Virtual IP (VIP) addresses for .1. 9 of 19 1/4/2006 8:44 AM .com/technology/pub/articles/hunter_rac10gr2.255.2. only node 1 (linux1) is shown.2.100 int-linux1 192.0.200 vip-linux1 192.1 localhost loopback # Public Network .1 localhost loopback # Public Network .1.201 vip-linux2 Note that the virtual IP addresses only need to be defined in the /etc/hosts file (or your DNS) for both nodes.(eth1) 192.0.html?_.2.0 Purpose Connects linux2 to the public network Connects linux2 (interconnect) to linux1 (int-linux1) /etc/hosts 127.168.101 Subnet 255.

.oracle. Node 1 (linux1) Figure 3 Ethernet Device Screen. eth0 (linux1) 10 of 19 1/4/2006 8:44 AM .com/technology/pub/articles/hunter_rac10gr2..html?_. Figure 2 Network Configuration Screen.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire http://www.

168.255.oracle..1.255 Mask:255.255.1. eth1 (linux1) Figure 5: Network Configuration Screen. you can use the ifconfig command to verify everything is working. /etc/hosts (linux1) When the network if configured.0 11 of 19 1/4/2006 8:44 AM .html?_.. Figure 4 Ethernet Device Screen.100 Bcast:192.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire http://www.com/technology/pub/articles/hunter_rac10gr2. The following example is from linux1: $ /sbin/ifconfig -a eth0 Link encap:Ethernet HWaddr 00:0D:56:FC:39:EC inet addr:192.168.

Without using VIPs. This results in the clients getting errors immediately.Metalink Note 220970. two things happen.0.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire http://www.255 Mask:255. 2. 1. Confirm the RAC Node Name is Not Listed in Loopback Address Ensure that the node names (linux1 or linux2) are not included for the loopback address in the /etc/hosts file.0. you will receive the following error during the RAC installation: ORA-00603: ORACLE server session terminated by fatal error or ORA-29702: error occurred in Cluster Group Service operation Adjusting Network Settings 12 of 19 1/4/2006 8:44 AM . the next address in tnsnames is used.oracle.0 b) Interrupt:11 Base address:0xe400 Link encap:Local Loopback inet addr:127.0. inet6 addr: fe80::20d:56ff:fefc:39ec/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:835 errors:0 dropped:0 overruns:0 frame:0 TX packets:1983 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:705714 (689. Subsequent packets sent to the VIP go to the new node.255. this usually causes them to see errors on their connections to the old address. In the case of connect.1 KiB) TX bytes:176892 (172.168. The new node re-arps the world indicating a new MAC address for the address. or traverses the address list while connecting. it is possible to completely avoid ORA-3113 errors alltogether! TAF will be discussed in more detail in Section 28 ("Transparent Application Failover . the VIP associated with it is supposed to be automatically failed over to some other node. Going one step further is making use of Transparent Application Failover (TAF).100 Bcast:192.168.html?_. This means that when the client issues SQL to the node that is now down.. In the case of SQL.2.com/technology/pub/articles/hunter_rac10gr2.1 linux1 localhost. As a result. you don't really have a good HA solution without using VIPs (Source .2.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:5110 errors:0 dropped:0 overruns:0 frame:0 TX packets:5110 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:8276758 (7.0 inet6 addr: fe80::20c:41ff:fee8:537/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:9 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0. rather than waiting on a very long TCP/IP time-out (~10 minutes).0 b) TX bytes:0 (0.0 b) TX bytes:546 (546. With TAF successfully configured.0. When this occurs. If the machine name is listed in the in the loopback address entry as below: 127.8 MiB) Link encap:IPv6-in-IPv4 NOARP MTU:1480 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0.0.255. which will send error RST packets back to the clients. clients connected to a node that died will often wait a 10-minute TCP timeout period before getting an error.1 localhost. For directly connected clients.0.1).0 b) lo sit0 About Virtual IP Why is there a Virtual IP (VIP) in 10g? Why does it just return a dead connection when its primary node fails? It's all about availability of the application.7 KiB) Interrupt:3 eth1 Link encap:Ethernet HWaddr 00:0C:41:E8:05:37 inet addr:192..localdomain localhost If the RAC node name is listed for the loopback address.(TAF)"). the client receives a TCP reset. this is ORA-3113.1 Mask:255. When a node fails.localdomain localhost it will need to be removed as shown below: 127.0.8 MiB) TX bytes:8276758 (7.

html?_. Typically when you logon to an OS. the FTP server (wu-ftpd) is no longer available with xinetd.d/vsftpd as in the following: # /etc/init. I included the steps to download a patched version of the Linux kernel (source code) and then compile it.wmem_default=262144 # Maximum socket receive buffer size which may be set by using # the SO_RCVBUF socket option net.root # sysctl -w net.wmem_max=262144 Enabling Telnet and FTP Services Linux is configured to run the Telnet and FTP server.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire http://www. Before going into the details of how to perform these actions. however.d/S56vsftpd # ln -s /etc/init. Obtain & Install New Linux Kernel / FireWire Modules Perform the following kernel upgrade and FireWire modules install on all nodes in the cluster! The next step is to obtain and install a new Linux kernel and the FireWire modules that support the use of IEEE1394 devices with multiple logins.rmem_default = 262144 # sysctl -w net.0. The default and maximum window size can be changed in the /proc file system without reboot: # su . this is no longer a requirement. Oracle now provides a pre-compiled kernel for RHEL4 (which also works with CentOS!).rmem_max=262144 net. Oracle strongly suggests to adjust the default and maximum send buffer size (SO_SNDBUF socket option) to 256KB.core. While FireWire drivers already exist for Linux.oracle. that can simply be downloaded and installed.d/vsftpd /etc/rc3. The instructions for downloading and installing the kernel and supporting FireWire modules are included in this section.rmem_default=262144 # Default setting in bytes of the socket send buffer net. It has been replaced with vsftp and can be started from /etc/init.core. You should now make the above changes permanent (for each reboot) by adding the following lines to the /etc/sysctl.. the OS associates the driver 13 of 19 1/4/2006 8:44 AM .2.rmem_max=262144 # Maximum socket send buffer size which may be set by using # the SO_SNDBUF socket option net.com/technology/pub/articles/hunter_rac10gr2. potentially causing the sender to overwhelm the receiver.rmem_max = 262144 # sysctl -w net.core. and the default and maximum receive buffer size (SO_RCVBUF socket option) to 256KB.conf file for each node in your RAC cluster: # Default setting in bytes of the socket receive buffer net.wmem_max=262144 net. This will require two separate downloads and installs: one for the new RHEL4 kernel and a second one that includes the supporting FireWire modules.d/S56vsftpd 8. these services are disabled.wmem_default = 262144 # sysctl -w net.core. login to the server as the root user account and run the following commands: # chkconfig telnet on # service xinetd reload Reloading configuration: [ OK ] Starting with the Red Hat Enterprise Linux 3.core. but by default.wmem_default=262144 net.1 and later. such as Cache Fusion and Cluster Manager buffer transfers between instances within the RAC cluster. Thanks to Oracle's Linux Projects Development Team .d/S56vsftpd # ln -s /etc/init.core. Oracle makes use of UDP as the default protocol on Linux for inter-process communication (IPC).core.core. let's take a moment to discuss the changes that are required in the new kernel.. To enable the telnet these service. The receive buffers are used by TCP and UDP to hold received data until it is read by the application.core.0 release (and in CentOS).d/vsftpd start Starting vsftpd for vsftpd: [ OK ] If you want the vsftpd service to start and stop when recycling (rebooting) the machine.d/vsftpd /etc/rc5.rmem_default=262144 net.core. With Oracle 9. The receive buffer cannot overflow because the peer is not allowed to send data beyond the buffer size window. you can create the following symbolic links: # ln -s /etc/init. they often do not support shared storage.wmem_max = 262144 The above commands made the changes to the already running OS.core. This means that datagrams will be discarded if they don't fit in the socket receive buffer. In a previous version of this guide.core.d/vsftpd /etc/rc4.

0.6.3.6. you need to install the supporting FireWire modules package by running either of the following: # rpm -ivh oracle-firewire-modules-2.0. Reboot into the new Linux server: At this point. move on to the second Linux server and repeat the same tasks in this section on it.com/technology/pub/articles/hunter_rac10gr2. as root: # rpm -ivh --force kernel-2.9-11.(for multiple processors) Install the new RHEL 4 kernel.3.10. This goal is accomplished by removing the bit mask that identifies the machine during login in the source code.0.9-22.0.3.6.0.6.i686.i686. to a specific drive for that machine alone.rpm .10.0.0.i686.0.3. All other nodes in the cluster login to the same drive during their logon session.html?_.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire http://www.6.EL-1286-1.(for multiple processors) Download one of the following files for the supporting FireWire Modules: oracle-firewire-modules-2. Keep in mind that the process of installing the patched Linux kernel and supporting FireWire modules will need to be performed on both Linux nodes. power down both Linux machines: 14 of 19 1/4/2006 8:44 AM .0.(for multiple processors) Installing the new kernel using RPM will also update your GRUB (or lilo) configuration with the appropiate stanza and default boot option..rpm . Perform the above tasks on the second Linux server: With the new RHEL4 kernel and supporting FireWire modules installed on the first Linux server.oracle. You need to enable the FireWire driver to provide nonexclusive access to the drive so that multiple servers—the nodes that comprise the cluster—will be able to access the same storage.6. Your implementation describes a dual node cluster (each with a single processor). We will need to download the OTN-supplied 2.EL.rpm . so they too also have nonexclusive access to the drive.i686.6. This implementation simply will not work for our RAC configuration.0.10..EL.0.(for single processor) or oracle-firewire-modules-2.9-11.i686.10. CentOS Enterprise Linux 4.3. each server running CentOS Enterprise Linux.ELsmp-1286-1.9-11.EL.9-11. Connect FireWire drive to each machine and boot into the new kernel: After performing the above tasks on both nodes in the cluster.rpm .0.10.i686.ELsmp-1286-1.10.EL-1286-1. Note: After installing the new kernel. The shared storage (our FireWire hard drive) needs to be accessed by more than one node.2 includes kernel 2.OR # rpm -ivh oracle-firewire-modules-2. the new RHEL4 kernel is installed.10.9-11.(for single processor) . resulting in nonexclusive access to the FireWire hard drive.6.9-11.rpm .6.(for single processor) or # rpm -ivh --force kernel-smp-2.0.EL #1.3.rpm . You now need to reboot into the new Linux kernel: # init 6 Install the supporting FireWire modules.i686.EL #1 Linux kernel and the supporting FireWire modules from the following two URLs: RHEL4 Kernels FireWire Modules Download one of the following files for the new RHEL 4 Kernel: kernel-2.0.(for multiple processors) Add module options: Add the following lines to /etc/modprobe.i686.10.3. using the same modified driver.0.0. There is no need to to modify your boot loader configuration after installing the new kernel.3.9-11.rpm .0.10.(for single processor) or kernel-smp-2.EL. as root: After booting into the new RHEL 4 kernel.9-11.conf: options sbp2 exclusive_login=0 It is vital that the parameter sbp2 exclusive_login of the Serial Bus Protocol module (sbp2) be set to zero to allow multiple hosts to login to and access the FireWire disk concurrently.0.rpm .6. do not proceed to install the supporting FireWire modules at this time! A reboot into the new kernel is required before the FireWire modules can be installed.9-11.3.

Note: RHEL4 users will be prompted during the boot process on both nodes at the "Probing for New Hardware" section for your FireWire hard drive. this was not the case and these commands would have to be manually run or put within a startup file.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 01) 01:04. let's check to see that the modules are loaded: # lsmod |egrep "ohci1394|sbp2|ieee1394|sd_mod|scsi_mod" sd_mod 17217 0 sbp2 23948 0 scsi_mod 121293 2 sd_mod.1 IDE interface: Intel Corporation 82801DB (ICH4) IDE Controller (rev 01) 00:1f.html?_. these commands are already put within the /etc/rc.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller (rev 01) 00:1e. When complete. The following commands and results are from my linux2 machine. make sure that you run the following commands on all nodes to ensure both machine can login to the shared drive.oracle. the kernel should automatically detect the disk as a SCSI device (/dev/sdXX). Simply select the option to "Configure" the device and continue the boot process. started linux1 first. With Red Hat Enterprise Linux 3 and later.sysinit file.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #1 (rev 01) 00:1d. If you are not prompted during the "Probing for New Hardware" section for the new FireWire drive. Check for SCSI Device: After each machine has rebooted. For this configuration. power on each Linux server and ensure to boot each machine into the new kernel. connect each of them to the back of the FireWire drive..0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 81) 00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller (rev 01) 00:1f..sbp2 ohci1394 35784 0 15 of 19 1/4/2006 8:44 AM . I was performing the above procedures on both nodes at the same time. The commands that are contained within this file that are responsible for loading the FireWire stack are: # modprobe sbp2 # modprobe ohci1394 In older versions of Red Hat.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #2 (rev 01) 00:1d.0 Host bridge: Intel Corporation 82845G/GL[Brookdale-G]/GE/PE DRAM Controller/Host-Hub Interface (rev 01) 00:02. the loading of the FireWire stack will already be configured in the /etc/rc. I shutdown both machines.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire http://www.0 ISA bridge: Intel Corporation 82801DB/DBL (ICH4/ICH4-L) LPC Interface Bridge (rev 01) 00:1f. Power on the FireWire drive.0 Ethernet controller: Linksys NC100 Network Everywhere Fast Ethernet 10/100 (rev 11) 01:06.com/technology/pub/articles/hunter_rac10gr2. Again.sysinit file and run on each boot. This section will provide several commands that should be run on all nodes in the cluster to verify the FireWire drive was successfully detected and being shared by all nodes in the cluster. and then linux2. Let's first check to see that the FireWire adapter was successfully detected: # lspci 00:00. =============================== # hostname linux1 # init 0 =============================== # hostname linux2 # init 0 =============================== After both machines are powered down.2 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #3 (rev 01) 00:1d.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link) 01:09. you will need to run the following commands and reboot the machine: # # # # # # # modprobe modprobe modprobe modprobe modprobe modprobe init 6 -r sbp2 -r sd_mod -r ohci1394 ohci1394 sd_mod sbp2 Loading the FireWire stack: In most cases. Finally.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T (rev 01) Second.0 VGA compatible controller: Intel Corporation 82845G/GL[Brookdale-G]/GE Chipset Integrated Graphics Device (rev 01) 00:1d.

0 GB.0 GB. Your drive may show that the device does not contain a valid partition table. The purpose of this script was to create the SCSI entry for the node by using the following command: echo "scsi add-single-device 0 0 0 0" > /proc/scsi/scsi With RHEL3 and RHEL4. the files on the OCFS file system may show up as "unowned" or may even be owned by a different user.html?_.. For this article. let's make sure the disk was detected and an entry was made by the kernel: # cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 01 Lun: 00 Vendor: Maxtor Model: OneTouch II Type: Direct-Access Rev: 023g ANSI SCSI revision: 06 Now let's verify that the FireWire drive is accessible for multiple logins and shows a valid login: # dmesg | grep sbp2 sbp2: $Rev: 1265 $ Ben Collins <bcollins@debian. 40000000000 bytes 255 heads. 300090728448 bytes 255 heads. Troubleshooting SCSI Device Detection: If you are having troubles with any of the procedures (above) in detecting the SCSI device.. Create "oracle" User and Directories (both nodes) Perform the following tasks on all nodes in the cluster! You will be using OCFS2 to store the files required to be shared for the Oracle Clusterware software. # fdisk -l Disk /dev/hda: 40. 36483 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot /dev/sda1 Start 1 End 36483 Blocks 293049666 Id c System W95 FAT32 (LBA) Rescan SCSI bus no longer required: In older versions of the kernel.org> ieee1394: sbp2: Maximum concurrent logins supported: 2 ieee1394: sbp2: Number of active logins: 0 ieee1394: sbp2: Logged into SBP-2 device From the above output. I will use 175 for theoracle UID and 115 for the dba GID. ieee1394 298228 2 sbp2. I would need to run the rescan-scsi-bus.sh script in order to detect the FireWire drive. Create Group and User for Oracle Let's continue our example by creating the Unix dba group and oracle user account along with all appropriate directories. # mkdir -p /u01/app # groupadd -g 115 dba 16 of 19 1/4/2006 8:44 AM . One other test I like to perform is to run a quick fdisk -l from each node in the cluster to verify that it is really being picked up by the OS. If either the UID or GID are different. 63 sectors/track. you can see that the FireWire drive I have can support concurrent logins by up to 2 servers. the UID of the UNIX user oracle and GID of the UNIX group dba should be identical on all machines in the cluster. When using OCFS2. this step is no longer required and the disk should be detected automatically. The system may not be able to recognize your FireWire drive if you have a USB device attached! 9.com/technology/pub/articles/hunter_rac10gr2. you can try the following: # # # # # # modprobe modprobe modprobe modprobe modprobe modprobe -r sbp2 -r sd_mod -r ohci1394 ohci1394 sd_mod sbp2 You may also want to unplug any USB devices connected to the server. It is vital that you have a drive where the chipset supports concurrent access for all nodes within the RAC cluster. 4863 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot /dev/hda1 * /dev/hda2 Start 1 14 End 13 4863 Blocks 104391 38957625 Id 83 8e System Linux Linux LVM Disk /dev/sda: 300. 63 sectors/track.oracle.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire http://www.ohci1394 Third. but this is OK at this point of the RAC configuration.

oracle. # # # # useradd -u 175 -g 115 -d /u01/app/oracle -s /bin/bash -c "Oracle Software Owner" -p oracle oracle chown -R oracle:dba /u01 passwd oracle su ......:${PATH}:$HOME/bin:$ORACLE_HOME/bin export PATH=${PATH}:/usr/bin:/bin:/usr/bin/X11:/usr/local/bin export PATH=${PATH}:$ORACLE_BASE/common/oracle/bin export ORACLE_TERM=xterm export TNS_ADMIN=$ORACLE_HOME/network/admin export ORA_NLS10=$ORACLE_HOME/nls/data export LD_LIBRARY_PATH=$ORACLE_HOME/lib export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$ORACLE_HOME/oracm/lib export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/lib:/usr/lib:/usr/local/lib export CLASSPATH=$ORACLE_HOME/JRE export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/jlib export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/rdbms/jlib export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/network/jlib export THREADS_FLAG=native export TEMP=/tmp export TMPDIR=/tmp .Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire http://www.... orcl2. I used: linux1 : ORACLE_SID=orcl1 linux2 : ORACLE_SID=orcl2 After creating the "oracle" UNIX userid on both nodes.bash_profile: . You can check the available space in /tmp by running the following command: # cat /proc/swaps Filename /dev/mapper/VolGroup00-LogVol01 -OR- Type partition Size Used 2031608 0 Priority -1 # cat /proc/meminfo | grep SwapTotal SwapTotal: 2031608 kB 17 of 19 1/4/2006 8:44 AM ...e....0/db_1 export ORA_CRS_HOME=$ORACLE_BASE/product/crs export ORACLE_PATH=$ORACLE_BASE/common/oracle/sql:. create the mount point for the OCFS2 filesystem that will be used to store the two Oracle Clusterware shared files... orcl1...2..... # .. ensure that the environment is setup correctly by using the following ......bashrc fi alias ls="ls -FA" # User specific environment and startup programs export ORACLE_BASE=/u01/app/oracle export ORACLE_HOME=$ORACLE_BASE/product/10...:$ORACLE_HOME/rdbms/admin # Each RAC node must have a unique ORACLE_SID....) export ORACLE_SID=orcl1 export PATH=..........oracle Note: When you are setting the Oracle environment variables for each RAC node. then .....com/technology/pub/articles/hunter_rac10gr2.. ensure to assign each RAC node a unique Oracle SID! For this example.. (i..bashrc ]..........bash_profile # Get the aliases and functions if [ -f ~/... These commands will need to be run as the "root" user account: $ su # mkdir -p /u02/oradata/orcl # chown -R oracle:dba /u02 Ensure Adequate temp Space for OUI Note: The Oracle Universal Installer (OUI) requires at most 400MB of free space in the /tmp directory....... Create Mount Point for OCFS2 / Clusterware Finally..html?_..... ~/......

we will be creating four partitions: one for Oracle's Clusterware shared files and the other three for ASM (to store all Oracle database files and the Flash Recovery Area). control files. You will then create three ASM volumes. it is important to remove any existing partitions (if they exist) on the FireWire drive: # fdisk /dev/sda Command (m for help): p Disk /dev/sda: 300. 36483 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot /dev/sda1 Start 1 End 36483 Blocks 293049666 Id c System W95 FAT32 (LBA) Command (m for help): d Selected partition 1 Command (m for help): p Disk /dev/sda: 300. Oracle Shared Drive Configuration File System Type Partition OCFS2 ASM ASM ASM Total /dev/sda1 /dev/sda2 /dev/sda3 Size Mount Point 1GB /u02/oradata/orcl 50GB ORCL:VOL1 50GB ORCL:VOL2 +ORCL_DATA1 +ORCL_DATA1 ASM Diskgroup Name File Types Oracle Cluster Registry File .oracle.0 GB. you can remove the temporary directory using the following: # # # # su rmdir /<AnotherFilesystem>/tmp unset TEMP unset TMPDIR 10. 300090728448 bytes 255 heads. you will use OCFS2 to store the two files to be shared for Oracle's Clusterware software. two for all physical database files (data/index files. and archived redo log files) and one for the Flash Recovery Area. The fdisk command is used for creating (and removing) partitions. Here are the steps to do this: # # # # # # su mkdir /<AnotherFilesystem>/tmp chown root.html?_. 36483 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System Command (m for help): n Command action e extended p primary partition (1-4) p 18 of 19 1/4/2006 8:44 AM . As I mentioned previously. my FireWire drive shows up as the SCSI device /dev/sda.. Create Partitions on the Shared FireWire Storage Device Create the following partitions on only one node in the cluster! The next step is to create the required partitions on the FireWire (shared) drive.0 GB.(~20MB) Oracle Database Files Oracle Database Files /dev/sda4 100GB ORCL:VOL3 201GB +FLASH_RECOVERY_AREA Oracle Flash Recovery Area Create All Partitions on FireWire Shared Storage As shown in the table above. For this configuration. The following table lists the individual partitions that will be created on the FireWire (shared) drive and what files will be contained on them.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire http://www.. Before creating the new partitions. SPFILE. you can temporarily create space in another file system and point your TEMP and TMPDIR to it for the duration of the install. 300090728448 bytes 255 heads.root /<AnotherFilesystem>/tmp chmod 1777 /<AnotherFilesystem>/tmp export TEMP=/<AnotherFilesystem>/tmp # used by Oracle export TMPDIR=/<AnotherFilesystem>/tmp # used by Linux programs # like the linker "ld" When the installation of Oracle is complete.(~100MB) CRS Voting Disk . 63 sectors/track.com/technology/pub/articles/hunter_rac10gr2. online redo log files. 63 sectors/track. If for some reason you do not have enough space in /tmp.

Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire http://www. 300090728448 bytes 255 heads. 36483 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot /dev/sda1 /dev/sda2 /dev/sda3 /dev/sda4 Start 1 124 6204 12284 End 123 6203 12283 24442 Blocks 987966 48837600 48837600 97667167+ Id 83 83 83 83 System Linux Linux Linux Linux (Note: The FireWire drive and partitions created will be exposed as a SCSI device. default 36483): +100G Command (m for help): p Disk /dev/sda: 300. default 1): 1 Last cylinder or +size or +sizeM or +sizeK (1-36483. default 36483): +50G Command (m for help): n Command action e extended p primary partition (1-4) p Selected partition 4 First cylinder (12284-36483. 63 sectors/track. Partition number (1-4): 1 First cylinder (1-36483. default 12284): 12284 Last cylinder or +size or +sizeM or +sizeK (12284-36483.oracle.0 GB. After creating all required partitions. default 36483): +1G Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 2 First cylinder (124-36483.. 300090728448 bytes 255 heads. default 6204): 6204 Last cylinder or +size or +sizeM or +sizeK (6204-36483. 36483 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot /dev/sda1 /dev/sda2 /dev/sda3 /dev/sda4 Start 1 124 6204 12284 End 123 6203 12283 24442 Blocks 987966 48837600 48837600 97667167+ Id 83 83 83 83 System Linux Linux Linux Linux Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table.. you should now inform the kernel of the partition changes using the following syntax as theroot user account: # partprobe # fdisk -l /dev/sda Disk /dev/sda: 300.com/technology/pub/articles/hunter_rac10gr2.) Page 1 Page 2 Page 3 19 of 19 1/4/2006 8:44 AM .0 GB. default 124): 124 Last cylinder or +size or +sizeM or +sizeK (124-36483.html?_. 63 sectors/track. default 36483): +50G Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 3 First cylinder (6204-36483. Syncing disks.

. This section provides very detailed information about setting shared memory. use the following: # ipcs -lm -----. This is the fastest form of inter-process communications (IPC) available. keep in mind that the size of the SGA should fit within one shared memory segment. Swap Space Considerations Installing Oracle10g Release 2 requires a minimum of 512MB of memory. This way you do not have to use a raw device or even more drastic. An inadequate SHMMAX setting could result in the following: ORA-27123: unable to attach to shared memory segment You can determine the value of SHMMAX by performing the following: 1 of 20 1/4/2006 8:46 AM . and file handle limits. As root. Overview This section focuses on configuring both Linux servers: getting each one prepared for the Oracle RAC 10g installation. Instructions for placing them in a startup script (/etc/sysctl.oracle. Data does not need to be copied between processes. This includes verifying enough swap space.conf file. and finally how to set the maximum amount of file handles for the OS. semaphores. setting shared memory and semaphores. mainly due to the fact that no kernel involvement occurs when data is being passed between the processes.conf) are included in Section 14 ("All Startup Commands for Each RAC Node"). Oracle makes use of shared memory for its Shared Global Area (SGA) which is an area of memory that is shared by all Oracle backup and foreground processes. Page 1 Page 2 Page 3 Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire (Continued) For development and testing only. shared SQL. I will be making all changes permanent (through reboots) by placing all commands in the /etc/sysctl.com/technology/pub/articles/hunter_rac10gr2_2. Throughout this section you will notice that there are several different ways to configure (set) these parameters. http://www. let's say about 300MB: # dd if=/dev/zero of=tempswap bs=1k count=300000 Now we should change the file permissions: # chmod 600 tempswap Finally we format the "partition" as swap and add it to the swap space: # mke2fs tempswap # mkswap tempswap # swapon tempswap Setting Shared Memory Shared memory allows processes to access common structures and data by placing them in a shared memory segment. type either: # cat /proc/meminfo | grep MemTotal MemTotal: 1034352 kB If you have less than 512MB of memory (between your RAM and SWAP). Configure the Linux Servers for Oracle Perform the following configuration procedures on all nodes in the cluster! Several of the commands within this section will need to be performed on every node within the cluster every time the machine is booted..Shared Memory Limits -------max number of segments = 4096 max seg size (kbytes) = 32768 max total shared memory (kbytes) = 8388608 min seg size (bytes) = 1 Setting SHMMAX The SHMMAX parameters defines the maximum size (in bytes) for a shared memory segment. you can add temporary swap space by creating a temporary swap file. When setting SHMMAX.. For the purpose of this article. The Oracle SGA is comprised of shared memory and it is possible that incorrectly setting SHMMAX could limit the size of the SGA. rebuild your system. (Note: An inadequate amount of swap during the installation will cause the Oracle Universal Installer to either "hang" or "die") To check the amount of memory / swap you have allocated.html.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. and so much more.. Adequate sizing of the SGA is critical to Oracle performance because it is responsible for holding the database buffer cache. access paths. To determine all shared memory limits. production deployments will not be supported! 11. make a file that will act as additional swap space.

This kernel parameter is used to set the maximum number of shared memory segments system wide.Semaphore Limits -------max number of arrays = 128 max semaphores per array = 250 max semaphores system wide = 32000 max ops per semop call = 32 semaphore max value = 32767 You can also use the following command: # cat /proc/sys/kernel/sem 250 32000 32 128 Setting SEMMSL The SEMMSL kernel parameter is used to control the maximum number of semaphores per semaphore set. however..html. I generally set the SHMMAX parameter to 2GB using the following methods: You can alter the default setting for SHMMAX without rebooting the machine by making the changes directly to the /proc file system (/proc/sys/kernel/shmmax) by using the following command: # sysctl -w kernel. Oracle recommends setting SEMMSL to the largest PROCESS instance parameter setting in the init. Also. use bigpages which supports the configuration of larger memory page sizes. we look at the SHMALL shared memory kernel parameter. # cat /proc/sys/kernel/shmmax 33554432 The default value for SHMMAX is 32MB.conf Setting SHMMNI We now look at the SHMMNI parameters. When an application requests semaphores. Oracle recommends setting the SEMMSL to a value of no less than 100.. it does so using "sets". You can..ora file for all databases on the Linux system plus 10.shmmax=2147483648" >> /etc/sysctl.) Setting Semaphores Now that you have configured our shared memory settings. 2 of 20 1/4/2006 8:46 AM . The default value for this parameter is 4096.com/technology/pub/articles/hunter_rac10gr2_2. Setting SHMALL Finally.oracle.. it is time to configure your semaphores.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. Semaphore sets are supported in UNIX System V where each one is a counting semaphore. Setting SEMMNI The SEMMNI kernel parameter is used to control the maximum number of semaphore sets in the entire Linux system. You can determine the value of SHMMNI by performing the following: # cat /proc/sys/kernel/shmmni 4096 The default setting for SHMMNI should be adequate for your Oracle RAC 10g Release 2 installation. http://www. This parameter controls the total amount of shared memory (in pages) that can be used at one time on the system.conf startup file: # echo "kernel.shmmax=2147483648 You should then make this change permanent by inserting the kernel parameter in the /etc/sysctl. (Note: The page size in Red Hat Linux on the i386 platform is 4. Setting SEMMNS The SEMMNS kernel parameter is used to control the maximum number of semaphores (not semaphore sets) in the entire Linux system. To determine all semaphore limits. use the following: # ipcs -ls -----. The best way to describe a "semaphore" is as a counter that is used to provide synchronization between processes (or threads within a process) for shared resources like shared memory. Oracle recommends setting the SEMMNI to a value of no less than 100.096 bytes. This size is often too small to configure the Oracle SGA. In short. the value of this parameter should always be at least: ceil(SHMMAX/PAGE_SIZE) The default size of SHMALL is 2097152 and can be queried using the following command: # cat /proc/sys/kernel/shmall 2097152 The default setting for SHMALL should be adequate for our Oracle RAC 10g Release 2 installation.

1) and Oracle9i Release 2 ( 9. A semaphore set can have the maximum number of SEMMSL semaphores per semaphore set and is therefore recommended to set SEMOPM equal to SEMMSL. currently used file handles. Oracle recommends setting the SEMOPM to a value of no less than 100.. if exceeded will reboot the machine.html. Use the following calculation to determine the maximum number of semaphores that can be allocated on a Linux system. http://www. All other default settings should be sufficient for our example installation.4. The semop system call (function) provides the ability to do operations for multiple semaphores with one semop system call. Use the following command to determine the maximum number of file handles for the entire system: # cat /proc/sys/fs/file-max 102563 Oracle recommends that the file handles for the entire system be set to at least 65536.conf startup file: # echo "fs. You can alter the default setting for the maximum number of file handles without rebooting the machine by making the changes directly to the /proc file system (/proc/sys/fs/file-max) using the following: # sysctl -w fs.1) used a userspace watchdog daemon called watchdogd to monitor the health of the cluster and to restart a RAC node in case of a failure.conf You can query the current usage of file handles by using the following: # cat /proc/sys/fs/file-nr 825 0 65536 The file-nr file displays three parameters: total allocated file handles.. The hang-check timer is loaded into the Linux kernel and checks if the system hangs. Verify the ulimit setting my issuing the ulimit command: # ulimit unlimited 12.file-max=65536" >> /etc/sysctl.conf startup file: # echo "kernel. There is a configurable threshold to hang-check that.. It will be the lesser of: SEMMNS -or(SEMMSL * SEMMNI) Setting SEMOPM The SEMOPM kernel parameter is used to control the number of semaphore operations that can be performed per semop system call.2.conf Setting File Handles When configuring our Red Hat Linux server.0. the only parameter I care about changing (raising) is SEMOPM. Usually for 2. the watchdog daemon has been deprecated by a Linux kernel module named hangcheck-timer which addresses availability and reliability problems much better. The hangcheck-timer.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW.. It will set a timer and check the timer after a certain amount of time. and maximum file handles that can be allocated. Oracle recommends setting the SEMMNS to the sum of the PROCESSES instance parameter setting for each database on the system. then make sure that the ulimit is set properly.2. Setting Semaphore Kernel Parameters Finally.20 it is set to unlimited. The setting for file handles denotes the number of open files that you can have on the Linux system. and then finally adding 10 for each Oracle database on the system.ko Module The hangcheck-timer module uses a kernel-based timer that periodically checks the system task scheduler to catch delays in order to determine the 3 of 20 1/4/2006 8:46 AM .com/technology/pub/articles/hunter_rac10gr2_2. we see how to set all semaphore parameters using several methods.0. adding the largest PROCESSES twice.oracle. In the following.sem="250 32000 100 128" You should then make this change permanent by inserting the kernel parameter in the /etc/sysctl.file-max=65536 You should then make this change permanent by inserting the kernel parameter in the /etc/sysctl. it is highly recommended by Oracle. (Note: If you need to increase the value in /proc/sys/fs/file-max. Starting with Oracle9i Release 2 (9.sem=250 32000 100 128" >> /etc/sysctl.0.2) (and still available in Oracle 10g Release 2). it is critical to ensure that the maximum number of file handles is sufficiently large. Configure the hangcheck-timer Kernel Module Perform the following configuration procedures on all nodes in the cluster! Oracle9i Release 1 (9. You can alter the default setting for all semaphore settings without rebooting the machine by making the changes directly to the /proc file system (/proc/sys/kernel/sem) by using the following command: # sysctl -w kernel. Although the hangcheck-timer module is not required for Oracle Clusterware (Cluster Manager) operation.

5.local (Note: You don't have to manually load the hangcheck-timer kernel module using modprobe or insmod after each reboot.9-11..10. Configure RAC Nodes for Remote Access Perform the following configuration procedures on all nodes in the cluster! When running the Oracle Universal Installer on a RAC node. If the system hangs or pauses.9-e. The TCS offers much more accurate time measurements because this register is updated by the hardware automatically. /etc/rc. To do that. make an entry with the correct values to the /etc/modprobe. this module is now included with Red Hat Linux starting with kernel versions 2.conf file.conf file as follows: # su # echo "options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180" >> /etc/modprobe. to test the hangcheck-timer kernel module to verify it is picking up the correct parameters we defined in the /etc/modprobe. The oracle UNIX account on the node running the Oracle Installer (runInstaller) must be trusted by all other nodes in your RAC 4 of 20 1/4/2006 8:46 AM .. but realize that it does not hurt to include a modprobe of the hangcheck-timer kernel module during startup. Although you could load the hangcheck-timer kernel module by passing it the appropriate parameters (e. insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180). it will use the rsh (or ssh) command to copy the Oracle software to all other nodes within the RAC cluster. margin is 180 seconds) 13.10.0.ko) in the /lib/modules/2. run the following command: # su # modprobe hangcheck-timer # grep Hangcheck /var/log/messages | tail -2 Sep 27 23:11:51 linux2 kernel: Hangcheck: starting hangcheck timer 0.g. Oracle recommends setting it to 30 seconds.conf Each time the hangcheck-timer kernel module gets loaded. The default value is 180 seconds. It defines the margin of error in seconds. we want to verify that it is picking up the options we set in the /etc/modprobe. For that reason. use the modprobe command.ko In the above output.0. Installing the hangcheck-timer. So to keep myself sane and able to sleep at night. It is only out of pure habit that I continue to include a modprobe of the hangcheck-timer kernel module in the /etc/rc.6.0.conf file.e. hangcheck-margin: This parameter defines the maximum hang delay that should be tolerated before hangcheck-timer resets the RAC node. then the hangcheck-timer is already included for you.9-22. The default value is 60 seconds.0 (tick is 30 seconds.3. Oracle recommends setting it to 180 seconds. To manually load the hangcheck-timer kernel module and verify it is using the correct values defined in the /etc/modprobe. Much more information about the hangcheck-timer project can be found here. which is incremented at each clock signal.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW.conf file.6. NOTE: The two hangcheck-timer module parameters indicate how long a RAC node must hang before it will reset the system.EL/kernel/drivers/char directory. Configuring and Loading the hangcheck-timer Module There are two key parameters to the hangcheck-timer module: hangcheck-tick: This parameter defines the period of time between checks of system health. The hangcheck-timer module uses the Time Stamp Counter (TSC) CPU register.EL/kernel/drivers/char/hangcheck-timer.ko Module The hangcheck-timer was originally shipped only by Oracle. A node reset will occur when the following is true: system hang time > (hangcheck_tick + hangcheck_margin) Configuring Hangcheck Kernel Module Parameters Each time the hangcheck-timer kernel module is loaded (manually or by Oracle).html. it will use the values defined by the entry I made in the /etc/modprobe.local). Use the following to confirm: # find /lib/modules -name "hangcheck-timer.3. The hangcheck-timer module will be loaded by Oracle automatically when needed..) Now.conf file. health of the system. If you followed the steps in Section 8 ("Obtain & Install New Linux Kernel / FireWire Modules"). it is not required to perform a modprobe or insmod of the hangcheck-timer kernel module in any of the startup files (i.ko /lib/modules/2. it needs to know what value to use for each of the two parameters we just discussed: (hangcheck-tick and hangcheck-margin).com/technology/pub/articles/hunter_rac10gr2_2.6.0.. we care about the hangcheck timer object (hangcheck-timer.local file. Manually Loading the Hangcheck Kernel Module for Testing Oracle is responsible for loading the hangcheck-timer kernel module when required.oracle. however. Someday I will get over it. http://www.4.12 and higher. These values need to be available after each reboot of the Linux server.ko" /lib/modules/2. the timer resets the node.EL/kernel/drivers/char/hangcheck-timer. I always configure the loading of the hangcheck-timer kernel module on each startup as follows: # echo "/sbin/modprobe hangcheck-timer" >> /etc/rc.9-11.

rpm To enable the "rsh" and "rlogin" services.. let's make sure that we have the rsh RPMs installed on each node in the RAC cluster: # rpm -q rsh rsh-server rsh-0. First.original # mv /usr/kerberos/bin/rlogin /usr/kerberos/bin/rlogin.oracle $ rsh linux1 ls -l /etc/hosts.oracle.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. Use the following: # su # which rsh /usr/kerberos/bin/rsh # mv /usr/kerberos/bin/rsh /usr/kerberos/bin/rsh. however. the "disable" attribute in the /etc/xinetd.3 From the above. http://www.17-25.1 root root 68 Sep 27 23:37 /etc/hosts. we would run the following command from the CD where the RPM is located: # su # rpm -ivh rsh-0.equiv file on all nodes in the cluster: # # # # su touch /etc/hosts.rhosts file found in the user's (oracle's) home directory. Do that by running the following commands on all nodes in the cluster: # su # chkconfig rsh on # chkconfig rlogin on # service xinetd reload Reloading configuration: [ OK ] To allow the "oracle" UNIX user account to be trusted among the RAC nodes. and rlogin on the Linux server you will be running the Oracle installer from.rpm rsh-server-0.3. This article.html.equiv chmod 600 /etc/hosts. the second field permits only the oracle user account to run rsh commands on the specified nodes.d/rsh file must be set to "no" and xinetd must be reloaded. (The use of rcp and rsh are not required for normal RAC operation.17-25..1 root root 68 Sep 27 23:37 /etc/hosts.equiv chown root. For security reasons.original # mv /usr/kerberos/bin/rcp /usr/kerberos/bin/rcp. Red Hat Linux puts /usr/kerberos/sbin at the head of the $PATH variable. some systems will only honor the content of this file if the owner is root and the permissions are set to 600. we can see that we have the rsh and rsh-server installed. When using the SSH tool suite.17-25. rcp. In fact.original # which rsh /usr/bin/rsh You should now test your connections and run the rsh command from the node that will be performing the Oracle Clusterware and 10g RAC installation. Therefore you should be able to run r* commands like rsh. By default. I will typically rename the Kerberos version of rsh so that the normal rsh command is being used.. the /etc/hosts. cluster.equiv -rw------.equiv $ rsh int-linux1 ls -l /etc/hosts.) Oracle added support in Oracle RAC 10g Release 1 for using the Secure Shell (SSH) tool suite for setting up user equivalence.17-25. The rsh daemon validates users using the /etc/hosts.3.equiv -rw------.equiv +linux1 oracle +linux2 oracle +int-linux1 oracle +int-linux2 oracle Note: In the above example..root /etc/hosts.3 rsh-server-0.com/technology/pub/articles/hunter_rac10gr2_2. Were rsh not installed.i386. I will be using the node linux1 to perform all installs so this is where I will run the following commands from: # su . the scp (as opposed to the rcp) command would be used to copy the software in a very secure manner. However rcp and rsh should be enabled for RAC and patchset installation.equiv file or the .equiv 5 of 20 1/4/2006 8:46 AM . against all other Linux servers in the cluster without a password. This will cause the Kerberos version of rsh to be executed.equiv file should be owned by root and the permissions should be set to 600. Before attempting to test your rsh command. create the /etc/hosts.equiv -rw------. uses the older method of rcp for copying the Oracle software to the other nodes in the cluster. ensure that you are using the correct version of rsh.equiv file similar to the following example for all nodes in the cluster: # cat /etc/hosts.equiv $ rsh linux2 ls -l /etc/hosts.1 root root 68 Sep 27 23:37 /etc/hosts.equiv Now add all RAC nodes to the /etc/hosts.i386.

ip_forward = 0 # Controls source route verification net.wmem_max=262144 # +---------------------------------------------------------+ # | SHARED MEMORY | # +---------------------------------------------------------+ kernel. 0 is disabled.sem=250 32000 100 128 6 of 20 1/4/2006 8:46 AM .core. and entries (in previous sections of this document) that need to happen on each node when the machine is booted.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW..core_uses_pid = 1 # Default setting in bytes of the socket receive buffer net.html.) alias eth0 b44 alias eth1 tulip alias snd-card-0 snd-intel8x0 options snd-card-0 index=0 alias usb-controller ehci-hcd alias usb-controller1 uhci-hcd options sbp2 exclusive_login=0 alias scsi_hostadapter sbp2 options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180 /etc/sysctl.equiv -rw------. All Startup Commands for Each RAC Node Verify that the following startup commands are included on all nodes in the cluster! Up to this point.conf (We wanted to adjust the default and maximum send buffer size as well as the default and maximum receive buffer size for the interconnect. See sysctl(8) and # sysctl. you have read in great detail about the parameters and resources that need to be configured on all nodes for the Oracle10g RAC configuration. semaphores.core.default. /etc/modprobe.conf.. commands.ipv4. # Useful for debugging multi-threaded applications. $ rsh int-linux2 ls -l /etc/hosts..rmem_default=262144 # Default setting in bytes of the socket send buffer net.rp_filter = 1 # Controls the System Request debugging functionality of the kernel kernel. For each of the startup files below.sysrq = 0 # Controls whether core dumps will append the PID to the core filename. # Controls IP packet forwarding net.) # Kernel sysctl configuration file for Red Hat Linux # # For binary values.oracle. This file also contains those parameters responsible for configuring shared memory. and file handles for use by the Oracle instance. This section will ler you " take a deep breath" and recap those parameters.. entries in gray should be included in each startup file.conf (All parameters and values to be used by kernel modules.equiv 14.rmem_max=262144 # Maximum socket send buffer size which may be set by using # the SO_SNDBUF socket option net.core. 1 is enabled. http://www.wmem_default=262144 # Maximum socket receive buffer size which may be set by using # the SO_RCVBUF socket option net.shmmax=2147483648 # +---------------------------------------------------------+ # | SEMAPHORES | # | ---------| # | | # | SEMMSL_value SEMMNS_value SEMOPM_value SEMMNI_value | # | | # +---------------------------------------------------------+ kernel.ipv4.core. kernel.conf(5) for more details.com/technology/pub/articles/hunter_rac10gr2_2.1 root root 68 Sep 27 23:37 /etc/hosts.

2. http://www.1.1.168.9 glibc-headers-2.) # Do not remove the following line.1. you would have installed everything. Check RPM Packages for Oracle 10g Release 2 Perform the following checks on all nodes in the cluster! When installing the Linux O/S (CentOS Enterprise Linux or RHEL4).4-2.101 int-linux2 # Public Virtual IP (VIP) addresses for .1...2.e.html.2. Check Required RPMs The following packages (or higher versions) must be installed: make-3.0.101 linux2 # Private Interconnect .4.1 compat-db-4.3.3 compat-libstdc++-33-3.201 vip-linux2 192.2.(eth0) 192.168.1.80-5 glibc-2. Advanced Server).0..(eth0) 192.2.3-47.oracle.25-9 compat-gcc-32-3.local (Loading the hangcheck-timer kernel module.168.3-22.1.4-2.3.168.168.file-max=65536 /etc/hosts (All machine/IP entries for nodes in our RAC cluster.4-2.localdomain localhost # Public Network .168. 127.100 int-linux1 192.3.1.87 cpp-3..1.equiv (Allow logins to each node as the oracle user account without the need for a password.200 vip-linux1 192. touch /var/lock/subsys/local # # # # +---------------------------------------------------------+ | HANGCHECK TIMER | | (I do not believe this is required.com/technology/pub/articles/hunter_rac10gr2_2.100 linux1 192. if you performed another installation type (i. However.(eth1) 192.168.1 localhost. in which case you will have all the required RPM packages. but doesn't hurt) | ----------------------------------------------------------+ /sbin/modprobe hangcheck-timer 15.3-47.) #!/bin/sh # # This script will be executed *after* all the other init scripts. you may have some packages missing and will need to install them.3-47.3 compat-gcc-32-c++-3.9 glibc-kernheaders-2. # +---------------------------------------------------------+ # | FILE HANDLES | # ----------------------------------------------------------+ fs. you should verify that all required RPMs are installed.) +linux1 oracle +linux2 oracle +int-linux1 oracle +int-linux2 oracle /etc/rc. # You can put your own initialization stuff in here if you don't # want to do the full Sys V style init stuff.9 glibc-devel-2.168.105 bartman /etc/hosts.4-9. If you followed the instructions I used for installing Linux. All of the required RPMs are on the Linux CDs/ISOs.102 alex 192. or various programs # that require network functionality will fail.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW.106 melody 192.1.3 7 of 20 1/4/2006 8:46 AM .168.

99.EL-1.com/projects/ocfs2/files/ and the tools from http://oss. hugemem. The installation process is simply a matter of running the following command on all nodes in the cluster as the root user account: $ su # rpm -Uvh ocfs2-2.com/technology/pub/articles/hunter_rac10gr2_2.oracle.2. glib2 2.(for single processor) or ocfs2-2. download the one that matches the distribution. This allows for easy management of applications that need to run across a cluster.1 glibc-devel-2. http://www. compat-libstdc++-296-2. archive logs.6-1 To query package information (gcc and glibc-devel for example).96-132. ocfs2-2. For example.3-22.2-1..0.3.(for hugemem) For the tools.i386.0.EL #1 Tue Jul 5 12:20:09 PDT 2005 i686 i686 i386 GNU/Linux Install OCFS2 I will be installing the OCFS2 files onto two single-processor machines.rpm .4-1.rpm Preparing. (Along with these two files..10.0.6. ########################################### [100%] 8 of 20 1/4/2006 8:46 AM . to install the GCC 3.ELhugemem-1. In this guide. one can store not only database related files on a shared disk.0. OCFS2.(OCFS2 tools) ocfs2console-1.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. python 2. has been designed as a general-purpose cluster filesystem.i686.6. You should download both the OCFS2 tools and the OCFS2 console applications.3 or later.4.i386.. <PackageName>]" command as follows: # rpm -q gcc glibc-devel gcc-3.rpm \ ocfs2console-1.rpm \ ocfs2-tools-1.i686.3. etc.9-11. but also store Oracle binaries and configuration files (shared Oracle Home) making management of RAC even easier. The kernel module is available for download from http://oss. redo logs.0.0. reboot all nodes in the cluster before attempting to install any of the Oracle components!!! # init 6 16. you will also be using this space to store the shared SPFILE for all Oracle RAC instances. namely. control files.EL-1.0. such as data files. Install & Configure OCFS2 Most of the configuration procedures in this section should be performed on all nodes in the cluster! Creating the OCFS2 filesystem. etc).0. Download OCFS First. should be executed on only one node in the cluster.9-11.0.0. The OCFS2 distribution comprises of two sets of RPMs.10. vte 0.(for multiple processors) or ocfs2-2.9-11. the kernel module and the tools.4.2.3. The ocfs2console application requires e2fsprogs.0.2-1.0.4-1.10 or later.6.oracle.html.3-22. in contrast.4-1. Download the appropriate RPMs starting with the key OCFS2 kernel module (the driver).i386.oracle.6. psmp.3 or later and ocfs2-tools. use "rpm -Uvh <PackageName. It is now time to install OCFS2.10. download the OCFS2 distribution.. OCFS Release 1 was released in 2002 to enable Oracle RAC users to run the clustered database without having to deal with RAW devices. From the three available kernel modules (below). With it.7.1. use: # rpm -Uvh gcc-3.9 If you need to install any of the above packages. OCFS2 is a cluster filesystem that allows all nodes in a cluster to concurrently access a device via the standard filesystem interface. pygtk2 (EL4) or python-gtk (SLES9) 1. use the "rpm -q <PackageName> [.rpm ..9-11.4-1.1 setarch-1.rpm .11.com/projects/ocfs2-tools/files/.0. you will be using OCFS2 to store the two files that are required to be shared by the Oracle Clusterware software.2-1.4-2.i386.3-24 package. however.2 openmotif-2.) See this page for more information on OCFS2 (including Installation Notes) for Linux. platform.2.rpm .0.i686. use the OCFS2 release that matches your kernel version.rpm>".3. kernel version and the kernel flavor (smp. simply match the platform and distribution.6.(OCFS2 console) The OCFS2 Console is optional but highly recommended.i686.3-9. ocfs2-tools-1.i386.0.rpm . To determine your kernel release: $ uname -a Linux linux1 2.ELsmp-1. If you were curious as to which OCFS2 driver release you need.3..RHEL4.3.10. The filesystem was designed to store database related files.2-1.0.0.10.16 or later.0.rpm Reboot the System If you made any changes to the O/S.9-11.

10. After clicking [OK].Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. Simply acknowledge this warning by clicking "Yes". If you are using RHEL4 U2 (which includes you. since you are using CentOS 4. To disable SELinux. http://www. Your screen should now look like the following after disabling the SELinux option: 9 of 20 1/4/2006 8:46 AM .0.. click the SELinux tab and check off the "Enabled" checkbox.2 here) you will need to disable SELinux (using tool system-config-securitylevel) to get the O2CB service to execute.9-11.. run the "Security Level Configuration" GUI utility: # /usr/bin/system-config-securitylevel & This will bring up the following screen: Figure 6 Security Level Configuration Opening Screen Now.0.2 is based on RHEL4 U2) are advised that OCFS2 currently does not work with SELinux enabled.oracle..com/technology/pub/articles/hunter_rac10gr2_2.6. 1:ocfs2-tools ########################################### [ 33%] 2:ocfs2-2. you will be presented with a warning dialog.html..3########################################### [ 67%] 3:ocfs2console ########################################### [100%] Disable SELinux (RHEL4 U2 Only) RHEL4 U2 users (CentOS 4.

com/technology/pub/articles/hunter_rac10gr2_2. When the /etc/ocfs2/cluster. (as will be the case in our example). http://www.oracle. the ocfs2console tool will create this file along with a new cluster stack service (O2CB) with a default cluster name of ocfs2.conf file is not present. The easiest way to accomplish this is to run the GUI tool ocfs2console. each node will need to be rebooted to implement the change: # init 6 Configure OCFS2 The next step is to generate and configure the /etc/ocfs2/cluster. we will not only create and configure the /etc/ocfs2/cluster..... but will also create and start the cluster stack O2CB. This will need to be done on all nodes in the cluster as the root user account: $ su # ocfs2console & This will bring up the GUI as shown below: 10 of 20 1/4/2006 8:46 AM . In this section.conf file on each node in the cluster. Figure 7: SELinux Disabled After making this change on both nodes in the cluster.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW.html.conf file using ocfs2console.

1. 2. In my example.html.oracle. Click [Close] on the "Node Configuration" dialog.. Starting the OCFS2 Cluster Stack The following dialog show the OCFS2 settings I used for the node linux1 and linux2: 11 of 20 1/4/2006 8:46 AM . Select [Cluster] -> [Configure Nodes. On the "Node Configuration" dialog.]. Figure 9.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. Figure 8 ocfs2console GUI Using the ocfs2console GUI tool.. enter the Host name and IP address for the first node in the cluster.168..1.100 for the first node and linux2 / 192. This needs to be performed on all nodes in the cluster. After verifying all values are correct.. perform the following steps: 1. 3.. http://www.. click the [Add] button.168.com/technology/pub/articles/hunter_rac10gr2_2.101 for the second node. Leave the IP Port set to its default value of 7777. In the "Add Node" dialog. exit the application using [File] -> [Quit]. This will bring up the "Add Node" dialog. Click [Apply] on the "Node Configuration" dialog . I added both nodes using linux1 / 192. This will start the OCFS Cluster Stack (Figure 9) and bring up the "Node Configuration" dialog.All nodes should now be "Active" as shown in Figure 10.

com/technology/pub/articles/hunter_rac10gr2_2.oracle. 12 of 20 1/4/2006 8:46 AM . If you were to check the status of the o2cb service immediately after configuring OCFS using ocfs2console utility.conf HB: Heart beat service that issues up/down notifications when nodes join or leave the cluster TCP: Handles communication between the nodes DLM: Distributed lock manager that keeps track of all locks. Here is a short listing of some of the more useful commands and options for the o2cb system service. we need to first have OCFS2's cluster stack.1. they would all be loaded. Figure 10 Configuring Nodes for OCFS2 After exiting the ocfs2console..d/o2cb status Module "configfs": Not loaded Filesystem "configfs": Not mounted Module "ocfs2_nodemanager": Not loaded Module "ocfs2_dlm": Not loaded Module "ocfs2_dlmfs": Not loaded Filesystem "ocfs2_dlmfs": Not mounted Note that with this example.100 number = 0 name = linux1 cluster = ocfs2 node: ip_port = 7777 ip_address = 192. its owners and status CONFIGFS: User space driven configuration file system mounted at /config DLMFS: User space interface to the kernel space DLM All of the above cluster services have been packaged in the o2cb system service (/etc/init. http://www. you will have a /etc/ocfs2/cluster... I did an "unload" right before executing the "status" option.conf similar to the following..168. /etc/init.168. O2CB. all of the services are not loaded.html. running (which it will be as a result of the configuration process performed above). The stack includes the following services: NM: Node Manager that keep track of all the nodes in the cluster.101 number = 1 name = linux2 cluster = ocfs2 cluster: node_count = 2 name = ocfs2 O2CB Cluster Service Before we can do anything with OCFS2 like formatting or mounting the file system.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW.d/o2cb).1. This process needs to be completed on all nodes in the cluster and the OCFS2 configuration file should be exactly the same for all of the nodes: node: ip_port = 7777 ip_address = 192.

you can continue to set the on-boot properties as follows: 13 of 20 1/4/2006 8:46 AM .13.. Before attempting to configure the on-boot properties: REMOVE the following lines in /etc/init.com/technology/pub/articles/hunter_rac10gr2_2. All the tasks within this section will need to be performed on both nodes in the cluster.d/o2cb ### BEGIN INIT INFO # Provides: o2cb # Required-Start: # Should-Start: # Required-Stop: # Default-Start: 2 3 5 # Default-Stop: # Description: Load O2CB cluster services at system boot.2-1 and chkconfig-1. Red Hat changed the way the service is registered between chkconfig-1. Configure O2CB to Start on Boot You now need to configure the on-boot properties of the OC2B driver so that the cluster stack services will start on each boot. Mounting other filesystems: mount. OCFS2 contains a bug wherein the driver does not get loaded on each boot even after configuring the on-boot properties to do so../init. /etc/init. /etc/init. /etc/init.d/o2cb load Loading module "configfs": OK Mounting configfs filesystem at /config: OK Loading module "ocfs2_nodemanager": OK Loading module "ocfs2_dlm": OK Loading module "ocfs2_dlmfs": OK Mounting ocfs2_dlmfs filesystem at /dlm: OK Loads all OCFS modules. http://www.d/o2cb The service should be S24o2cb in the default runlevel. ### END INIT INFO Re-register the o2cb service.d/*o2cb* lrwxrwxrwx 1 root root 14 Sep 29 11:56 /etc/rc3.... ocfs2.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. ocfs2. After attempting to configure the on-boot properties to start on each boot according to the official OCFS2 documentation.d/S24o2cb -> .d/o2cb offline ocfs2 Unmounting ocfs2_dlmfs filesystem: OK Unloading module "ocfs2_dlmfs": OK Unmounting configfs filesystem: OK Unloading module "configfs": OK The above command will offline the cluster we created..ocfs2: Unable to access cluster service Cannot initialize cluster [FAILED] .d/o2cb online ocfs2 Starting cluster ocfs2: OK The above command will online the cluster we created.3.3. you will still get the following error on each boot: . After resolving this bug.html. # chkconfig --del o2cb # chkconfig --add o2cb # chkconfig --list o2cb o2cb 0:off 1:off 2:on 3:on 4:on 5:on 6:off # ll /etc/rc3..ocfs2: Unable to access cluster service Cannot initialize cluster mount. /etc/init.oracle.d/o2cb unload Cleaning heartbeat on ocfs2: OK Stopping cluster ocfs2: OK The above command will unload all OCFS modules.. Note: At the time of writing this guide. The O2CB script used to work with the former..11.2-1.

we created the directory /u02/oradata/orcl under the section Create Mount Point for OCFS / Clusterware. http://www..d/o2cb configure Configuring the O2CB driver. Hitting <ENTER> without typing an answer will keep that current value. This will configure the on-boot properties of the O2CB driver. Let's first do it using the command-line.ocfs2 successful Mount the OCFS2 Filesystem Now that the file system is created.d/o2cb unload # /etc/init. (Well.[Format]. Remember.ocfs2. The following questions will determine whether the driver is loaded on boot.2 Filesystem label=oradatafiles Block size=4096 (bits=12) Cluster size=32768 (bits=15) Volume size=1011675136 (30873 clusters) (246984 blocks) 1 cluster groups (tail covers 30873 clusters. The current values will be shown in brackets ('[]').. you will simply got your prompt back. I run the following command only from linux1 as the root user account: $ su # mkfs. This section contains the commands to create and mount the file system to be used for the Cluster Manager ..ocfs2. however./u02/oradata/orcl. creating the OCFS2 filesystem should only be executed on one node in the RAC cluster. For the purpose of this example.ocfs2 1. Load O2CB driver on boot (y/n) [n]: y Cluster to start on boot (Enter "none" to clear) [ocfs2]: ocfs2 Writing O2CB configuration: OK Loading module "configfs": OK Mounting configfs filesystem at /config: OK Loading module "ocfs2_nodemanager": OK Loading module "ocfs2_dlm": OK Loading module "ocfs2_dlmfs": OK Mounting ocfs2_dlmfs filesystem at /dlm: OK Starting cluster ocfs2: OK Format the OCFS2 Filesystem You can now start to make use of the partitions created in the section Create Partitions on the Shared FireWire Storage Device. Earlier in this document. This should be performed on all nodes in the RAC cluster: # mount /dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw) none on /proc type proc (rw) 14 of 20 1/4/2006 8:46 AM . See the instructions below on how to create the OCFS2 file system using the command-line tool mkfs. this needs to be performed as the root user account: $ su # mount -t ocfs2 -o datavolume /dev/sda1 /u02/oradata/orcl If the mount was successful. here is how to manually mount the OCFS2 file system from the command line. You will be executing all commands in this section from linux1 only. Create the OCFS2 Filesystem Unlike the other tasks in this section. at least the first partition!) If the O2CB cluster is offline. use the Oracle executable mkfs. First. Ctrl-C will abort. From the ocfs2console utility.0. run the following checks to ensure the fil system is mounted correctly.d/o2cb offline ocfs2 # /etc/init. You should.. The format operation needs the cluster to be online. then I'll show how to include it in the /etc/fstab to have it mount on each boot. as it needs to ensure that the volume is not mounted on some node in the cluster.ocfs2. rest cover 30873 clusters) Journal size=16777216 Initial number of node slots: 4 Creating bitmaps: done Initializing superblock: done Writing system files: done Writing superblock: done Writing lost+found: done mkfs.html. use the menu [Tasks] . # /etc/init. Let's use the mount command to ensure that the new filesystem is really mounted.ocfs2 -b 4K -C 32K -N 4 -L oradatafiles /dev/sda1 mkfs.oracle.com/technology/pub/articles/hunter_rac10gr2_2. you can mount it. Mounting the filesystem will need to be performed on all nodes in the Oracle RAC cluster as the root user account. Note that it is possible to create and mount the OCFS2 file system using either the GUI tool ocfs2console or the command-line tool mkfs. start it. To create the filesystem.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW.

168. which will make most mounts instantaneous. none on /sys type sysfs (rw) none on /dev/pts type devpts (rw. http://www.ko kernel module is being loaded and that the file system will be mounted during the boot process. The _netdev mount option is a must for OCFS2 volumes. located on the new OCFS2 volume.com/technology/pub/articles/hunter_rac10gr2_2. and Control files with the datavolume mount option so as to ensure that the Oracle processes open the files with the o_direct flag. Data files. the voting disk and OCR file). In a future release. Finally. let's make sure that the ocfs2. then it is very possible that the "oracle" UID (175 in this example) and/or the "dba" GID (115 in this example) are not the same across all nodes._netdev. The permissions should be set to 0775 with owner "oracle" and group "dba".1. This section walks through the steps responsible for mounting the new OCFS2 file system each time the machine(s) are booted. During testing.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. It does so to let the heartbeat thread stabilize. including an Oracle home (not used in this guide).addr=192..mode=620) usbfs on /proc/bus/usb type usbfs (rw) /dev/hda1 on /boot type ext3 (rw) none on /dev/shm type tmpfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) cartman:SHARE2 on /cartman type nfs (rw. I was able to install Oracle Clusterware and see the shared drive. you should still check those options by running the following on all nodes in the RAC cluster as the root user account: $ su # chkconfig --list o2cb o2cb 0:off 1:off 2:on 3:on 4:on 5:on 6:off The flags that I have marked in bold should be set to "on".html. Archive logs. the oracle user account (and the dba group) will not be able to write to this directory. to 301 (and in some cases as high as 900). however.oracle.gid=5. Any other type of volume. which will be used to store the files needed by Cluster Manager files.dba /u02/oradata/orcl # chmod 775 /u02/oradata/orcl Let's now go back and re-check that the permissions are correct for each node in the cluster: # ls -ld /u02/oradata/orcl drwxrwxr-x 3 oracle dba 4096 Sep 29 12:11 /u02/oradata/orcl Adjust the O2CB Heartbeat Threshold This is a very important section when configuring OCFS2 for use by Oracle Clusterware's two shared files on our FireWire drive. If this is not the case for all nodes in the cluster (which was the case for me). should not be mounted with this mount option. After looking through the trace files for OCFS2. Now. the actions to load the kernel module and mount the OCFS2 file system should already be enabled. Cluster Registry (OCR).. However. I was able to install and configure OCFS2. and finally install Oracle Clusterware (with its two required shared files. After going through the install. during my evaluation I was receiving many lock-ups and hanging after about 15 minutes when the Clusterware software was running on both nodes.. Configure OCFS to Mount Automatically at Startup Let's review what you've done so far. it was apparent that access to the voting disk was too slow (exceeding the O2CB heartbeat threshold) and causing the Oracle Clusterware software (and the node) to crash. You downloaded and installed OCFS2. it indicates that the volume is to be mounted after the network is started and dismounted before the network is shutdown. format the new volume. Redo logs.datavolume) Note: You are using the datavolume option to mount the new filesystem here. Oracle plans to add support for a global heartbeat. you mounted the newly created filesystem. Start by adding the following line to the /etc/fstab file on all nodes in the RAC cluster: /dev/sda1 /u02/oradata/orcl ocfs2 _netdev. usually around five seconds. Oracle database users must mount any volume that will contain the Voting Disk file. The solution I used was to simply increase the O2CB heartbeat threshold from its default setting of 7. Check Permissions on New OCFS2 Filesystem Use the ls command to check ownership. Let's first check the permissions: # ls -ld /u02/oradata/orcl drwxr-xr-x 3 root root 4096 Sep 29 12:11 /u02/oradata/orcl As you can see from the listing above.. This is by no means a high-end setup and susceptible to bogus timeouts. This 15 of 20 1/4/2006 8:46 AM . The volume will mount after a short delay. It also didn't matter whether there was a high I/O load or none at all for it to crash (hang). If you have been following along with the examples in this article.datavolume 0 0 Notice the _netdev option for mounting this filesystem.120) configfs on /config type configfs (rw) ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw) /dev/sda1 on /u02/oradata/orcl type ocfs2 (rw. Let's fix that: # chown oracle. It always varied on which node would hang (either linux1 or linux2 in my example). you loaded the OCFS2 module into the kernel and then formatted the clustered filesystem. Keep in mind that the configuration you are creating is a rather low-end setup being configured with slow disk access with regards to the FireWire drive.

O2CB_BOOTCLUSTER=ocfs2 # O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead. you will want a O2CB_HEARTBEAT_THRESHOLD of 301 as shown below: (301 ..(300703U) LaCie Hard Drive. is a configurable parameter that is used to compute the time it takes for a node to "fence" itself.0 / IEEE 1394a External Hard Drive . Design by F.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW.A. # umount /u02/oradata/orcl/ # /etc/init.(A01A250) Maxtor OneTouch 200GB USB 2.d/o2cb configure Load O2CB driver on boot (y/n) [y]: y Cluster to start on boot (Enter "none" to clear) [ocfs2]: ocfs2 Writing O2CB configuration: OK Loading module "configfs": OK Mounting configfs filesystem at /config: OK Loading module "ocfs2_nodemanager": OK Loading module "ocfs2_dlm": OK Loading module "ocfs2_dlmfs": OK Mounting ocfs2_dlmfs filesystem at /dlm: OK Starting cluster ocfs2: OK You can now check again to make sure the settings took place in for the o2cb cluster stack: # cat /proc/fs/ocfs2_nodemanager/hb_dead_threshold 301 Important Note: The value of 301 used for the O2CB heartbeat threshold will not work for all the FireWire drives listed in this guide.0 / IEEE 1394a External Hard Drive .d/o2cb unload # /etc/init. it is used in the formula below to determine the fence time (in seconds): [fence time in seconds] = (O2CB_HEARTBEAT_THRESHOLD .html..A. you would have a fence time of: (7 . FireWire 400 ..(300702U) LaCie Hard Drive.(E01G300) Maxtor OneTouch II 250GB USB 2. with a O2CB heartbeat threshold of 7. When the machines come up.(E01A200) LaCie Hard Drive. This will need to be performed on both nodes in the cluster.A.0 / IEEE 1394a External Hard Drive . this would be a good place to reboot all of the nodes in the RAC cluster. FireWire 400 .0 / IEEE 1394a External Hard Drive .1) * 2 = 600 seconds Let's see now how to increase the O2CB heartbeat threshold from 7 to 301. Design by F. This can be done by querying the /proc file system as follows: # cat /proc/fs/ocfs2_nodemanager/hb_dead_threshold 7 The value is 7.(DLX185) Maxtor OneTouch 250GB USB 2.com/technology/pub/articles/hunter_rac10gr2_2.oracle. Use the following chart to determine the O2CB heartbeat threshold value that should be used.1) * 2 = 12 seconds You need a much larger threshold (600 seconds to be exact) given your slower FireWire disks.(A01A200) O2CB Heartbeat Threshold Value 301 301 301 600 600 600 901 600 600 Reboot Both Nodes Before starting the next section. let's see how to determine what the O2CB heartbeat threshold is currently set to. O2CB_HEARTBEAT_THRESHOLD=301 After modifying the file /etc/sysconfig/o2cb. Design by F. http://www. FireWire Enclosure. ADS Technologies . First.(300699U) Dual Link Drive Kit. O2CB_ENABLED=true # O2CB_BOOTCLUSTER: If not empty. this should be performed on all nodes in the cluster.. Porsche 80GB. FireWire 400 . you need to alter the o2cb configuration. the name of a cluster to start. Porsche 250GB. You first need to modify the file /etc/sysconfig/o2cb and set O2CB_HEARTBEAT_THRESHOLD to 301: # O2CB_ENABELED: 'true' means to load the driver on boot. Again. Porsche 160GB. but what does this value represent? Well.(E01G250) Maxtor OneTouch II 200GB USB 2.1) * 2 So. ensure that the cluster stack services are being loaded and the new OCFS2 file system is being mounted: # mount 16 of 20 1/4/2006 8:46 AM . For 600 seconds. FireWire Drive Maxtor OneTouch II 300GB USB 2.0 / IEEE 1394a External Hard Drive .

http://www. I will provide instructions for downloading the ASM drivers (ASMLib Release 2. and /dev/sda4). even with rapidly changing data usage patterns. (/dev/sda2. ASM with Standard Linux I/O: This method creates all Oracle database files on raw character devices managed by ASM using standard Linux I/O system calls. Every Linux raw device you want to use must be bound to the corresponding block device using the raw driver. "ASM with Standard Linux I/O. you will configure ASM to be used as the filesystem / volume manager for all Oracle physical database files (data.gid=5. We will examine the "ASM with ASMLib I/O" method here. online redo logs. You would then want to change ownership of all raw devices to the "oracle" user account: 17 of 20 1/4/2006 8:46 AM . archived redo logs) and a Flash Recovery Area. you will install and configure the ASMLib 2. There are two different methods to configure ASM on Linux: ASM with ASMLib I/O: This method creates all Oracle database files on raw block devices managed by ASM using ASMLib calls.datavolume) You should also verify that the O2CB heartbeat threshold is set correctly (to our new value of 301): # cat /proc/fs/ocfs2_nodemanager/hb_dead_threshold 301 How to Determine OCFS2 Version To determine which version of OCFS2 is running._netdev.0..addr=192.. If you would like to learn more about the ASMLib. you would need to perform the following tasks: 1. I wanted to focus on using ASM for all database files.com/technology/tech/linux/asmlib/install.. In this section. Raw devices are not required with this method as ASMLib works with block devices. just not the method we will be implementing here).0 libraries and its associated driver for Linux plus other methods for configuring ASM with Linux. /dev/sda3. Methods for Configuring ASM with Linux (For Reference Only) When I first started this guide. /dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw) none on /proc type proc (rw) none on /sys type sysfs (rw) none on /dev/pts type devpts (rw. ASM automatically performs load balancing in parallel across all available disk drives to prevent hot spots and maximize performance.168. however. if you wanted to use the partitions we've created." If you were to use this method (which is a perfectly valid solution. you should be aware that Linux does not use raw devices by default. Creating the ASM disks.mode=620) usbfs on /proc/bus/usb type usbfs (rw) /dev/hda1 on /boot type ext3 (rw) none on /dev/shm type tmpfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) cartman:SHARE2 on /cartman type nfs (rw.4 Fri Aug 26 12:31:58 PDT 2005 (build 0a22e88ab648dc8d2a1f9d7796ad101c) 17. Next. will only need to be performed on a single node within the cluster.1.0 drivers while finishing off the section with a demonstration of how to create the ASM disks. Install & Configure Automatic Storage Management (ASMLib 2. use: # cat /proc/fs/ocfs2/version OCFS2 1. ASM is built into the Oracle kernel and provides the DBA with a way to manage thousands of disk drives 24x7 for both single and clustered instances of Oracle. control files. The ASM feature was introduced in Oracle Database 10g Release 1 and is used to alleviate the DBA from having to manage individual files and drives.oracle.html. I thought it would be interesting to talk briefly about the second method. You will be required to create raw devices for all disk partitions used by ASM. I start this section by first discussing the ASMLib 2..Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. 2. visit www. however. For example.0) specific to your Linux kernel.oracle.0) Most of the installation and configuration procedures should be performed on all nodes.html. All the files and directories to be used for Oracle will be contained in a disk group. Before discussing the installation and configuration details of ASMLib.120) configfs on /config type configfs (rw) ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw) /dev/sda1 on /u02/oradata/orcl type ocfs2 (rw.com/technology/pub/articles/hunter_rac10gr2_2. Edit the file /etc/sysconfig/rawdevices as follows: # raw device bindings # format: <rawdev> <major> <minor> # <rawdev> <blockdev> # example: /dev/raw/raw1 /dev/sda1 # /dev/raw/raw2 8 5 /dev/raw/raw2 /dev/sda2 /dev/raw/raw3 /dev/sda3 /dev/raw/raw4 /dev/sda4 The raw device bindings will be created on each reboot. I was curious to see how well ASM works with this test RAC configuration with regard to load balancing and fault tolerance. Last.

.rpm \ oracleasm-support-2. The following questions will determine whether the driver is loaded on boot and what permissions it will have.0-1.3.3. chmod 660 /dev/raw/raw3 # chown oracle:dba /dev/raw/raw4.0.0. archived redo log files.9-11.i686.. and a flash recovery area.10.i386. you created three Linux partitions to be used for storing Oracle database files like online redo logs.10.3. The current values will be shown in brackets ('[]').0.0. Now let's move on to the method that will be used for this article. This will configure the on-boot properties of the Oracle ASM library driver. Here is a list of those partitions we created for use by ASM: Oracle ASM Partitions Created Filesystem Type Partition ASM ASM ASM /dev/sda2 /dev/sda3 Size Mount Point File Types 50GB ORCL:VOL1 Oracle Database Files 50GB ORCL:VOL2 Oracle Database Files [100%] [ 33%] [ 67%] [100%] /dev/sda4 100GB ORCL:VOL3 Flash Recovery Area 18 of 20 1/4/2006 8:46 AM . Ctrl-C will abort.0-1. control files.0.0.0 Packages First download the ASMLib 2.0. ########################################### 1:oracleasm-support ########################################### 2:oracleasm-2.9-11. http://www. "ASM with ASMLib I/O.rpm .0 Packages Now that you downloaded and installed the ASMLib Packages for Linux.i386.6.(Driver support files) Install ASMLib 2.6.0-1.rpm . # chown oracle:dba /dev/raw/raw2.i386.0.3.10.0.########################################### 3:oracleasmlib ########################################### Configure and Loading the ASMLib 2.0 libraries (from OTN) and the driver (from my web site).i686.0. Hitting <ENTER> without typing an answer will keep that current value.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. Like OCFS.6. the above example was just to demonstrate that there is more than one method for using ASM with Linux.com/technology/pub/articles/hunter_rac10gr2_2.9-11.i386. The last step is to reboot the server to bind the devices or simply restart the rawdevices service: # service rawdevices restart As I mentioned earlier.0-1.3. You are using kernel 2. chmod 660 /dev/raw/raw2 # chown oracle:dba /dev/raw/raw3.." Download the ASMLib 2.0-1..i686. Default user to own the driver interface []: oracle Default group to own the driver interface []: dba Start Oracle ASM library driver on boot (y/n) [n]: y Fix permissions of Oracle ASM disks on boot (y/n) [y]: y Writing Oracle ASM library driver configuration: [ OK ] Creating /dev/oracleasm mount point: [ OK ] Loading module "oracleasm": [ OK ] Mounting ASMlib driver filesystem: [ OK ] Scanning system for ASM disks: [ OK ] Create ASM Disks for Oracle In Section 10.10.0.rpm \ oracleasmlib-2.rpm . database files.6.EL-2.0 Packages This installation needs to be performed on all nodes as the root user account: $ su # rpm -Uvh oracleasm-2.EL #1 on single-processor machines: # uname -a Linux linux1 2.0.EL #1 Tue Jul 5 12:20:09 PDT 2005 i686 i686 i386 GNU/Linux Oracle ASMLib Downloads for Red Hat Enterprise Linux 4 AS oracleasm-2.0.rpm .rpm Preparing.0.(Driver for "up" kernels) -ORoracleasm-2. This task needs to be run on all nodes as root: $ su # /etc/init.html.d/oracleasm configure Configuring the Oracle ASM library driver.6.9-11.oracle.0..9-11.0-1.0.0-1.(Userspace library) oracleasm-support-2.6.ELsmp-2.0.0.0. you need to configure and load the ASM kernel module.10.EL-2.(Driver for "smp" kernels) oracleasmlib-2. chmod 660 /dev/raw/raw4 3..9-11. you need to download the version for the Linux kernel and number of processors on the machine.

you should then run the oracleasm listdisks command on all nodes to verify that all ASM disks were created and available. All downloads are available from the same page.. If you do receive a failure.com/technology/pub/articles/hunter_rac10gr2_2.0) for Linux x86 software.1.zip Then extract the Oracle Database Software: 19 of 20 1/4/2006 8:46 AM .1. In this example.d/oracleasm createdisk VOL2 /dev/sda3 Marking disk "/dev/sda3" as an ASM disk [ OK ] # /etc/init.1. the same shared drive).2.2. I will be running these commands on linux1. you must first download and extract the required Oracle software packages from OTN. and finally the Oracle Database 10g Companion CD Release 2 (10. However.1. Downloading and Extracting the Software First.d/oracleasm createdisk VOL3 /dev/sda4 Marking disk "/dev/sda4" as an ASM disk [ OK ] Note: If you are repeating this guide using the same hardware (actually.0.0. You will be downloading and extracting the required software from Oracle to only one of the Linux nodes in the cluster—namely.d/oracleasm listdisks VOL1 VOL2 VOL3 18. Total 200GB The last task in this section it to create the ASM Disks. you must perform a scandisk to recognize the new volumes: # /etc/init. the results show that I have three volumes already defined.d/oracleasm deletedisk VOL1 Removing ASM disk "VOL1" [ OK ] # /etc/init.0. download the Oracle Clusterware Release 2 (10. you will be downloading the required Oracle software to linux1 and saving them to /u01/app/oracle/orainstall.0.0). extract the three packages you downloaded to a temporary directory..1. In this example. The Oracle installer will copy the required software packages to all other nodes in the RAC configuration we set up in Section 13. we will use /u01/app/oracle/orainstall.oracle $ cd ~oracle/orainstall $ unzip 10201_clusterware_linux32.2.d/oracleasm scandisks Scanning system for ASM disks [ OK ] You can now test that the ASM disks were successfully created by using the following command on all nodes as the root user account: # /etc/init. Oracle Database 10g Release 2 (10. If you have the three volumes already defined from a previous run.0.0). try listing all ASM disks using: # /etc/init. Extract the Oracle Clusterware package as follows: # su .2.2.1. You will perform all installs from this machine. When that is complete.0.0). linux1. you may get a failure when attempting to create the ASM disks.. you will need to perform a scandisk to recognize the new volumes..0) software for Linux x86. http://www. As the oracle user account.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW.2. and Oracle Database 10g Companion CD Release 2 (10.html. Download Oracle 10g RAC Software The following download procedures only need to be performed on one node in the cluster! The next logical step is to install Oracle Clusterware Release 2 (10.d/oracleasm listdisks VOL1 VOL2 VOL3 As you can see.d/oracleasm createdisk VOL1 /dev/sda2 Marking disk "/dev/sda2" as an ASM disk [ OK ] # /etc/init. Oracle Database 10g Release 2 (10. $ su # /etc/init. go ahead and remove them using the following commands and then creating them again using the above (oracleasm createdisk) commands: # /etc/init.oracle.d/oracleasm deletedisk VOL3 Removing ASM disk "VOL3" [ OK ] On all other nodes in the cluster.0). Creating the ASM disks only needs to be done on one node as the root user account. Login to one of the nodes in the Linux RAC cluster as the oracle user account.d/oracleasm deletedisk VOL2 Removing ASM disk "VOL2" [ OK ] # /etc/init. On the other nodes.

.zip Finally. $ cd ~oracle/orainstall $ unzip 10201_database_linux32.com/technology/pub/articles/hunter_rac10gr2_2.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW...zip Page 1 Page 2 Page 3 20 of 20 1/4/2006 8:46 AM .. extract the Oracle Companion CD Software: $ cd ~oracle/orainstall $ unzip 10201_companion_linux32. http://www.oracle.html.

it will copy the required software to all nodes using the remote access we configured in the section Section 13 ("Configure RAC Nodes for Remote Access"). Once the actual installation starts." If the heartbeat fails for any of the nodes. The problem is that these files need to be in place and accessible before any Oracle instances can be started. what exactly is the Oracle Clusterware responsible for? It contains all of the cluster and database configuration metadata along with several system management features for RAC. you will be asked for the nodes involved and to configure in the RAC cluster. During the installation of Oracle Clusterware. the Oracle Database 10g software only needs to be run from one node. In the previous section. For ASM to be available. After installing Oracle Clusterware. http://www. Oracle Clusterware will send messages (via a special ping operation) to all nodes configured in the cluster. you downloaded and extracted the install files for Oracle Clusterware to linux1 in the directory /u01/app/oracle/orainstall/clusterware. You are now ready to install the "cluster" part of the environment: the Oracle Clusterware. The two shared files could be stored on the OCFS2. you should first run the xhost command as root from the console to allow X Server connections.html. clients can connect from any host # su . the ASM instance would need to be run first. shared RAW devices. often called the "heartbeat. Like the Oracle Clusterware install you will be performing in this section. Oracle Clusterware Shared Files The two shared files used by Oracle Clusterware will be stored on the OCFS2 filesystem we created earlier. The two shared Oracle Clusterware files are: Oracle Cluster Registry (OCR) Location: /u02/oradata/orcl/OCRFile Size: ~ 100MB CRS Voting Disk Location: /u02/oradata/orcl/CSSFile Size: ~ 20MB Note: For our installation here. The OUI will copy the software packages to all nodes configured in the RAC cluster. or another vendor's clustered file system.oracle. So. This is the only nodefrom which you need to perform the install.. production deployments will not be supported! 19.. It allows the DBA to register and invite an Oracle instance (or instances) to the cluster. Verifying Environment Variables Before starting the OUI. it is not possible to use ASM for the two Oracle Clusterware files (OCR or CRS Voting Disk).com/technology/pub/articles/hunter_rac10gr2_3. Install Oracle 10g Clusterware Software Perform the following installation procedures on only one node in the cluster! The Oracle Clusterware software will be installed to all other nodes in the cluster by the Oracle Universal Installer. We also should verify that we are logged in as the oracle user account: Login as oracle # xhost + access control disabled.. Then unset the ORACLE_HOME variable and verify that each of the nodes in the RAC cluster defines a unique ORACLE_SID..Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. Page 1 Page 2 Page 3 Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire (Continued) For development and testing only. During normal operation. the Oracle Universal Installer (OUI) used to install the Oracle 10g database software (next section) will automatically recognize these nodes.oracle Unset ORACLE_HOME $ $ $ $ unset unset unset unset ORA_CRS_HOME ORACLE_HOME ORA_NLS10 TNS_ADMIN Verify Environment Variables on linux1 $ env | grep ORA 1 of 17 1/4/2006 8:47 AM . it checks with the Oracle Clusterware configuration files (on the shared disk) to distinguish between a real node failure and a network failure.

the more often this will occur.1. (The instructions for verifying and modifying the CSS misscount using crsctl can be found in the section "Verify Oracle Clusterware / CSS misscount value". you can still modify the CSS misscount value using the $ORA_CRS_HOME/bin/crsctl command. for example.1. I believe the CSS misscount can be set as large as 600. the good news is that you can modify the CSS misscount value from its default value of 60 (for Linux) to allow for lengthier timeouts. the formula was changed to: disktimeout_in_secs = MAX((3 * CSS misscount)/4. With 10.0. where this same default value of 60 can be shaved by 15 seconds to 45 seconds starting with 10. So how do you modify the default value for CSS misscount? Well.0. especially during database creation (DBCA). When the calculated timeout was exceeded. During the database creation process.3. Several problems have been documented as a result of the CSS daemon timing out starting with Oracle 10. This has been a big problem for me in the past.0. OK.1.0.1. The high I/O would cause lengthy timeouts for CSS while attempting to query the voting disk.2. using the default CSS misscount value of 60. so why all the talk about CSS misscount? As I mentioned earlier.. you will need to modify the CSS timeout value for Clusterware. IA64.html.0. Oracle Clusterware crashed. I would often have the database creation process fail (or other high I/O loads to the system) from the Oracle Clusterware crashing.0. Well.2 to 10. (The instructions for modifying the root. and x86-64).1. Although I haven't been able to verify this.1. however.sh script for Oracle Clusterware can be found here. For the drives you have been using with this article.1. The default value for CSS misscount on Linux for Oracle 10. the timeout value was calculated as follows: time_in_secs > CSS misscount. then EXIT With the default value of 60.3. for example.) If Oracle Clusterware is already installed.3 Please note that after the Oracle Clusterware software is installed. the timeout period would be 60 seconds. this would result in a timeout of 45 seconds.1.3 and later as the CSS timeout is computed differently than with 10. we would have to wait at least 60 seconds for a timeout.1. With the default CSS misscount value of 60 in 10. did change from release 10.sh for Oracle Clusterware before running it on each node in the cluster. ORACLE_SID=orcl1 ORACLE_BASE=/u01/app/oracle ORACLE_TERM=xterm Verify Environment Variables on linux2 $ env | grep ORA ORACLE_SID=orcl2 ORACLE_BASE=/u01/app/oracle ORACLE_TERM=xterm Installing Cluster Ready Services Note: CSS Timeout Computation in Oracle RAC 10g 10. This change was motivated mainly in order to allow for a faster cluster reconfiguration in case of node failure. it was not uncommon for the database creation process to fail with the error: ORA-03113: end-of-file on communication channel. The easiest way is to modify the root.0.log as: clssnmDiskPingMonitorThread: voting device access hanging (45010 miliseconds) The problem is essentially slow disks and the default value for CSS misscount.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW.3. there are several ways.. CSS uses this number to calculate the time after which an I/O to the voting disk should be considered timed out and thus terminating itself to prevent split brain conditions.com/technology/pub/articles/hunter_rac10gr2_3.2 and higher is 60.2.3 on the Linux platform (including IA32..15) Again. The CSS misscount value is the number of heartbeats missed before CSS evicts a node. you can get away with a CSS misscount value of 360. The slower the drive.1.0.) Perform the following tasks to install the Oracle Clusterware: $ cd ~oracle $ /u01/app/oracle/orainstall/clusterware/runInstaller -ignoreSysPrereqs Screen Name Welcome Screen Specify Inventory directory and credentials Response Click Next Accept the default values: Inventory directory: /u01/app/oracle/oraInventory Operating System group name: dba 2 of 17 1/4/2006 8:47 AM . http://www. The key error was reported in the log file $ORA_CRS_HOME/css/log/ocssd1..oracle. This has been common with this article as the FireWire drives we are using are not the fastest.0.1.0. CSS misscount .2. The formula for calculating the timeout value (in seconds).0. Starting with 10. This is especially true for 10.

1..html.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW..sh script. you must configure at least three voting disks.0. Cluster Name: crs Product-Specific Prerequisite Checks Specify Cluster Configuration Public Node Name linux1 linux2 Interface Name Private Node Name int-linux1 int-linux2 Subnet 192. enhancing cluster reliability. Within the same new console window on each node in the RAC cluster. stay logged in as the "root" user account.. The OUI will log all errors to a log file in this directory only if it exists.sh file for each node in the cluster . Oracle Clusterware provides for the creation of a mirrored OCR file. For the purpose of this example. By enabling multiple voting disk configuration. (starting with the node you are performing the install from). Run the root. http://www.sh and root. This option facilitates the use of the iSCSI network protocol. Navigate to the /u01/app/oracle/oraInventory directory and run orainstRoot. you could configure only one voting disk.2) with RAC. (starting with the node you are performing the install from).0 Virtual Node Name vip-linux1 vip-linux2 Interface Type Public Private Specify Network Interface Usage eth0 eth1 Specify OCR Location Starting with Oracle Database 10g Release 2 (10. Note that to take advantage of the benefits of multiple voting disks. I did choose to mirror the voting disk by keeping the default option of "Normal Redundancy": Voting Disk Location: /u02/oradata/orcl/CSSFile Additional Voting Disk 1 Location: /u02/oradata/orcl/CSSFile_mirror1 Additional Voting Disk 2 Location: /u02/oradata/orcl/CSSFile_mirror2 For some reason. If any of the checks fail. Change the following entry that can be found on line 356: CLSCFG_MISCNT="-misscount 60" to CLSCFG_MISCNT="-misscount 360" Now.168. manually create the file /u01/app/oracle/product/crs/log on all nodes in the cluster.2) with RAC. 3 of 17 1/4/2006 8:47 AM .2.oracle. and other Network Attached Storage (NAS) storage solutions.1). Specify Voting Disk Location Summary For this installation. Click Install to start the installation! After the installation has completed. you will need to manually verify the check that failed by clicking on the checkbox.3" section.. all checks passed with no problems. You should manually create this directory before clicking the "Install" button.168. CSS has been modified to allow you to configure CSS with multiple voting disks.sh file ON ALL NODES in the RAC cluster ONE AT A TIME. you should modify the entry for CSS misscount from 60 to 360 in the file $ORA_CRS_HOME/install/rootconfig as follows (on each node in the cluster). Specify Home Details Leave the default value for the Source directory. you will be prompted to run the orainstRoot. In Release 1 (10. Open a new console window on each node in the RAC cluster. For the purpose of this example. as the "root" user account.(starting with the node you are performing the install from). I did choose to mirror the OCR file by keeping the default option of "Normal Redundancy": Specify OCR Location: /u02/oradata/orcl/OCRFile Specify OCR Mirror Location: /u02/oradata/orcl/OCRFile_mirror Starting with Oracle Database 10g Release 2 (10.com/technology/pub/articles/hunter_rac10gr2_3. the redundant voting disks allow you to configure a RAC database with multiple voting disks on independent shared physical disks. navigate to the /u01/app/oracle/product/crs directory and locate the root. Click Next to continue.sh ON ALL NODES in the RAC cluster.0 192. the OUI fails to create the directory "$ORA_CRS_HOME/log" before starting the installation. For my installation.1. Execute Configuration Scripts As mentioned earilier in the "CSS Timeout Computation in 10g RAC 10. Set the destination for the ORACLE_HOME name (actually the $ORA_CRS_HOME that I will be using in this article) and location as follows: Name: OraCrs10g_home Location: /u01/app/oracle/product/crs The installer will run through a series of checks to determine if the node meets the minimum requirements for installing and configuring the Oracle Clusterware software.

you will want to modify it to 360 as follows: Start only one node in the cluster. The root. For my example..sh script before running it on each node in the cluster. When running the root. Within that section I explained how to accomplish that by modifying the root. CSS is active on these nodes. The easiest workaround is to re-run vipca (GUI) manually as root from the last node in which the error occurred. exit from the OUI. login as the root user account and type: $ORA_CRS_HOME/bin/crsctl set css misscount 360 Reboot the single node (linux1).255. http://www. Please keep in mind that vipca is a GUI and will need to set your DISPLAY variable accordingly to your X server: # $ORA_CRS_HOME/bin/vipca When the "VIP Configuration Assistant" appears. to obtain the current value for CSS misscount. For example. 4 of 17 1/4/2006 8:47 AM .1.. we can run through several tests to verify the install was successful.1. End of installation At the end of the installation. you will receive a critical error and the output should look like: . Run the following commands on all nodes in the RAC cluster. this is how I answered the screen prompts: Welcome: Click Next Network interfaces: Select both interfaces .html.1 (noted in bug 4437727) and needs to be resolved before continuing. Verify Oracle Clusterware Installation After the installation of Oracle Clusterware..201 Subnet Mask: 255..sh script. Verify Oracle Clusterware / CSS misscount value In the section "CSS Timeout Computation in 10g RAC 10.0 Summary: Click Finish Configuration Assistant Progress Dialog: Click OK after configuration is complete. Waiting for the Oracle CRSD and EVMD to start Oracle CRS stack installed and running under init(1M) Running vipca(silent) for configuring nodeapps The given interface(s). This issue is specific to Oracle 10. Start all other nodes in the cluster. I would shutdown linux2 and startup only linux1.0.sh on the last node.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW.3".eth0 and eth1 Virtual IPs for cluster notes: Node Name: linux1 IP Alias Name: vip-linux1 IP Address: 192. From the one node (linux1).0 Node Name: linux2 IP Alias Name: vip-linux2 IP Address: 192.255. "eth0" is not public. I mentioned the need to modify the CSS misscount value from its default value of 60 to 360 (or higher).168. linux1 linux2 CSS is active on all nodes.sh may take awhile to run. These warnings can be safely ignored.200 Subnet Mask: 255. You will receive several warnings while running the root. Configuration Results: Click Exit Go back to the OUI and acknowledge the "Execute Configuration scripts" dialog window.oracle.com/technology/pub/articles/hunter_rac10gr2_3. you can still perform this action by using the $ORA_CRS_HOME/bin/crsctl program. Expecting the CRS daemons to be up within 600 seconds.255.168. Public interfaces should be used to configure virtual IPs.0.2. use the following: $ORA_CRS_HOME/bin/crsctl get css misscount 360 If you get back a value of 60..1. If you were not able to modify the CSS misscount value within the root.255..sh script on all nodes.

1.crsd* /etc/init.2. you should first run the xhost command as root from the console to allow X Server connections. We also should verify that we are logged in as the oracle user account: Login as oracle # xhost + access control disabled. 5 of 17 1/4/2006 8:47 AM . http://www. Verify Environment Variables Before starting the OUI. the next step is to install the Oracle Database 10g Release 2 (10. Check cluster nodes $ /u01/app/oracle/product/crs/bin/olsnodes -n linux1 1 linux2 2 Check Oracle Clusterware Auto-Start Scripts $ ls -l /etc/init. For the purpose of this example. You will. you will forgoe the "Create Database" option when installing the software..0) with RAC.* -r-xr-xr-x 1 root root 1951 -r-xr-xr-x 1 root root 4714 -r-xr-xr-x 1 root root 35394 -r-xr-xr-x 1 root root 3190 Oct Oct Oct Oct 4 4 4 4 14:21 14:21 14:21 14:21 /etc/init.0.d/init. After successfully installing the Oracle Clusterware software.html.evmd* 20.com/technology/pub/articles/hunter_rac10gr2_3.oracle Unset ORACLE_HOME $ $ $ $ unset unset unset unset ORA_CRS_HOME ORACLE_HOME ORA_NLS10 TNS_ADMIN Verify Environment Variables on linux1 $ env | grep ORA ORACLE_SID=orcl1 ORACLE_BASE=/u01/app/oracle ORACLE_TERM=xterm Verify Environment Variables on linux2 $ env | grep ORA ORACLE_SID=orcl2 ORACLE_BASE=/u01/app/oracle ORACLE_TERM=xterm Install Oracle Database 10g Release 2 Software Install the Oracle Database 10g Release 2 software with the following: $ cd ~oracle $ /u01/app/oracle/orainstall/database/runInstaller -ignoreSysPrereqs Screen Name Welcome Screen Select Installation Type Response Click Next I selected the Enterprise Edition option.d/init...Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW..cssd* /etc/init.d/init. clients can connect from any host # su .d/init.oracle.crs* /etc/init. instead. Then unset the ORACLE_HOME variable and verify that each of the nodes in the RAC cluster defines a unique ORACLE_SID. create the database using the Database Creation Assistant (DBCA) after the install.d/init. Install Oracle Database 10g Software Perform the following installation procedures on only one node in the cluster! The Oracle database software will be installed to all other nodes in the cluster by the Oracle Universal Installer.

0/db_1/log on the node you are performing the installation from... Summary For this installation. Click Select All to select all servers: linux1 and linux2. Select the option to "Install database software only. found ip_local_port_range=32768 . this was "linux1". For me.65000. First.Run root.html. manually create the file /u01/app/oracle/product/10.oracle.0/db_1 directory and run root. Product-Specific Prerequisite Checks For my installation.2.sh script on all nodes in the cluster. The process of creating the TNS listener only needs to be performed on one node in the cluster.0/db_1 Select the Cluster Installation option then select all nodes available.sh script will need to be run on all nodes in the RAC cluster one at a time starting with the node you are running the database installation from. go back to the OUI and acknowledge the "Execute Configuration scripts" dialog window.sh. The DBCA requires the Oracle TNS Listener process to be configured and running on all nodes in the RAC cluster before it can create the clustered database. It is important to keep in mind that the root.61000. perform the following checks: Ensure Oracle Clusterware is running on the node in question. For some reason. Before running the NETCA.2. All changes will be made and replicated to all nodes in the cluster. Create TNS Listener Process Perform the following configuration procedures from only one node in the cluster! The Network Configuration Assistant (NETCA) will setup the TNS listener in a clustered configuration on all nodes in the cluster. you will need to manually verify the check that failed by clicking on the checkbox. the OUI fails to create a $ORACLE_HOME/log for the installation directory before starting the installation. You should manually create this directory first. The installer will run through a series of checks to determine if the node meets the minimum requirements for installing and configuring the Oracle database software. Root Script Window . make sure to re-login as the oracle user and verify the $ORACLE_HOME environment variable set to the 6 of 17 1/4/2006 8:47 AM . The OUI will log all errors to a log file in this directory only if it exists. Click on Install to start the installation! After the installation has completed." Select Database Configuration Remember that we will create the clustered database as a separate step using DBCA. 21. Specify Hardware Cluster Installation Mode If the installation stops here and the status of any of the RAC nodes is "Node not reachable".. After running the root. On one of the nodes (I will be using linux1) bring up the NETCA and run through the process of creating a new TNS listener process and also configure the node for local access.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. you will be prompted to run the root.2.com/technology/pub/articles/hunter_rac10gr2_3. Ensure you are table to reach the node in question from the node you are performing the installation from. Specify Home Details Set the destination for the ORACLE_HOME name and location as follows: Name: OraDb10g_home1 Location: /u01/app/oracle/product/10.sh End of installation At the end of the installation.sh script.. Failed Simply click the check-box for "Checking kernel parameters" then click Next to continue. If any of the checks fail. http://www. open a new console window on the node you are installing the Oracle 10g database software from as the root user account. Navigate to the /u01/app/oracle/product/10. I had only one of the checks fail: Checking for ip_local_port_range=1024 . exit from the OUI.

Screen Name Select the Type of Oracle Net Services Configuration Select the nodes to configure Type of Configuration Response Select Cluster Configuration Select all of the nodes: linux1 and linux2.2.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. Listener Configuration Next 6 Screens Type of Configuration Naming Methods Configuration Type of Configuration The Oracle TNS listener process should now be running on all nodes in the RAC cluster: $ hostname linux1 $ ps -ef | grep lsnr | grep -v 'grep' | grep -v 'ocfs' | awk '{print $9}' LISTENER_LINUX1 ===================== $ hostname linux2 $ ps -ef | grep lsnr | grep -v 'grep' | grep -v 'ocfs' | awk '{print $9}' LISTENER_LINUX2 22.oracle $ netca & The following screenshots walk you through the process of creating a new Oracle listener for our RAC environment..oracle.. Click Finish to exit the NETCA. If you attempt to use the console window used in the previous section. Please keep in mind that this is an optional step.0).com/technology/pub/articles/hunter_rac10gr2_3. After successfully installing the Oracle Database software. If you do not install the NCOMP files. The following screens are: Selected Naming Methods: Local Naming Naming Methods configuration complete! [ Next ] You will be returned to this Welcome (Type of Configuration) Screen. Select Listener configuration. The following screens are now like any other normal listener configuration. proper location. the next step is to install the Oracle Database 10g Release 2 Companion CD software (10.. Install Companion CD Software Install the Companion CD software with the following: 7 of 17 1/4/2006 8:47 AM . the ORA-29558:JAccelerator (NCOMP) not installed error occurs when a database that uses Java VM is upgraded to the patch release. run the following GUI utility as the oracle user account: # su . For the purpose of this guide. http://www. You can simply accept the default parameters for the next six screens: What do you want to do: Add Listener name: LISTENER Selected protocols: TCP Port number: 1521 Configure another listener: No Listener configuration complete! [ Next ] You will be returned to this Welcome (Type of Configuration) Screen. The type of installation to perform will be the Oracle Database 10g Products installation type. Select Naming Methods configuration. remember that we unset the $ORACLE_HOME environment variable. my testing database will often make use of the Java Virtual Machine (Java VM) and Oracle interMedia and therefore will require the installation of the Oracle Database 10g Companion CD.html..0. To start the NETCA. This installation type includes the Natively Compiled Java Libraries (NCOMP) files to improve Java performance. Install Oracle Database 10g Companion CD Software Perform the following installation procedures from only one node in the cluster! The Oracle Database 10g Companion CD software will be installed to all other nodes in the cluster by the Oracle Universal Installer. This will result in a failure when attempting to run netca.1.

etc. 8 of 17 1/4/2006 8:47 AM . Ensure you are table to reach the node in question from the node you are performing the installation from.. Before executing the DBCA. Stay with these default options and click Next to continue. http://www. make sure that $ORACLE_HOME and $PATH are set appropriately for the $ORACLE_BASE/product/10. If any of the checks fail. click Install to start the installation! At the end of the installation. Oracle Clusterware processes. Specify Home Details Product-Specific Prerequisite Checks 23.0" option. Specify Hardware Cluster Installation Mode If the installation stops here and the status of any of the RAC nodes is "Node not reachable". all checks passed with no problems. Summary End of installation On the Summary screen. Create the Oracle Cluster Database The database creation process should only be performed from one node in the cluster! We will use the DBCA to create the clustered database." Select Create a Database. exit from the OUI. Select Custom Database..0/db_1 environment.2.0. $ cd ~oracle $ /u01/app/oracle/orainstall/companion/runInstaller -ignoreSysPrereqs Screen Name Welcome Screen Select a Product to Install Response Click Next Select the "Oracle Database 10g Products 10. Click on the Select All button to select all servers: linux1 and linux2. For my installation..1.) are running before attempting to start the clustered database creation process.com/technology/pub/articles/hunter_rac10gr2_3.2.oracle.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. you will need to manually verify the check that failed by clicking on the checkbox. Set the destination for the ORACLE_HOME name and location to that of the previous Oracle10g Database software install as follows: Name: OraDb10g_home1 Location: /u01/app/oracle/product/10. The installer will run through a series of checks to determine if the node meets the minimum requirements for installing and configuring the Companion CD Software..html. Click on Next to continue.0/db_1 The Cluster Installation option will be selected along with all of the available nodes in the cluster by default.2.oracle $ dbca & Screen Name Welcome Screen Operations Node Selection Database Templates Response Select "Oracle Real Application Clusters database. Create the Clustered Database To start the database creation process. You should also verify that all services we have installed up to this point (Oracle TNS listener. run the following: # xhost + access control disabled. clients can connect from any host # su . perform the following checks: Ensure Oracle Clusterware is running on the node in question.

Select the first two ASM volumes (ORCL:VOL1 and ORCL:VOL2) in the "Select Member Disks" window. All other options can stay at their defaults. click the Create New button. If the volumes we created earlier in this article do not show up in the "Select Member Disks" window: (ORCL:VOL1. which is to use Oracle Managed Files: Database Area: +ORCL_DATA1 Check the option for "Specify Flash Recovery Area". When the ASM Disk Group Creation process is finished.com/technology/pub/articles/hunter_rac10gr2_3.ora. Enter the password (twice) and make sure the password does not start with a digit number.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. 9 of 17 1/4/2006 8:47 AM . You will then be prompted with a dialog box asking if you want to create and start the ASM instance. we will select to use ASM. When the ASM Disk Group Creation process is finished. ORCL:VOL2. I used the string "ORCL_DATA1".info for the database domain. Database Content Database Services Initialization Parameters I left all of the Database Components (and destination tablespaces) set to their default value.. This will present the "ASM Disk Group Creation" dialog." Leave both instances set to Preferred and for the "TAF Policy" select Basic. Select the OK button to acknowledge this dialog. My disk group has a size of about 100GB. Also.. Database File Locations I selected to use the default. For this test configuration. click Add.info SID Prefix: orcl I used idevelopment. The OUI will now create and start the ASM instance on all nodes in the RAC cluster. Database Identification Select: Global Database Name: orcl. To start. For this guide. you will be returned to the "ASM Disk Groups" windows. Management Option Database Credentials Storage Options Leave the default options here. Change any parameters for your environment. Keep in mind that this domain does not have to be a valid DNS domain. you will be returned to the "ASM Disk Groups" window with two disk groups created and selected. You may use any domain. Set the "Redundancy" option to "External"... For the second "Disk Group Name". Keep the "Redundancy" setting to "Normal". This option is available since we installed the Oracle Companion CD software. I used a Flash Recovery Area Size of 90GB (91136 MB). These two volumes should now have a status of "PROVISIONED". use the disk group name +FLASH_RECOVERY_AREA.html. I used the string FLASH_RECOVERY_AREA. For the first "Disk Group Name". click on the [OK] button.idevelopment. http://www. Click the Create New button again. This final volume will also be changed to a status of "PROVISIONED". and ORCL:VOL3) then click on the "Change Disk Discovery Path" button and input "ORCL:VOL*". Create ASM Instance ASM Disk Groups After verifying all values in this window are correct. starting with Release 2. Supply the SYS password to use for the new ASM instance. This will present the "ASM Disk Group Creation" dialog. I left them all at their default settings. which is to "Configure the Database with Enterprise Manager / Use Database Control for Database Management. Recovery Configuration For the Flash Recovery Area. This will bring up the "Create Disk Group" window with the three volumes we configured earlier using ASMLib. After verifying all values in this window are correct.oracle. although it is perfectly OK to select the Example Schemas. and enter orcltest as the "Service Name. Select the last ASM volume (ORCL:VOL3) in the "Select Member Disks" window." I selected to Use the Same Password for All Accounts. click the [OK] button. Select only one of the disk groups by using the checkbox next to the newly created Disk Group Name ORCL_DATA1 (ensure that the disk group for FLASH_RECOVERY_AREA is not selected) and click [Next] to continue. You will need to modify the default entry for "Create server parameter file (SPFILE)" to reside on the OCFS2 partition as follows: /u02/oradata/orcl/dbs/spfile+ASM. the ASM instance server parameter file (SPFILE) needs to be on a shared disk.

Click OK on the "Summary" screen.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. Use another machine (i.oracle.ora file that was configured by Oracle and can be used for testing the Transparent Application Failover (TAF).----------. Keep the default option Create Database selected and click Finish to start the database creation process.-------------------------service_names string orcl. you will have a fully functional Oracle RAC cluster running! Create the orcltest Service During the creation of the Oracle clustered database.-------------------------------service_names string orcl. orcltest If the only service defined was for orcl. orcltest.idevelopment. End of Database Creation When exiting the DBCA. The listener.html. This file should already be configured on each node in the RAC cluster. When finished.ora We already covered how to create a TNS listener configuration file (listener. During several of my installs. Verify TNS Networking Files Ensure that the TNS networking files are configured on all nodes in the cluster! listener.com/technology/pub/articles/hunter_rac10gr2_3. I've also included a copy of my tnsnames. Then try to connect to the clustered database using all available service names defined in the tnsnames. another dialog will come up indicating that it is starting all Oracle instances and HA service "orcltest".. At the end of the database creation.info.info' scope=both.idevelopment.info. You can include any of these entries on other client machines that need access to the clustered database.. then you will need to manually add the service to both instances: SQL> show parameter service NAME TYPE VALUE -------------------.e. When the DBCA has completed. you added a service named orcltest that will be used to connect to the database with TAF enabled. a Windows machine connected to the network) that has Oracle installed (either 9i or 10g) and add the TNS entries (in the tnsnames. Database Storage Creation Options Change any parameters for your environment. This may take several minutes to complete. Use the following to verify the orcltest service was successfully added: SQL> show parameter service NAME TYPE VALUE -------------------. For clarity.ora) from either of the nodes in the cluster that were created for the clustered database. http://www. but was never updated as a service for each Oracle instance.idevelopment.ora. I left them all at their default settings.----------. 24.ora file from my node linux1 in this guide's support files..idevelopment..info.ora file should be properly configured and no modifications should be needed. Connecting to Clustered Database From an External Client This is an optional step. all windows and dialog boxes will disappear. exit from the DBCA.ora file: C:\> C:\> C:\> C:\> sqlplus sqlplus sqlplus sqlplus system/manager@orcl2 system/manager@orcl1 system/manager@orcltest system/manager@orcl 10 of 17 1/4/2006 8:47 AM .idevelopment. I have included a copy of the listener. the service was added to the tnsnames.info SQL> alter system set service_names = 2 'orcl. but I like to perform it in order to verify my TNS files are configured correctly.ora) for a clustered environment in Section 21.

262. Below are several optional SQL commands for modifying and creating all tablespaces for the test database.860.301.oracle.264.258.000 275. Verify the RAC Cluster & Database Configuration 11 of 17 1/4/2006 8:47 AM .264. http://www.824 27.824 65. SQL> alter database datafile '+ORCL_DATA1/orcl/datafile/system.088 0 MANUAL 1.-----------------. ---------LOCAL LOCAL LOCAL LOCAL LOCAL LOCAL LOCAL LOCAL Seg.html.457. Please keep in mind that the database file names (OMF files) used in this example may differ from what Oracle creates for your environment.560 53 AUTO 2.316. The following query can be used to determine the file names for your environment: SQL> 2 3 4 5 select tablespace_name. SQL> alter database datafile '+ORCL_DATA1/orcl/datafile/sysaux. SQL> alter database datafile '+ORCL_DATA1/orcl/datafile/users. file_name from dba_temp_files.570913215 +ORCL_DATA1/orcl/tempfile/temp.283.--------MANUAL 1.906.741. Used --------.820..483. FILE_NAME -------------------------------------------------+ORCL_DATA1/orcl/datafile/example.-----------------.570913355' resize 1024m.570913287 +ORCL_DATA1/orcl/datafile/system.570920045 +ORCL_DATA1/orcl/datafile/sysaux.288. 25.570913355 TABLESPACE_NAME --------------EXAMPLE INDX SYSAUX SYSTEM TEMP UNDOTBS1 UNDOTBS2 USERS $ sqlplus "/ as sysdba" SQL> create user scott identified by tiger default tablespace users..544 53 AUTO 1.257.024 85.801.262. Here is a snapshot of the tablespaces I have defined for my test database environment: Status --------ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE avg sum 8 rows selected.544 976.570913311 +ORCL_DATA1/orcl/datafile/indx. Tablespace Name --------------UNDOTBS1 SYSAUX USERS SYSTEM EXAMPLE INDX UNDOTBS2 TEMP TS Type -----------UNDO PERMANENT PERMANENT PERMANENT PERMANENT PERMANENT UNDO TEMPORARY Ext. If you are using a large drive for the shared storage.065.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. we left all tablespaces set to their default size.073.024 3.741.262. SQL> 2 3 4 create tablespace indx datafile '+ORCL_DATA1' size 1024m autoextend on next 50m maxsize unlimited extent management local autoallocate segment space management auto.-----------------.570913263 +ORCL_DATA1/orcl/datafile/undotbs2.260. connect to scott.570913303' resize 1024m.400 83.382.286.265. SQL> alter tablespace undotbs2 add datafile '+ORCL_DATA1' size 1024m 2 autoextend on next 50m maxsize 2048m.260.457.--------22 8.073. SQL> alter tablespace undotbs1 add datafile '+ORCL_DATA1' size 1024m 2 autoextend on next 50m maxsize 2048m. Mgt.147.570913215' resize 800m.536 0 MANUAL 1. Mgt. resource.800 500.270..570913331 +ORCL_DATA1/orcl/datafile/users..570913287' resize 500m.283. file_name from dba_data_files union select tablespace_name. Tablespace Size Used (in bytes) Pct. you may want to make a sizable testing database. SQL> alter database tempfile '+ORCL_DATA1/orcl/tempfile/temp.261.258.824 60 AUTO 157.648 131. Create / Alter Tablespaces When creating the clustered database.355.728 7 AUTO 524.570913303 +ORCL_DATA1/orcl/datafile/undotbs1.976 3 -----------------. SQL> alter tablespace users add datafile '+ORCL_DATA1' size 1024m autoextend off.com/technology/pub/articles/hunter_rac10gr2_3.328 26.072 0 MANUAL 838. SQL> grant dba.

com/technology/pub/articles/hunter_rac10gr2_3. Display the configuration for the ASM instance(s) $ srvctl config asm -n linux1 +ASM1 /u01/app/oracle/product/10.255.2.html.1. Oracle Notification Services.0/db_1 linux2 orcl2 /u01/app/oracle/product/10. List all configured databases $ srvctl config database orcl Display configuration for our RAC database $ srvctl config database -d orcl linux1 orcl1 /u01/app/oracle/product/10.. The following RAC verification checks should be performed on all nodes in the cluster! For this guide. and Oracle Enterprise Manager agents (for maintenance purposes). There are five node-level tasks defined for SRVCTL: Adding and deleting node-level applications Setting and unsetting the environment for node-level applications Administering node applications Administering ASM instances Starting and stopping a group of programs that includes virtual IP addresses.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW.2.2.200/255. This section provides several srvctl commands and SQL queries you can use to validate your Oracle RAC 10g configuration.0/db_1 All running instances in the cluster SELECT 12 of 17 1/4/2006 8:47 AM .0/db_1 Display all services for the specified cluster database $ srvctl config service -d orcl orcltest PREF: orcl2 orcl1 AVAIL: Display the configuration for node applications .255. http://www. GSD. listeners. we will perform these checks only from linux1.: /vip-linux1/192.168.oracle. ONS. orcl1 Status of node applications on a particular node $ srvctl status nodeapps -n linux1 VIP is running on node: linux1 GSD is running on node: linux1 Listener is running on node: linux1 ONS daemon is running on node: linux1 Status of an ASM instance $ srvctl status asm -n linux1 ASM instance +ASM1 is running on node linux1. Status of all instances and services $ srvctl status database -d orcl Instance orcl1 is running on node linux1 Instance orcl2 is running on node linux2 Status of a single instance $ srvctl status instance -d orcl -i orcl2 Instance orcl2 is running on node linux2 Status of a named service globally across the database $ srvctl status service -d orcl -s orcltest Service orcltest is running on instance(s) orcl2.0/eth0:eth1 GSD exists. Listener exists. ONS daemon exists... Listener) $ srvctl config nodeapps -n linux1 -a -g -s -l VIP exists..(VIP.

PATH ---------------------------------ORCL:VOL1 ORCL:VOL2 27. All ASM disk that belong to the 'ORCL_DATA1' disk group SELECT path FROM v$asm_disk WHERE group_number IN (select group_number from v$asm_diskgroup where name = 'ORCL_DATA1').--------.570913331 +ORCL_DATA1/orcl/datafile/undotbs2. parallel .264. so how do I start and stop services?" If you have followed the instructions in this guide.570913263 +ORCL_DATA1/orcl/datafile/undotbs1.570913191 +FLASH_RECOVERY_AREA/orcl/onlinelog/group_1.--.272.570918279 +ORCL_DATA1/orcl/onlinelog/group_4..570913287 +ORCL_DATA1/orcl/datafile/system. Enterprise Manager Database Console.---------.259.258. all Oracle instances.570913189 +ORCL_DATA1/orcl/datafile/example.267. INST_ID INST_NO INST_NAME PAR STATUS DB_STATUS STATE HOST -------..570920865 +ORCL_DATA1/orcl/datafile/undotbs2.260.-------.html.570919829 +ORCL_DATA1/orcl/onlinelog/group_1.570913215 +ORCL_DATA1/orcl/datafile/undotbs1.269.570913211 +FLASH_RECOVERY_AREA/orcl/onlinelog/group_3.263.570913311 +ORCL_DATA1/orcl/datafile/indx.271.256. Starting / Stopping the Cluster At this point. we've installed and configured Oracle RAC 10g entirely and have a fully functional clustered database. and so on—should 13 of 17 1/4/2006 8:47 AM . active_state state .570920045 +ORCL_DATA1/orcl/datafile/sysaux.570913303 21 rows selected.570913205 +ORCL_DATA1/orcl/onlinelog/group_3.. database_status db_status .266. "OK..570913201 +FLASH_RECOVERY_AREA/orcl/onlinelog/group_2.570913355 +ORCL_DATA1/orcl/datafile/users. inst_id .oracle.262.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW.570921065 +ORCL_DATA1/orcl/datafile/users. instance_number inst_no . instance_name inst_name . After all the work done up to this point. NAME ------------------------------------------+FLASH_RECOVERY_AREA/orcl/controlfile/current.570913195 +ORCL_DATA1/orcl/onlinelog/group_2.------1 1 orcl1 YES OPEN ACTIVE NORMAL linux1 2 2 orcl2 YES OPEN ACTIVE NORMAL linux2 All data files which are in the disk group select union select union select union select name from v$datafile member from v$logfile name from v$controlfile name from v$tempfile. host_name host FROM gv$instance ORDER BY inst_id. http://www.-----------. you may well ask.265.260. all services—including Oracle Clusterware.270.570918285 +FLASH_RECOVERY_AREA/orcl/onlinelog/group_4.257.261.com/technology/pub/articles/hunter_rac10gr2_3.258.257.------. status .570918295 +ORCL_DATA1/orcl/controlfile/current.256.570918289 +ORCL_DATA1/orcl/tempfile/temp.259.

We will runn all commands in this section from linux1: # su . Finally. bring up the Oracle instance (and related services) and the Enterprise Manager Database console. and ONS). GSD. I have included this step just for fun as a way to bring down all instances! $ srvctl start database -d orcl $ srvctl stop database -d orcl 28. When considering the availability of the Oracle database.html. however. TNS Listener. This section provides the commands (using SRVCTL) responsible for starting and stopping the cluster environment. Ensure that you have the Oracle RDBMS software installed. When the node applications are successfully started. Transparent Application Failover (TAF) It is not uncommon for businesses to demand 99. A major component of Oracle RAC 10g that is responsible for failover processing is the Transparent Application Failover (TAF) option.com/technology/pub/articles/hunter_rac10gr2_3. $ $ $ $ $ export ORACLE_SID=orcl1 srvctl start nodeapps -n linux1 srvctl start asm -n linux1 srvctl start instance -d orcl -i orcl1 emctl start dbconsole Start/Stop All Instances with SRVCTL Start/stop all the instances and their enabled services. Oracle RAC 10g includes the required components that all work within a clustered configuration responsible for providing continuous availability.) During the creation of the clustered database in this guide. When the instance (and related services) is down. 14 of 17 1/4/2006 8:47 AM .ora file on a non-RAC client machine (if you have a Windows machine lying around). you only need a client install of the Oracle software. (Keep in mind that as of this writing. There are times. then bring down the ASM instance. One important note is that TAF happens automatically within the OCI libraries. http://www.5 hours or even no downtime during the year.. Please note that a complete discussion of failover in Oracle RAC 10g would require an article in itself. Certain configuration steps.) Setup the tnsnames.ora. then bring up the ASM instance. we need to verify that a valid entry exists in the tnsnames. shut down the node applications (Virtual IP.ora. This final section provides a short demonstration on how TAF works in Oracle RAC 10g. $ $ $ $ $ export ORACLE_SID=orcl1 emctl stop dbconsole srvctl stop instance -d orcl -i orcl1 srvctl stop asm -n linux1 srvctl stop nodeapps -n linux1 Starting the Oracle RAC 10g Environment The first step is to start the node applications (Virtual IP.. my intention here is to present only a brief overview. Thus your application (client) code does not need to change in order to take advantage of TAF. the Java thin client will not be able to participate in TAF because it never reads tnsnames.oracle. will need to be done on the Oracle TNS file tnsnames.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. Oracle RAC 10g provides a superior solution with its advanced failover mechanisms.999%) availability for their enterprise applications.99% (or even 99. All database connections (and processes) that lose connections are reconnected to another node within the cluster.. businesses are investing in mechanisms that provide for automatic failover when one participating system fails. Finally. however. when one of the participating systems fail within the cluster. when you might want to shut down a node and manually start it back up. The failover is completely transparent to the user. Or you may find that Enterprise Manager is not running and need to start it. To answer many of these high-availability requirements.ora File Before demonstrating TAF. Ensure that you are logged in as the oracle UNIX user. the users are automatically migrated to the other available systems. start automatically on each reboot of the Linux nodes. TNS Listener. we created a new service that will be used for testing TAF named ORCLTEST. GSD..oracle $ hostname linux1 Stopping the Oracle RAC 10g Environment The first step is to stop the Oracle instance. and ONS). (Actually. Think about what it would take to ensure a downtime of no more than .

failover_type . You can copy the contents of this entry to the %ORACLE_HOME%\network\admin\tnsnames. and if a failover has occurred.. NULL AS failover_type . failed_over FROM v$session WHERE username = 'SYSTEM'. ORCLTEST = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = vip-linux1)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = vip-linux2)(PORT = 1521)) (LOAD_BALANCE = yes) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = orcltest.. failover_method . NULL AS failed_over FROM v$instance UNION SELECT NULL . We will be using this query throughout this example.ora file on the client machine (my Windows laptop is being used in this example) in order to connect to the new Oracle clustered database: . NULL AS failover_method .com/technology/pub/articles/hunter_rac10gr2_3. It provides all the necessary configuration parameters for load balancing and failover. login to the clustered database using the orcltest service as the SYSTEM user: C:\> sqlplus system/manager@orcltest COLUMN COLUMN COLUMN COLUMN instance_name host_name failover_method failed_over FORMAT FORMAT FORMAT FORMAT a13 a9 a15 a11 SELECT instance_name .. host_name . NULL AS failover_type .Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. SQL Query to Check the Session's Failover Information The following SQL query can be used to check a session's failover type.html.info) (FAILOVER_MODE = (TYPE = SELECT) (METHOD = BASIC) (RETRIES = 180) (DELAY = 5) ) ) ) ... TAF Demo From a Windows machine (or other non-RAC client machine). COLUMN COLUMN COLUMN COLUMN instance_name host_name failover_method failed_over FORMAT FORMAT FORMAT FORMAT a13 a9 a15 a11 SELECT instance_name . NULL 15 of 17 1/4/2006 8:47 AM . host_name . http://www..oracle.. NULL AS failed_over FROM v$instance UNION SELECT NULL . failover method..idevelopment. NULL AS failover_method . NULL .

.. failed_over FROM v$session WHERE username = 'SYSTEM'. Conclusion Ideally this guide has provided an economical solution to setting up and configuring an inexpensive Oracle RAC 10g Release 2 cluster using CentOS 4.oracle. host_name . failover_type .------------.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. NULL . we can use the srvctl command-line utility as follows: # su . To perform this operation.----------orcl1 linux1 SELECT BASIC NO DO NOT logout of the above SQL*Plus session! Now that we have run the query (above).. INSTANCE_NAME HOST_NAME FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER ------------. NULL AS failover_type . failover_method . http://www. we should now shutdown the instance orcl1 on linux1 using the abort option.----------orcl2 linux2 SELECT BASIC YES SQL> exit From the above demonstration. we can see that the above session has now been failed over to instance orcl2 on linux2.800 and will provide the DBA with a fully functional development Oracle RAC cluster..--------------.oracle $ srvctl status database -d orcl Instance orcl1 is running on node linux1 Instance orcl2 is running on node linux2 $ srvctl stop instance -d orcl -i orcl1 -o abort $ srvctl status database -d orcl Instance orcl1 is not running on node linux1 Instance orcl2 is running on node linux2 Now let's go back to our SQL session and rerun the SQL statement in the buffer: COLUMN COLUMN COLUMN COLUMN instance_name host_name failover_method failed_over FORMAT FORMAT FORMAT FORMAT a13 a9 a15 a11 SELECT instance_name .2 Enterprise Linux (or RHEL4) and FireWire technology. failover_method .--------------.com/technology/pub/articles/hunter_rac10gr2_3. NULL AS failed_over FROM v$instance UNION SELECT NULL . failed_over FROM v$session WHERE username = 'SYSTEM'. INSTANCE_NAME HOST_NAME FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER ------------. 29.html.--------. failover_type . NULL AS failover_method . The RAC solution presented here can be put together for around US$1.. it should never be considered for a production environment. although this solution should be stable enough for testing and development. 16 of 17 1/4/2006 8:47 AM .------------.--------. Remember.

. this article may have never come to fruition. please visit his excellent website at www. Send us your comments Page 1 Page 2 Page 3 17 of 17 1/4/2006 8:47 AM .com. writing web-based database administration tools. and physical/logical database design in Unix.oracle. Joel Becker. LDAP. 30. properly configuring UNIX shared memory. there are several other individuals that deserve credit in making this article a success. I would next like to thank Wim Coekaerts. Java programming. Without his hard work and research into issues like configuring and installing the hangcheck-timer kernel module. along with several others of his.com/technology/pub/articles/hunter_rac10gr2_3. and author and currently works for The DBA Zone.. provided information on Oracle RAC10g that could not be found in any other Oracle documentation. programming language processors (compilers and interpreters) in Java and C. First. and Windows NT environments.2 (which also works with CentOS Linux) along with many other useful tools and documentation at oss... Although I was able to author and successfully demonstrate the validity of the components that make up this configuration. Linux. Java Development Certified Professional. Inc. I would like to thank Werner Puschitz for his outstanding work on "Installing Oracle Database 10g with Real Application Cluster (RAC) on Red Hat Enterprise Linux Advanced Server 3". The professionals in this group made the job of upgrading the Linux kernel to support IEEE1394 devices with multiple logins (and several other significant modifications) a seamless task.idevelopment.info) has been a senior DBA and software engineer for over 11 years. capacity planning. He is an Oracle Certified Professional. Acknowledgements An article of this magnitude and complexity is generally not the work of one person alone.com.html. and configuring ASMLib.puschitz. database security.. Jeffrey Hunter (www. Manish Singh and the entire team at Oracle's Linux Projects Development Group. The group provides a pre-compiled kernel for Red Hat Enterprise Linux 4. This article. Jeff's work includes advanced performance tuning. If you are interested in examining technical articles on Linux internals and in-depth Oracle configurations written by Werner Puschitz. http://www. and of course Linux.oracle.Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireW. Jeff's other interests include mathematical encryption theory.

Sign up to vote on this title
UsefulNot useful