Installing and Configuring Oracle Cluster File Systems (OCFS).

Creating Partitions for Raw Devices If you want to use raw devices, see Creating Partitions for Raw Devices for more information. This article does not cover raw devices. Installing and Configuring Oracle Cluster File Systems (OCFS) Note that OCFS is not required for 10g RAC. In fact, I never use OCFS for RAC systems. However, this article covers OCFS since some people want to know how to configure and use OCFS. The Oracle Cluster File System (OCFS) was developed by Oracle to overcome the limits of Raw Devices and Partitions. It also eases administration of database files because it looks and feels just like a regular file system. At the time of this writing, OCFS only supports Oracle Datafiles and a few other files: - Redo Log files - Archive log files - Control files - Database datafiles - Shared quorum disk file for the cluster manager - Shared init file (srv) Oracle says that they will support Shared Oracle Home installs in the future. So don't install the Oracle software on OCFS yet. See Oracle Cluster File System for more information. In this article I'm creating a separate, individual ORACLE_HOME directory on local server storage for each and every RAC node. NOTE: If files on the OCFS file system need to be moved, copied, tar'd, etc., or if directories need to be created on OCFS, then the standard file system commands mv, cp, tar,... that come with the OS should not be used. These OS commands can have a major OS performance impact if they are being used on the OCFS file system. Therefore, Oracle's patched file system commands should be used instead. It is also important to note that some 3rd vendor backup tools make use of standard OS commands like tar. Installing OCFS NOTE: In my example I will use OCFS only for the cluster manager files since I will use ASM for datafiles. Download the OCFS RPMs (drivers, tools) for RHEL3 from http://oss.oracle.com/projects/ocfs/files/RedHat/RHEL3/i386/. (you can use the same RPMs for FireWire shared disks). To find out which OCFS driver you need for your server, run: $ uname -a Linux rac1pub 2.4.21-9.ELsmp #1 Thu Jan 8 17:24:12 EST 2004 i686 i686 i386 GNU/Linux To install the OCFS RPMs for SMP kernels (including FireWire SMP kernels), execute: su - root rpm -Uvh ocfs-2.4.21-EL-smp-1.0.12-1.i686.rpm \

ocfs-tools-1.0.10-1.i386.rpm \ ocfs-support-1.0.10-1.i386.rpm To install the OCFS RPMs for uniprocessor kernels (including FireWire UP kernels), execute: su - root rpm -Uvh ocfs-2.4.21-EL-1.0.12-1.i686.rpm \ ocfs-tools-1.0.10-1.i386.rpm \ ocfs-support-1.0.10-1.i386.rpm Configuring and Loading OCFS To generate the /etc/ocfs.conf file, you can run the ocfstool tool: su - root ocfstool - Select "Task" - Select "Generate Config" - Select the interconnect interface (private network interface) In my example for rac1pub I selected: eth1, rac1prv - Confirm the values displayed and exit The generated /etc/ocfs.conf file will appear similar to the following example: $ cat /etc/ocfs.conf # # ocfs config # Ensure this file exists in /etc # node_name = rac1prv ip_address = 192.168.2.1 ip_port = 7000 comm_voting = 1 guid = 84D43BC8FB7A2C1B88C3000D8821CC2C The guid entry is the unique group user ID. This ID has to be unique for each node. You can create the above file without the ocfstool tool by editing the /etc/ocfs.conf file manually and by running ocfs_uid_gen -c to assign/update the guid value in this file. To load the ocfs.o kernel module, execute: su - root # /sbin/load_ocfs /sbin/insmod ocfs node_name=rac1prv ip_address=192.168.2.1 cs=1795 guid=84D43BC8FB7A2C1B88C3000D8821CC2C comm_voting=1 ip_port=7000 Using /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o # To verify if the ofcs module was loaded, execute: # /sbin/lsmod |grep ocfs ocfs 305920 0 (unused) Note that the load_ocfs command doest not have to be executed again once everything has been setup for the OCFS filesystems, see Configuring the OCFS File Systems to Mount Automatically at Startup. If you run load_ocfs on a system with the experimental FireWire kernel, you might get the following error message: su - root # load_ocfs /sbin/insmod ocfs node_name=rac1prv ip_address=192.168.2.1 cs=1843 guid=AA12637FAABFB354371C000D8821CC2C comm_voting=1 ip_port=7000

insmod: ocfs: no module by that name found load_ocfs: insmod failed # The ocfs.o module for the "FireWire kernel" can be found here: su - root # rpm -ql ocfs-2.4.21-EL-1.0.12-1 /lib/modules/2.4.21-EL-ABI/ocfs /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o # So for the experimental kernel for FireWire drives, I manually created a link for the ocfs.o module file: su - root mkdir /lib/modules/`uname -r`/kernel/drivers/addon/ocfs ln -s `rpm -qa | grep ocfs-2 | xargs rpm -ql | grep "/ocfs.o$"` \ /lib/modules/`uname -r`/kernel/drivers/addon/ocfs/ocfs.o Now you should be able to load the OCFS module using the "FireWire kernel", and the output should look similar to this example: su - root # /sbin/load_ocfs load_ocfs /sbin/insmod ocfs node_name=rac1prv ip_address=192.168.2.1 cs=1843 guid=AA12637FAABFB354371C000D8821CC2C comm_voting=1 ip_port=7000 Using /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o Warning: kernel-module version mismatch /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o was compiled for kernel version 2.4.21-4.EL while this kernel is version 2.4.21-15.ELorafw1 Warning: loading /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o will taint the kernel: forced load See http://www.tux.org/lkml/#export-tainted for information about tainted modules Module ocfs loaded, with warnings # I would not worry about the above warning. However, if you get the following error, then you have to upgrade the modutils RPM: su - root # /sbin/load_ocfs /sbin/insmod ocfs node_name=rac2prv ip_address=192.168.2.2 cs=1761 guid=1815F1C57530339EA00E000D8825B058 comm_voting=1 ip_port=7000 Using /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o: kernel-module version mismatch /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o was compiled for kernel version 2.4.21-4.EL while this kernel is version 2.4.21-15.ELorafw1. # To remedy the "loading" problem, download the latest modutils RPM and enter e.g.: rpm -Uvh modutils-2.4.25-11.EL.i386.rpm To verify that the ofcs module was loaded, enter: # /sbin/lsmod |grep ocfs ocfs 305920 0 (unused) Note that the load_ocfs command doest not have to be executed again once everything has been setup for the OCFS filesystems, see Configuring the OCFS File Systems to Mount Automatically at Startup.

Creating OCFS File Systems Before you continue with the next steps, make sure you've created all needed partitions on your shared storage. Under Creating Oracle Directories I created the /u02/oradata/orcl mount directory for the cluster manager files. In the following example I will create one OCFS filesystem and mount it on /u02/oradata/orcl. The following steps for creating the OCFS filesystem(s) should only be executed on one RAC node! To create the OCFS filesystems, you can use the ocfstool: su - root ocfstool - Select "Task" - Select "Format" Alternatively, you can execute the "mkfs.ocfs" command to create the OCFS filesystems: su - root mkfs.ocfs -F -b 128 -L /u02/oradata/orcl -m /u02/oradata/orcl \ -u `id -u oracle` -g `id -g oracle` -p 0775 <device_name> Cleared volume header sectors Cleared node config sectors Cleared publish sectors Cleared vote sectors Cleared bitmap sectors Cleared data block Wrote volume header # For SCSI disks (including FireWire disks), <device_name> stands for devices like /dev/sda, /dev/sdb, /dev/sdc, dev/sdd, etc. Be careful to use the right device name! For this article I created an OCFS filesystem on /dev/sda1. mkfs.ocfs options: -F Forces to format existing OCFS volume -b Block size in kB. The block size must be a multiple of the Oracle block size. Oracle recommends to set the block size for OCFS to 128. -L Volume label -m Mount point for the device (in this article "/var/opt/oracle/oradata/orcl") -u UID for the root directory (in this article "oracle") -g GID for the root directory (in this article "oinstall") -p Permissions for the root directory Mounting OCFS File Systems As I mentioned previously, for this article I created one large OCFS fileystem on /dev/sda1. To mount the OCFS filesystem, I executed: su - root # mount -t ocfs /dev/sda1 /u02/oradata/orcl or # mount -t ocfs -L /u02/oradata/orcl /u02/oradata/orcl Now run the ls command on all RAC nodes to check the ownership: # ls -ld /u02/oradata/orcl

drwxrwxr-x 1 oracle oinstall 131072 Jul 4 23:25 /u02/oradata/orcl # NOTE: If the above ls command does not display the same ownership on all RAC nodes (oracle:oinstall), then the "oracle" UID and the "oinstall" GID are not the same accross the RAC nodes, see Creating Oracle User Accounts for more information. Configuring the OCFS File Systems to Mount Automatically at Startup To ensure the OCFS filesystems are mounted automatically during reboots, the OCFS mount points need to be added to the /etc/fstab file. Add lines to the /etc/fstab file similar to the following example: /dev/sda1 /u02/oradata/orcl ocfs _netdev 0 0 The "_netdev" option prevents the OCFS filesystem from being mounted until the network has first been enabled on the system, which provides access to the storage device (see mount(8)). To make sure the ocfs.o kernel module is loaded and the OCFS file systems are mounted during the boot process, enter: su - root # chkconfig --list ocfs ocfs 0:off 1:off 2:off 3:on 4:on 5:on 6:off If the flags are not set to "on" as marked in bold, run the following command: su - root # chkconfig ocfs on You can also start the "ocfs" service manually by running: su - root # service ocfs start When you run this command it will not only load the ocfs.o kernel module but it will also mount the OCFS filesystems as configured in /etc/fstab. At this point you might want to reboot all RAC nodes to ensure that the OCFS filesystems are mounted automatically after reboots: su - root reboot

Installing and Configuring Automatic Storage Management (ASM) and Disks General For information about what Automatic Storage Management is, see Configuring and Using Automatic Storage Management. See also Installing Oracle ASMLib for Linux. Installing ASM Download the latest Oracle ASM RPMs from http://otn.oracle.com/tech/linux/asmlib/index.html. Make sure that you download the right ASM driver for your kernel (UP or SMP). To install the ASM RPMs on a UP server, run: su - root rpm -Uvh oracleasm-2.4.21-EL-1.0.0-1.i686.rpm \ oracleasm-support-1.0.2-1.i386.rpm \

oracleasmlib-1.0.0-1.i386.rpm To install the ASM RPMs on a SMP server, run: su - root rpm -Uvh oracleasm-2.4.21-EL-smp-1.0.0-1.i686.rpm \ oracleasm-support-1.0.2-1.i386.rpm \ oracleasmlib-1.0.0-1.i386.rpm Configuring and Loading ASM To load the ASM driver oracleams.o and to mount the ASM driver filesystem, enter: su - root # /etc/init.d/oracleasm configure Configuring the Oracle ASM library driver. This will configure the on-boot properties of the Oracle ASM library driver. The following questions will determine whether the driver is loaded on boot and what permissions it will have. The current values will be shown in brackets ('[]'). Hitting without typing an answer will keep that current value. Ctrl-C will abort. Default user to own the driver interface []: oracle Default group to own the driver interface []: oinstall Start Oracle ASM library driver on boot (y/n) [n]: y Fix permissions of Oracle ASM disks on boot (y/n) [y]: y Writing Oracle ASM library driver configuration Creating /dev/oracleasm mount point Loading module "oracleasm" Mounting ASMlib driver filesystem Scanning system for ASM disks # Creating ASM Disks NOTE: Creating ASM disks is done on one RAC node! The following commands should only be executed on one RAC node! I executed the following commands to create my ASM disks: (make sure to change the device names!) (In this example I used partitions (/dev/sda2, /dev/sda3, /dev/sda5) instead of whole disks (/dev/sda, /dev/sdb, /dev/sdc,...)) su - root # /etc/init.d/oracleasm createdisk VOL1 /dev/<sd??> Marking disk "/dev/sda2" as an ASM disk [ OK ] # /etc/init.d/oracleasm createdisk VOL2 /dev/<sd??> Marking disk "/dev/sda3" as an ASM disk [ OK ] # /etc/init.d/oracleasm createdisk VOL3 /dev/<sd??> Marking disk "/dev/sda5" as an ASM disk [ OK ] # # # Replace "sd??" with the name of your device. I used /dev/sda2, /dev/sda3, and /dev/sda5 To list all ASM disks, enter: # /etc/init.d/oracleasm listdisks VOL1 VOL2 VOL3 # On all other RAC nodes, you just need to notify the system about the new ASM disks:

[ [ [ [ [

OK OK OK OK OK

] ] ] ] ]

su - root # /etc/init.d/oracleasm scandisks Scanning system for ASM disks #

[

OK

]

Configuring the "hangcheck-timer" Kernel Module Oracle uses the Linux kernel module hangcheck-timer to monitor the system health of the cluster and to reset a RAC node in case of failures. The hangcheck-timer module uses a kernel-based timer to periodically check the system task scheduler. This timer resets the node when the system hangs or pauses. This module uses the Time Stamp Counter (TSC) CPU register which is a counter that is incremented at each clock signal. The TCS offers very accurate time measurements since this register is updated by the hardware automatically. The hangcheck-timer module comes now with the kernel: find /lib/modules -name "hangcheck-timer.o" The hangcheck-timer module has the following two parameters: hangcheck_tick This parameter defines the period of time between checks of system health. The default value is 60 seconds. Oracle recommends to set it to 30 seconds. hangcheck_margin This parameter defines the maximum hang delay that should be tolerated before hangcheck-timer resets the RAC node. It defines the margin of error in seconds. The default value is 180 seconds. Oracle recommends to set it to 180 seconds. These two parameters indicate how long a RAC node must hang before the hangcheck-timer module will reset the system. A node reset will occur when the following is true: system hang time > (hangcheck_tick + hangcheck_margin) To load the module with the right parameter settings, make entries to the /etc/modules.conf file. To do that, add the following line to the /etc/modules.conf file: # su - root # echo "options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180" >> /etc/modules.conf Now you can run modprobe to load the module with the configured parameters in /etc/modules.conf: # su - root # modprobe hangcheck-timer # grep Hangcheck /var/log/messages |tail -2 Jul 5 00:46:09 rac1pub kernel: Hangcheck: starting hangcheck timer 0.8.0 (tick is 180 seconds, margin is 60 seconds). Jul 5 00:46:09 rac1pub kernel: Hangcheck: Using TSC. # Note: To ensure the hangcheck-timer module is loaded after each reboot, add the modprobe command to the /etc/rc.local file. Setting up RAC Nodes for Remote Access When you run the Oracle Installer on a RAC node, it will use ssh to copy Oracle software and data to other RAC nodes. Therefore, the oracle user on the RAC node where Oracle Installer is launched must be able to login to other RAC nodes without having to provide a password or passphrase.

The following procedure shows how ssh can be configured that no password is requested for oracle ssh logins. To create an authentication key for oracle, enter the following command on all RAC node: (the ~/.ssh directory will be created automatically if it doesn't exist yet) su - oracle $ ssh-keygen -t dsa -b 1024 Generating public/private dsa key pair. Enter file in which to save the key (/home/oracle/.ssh/id_dsa): Press ENTER Created directory '/home/oracle/.ssh'. Enter passphrase (empty for no passphrase): Enter a passphrase Enter same passphrase again: Etner a passphrase Your identification has been saved in /home/oracle/.ssh/id_dsa. Your public key has been saved in /home/oracle/.ssh/id_dsa.pub. The key fingerprint is: e0:71:b1:5b:31:b8:46:d3:a9:ae:df:6a:70:98:26:82 oracle@rac1pub Copy the pulic key for oracle from each RAC node to all other RAC nodes. For example, run the following commands on all RAC nodes: su - oracle ssh rac1pub cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys ssh rac2pub cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys ssh rac3pub cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys Now verify that oracle on each RAC node can login to all other RAC nodes without a password. Make sure that ssh only asks for the passphrase. Note, however, that the first time you ssh to another server you will get a message stating that the authenticity of the host cannot be established. Enter "yes" at the prompt to continue the connection. For example, run the following commands on all RAC nodes to verify that no password is asked: su - oracle ssh rac1pub hostname ssh rac1pub hostname ssh rac1prv hostname ssh rac2pub hostname ssh rac2prv hostname ssh rac3pub hostname ssh rac3prv hostname And later, before runInstaller is launched, I will show how ssh can be configured that no passphrase has to be entered for oracle ssh logins. Checking Packages (RPMs) Some packages will be missing when you selected the Installation Type "Advanced Server" during the Red Hat Advanced Server installation. The following additional RPMs are required: rpm -q gcc glibc-devel glibc-headers glibc-kernheaders cpp compatlibstdc++ To install these RPMS, run: su - root rpm -ivh gcc-3.2.3-24.i386.rpm \

glibc-devel-2.3.2-95.6.i386.rpm \ glibc-headers-2.3.2-95.6.i386.rpm \ glibc-kernheaders-2.4-8.34.i386.rpm \ cpp-3.2.3-24.i386.rpm \ compat-libstdc++-7.3-2.96.123.i386.rpm The opemotif RPM is also required, otherwise you won't pass Oracle's recommended operating system packages test. If it's not installed on your system, run su - root rpm -ivh openmotif-2.2.2-16.i386.rpm I recommend using the latest RPM version. Adjusting Network Settings Oracle now uses UDP as the default protocol on Linux for interprocess communication, such as cache fusion buffer transfers between the instances. It is strongly suggested to adjust the default and maximum send buffer size (SO_SNDBUF socket option) to 256 KB, and the default and maximum receive buffer size (SO_RCVBUF socket option) to 256 KB. The receive buffers are used by TCP and UDP to hold received data until is is read by the application. The receive buffer cannot overflow because the peer is not allowed to send data beyond the buffer size window. This means that datagrams will be discarded if they don't fit in the socket receive buffer. This could cause the sender to overwhelm the receiver. The default and maximum window size can be changed in the proc file system without reboot: su - root sysctl -w net.core.rmem_default=262144 # Default setting in bytes of the socket receive buffer sysctl -w net.core.wmem_default=262144 # Default setting in bytes of the socket send buffer sysctl -w net.core.rmem_max=262144 # Maximum socket receive buffer size which may be set by using the SO_RCVBUF socket option sysctl -w net.core.wmem_max=262144 # Maximum socket send buffer size which may be set by using the SO_SNDBUF socket option To make the change permanent, add the following lines to the /etc/sysctl.conf file, which is used during the boot process: net.core.rmem_default=262144 net.core.wmem_default=262144 net.core.rmem_max=262144 net.core.wmem_max=262144

Sizing Swap Space It is important to follow the steps as outlined in Sizing Swap Space. Setting Shared Memory It is important to follow the steps as outlined in Setting Shared Memory. Checking /tmp Space It is important to follow the steps as outlined in Checking /tmp Space. Setting Semaphores

It is recommended to follow the steps as outlined in Setting Semaphores. Setting File Handles It is recommended to follow the steps as outlined in Setting File Handles.

Installing Cluster Ready Services (CRS) General Cluster Ready Services (CRS) contains cluster and database configuration information for RAC, and it provides many system management features. CRS accepts registration of Oracle instances to the cluster and it sends ping messages to other RAC nodes. If the heartbeat fails, CRS will use shared disk to distinguish between a node failure and a network failure. Once CRS is running on all RAC nodes, OUI will automatically recognice all nodes on the cluster. This means that you can run OUI on one RAC node to install the Oracle software on all other RAC nodes. Note that Automatic Storage Management (ASM) cannot be used for the "Oracle Cluster Registry (OCR)" file or for the "CRS Voting Disk" file. These files must be accessible before any Oracle instances are started. And for ASM to become available, the ASM instance needs to run first. In the following example I will use OCFS for the "Oracle Cluster Registry (OCR)" file and for the "CRS Voting disk" file. The Oracle Cluster Registry file has a size of about 100 MB, and the CRS Voting Disk file has a size of about 20 MB. Tese files must reside on OCFS or on a shared raw device, or on any other clustered filesystem. Automating Authentication for oracle ssh Logins Make sure that the oracle user can ssh to all RAC nodes without ssh asking for a passphrase. This is very important because otherwise OUI won't be able to install the Oracle software on other RAC nodes. The following example shows how ssh-agent can do the authentication for you when the oracle account logs in to other RAC nodes using ssh. Open a new terminal for the RAC node where you will execute runInstaller and use this terminal to login from your desktop using the following command: $ ssh -X oracle@rac?pub The "X11 forward" feature (-X option) of ssh will relink X to your local desktop. For more information, see Installing Oracle10g on a Remote Linux Server. Now configure ssh-agent to handle the authentication for the oracle account: oracle$ ssh-agent $SHELL oracle$ ssh-add Enter passphrase for /home/oracle/.ssh/id_dsa: Enter your passphrase Identity added: /home/oracle/.ssh/id_dsa (/home/oracle/.ssh/id_dsa) oracle$ Now make sure the oracle user can ssh into each RAC node. It is very important that NO text is displayed and that you are not asked for a passphrase. Only the server name of the remote RAC node should be displayed: oracle$ ssh rac1pub hostname

rac1pub oracle$ rac1pub oracle$ rac2pub oracle$ rac2pub oracle$ rac3pub oracle$ rac3pub

ssh rac1prv hostname ssh rac2pub hostname ssh rac2prv hostname ssh rac3pub hostname ssh rac3prv hostname

NOTE: Keep this terminal open since this is the terminal that will be used for running runInstaller! Checking OCFS and Oracle Environment Variables Checking OCFSs Make sure the OCFS filesystem(s) are mounted on all RAC nodes: oracle$ ssh rac1pub df |grep oradata /dev/sda1 51205216 33888 51171328 /u02/oradata/orcl oracle$ ssh rac2pub df |grep oradata /dev/sda1 51205216 33888 51171328 /u02/oradata/orcl oracle$ ssh rac3pub df |grep oradata /dev/sda1 51205216 33888 51171328 /u02/oradata/orcl Checking Oracle Environment Variables Run the following command on all RAC nodes: su - oracle $ set | grep ORA ORACLE_BASE=/u01/app/oracle ORACLE_SID=orcl1 $ It is important that $ORACLE_SID is different on each RAC node! It is also recommended that $ORACLE_HOME is not set but that OUI selects the home directory. Installing Oracle 10g Cluster Ready Services (CRS) R1 (10.1.0.2) In order to install the Cluster Ready Services (CRS) R1 (10.1.0.2) on all RAC nodes, OUI has to be launched on only one RAC node. In my example I will run OUI always on rac1pub. To install CRS, insert the "Cluster Ready Services (CRS) R1 (10.1.0.2)" CD (downloadedd image name: "ship.crs.cpio.gz"), and mount it on e.g. rac1pub: su - root mount /mnt/cdrom Use the oracle terminal that you prepared for ssh at Automating Authentication for oracle ssh Logins and execute runInstaller: oracle$ /mnt/cdrom/runInstaller - Welcome Screen: Click Next

1% 1% 1%

- Inventory directory and credentials: Click Next - Unix Group Name: Use "oinstall". - Root Script Window: Open another window, login as root, and run /tmp/orainstRoot.sh on the node where you launched runInstaller. After you've run the script, click Continue. - File Locations: I used the recommended default values: Destination Name: OraCr10g_home1 Destination Path: /u01/app/oracle/product/10.1.0/crs_1 Click Next - Language Selection: Click Next - Cluster Configuration: Cluster Name: crs Cluster Nodes: Public Node Name: rac1pub Private Node Name: rac1prv Public Node Name: rac2pub Private Node Name: rac2prv Public Node Name: rac3pub Private Node Name: rac3prv Click Next - Private Interconnect Enforcement: Interface Name: eth0 Subnet: 192.168.1.0 Interface Type: Public Interface Name: eth1 Subnet: 192.168.2.0 Interface Type: Private Click Next - Oracle Cluster Registry: OCR Location: /u02/oradata/orcl/OCRFile Click Next - Voting Disk: Voting disk file name: /u02/oradata/orcl/CSSFile Click Next - Root Script Window: Open another window, login as root, and execute /u01/app/oracle/oraInventory/orainstRoot.sh on ALL RAC Nodes! log directory there are problems with but only if it exists. oracle: NOTE: For any reason Oracle does not create the "/u01/app/oracle/product/10.1.0/crs_1/log". If CRS, it will create log files in this directory, Therefore make sure to create this directory as

oracle$ mkdir /u01/app/oracle/product/10.1.0/crs_1/log After you've run the script, click Continue. - Setup Privileges Script Window: Open another window, login as root, and execute /u01/app/oracle/product/10.1.0/crs_1/root.sh on ALL RAC Nodes one by one! Note that his can take a while. On the last RAC node, the output of the script was as follows:

init(1M)

... CSS is active on these nodes. rac1pub rac2pub rac3pub CSS is active on all nodes. Oracle CRS stack installed and running under

Click OK - Summary: Click Install - When installation is completed, click Exit. One way to verify the CRS installation is to display all the nodes where CRS was installed: oracle$ /u01/app/oracle/product/10.1.0/crs_1/bin/olsnodes -n rac1pub 1 rac2pub 2 rac3pub 3

Installing Oracle Database 10g Software with Real Application Clusters (RAC) General The following procedure shows the installation of the software for Oracle Database 10g Software R1 (10.1.0.2) with Real Application Clusters (RAC). Note that Oracle Database 10g R1 (10.1) OUI will not be able to discover disks that are marked as Linux ASMLib. Therefore it is recommended to complete the software installation and then to use dbca to create the database, see http://otn.oracle.com/tech/linux/asmlib/install.html#10gr1 for more information. Automating Authentication for oracle ssh Logins Before you install the Oracle Database 10g Software with Real Application Clusters (RAC) R1 (10.1.0.2), it is important that you followed the steps as outlined in Automating Authentication for oracle ssh Logins. Checking Oracle Environment Variables Run the following command on all RAC nodes: su - oracle $ set | grep ORA ORACLE_BASE=/u01/app/oracle ORACLE_SID=orcl1 $ It is important that $ORACLE_SID is different on each RAC node! It is also recommended that $ORACLE_HOME is not set but that OUI selects the home directory. Installing Oracle Database 10g Software R1 (10.1.0.2) with Real Application Clusters (RAC) In order to install the Oracle Database 10g R1 (10.1.0.2) Software with Real Application Clusters (RAC) on all RAC nodes, OUI has to be launched on only one RAC node. In my example I will run OUI on rac1pub. To install the RAC Database software, insert the Oracle Database 10g R1 (10.1.0.2) CD (downloaded image name: "ship.db.cpio.gz"), and mount it on e.g. rac1pub:

su - root mount /mnt/cdrom Use the oracle terminal that you prepared for ssh at Automating Authentication for oracle ssh Logins, and execute runInstaller: oracle$ /mnt/cdrom/runInstaller Click Next I used the default values: Destination Name: raDb10g_home1 Destination Path: /u01/app/oracle/product/10.1.0/db_1 Click Next. - Hardware Cluster Installation Mode: Select "Cluster Installation" Click "Select All" to select all servers: rac1pub, rac2pub, rac3pub Click Next NOTE: If it stops here and the status of a RAC node is "Node not reachable", then perform the following checks: - Check if the node where you launched OUI is able to do ssh without a passphrase to the RAC node where the status is set to "Node not reachable". - Check if the CRS is running this RAC node. - Installation Type: I selected "Enterprise Edition". Click Next. - Product-specific Prerequisite Checks: Make sure that the status of each Check is set to "Succeeded". Click Next - Database Configuration: I selected "Do not create a starter database" since we have to create the database with dbca. Oracle Database 10g R1 (10.1) OUI will not be able to discover disks that are marked as Linux ASMLib. For more information, see http://otn.oracle.com/tech/linux/asmlib/install.ht ml#10gr1 Click Next - Summary: Click Install - Setup Privileges Window: Open another window, login as root, and execute /u01/app/oracle/product/10.1.0/db_1/root.sh on ALL RAC Nodes one by one! NOTE: Make also sure that X is relinked to your local desktop since this script will launch the "VIP Configuration Assistant" tool which is a GUI based utility! VIP Configuration Assistant Tool: - Welcome Screen: - File Locations:

(This root.sh is executed the first interfaces, eth0 and eth1.

Assistant tool will come up only once when time in your RAC cluster) Welcome Click Next Network Interfaces: I selected both

Click Next - Virtual IPs for cluster nodes: (for the alias names and IP address, see Setting Up the /etc/hosts File) Node Name: rac1pub IP Alias Name: rac1vip IP address: 192.168.1.51 Subnet Mask: 255.255.255.0 Node Name: rac2pub IP Alias Name: rac2vip IP address: 192.168.1.52 255.255.255.0 Node Name: rac3pub IP Alias Name: rac3vip IP address: Subnet Mask: Subnet Mask:

192.168.1.53 255.255.255.0

Click Next - Summary: Click Finish - Configuration Assistant Progress Dialog: Click OK after configuration is complete. - Configuration Results: Click Exit Click OK to close the Setup Privilege Window. - End of Installation: Click Exit If OUI terminates abnormally (happend to me several times), or if anything else goes wrong, remove the following files/directories and start over again: su - oracle rm -rf /u01/app/oracle/product/10.1.0/db_1

Installing Oracle Database 10g with Real Application Cluster (RAC) General The following steps show how to use dbca to create the database and its instances. Oracle recommends to use dbca to create the RAC database since the preconfigured databases are optimized for ASM, server

parameter file, and automatic undo management. dbca also makes it much more easier to create new ASM disk groups etc. Automating Authentication for oracle ssh Logins Before you install a RAC database, it is important that you followed the steps as outlined in Automating Authentication for oracle ssh Logins. Setting Oracle Environment Variables Since the Oracle RAC software is already installed, $ORACLE_HOME can now be set to the home directory that was choosen by OUI. The following steps should now be performed on all RAC nodes! It is very important that these environment variables are set permanently for oracle on all RAC nodes! To make sure $ORACLE and $PATH are set automatically each time oracle logs in, add these environment variables to the ~oracle/.bash_profile file which is the user startup file for the Bash shell on Red Hat Linux. To do this you could simply copy/paste the following commands to make these settings permanent for your oracle's Bash shell (the path might differ on your system!): su - oracle cat >> ~oracle/.bash_profile << EOF export ORACLE_HOME=$ORACLE_BASE/product/10.1.0/db_1 export PATH=$PATH:$ORACLE_HOME/bin export LD_LIBRARY_PATH=$ORACLE_HOME/lib EOF

Installing Oracle Database 10g with Real Application Cluster (RAC) To install the RAC database and the instances on all RAC nodes, OUI has to be launched on only one RAC node. In my example I will run OUI on rac1pub. Use the oracle terminal that you prepared for ssh at Automating Authentication for oracle ssh Logins, and execute dbca. But before you execute dbca, make sure that $ORACLE_HOME and $PATH are set: oracle$ . ~oracle/.bash_profile oracle$ dbca - Welcome Screen: database" - Operations: Select "Oracle Real Application Clusters

Click Next Select "Create Database" Click Next - Node Selection: Click "Select All". Make sure all your RAC nodes show up and are selected! If dbca hangs here, then you probably didn't follow the steps as outlined at Automating Authentication for oracle ssh Logins Click Next - Database Templates: I selected "General Purpose". Click Next - Database Identification: Global Database Name: orcl SID Prefix: orcl

- Management Option: Management".

Click Next I selected "Use Database Control for Database

Click Next - Database Credentials: I selected "Use the Same Password for All Accounts". Enter the password and make sure the password does not start with a digit number. Click Next - Storage Options: I selected "Automatic Storage Management (ASM)", see Installing and Configuring Automatic Storage Management (ASM) and Disks Click Next - Create ASM Instance: Enter the SYS password for the ASM instance. I selected the default parameter file (IFILE): "{ORACLE_BASE}/admin/+ASM/pfile/init.ora" Click Next At this point DBCA will create and start the ASM instance on all RAC nodes. Click OK to create and start the ASM instance. An error will come up that oratab can't be copied to /tmp. I ignored this error. If you get "ORACLE server session terminated by fatal error", then you probably didn't follow the steps at Setting Up the /etc/hosts File - ASM Disk Groups: - Click "Create New" Create Disk Group Window: - Click "Change Disk Discovery Path". - Enter "ORCL:VOL*" for Disk Discovery Path. The discovery string for finding ASM disks must be prefixed with "ORCL:", and in my example I called the ASM disks VOL1, VOL2, VOL3. - I entered an arbitraty Disk Group Name: ORCL_DATA1 - I checked the candidate: "ORCL:VOL1" and "ORCL:VOL2" which have together about 60 GB space in my configuration. - Click OK. - Check the new created disk group "ORCL_DATA1". - Click Next - Database File Locations: Select "Use Oracle-Managed Files" Database Area: +ORCL_DATA1 Click Next

- Recovery Configuration: Using recovery options like Flash Recovery Area is out of scope for this article. So I did not select any recovery options. Click Next - Database Content: I did not select Sample Schemas or Custom Scripts. Click Next - Database Services: Click "Add" and enter a Service Name: I entered "orcltest". I selected TAF Policy "Basic". Click Next - Initialization Parameters: Change settings as needed. Click Next - Database Storage: Change settings as needed. Click Next - Creation Options: Check "Create Database" Click Finish - Summary: Click OK Now the database is being created. The following error message came up: "Unable to copy the file "rac2pub:/etc/oratab" to "/tmp/oratab.rac2pub". I clicked "Ignore". I have to investigate this. Your RAC cluster should now be up and running. To verify, try to connect to each instance from one of the RAC nodes: $ sqlplus system@orcl1 $ sqlplus system@orcl2 $ sqlplus system@orcl3 After you connected to an instance, enter the following SQL command to verify your connection: SQL> select instance_name from v$instance;

Post-Installation Steps

Transparent Application Failover (TAF) Introduction Processes external to the Oracle 10g RAC cluster control the Transparent Application Failover (TAF). This means that the failover types and methods can be unique for each Oracle Net client. The re-connection happens automatically within the OCI library which means that you do not need to change the client application to use TAF. Setup

To test TAF on the new installed RAC cluster, configure the tnsnames.ora file for TAF on a non-RAC server where you have either the Oracle database software or the Oracle client software installed. Here is an example how my /opt/oracle/product/9.2.0/network/admin/tnsnames.ora: looks like: ORCLTEST = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = rac1vip)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = rac2vip)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = rac3vip)(PORT = 1521)) (LOAD_BALANCE = yes) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = orcl) (FAILOVER_MODE = (TYPE = SELECT) (METHOD = BASIC) (RETRIES = 180) (DELAY = 5) ) ) ) The following SQL statement can be used to check the sessions's failover type, failover method, and if a failover has occured: select instance_name, host_name, NULL AS failover_type, NULL AS failover_method, NULL AS failed_over FROM v$instance UNION SELECT NULL, NULL, failover_type, failover_method, failed_over FROM v$session WHERE username = 'SYSTEM'; Example of a Transparent Application Failover (TAF) Here is an example of a Transparent Application Failover: su - oracle $ sqlplus system@orcltest SQL> select instance_name, host_name, 2 NULL AS failover_type, 3 NULL AS failover_method, 4 NULL AS failed_over 5 FROM v$instance 6 UNION 7 SELECT NULL, NULL, failover_type, failover_method, failed_over 8 FROM v$session 9 WHERE username = 'SYSTEM'; INSTANCE_NAME HOST_NAME FAILOVER_TYPE FAILOVER_M FAI ---------------- ---------- ------------- ---------- --orcl1 rac1pub SELECT BASIC NO

SQL> The above SQL statement shows that I'm connected to "rac1pub" for instance "orcl1". In this case, execute shutdown abort on "rac1pub" for instance "orcl1": SQL> shutdown abort ORACLE instance shut down. SQL> Now rerun the SQL statement: SQL> select instance_name, host_name, 2 NULL AS failover_type, 3 NULL AS failover_method, 4 NULL AS failed_over 5 FROM v$instance 6 UNION 7 SELECT NULL, NULL, failover_type, failover_method, failed_over 8 FROM v$session 9 WHERE username = 'SYSTEM'; INSTANCE_NAME HOST_NAME FAILOVER_TYPE FAILOVER_M FAI ---------------- ---------- ------------- ---------- --orcl2 rac2pub SELECT BASIC YES SQL> The SQL statement shows that the sessions has failed over to instance "orcl2". Note that this can take a few seconds.

Checking Automatic Storage Management (ASM) Here are a couple SQL statements to verify ASM. Run the following command to see which data files are in which disk group: SQL> select name from v$datafile 2 union 3 select name from v$controlfile 4 union 5 select member from v$logfile; NAME ------------------------------------------------------------------------------+ORCL_DATA1/orcl/controlfile/current.260.3 +ORCL_DATA1/orcl/datafile/sysaux.257.1 +ORCL_DATA1/orcl/datafile/system.256.1 +ORCL_DATA1/orcl/datafile/undotbs1.258.1 +ORCL_DATA1/orcl/datafile/undotbs2.264.1 +ORCL_DATA1/orcl/datafile/users.259.1 +ORCL_DATA1/orcl/onlinelog/group_1.261.1 +ORCL_DATA1/orcl/onlinelog/group_2.262.1 +ORCL_DATA1/orcl/onlinelog/group_3.265.1 +ORCL_DATA1/orcl/onlinelog/group_4.266.1

10 rows selected. SQL> Run the following command to see which ASM disk(s) belong to the disk group 'ORCL_DATA1': (ORCL_DATA1 was specified in Installing Oracle Database 10g with Real Application Cluster) SQL> select path from v$asm_disk where group_number in 2 (select group_number from v$asm_diskgroup where name = 'ORCL_DATA1'); PATH ------------------------------------------------------------------------------ORCL:VOL1 ORCL:VOL2 SQL>

Oracle 10g RAC Issues, Problems and Errors This section describes other issues, problems and errors pertaining to installing Oracle 10g with RAC which has not been covered so far. • Gtk-WARNING **: libgdk_pixbuf.so.2: cannot open shared object file: No such file or directory

• This error can come up when you run ocfstool. To fix this error, install the gdk-pixbuf RPM: rpm -ivh gdk-pixbuf-0.18.0-8.1.i386.rpm • • • • • • This error can come up when you run root.sh. To fix this error, install the compat-libstdc++ RPM and rerun root.sh: rpm -ivh compat-libstdc++-7.3-2.96.122.i386.rpm • mount: fs type ocfs not supported by kernel • The OCFS kernel module was not loaded. See Configuring and Loading OCFS for more information. • • ORA-00603: ORACLE server session terminated by fatal error or /u01/app/oracle/product/10.1.0/crs_1/bin/crs_stat.bin: error while loading shared libraries: libstdc++-libc6.2-2.so.3: cannot open shared object file: No such file or directory /u01/app/oracle/product/10.1.0/crs_1/bin/crs_stat.bin: error while loading shared libraries: libstdc++-libc6.2-2.so.3: cannot open shared object file: No such file or directory PRKR-1061 : Failed to run remote command to get node configuration for node rac1pup PRKR-1061 : Failed to run remote command to get node configuration for node rac1pup

• • • •

SQL> startup nomount ORA-29702: error occurred in Cluster Group Service operation

If the trace file looks like this: /u01/app/oracle/product/10.1.0/db_1/rdbms/log/orcl1_ora_7424.trc ... kgefec: fatal error 0 *** 2004-03-13 20:50:28.201 ksedmp: internal or fatal error ORA-00603: ORACLE server session terminated by fatal error ORA-27504: IPC error creating OSD context ORA-27300: OS system dependent operation:gethostbyname failed with status: 0 ORA-27301: OS failure message: Error 0 ORA-27302: failure occurred at: sskgxpmyip4 Current SQL information unavailable - no session. ----- Call Stack Trace ----calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- ----------------------------------------------ksedmp()+493 call ksedst()+0 0 ? 0 ? 0 ? 1 ? 0 ? 0 ? ksfdmp()+14 call ksedmp()+0 3 ? BFFF783C ? A483593 ? BF305C0 ? 3 ? BFFF8310 ? Make sure that the name of the RAC node is not listed for the loopback address in the /etc/hosts file similar to this example: 127.0.0.1 rac1pub localhost.localdomain localhost The entry should rather look like this: 127.0.0.1 localhost.localdomain localhost

Sign up to vote on this title
UsefulNot useful