You are on page 1of 104

Oracle Real Application Clusters GFS

Oracle RAC GFS


Oracle Real Application Clusters GFS: Oracle RAC GFS
Copyright © 2006 Red Hat

The Oracle RAC GFS is a complete guide to all things Oracle RAC GFS!
Table of Contents
Introduction ...................................................................................................................................... vi
1. About This Guide ................................................................................................................... vi
2. Audience ............................................................................................................................... vi
3. Software Versions ................................................................................................................... vi
4. Related Documentation ............................................................................................................ vi
5. Document Conventions ........................................................................................................... vii
1. Sample Cluster ................................................................................................................................ 1
1. Oracle RAC Cluster on CS4/GFS ................................................................................................ 1
1.1. Sample 4-node Oracle RAC cluster ................................................................................... 1
1.2. Storage ........................................................................................................................ 2
1.3. Fencing Topology ......................................................................................................... 3
1.4. Remote Lock Management using GULM ........................................................................... 3
1.5. Network fabrics ............................................................................................................ 4
1.5.1. Hostnames and Networks ..................................................................................... 4
1.5.2. Hostnames and Physical Interfaces ......................................................................... 4
2. Installation and Configuration of RHEL4 ............................................................................................. 6
1. Using RHEL4 Update 3 ............................................................................................................ 6
1.1. Customizing the RHEL4 Installation ................................................................................. 6
1.1.1. Boot Disk Provisioning ........................................................................................ 6
1.1.2. Network Interfaces .............................................................................................. 6
1.1.3. Firewall and Security ........................................................................................... 6
1.1.4. Selecting from the Custom subset .......................................................................... 6
1.2. Post Install Configuration Activities .................................................................................. 7
1.2.1. INIT[run level] options ........................................................................................ 7
1.2.2. Configuring Cluster Clock synchronization .............................................................. 7
1.2.3. Configuring HP iLO (Integrated Lights Out) ............................................................ 7
1.2.4. Shared LUNs Requirement ................................................................................... 8
3. Installation and Configuration of Cluster Suite4 ..................................................................................... 9
1. Installing ClusterSuite4 (CS4) components ................................................................................... 9
1.1. Installing CS4 RPMs ..................................................................................................... 9
1.2. Configuring CS4 Using the GUI Tool ............................................................................... 9
1.2.1. Verify X11 connectivity ......................................................................................10
1.3. Configuring the 1st lock server .......................................................................................10
1.3.1. Hostnames, Networks and Interfaces .....................................................................10
1.3.2. Configuring with the GUI tool ..............................................................................10
1.3.3. After the GUI configuration .................................................................................19
1.3.4. Testing first GULM lock server ............................................................................20
1.3.5. Configuring the remaining GULM lock servers .......................................................22
1.3.6. After the GUI configuration: for other lock servers ...................................................23
1.3.7. Adding the Four RAC nodes and their Fence Devices ...............................................26
1.3.8. Post GUI configuration for other lock servers ..........................................................28
1.3.9. Operational Considerations ..................................................................................29
4. Installing Clustered Logical Volume Manager (CLVM) .........................................................................31
1. Installing CLVM components ...................................................................................................31
2. Configuring CLVMD ..............................................................................................................31
3. Start up CLVMD ....................................................................................................................31
4. Repeat Installation and configuration for all nodes ........................................................................31
5. Creating the Physical and Logical Volumes .........................................................................................32
1. Physical_Storage_Allocation ....................................................................................................32
2. Initialize and Configure Volumes ...............................................................................................32
2.1. Verify X11 connectivity ................................................................................................32
2.2. Initialize the Shared Home volume group ..........................................................................33
2.3. Create the 1st redo volume group and logical volume ..........................................................36
2.4. Create the remaining redo groups and volumes ..................................................................40
2.5. Create the main datafiles logical volume ...........................................................................41
6. GFS .............................................................................................................................................42
1. Installing GFS components .......................................................................................................42
2. Create the GFS volumes ...........................................................................................................42
2.1. Verify the logical volumes .............................................................................................42
2.2. Create the filesystems ...................................................................................................42
2.3. /etc/fstab entries ...........................................................................................................43
7. Oracle 10gR2 Clusterware ................................................................................................................45

iv
Oracle Real Application Clusters GFS

1. Installing Oracle 10gR2 Clusterware (formerly 10gR1 CRS) ...........................................................45


1.1. RHEL Preparation ........................................................................................................45
1.1.1. Map the shared raw partitions to RHEL rawdevices ..................................................45
1.1.2. Configure /etc/sysctl.conf ....................................................................................47
1.1.3. Create the oracle user .........................................................................................47
1.1.4. Create_a_clean_ssh_connection_environment .........................................................48
1.1.5. Download Oracle Installers ..................................................................................49
1.1.6. Create shared home directories .............................................................................49
1.1.7. Verify X11 connectivity ......................................................................................49
1.1.8. Clusterware rootpre.sh ........................................................................................49
1.1.9. Instantiating Clusterware .....................................................................................61
1.1.10. Registering Clusterware resources with VIPCA .....................................................64
8. Installing Oracle 10gR2 Enterprise Edition Database .............................................................................71
1. RHEL Preparation ..................................................................................................................71
2. Oracle 10gR2 RDBMS Installation ............................................................................................71
3. Oracle SQL*Net Configuration .................................................................................................81
9. Creating a Database ........................................................................................................................88
1. Database File Layout ...............................................................................................................88
1.1. Oracle Datafiles ...........................................................................................................88
1.2. Redo and Undo ............................................................................................................88
2. Setup and Scripts ....................................................................................................................88
2.1. Environment Variables ..................................................................................................88
2.2. Installing a shared init.ora configuration ...........................................................................89
2.3. Sample Init.ora ............................................................................................................90
2.4. Detailed Parameter Descriptions ......................................................................................90
2.4.1. control_files (mandatory) ....................................................................................90
2.4.2. db_name (mandatory) .........................................................................................91
2.4.3. db_block_size (mandatory) ..................................................................................91
2.4.4. sga_target (optionally mandatory) .........................................................................91
2.4.5. File I/O (optional) ..............................................................................................91
2.4.6. Network and Listeners (mandatory) .......................................................................91
2.4.7. Undo (mandatory) ..............................................................................................92
2.4.8. Foreground and Background dump destinations (mandatory) ......................................92
2.4.9. Cluster_database_instances (mandatory) ................................................................92
2.4.10. Cluster_database (mandatory) .............................................................................92
2.4.11. RAC identification (mandatory) ..........................................................................92
2.4.12. Create Database Script and Execution ..................................................................92
2.4.13. Registering the database and the instance with Oracle Clusterware ............................94
Index ...............................................................................................................................................97

v
Introduction
1. About This Guide
This manual provides a step-by-step installation of Oracle’s 10gR2 Real Application Clusters (RAC) database on GFS6.1.
A sample cluster is provided as a working example that incorporates some best practices to provide entry-level perform-
ance and stability.

2. Audience
Installing RAC on GFS typically requires the collaboration of database administrators (DBAs), storage administrators, and
system administrators in order to get the best setup. Installing RAC on GFS6/CS4 is an advanced activity for all three
groups.

3. Software Versions
Software Description

RHEL4 refers to RHEL4.3 and higher

GFS refers to GFS6.1

CS refers to CS4

RAC refers to Oracle 10gR2 RAC

Table 1. Software Versions

4. Related Documentation
This manual is intended to be a complete “cookbook” and therefore attempts to eliminate the need to read other Oracle or
RHEL installation manuals. All steps to successfully install and instantiate an Oracle 10gRAC cluster on GFS6/CS4 are
contained in this manual.

Oracle is a very complex product and many permutations on installation are possible. The description of the sample cluster
provides a rationale for many of the best practices that are being deployed. Oracle RAC can be installed for HA, for
scalability, or to realize the cost savings of using a commodity RHEL enterprise computing platform. This sample four-
node cluster will provide some degree of high-avilability (HA) and scalability, but as always, the degree to which these are
realized is highly dependent on the mid-tier application architecture.

Referring to other manuals should not be necessary because all the information you need is in this document. However, if
you would like to learn how to customize your installation, there are Notes and Tips throughout the document to provide
some detail into why certain decisions were made for this sample cluster. For further optional reading:

• Red Hat GFS6.1 Admin Guide

• Red Hat GFS6.1 Release Notes

• Oracle® Database Oracle Clusterware and Oracle Real Application Clusters Installation Guide for 10g Release 2
(10.2) for Linux (Part Number B14203-05)

• Oracle® Database Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide
10g Release 2 (10.2) (Part Number B14197-03)

• Oracle® SQL Reference

vi
Introduction

• Oracle® Administer

• Oracle® Network Administration

5. Document Conventions
Certain words in this manual are represented in different fonts, styles, and weights. This highlighting indicates that the
word is part of a specific category. The categories include the following:

Courier font
Courier font represents commands, file names and paths, and prompts .

When shown as below, it indicates computer output:

Desktop about.html logs paulwesterberg.png


Mail backupfiles mail reports

bold Courier font


Bold Courier font represents text that you are to type, such as: service jonas start

If you have to run a command as root, the root prompt (#) precedes the command:

# gconftool-2

italic Courier font


Italic Courier font represents a variable, such as an installation directory: install_dir/bin/

bold font
Bold font represents application programs and text found on a graphical interface.

When shown like this: OK , it indicates a button on a graphical application interface.

Additionally, the manual uses different strategies to draw your attention to pieces of information. In order of how critical
the information is to you, these items are marked as follows:

Note
A note is typically information that you need to understand the behavior of the system.

Tip
A tip is typically an alternative way of performing a task.

Important
Important information is necessary, but possibly unexpected, such as a configuration change that will not persist
after a reboot.

Caution
A caution indicates an act that would violate your support agreement, such as recompiling the kernel.

Warning
A warning indicates potential data loss, as may happen when tuning hardware for maximum performance.

vii
Chapter 1. Sample Cluster
1. Oracle RAC Cluster on CS4/GFS
1.1. Sample 4-node Oracle RAC cluster
The sample cluster in this chapter represents a simple, yet effective RAC cluster deployment. Where necessary, a rationale
will be provided so that when your business requirements cause you to deviate from the sample cluster requirements, there
will be information to help you with these customizations.

Oracle 10gR2 RAC can provide both high availability and scalability using modern commodity servers running RHEL4.
Oracle 10gR2 comes in both 32-bit and 64-bit versions. A typical modern four node RAC cluster consists of high quality
commodity servers that provide superior price/performance, reliability, and modularity to make Oracle commodity com-
puting a viable alternative to large Enterprise class UNIX mainframes.

This sample cluster, which consists of four identical nodes, is the most common deployment topology. It is called a Sym-
metrical RAC cluster as all server nodes are identically configured.

1
Sample Cluster

Figure 1.1. Sample four-node cluster

Note
Asymmetrical cluster topologies also make sense where there is a need to isolate application writes to one node.
Spreading writes over several nodes in RAC can limit scalability. It can complicate node failover, but this high-
lights how important application integration with the RAC is. Asymmetrical RAC clusters slightly favor perform-
ance over availability.

1.2. Storage
This cluster uses a commodity storage platform (HP Storageworks MSA1500) which is a conventional 2GB/s FCP SAN.
GFS is a blocks-based clustered filesystem and therefore can run over any FCP or iSCSI SAN. The storage array must be
accessible from all nodes. Each node needs a FCP HBA. Storage vendors will be very particular about which HBAs and
which supported drivers are required for use with their equipment. Typically, the vendor will sell an attach kit that contains
the FCP HBA and the RHEL4 relevant drivers.

A minimum of one FCP switch is required, although many topologies are configured with two switches, which would then
require each node to have a dual-ported HBA.

Note
Like FCP, iSCSI is a blocks-based protocol that implements the T.10 SCSI command set. It does this over a TCP/
IP transport instead of a Fiber Channel Transport. The term SAN and NAS are no longer relevant in the modern
storage world, but have historically been euphemisms for SCSI over FCP (SAN) and NFS over TCP/IP (NAS).
What matters is whether or not it is a SCSI blocks-based protocol or if it uses the NFS filesystem protocol when
communicating with the storage array. Often iSCSI is mistakenly referred to as NAS, where it really has much
more in common with FCP, since the protocol is what matters, not the transport.

In order for Oracle to perform well, it requires spindles, not bandwidth. Many customers often configure their storage ar-
ray based on how much space the database might need or if they do consider performance, how much bandwidth. These
are not appropriate metrics for sizing a storage array to run an Oracle database.

Database performance almost always depends on the number of IOPs (I/O Operations) that a storage fabric can deliver and
this is inevitably a function of the number of spindles underneath a database table or index. The best strategy is to use the
SAME (Strip and Mirror everything) methodology. This allows any GFS volume to have access to the maximum IOP rate
the array supports without having the performance requirements of the application in advance.

When doing a SQL query that does an index range scan, thousands of IOPs may be needed. What determines the IOP rate
of a given physical disk is how fast it spins (RPM), the location of the data and the interface type. Here is an approximate
sizing guide:

Interface/RPM IOPs

SATA-I (7200) 50 IOPS

SATA-II (10K) 150 IOPS

FCP 10K 150 IOPs

FCP 15K 200 IOPs

Table 1.1. Sizing Guide

A 144GB 10K FCP drive can sometimes out-perform a 72GB 10K drive because most of the data might be located in few-
er “tracks,” causing the larger disk to seek less. However, the rotational latency is identical and the track cache often does
not help as the database typically reads thousands of random blocks per second. SATA-I drives are particularly bad be-
cause they do not support tagged-queuing. Tagged queuing, an interface optimization found in SCSI, permits the disk to
process more I/O transactions, but it increases the cost. A 7200-rpm Ultra-Wide SCSI disk often out-performs the equival-

2
Sample Cluster

ent SATA-I due to tagged queuing. SATA-I drives are very high capacity and cheap, but are very poor at random IOPS.
SATA-II disks support tagged queuing.

A modern Oracle database should have at least two shelves (20-30 payload spindles) in order to insure that there is a reas-
onable amount of performance. In this cluster, the RAID10 volumes are implemented in the storage array, which is now
common practice. The extent allocation policy does influence performance and this will be defined when the volume
group is created with CLVM. CLVM will be presented with several physical LUNs that all have the same performance
characteristics.

When adding performance capacity to the storage array, it is important that the array re-balance the existing allocated
LUNs over this larger set of spindles so the database objects on those existing GFS volumes benefit from increased IOP
rates.

Note
Payload spindles are the physical disks that contain only data, not parity. RAID 0+1 configurations allow the mir-
ror to be utilized as payload, which doubles the IOP rate of a mirror. Some arrays that support this feature on con-
ventional RAID1 mirrors also perform this optimization.

The Oracle Clusterware files (Voting and Registry) are not currently supported on GFS. For this cluster, two 256MB
shared raw partitions located on LUN0 will be used. LUN0 is usually not susceptible to the device scanning instability, un-
less new SCSI adaptors are added to the node so not connect SCSI controllers to a RAC cluster node once Clusterware is
installed. Since all candidate CLVMD2 LUNs have the same performance characteristics, their size and number is determ-
ined by their usage requirements:

• One 6GB LUN (Shared Oracle Home)

• One 24GB LUN for datafiles and indexes

• Four 4GB LUNs for Redo logs and Undo tablespaces

Each node in the cluster gets a dedicated volume for their specific Redo and Undo. One single GFS volume will contain all
datafiles and indexes. This normally will not cause a bottleneck on RHEL unless there is a requirement for more than
15,000 IOPs, but this is an operational trade-off of performance and simplicity in a modest cluster. The number of spindles
in the array should continue to be the bottleneck.

Caution
Choosing a RAID policy can be contentious. With databases, spindle-count matters more than size, so using a
simple RAID scheme such as RAID 1+0 (or RAID 10- mirrored, then striped) is often the best policy. It is faster
on random I/O, yet not as space-efficient as RAID4 and RAID5. The system will typically have far more space
because the array was correctly configured by spindle-count, not bandwidth or capacity.

1.3. Fencing Topology


Fencing is a mechanism a cluster employs to prevent un-coordinated activity to the shared storage array. It prevents
“split-brain” clusters where some nodes think they are part of the actual cluster and another set thinks the same thing. This
usually destroys the integrity of the shared storage. Fencing can be implemented in a variety of ways in CS4, but the one
used in this sample cluster uses server IPMI (Integrated Power Management Interface). HP servers call this feature iLO
(Integrated Lights Out) and it is found on all Proliant servers. This is a network accessible management processor that ac-
cepts commands that can power on and off the server.

In CS4, the lock servers are responsible for maintaining quorum and determining if a member node is in such a state that it
needs to be fenced in order to protect the integrity of the cluster. The master lock server will issue commands directly to
the power management interface to power cycle the server node. This is a very reliable fencing mechanism. CS4 supports
a variety of hardware interfaces that can affect power cycling on nodes that need to be fenced.

1.4. Remote Lock Management using GULM


The lock manager that is certified for use with Oracle 10gR2 is called GULM (Generic User mode Lock Manager). It is re-
quired that the lock managers be physically distinct servers; this is known as Remote Lock Management. The minimum re-
quirement is to have at least three of them, with one master and two slaves. There are four RAC servers and three external
GULM servers and all of them must capable of being fenced. The lock servers do not need to be as powerful as the data-
base nodes. They should have at least 1GB of RAM and be GbE connected, but can run either 32-bit or 64-bit RHEL. The
database nodes are HP DL385 dual-core, dual-socket four-way Opterons, and the lock servers could be less-equipped
DL385s to simplfy the order—if all seven nodes are being purchased together. The lock servers could even be previous

3
Sample Cluster

generation DL360s that meet the GbE and memory requirements.

1.5. Network fabrics


Several network fabrics are required for both CS4 and Oracle to communicate across the cluster. These networks could all
have their own private physical interfaces and network fabrics, but this sample cluster has overlapped the GULM and Or-
acle RAC heartbeat networks. The RAC heartbeat network is also used to move database blocks between nodes. Depend-
ing on the nature of the database application, it might be necessary to further dedicate a physical network just to the RAC
heartbeat network. Only one GbE switch is used (HP ProCurve 2724) for both heartbeat fabrics and iLO.

Note
If more redundancy is required, then adding another switch requires adding two more GbE ports to each server in
order to implement bonded interfaces. Just adding a 2nd private switch dedicated just to RAC does not help. If the
other switch fails, then CS4 heartbeat would fail and take RAC down with it.

Note
Private unmanaged switches are sufficient as these are standalone, isolated network fabrics. Network Operations
staff may still prefer that the switch is managed, but it should remain physically isolated from production VLANs.

1.5.1. Hostnames and Networks


For configuring CS4 and Oracle RAC, a hostname convention was followed to make it easier to map hostnames to a net-
work. All networks are required and must map to underlying physical interfaces. The type and number of physical inter-
faces used is a performance and reliability consideration.

Public RAC vip RAC heartbeat CS4 heartbeat iLO

RAC1 RAC1-vip RAC1-priv RAC1-priv RAC1-iLO

RAC2 RAC2-vip RAC2-priv RAC2-priv RAC2-iLO

RAC3 RAC3-vip RAC3-priv RAC3-priv RAC3-iLO

RAC4 RAC4-vip RAC4-priv RAC4-priv RAC4-iLO

Lock1 n/a n/a Lock1-gulm Lock1-iLO

Lock2 n/a n/a Lock2-gulm Lock2-iLO

Lock3 n/a n/a Lock3-gulm Lock3-iLO

Table 1.2. Hostnames and Networks

1.5.2. Hostnames and Physical Interfaces

ETH0 ETH1 iLO

RAC1 RAC1-priv RAC1-iLO

RAC1-vip

RAC2 RAC2-priv RAC2-iLO

4
Sample Cluster

ETH0 ETH1 iLO

RAC2-vip

RAC3 RAC3-priv RAC3-iLO

RAC3-vip

RAC4 RAC4-gulm RAC4-iLO

RAC4-vip

Lock1 Lock1-gulm Lock1-ilO

Lock2 Lock2-gulm Lock2-ilO

Lock3 Lock3-gulm Lock3-ilO

Table 1.3. Hostnames and Physical Interfaces

5
Chapter 2. Installation and Configuration
of RHEL4
1. Using RHEL4 Update 3
Each node that will become an Oracle RAC cluster node requires RHEL4 Update 3 or higher. Previous versions of RHEL4
are not recommended. The lock servers must also be installed with the same version of RHEL4, but can either be 32-bit or
64-bit. In our sample cluster, the RAC nodes and the external lock servers are all 64-bit Opteron servers running 64-bit
versions of both RHEL and Oracle.

1.1. Customizing the RHEL4 Installation


1.1.1. Boot Disk Provisioning
When installing RHEL4 on the RAC nodes, there is some basic minimal provisioning that should be followed. Although
most modern systems have very large system disks, all the software ever needed to install either a 10gRAC or CS4 lock
server would actually fit on a 9GB disk. Here are some typical minimums:

/boot 128MB
swap 4096MB # 10gR2 Installer expects at least 3932MB
/ 4096MB # A RAC/CS4-ready RHEL4 is about 2.5GB
/home 1024MB # Most of the ORACLE files are on GFS

Most customers will deploy with much larger drives, but this example helps explain what is being allocated. Oracle files
are mostly installed on a GFS volume. The /home requirements are so minimal, that it can be safely folded into the root
volume and still not exceed a 4GB partition. The size of the RHEL install including the need to recompile the kernel will
rarely exceed 4GB. Once installed, the only space growth would come from system logs or crash.

1.1.2. Network Interfaces


The following network interfaces can be configured during the installation session:

eth0
192.168.1.100 (SQL*Net App Tier)

eth1
192.168.2.100 (Oracle RAC GCS/CS4-GULM)

eth2
192.168.3.100 (Optional to isolate RAC from CS4-GULM)

The first two interfaces are required; the optional third network interface could be deployed to further separate CS4 lock
traffic from Oracle RAC traffic. In addition, NIC bonding (which is supported by Oracle RAC) is recommended for all in-
terfaces if further hardening is required. For the sake of simplicity in this example, this cluster does not deploy bonding
Ethernet interfaces.

DNS should be configured and enabled so that ntpd can locate the default clock servers. The ntpd process normally needs
DNS to look up the default name servers. If ntpd will be configured to use raw IP addresses, then DNS will not be re-
quired. This sample cluster will configure DNS during the install and ntpd during post-install.

1.1.3. Firewall and Security


The firewall software and security features are also optional for Oracle. Normally security is to place the database and lock
servers in an isolated VLAN that has been locked down using ACL (Access Control Lists) entries in the network switches.
The Oracle SQL*Net listeners will likely require an ACL to be allocated and opened because the datacenter has most or all
ports disabled.

1.1.4. Selecting from the Custom subset


During the installation of RHEL4, the installer will ask if you would like to do a standard or customized install. The min-

6
Installation and Configuration of
RHEL4

imum subset required to configure and install both CS4 and Oracle 10gRAC is:

• X Windows

• Server Config Tools

• Development Tools

• X Software Development

• Compatibility Software Development (32-bit)*

• Admin Tools

• System Tools

• Compatibility Arch Support (32-bit)*

Note
* Sub-systems that are only available on x64 installs

The compatibility subsets appear as options only during 64-bit install sessions. More subsets can of course be selected, but
it is recommended that an Oracle 10gRAC node not be provisioned to do anything other than run Oracle 10gRAC. The
same recommendation also applies to GULM lock servers.

1.2. Post Install Configuration Activities


1.2.1. INIT[run level] options
The system installs with a default run level of 5, although 3 can also be used. Level 5 is required only because the GUI
tools are executed from the system console. The GUI tools are executed using a remote X session and so level 3 is suffi-
cient. Installing CS4 and Oracle often require the user to not be co-located to the console, so running remote X (even
through a firewall) is very common, and this is how this cluster was configured. The run level can be changed in /
etc/inittab

# Default runlevel. The runlevels used by RHS are:


# 0 - halt (Do NOT set initdefault to this)
# 1 - Single user mode
# 2 - Multiuser, without NFS (The same as 3, if you do not have networking)
# 3 - Full multiuser mode
# 4 - unused
# 5 - X11
# 6 - reboot (Do NOT set initdefault to this)
#
id:2:initdefault

1.2.2. Configuring Cluster Clock synchronization


The ntpd service must be run on all RAC and GULM servers and is not enabled by default. It can be enabled for both run
level init level 3 and 5:

lock1 $ sudo chkconfig –level 35 ntpd on

RAC clusters need to their clocks to be within a few minutes of each other, but not completely synchronous. Using ntpd
should provide accuracy within a second or better and this is more than adequate. If the clocks are wildly out on the system
after install, ntpd will slowly slew the clocks back into synchronization and this will not happen quickly enough to be ef-
fective. In order to use ntpdate as a one-time operation ntpd must not be running.

1.2.3. Configuring HP iLO (Integrated Lights Out)


Once the system is installed, then it needs to be configured. The iLO configuration does vary a bit from machine to ma-
chine, but the username, password and IP address of the iLO interface is all that is required to configure fencing. The serv-

7
Installation and Configuration of
RHEL4

er’s BIOS Advanced section is usually where this is configured. The version of iLO that appears on most DL1xx series
boxes (i100) is not supported as it does not support sshd.

1.2.4. Shared LUNs Requirement


The Storage must be shared and visible on all nodes in the cluster including the GULM lock servers. The FCP driver will
present SCSI LUNs to the operating system.

Here are the results of running dmesg | grep SCSI:

lock1 $ dmesg | grep scsi

Attached scsi disk sda at scsi0, channel 0, id 0, lun 0


Attached scsi disk sdb at scsi0, channel 0, id 0, lun 2
Attached scsi disk sdc at scsi0, channel 0, id 0, lun 7
Attached scsi disk sdd at scsi0, channel 0, id 0, lun 1
Attached scsi disk sde at scsi0, channel 0, id 0, lun 6
Attached scsi disk sdf at scsi0, channel 0, id 0, lun 3
Attached scsi disk sdg at scsi0, channel 0, id 0, lun 4
Attached scsi disk sdh at scsi0, channel 0, id 0, lun 5

8
Chapter 3. Installation and Configuration
of Cluster Suite4
1. Installing ClusterSuite4 (CS4) components
1.1. Installing CS4 RPMs
ClusterSuite4 consists of the Cluster Configuration System (CCS), lock manager and other support utilities required by a
GULM implementation. This cluster has four RAC nodes and three GULM lock server nodes. Unless stated otherwise in
the Notes section, the following software must be loaded on all seven nodes. This is the list for a 64-bit install:

RPM Notes

ccs-1.0.3-0.x86_64.rpm Cluster configuration system

gulm-1.0.6-0.x86_64.rpm Grand unified lock manager

fence-1.32.18-0.x86_64.rpm The fence-1.32.18-0.x86_64.rpm rpm installs the


fenced daemon, which is only used by the DLM, not
GULM. Currently only GULM implementations are certi-
fied for use with 10gRAC. It should be disabled:

lock1 $ chkconfig –level 35 fenced off

perl-Net-Telnet-3.03-3.noarch.rpm The perl-Net-Telnet-3.03-3.noarch.rpm rpm


is required by the fence_ilo process that would be called by
GULM in the event that any of the nodes needed fencing.

magma-1.0.4-0.x86_64.rpm n/a

magma-plugins-1.0.6-0.x86_64.rpm n/a

rgmanager-1.9.46-0.x86_64.rpm The rgmanager-1.9.46-0.x86_64.rpm installs the


cluster status utility clustat and should be installed on all
seven nodes.

system-config-cluster-1.0.25-1.0.noarch.rpm The system-con-


fig-cluster-1.0.25-1.0.noarch.rpm only
needs to be installed on the nodes where you choose to run
the cluster configuration GUI. It was installed on lock1,
which is the primary lock server, but it can be installed on
all of them. It is not necessary to install this package on the
DB nodes.

Table 3.1. List for 64-bit Install

1.2. Configuring CS4 Using the GUI Tool


External GULM servers need to be created in sets of 1,3,5 so that accurate quorum voting can occur. In this cluster, there
will be three external lock servers. For initial testing, only the first lock server (hostname lock1) will be configured and
started.

You can run system-config-cluster over X11 or from the system console on lock1. Normally, it is best to initially con-
figure from run-level 2 (init [2]), so that services will not automatically startup at reboot before configuration testing

9
Installation and Configuration of Cluster
Suite4
is complete. Once a functioning cluster is verified, the system can be switched to either run-level 3 or 5. This configuration
example will run the GUI tool remotely using X11.

1.2.1. Verify X11 connectivity


In this example, the remote hostname where the X windows will appear is called adminws. For X11, xhost + must be ex-
ecuted on adminws from any session running on this system. A shell window on adminws will login to lock1 via ssh and
must have the DISPLAY environment variable set prior to running any X11 application.

lock1 $ export DISPLAY=adminws:0.0

Run xclock, to make sure that the X11 clock program appears on the adminws desktop.

Tip
Running X through a firewall often requires you to set the flag on the ssh command and possibly fiddle with the
.ssh/config file so that ForwardX11 yes is included. Remember to disable this feature once you are preparing
to run the 10gCRS installer as it will need to execute ssh commands between nodes (such as ssh hostname
date) that only return the date string (in this case) and nothing else.

1.3. Configuring the 1st lock server


1.3.1. Hostnames, Networks and Interfaces

Network Interface Hostname

192.168.1.154 lock1

192.168.2.154 lock1-gulm

192.168.2.54 lock1-ilo

Table 3.2. Network Interface and Hostnames

When using HP iLO power management, there is a fence device for every node in the cluster. This is different from having
one single device that fences all nodes (for example, a Brocade switch). When the node named lock1-gulm is created,
the corresponding fence device will be lock1-ILO. It is mandatory that the iLO network be accessible from the lock1
server. They share the same interface in this example, but this is not mandatory. Typically, there is only one iLO interface
port, so using the same network interface and switch as the CS4 heartbeat does not incur any further failure risk. Full
hardening of iLO is limited by the interface processor’s single port, but the fencing fabric could use bonded NICs on all
servers and a hardened production VLAN.

Although lock server hostnames and RAC node hostnames have a different naming convention, these services share the
same physical interface in this cluster. The hostname conventions are different to emphasize that it is possible to further
separate RAC and GULM networks if required for performance (not reliability). RAC-vip hostnames could also be defined
on a separate physical network so that redundant pathways to the application tiers can be also configured. Hardening is an
availability business requirement and this sample cluster emphasizes a cost-effective balance between availability and
complexity.

1.3.2. Configuring with the GUI tool

1. Run:

lock1 $ sudo /usr/bin/system-config-cluster

2. Click Create New Configuration.

10
Installation and Configuration of Cluster
Suite4

Figure 3.1. Information dialog

3. Click the Grand Unified Lock Manager (GuLM) radio button, and then click OK.

Figure 3.2. Lock Method dialog

4. Highlight Cluster Nodes in the Cluster pane, which will present the Add a Cluster Node button in the Properties
pane. Clicking Add a Cluster Node presents the Node Properties window:

11
Installation and Configuration of Cluster
Suite4

Figure 3.3. Cluster Configuration window

5. Set Quorum Votes to 1, select GuLM Lockserver and click OK.

Figure 3.4. Node Properties dialog

6. Select Fence Devices in the Cluster pane and then click Add a Fence Device:

12
Installation and Configuration of Cluster
Suite4

Figure 3.5. Fence Device Configuration dialog

7. The username and password default for iLO systems is set at the factory to admin/admin. The hostname for the
iLO interface is lock1-ilo. The name of the fence device is lock1-ILO. Click OK.

13
Installation and Configuration of Cluster
Suite4

Figure 3.6. Cluster Configuration window

8. The fence device needs to be linked to the node in order to activate fencing.

14
Installation and Configuration of Cluster
Suite4

Figure 3.7. Fence Configuration window

9. Click Add a New Fence Level:

15
Installation and Configuration of Cluster
Suite4

Figure 3.8. Fence Configuration window: Adding a new fence level

10. Click Add a New Fence to this Level.

Figure 3.9. Fence Properties dialog

11. After you click OK and close this window, the main window now shows that lock1-gulm is associated with the fence
level. The Properties pane should convey this by the message, "one fence in one fence level."

16
Installation and Configuration of Cluster
Suite4

Figure 3.10. Fence Configuration window

12. Click Close.

17
Installation and Configuration of Cluster
Suite4

Figure 3.11. Cluster Configuration window

13. Now save this configuration and then test that this single GULM lock server can start up. Save the file into /
etc/cluster/cluster.conf using the File->Save menu option in the Cluster Configuration tool.

18
Installation and Configuration of Cluster
Suite4

Figure 3.12. Cluster Configuration tool

Figure 3.13. Cluster Configuration tool Confirmation pop-up

The /etc/cluster/cluster.conf file now contains one master GULM lock server and it is now possible to exit
the GUI. Once the first lock server is running, re-starting the GUI tool will now permit the Cluster Management tab to be
selected and the server should appear in this display.

1.3.3. After the GUI configuration


There is an additional configuration file for GULM that needs to be changed to be consistent with the /
etc/cluster/cluster.conf file that was just created. This is the /etc/sysconfig/cluster file.

19
Installation and Configuration of Cluster
Suite4

#
# Node-private
#
GULM_OPTS="--name lock1-gulm --cluster alpha_cluster --use_ccs"

Make sure that the –-name parameter is the same name as the cluster node that was chosen in the GUI tool. The default
for this file is to use the server hostname, but this causes GULM to run over the public network interface. GULM traffic in
this cluster will run over a private network that corresponds to the hostname lock1-gulm. If the cluster name was
changed in the /Cluster Properties pane in the GUI, then the --cluster parameter must be changed to match that
value. If these values do not exactly match, then GULM will not startup successfully.

1.3.4. Testing first GULM lock server


Because there is only one defined lock server in the configuration, quorum will be attained as soon as this lock server
starts successfully. When changing the configuration to support three external GULM lock servers, then a minimum of two
are needed to successfully start before the cluster is quorate and operations on the cluster can proceed. The advantage of
configuring the GULM servers first is that it insures there will be a quorate cluster when the other Oracle RAC nodes are
being defined and transitioned into the cluster.

Note
Quorate (or Inquorate) is the term used in the system log to indicate the presence (or absence) of a GULM lock
server quorum. Without quorum, no access to the storage is permitted.

Tip
Set /etc/inittab to 2, so that when you transition to init3 or init5, you can do it from a system that is run-
ning and accessible (tail -f /var/log/messages for debug).

1.3.4.1. Initial Test setup

1. Open two terminal windows on lock1: one for typing commands and one for running tail –f /
var/log/messages.

Note
Making the /var/log/messages file visible to the user oracle or orainstall will make the procedure
easier. As these users need to read /var/log/messages much more frequently in a RAC environment,
providing group read permission is recommended, at least during the install.

2. Because lock1 is running in run-level 2, ccsd, lock_gulmd would not be running accidentally (for instance, you had
to reboot the server after installing the RPMs). Start up the ccsd process:

$ lock1 sudo service ccsd start

lock1 ccsd[7960]: Starting ccsd 1.0.3:


lock1 ccsd[7960]: Built: Jan 25 2006 16:54:43
lock1 ccsd[7960]: Copyright (C) Red Hat, Inc. 2004 All rights reserved.
lock1 ccsd: startup succeeded

3. Start up the GULM lock process:

$ sudo service lock_gulmd start

lock1 lock_gulmd_main[5507]: Forked lock_gulmd_core.


lock1 lock_gulmd_core[5509]: Starting lock_gulmd_core 1.0.6. (built Feb 13 2006 15:08:25
lock1 lock_gulmd_core[5509]: I am running in Standard mode.
lock1 lock_gulmd_core[5509]: I am (lock1-gulm) with ip (::ffff:192.168.2.154)
lock1 lock_gulmd_core[5509]: This is cluster alpha_cluster
lock1 lock_gulmd_core[5509]: I see no Masters, So I am becoming the Master.
lock1 lock_gulmd_core[5509]: Could not send quorum update to slave lock1-gulm

20
Installation and Configuration of Cluster
Suite4

lock1 lock_gulmd_core[5509]: New generation of server state. (1145002309185059)


lock1 lock_gulmd_core[5509]: EOF on xdr (Magma::5479 ::1 idx:1 fd:6)
lock1 lock_gulmd_main[5507]: Forked lock_gulmd_LT.
lock1 lock_gulmd_LT[5513]: Starting lock_gulmd_LT 1.0.6. (built Feb 13 2006 15:08:25) Co
lock1 lock_gulmd_LT[5513]: I am running in Standard mode.
lock1 lock_gulmd_LT[5513]: I am (lock1-gulm) with ip (::ffff:192.168.2.154)
lock1 lock_gulmd_LT[5513]: This is cluster alpha_cluster
lock1 lock_gulmd_core[5509]: EOF on xdr (Magma::5479 ::1 idx:2 fd:7)
lock1 lock_gulmd_main[5507]: Forked lock_gulmd_LTPX.
lock1 lock_gulmd_LTPX[5519]: Starting lock_gulmd_LTPX 1.0.6. (built Feb 13 2006 15:08:25
lock1 lock_gulmd_LTPX[5519]: I am running in Standard mode.
lock1 lock_gulmd_LTPX[5519]: I am (lock1-gulm) with ip (::ffff:192.168.2.154)
lock1 lock_gulmd_LTPX[5519]: This is cluster alpha_cluster
lock1 lock_gulmd_LTPX[5519]: New Master at lock1-gulm ::ffff:192.168.2.154
lock1 lock_gulmd_LT000[5513]: New Client: idx 2 fd 7 from lock1-gulm ::ffff:192.168.2.15
lock1 lock_gulmd_LTPX[5519]: Logged into LT000 at lock1-gulm ::ffff:192.168.2.154
lock1 lock_gulmd_LTPX[5519]: Finished resending to LT000
lock1 ccsd[5478]: Connected to cluster infrastructure via: GuLM Plugin v1.0.3
lock1 ccsd[5478]: Initial status:: Quorate
lock1 lock_gulmd: startup succeeded

Because rgmanager was installed on this node, the clustat utility can be used to verify the status of this lock man-
ager:

$ sudo clustat

Member Name Status


------ ---- ------
lock1-gulm Online, Local, rgmanager

Restart the GUI:

lock1 $ sudo system-config-cluster

Click on the window tab Cluster Management:

21
Installation and Configuration of Cluster
Suite4

Figure 3.14. Cluster Configuration window: Cluster Management

1.3.5. Configuring the remaining GULM lock servers


This involves repeating the steps for the first servers and following the naming conventions setup for the hostnames. When
these steps are complete, the current configuration should show three nodes and three fence devices.

22
Installation and Configuration of Cluster
Suite4

Figure 3.15. Cluster Configuration window

Save the new configuration!


It is critical that you save this new configuration in the GUI tool before returning to the shell session that you
used to start up the 1st lock server. The GUI tool can remain running.

1.3.6. After the GUI configuration: for other lock servers

23
Installation and Configuration of Cluster
Suite4

The /etc/cluster/cluster.conf and /etc/sysconfig/cluster files need to be manually copied to the


other lock servers during this bootstrap process. It will be easier to debug this step if both lock2 and lock3 are also at init
[2] and all the CS4 components need to be installed. If rpm installation was not performed on these nodes (as per Sec-
tion 1, “Installing CLVM components”), then do this now.

1.3.6.1. Propagate and localize the configuration files


Manually copy the current /etc/cluster/cluster.conf

# scp /etc/cluster/cluster.conf lock2:/etc/cluster


# scp /etc/cluster/cluster.conf lock3:/etc/cluster

Localized /etc/sysconfig/cluster file on each respective node:

#
# Node-private
#
GULM_OPTS="--name lock2-gulm --cluster alpha_cluster --use_ccs"

# Node-private
#
GULM_OPTS="--name lock3-gulm --cluster alpha_cluster --use_ccs"

1.3.6.2. Test startup the lock servers


Repeating the steps on lock2 and lock3, both ccsd and lock_gulmd need to be started and checked. Each node should
have a window with tail –f /var/log/messages. This example is only showing the lock_gulmd progress. If cc-
sd would not start up, it is usually because of parsing errors in the /etc/cluster/cluster.conf file. The file was
created correctly, but if could fail if the /etc/hosts file did not contain all the referenced hostnames. Copying the files
from a working lock server (including the /etc/hosts file) reduces the risk of parsing errors.

$ sudo service ccsd start


$ sudo service lock_gulmd start

lock2 lock_gulmd_LTPX[3380]: I am running in Standard mode.


lock2 lock_gulmd_LTPX[3380]: I am (lock2-gulm) with ip (::ffff:192.168.2.155)
lock2 lock_gulmd_LTPX[3380]: This is cluster alpha_cluster
lock2 lock_gulmd_LTPX[3380]: New Master at lock1-gulm ::ffff:192.168.2.154
lock2 lock_gulmd_LTPX[3380]: Logged into LT000 at lock1-gulm ::ffff:192.168.2.154
lock2 lock_gulmd_LTPX[3380]: Finished resending to LT000
lock2 ccsd[3340]: Connected to cluster infrastructure via: GuLM Plugin v1.0.3

Verify the status check the status of the cluster using clustat:

lock1 $ clustat

Member Status: Quorate


Member Name Status
------ ---- ------
lock1-gulm Online, Local, rgmanager
lock2-gulm Online, rgmanager

The GUI Cluster Management tab should be the same:

24
Installation and Configuration of Cluster
Suite4

Figure 3.16. Cluster Configuration window: Cluster Management

Note
The Send to Cluster button in the upper right hand corner of the Cluster Configuration tab will send the current
configuration only to nodes that have cluster status shown above. Because each node is brought up one at a time
during initial test and setup, this feature is not initially useful. Once the cluster is completely up and running and
all the nodes are in the cluster, this option is an effective way to distribute changes to /

25
Installation and Configuration of Cluster
Suite4
etc/cluster/cluster.conf.

1.3.7. Adding the Four RAC nodes and their Fence Devices
The steps to adding the four RAC nodes are identical to the adding lock servers. The only difference is the hostname con-
vention. GULM and RAC heartbeat share the same physical interface, but the hostname convention indicates that these
networks could be physically separated. The –priv suffix is a RAC hostname convention. During the Add a Cluster Node
step, use this hostname convention and do not check the GuLM Lockserver box

Figure 3.17. Node Properties dialog

Fence devices follow the same naming convention and the only difference is that the rac1-ILO fence device is associ-
ated with rac1-priv.

26
Installation and Configuration of Cluster
Suite4

Figure 3.18. Cluster Configuration window

The completed configuration should show all seven cluster nodes and fence devices.

Managed Resources
This section does not need to be configured, as it is the responsibility of the Oracle Clusterware to manage the
RAC database resources. This section is reserved for CS4 hot-standby non-RAC configurations.

27
Installation and Configuration of Cluster
Suite4

1.3.8. Post GUI configuration for other lock servers


The /etc/cluster/cluster.conf and /etc/sysconfig/cluster files need to be manually copied to the
other db nodes during this bootstrap process. It will be easier to debug this step if all db nodes are also at init [2] and
all the CS4 components need to be installed. If the rpm installation was not performed on these nodes (as per Section 1,
“Installing CLVM components”), then do this now.

1.3.8.1. Propagate and localize the configuration files

1. Manually copy the current /etc/cluster/cluster.conf:

# scp /etc/cluster/cluster.conf rac1:/etc/cluster


# scp /etc/cluster/cluster.conf rac2:/etc/cluster
# scp /etc/cluster/cluster.conf rac3:/etc/cluster
# scp /etc/cluster/cluster.conf rac4:/etc/cluster

2. Localized /etc/sysconfig/cluster file on each respective node:

#
# Node-private
#
GULM_OPTS="--name rac1-priv --cluster alpha_cluster --use_ccs"

#
# Node-private
#
GULM_OPTS="--name rac2-priv --cluster alpha_cluster --use_ccs"

#
# Node-private
#
GULM_OPTS="--name rac3-priv --cluster alpha_cluster --use_ccs"

#
# Node-private
#
GULM_OPTS="--name rac4-priv --cluster alpha_cluster --use_ccs"

1.3.8.2. Test startup the lock servers


ccsd and lock_gulmd need to be started and verified on every node. Each node should have a window with tail –f /
var/log/messages:

$ sudo service ccsd start

rac1 ccsd[2520]: Starting ccsd 1.0.3:


rac1 ccsd[2520]: Built: Jan 25 2006 16:54:43
rac1 ccsd[2520]: Copyright (C) Red Hat, Inc. 2004 All rights reserved.
rac1 ccsd: startup succeeded
rac1 ccsd[2520]: cluster.conf (cluster name = alpha_cluster, version = 10) found.
rac1 ccsd[2520]: Unable to perform sendto: Cannot assign requested address
rac1 ccsd[2520]: Remote copy of cluster.conf is from quorate node.
rac1 ccsd[2520]: Local version # : 10
rac1 ccsd[2520]: Remote version #: 10
rac1 ccsd[2520]: Connected to cluster infrastructure via: GuLM Plugin v1.0.3
rac1 ccsd[2520]: Initial status:: Quorate
rac1 ccsd[2520]: Cluster is quorate. Allowing connections.

28
Installation and Configuration of Cluster
Suite4

$ sudo service lock_gulmd start

rac1 lock_gulmd_main[2573]: Forked lock_gulmd_LT.


rac1 lock_gulmd_LT[2583]: Starting lock_gulmd_LT 1.0.6. (built Feb 13 2006 15:08:25) Copyri
rac1 lock_gulmd_LT[2583]: I am running in Fail-over mode.
rac1 lock_gulmd_LT[2583]: I am (rac1-priv) with ip (::ffff:192.168.2.150)
rac1 lock_gulmd_LT[2583]: This is cluster alpha_cluster
rac1 lock_gulmd_LT000[2583]: Not serving locks from this node.
rac1 lock_gulmd_core[2579]: EOF on xdr (Magma::2521 ::1 idx:2 fd:7)
rac1 lock_gulmd_main[2573]: Forked lock_gulmd_LTPX.
rac1 lock_gulmd_LTPX[2588]: Starting lock_gulmd_LTPX 1.0.6. (built Feb 13 2006 15:08:25) Co
rac1 lock_gulmd_LTPX[2588]: I am running in Fail-over mode.
rac1 lock_gulmd_LTPX[2588]: I am (rac1-priv) with ip (::ffff:192.168.2.150)
rac1 lock_gulmd_LTPX[2588]: This is cluster alpha_cluster
rac1 lock_gulmd_LTPX[2588]: New Master at lock1-gulm ::ffff:192.168.2.154
rac1 lock_gulmd_LTPX[2588]: Logged into LT000 at lock1-gulm ::ffff:192.168.2.154
rac1 lock_gulmd_LTPX[2588]: Finished resending to LT000
rac1 lock_gulmd: startup succeeded

Verify that all nodes are in the cluster:

rac1 $ clustat

Member Status: Quorate


Member Name Status
------ ---- ------
rac1-priv Online, Local, rgmanager
rac3-priv Online, rgmanager
lock2-gulm Online, rgmanager
rac4-priv Online, rgmanager
rac2-priv Online, rgmanager
lock1-gulm Online, rgmanager

1.3.9. Operational Considerations


The system should now be capable of supporting LVM2 volumes and the next step is to install and configure the Cluster
LVM2 software as well as allocating the logical volumes for use by GFS. Once the GFS volumes are created and mounted,
the Oracle installation can commence.

1.3.9.1. Clean Shutdown


A clean shutdown may be required for maintenance or other reasons. A GULM cluster will not cleanly shutdown once it is
has lost quorum. The process of stopping the last two lock servers needs to be coordinated so a clean shutdown of both of
these remaining lock servers is possible.

Shutting down the last two nodes that hold quorum will cause the cluster to become inquorate and all activity on the
cluster is blocked, including the ability to proceed with a normal shutdown. These steps will insure a clean shutdown and
only apply to the last two lock servers that are holding quorum. All other nodes should shut down normally.

Although clvmd has not been configured yet in this guide, it is a lock server client, so this protocol assumes it is running.

1. Remove existing lock server clients from each of the remaining nodes holding quorum:

lock2 $ sudo service rgmanager stop


lock2 $ sudo service clvmd stop
lock1 $ sudo service rgmanager stop
lock1 $ sudo service clvmd stop

2. Stop the GULM lock manager on lock2. This will cause the cluster to be in-quorate at this time.

29
Installation and Configuration of Cluster
Suite4

lock2 $ sudo service lock_gulmd stop

3. The remaining GULM lock manager must be shutdown using the gulm_tool utility and output from /
var/log/messages shows that the core is shutdown cleanly.

lock1 $ sudo gulm_tool shutdown lock1

lock1 lock_gulmd_core[3100]: Arbitrating Node Is Logging Out NOW!


lock1 lock_gulmd_LTPX[3110]: finished.
lock1 lock_gulmd_core[3100]: finished.
lock1 lock_gulmd_LT000[3105]: Core is shutting down.
lock1 lock_gulmd_LT000[3105]: finished.

Once this is complete, both servers can be shutdown cleanly as ccsd will be able to terminate normally.

30
Chapter 4. Installing Clustered Logical
Volume Manager (CLVM)
1. Installing CLVM components
CLVM requires only one RPM, lvm2-cluster-2.02.01-1.2.RHEL4.x86_64.rpm, and this must be installed
on all seven nodes. The lock servers must be able to see the cluster logical volumes, but do not need to mount the GFS
volumes.

Warning
ALL GULM lock servers must have all of the shared storage mounted and must have CLVM installed, con-
figured, and running.

2. Configuring CLVMD
The cluster volume manager process clvmd needs to be configured and started on every node in the cluster. The configura-
tion file for clvmd is /etc/lvm/lvm.conf.

The default configuration of this file should be set up for clustered operation. The parameters locking_type and
locking_library should be verified. The library_dir parameter will be “/usr/lib” for a 32-bit installations:

# Miscellaneous global LVM2 settings


global {
library_dir = "/usr/lib64"
locking_library = "liblvm2clusterlock.so"
… < text deleted > …
locking_type = 2

3. Start up CLVMD
Run:

lock1 $ sudo service clvmd start

lock1 clvmd: Activating VGs: succeeded

There are no physical groups or volume groups (VGs) defined at this time, otherwise the console output would list all ac-
tivated volumes it could find.

4. Repeat Installation and configuration for all


nodes
All lock servers and RAC node members of the cluster must have clvmd running. Repeat the previous steps on the remain-
ing nodes.

31
Chapter 5. Creating the Physical and
Logical Volumes
Logical volumes, once created and visible to members of the cluster, will appear to RHEL as block device entries. Logical
Volumes can optionally appear as rawdevices using the rawdevices service in RHEL. Both block and raw logical volumes
will be used to install RAC. Physical and Logical volumes can be initialized and configured using the commandline or the
GUI-based system-config-lvm.

1. Physical_Storage_Allocation
The storage array was configured to present a series of physical LUNs that will be initialized and configured by CLVM.

• One 6GB LUN for Shared Oracle Home

• One 24GB LUN for datafiles and indexes

• Four 4GB LUNs for Redo logs and Undo tablespaces

Warning
The current version of the GUI tool will not prevent the initialization and addition of local physical volumes to
cluster volumes. In this sample cluster, all media handled by LVM2 must be on shared storage for Oracle RAC to
function. Do not configure any node-local storage from any of the 7 nodes until it is known that this is resolved in
subsequent releases of the GUI tool.

For example, /dev/hda is visible to CLVM and it is a local physical device (the boot disk), but remains un-
initialized by CLVM. Only initialize the shared storage intended for use with Oracle RAC.

2. Initialize and Configure Volumes


All shared physical LUNs need to be initialized as physical volumes by CLVM before volume groups and volumes can be
created. The administration tool system-config-lvm can be executed from any node in the cluster that has success-
fully installed, configured and started clvmd.

2.1. Verify X11 connectivity


In this example, the remote hostname where the X windows will appear is called adminws. For X11, xhost + must be ex-
ecuted on adminws from any session running on this system. A shell window on adminws will login to lock1 and must
have the DISPLAY environment variable set either upon login or in some profile.

1. Run:

lock1 $ export DISPLAY=adminws:0.0

2. Run xclock, to make sure that the X11 clock program appears on the adminws desktop.

Tip
Running X through a firewall often requires you to set the flag on the ssh command and possibly fiddle with the
.ssh/config file so that X11Forward yes is included. Remember to disable this feature once you are pre-
paring to run the 10gCRS installer as it will need to execute ssh commands between nodes (such as ssh date)
that only return the date string (in this case) and nothing else.

The GUI tool starts and the initial screen appears.

3. Run:

rac1 $ sudo system-config-lvm

32
Creating the Physical and Logical
Volumes

Figure 5.1. Logical Volume Management window: Uninitialized disk

2.2. Initialize the Shared Home volume group

Figure 5.2. Logical Volume Management window: Unallocated physical volume

33
Creating the Physical and Logical
Volumes

1. Identify and highlight the 6GB raw LUN that will be listed under the Uninitialized Entries and then click Initialize
Entry. The Properties pane can be used to verify the size and other characteristics.

2. Create a new volume group on this physical volume group by selecting Partition 1 and then clicking Create new
Volume Group. Although volume groups may consist of multiple physical LUNs, these LUNs were created on a
storage array that implements both the stripping and mirroring. The physical extent size of 128k is a reasonable aver-
age extent size for most Oracle databases.

Figure 5.3. New Volume Group dialog

Note
Oracle has its own extent strategy and the policies used on tablespaces factor more directly on performance than
the LVM extent size. A good general practice for tablespaces (using the create tablespace command)that
hold tables is a 1M extent and the indexes will match the LVM extent size of 128K.

Figure 5.4. Logical Volume Management window: Logical view

The volume group common is a 6GB volume that will contain the Oracle shared home installation of both Cluster-
ware and the RDBMS.

34
Creating the Physical and Logical
Volumes

3. Highlight the Logical View for the volume group common and then click on Create New Logical Volume

Figure 5.5. Create New Logical Volume dialog

4. There are two ways to consume free space: click the Use remaining button or slide the allocation bar all the way to
the right.

Warning
Do NOT define any filesystems during logical volume creation. The GUI assumes that the number of nodes that
are in connected to the cluster is the number of lock journals that will be required. This cluster is being increment-
ally installed, so this value will be wrong. GFS volumes will be created using the mkfs.gfs command.

5. Click OK and the main screen should look like this:

35
Creating the Physical and Logical
Volumes

Figure 5.6. Logical Volume Management window: Logical view of ohome

...and the newly created logical volume will have a block device file names that will be used by the mkfs.gfs call
and in /etc/fstab.

6. Run:

lock1 # ls –l /dev/common/

1 root root 24 May 3 18:11 ohome -> /dev/mapper/common-ohome

2.3. Create the 1st redo volume group and logical volume
There are four 4GB physical LUNs that need to be initialized as physical groups, volume groups and then logical volumes.
Each one corresponds to the redo/undo GFS volume for each RAC node. The steps used to create this logical volume will
only show the first redo log, and the steps will be need to be repeated for the remaining three 4GB physical volumes.

1. Initialize the physical LUN /dev/sdb. The physical LUN /dev/sda is reserved for the Oracle Clusterware files
and must not be initialized by CLVM.

36
Creating the Physical and Logical
Volumes

Figure 5.7. Logical Volume Management window: Unallocated physical volume

2. Create the volume group redo1. Click on Create New Volume Group. Verify that the extent size is 128 KB.

Figure 5.8. New Volume Group dialog

3. Create the logical volume log1 by clicking on Logical View of redo1 and then clicking Create New Logical
Volume.

37
Creating the Physical and Logical
Volumes

Figure 5.9. Logical Volume Management window: Creating a new logical volume

Use the entire volume when creating the log1. The default units are Extents; you can use this or set it to Gigabytes.
Since this logical volume will be the entire contents of the volume group, leaving the units as Extents and then click-
ing on Use remaining will be the easiest.

38
Creating the Physical and Logical
Volumes

Figure 5.10. Create New Logical Volume dialog

4. Verify in the Properties pane, that the pathname to this file is /dev/redo1/log1 and that the all 4GB was used to
create this logical volume. Click OK.

39
Creating the Physical and Logical
Volumes

Figure 5.11. Logical Volume Management window: Logical volume log1

2.4. Create the remaining redo groups and volumes


Repeat the process in the previous section for the remaining 4GB physical LUNs. Once complete, the main window should
show all redo logical volumes and the common volume that will be used for the Oracle shared home.

Figure 5.12. Logical Volume Management window

40
Creating the Physical and Logical
Volumes

2.5. Create the main datafiles logical volume


All of the main tablespaces for the entire RAC database will reside on this logical volume. Each node has a dedicated
redo/undo volume to minimize any unnecessary sharing of cluster resources during normal operations.

In this view, the completed logical volume layout highlights the CRS, shared home and datafiles logical volumes

Figure 5.13. Logical Volume Management window

41
Chapter 6. GFS
1. Installing GFS components
Most installs (including this sample cluster) will be have more than one CPU, so the SMP compatible kernel modules will
be required. This system is a 64-bit SMP, so the actual names will reflect the type of RHEL kernel that is running (RHEL4
Update 3). Install GFS on all four RAC nodes.

rac1 $ sudo rpm –Uhv GFS-6.1.5-0.x86_64.rpm GFS-kernel-smp-2.6.9-49.1.x86_64.rpm

Preparing... ########################################### [100%]


1:GFS-kernel-smp ########################################### [ 50%]
2:GFS ########################################### [100%]

Note
GFS does not need to be installed on the GULM lock server nodes.

This process installs only the GFS module required by the specific kernel that is being used (64-bit, SMP). All
variants of the GFS kernel module may be installed, but it is not required—only the version that matches the
RHEL kernel is required.

2. Create the GFS volumes


2.1. Verify the logical volumes
Because GFS is installed only on the RAC servers, these steps must be performed on one of them. It does not matter which
one, but if you are installing Oracle Clusterware and RDBMS, it is recommended that you choose node 1. Verify that the
node can see logical volumes.

Run: rac1 $ sudo lvscan

ACTIVE '/dev/oradata/datafiles' [48.00 GB] inherit


ACTIVE '/dev/redo4/log4' [4.00 GB] inherit
ACTIVE '/dev/common/ohome' [5.50 GB] inherit
ACTIVE '/dev/redo3/log3' [4.00 GB] inherit
ACTIVE '/dev/redo2/log2' [4.00 GB] inherit
ACTIVE '/dev/redo1/log1' [4.00 GB] inherit

2.2. Create the filesystems


The command-line utility mkfs.gfs will be used to create each volume since some non-default values are required to
create suitable Oracle GFS volumes

Run: rac1 $ mkfs.gfs –h

mkfs.gfs [options] <device>


Options:
-b <bytes> Filesystem block size
-D Enable debugging code
-h Print this help, then exit
-J <MB> Size of journals
-j <num> Number of journals
-O Do not ask for confirmation
-p <name> Name of the locking protocol
-q Do not print anything
-r <MB> Resource Group Size
-s <blocks> Journal segment size
-t <name> Name of the lock table

42
GFS

-V Print program version information, then exit

Option Value Notes

-j 4 One for each RAC node. GULM lock servers do not run
GFS.

-J 32MB Oracle maintains the integrity of its filesystem with its own
journals or redo logs. All database files are opened
O_DIRECT (bypassing the RHEL buffer cache and the
need to use GFS journals). Additionally, redo logs are
opened O_SYNC.

-p lock_gulm The chosen locking protocol.

-t alpha_cluster:log1 The cluster name (from /etc/sysconfig/clusters)


and the logical volume being initialized.

<device> /dev/redo1/log1 The block-mode logical-device name.

Table 6.1. Hostnames and Physical Interfaces

Run:

rac1 $ sudo mkfs.gfs -J 32 -j 4 -p lock_gulm -t


alpha_cluster:log1 /dev/redo1/log1

This will destroy any data on /dev/redo1/log1.


Are you sure you want to proceed? [y/n] y
Device: /dev/redo1/log1
Blocksize: 4096
Filesystem Size: 1015600
Journals: 4
Resource Groups: 16
Locking Protocol: lock_gulm
Lock Table: alpha_cluster:log1
Syncing...
All Done

Run:

mkfs.gfs -J 32 -j 4 -p lock_gulm -t alpha_cluster:log2 /dev/redo2/log2


mkfs.gfs -J 32 -j 4 -p lock_gulm -t alpha_cluster:log3 /dev/redo3/log4
mkfs.gfs -J 32 -j 4 -p lock_gulm -t alpha_cluster:log4 /dev/redo4/log4
mkfs.gfs -J 32 -j 4 -p lock_gulm -t alpha_cluster:ohome /dev/common/ohome
mkfs.gfs -J 32 -j 4 -p lock_gulm -t alpha_cluster:datafiles /dev/oradata/datafiles

2.3. /etc/fstab entries


/dev/common/ohome /mnt/ohome gfs _netdev 0 0

43
GFS

/dev/datafiles/oradata /mnt/oradata gfs _netdev 0 0


/dev/redo1/log1 /mnt/log1 gfs _netdev 0 0
/dev/redo2/log2 /mnt/log2 gfs _netdev 0 0
/dev/redo3/log3 /mnt/log3 gfs _netdev 0 0
/dev/redo4/log4 /mnt/log4 gfs _netdev 0 0

The _netdev option is also useful as it insures the filesystems are un-mounted before cluster services shutdown. Copy
this section of the /etc/fstab file and move it to the other nodes in the system. These volumes were mounted in /mnt
and the corresponding mount directories needed to be created on every node.

Filesystem 1K-blocks Used Available Use% Mounted on


/dev/mapper/redo1-log1 4062624 20 4062604 1% /mnt/log1
/dev/mapper/redo2-log2 4062368 20 4062348 1% /mnt/log2
/dev/mapper/redo3-log3 4062624 20 4062604 1% /mnt/log3
/dev/mapper/redo4-log4 4062368 20 4062348 1% /mnt/log4
/dev/mapper/common-ohome 6159232 20 6159212 1% /mnt/ohome
/dev/mapper/oradata-datafiles 50193856 40 50193816 1% /mnt/datafiles

44
Chapter 7. Oracle 10gR2 Clusterware
1. Installing Oracle 10gR2 Clusterware (formerly
10gR1 CRS)
Although this is documented in Oracle install manuals, in metalink notes, and elsewhere, it is consolidated here, so that
this manual can be used as the main reference for a successful installation. A good supplementary Oracle article for doing
RAC installations can be found here:

http://www.oracle.com/technology/pub/articles/smiley_rac10g_install.html

1.1. RHEL Preparation


All four RAC nodes need to be up and running, and in the CS4 cluster. All GFS volumes that will be used for this Oracle
install should be mounted on all four nodes. At a minimum, the GFS volume (/mnt/ohome) that will contain the shared
installation must be mounted:

Filesystem 1K-blocks Used Available Use% Mounted on


/dev/mapper/redo1-log1 4062624 20 4062604 1% /mnt/log1
/dev/mapper/redo2-log2 4062368 20 4062348 1% /mnt/log2
/dev/mapper/redo3-log3 4062624 20 4062604 1% /mnt/log3
/dev/mapper/redo4-log4 4062368 20 4062348 1% /mnt/log4
/dev/mapper/common-ohome 6159232 20 6159212 1% /mnt/ohome
/dev/mapper/oradata-datafiles 50193856 40 50193816 1% /mnt/datafiles

1.1.1. Map the shared raw partitions to RHEL rawdevices


The certified version of Oracle 10g on GFS requires that the two clusterware files be located on shared raw partitions and
be visible by all RAC nodes in the cluster. The GULM lock server nodes do not need access to these files. These partitions
are usually located on a small LUN that is not used for other purposes.

The LUN /dev/sda should be large enough to create two 256MB partitions. Using the /dev/sda command, create two
primary partitions:

rac1 # fdisk /dev/sda

Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content will not be recoverable.
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
Command (m for help): p
Disk /dev/sda: 536 MB, 536870912 bytes
17 heads, 61 sectors/track, 1011 cylinders
Units = cylinders of 1037 * 512 = 530944 bytes
Device Boot Start End Blocks Id System
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-1011, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-1011, default 1011): +256M
Command (m for help): p

45
Oracle 10gR2 Clusterware

Disk /dev/sda: 536 MB, 536870912 bytes


17 heads, 61 sectors/track, 1011 cylinders
Units = cylinders of 1037 * 512 = 530944 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 483 250405 83 Linux
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 2
First cylinder (484-1011, default 484):
Using default value 484
Last cylinder or +size or +sizeM or +sizeK (484-1011, default 1011):
Using default value 1011
Command (m for help): p
Disk /dev/sda: 536 MB, 536870912 bytes
17 heads, 61 sectors/track, 1011 cylinders
Units = cylinders of 1037 * 512 = 530944 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 483 250405 83 Linux
/dev/sda2 484 1011 273768 83 Linux
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.

If the other nodes were already up and running while you created these partitions, these other nodes must re-read the parti-
tion table from disk (blockdev –rereadpt /dev/sda).

Make sure the service rawdevices is enabled on all four RAC nodes for the run level that will be used. This example en-
ables it for both run levels. Run:

rac1 # chkconfig –level 35 rawdevices on

The mapping occurs in the files /etc/sysconfig/rawdevices

# raw device bindings


# format: <rawdev> <major> <minor>
# <rawdev> <blockdev>
# example: /dev/raw/raw1 /dev/sda1
# /dev/raw/raw2 8 5
/dev/raw/raw1 /dev/sda1
/dev/raw/raw2 /dev/sda2

The permissions of these files must always be owned by the oracle user used to install the software (oracle). A 10
second delay is needed to insure that the rawdevices service has a chance to configure the /dev/raw directory. Add
these lines to the /etc/rc.local file. This file is symbolically linked to /etc/rc?.d/S99local.

echo "Sleep a bit first and then set the permissions on raw"
sleep 10
chown oracle:dba /dev/raw/raw?

Note
After you install Clusterware and if you see a set of three /tmp/crsctl.<pid> trace files, then Clusterware
did not start and there will be an error message in these files, usually complaining about permissions. Make sure
the /dev/raw/raw? files are owned by oracle owner (in this example, oracle:dba)

46
Oracle 10gR2 Clusterware

1.1.2. Configure /etc/sysctl.conf


All four RAC nodes should have the same settings.

#
# Oracle specific settings
# x86 Huge Pages are 2MB
#
#vm.hugetlb_pool = 3000
#
kernel.shmmax = 4047483648
kernel.shmmni = 4096
kernel.shmall = 1051168
kernel.sem = 250 32000 100 128
net.ipv4.ip_local_port_range = 1024 65000
fs.file-max = 65536
#
# This is for Oracle RAC core GCS services
#
net.core.rmem_default = 1048576
net.core.rmem_max = 1048576
net.core.wmem_default = 1048576
net.core.wmem_max = 1048576

The parameter that most often needs to be modified to support larger SGAs is the shared memory setting: ker-
nel.shmmax. Typically 75% of the memory in a node should be allocated to the SGA. This does assume a modest num-
ber of Oracle foreground processes, which can consume physical memory for allocating the PGA (Oracle Process Global
Area). The PGA is typically used for sorting. On a 4GB system, a 3GB SGA is recommended. The amount of memory
consumed by the SGA and the PGA are very workload-dependant.

Note
The maximum size of the SGA on a 64-bit version of RHEL4 is currently slightly less than 128GB. The maxim-
um size of the SGA on a 32-bit version of RHEL4 varies a bit. The standard size is 1.7GB. If the oracle binary is
lower mapped, then this maximum can be increased to 2.5GB on –SMP kernels and 3.7GB on –HUGEMEM ker-
nels. Lower mapping is an Oracle approved linking technique that changes the address where the SGA attaches in
the user address space. When it is lowered, there is more space available for attaching a larger shared memory
segment. See Metalink Doc 260152.1

Another strategy for extending the SGA to 8GB and higher in a 32-bit environment is through the use of the /
dev/shm filesystem, although this is not recommended. If you need this much SGA, then using the 64-bit ver-
sion of Oracle and RHEL4 is a better strategy.

The net.core.* parameters establish the UDP buffers that will be used by the Oracle Global Cache Services (GCS) for
heartbeats and inter-node communication (including the movement of Oracle buffers). For large SGAs (more than 16GB),
the use of HugeTLBs is recommended.

Tip
TLBs or Translation Lookaside Buffers is the working end of a Page Table Entry (PTE). The hardware speaks in
physical addresses, whereas the processes running in user-mode speak only PVAs (Process Virtual Address), in-
cluding the SGA. These addresses have to be translated and modern CPUs must provide some TLB register space
so that during memory loads, the translation does not cause extra memory references.

By default, the page table entry on x86 hardware is 4K. When configuring a large SGA (16GB or more), the num-
ber of 4K PTEs (or TLBs slots) required to just map the SGA into the user’s process space requires 4,000,000
PTEs. HugeTLBs are a mechanism in RHEL that permits the use of 2MB hardware page tables. This mechanism
reduces the number of PTEs required to map the SGA. The performance improvments increase with the size of
the SGA, but can be between 10-30%.

During RHEL installation, 4GB of swap was set up and the Oracle Installer will check for this minimum.

1.1.3. Create the oracle user


You have to create a user (typically oracle or oinstall). The user name is somewhat arbitrary, but the DBAs might
insist that it be one of these two. However, the group must be dba. Configure the /etc/sudoers file so that oracle ad-
min users can safely execute root commands, which is required during and after the install:

47
Oracle 10gR2 Clusterware

# User alias specification


User_Alias SYSAD=oracle, oinstall
User_Alias USERADM=oracle, oinstall
# User privilege specification
SYSAD ALL=(ALL) ALL
USERADM ALL=(root) NOPASSWD:/usr/local/etc/yanis.client
root ALL=(ALL) ALL

1.1.4. Create_a_clean_ssh_connection_environment
You have to insure that whenever Clusterware talks to other nodes in the cluster, the ssh commands proceed unimpeded
and without extraneous session dialog. In order to insure that all connection pathways are set up, run:

rac1 $ ssh rac2 date

Wed May 10 21:48:02 PDT 2006

and not return any extra strings or prompts, such as:

rac1 $ ssh rac2 date

oracle@rac2's password:
OR
The authenticity of host 'rac2 (192.168.1.151)' can't be established.
RSA key fingerprint is 48:e5:e0:84:63:62:03:84:c7:57:05:6b:58:7d:12:07.
Are you sure you want to continue connecting (yes/no)?

Create a file of ~/.ssh/authorized_keys, distribute it to all four nodes and then proceed to execute ssh host-
name date to every host in the RAC cluster, in all combinations over both the primary and heartbeat interfaces. If you
miss any one of them, the Oracle Clusterware installer will fail at the node verification step.

On rac1, login to the oracle user and make sure $HOME/.ssh is empty. Do not supply a passphrase for the keygen
command; just press Return. Run:

rac1 $ ssh-keygen –t dsa

Generating public/private dsa key pair.


Enter file in which to save the key (/home/oracle/.ssh/id_dsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/oracle/.ssh/id_dsa.
Your public key has been saved in /home/oracle/.ssh/id_dsa.pub.
The key fingerprint is:
9e:98:88:5c:17:bc:1f:dc:05:33:21:cf:04:99:23:e1 oracle@rac1

Repeat this step on all four RAC nodes (not required by GULM lock servers), collect up all the ~/.ssh/id_dsa.pub
files into one ~/.ssh/authorized_keys file and distribute this to the other three nodes:

ssh rac2 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys


ssh rac3 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
ssh rac4 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
scp ~/.ssh/authorized_keys rac2:~/.ssh
scp ~/.ssh/authorized_keys rac3:~/.ssh
scp ~/.ssh/authorized_keys rac4:~/.ssh

48
Oracle 10gR2 Clusterware

Run all cominations from all nodes for both PUBLIC and PRIVATE networks (including the node where you are currently
executing):

rac1 $ ssh rac1 date


rac1 $ ssh rac-priv date
rac1 $ ssh rac2 date
rac1 $ ssh rac2-priv date
rac1 $ ssh rac3 date
rac1 $ ssh rac3-priv date
rac1 $ ssh rac4 date
rac1 $ ssh rac4-priv date

1.1.5. Download Oracle Installers


Download the Clusterware and Database installation materials from OTN (Oracle Technology Network) as this is where
the current base releases for all platforms are located. These are zipped, cpio files. Create a local installer directory on
node1 (/home/oracle/inst) and then expand the archives:

gunzip -c 10201_clusterware_linux_x86_64.cpio.gz | cpio -ivdm &>log1 &

gunzip -c 10201_database_linux_x86_64.cpio.gz | cpio -ivdm &>log2 &

The installer can be run from any filesystem mounted on node1.

1.1.6. Create shared home directories


The Clusterware can be installed on each node locally, or on the shared_home. This is a production maintenance decision.
A single shared Clusterware home is clearly less complex, but requires the entire cluster to shutdown when you do a
Clusterware upgrade. Node-local Clusterware gives you the ability to do rolling upgrades, but with some added mainten-
ance costs. This sample cluster will perform a single shared Clusterware install, so directories should be created and
owned prior to running the installer

rac1 $ sudo mkdir /mnt/ohome/oracle


rac1 $ sudo chown oracle:dba /mnt/ohome/oracle

1.1.7. Verify X11 connectivity


In this example, the remote hostname where the X Windows will appear is called adminws. For X11, xhost + must be ex-
ecuted on adminws from any session running on this system. A shell window on adminws will login to lock1 and must
have the DISPLAY environment variable set either upon login or in some profile:

rac1 $ export DISPLAY=adminws:0.0

Run xclock, to make sure that the X11 clock program appears on the adminws desktop.

Although, you can have ORACLE_BASE, ORACLE_HOME pre-set in the oracle user profile prior to running the installer, it
is not mandatory. In our case, it is set to point to the shared Oracle home location that is a 6GB GFS volume. The installer
will detect these values if they are set:

export ORACLE_BASE=/mnt/ohome/oracle/1010
export ORACLE_HOME=/mnt/ohome/oracle/1010/product/db

1.1.8. Clusterware rootpre.sh


The script /home/oracle/inst/clusterware/rootpre/rootpre.sh checks to see if a previous version of
Clusterware has been installed. Once this script executes successfully, then it is safe to start up the Clusterware installer:

/home/oracle/inst/clusterware/runInstaller

49
Oracle 10gR2 Clusterware

********************************************************************************
Please run the script rootpre.sh as root on all machines/nodes. The script can be found at
Answer 'y' if root has run 'rootpre.sh' so you can proceed with Oracle Clusterware installa
Answer 'n' to abort installation and then ask root to run 'rootpre.sh'.
********************************************************************************
Has 'rootpre.sh' been run by root? [y/n] (n)
y
Starting Oracle Universal Installer...

Figure 7.1. Oracle Universal Installer: Welcome window

50
Oracle 10gR2 Clusterware

Figure 7.2. Oracle Universal Installer: Specify Inventory Directory window

Verify that $ORACLE_BASE/oraInventory is located on the shared GFS volume (/mnt/ohome). If you want an in-
ventory on each node for CRS or the RDBMS, you would need to type in a node local directory (/
opt/oracle/1010/oraInventory), but you have to insure the directory is created and owned by the oracle user
before you click Next.

51
Oracle 10gR2 Clusterware

Figure 7.3. Oracle Universal Installer: Specify Home Details window

This screen’s default path will need to be changed, as it wants to put the CRSHOME in ORACLE_HOME. This install is a
single, shared CRS install, so the path is on the shared GFS volume. The name was simplified to just crs. Click Next.

Prerequisite checks run and since we have done our preparation work in the file /etc/sysctl.conf, then we expect
no errors or warnings.

52
Oracle 10gR2 Clusterware

Figure 7.4. Oracle Universal Installer: Prerequisite Checks window

Click Next.

53
Oracle 10gR2 Clusterware

Figure 7.5. Oracle Universal Installer: Specify Cluster Configuration window

Click Next.

Next, the other three nodes need to be added to the cluster configuration. All of these hosts must be defined in /
etc/hosts on all nodes.

54
Oracle 10gR2 Clusterware

Figure 7.6. Modify a Node dialog

Click OK.

The completed configuration screen should contain all four nodes.

Figure 7.7. Oracle Universal Installer: Specify Cluster Configuration window

Click Next.

This is the step that fails if any part of the ssh hostname date set up was not performed correctly.

If the /etc/hosts, ~/.ssh/authorized_keys and ~/.ssh/known_hosts are all properly setup, then the in-
staller should proceed to the next screen. Fully qualified hostnames can sometimes cause confusion, so the public network
hostnames entered into the Clusterware installer must match the string that is returned from (hostname. Otherwise, go
back and verify the entire matrix of ssh hostname date calls to make sure all these paths are clean. Often the self-
referential ones are missed, ssh rac1 date from rac1 itself.

55
Oracle 10gR2 Clusterware

Figure 7.8. Oracle Universal Installer: Specify Network Interface Usage window

Edit the eth0 fabric and change the interface type to Public and click Next.

56
Oracle 10gR2 Clusterware

Figure 7.9. Edit Private Interconnect Type dialog

Click OK.

57
Oracle 10gR2 Clusterware

Figure 7.10. Oracle Universal Installer: Specify OCR Location window

Click Next.

Assign the quorum voting and registry files. The option external redundancy is chosen as the files reside on a storage array
that implements redundancy.

58
Oracle 10gR2 Clusterware

Figure 7.11. Oracle Universal Installer: Specify Voting Disk Location window

The quorum vote disk will be located on /dev/raw/raw2. Once again, external redundancy is chosen. Click Next.

59
Oracle 10gR2 Clusterware

Figure 7.12. Oracle Universal Installer: Summary window

The next screen is the Install Summary screen. Click Install.

The installer starts to install, link and copy. This process typically takes less than 10 minutes depending on the perform-
ance of the CPU and the filesystem.

60
Oracle 10gR2 Clusterware

Figure 7.13. Execute Configuration Scripts dialog

This screen prompts for 2 sets of scripts to be run on all 4 nodes. Run the orainstRoot.sh script first on each node, in
order.

rac1 $ sudo /mnt/ohome/oracle/1010/oraInventory/orainstRoot.sh

Password:
Changing permissions of /mnt/ohome/oracle/1010/oraInventory to 770.
Changing groupname of /mnt/ohome/oracle/1010/oraInventory to dba.
The execution of the script is complete

1.1.9. Instantiating Clusterware


The script /mnt/ohome/oracle/1010/product/crs/root.sh must be run on every node, one at a time, start-
ing with rac1. You must wait until this script completes successfully on a given node before you can execute it on the
next node. This script can take several minutes to complete per node, so be patient. This script will initialize the files, con-
figure RHEL to run the Oracle Clusterware kernel services (including appending services to /etc/inittab) and then
start these services up. Only the first execution of this script will initialize the registry and quorum disk files.

rac1 $ sudo /mnt/ohome/oracle/1010/product/crs/root.sh

61
Oracle 10gR2 Clusterware

WARNING: directory '/mnt/ohome/oracle/1010/product' is not owned by root


WARNING: directory '/mnt/ohome/oracle/1010' is not owned by root
WARNING: directory '/mnt/ohome/oracle is not owned by root
Checking to see if Oracle CRS stack is already configured
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/mnt/ohome/oracle/1010/product' is not owned by root
WARNING: directory '/mnt/ohome/oracle/1010' is not owned by root
WARNING: directory '/mnt/ohome/oracle is not owned by root
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node FIXME
node 1: rac1 rac1-priv rac1
node 2: rac2 rac2-priv rac2
node 3: rac3 rac3-priv rac3
node 4: rac4 rac4-priv rac4
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Now formatting voting device: /dev/raw/raw2
Format of 1 voting devices complete.
Startup will be queued to init within 90 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
rac1
CSS is inactive on these nodes.
rac2
rac3
rac4
Local node checking complete.

Run /mnt/ohome/oracle/1010/product/crs/root.sh on the remaining nodes. As this script executes on the


other nodes, the last few lines should change to indicate that more nodes are active. These last few lines are from the com-
mand crsctl check install:

CSS is active on these nodes.


rac1
rac2
CSS is inactive on these nodes.
rac3
rac4
Local node checking complete.

If successful, the completion of the script on the fourth node should indicate that CSS is running on all nodes

CSS is active on these nodes.


rac1
rac2
rac3
rac4
CSS is active on all nodes.

Return to the main installer screen and click OK. Most of the verification and installation checks should pass.

62
Oracle 10gR2 Clusterware

Figure 7.14. Oracle Universal Installer: Configuration Assistants window

Figure 7.15. Warning dialog

If not, or if this pop-up occurs then is it likely the CRS application registration has failed to start up. This is usually due to
it not finding the tool in the path, but this can be fixed by running the vipca utility from rac1 once you quit the installer.
Click OK to the pop-up and Next for the Configuration Assistants screen.

63
Oracle 10gR2 Clusterware

Figure 7.16. Oracle Universal Installer: End of Installation window

The crs_stat command will display any registered CRS resources. There are currenlty none, so the vipca utility will need
to be executed next.

rac1 $ crs_stat –t

CRS-0202: No resources are registered.

1.1.10. Registering Clusterware resources with VIPCA


The environment variable $ORA_CRS_HOME should be added to the oracle user profile and vipca must run as root.

rac1 $ export CRS_HOME=/mnt/ohome/oracle/1010/product/crs


rac1 $ sudo $CRS_HOME/bin/vipca

64
Oracle 10gR2 Clusterware

Figure 7.17. VIP Configuration Assistant: Welcome window

Click Next on this window and the next one. Then the hostnames mapping window appears:

65
Oracle 10gR2 Clusterware

Figure 7.18. VIP Configuration Assistant: Virtual IPs for Cluster Nodes window

Fill in the first IP Alias name and press Tab. The tool should fill in the rest.

66
Oracle 10gR2 Clusterware

Figure 7.19. VIP Configuration Assistant: Virtual IPs for Cluster Nodes window

Click Next and a summary screen appears. Click OK.

67
Oracle 10gR2 Clusterware

Figure 7.20. VIP Configuration Assistant: Progress dialog

The final window should be:

68
Oracle 10gR2 Clusterware

Figure 7.21. Configuration Results window

Click Exit and then rerun the status command.

rac1 $ crs_stat -t

Name Type Target State Host


------------------------------------------------------------
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
ora.rac3.gsd application ONLINE ONLINE rac3
ora.rac3.ons application ONLINE ONLINE rac3
ora.rac3.vip application ONLINE ONLINE rac3
ora.rac4.gsd application ONLINE ONLINE rac4

69
Oracle 10gR2 Clusterware

ora.rac4.ons application ONLINE ONLINE rac4


ora.rac4.vip application ONLINE ONLINE rac4

70
Chapter 8. Installing Oracle 10gR2
Enterprise Edition Database
Installing the database requires you to run the Oracle Installer once more for the database-specific components. These
components will include configuring and registering the Oracle SQL*Net Tnslsnr process. This process needs to run on
each RAC node and should be registered with Oracle Clusterware. The final step is actually creating a database from /
dev/sda of the cluster.

1. RHEL Preparation
Oracle 10gR2 binaries are now being shipped with a dependency on the RHEL Async I/O library. This library will need to
be installed prior to running the Installer or the linking phase of the install process will fail. In this instance, it is not neces-
sary to verify if the library is installed, as it was not included in any packages. However, if you want to check prior to in-
stall and it is installed:

rac1 $ rpm –qa | grep libaio

libaio-0.3.105-2

Otherwise, download or locate the file libaio-0.3.105.2.rpm and install it now:

rac1 $ sudo rpm –Uhv libaio-0.3.103-3.x86_64.rpm

2. Oracle 10gR2 RDBMS Installation


You probably unzipped the database installer’s files when you unzipped the Oracle Clusterware install directories. If not,
do that now:

gunzip -c 10201_database_linux_x86_64.cpio.gz | cpio -ivdm FIXMElog2

This installer is also X-based, so if you choose to do it remotely, a correct set up can be verified using xclock. Setting
these environment variables are optional prior to running the installer, but will become mandatory once the product is in-
stalled, so these two should be put into the appropriate shell profile for the user (in this case, oracle):

export ORACLE_BASE=/mnt/ohome/oracle/1010
export ORACLE_HOME=/mnt/ohome/oracle/1010/product/db

The installer is located at the top of the install tree. The database installer files are located in the same directory as the Or-
acle Clusterware install directories, which is /home/oracle/inst. To start the installer, execute this shell script:

/home/oracle/inst/database/runInstaller

Starting Oracle Universal Installer...


Checking installer requirements...
Checking operating system version: must be redhat-3, SuSE-9, redhat-4, UnitedLinux-1.0, asi
Passed
All installer requirements met.
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2006-03-26_08-28-09PM. P

71
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Copyright (C) 1999, 2005, Oracle. All rights reserved.

Figure 8.1. Oracle Universal Installer: Welcome dialog

After an initial installer splash screen, this screen appears. Click Next.

72
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.2. Oracle Universal Installer: Select Installation dialog

Select Enterprise Edition (1.60GB) and click Next.

73
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.3. Oracle Universal Installer: Specify Home Details dialog

Choose a simple Name (as in the example above) and verify the Path. The installer should have extracted the path from the
environment variable:

export ORACLE_HOME=/mnt/ohome/oracle/1010/product/db

Click Next.

74
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.4. Oracle Universal Installer: Specify Hardware Cluster Installation Mode window

Because this is a Shared Home install, leave only rac1 checked in the Specify Hardware Cluster Installation Mode
window.

The Product-Specific Prerequisite Checks may fail due to these the installer best practice minimums not being met. It
sometimes does make sense to at least review the Warnings to see if they are a legitimate concern. Often they are not, as in
this case.

75
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.5. Oracle Universal Installer: Product-Specific Prerequisite Checks window

The Product Specific Pre-Requisite Checks window may fail due to some of the installer best-practice minimums not
being met. It does make sense to at least review the Warnings to see if they are a legitimate concern. In this case, there
were zero requirements to be verified. This appears between the two panes in the window. Click Next.

76
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.6. Oracle Universal Installer: Select Configuration Option window

Select Install database Software only and then click Next.

77
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.7. Oracle Universal Installer: Summary window

This is the Summary screen. The Cluster Nodes shows only one node in the Cluster Nodes section because this is a
shared home install. A lot of files need to be copied, processed and linked and it is during this process where you find you
find out you had not installed the libaio-0.3.103-3.x86_64.rpm. Click Install.

78
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.8. Oracle Universal Installer: Install window

This will take about 10 minutes—depending on the speed of your server.

79
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.9. Execute Configuration Scripts dialog

Once complete, a new window will open up on top of the Install screen asking for a script to be executed. Run:

$ sudo ./root.sh

Running Oracle10 root.sh script...


The following environment variables are set as:
ORACLE_OWNER= oracle
ORACLE_HOME= /mnt/ohome/oracle/1010/product/db
Enter the full pathname of the local bin directory: [/usr/local/bin]:
Copying dbhome to /usr/local/bin ...
Copying oraenv to /usr/local/bin ...
Copying coraenv to /usr/local/bin ...
Creating /etc/oratab file...Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created

Once this script has completed, click OK, which returns processing to the Install screen and eventually to the final End of
Installation screen. Click Exit.

80
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.10. Oracle Universal Installer: End of Installation window

3. Oracle SQL*Net Configuration


The SQL*Net configuration can be automatically set up using the netca utility that would have been installed in the previ-
ous step. If the PATH variable $ORACLE_HOME/bin, was not added to your profile, then you must specify the absolute
pathname:

/mnt/ohome/oracle/1010/product/db/bin/netca

81
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.11. Oracle Net Configuration Assistant: RAC Configuration window

Click Next.

82
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.12. Oracle Net Configuration Assistant: RAC Active Nodes window

Verify that these are the correct node names (they should be) and also that they are all selected before clicking Next.

Figure 8.13. Oracle Net Configuration Assistant: Welcome window

Click Next.

83
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.14. Oracle Net Configuration Assistant: Listener Configuration, Listener window

Click Next.

84
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.15. Oracle Net Configuration Assistant: Listener Configuration, Listener Name
window

Click Next.

Figure 8.16. Oracle Net Configuration Assistant: Listener Configuration, Select Protocols
window

Click Next.

85
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.17. Oracle Net Configuration Assistant: Listener Configuration, TCP/IP Protocol
window

Do not use the standard port number of 1521. Use another port number that is supplied to you by Network Operations. Ac-
cess Control lists in modern switches block most port numbers. The value assigned must be on the switches Access Con-
trol List (ACL) or clients will not be able to connect to the database. If this is not the case, it is still best not to choose the
default. Choose your birth year, perhaps. It is rare, but some database applications still assume 1521 is the listener port and
will not be able to connect. Check with the application’s network configuration documentation to determine how to set up
the application with correct port number. Click Next.

86
Installing Oracle 10gR2 Enterprise Edi-
tion Database

Figure 8.18. Oracle Net Configuration Assistant: Listener Configuration Done window

Click Next and then Finish at the next window, which will exit the netca.

By checking the output from the Clusterware command with $CRS_HOME/crs_stat –t , you can verify that the listeners
are now registered with Cluster services and will automatically restart when Clusterware restarts (which is usually when
the machine is rebooted). Run:

crs_stat –t

Name Type Target State Host


------------------------------------------------------------
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
ora....C3.lsnr application ONLINE ONLINE rac3
ora.rac3.gsd application ONLINE ONLINE rac3
ora.rac3.ons application ONLINE ONLINE rac3
ora.rac3.vip application ONLINE ONLINE rac3
ora....C4.lsnr application ONLINE ONLINE rac4
ora.rac4.gsd application ONLINE ONLINE rac4
ora.rac4.ons application ONLINE ONLINE rac4
ora.rac4.vip application ONLINE ONLINE rac4

After the database is created, it can also be registered with Clusterware. Once the database and RAC nodes are registered,
Clusterware will be able to automatically restart is complete, then the four instances can also be registered with Cluster-
ware, so that the database instance on a given node will also start up upon reboot. This Clusterware registration step must
be performed after the database has been successfully created.

87
Chapter 9. Creating a Database
1. Database File Layout
A four-node Oracle RAC database is actually only one database, but has four instances, one running on each node. An Or-
acle instance consists of shared memory (Shared Global Area or SGA) and Oracle background processes. Oracle functions
like an operating system that has a transactional file system with a buffer cache and journaling (redo logs). Each instance
shares access to the database files, but maintains an instance-specific set of redo logs. Although these logs are instance-
private, they must also be shared and visible so that any other node can perform RAC instance recovery.

Oracle file I/O falls into two usage categories: database files and transaction files. Database files (and the sparse TEMP
files) contain database blocks, which hold the user’s data. The I/O profile of these files is either small random reads and
writes or large sequential reads and writes, depending on the SQL application.

Transaction files (redo and undo) are typically small, very low-latency sequential writes and reads. The transaction files
are instance-specific and have an I/O profile distinct from the datafiles.

1.1. Oracle Datafiles


The largest GFS volume in this configuration is dedicated to the main portion of the database for both datafiles and in-
dexes. This volume can hold a single tablespace for everything, several small tablespaces, or whatever arrangement meets
your needs.

All volumes in our sample cluster are evenly stripped across all spindles. This is called the SAME (Stripe and Mirror
Everything) strategy, which avoids most I/O tuning problems. The problem you are going to have is the problem everyone
faces and that is not enough IOPs or spindles. If you have expand the array by adding more spindles, then it must be cap-
able of adding the performance capacity of these new spindles to existing GFS volumes. Most modern storage arrays do
this, but it is usually considered an advanced activity for the storage administrator.

1.2. Redo and Undo


The redo logs and undo tablespaces for each instance are contained on separate GFS volumes. The I/O to this volume is al-
most always from one instance, and this strategy minimizes cluster-wide contention for these GFS volumes. Each instance
has three 512MB redo logs and their size remains static for the lifetime of the database (unless manually altered by a
DBA). These logs are large enough and numerous enough to ensure sufficient committed transaction throughput while
avoiding processing stalls due to hard checkpoints.

There is also an undo tablespace for each instance as well. For example, the undo tablespace holds the entire encoded con-
tents of a table whose zip code column is being updated. All the values as they existed prior to the update statement are
stored in the undo tablespace. If the transaction issues a commit, the undo contents are discarded. If the transaction issues a
rollback, the undo contents are retrieved and the table is put back to its original pre-update state. The undo tablespaces are
initially 64MB, but are allowed to expand up to 2GB. In our example, a single UPDATE statement of 100s of millions of
rows of zip codes could be supported with 2GB worth of available undo. It is unlikely that the redo logs would ever be ex-
panded, but undo requirements may exceed 2GB. If that is the case, then redo/undo volumes could be created to be 8GB
instead of 4GB.

2. Setup and Scripts


Many customers have experience with the Oracle GUI Database Creation Assistant; however, walking through a simple
creation example will shed some light on how a basic database is created, why certain sizes are when the defaults are over-
ridden. Although Oracle is a very complex product, this example tries to be simple, yet useful. There is no reason you can-
not use any other tool to create and manage the database once you have read (or skipped) this section.

2.1. Environment Variables


Define these variables for the oracle user on every node. Only the $ORACLE_INSTANCE variable needs to be changed
for a given node.

export ORACLE_BASE=/mnt/ohome/oracle/1010

export ORACLE_HOME=/mnt/ohome/oracle/1010/product/db

export ORACLE_BASE_SID=rhel

88
Creating a Database

export ORACLE_INSTANCE=1

export ORACLE_SID=$ORACLE_BASE_SID$ORACLE_INSTANCE

export ADMHOME=/mnt/ohome/oracle/admin

export PATH=$ORACLE_HOME/bin:$CRS_HOME/bin:$PATH

export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH

The $ADMHOME is a variable that is defined in addition to the standard list above and it refers to the location of all of the
critical administration files for the RAC cluster. This admin directory must be on the shared home GFS volume. Some ali-
ases you can’t live without are listed below that help you navigate around both $ADMHOME and $ORACLE_HOME, includ-
ing a simple alias named dba that starts up sqlplus and connects as sysdba with the minimum of typing:

alias dba="sqlplus '/ as sysdba'"


alias pfile='cd $ADMHOME/$ORACLE_BASE_SID/pfile'
alias create='cd $ADMHOME/$ORACLE_BASE_SID/create'
alias trc='cd $ADMHOME/$ORACLE_BASE_SID/bdump/$ORACLE_SID'
alias utrc='cd $ADMHOME/$ORACLE_BASE_SID/udump'
alias hm='cd $ORACLE_HOME'
alias lib='cd $ORACLE_HOME/lib'
alias bin='cd $ORACLE_HOME/bin'
alias dbs='cd $ORACLE_HOME/dbs'
alias rlib='cd $ORACLE_HOME/rdbms/lib'
alias adm='cd $ORACLE_HOME/rdbms/admin'
alias net='cd $ORACLE_HOME/network/admin'
alias lsn='lsnrctl'

2.2. Installing a shared init.ora configuration


The init.ora file is Oracle’s equivalent to the /etc/sysctl.conf file and must contain a minimum of settings.

Note
Do not recycle an init.ora from some previous released version of Oracle. Oracle releases change enough so
that old init.ora settings become irrelevant or even counter-productive.

Each instance can have its own private init.ora, but it is not required and creates more problems than it
solves. A single init.ora can be customized to contain instance-specific parameters.

Some DBAs will require the use of the SPFILE feature in Oracle, which stores a copy of the init.ora inside
the database. This is often a production policy preference, but this example uses text mode, despite the risk that
somebody could delete or corrupt it. Both the init.ora and controlfile should be regularly archived or backed
up with something as simple as a cronjob.

The actual init.ora must be kept in a common location on the shared home admin directory for parameter files
($ADMHOME/$ORACLE_SID/pfile). All instances have must access to it and other admin related files. The Oracle
SQL command startup command assumes that the instance’s init.ora file is located in $ORACLE_HOME/dbs. Four
symbolic links must be created. This example shows how the environment variables factor in locating key files:

rac1 $ ln -s $ADMHOME/$ORACLE_BASE_SID/pfile/initrhel.ora initrhel1.ora

rac1 $ ln -s $ADMHOME/$ORACLE_BASE_SID/pfile/initrhel.ora initrhel2.ora

rac1 $ ln -s $ADMHOME/$ORACLE_BASE_SID/pfile/initrhel.ora initrhel3.ora

rac1 $ ln -s $ADMHOME/$ORACLE_BASE_SID/pfile/initrhel.ora initrhel4.ora

rac1 $ ls –l init*.ora

initrhel1.ora -> /mnt/ohome/oracle/admin/rhel/pfile/initrhel.ora


initrhel2.ora -> /mnt/ohome/oracle/admin/rhel/pfile/initrhel.ora
initrhel3.ora -> /mnt/ohome/oracle/admin/rhel/pfile/initrhel.ora
initrhel4.ora -> /mnt/ohome/oracle/admin/rhel/pfile/initrhel.ora

89
Creating a Database

2.3. Sample Init.ora


This is the sample init.ora that will be used by all four instances to run this RAC cluster.

control_files='/mnt/ohome/oracle/admin/rhel/ctl/control01.ctl', '/mnt/oradata/oracle/ctl/co
*.db_name = 'rhel'
*.db_block_size = 8192
#
# SGA Sizing
*.sga_target = 3300M
#
# File I/O
filesystemio_options = setall
#
# Network and Listeners
rhel1.local_listener = listener_rac1
rhel2.local_listener = listener_rac2
rhel3.local_listener = listener_rac3
rhel4.local_listener = listener_rac4
#
# Undo
*.undo_management = 'AUTO'
rhel1.undo_tablespace = 'UNDOTBS1'
rhel2.undo_tablespace = 'UNDOTBS2'
rhel3.undo_tablespace = 'UNDOTBS3'
rhel4.undo_tablespace = 'UNDOTBS4'
#
# Foreground and Background Dump Destinations
rhel1.background_dump_dest ='/mnt/ohome/oracle/admin/rhel/bdump/rhel1'
rhel2.background_dump_dest ='/mnt/ohome/oracle/admin/rhel/bdump/rhel2'
rhel3.background_dump_dest ='/mnt/ohome/oracle/admin/rhel/bdump/rhel3'
rhel4.background_dump_dest ='/mnt/ohome/oracle/admin/rhel/bdump/rhel4'
*.core_dump_dest ='/mnt/ohome/oracle/admin/rhel/cdump'
*.user_dump_dest ='/mnt/ohome/oracle/admin/rhel/udump'
#
# RAC Identification
*.cluster_database_instances= 4
*.cluster_database = FALSE # FALSE ONLY for database create phase
rhel1.thread = 1
rhel2.thread = 2
rhel3.thread = 3
rhel4.thread = 4
rhel1.instance_name = rhel1
rhel2.instance_name = rhel2
rhel3.instance_name = rhel3
rhel4.instance_name = rhel4
rhel1.instance_number = 1
rhel2.instance_number = 2
rhel3.instance_number = 3
rhel4.instance_number = 4

2.4. Detailed Parameter Descriptions


When defaults are over-ridden, it is usually because they are mandatory parameters that need to be set and are specific to
this database or to these instances. A simple example of this is the db_name parameter; if it is not defined, no instances
will start up.

Some parameters are optional, but are considered best practice.

In a few cases, some parameters are “optionally mandatory”. You do not have to set an optionally mandatory parameter,
but if you do not, your system will not work effecitively. An example of such parameters is setting the size of the SGA
buffers and pools.

2.4.1. control_files (mandatory)


Warning
You can never have enough backup controlfiles—if you lose all copies, your database is effectively GONE. A

90
Creating a Database

single point of failure in an Oracle RAC database can be the loss of all controlfiles.

If the controlfiles are in the same directory as the datafiles, then there can be a slight performance impact on create or ex-
tend operations. This is easily avoided by just putting the controlfiles in a sub-directory underneath the datafiles. All con-
trolfiles need to be on shared media for cluster recovery. Create a cron job to email the contents to a safe place.

2.4.2. db_name (mandatory)


Needs to be the same as what you used in the CREATE DATABASE statement in the create database script described be-
low.

2.4.3. db_block_size (mandatory)


On modern systems, set this value to at least 8K and for 64-bit systems where you plan to have more than 8GB of SGA,
set it to 16K, especially for new 64-bit deployments. Most 64-bit ports of Oracle support up to 32KB block sizes, which is
a common block size for data warehouses, but is now becoming common in large memory 64-bit deployments.

2.4.4. sga_target (optionally mandatory)


This parameter does a reasonable job of replacing db_cache_size, shared_pool_size, and
large_pool_size. Setting them to appropriate values is essential to basic functionality of the instance. If you want to
set these parameters by hand, db_cache_size and shared_pool_size are the two most important.
Db_cache_size controls the size of DB block buffer cache (and this is typically 80% of the SGA). The
Shared_pool_size parameter controls the size of the shared pool which mostly contains parsed SQL queries (cursors)
and their execution plans.

The *. at the beginning of any parameter indicates that this value applies to all instances that use this init.ora. If
rac4 could accommodate a buffer cache of 5GB, then a rac4-specific entry might look like:

rac4.sga_target = 5072M

2.4.5. File I/O (optional)


*.filesystemio_options=setall (for AsyncIO and DirectIO)

*.filesystemio_options=directIO

*.filesystemio_options=asyncIO

This parameter enables either DirectIO or AsyncIO. DirectIO bypasses the GFS filesystem buffer cache. This prevents
memory from being buffered in memory by two buffer caches. DirectIO provides near-raw performance. AsyncIO in-
creases I/O performance further still, but its benefit is usually only seen at very high I/O rates. Both setall are recom-
mended for RHEL4.3 and higher.

2.4.6. Network and Listeners (mandatory)


Each instance needs to have a SQL*Net listener (tnslsnr) defined. The execution of the utility netca defines all the corres-
ponding values in the listener.ora file. In order for applications to connect to this cluster, a tnsname.ora alias
should be created that specifies a list of listener end-points. load_balance=on optional parameter will randomly as-
sign new connections to any of the listed addresses. It is not very dynamic and, if the cluster is a symmetrical cluster archi-
tecture, this might not be the appropriate strategy—so plan accordingly. failover=on is the default for address lists and
only applies at connect time. More sophisticated connection management requires the use of a new Oracle 10gR2 feature
called FAN (Fast Application Notification).

rhel =
(DESCRIPTION=
(ADDRESS_LIST=
(CONNECT_DATA=(SERVICE_NAME=rhel))
(LOAD_BALANCE=OFF)
(ADDRESS = (PROTOCOL = tcp)(HOST = rac1)(PORT = 1921))
(ADDRESS = (PROTOCOL = tcp)(HOST = rac2)(PORT = 1921))
(ADDRESS = (PROTOCOL = tcp)(HOST = rac3)(PORT = 1921))
(ADDRESS = (PROTOCOL = tcp)(HOST = rac4)(PORT = 1921))
)
)

91
Creating a Database

If you forget this step, then you should see these error messages when attempting to start an instance:

ORA-00119: invalid specification for system parameter LOCAL_LISTENER


ORA-00132: syntax error or unresolved network name 'listener_rac1'

2.4.7. Undo (mandatory)


Each instance needs an undo tablespace and management is set to AUTO.

2.4.8. Foreground and Background dump destinations (mandatory)


This is where all the trace files and alert logs can be found. The background dump files are in the $ADMHOME/bdump
sub-directory, but in this example are further separated by instance, so that locating instance-specific trace files is easier.
The core file dump and user trace file locations are common across all four nodes, but could also be separated into in-
stance-specific subdirectories. Make sure these directories are created before you attempt to startup an instance or you
might see these errors:

ORA-00444: background process "LMD0" failed while starting


ORA-07446: sdnfy: bad value '' for parameter.

2.4.9. Cluster_database_instances (mandatory)


Specifies the maximum number of instances that can mount this database. This value cannot exceed the MAXINSTANCES
parameter that was specified on the SQL CREATE DATABASE statement.

2.4.10. Cluster_database (mandatory)


During database creation, the parameter *.cluster_database needs to be set to TRUE. Once the creation steps are
complete, the first node needs to be shutdown and this parameter must be changed to TRUE to enable multi-node operation
before any of the instances can be started.

2.4.11. RAC identification (mandatory)


Most of this is basic housekeeping that identifies instance-specific parameters that must be set for each node to startup and
be identified as unique.

2.4.12. Create Database Script and Execution


The Alert log is the Oracle equivalent of /var/log/messages and is located in the background dump destination. If
any problems occur that are not written to the console, they will appear in this file; it is first place to look whenever there
seems to be something wrong with the normal operation of the database or any of the instances. Each instance has its own
alert log and they are found in the instance-specific bdump directories as specified in the init.ora (for example,
rhel1.background_dump_dest). Its name will be derived from the $ORACLE_SID, such as:

$ADMHOME/$ORACLE_BASE/bdump/$ORACLE_SID/alert_<$ORACLE_SID>.log for example, /


mnt/ohome/oracle/admin/rhel/bdump/rhel1/alert_rhel1.log

If you start up the instance on rac1 with the nomount option, it will create an SGA and start up the background ses-
sions. This will verify that it is possible to bring up an instance. Here is a simple script, which can be modified to bring the
database down easily (shutdown immediate). The following script assumes the default location for the init.ora
($ORACLE_HOME/dbs):

#!/bin/sh
sqlplus << EOF
connect / as sysdba
startup nomount
exit
EOF

rac1 $ nomnt

92
Creating a Database

SQL*Plus: Release 10.2.0.1.0 - Production on Wed Mar 29 12:42:28 2006


Copyright (c) 1982, 2005, Oracle. All rights reserved.
Enter user-name: Enter password:
Connected to an idle instance.
SQL> ORACLE instance started.
Total System Global Area 3221225472 bytes
Fixed Size 2024240 bytes
Variable Size 3154119888 bytes
Database Buffers 50331648 bytes
Redo Buffers 14749696 bytes
ORA-00205: error in identifying control file, check alert log for more info

Because no database has been created yet, there is no valid controlfile and this is a normal error at this early stage.
However, you have verified that an Oracle instance on rac1 can start up. The next step is to actually create the database
using the following script:

spool db_create
STARTUP NOMOUNT
CREATE DATABASE rhel CONTROLFILE REUSE
LOGFILE
GROUP 1 ('/mnt/log1/oracle/logs/redo11.log') SIZE 512M reuse,
GROUP 2 ('/mnt/log1/oracle/logs/redo12.log') SIZE 512M reuse,
GROUP 3 ('/mnt/log1/oracle/logs/redo13.log') SIZE 512M reuse
CHARACTER SET UTF8
NATIONAL CHARACTER SET UTF8
NOARCHIVELOG
MAXINSTANCES 4
MAXLOGFILES 128
MAXLOGMEMBERS 3
MAXLOGHISTORY 10240
MAXDATAFILES 256
DATAFILE '/mnt/oradata/oracle/sys.dbf'
SIZE 256M REUSE EXTENT MANAGEMENT LOCAL
SYSAUX
DATAFILE '/mnt/oradata/oracle/sysaux.dbf'
SIZE 256M REUSE AUTOEXTEND ON NEXT 10M MAXSIZE UNLIMITED
UNDO TABLESPACE undotbs1
DATAFILE '/mnt/log1/oracle/undo1.dbf'
SIZE 64M REUSE AUTOEXTEND ON NEXT 64M MAXSIZE 2048M
DEFAULT TEMPORARY TABLESPACE temp
TEMPFILE '/mnt/oradata/oracle/temp.dbf'
SIZE 256M REUSE AUTOEXTEND ON NEXT 1024M MAXSIZE UNLIMITED;
rem Make sure that the basic create works and then either re-run
rem the whole thing or paste the rest of it into a sqlplus session
rem
exit;
CREATE UNDO TABLESPACE undotbs2
DATAFILE '/mnt/log2/oracle/undo2.dbf'
SIZE 64M REUSE AUTOEXTEND ON NEXT 64M MAXSIZE 2048M;
CREATE UNDO TABLESPACE undotbs3
DATAFILE '/mnt/log3/oracle/undo3.dbf'
SIZE 64M REUSE AUTOEXTEND ON NEXT 64M MAXSIZE 2048M;
CREATE UNDO TABLESPACE undotbs4
DATAFILE '/mnt/log4/oracle/undo4.dbf'
SIZE 64M REUSE AUTOEXTEND ON NEXT 64M MAXSIZE 2048M;
ALTER DATABASE ADD LOGFILE THREAD 2
GROUP 4 ( '/mnt/log2/oracle/logs/redo21.log' ) SIZE 512M reuse,
GROUP 5 ( '/mnt/log2/oracle/logs/redo22.log' ) SIZE 512M reuse,
GROUP 6 ( '/mnt/log2/oracle/logs/redo23.log' ) SIZE 512M reuse;
ALTER DATABASE ENABLE PUBLIC THREAD 2;
ALTER DATABASE ADD LOGFILE THREAD 3
GROUP 7 ( '/mnt/log3/oracle/logs/redo31.log' ) SIZE 512M reuse,

93
Creating a Database

GROUP 8 ( '/mnt/log3/oracle/logs/redo32.log' ) SIZE 512M reuse,


GROUP 9 ( '/mnt/log3/oracle/logs/redo33.log' ) SIZE 512M reuse;
ALTER DATABASE ENABLE PUBLIC THREAD 3;
ALTER DATABASE ADD LOGFILE THREAD 4
GROUP 10 ( '/mnt/log4/oracle/logs/redo41.log' ) SIZE 512M reuse,
GROUP 11 ( '/mnt/log4/oracle/logs/redo42.log' ) SIZE 512M reuse,
GROUP 12 ( '/mnt/log4/oracle/logs/redo43.log' ) SIZE 512M reuse;
ALTER DATABASE ENABLE PUBLIC THREAD 4;

Run this script and remember to first create all sub-directories for the logs and the controlfiles, as Oracle will not create
these sub-directories. This simple script creates the database by calling the create.sql script:

#!/bin/bash
set echo on
# @? In sqlplus translates to @$ORACLE_HOME
time sqlplus /nolog << EOF > bld.lst
connect / as sysdba
shutdown abort
@create
@?/rdbms/admin/catalog
EOF

The catalog.sql is a master script that defines the Oracle data dictionary. If these steps are successful, then you have a
single node RAC database running in Exclusive mode. Shut down the database, change *.cluster_database =
TRUE and then start up rac1 again. (Remember, this is still all on rac1). If the listener is running on this node, then the
network status of this node can be verified using the command:

rac1 $ lsnrctl status listener_rac1

LSNRCTL for Linux: Version 10.2.0.1.0 - Production on 20-APR-2006 23:15:06


Copyright (c) 1991, 2005, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=rac1-vip)(PORT=1921)(IP=FIRST)))
STATUS of the LISTENER
------------------------
Alias LISTENER_RAC1
Version TNSLSNR for Linux: Version 10.2.0.1.0 - Production
Start Date 20-APR-2006 23:12:12
Uptime 0 days 0 hr. 2 min. 54 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /mnt/ohome/oracle/1010/product/db/network/admin/listener.ora
Listener Log File /mnt/ohome/oracle/1010/product/db/network/log/listener_rac1.log
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.20)(PORT=1921)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.150)(PORT=1921)))
Services Summary...
Service "PLSExtProc" has 1 instance(s).
Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service...
Service "rhel" has 1 instance(s).
Instance "rhel1", status READY, has 3 handler(s) for this service...
Service "rhel_XPT" has 1 instance(s).
Instance "rhel1", status READY, has 3 handler(s) for this service...
The command completed successfully

2.4.13. Registering the database and the instance with Oracle Cluster-
ware
Oracle Clusterware supports the capability of registering both the database and each instance so that it will automatically
start up an instance when it detects that it is not running. The utility srvctl is used to register the instances for auto-start:

94
Creating a Database

rac1 $ srvctl add database -d rhel -o $ORACLE_HOME


rac1 $ srvctl add instance -d rhel -i rhel1 -n rac1
rac1 $ srvctl add instance -d rhel -i rhel2 -n rac2
rac1 $ srvctl add instance -d rhel -i rhel3 -n rac3
rac1 $ srvctl add instance -d rhel -i rhel4 -n rac4

It also permits the use of srvctl to manually start and stop the instances from one node:

rac1 $ srvctl start instance -d rhel -i rhel1


rac1 $ srvctl start instance -d rhel -i rhel2
rac1 $ srvctl start instance -d rhel -i rhel3
rac1 $ srvctl start instance -d rhel -i rhel4

This command may be run from any node and it can retrieve the status of any node:

rac1 $ srvctl status nodeapps -n rac3

VIP is running on node: rac3


GSD is running on node: rac3
Listener is running on node: rac3
ONS daemon is running on node: rac3

To get a consolidated cluster-wide status, the $CRS_HOME/bin/crs_stat –t utility can be used. This example
shows all services and instances registered and online. The instances are listed at the end and their online status indic-
ates that the database instance is registered with Oracle Clusterware and is running.

rac1 $ crs_stat –t

------------------------------------------------------------
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
ora....C3.lsnr application ONLINE ONLINE rac3
ora.rac3.gsd application ONLINE ONLINE rac3
ora.rac3.ons application ONLINE ONLINE rac3
ora.rac3.vip application ONLINE ONLINE rac3
ora....C4.lsnr application ONLINE ONLINE rac4
ora.rac4.gsd application ONLINE ONLINE rac4
ora.rac4.ons application ONLINE ONLINE rac4
ora.rac4.vip application ONLINE ONLINE rac4
ora.rhel.db application ONLINE ONLINE rac2
ora....l1.inst application ONLINE ONLINE rac1
ora....l2.inst application ONLINE ONLINE rac2
ora....l3.inst application ONLINE ONLINE rac3
ora....l4.inst application ONLINE ONLINE rac4

A corresponding network status inquiry from node1 shows the presence of services for all four nodes:

rac1 $ lsnrctl status listener_rac1

LSNRCTL for Linux: Version 10.2.0.1.0 - Production on 15-APR-2006 02:28:02


Copyright (c) 1991, 2005, Oracle. All rights reserved.

95
Creating a Database

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=rac1-vip)(PORT=1921)(IP=FIRST)))
STATUS of the LISTENER
------------------------
Alias LISTENER_RAC1
Version TNSLSNR for Linux: Version 10.2.0.1.0 - Production
Start Date 14-APR-2006 17:08:13
Uptime 0 days 9 hr. 19 min. 49 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /mnt/ohome/oracle/1010/product/db/network/admin/listener.ora
Listener Log File /mnt/ohome/oracle/1010/product/db/network/log/listener_rac1.log
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.20)(PORT=1921)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.150)(PORT=1921)))
Services Summary...
Service "PLSExtProc" has 1 instance(s).
Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service...
Service "rhel" has 4 instance(s).
Instance "rhel1", status READY, has 3 handler(s) for this service...
Instance "rhel2", status READY, has 2 handler(s) for this service...
Instance "rhel3", status READY, has 2 handler(s) for this service...
Instance "rhel4", status READY, has 2 handler(s) for this service...
Service "rhel_XPT" has 4 instance(s).
Instance "rhel1", status READY, has 3 handler(s) for this service...
Instance "rhel2", status READY, has 2 handler(s) for this service...
Instance "rhel3", status READY, has 2 handler(s) for this service...
Instance "rhel4", status READY, has 2 handler(s) for this service...
The command completed successfully

96
Index

97

You might also like