You are on page 1of 9

• Cognizant 20-20 Insights

Virtualizing Oracle:
Oracle RAC on VMware vSphere
Executive Summary Database Virtualization: Getting Started
While most database systems running in a VMware The fundamentals of running database systems
environment benefit from the increased reliability such as Oracle in a virtual environment have
and recoverability inherent with virtual environ- become increasingly well established with
ments, achieving the highest possible availability newer releases of VMware vSphere. Advances
still requires database clustering. in the capabilities and overall performance of
VMware have put to rest the arguments about
Yet, traditional clustering methods that use Raw running high-performance applications as virtual
Disk Mappings (RDMs) have generally achieved machines (VMs).
redundancy at the expense of the many benefits
that result from running in virtual environments. However, the current release of VMware vSphere
Recent advances in the capabilities of VMware can provide continuous availability through
vSphere have opened the door to new clustering VMware Fault Tolerance only for single vCPU
methods. These methods enable individual VMs systems, and then only in limited configurations.
in a database cluster to be migrated via VMware vSphere is not yet able to provide fault tolerance
vMotion from one ESX host to another, creating for multi-CPU systems, which are often needed
an opportunity to synergistically combine the to meet the demands of high-performance
natural resiliency of database clusters with the databases and other Tier 1 platforms. Thus,
high-availability and load-balancing properties of concerns remain around enabling high availability
VMware virtual environments. The net result is a on virtual machines with more than one virtual
high-performance database system with greater CPU along with other properties that are not yet
reliability than what could otherwise be achieved supported by VMware Fault Tolerance. Organiza-
through traditional clustering methods on either tions with enterprise-class database platforms
physical or virtual infrastructure. that require mission-critical availability or carrier-
grade stability must find other ways to meet this
We have delivered Oracle RAC on VMware vSphere need in a virtual environment.
to several large clients, including being first in the
industry to do so in production environments As a result, traditional database clustering is still
on converged infrastructure and with vMotion required for both mission-critical, high-availabili-
enabled — something thought to be impossible at ty and high-performance compute capacity. Yet,
the time. when using traditional methods, clustering virtual

cognizant 20-20 insights | october 2011


machines in VMware leads to another limitation. system performance issues will arise in the event
The individual nodes in a typical cluster – whether of a HA restart due to either the VM or ESX host
or not these nodes are running on physical, virtual failure is increased.
or even mixed architectures – require access to
shared data. These shared drives are used for Oracle Support
storing information common to all systems, as On November 8, 2010, Oracle announced a change
well as for keeping all of the nodes in a given to its support statements for all Oracle products
cluster coordinated (voting and quorum drives). when running on VMware. Prior to this announce-
ment, Oracle would provide support on VMware
In VMware, traditional VM clustering methods only when an issue could first be duplicated on
have required the use of Raw Disk Mappings physical infrastructure. This effectively kept some
(RDMs) on a shared Fiber Channel or iSCSI storage companies from virtualizing Oracle products and
system. When used in this way, RDMs introduce applications, as many in the user community
several limitations in virtual infrastructure envi- already knew that specific Oracle configurations
ronments: worked well without it. More recently (but still
• RDMs are often difficult to backup and restore prior to this announcement), Oracle changed its
using traditional VMware backup methods, stance on supporting virtualized applications
particularly if they are physical as opposed to when running on its hypervisor product. In all of
virtual RDMs (vRDM). these cases, Oracle RAC was expressly excluded
from being supported.
• When using backup methods designed to take
advantage of special disk access (e.g., the
The recent Oracle support statement changed
vStorage API), RDMs are not always backed up
things dramatically.1 The key portion of that
in the same way as the other VMware storage,
change is as follows: “If a problem is a known
leading to more complex restore procedures.
Oracle issue, Oracle support will recommend the
• RDMs, when used for voting and quorum appropriate solution on the native OS. If that
drives, require VMs to turn on a feature called solution does not work in the VMware virtual-
SCSI Bus Sharing. This feature is incompat- ized environment, the customer will be referred
ible with certain key VMware technologies, the to VMware for support. When the customer can
most important of which is VMware vMotion, demonstrate that the Oracle solution does not
which enables a VM to be migrated from one work when running on the native OS, Oracle will
ESX host to another with no downtime (aka resume support, including logging a bug with
live migration). Oracle Development for investigation if required.
If the problem is determined not to be a known
As a result, a VM that is used in traditional
Oracle issue, we (Oracle) will refer the customer
database clustering is always tied to a dedicated
to VMware for support. When the customer can
ESX host. It cannot be moved to another ESX host
demonstrate that the issue occurs when running
without incurring some amount of downtime.
on the native OS, Oracle will resume support,
This lack of mobility makes other key features
including logging a bug with Oracle Development
that rely on VMware vMotion technology, such as
for investigation if required.“
VMware Distributed Resource Scheduler (DRS),
unavailable. NOTE: For Oracle RAC, Oracle will only accept
Service Requests as described in this note for
The end result is that workloads within a tradi-
Oracle RAC version 11.2.0.2 and later. While they
tional, RDM-based VMware cluster are more
are known to work, Oracle still does not support
difficult to load balance across a DRS cluster.
versions of Oracle RAC prior to this. In making
Further, the primary method used to ensure
this statement, it is also clear that Oracle appro-
high availability for a database cluster is to use
priately expects VMware will provide support for
multiple VMs in the cluster itself – just as multiple
VMware when running on vSphere based virtual
physical servers would do in a physical cluster.
infrastructure. As an added measure, VMware
VMware is unable to contribute to or enhance
has created an Oracle Support center with highly
this capability in any meaningful way, at least for
skilled Oracle resources, and will also now take
the foreseeable future. While VMware High Avail-
support calls for Oracle issues. As a result, there
ability (HA) can automatically restart a failed VM
is arguably now a greater level of support for
in a database cluster, it is unable to follow the
Oracle systems when running on VMware than
additional load balancing rules provided by DRS
when running on bare metal hardware.
as a part of that process. Thus, the potential that

cognizant 20-20 insights 2


The issue of certification is worth a specific note Shared VMDK
as several organizations have expressed concern The Shared VMDK file is the simplest to set up.
about it. Oracle has certified its products when vSphere now supports using a VMDK file that
running on its own hypervisor product, which is can be accessed as a shared disk by multiple
a cousin to the Xen hypervisor. This, however, VMs. This is the same technology that supports
appears to be more of a marketing effort than VMware Fault Tolerance, and is documented in
a technical issue. Certification is different from the VMware KB article KB1034165. The shared VM
support, so confusing the two with each other must have the following characteristics:
should be avoided. Oracle also does not certify its
products on any specific Intel hardware platforms • Accessible by all VMs in the database cluster.
(HP, IBM, Dell, etc.). As a practical matter, Oracle • Provisioned as Eager Zeroed Thick.
should not be expected to certify its products on • Multi-Write Flag Enabled.
vSphere because it operates at the same level in
the overall stack with respect to Oracle products Once configured, each VM in the cluster can
as physical hardware does. VMware vSphere is access the shared virtual disk. The primary limita-
no different in this case than any other hardware tions to this configuration include:
vendor. Ironically, this is not necessarily the case • Maximum of 8 ESX hosts per cluster.
for Oracle’s Hypervisor or for its close cousin, Xen.
Thus, certifying on one hypervisor vs. another, in
• Shared VMDK is not uniquely associated with
a VM, which can lead to more complex backup
contrast to that none of the underlying hardware and restore issues.
is certified even when running on bare metal,
leads to an inconsistent and confusing message. iSCSI Gateway VM
Add to this that Oracle now enthusiastically The iSCSI Gateway VM is essentially an iSCSI SAN
advocates running their products on (their own) within the virtual infrastructure environment
virtual infrastructure environments, and that itself. As a virtualized SAN, it uses traditional
vSphere is acknowledged in the industry as the shared disk and re-shares it via iSCSI protocol
leading and most advanced virtual infrastructure to the other VMs in the database cluster. To
platform. Thus, the opportunity and incentive to enhance redundancy, the iSCSI Gateway VM can
migrate Oracle systems to virtualized infrastruc- be configured to take advantage of VMware Fault
ture, including VMware vSphere, has never been Tolerance. It solves the limitations of the shared
greater. VMDK file because it keeps the shared VMDK file
associated with a unique VM and is not subject to
Breaking Free of RDM
the maximum cluster size of 8 hosts. It is more
Overcoming this last barrier for most mission-crit- difficult to set up than a shared VMDK, which is
ical applications is key. By eliminating the need for also its primary drawback.
RDMs in VM clusters, the high availability of tra-
ditional database clusters can be combined syn- iSCSI Guest to SAN Storage
ergistically with the built-in features of VMware The final option is to configure an iSCSI target on
vSphere to provide a high-performance database the primary SAN storage. This option provides
cluster environment with even greater resiliency similar flexibility to the iSCSI Gateway VM and
than would be possible on either physical infra- can potentially deliver superior performance,
structure or via traditional VMware high-avail- depending on the type of storage system. The
ability methods. Thanks to the performance primary drawbacks of this option are:
enhancements starting with VMware vSphere 4
and continuing with vSphere 5, this final barrier • Additional LUNs must be set up for each
can now be broken with new options to take the database cluster.
place of RDMs. These include: • Backup and restore for shared disk has the
same issues as traditional RDMs.
• Shared virtual disk files. • Not supported by all SAN types, which is prob-
• iSCSI (or NFS) Gateway VM. lematic for traditional Fiber Channel SAN
storage systems.
• iSCSI guest to SAN storage.
Each of these has individual advantages and Take Advantage of Improved
drawbacks. Depending on your individual situa- vSphere Performance
tion, one may be better than the others. VMware vSphere 4 ushered in key performance
enhancements across the board for virtual

cognizant 20-20 insights 3


iSCSI Gateway VM

Guest to Guest iSCSI Gateway


Disk Sharing Shared iSCSI Disk

APP APP
APP APP APP OS OS
OS OS OS VMware
iSCSI Fault Tolerance iSCSI
Gateway Gateway
Database Database Database (FT Clone)
Node Node Node
ESXi ESXi ESXi ESXi

vSphere Gateway
Datastore Shared Disk
VM Disk VM Disk VM Disk VM Disk VM Disk

SAN Infrastructure
Highlights:
• All Storage is VMDK on SAN
• iSCSI Gateway virtualizes and re-shares disk Physical Disk
over VM Network (Virtual SAN on SAN)
• HA, DRS, and FT work together
• All Systems can be vMotioned
• Portable to any vSphere architecture

Figure 1

machines. These enhancements continue with As an analogy, a sound engineer’s mixing board at
the introduction of vSphere 5 and allow vSphere a concert has dozens of control knobs, levers and
to easily meet and exceed the compute capacity switches, which can appear daunting to manage.
needed to run high-performance, Tier 1 applica- But there is a logical flow to the system.
tions. In particular, the enhancements to the iSCSI
and networking stacks have increased I/O and Sound from a microphone or instrument is first
efficiency gains by as much as 50% and a factor filtered into the top of the mixing board on one
of 10, respectively. As a result, both in-guest iSCSI of several channels through a “trim” control. The
and NFS can be used to access shared drives, as sound is then “mixed” in a variety of ways (treble,
needed. In virtual infrastructure environments bass, echo effects, etc.) as it travels down from
leveraging converged 10 Gb Ethernet networks, the top to the bottom of the mixing board where
the options and benefits are significant. However, another lever, called a “fader,” controls how much
traditional Fiber Channel environments can also sound comes out on that instrument’s channel.
take advantage of these benefits through the The processed sound from each channel is then
use of an iSCSI Gateway VM. When combining sent to a master volume control which is used
multiple systems with the sophistication of virtual to set overall volume for all of the instruments
infrastructure and Tier 1 database clusters, a sig- and voices. Understanding this flow lets a sound
nificant amount of feature overlap can occur. engineer use his highly skilled ears to make the
Managing and eliminating performance bottle- concert sound great.
necks requires a clear understanding of how
There is a similar logical layout and flow to how
these products interact with virtual infrastructure
physical infrastructure, VMware and Oracle
environments and with each other. While this can
database components interact. Knowing how
sometimes look complex, understanding how and
and where data flows through the network, how
why certain components provide performance
CPU and memory is assigned and how storage is
boosts can be broken into logical components.
accessed provides a skilled architect or admin-

cognizant 20-20 insights 4


istrator with a similar framework for optimizing connects directly to an iSCSI or NFS shared drive
performance. Balancing these for maximum per- for all data that must be held in common. This
formance still requires skill and knowledge, but connection uses the same protocols and security
the concepts of what each component does and mechanisms that would be used if these VMs were
how it works can be easily understood. instead servers in a purely physical environment.

Virtual Infrastructure Architecture On appropriate underlying infrastructure, iSCSI


Virtual infrastructure environments that are and NFS deliver similar performance, with unique
based on converged network topologies such as benefits and drawbacks that are well known
10 Gb Ethernet are especially friendly to virtual- among storage administrators and architects.
ized Tier 1 applications such as Oracle RAC. This Which one to choose can be driven by available
is due to the available network bandwidth and skills, layout of the underlying infrastructure,
the use of IP-based storage protocols (iSCSI and company security policies, and even personal
NFS). These architectures allow the shared drives tastes and preferences. As such, the examples
needed for VM clusters to be hosted directly from used in this document are based on iSCSI, but can
the physical storage system. As a result, they are also be readily applied to NFS configurations.
able to take better advantage of the hardware
Configuring a Virtualized Oracle System
infrastructure which supports the virtual environ-
ment. Properly sizing the VMs which make up a RAC
cluster and the Gateway VM (if implemented) is
However, this doesn’t rule out the ability to also critical to maximizing performance. An example
use traditional Fiber Channel storage systems. VM configuration for Oracle RAC nodes might
Here, an iSCSI Gateway VM (as described above) have the following characteristics:
is used to share the FC SAN storage using iSCSI
protocol. While this particular method of sharing
• Four vCPUs.
has an additional step and requires additional • 12 GB RAM.
tuning to achieve optimum performance, it has • 50 GB Primary Disk (can be thin provisioned).
the advantage that all of the storage for the
VM clusters is kept in one or more virtual disk • Two vNICs (vmxnet3 driver): one public and
one private.
files stored on a VMware VMFS data store. This
provides for a consistent method of storage • Current Linux distribution. (CentOS, Ubuntu,
across all systems that can be backed up using and Fedora have been successfully tested. Red
the same virtual machine backup methods. The Hat Enterprise Linux, SuSE Linux and Oracle
primary difference between Fiber Channel and Enterprise Linux have been used in other
IP-based storage solutions is solely that an iSCSI Oracle database solutions and should work as
Gateway VM is required in FC SAN environments. well.)
While it provides clear benefits in all SAN storage
For those using an iSCSI Gateway VM, the config-
solutions, the iSCSI Gateway VM is not absolutely
uration might look something like this:
required where iSCSI or NFS is already used as
the primary storage system. • One vCPU.
Since all of these configurations allow clustering • 4GB RAM.
without the need for SCSI bus sharing, all of the • VMware Fault Tolerance (optional).
VMs — including iSCSI Gateway VMs — can be
moved between the various ESX hosts in a DRS
• 10 GB primary disk (can be thin provisioned).
cluster via vMotion. This enables clusters to be • 100 GB secondary Disk, thick provisioned (to
be shared via iSCSI).
freely configured such that the benefits of HA and
DRS can be synergistically added to the failover • Two vNICs (vmxnet3 driver) – one for adminis-
capabilities inherent in Oracle RAC clusters. tration and one for iSCSI network.

Virtual Machine Architecture • Current Linux distribution. (CentOS, Ubuntu,


and Fedora have been successfully tested. Red
The virtual machine configuration for the Hat Enterprise Linux, SuSE Linux, and Oracle
individual Oracle RAC nodes relies on in-guest Enterprise Linux have been used in similar
iSCSI or in-guest NFS protocol for all shared solutions and should work as well.)
drives. This means that each virtual machine

cognizant 20-20 insights 5


Further, the example VMs as configured in this The following settings have been compiled from
document make use of a 10 Gb Ethernet converged several community-based sources (online blogs,
network for both network and storage access. “man pages,” etc.). These represent some of
When configuring for Gig-E networks, additional, the more common settings and should provide
dedicated network ports and interfaces at the adequate performance for most situations. A full
physical layer will be required. explanation of these parameters can be found
in the Linux man pages of the iSCSI Enterprise
The above example configuration is intended to Target configuration file (ietd.conf). Online expla-
support up to a medium-sized Oracle database nations of each parameter can also be found
for development, small-scale production, and online at (http://www.linuxcertif.com/man/5/ietd.
secondary support for enterprise-class, large- conf), as well as other locations.
scale database solutions such as Oracle Exadata.
This configuration should be modified as Configuring iSCSI Enterprise Target
necessary to support alternate use cases. On the target server, place the following in the
/etc/ietd.conf file:
iSCSI Tuning
There are a variety of options for an appropriate • MaxConnections 1
iSCSI Gateway VM, most all of which are some • InitialR2T No
variant of Linux. These include Red Hat Enterprise • ImmediateData Yes
Linux, Ubuntu, SuSE, Fedora, and FreeNAS, to
name a few. All have an iSCSI target capability • MaxRecvDataSegmentLength 262144
built into them. The most common iSCSI target • MaxXmitDataSegment-
applications found on current Linux distributions • Length 262144
are: MaxBurstLength 262144
• iSCSI Enterprise Target. • FirstBurstLength 262144
• TGT. • MaxOutstandingR2T 16
• Open iSCSI. • Wthreads 16
The iSCSI Enterprise Target is the oldest and most • DataDigest None
mature of these alternatives, but new versions of • HeaderDigest None
Open iSCSI have newer features and are rapidly
Next, adjust the amount of memory the iSCSI
replacing IET. Open iSCSI is included “in the box”
target system is configured to use. To do this, edit
with current versions of Red Hat and its deriva-
/etc/init.d/ iscsitarget and change the MEM_SIZE
tives, whereas TGT is usually found in older
variable to MEM_SIZE=1073741824 and then
versions. Both are more than capable as a iSCSI
restart the iSCSI Target server by issuing the
platform. However, the default settings for all
command: /etc/init.d/iscsitarget restart.
iSCSI systems are generally too conservative for
the level of performance needed by Oracle RAC. Configuring iSCSI Targets with TGT
Tuning is required to achieve a desirable level of
If configuring the iSCSI target Gateway VM using
performance. There are several online resources
TGT, use the following commands:
for tuning and configuring an iSCSI target on
Linux for Oracle. • tgtadm –lld iscsi –mode target –op update
The primary issue is that the default settings
• –tid $tid –name MaxRecvDataSegmentLength
–value 262144
for iSCSI target servers in Linux do not allocate
tgtadm –lld iscsi –mode target –op update
sufficient resources to handle the I/O needs of
databases such as Oracle RAC. Tuning iSCSI to • –tid $tid –name MaxXmitDataSegmentLength
have larger memory caches and to handle larger –value 262144
chunks of data, as well as to spawn more threads tgtadm –lld iscsi –mode target –op update –tid
to handle data requests more efficiently, can reap • $tid –name HeaderDigest –value None
significant performance benefits. When combined tgtadm –lld iscsi –mode target –op update –tid
with enabling Jumbo Frame support, iSCSI perfor-
mance increases even more. Performance boosts
• $tid –name DataDigest –value None
tgtadm –lld iscsi –mode target –op update –tid
of 30% to 40% have been reported by clients
who enabled Jumbo Frames on 10 Gb Ethernet • $tid –name InitialR2T –value No
tgtadm –lld iscsi –mode target –op update –tid
networks.

cognizant 20-20 insights 6


• $tid –name MaxOutstandingR2T –value 16 Once this is done, restart the iscsi daemon with
tgtadm –lld iscsi –mode target –op update –tid the command:
• $tid –name ImmediateData –value Yes • service iscsi restart
tgtadm –lld iscsi –mode target –op update –tid
• # the default is 32768
• $tid –name FirstBurstLength –value 262144 • discovery.sendtargets.iscsi.MaxRecvData
tgtadm –lld iscsi –mode target –op update –tid
• SegmentLength = 262144
• $tid –name MaxBurstLength –value 262144 node.conn[0].iscsi.HeaderDigest = None
Configuring iSCSI Initiators (on each RAC VM) • node.session.iscsi.FastAbort = No
On each of the Oracle RAC VM nodes, the iSCSI The max data length parameters are determined
initiator needs to be tuned. To do so, add the by the size of the kernel page (usually 4K) and
following to /etc/sysctl.conf then multiplied by 64 (4096 * 64 = 262144). You
can experiment with the size for additional per-
• net.core.rmem_max = 1073741824 formance by doubling this size. Depending on the
• net.core.wmem_max = 1073741824 iSCSI target used, the maximum size for these
• net.ipv4.tcp_rmem = 1048576 16777216 variables differs, and if the maximum allowed size
is exceeded, the default size is assumed. As such,
• 1073741824 be certain to verify these parameters based on
net.ipv4.tcp_wmem = 1048576 16770216
the iSCSI target and initiator used.
• 1073741824
net.ipv4.tcp_mem = 1048576 16770216 Oracle Automatic Storage Management
• 1073741824 Oracle Automatic Storage Management (ASM)
should be used in this configuration to provide
Reload the system parameters with the command:
storage for shared disk management in exactly
• sysctl –p the same way that it would be used in a tradi-
Then finally backup and overwrite /etc/iscsi/ tional physical server deployment. The primary
iscsid.conf on each VM server so it contains: difference is that ASM and its components all
operate from within the virtual infrastructure envi-
• node.startup = automatic ronment, but access the shared iSCSI or NFS disk
• node.session.timeo.replacement_timeout = 120 in exactly the same way. It makes no difference if
the iSCSI target is directly on the storage system
• node.conn[0].timeo.login_timeout = 15 or accessed through a “Gateway” VM.
• node.conn[0].timeo.logout_timeout = 15
• node.conn[0].timeo.noop_out_interval = 10 Key factors to keep in mind for any VM configura-
tion include:
• node.conn[0].timeo.noop_out_timeout = 15
• node.session.initial_login_retry_max = 4 • All nodes in a given RAC cluster should have
an identical virtual hardware configuration.
• node.session.cmds_max = 128 Ideally, it’s best to clone a properly configured
• node.session.queue_depth = 128 RAC VM to create the other RAC nodes in a
• node.session.iscsi.InitialR2T = No cluster.

• node.session.iscsi.ImmediateData = Yes • VM performance, especially CPU, RAM and


CPU Ready parameters, should be closely
• node.session.iscsi.FirstBurstLength = 262144 monitored to ensure maximum performance
• node.session.iscsi.MaxBurstLength = 262144 and resource utilization efficiency.
• # the default is 131072 • Make use of VMXNET3 and PVSCSI drivers in
• node.conn[0].iscsi.MaxRecvDataSegment- VMs whenever possible to ensure maximum
• Length = 262144 network and disk performance.
# the default is 32768
• Enable Jumbo Frames on all network interfaces
• discovery.sendtargets.iscsi.MaxRecvData (Suggested MTU = 9000).
• SegmentLength = 262144 • Disable unneeded services in Linux VMs.
node.conn[0].iscsi.HeaderDigest = None
• Tune the iSCSI initiators and targets, especially
• node.session.iscsi.FastAbort = No on Gateway VMs, for the performance needs

cognizant 20-20 insights 7


of the VMs in the cluster using it. Multiple Because time synchronization in vSphere can be
Gateway VMs should be considered when sensitive, best practices suggest using VMware
multiple database clusters are deployed. Tools to synchronize with the hardware clock of
the ESX host system on which they are running.
The configuration of the underlying hosts systems, Testing to date has proved this to be unnecessary.
network, and storage in a virtual environment can
have a significant impact on virtual machine per- The above methods have been proven to satisfy
formance. Oracle is particularly sensitive in this the actual need. Follow Oracle best practices with
area. Be sure that the underlying hardware infra- respect to time synchronization regardless of the
structure is optimized to support Oracle just as if it platform.
were running directly on physical infrastructure.
High Availability and DRS Configuration
It is also important to note that Oracle ASM is One of the primary drivers for deploying Oracle
a sophisticated and robust database storage RAC is the high availability provided by a RAC
mechanism that is designed to make the most cluster. This cluster failover carries forward into a
of physical storage systems with multiple disks. VMware vSphere environment and — with cluster
In a virtualized environment, the virtual storage nodes that can be migrated via vMotion — can
system will normally have most of the perfor- now be configured to take advantage of these
mance and reliability that ASM would normally capabilities. Remember that VMware HA will — in
provide for itself. As a result, ASM configura- the event a physical ESX host fails — automatically
tions in VMware environments are usually much restart all of the failed host’s VMs on surviving
simpler to set up. Don’t be misled! A redundant ESX hosts. VMs do experience down time when
disk volume in VMware is normally presented to this happens. For this reason, allowing more than
ASM as if it were a single disk drive. Just because one virtualized RAC server node in a given RAC
ASM doesn’t know that a disk volume it is using is cluster to run on a single ESX host needlessly
redundant doesn’t mean there is no redundancy. exposes the RAC cluster to failure scenarios from
By the same token, ensure that you have built which it potentially may not recover gracefully.
appropriate levels of data protection into your
storage system. As such, it is important to set a series of DRS
anti-affinity policies between all nodes in a given
Time Synchronization RAC cluster. A typical virtualized Oracle RAC envi-
Time synchronization in vSphere environments ronment will consist of three server nodes. Since
can be tricky, and applications which are sensitive anti-affinity DRS policies can currently only be set
to time require special attention. Oracle RAC is no between two specific VMs, multiple policies are
exception. Each virtualized Oracle RAC node must required to keep three or more nodes in a RAC
be time synchronized to the other nodes. There cluster properly separated. Be sure to name the
are two methods for keeping the cluster nodes in DRS policies such that they can be easily identified
sync. Each has its benefits and works just as well: and grouped together. Note that having multiple
RAC nodes from different clusters running on the
• Cluster Time Synchronization Service: This same host server is acceptable, subject to resource
is the easier of the two options to set up. Prior
utilization and other resource management issues
to beginning the installation of Oracle RAC,
common to all virtual machines.
make sure that all Network Time Protocol
(NTP) programs are disabled (and ideally unin- For optimal HA detection and monitoring,
stalled). The Oracle RAC installer then auto- configure VM heartbeat monitoring for all nodes
matically installs and configures Oracle Cluster in the RAC cluster. This will ensure that, if VM is
Time Synchronization Service. powered on, but not actually functioning, VMware
• Enable NTP: The default NTP configuration HA will automatically restart the VM.
must be modified only to allow the Slew option
(-x). This will force the NTP daemon to ensure Database Clustering Advances
the clock on the individual nodes does not Thanks to the performance enhancements first
move backwards. This option is set in different introduced in VMware vSphere 4, it is now possible
places depending on the Linux distribution to cluster database systems reliably without the
used. Please refer to the documentation for use of Raw Disk Mappings. This change enables
the specific Linux distribution chosen for individual nodes in a virtualized database cluster
additional details. to migrate freely across ESX hosts in a HA/
DRS cluster, and adds the benefits of database

cognizant 20-20 insights 8


clustering to those provided by vSphere. When same RAC cluster are running. With this combi-
configured this way, vSphere HA and DRS work nation, Oracle RAC will automatically manage the
to complement the inherent HA capabilities of loss of a failed node from an application perspec-
Oracle RAC clusters. tive, and vSphere will then automatically recover
the failed RAC node, restoring the Oracle RAC
vSphere DRS will ensure that all virtual Oracle cluster’s state to normal. All of this occurs with
RAC nodes receive the resources they require no human intervention required.
by dynamically load-balancing the nodes across
the vSphere HA/DRS cluster. In the event any The end result is that by using in-guest, iSCSI
ESX host in the cluster fails (or RAC node when (and/or NFS) storage for shared data, virtual-
HA heartbeat monitoring is used), vSphere HA ized Oracle RAC database clusters can achieve
will automatically restart all failed RAC nodes improved levels of redundancy and — on appropri-
on another available ESX host. The process of ate hardware infrastructure — enhanced levels of
restarting these nodes will follow all HA and DRS performance that cannot be achieved on physical
rules in place to ensure that the failed nodes are infrastructure alone.
placed on a host where no other nodes in the

Footnote
1
Support Position for Oracle Products Running on VMware Virtualized Environments [ID 249212.1]),
November 8, 2010.

About the Author


Christopher (Chris) Williams is a Senior Manager and Principal Architect in Consulting and Profes-
sional Services with Cognizant’s IT Infrastructure Services business unit, where he serves as the Lead
Consultant in the Virtualization practice. Chris holds a Bachelor of Science degree from Metropolitan
State College of Denver, and an MBA with an information systems emphasis from the University of
Colorado. He can be reached at chris.williams@cognizant.com.

About Cognizant

Cognizant (Nasdaq: CTSH) is a leading provider of information technology, consulting,and business process outsourc-
ing services, dedicated to helping the world’s leading companies build stronger businesses. Headquartered in Teaneck,
N.J., Cognizant combines a passion for client satisfaction, technology innovation, deep industry and business process
expertise and a global, collaborative workforce that embodies the future of work. With over 50 delivery centers world-
wide and approximately 104,000 employees as of December 31, 2010, Cognizant is a member of the NASDAQ-100, the
S&P 500, the Forbes Global 2000, and the Fortune 1000 and is ranked among the top performing and fastest growing
companies in the world.

Visit us online at www.cognizant.com for more information.

World Headquarters European Headquarters India Operations Headquarters


500 Frank W. Burr Blvd. Haymarket House #5/535, Old Mahabalipuram Road
Teaneck, NJ 07666 USA 28-29 Haymarket Okkiyam Pettai, Thoraipakkam
Phone: +1 201 801 0233 London SW1Y 4SP UK Chennai, 600 096 India
Fax: +1 201 801 0243 Phone: +44 (0) 20 7321 4888 Phone: +91 (0) 44 4209 6000
Toll Free: +1 888 937 3277 Fax: +44 (0) 20 7321 4890 Fax: +91 (0) 44 4209 6060
Email: inquiry@cognizant.com Email: infouk@cognizant.com Email: inquiryindia@cognizant.com

­­© Copyright 2011, Cognizant. All rights reserved. No part of this document may be reproduced, stored in a retrieval system, transmitted in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise, without the express written permission from Cognizant. The information contained herein is
subject to change without notice. All other trademarks mentioned herein are the property of their respective owners.

You might also like