You are on page 1of 15

Highly�available�virtualization�with�KVM,

iSCSI�&�Pacemaker
Florian�Haas
Highly�available�virtualization�with�KVM,�iSCSI�&�Pacemaker
Florian Haas
Copyright © 2011 LINBIT HA-Solutions GmbH

Trademark notice
DRBD® and LINBIT® are trademarks or registered trademarks of LINBIT in Austria, the United States, and other countries. Other names
mentioned in this document may be trademarks or registered trademarks of their respective owners.

License information
The text and illustrations in this document are licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported
license ("CC BY-NC-ND").

• A summary of CC BY-NC-ND is available at http://creativecommons.org/licenses/by-nc-nd/3.0/.

• The full license text is available at http://creativecommons.org/licenses/by-nc-nd/3.0/legalcode.

• In accordance with CC BY-NC-ND, if you distribute this document, you must provide the URL for the original version.
Table�of�Contents
1. Introduction ............................................................................................................ 1
2. Installation .............................................................................................................. 2
3. Initial Configuration .................................................................................................. 3
3.1. Configuring Heartbeat ................................................................................... 3
3.2. Creating a basic Pacemaker configuration ......................................................... 3
3.3. Configuring an iSCSI storage pool ................................................................... 3
3.3.1. iSCSI storage pool configuration with virt-manager ............................ 3
3.3.2. Manual iSCSI storage pool configuration ................................................ 4
3.3.3. Enabling the storage pool .................................................................... 5
3.4. Adding the libvirt & iSCSI control daemons to Pacemaker ................................... 5
4. Creating KVM domains ............................................................................................. 6
5. Adding domains to the Pacemaker configuration .......................................................... 9
6. STONITH resources ................................................................................................. 10
7. Using the virtualization cluster ................................................................................ 11
7.1. Stopping and starting domains ...................................................................... 11
7.2. Migrating domains ....................................................................................... 11
8. Special considerations ............................................................................................. 12
8.1. Mixed 32/64-bit environment ..................................................................... 12

iii
Chapter 1. Introduction
KVM (Kernel-based Virtual Machine) is an in-kernel virtualization facility — a hypervisor — for
the Linux platform. KVM creates virtual machines — domains — which run in completely isolated
environments on the same phyiscal hardware. KVM has been included in Linux since version
2.6.20, and is the default hypervisor in many Linux distributions. It depends on virtualization
support in hardware; the relevant CPU extensions are known as VT for Intel and SVM for AMD.
KVM supports live migration of domains between physical hosts, provided the hosts share
common storage.

The solution described in this technical guide applies to KVM with iSCSI-based storage. As such,
it can be deployed with any iSCSI capable storage target or appliance. For a highly available
iSCSI target service to complement the configuration described in this guide, refer to the LINBIT®
Technical Guide "Highly available iSCSI storage with DRBD and Pacemaker".

1
Chapter 2. Installation
In order to create a highly available KVM environment, you will need to install the following
software packages:

• Pacemaker is a cluster resource management framework which you will use to automatically
start, stop, monitor, and migrate resources. Distributions typically bundle Pacemaker in a
package simply named pacemaker. This technical guide assumes that you are using at least
Pacemaker 1.0.9.

• Heartbeat is one of the cluster messaging layers that Pacemaker is capable of using. In
distributions, the Heartbeat package is usually named heartbeat. This guide assumes at least
Heartbeat version 3.0.3.

Note
The other Pacemaker supported cluster messaging layer is named Corosync, which
you may use in place of Heartbeat. Corosync is the only supported Pacemaker
messaging layer in Red Hat Enterprise Linux 6 and SUSE Linux Enterprise Server 11,
other distributions ship both Corosync and Heartbeat. For the sake of simplicity,
this technical guide presents the solution on only one messaging layer, which is
Heartbeat.

• KVM is the virtualization facility than manages domains. It requires an KVM-enabled Linux
kernel, a CPU with VT or SVM support enabled in the BIOS and the Qemu hardware emulator
(usually named qemu, qemu-kvm, or similar).

• Libvirt is an abstraction and management layer for KVM and other hypervisors. The
configuration explained in this guides requires the libvirt library, and the associated set of
binary tools. The corresponding packages are usually named libvirt and libvirt-bin,
respectively.

• The open-iscsi initiator is the default Linux iSCSI initiator package. It ships as part of virtually any
distribution stock kernel. The corresponding userspace management daemons (also required
for proper iSCSI initiator functionality) are typically named open-iscsi or similar.

Note
You may be required to install packages other than the above-mentioned ones due
to package dependencies. However, when using a package management utility such
as aptitude, yum, or zypper, these dependencies should be taken care of for
you, automatically.

After you have installed the required packages, you should take care of a few settings applying
to your boot process, using your distribution’s preferred utility for doing so (typically either
chkconfig, insserv, or update-rc.d).

Make sure that:

• Heartbeat does start automatically on system boot. This will also start Pacemaker.

• The libvirt daemon init script (typically named /etc/init.d/libvirtd) does not start
automatically on system boot. Pacemaker will manage this service as a cluster resource.

• The local open-iscsi configuration file does not contain any iSCSI volumes serving as KVM virtual
block devices. Pacemaker will manage all of these, through libvirt, as storage pools.

2
Chapter 3. Initial�Configuration
This section describes the initial configuration of a highly available Libvirt/KVM environment in
the context of the Pacemaker cluster manager.

3.1. Configuring�Heartbeat
Configuring Heartbeat in a 2-node cluster and enabling Pacemaker is a straightforward process
that is well documented in the Linux-HA User’s Guide [http://www.linux-ha.org/doc/], in the
section called "Creating an initial Heartbeat configuration".

3.2. Creating�a�basic�Pacemaker�configuration
In a highly available iSCSI target configuration that involves a 2-node cluster, you should

• disable STONITH;

• set Pacemaker’s "no quorum policy" to ignore loss of quorum;

• set the default resource stickiness to 200.

To do so, issue the following commands from the CRM shell:

crm(live)# configure
crm(live)configure# property stonith-enabled="false"
crm(live)configure# property no-quorum-policy="ignore"
crm(live)configure# rsc_defaults resource-stickiness="200"
crm(live)configure# commit

Warning
Disabling STONITH is a temporary measure, intended for testing only. Do not
move the cluster into production with STONITH disabled. Considerations for setting
up STONITH for the virtualization cluster are described in Chapter  6, STONITH
resources [10].

3.3. Configuring�an�iSCSI�storage�pool
In a libvirt/KVM cluster configuration, libvirt stores its data in a storage pool — an iSCSI target
managed by libvirtd.

To define a storage pool, you can use virt-manager, a graphical utility to manage virtual
domains, networks, storage pools, and other resources. You can also create the pool configuration
XML manually.

The examples below are for managing a storage pool corresponding to an iSCSI target
named iqn.2001-04.com.example:storage.example.iscsivg01, hosted on the
portal 10.9.9.180:3260.

3.3.1. iSCSI�storage�pool�configuration�with�virt-
manager
To create a new storage pool with virt-manager, select your virtualization host, open the host
Details view, and select the Storage tab. Then, click (Add Pool) and follow the wizard.

3
Initial Configuration

Figure 3.1. Adding a storage pool with virt-manager (step 1)

Figure 3.2. Adding a storage pool with virt-manager (step 2)

Once the installation is complete, virt-manager will store the newly created pool configuration
file as /etc/libvirt/storage/iscsivg.xml. You must then copy this configuration file
to all nodes in the cluster.

3.3.2. Manual�iSCSI�storage�pool�configuration
You can also create an XML storage pool configuration file by hand. You would save this
configuration as a file named /etc/libvirt/storage/iscsivg01.xml:

Storage pool XML configuration. 

<pool type='iscsi'>
<name>iscsivg01</name>
<source>
<host name='10.9.9.180'/>
<device path='iqn.2001-04.com.example:storage.example.iscsivg01'/>
</source>
<target>
<path>/dev/disk/by-path</path>
<permissions>
<mode>0700</mode>
<owner>-1</owner>
<group>-1</group>
</permissions>
</target>
</pool>

Once installed, you must copy this configuration file to all nodes in the cluster.

4
Initial Configuration

3.3.3. Enabling�the�storage�pool
You can then define this storage pool, and have it started automatically on libvirtd startup:

virsh pool-define /etc/libvirt/storage/iscsivg01.xml


virsh pool-autostart iscsivg01

You must repeat this step on all cluster nodes.

3.4. Adding�the�libvirt�&�iSCSI�control�daemons
to�Pacemaker
You may proceed by adding the libvirt and open-iscsi control service to your Pacemaker
configuration. To that end, you simply invoke the lsb LSB init scripts as a Pacemaker clone:

crm(live)# configure
crm(live)configure# primitive p_iscsid \
lsb:iscsid \
op monitor interval="30"
crm(live)configure# primitive p_libvirtd \
lsb:libvirtd \
op monitor interval="30"
crm(live)configure# group g_daemons \
p_iscsid p_libvirtd
crm(live)configure# clone cl_daemons g_daemons
crm(live)configure# commit

Note
If your distribution uses a different name for the init scripts, you must
adjust your resource configuration accordingly. For example, if your distribution
installs the libvirtd init script as /etc/init.d/libvirt-bin, you would use
lsb:libvirt-bin as the resource type.

5
Chapter 4. Creating�KVM�domains
As with storage pools, you can add new virtual domains either with virt-manager or manually.
The recommended and far less error-prone method is to use the installation wizard in virt-
manager.

As you create the domain, select existing volumes (Logical Units, LUs) in the iSCSI storage pool
to act as Virtual Block Devices (VBDs) for the domain.

Figure 4.1. Creating a new KVM domain

Figure 4.2. Selecting an installation image for the new domain

6
Creating KVM domains

Figure 4.3. Setting CPU and memory parameters for the new domain

Figure 4.4. Selecting managed storage for the new domain

Figure 4.5. Selecting an iSCSI volume for domain storage

7
Creating KVM domains

Figure 4.6. Translated storage volume path in domain configuration

Figure 4.7. Finished domain configuration

Figure 4.8. Running domain installer

Once the installation is complete, virt-manager will store the newly created domain
configuration file as /etc/libvirt/qemu/superfrobnicator.xml. You must then copy
this configuration file to all nodes in the cluster.

Note
virt-manager has no facility to create Logical Units in the iSCSI target, from the
virtualization host. You must create these, ahead of time, on your iSCSI target device
or server.

8
Chapter 5. Adding�domains�to�the
Pacemaker�configuration
Once your domain is created, you may add it to the Pacemaker configuration. To do so, use the
ocf:heartbeat:VirtualDomain resource type:

crm(live)# configure
crm(live)configure# primitive p_virtdom_superfrobnicator \
ocf:heartbeat:VirtualDomain \
params config=/etc/libvirt/qemu/superfrobnicator.xml \
meta allow-migrate=true
op monitor interval="30"

Note
meta allow-migrate=true enables live resource migration with no service
interruption.

Before you commit this configuration, be sure to add Pacemaker constraints to make sure that
this domain is only started where, and after, the libvirt management service is available:

crm(live)configure# order o_daemons_before_virtdom_superfrobnicator \


inf: cl_daemons p_virtdom_superfrobnicator
crm(live)configure# colocation c_virtdom_superfrobnicator_on_daemons \
inf: p_virtdom_superfrobnicator cl_daemons
crm(live)configure# commit

After these changes have been committed, Pacemaker should start this domain on a machine
where the libvirt management service is running. You may verify this with crm_mon (executed on
any active cluster node) or virsh list (executed on the node where the container is running).

9
Chapter 6. STONITH�resources
Before moving the cluster into production, you must enable STONITH (also known as node
fencing). KVM employs no cluster-wide safeguards against running a domain more than once
on different physical nodes. While Pacemaker nodes are communicating properly, this is of no
concern — Pacemaker ensures that any domain is started once, and only once, in the cluster.
However, if Pacemaker loses connectivity to a cluster node, or if a domain fails to shut down
properly, Pacemaker needs to forcibly shut down the node to make certain it no longer uses any
iSCSI volumes. This is what STONITH ensures.

Pacemaker supports a multitude of STONITH devices, and configuring STONITH is beyond the
scope of this guide. An excellent guide to configuring STONITH can be found on the Pacemaker
web site [http://www.clusterlabs.org/doc/crm_fencing.html].

Once you have configured STONITH devices properly, you must enable STONITH by setting the
stonith-enabled cluster property to true, using the crm shell:

crm(live)# configure
crm(live)configure# property stonith-enabled="true"
crm(live)configure# commit

10
Chapter 7. Using�the�virtualization
cluster
7.1. Stopping�and�starting�domains
In case you want to shut down a domain, you may do so using the crm shell:

crm(live)# resource stop p_virtdom_superfrobnicator

To restart it, again use the crm shell:

crm(live)# resource start p_virtdom_superfrobnicator

Note
Unlike configure commands, the crm shell executes resource commands
immediately. There is no need to commit the changes.

7.2. Migrating�domains
To seamlessly migrate a domain from one cluster node to another, use the resource
migrate command in the crm shell. For example, to migrate the resource named
p_virtdom_superfrobnicator to a node named alice, issue the following command:

crm(live)# resource migrate p_virtdom_superfrobnicator alice

Once you have entered this command, Pacemaker will live-migrate the domain to the target node,
using KVM’s intrinsic domain migration.

This process may take several seconds or even minutes depending on domain utilization, but
during this time the domain will remain running and fully usable.

Note
Whenever you migrate a node to a specified node, Pacemaker creates a permanent
location constraint pinning the resource on that node. This is usually undesirable, thus
you should revoke this constraint once the resource migration has been completed.
To do so, issue the resource unmigrate <resourcename> command in the
crm shell.

11
Chapter 8. Special�considerations
8.1. Mixed�32/64-bit�environment
If your Pacemaker cluster uses mixed hosts — some with 32-bit, some with 64-bit CPUs — you
must ensure that any 64-bit guests are configured to run on 64-bit hosts only. You may do so by
use of Pacemaker location constraints and node attributes. For example, assume

• you run a 4 node cluster with hosts alice, bob, charlie, and daisy,

• only charlie and daisy have 64-bit CPUs,

• resource p_virtdom_superfrobnicator refers to a 64-bit domain.

Then, you would first tag your nodes with a cpubits attribute as follows:

crm(live)# configure
crm(live)configure# node alice attributes cpubits=32
crm(live)configure# node bob attributes cpubits=32
crm(live)configure# node charlie attributes cpubits=64
crm(live)configure# node daisy attributes cpubits=64

Now, you add a location constraint allowing the resource p_virtdom_superfrobnicator to


run on only those nodes where the cpubits attribute equals 64:

crm(live)configure# location l_superfrobnicator_on_64 p_virtdom_superfrobnicator


rule -inf: cpubits ne 64
crm(live)configure# commit

Pacemaker then ensures that p_virtdom_superfrobnicator only starts on either


charlie or daisy.

As you add more nodes to the cluster, you simply tag the newly joined nodes with the cpubits
attribute set to 32 or 64 (as above), and Pacemaker will only consider the 64-bit machines eligible
for running the p_virtdom_superfrobnicator resource.

12

You might also like