You are on page 1of 26

The�Linux-HA�User’s�Guide

Florian�Haas�<florian.haas@linbit.com>
The�Linux-HA�Project<linux-ha@lists.linux-ha.org>
The�Linux-HA�User’s�Guide
by Florian Haas
Copyright © 2009, 2010 LINBIT HA-Solutions GmbH, The Linux-HA Project

License information
The text of and illustrations in this document are licensed under a Creative Commons Attribution–Share Alike 3.0 Unported license
("CC-BY-SA").

• A summary of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/.

• The full license text is available at http://creativecommons.org/licenses/by-sa/3.0/legalcode.

• In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.
Preface ....................................................................................................................... iv
I. Introduction to Heartbeat .......................................................................................... 1
1. Heartbeat as a Cluster Messaging Layer ............................................................. 2
2. Components .................................................................................................... 3
2.1. Communication module .......................................................................... 3
2.2. Cluster Consensus Membership ............................................................... 3
2.3. Cluster Plumbing Library ........................................................................ 3
2.4. IPC Library ........................................................................................... 4
2.5. Non-blocking logging daemon ................................................................ 4
II. Installing Heartbeat .................................................................................................. 5
3. Building and installing from source ..................................................................... 6
3.1. Building and installing Cluster Glue from source ......................................... 6
3.1.1. Cluster Glue build prerequisites .................................................... 6
3.1.2. Downloading Cluster Glue sources ................................................ 6
3.1.3. Building Cluster Glue ................................................................... 7
3.1.4. Building Packages ....................................................................... 8
3.2. Building and installing Heartbeat from source ............................................ 8
3.2.1. Heartbeat build prerequisites ....................................................... 8
3.2.2. Downloading Heartbeat sources ................................................... 8
3.2.3. Building Heartbeat ...................................................................... 9
3.2.4. Building Packages ..................................................................... 10
4. Installing pre-built packages ............................................................................ 11
4.1. Debian and Ubuntu .............................................................................. 11
4.2. Fedora, RHEL and CentOS .................................................................... 11
4.3. OpenSUSE and SLES ............................................................................ 11
III. Administrative Tasks .............................................................................................. 12
5. Creating an initial Heartbeat configuration ........................................................ 13
5.1. The ha.cf file ................................................................................... 13
5.2. The authkeys file ............................................................................. 13
5.3. Propagating the cluster configuration to cluster nodes ............................. 14
5.4. Starting Heartbeat services .................................................................. 14
5.5. Where to go from here ........................................................................ 15
6. Upgrading from previous Heartbeat versions ..................................................... 16
6.1. Upgrading from Heartbeat 2.1 clusters not using the CRM ........................ 16
6.1.1. Stopping Heartbeat services ...................................................... 16
6.1.2. Upgrade software ..................................................................... 16
6.1.3. Enabling the Heartbeat cluster to use Pacemaker .......................... 16
6.1.4. Restarting Heartbeat ................................................................. 17
6.2. Upgrading from CRM-enabled Heartbeat 2.1 clusters .............................. 17
6.2.1. Placing the cluster in unmanaged mode ....................................... 17
6.2.2. Backing up the CIB ................................................................... 18
6.2.3. Stopping Heartbeat services ...................................................... 18
6.2.4. Wiping files related to the CRM .................................................. 18
6.2.5. Restoring the CIB ..................................................................... 18
6.2.6. Upgrading software .................................................................. 19
6.2.7. Restarting Heartbeat services .................................................... 19
6.2.8. Returning the cluster to managed mode ...................................... 19
6.2.9. Upgrading the CIB schema ......................................................... 19
IV. Getting Help and Helping Out ................................................................................. 20
7. Reporting problems ........................................................................................ 21
7.1. Mailing Lists ....................................................................................... 21
7.1.1. Linux-HA ................................................................................. 21
7.1.2. Linux-HA-dev .......................................................................... 21
7.2. Bug Tracking System ........................................................................... 21
7.3. IRC .................................................................................................... 21
8. Submitting patches ........................................................................................ 22

iii
Preface
This guide aims to be the definitive reference guide for users of the Heartbeat cluster messaging
layer, the Linux-HA cluster resource agents, and other clustering building blocks maintained by
the Linux-HA project.

We welcome and actively encourage any and all feedback about this document. Please use
the linux-ha@lists.linux-ha.org [mailto:linux-ha@lists.linux-ha.org] mailing list for comments,
corrections, and suggestions for improvement.

iv
Part I. Introduction�to�Heartbeat
Chapter 1. Heartbeat�as�a�Cluster
Messaging�Layer
Heartbeat is a daemon that provides cluster infrastructure (communication and membership)
services to its clients. This allows clients to know about the presence (or disappearance!) of peer
processes on other machines and to easily exchange messages with them.

In order to be useful to users, the Heartbeat daemon needs to be combined with a cluster
resource manager (CRM) which has the task of starting and stopping the services (IP addresses,
web servers, etc) that cluster will make highly available. The canonical cluster resource manager
typically associated with Heartbeat is Pacemaker [http://www.clusterlabs.org], a highly scalable
and feature-rich implementation that supports both Heartbeat and, alternatively, the Corosync
[http://www.corosync.org] cluster messaging layer.

Note
Up until Heartbeat release 2.1.3, Pacemaker was developed jointly with Heartbeat as
part of the Linux-HA umbrella project. After this release, the Pacemaker project was
spun off as a separate project and continues as such, while maintaining full support
for Heartbeat cluster messaging.

2
Chapter 2. Components
The Heartbeat messaging layer comprises several interlocking but distinct components.

2.1. Communication�module
The Heartbeat communication module provides strongly authenticated, locally-ordered multicast
messaging over basically any media, IP-based or not. Heartbeat supports cluster communications
over the following network link types:

• unicast UDP over IPv4;

• broadcast UDP over IPv4;

• multicast UDP over IPv4;

• serial link communications.

Warning
Please consider some important caveats with regard to serial link connectivity (see
the ha.cf man page, man ha.cf). As a general rule: when in doubt, serial links
should be avoided.

Heartbeat can detect node failure reliably in less than a half-second. It will register with the system
watchdog timer if configured to do so.

The heartbeat layer has an API which provides the following classes of services:

• intra-cluster communication - sending and receiving packets to cluster nodes

• configuration queries

• connectivity information (who can the current node hear packets from) - both for queries and
state change notifications

• basic group membership services

2.2. Cluster�Consensus�Membership
CCM provides strongly connected consensus cluster membership services. It ensures that every
node in a computed membership can talk to every other node in this same membership. CCM
Implements both an OCF draft membership API, and the SAF AIS membership API. Typically it
computes membership in sub-second time.

2.3. Cluster�Plumbing�Library
The Cluster plumbing library is a collection of very useful functions which provide a variety of
services used by many of our main components. A few of the major objects provided by this
library include:

• compression API (with underlying compression plugins)

• Non-blocking logging API

• memory management oriented to continuously running services

3
Components

• Hierarchical name-value pair messaging facility promoting portability and version upgrade
compatibility (also provides optional message compression facilities)

• Signal unification - allowing signals to appear as mainloop events

• Core dump management utilities - promoting capture of core dumps in a uniform way, and
under all circumstances

• timers (like glib mainloop timers - but they work even when the time of day clock jumps)

• child process management - death of children causes invocation of process object, with
configurable death-of-child messages

• Triggers (arbitrary events triggered by software)

• Realtime management - setting and unsetting high priorities, and locked into memory
attributes of processes.

• 64-bit HZ-granularity time manipulation (longclock_t)

• User id management for security purposes, for processes which need some root privileges.

• Mainloop integration for IPC, plain file descriptors, signals, etc. This means that all these
different event sources are managed and dispatched consistently.

2.4. IPC�Library
All interprocess communication is performed using a very general IPC library which provides non-
blocking access to IPC using a flexible queueing strategy, and includes integrated flow control. This
IPC API does not require sockets, but the currently available implementations use UNIX (Local)
Domain sockets.

This API also includes built-in authentication and authorization of peer processes, and is portable
to most POSIX-like OSes. Although use of Glib mainloop with these APIs is not required, Heartbeat
provides simple and convenient integration with mainloop.

2.5. Non-blocking�logging�daemon
logd is Heartbeat’s logging daemon, capable of logging to a syslog daemon, to files, or both.
logd never blocks, instead, it discards messages lagging too far behind.

Once it is capable of outputting messages again, logd prints a lost message count. Queue sizes
are controllable both overall, and on a per-application basis.

4
Part II. Installing�Heartbeat
Chapter 3. Building�and�installing�from
source
Building and installing the Heartbeat cluster messaging layer from source amounts to building the
following packages:

• heartbeat itself;

• the cluster-glue package containing the Heartbeat local resource manager (LRM) and
STONITH plugins.

Since heartbeat has a build dependency on cluster-glue, it is necessary to build and install
cluster-glue first, before one proceeds with building and installing heartbeat.

3.1. Building�and�installing�Cluster�Glue�from
source
3.1.1. Cluster�Glue�build�prerequisites
Building Cluster Glue requires the presence of the following tools and libraries on the build system:

• A C compiler (typically gcc) and associated C development libraries;

• the flex scanner generator and the bison parser compiler;

• net-snmp development headers, to enable SNMP related functionality;

• OpenIPMI development headers, to enable IPMI related functionality;

• Python (just the language interpreter, not library headers).

Note
This list applies to the default software configuration. If you configure the source
with non-standard options, other dependencies may apply.

3.1.2. Downloading�Cluster�Glue�sources
Several options are available for retrieving the Cluster Glue source code, for building locally on
a target system.

Downloading�a�release�tarball
Downloading a released version of Heartbeat as a compressed tarball is equivalent to fetching
a tagged snapshot from the Mercurial source code repository. Release tags follow the format
glue-x.y.z, where x.y.z is the released version of Cluster Glue you wish to download.

If, for example, one wants to download the 1.0.1 release, the correct sequence of commands
would be:

# wget http://hg.linux-ha.org/glue/archive/glue-1.0.1.tar.bz2
# tar -vxjf glue-1.0.1.tar.bz2

6
Building and installing from source

Downloading�the�latest�Mercurial�snapshot
The latest development code is always available in the Mercurial repository as the tip revision.

To download a tarball auto-generated from the tip, use this sequence of commands:

# wget http://hg.linux-ha.org/glue/archive/tip.tar.bz2
# tar -vxjf tip.tar.bz2

Checking�out�sources�from�Mercurial
This is the method you would apply if you have the Mercurial utilities locally installed. Checking
out the sources amounts to cloning the repository:

$ hg clone http://hg.linux-ha.org/glue cluster-glue


requesting all changes
adding changesets
adding manifests
adding file changes
added 12491 changesets with 34830 changes to 2632 files
updating working directory
356 files updated, 0 files merged, 0 files removed, 0 files unresolved

3.1.3. Building�Cluster�Glue
Building Cluster Glue is an automated process making extensive use of GNU Autotools. When
building and installing on the same machine, it usually amounts to just the following sequence of
commands:

$ ./autogen.sh
$ ./configure
$ make
$ sudo make install

Note
The autogen.sh script is a convenience wrapper around automake,
autoheader, autoconf, and libtool.

A number of configuration options are supported, and you may tweak some of them to
optimize Heartbeat for your system. To retrieve a list of configuration options, you may invoke
configure with the --help option. A customized build may thus comprise these steps:

$ ./autogen.sh
$ ./configure --help
$ ./configure configuration-options
$ make
$ sudo make install

Some typical configuration options that you may wish to set are --prefix, --sysconfdir,
and --localstatedir, as shown in this example:

$ ./autogen.sh
$ ./configure --help
$ ./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var
$ make
$ sudo make install

7
Building and installing from source

3.1.4. Building�Packages
RPM�packages
RPM spec files are provided in the Cluster Glue source tree both for SuSE and Red Hat based
distributions:

• cluster-glue-suse.spec should be used for OpenSUSE and SLES installations.

• cluster-glue-fedora.spec is for Fedora, Red Hat Enterprise Linux, and CentOS.

Debian�packages
Debian packaging for Cluster glue is maintained in a Mercurial repository on
alioth.debian.org [http://hg.debian.org/hg/debian-ha/cluster-glue/]. Thus, rather than
cloning or downloading from the upstream Mercurial repository, use the one hosted on alioth.

Once you have checked out or unpacked the source tree from the alioth repository, simply
invoke dpkg-buildpackage from the top of the source tree — like you would with any other
Debian package.

3.2. Building�and�installing�Heartbeat�from
source
3.2.1. Heartbeat�build�prerequisites
Building Heartbeat requires the presence of the following tools and libraries on the build system:

• A C compiler (typically gcc) and associated C development libraries;

• the flex scanner generator and the bison parser compiler;

• net-snmp development headers, to enable SNMP related functionality;

• OpenIPMI development headers, to enable IPMI related functionality;

• Python (just the language interpreter, not library headers)

• the cluster-glue development headers. See Section 3.1, “Building and installing Cluster
Glue from source” [6] for details on how to build these from source.

Note
This list applies to the default software configuration. If you configure the source
with non-standard options, other dependencies may apply.

3.2.2. Downloading�Heartbeat�sources
Downloading�a�release�tarball
Downloading a released version of Heartbeat as a compressed tarball is equivalent to fetching
a tagged snapshot from the Mercurial source code repository. Release tags follow the format
STABLE-x.y.z, where x.y.z is the released version of Heartbeat you wish to download.

If, for example, one wants to download the 3.0.4 release, the correct sequence of commands
would be:

8
Building and installing from source

# wget http://hg.linux-ha.org/dev/archive/STABLE-3.0.4.tar.bz2
# tar -vxjf STABLE-3.0.4.tar.bz2

Downloading�the�latest�Mercurial�snapshot
The latest development code is always available in the Mercurial repository as the tip revision.

To download a tarball auto-generated from the tip, use this sequence of commands:

# wget http://hg.linux-ha.org/dev/archive/tip.tar.bz2
# tar -vxjf tip.tar.bz2

Checking�out�sources�from�Mercurial
This is the method you would apply if you have the Mercurial utilities locally installed. Checking
out the sources amounts to cloning the repository:

$ hg clone http://hg.linux-ha.org/dev heartbeat-dev


requesting all changes
adding changesets
adding manifests
adding file changes
added 12491 changesets with 34830 changes to 2632 files
updating working directory
356 files updated, 0 files merged, 0 files removed, 0 files unresolved

3.2.3. Building�Heartbeat
Building Heartbeat is an automated process making extensive use of GNU Autotools. When
building and installing on the same machine, it usually amounts to just the following sequence of
commands:

$ ./bootstrap
$ ./configure
$ make
$ sudo make install

Note
The bootstrap script is a convenience wrapper around automake, autoheader,
autoconf, and libtool. ConfigureMe is a convenience wrapper for the
autoconf-generated configure script.

A number of configuration options are supported, and you may tweak some of them to
optimize Heartbeat for your system. To retrieve a list of configuration options, you may invoke
configure with the --help option. A customized build may thus comprise these steps:

$ ./bootstrap
$ ./configure --help
$ ./configure <configuration-options>
$ make
$ sudo make install

Some typical configuration options that you may wish to set are --prefix, --sysconfdir,
and --localstatedir, as shown in this example:

$ ./bootstrap
$ ./configure --help
$ ./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var

9
Building and installing from source

$ make
$ sudo make install

3.2.4. Building�Packages
RPM�packages
RPM spec files are provided in the Heartbeat source tree both for SuSE and Red Hat based
distributions:

heartbeat-suse.spec should be used for OpenSUSE and SLES installations. heartbeat-


fedora.spec is for Fedora, Red Hat Enterprise Linux, and CentOS.

Debian�packages
Debian packaging for Heartbeat is maintained in a Mercurial repository on
alioth.debian.org [http://hg.debian.org/hg/debian-ha/heartbeat/]. Thus, rather than
cloning or downloading from the upstream Mercurial repository, use the one hosted on alioth.

Once you have checked out or unpacked the source tree from the alioth repository, simply
invoke dpkg-buildpackage from the top of the source tree — like you would with any other
Debian package.

10
Chapter 4. Installing�pre-built�packages
Cluster Glue and Heartbeat are available as pre-built binary packages on a number of platforms,
including

• Debian (fully included in squeeze and up, backports packages are available for lenny);

• Ubuntu (since lucid);

• Fedora (since release 12);

• OpenSUSE (since release 11).

Commercially supported enterprise packages for Red Hat Enterprise Linux and SUSE Linux
Enterprise server are available from LINBIT [http://www.linbit.com].

This section describes the steps necessary to install binary packages on these platforms.

4.1. Debian�and�Ubuntu
Installing the cluster-glue and heartbeat packages on Debian and Ubuntu is a
straightforward process. Assuming you have the correct package repositories configured for APT,
install the two packages with the following commands:

aptitude install heartbeat cluster-glue

Since you most likely will also want to install Pacemaker (beyond the scope of this manual), do
so by issuing the following commands as well:

aptitude install cluster-agents pacemaker

4.2. Fedora,�RHEL�and�CentOS
On Red Hat platforms, you install the cluster-glue and heartbeat packages with the YUM
package manager. Assuming you have the correct package repositories configured in +/etc/
yum.repos.d/, install the two packages with the following commands:

yum install heartbeat cluster-glue

Since you most likely will also want to install Pacemaker (beyond the scope of this manual), do
so by issuing the following commands as well:

yum install resource-agents pacemaker

4.3. OpenSUSE�and�SLES
On SUSE platforms, you install the cluster-glue and heartbeat packages with the Zypper
package manager. Assuming you have the correct package repositories configured, install the two
packages with the following commands:

zypper install heartbeat cluster-glue

Since you most likely will also want to install Pacemaker (beyond the scope of this manual), do
so by issuing the following commands as well:

zypper install resource-agents pacemaker

11
Part III. Administrative�Tasks
Chapter 5. Creating�an�initial
Heartbeat�configuration
For any Heartbeat cluster, the following configuration files must be available:

• /etc/ha.d/ha.cf — the global cluster configuration file.

• /etc/ha.d/authkeys — a file containing keys for mutual node authentication.

5.1. The�ha.cf�file
The following example is a small and simple ha.cf file:

autojoin none
mcast bond0 239.0.0.43 694 1 0
bcast eth2
warntime 5
deadtime 15
initdead 60
keepalive 2
node alice
node bob
pacemaker respawn

Setting autojoin to none disables cluster node auto-discovery and requires that cluster nodes
be listed explicitly, using the node options. This speeds up cluster start-up in clusters with a fixed
small number of nodes.

This example assumes that bond0 is the cluster’s interface to the shared network, and that eth2
is the interface dedicated for DRBD replication between both nodes. Thus, bond0 can be used for
Multicast heartbeat, whereas on eth2 broadcast is acceptable as eth2 is not a shared network.

The next options configure node failure detection. They set the time after which Heartbeat issues
a warning that a no longer available peer node may be dead (warntime), the time after which
Heartbeat considers a node confirmed dead (deadtime), and the maximum time it waits for
other nodes to check in at cluster startup (initdead). keepalive sets the interval at which
Heartbeat keep-alive packets are sent. All these options are given in seconds.

The node option identifies cluster members. The option values listed here must match the exact
host names of cluster nodes as given by uname -n.

pacemaker respawn enables the Pacemaker cluster manager, and ensures that Pacemaker is
automatically restarted in case of a failure.

Note
Prior to Heartbeat release 3.0.4, the pacemaker keyword was named crm. Newer
versions still retain the old name as a compatibility alias, but pacemaker is the
preferred syntax.

5.2. The�authkeys�file
/etc/ha.d/authkeys contains pre-shared secrets used for mutual cluster node
authentication. It should only be readable by root and follows this format:

auth <num>

13
Creating an initial
Heartbeat configuration

<num> <algorithm> <secret>

num is a simple key index, starting with 1. Usually, you will only have one key in your authkeys
file.

algorithm is the signature algorithm being used. You may use either md5 or sha1; the use of
crc (a simple cyclic redundancy check, not secure) is not recommended.

secret is the actual authentication key.

You may create an authkeys file, using a generated secret, with the following shell hack:

( echo -ne "auth 1\n1 sha1 "; \


dd if=/dev/urandom bs=512 count=1 | openssl md5 ) \
> /etc/ha.d/authkeys
chmod 0600 /etc/ha.d/authkeys

5.3. Propagating�the�cluster�configuration�to
cluster�nodes
In order to propagate the contents of the ha.cf and authkeys configuration files, you may use the
ha_propagate command, which you would invoke using either

/usr/lib/heartbeat/ha_propagate

or

/usr/lib64/heartbeat/ha_propagate

This utility will copy the configuration files over to any node listed in /etc/ha.d/ha.cf using
scp. It will afterwards also connect to the nodes using ssh and issue chkconfig heartbeat
on in order to enable Heartbeat services on system startup.

5.4. Starting�Heartbeat�services
The Heartbeat services are started just as you would any other system service on your machine.
Depending on your system platform, you might be using one of the following commands:

/etc/init.d/heartbeat start

service heartbeat start

rcheartbeat start

Through the pacemaker entry in your ha.cf [13], Heartbeat will now start the Pacemaker
daemon (named crmd for historical reasons) along with the rest of its services. After a few
seconds, you should be able to detect Heartbeat processes in your process table:

# ps -AHfww | grep heartbeat


root 2772 1639 0 14:27 pts/0 00:00:00 grep heartbeat
root 4175 1 0 Nov08 ? 00:37:57 heartbeat: master control proc
root 4224 4175 0 Nov08 ? 00:01:13 heartbeat: FIFO reader
root 4227 4175 0 Nov08 ? 00:01:28 heartbeat: write: bcast eth2
root 4228 4175 0 Nov08 ? 00:01:29 heartbeat: read: bcast eth2
root 4229 4175 0 Nov08 ? 00:01:35 heartbeat: write: mcast bond
root 4230 4175 0 Nov08 ? 00:01:32 heartbeat: read: mcast bond0
102 4233 4175 0 Nov08 ? 00:03:37 /usr/lib/heartbeat/ccm
102 4234 4175 0 Nov08 ? 00:15:02 /usr/lib/heartbeat/cib

14
Creating an initial
Heartbeat configuration

root 4235 4175 0 Nov08 ? 00:17:14 /usr/lib/heartbeat/lrmd -r


root 4236 4175 0 Nov08 ? 00:02:48 /usr/lib/heartbeat/stonithd
102 4237 4175 0 Nov08 ? 00:00:54 /usr/lib/heartbeat/attrd
102 4238 4175 0 Nov08 ? 00:08:32 /usr/lib/heartbeat/crmd
102 5724 4238 0 Nov08 ? 00:04:47 /usr/lib/heartbeat/pengine

Finally, you should also be able to confirm that the cluster is working, via the Pacemaker crm_mon
command:

# crm_mon -1
============
Last updated: Mon Dec 13 14:29:36 2010
Stack: Heartbeat
Current DC: alice (083146b9-6e26-4ac8-a705-317095d0ba57) - partition with quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
2 Nodes configured, unknown expected votes
24 Resources configured.
============

Online: [ alice bob ]

5.5. Where�to�go�from�here
Now that you have a working Heartbeat configuration, you will want to continue with configuring
Pacemaker and adding cluster resources.

The following documents are available for your perusal:

• Clusters From Scratch [http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/


Clusters_from_Scratch/] is an excellent guide for configuring Pacemaker clusters. The
document primarily covers Pacemaker on Corosync, but starting with its chapter "Using
Pacemaker Tools" applies identically to Heartbeat clusters.

• The DRBD User’s Guide [http://www.drbd.org/users-guide/] has a chapter dedicated to


integrating DRBD with Pacemaker clusters.

• Several Technical Guides covering Heartbeat/Pacemaker clusters for a variety of applications


are available from the LINBIT web site [http://www.linbit.com/en/education/tech-guides].

15
Chapter 6. Upgrading�from�previous
Heartbeat�versions
6.1. Upgrading�from�Heartbeat�2.1�clusters�not
using�the�CRM
For Heartbeat 2.1 clusters not using the CRM (i.e., clusters configured with the haresources
file, an upgrade to 3.0 involves converting the current configuration to one that is suitable for
Pacemaker.

Note
This upgrade procedure does incur application down time. However, when the
upgrade is properly planned, tested, and executed, this down time amounts to
minutes, possibly even seconds (depending on configuration).

6.1.1. Stopping�Heartbeat�services
You should commence the upgrade procedure on your current standby node, that is, the cluster
node currently not running any resources. If your cluster is using an active-active configuration
(both nodes running resources), select one and issue the following command to transfer all
resources to the peer node:

# hb_standby

Then, and on that node only, stop Heartbeat services:

# /etc/init.d/heartbeat stop

6.1.2. Upgrade�software
While upgrading, it is important to recall that the monolithic Heartbeat 2.1 tree has been split
up into modular parts. Thus you will replace Heartbeat with three individual pieces of software:
Cluster Glue, Pacemaker, and Heartbeat 3 which comprises just the cluster messaging layer.

• Upgrading from source: In the unpacked archive that you installed Heartbeat 2.1 from, run make
uninstall. Then, install Cluster Glue and Heartbeat.

• Upgrading using locally built packages: When installing packages manually, uninstall the
heartbeat package first. Then install cluster-glue, the version 3 heartbeat package,
resource-agents, and pacemaker.

• Upgrading using a package repository: When upgrading using an APT, YUM, or Zypper
repository, you should just be able to run the install command for heartbeat version 3 and
pacemaker, and the dependencies will be resolved automatically.

Do not restart Heartbeat services at this point.

6.1.3. Enabling�the�Heartbeat�cluster�to�use�Pacemaker
The cluster messaging layer must now be instructed to start Pacemaker on cluster start-up. To
do so, add

16
Upgrading from previous
Heartbeat versions

crm respawn

to your ha.cf configuration file.

Important
At this point, you should also check your ha.cf file against the ha.cf(5) manual
page, and remove any deprecated options.

When your ha.cf modifications are complete, copy the file to the peer node.

6.1.4. Restarting�Heartbeat
Your cluster is now ready to be restarted in Pacemaker-enabled mode. To do so:

1. Run /etc/init.d/heartbeat stop on your still-active node. This will shutdown your
cluster resources.

2. Run /etc/init.d/heartbeat start on your standby node (the one where you created
your CIB). This will start the local Heartbeat instance and Pacemaker, and wait for other cluster
nodes to check in.

3. Run /etc/init.d/heartbeat start on your the other node. This will start the local
Heartbeat instance and Pacemaker, fetch the CIB automatically, and start applications.

6.2. Upgrading�from�CRM-enabled�Heartbeat
2.1�clusters
This section outlines the steps necessary to upgrade a Heartbeat 2.1 cluster with the built-in
CRM enabled, to Heartbeat 3.0 with Pacemaker.

Note
When properly planned and executed, the upgrade procedure can be completed in
a manner of minutes, with no application down time. It is strongly recommended
to read and understand the steps outlined in this section before attempting a
production cluster uqpgrade. All commands stated in this section must be run as
root. Do not perform the individual steps in parallel on all of your cluster nodes.
Instead, complete the procedure on each node before continuing with the next.

6.2.1. Placing�the�cluster�in�unmanaged�mode
With this step, the cluster temporarily relinquishes control of its resources. This means that the
cluster no longer monitors its nodes or resources for the duration of the upgrade, and will not
rectify and application or node failures during this time. Currently running resources, however,
will continue to run.

# crm_attribute -t crm_config -n is_managed_default -v false

In most configurations, individual resources do not set the is_managed attribute individually,
and hence the cluster-wide attribute is_managed_default applies to all of them.

If in your specific configuration you do have resources that have this attribute set, you should
remove it to make sure the default applies:

# crm_resource -t primitive -r <resource-id> -p is_managed -D

17
Upgrading from previous
Heartbeat versions

6.2.2. Backing�up�the�CIB
At this point, is is important to save a copy of the Cluster Information Base (CIB). The CIB to save
is stored in a file named cib.xml, normally located in /var/lib/heartbeat/crm.

# cp /var/lib/heartbeat/crm/cib.xml ~

You need to perform this step on only one node currently connected to the cluster. Do not delete
this file, it will be restored later.

6.2.3. Stopping�Heartbeat�services
You may now stop Heartbeat with /etc/init.d/heartbeat stop or the preferred
command to stop a system service on your distribution (service heartbeat stop,
rcheartbeat stop, etc.).

In case you are running a legacy version of Heartbeat affected by a shutdown bug, then graceful
crmd shutdown will not work properly in unmanaged mode.

In this case, after you have initiated a graceful service shutdown with the above command, kill
the crmd process manually:

• use ps -AHfww to retrieve the process ID of crmd;

• kill crmd with a TERM signal.

6.2.4. Wiping�files�related�to�the�CRM
Warning
Before proceeding with this section, verify that you have created a backup copy of
the CIB on one of your cluster nodes, as outlined in Section 6.2.2, “Backing up the
CIB” [18].

You should now wipe local CRM-related files from your node. To do so, remove all files from the
directory where the CRM stores CIB information, commonly /var/lib/heartbeat/crm.

# rm /var/lib/heartbeat/crm/*

6.2.5. Restoring�the�CIB
Note
You should only proceed with this step if Heartbeat is still stopped on all cluster
nodes, and all cluster nodes have had their CIB contents wiped. If you still have
remaining nodes that have a residual CIB configuration, proceed as outlined in
Section 6.2.4, “Wiping files related to the CRM” [18].

Restoring the CIB means copying the CIB backup described in Section  6.2.2, “Backing up the
CIB” [18] to the /var/lib/heartbeat/crm directory.

# cp ~/cib.xml /var/lib/heartbeat/crm/cib.xml
# chown hacluster:haclient /var/lib/heartbeat/crm/cib.xml
# chmod 0600 /var/lib/heartbeat/crm/cib.xml

You must perform this step on one node only, namely the first node on which you are about to
upgrade the cluster software. On all other nodes, the /var/lib/heartbeat/crm directory
must remain empty — Pacemaker distributes the CIB automatically.

18
Upgrading from previous
Heartbeat versions

6.2.6. Upgrading�software
While upgrading, it is important to recall that the monolithic Heartbeat 2.1 tree has been split
up into modular parts. Thus you will replace Heartbeat with three individual pieces of software:
Cluster Glue, Pacemaker, and Heartbeat 3 which comprises just the cluster messaging layer.

• Upgrading from source: In the unpacked archive that you installed Heartbeat 2.1 from, run make
uninstall. Then, install Cluster Glue and Heartbeat.

• Upgrading using locally built packages: When installing packages manually, uninstall the
heartbeat package first. Then install cluster-glue, the version 3 heartbeat package,
resource-agents, and pacemaker.

• Upgrading using a package repository: When upgrading using an APT, YUM, or Zypper
repository, you should just be able to run the install command for heartbeat version 3 and
pacemaker, and the dependencies will be resolved automatically.

If this is the last node to be upgraded in your cluster, and your package management system
did not restart Heartbeat services after the software upgrade, you should now proceed to
Section  6.2.7, “Restarting Heartbeat services” [19]. Otherwise, you should move to the
next node and proceed as outlined in Section  6.1.1, “Stopping Heartbeat services” [16]
Section 6.2.6, “Upgrading software” [19].

6.2.7. Restarting�Heartbeat�services
Note
This step may be omitted in case your package management system automatically
restarts Heartbeat services during post-install.

First, restart Heartbeat on the node where you restored the CIB (see Section 6.2.5, “Restoring
the CIB” [18]) with /etc/init.d/heartbeat start. Then, repeat this command on the
remaining cluster nodes. At this time,

• the cluster is still in unmanaged mode (meaning it does not start, stop, or monitor any
resources),

• the cluster redistributes the old CIB among its nodes, and

• the cluster is still using the pre-upgrade CIB schema.

6.2.8. Returning�the�cluster�to�managed�mode
Once the cluster software has been upgraded, it is recommended to return the cluster into
managed mode:

# crm_attribute -t crm_config -n is_managed_default -v true

6.2.9. Upgrading�the�CIB�schema
Although an upgraded cluster can theoretically operate on the pre-upgrade CIB schema
indefinitely, it is strongly recommended to upgrade the CIB to the current schema. To do so,
run the following command after cluster communications between all nodes have been re-
established:

# cibadmin --upgrade --force

19
Part IV. Getting�Help�and�Helping�Out
Chapter 7. Reporting�problems
This chapter describes the appropriate channels for dealing with bugs and issues releated to the
software.

7.1. Mailing�Lists
The Linux-HA project maintains two mailing lists. Both are subscriber-only lists, hosted at
lists.linux-ha.org.

7.1.1. Linux-HA
This is the general-purpose community mailing list. Post here if you have a general question about
Heartbeat or the resource agents. The post address for this list is linux-ha@lists.linux-ha.org
[mailto:linux-ha@lists.linux-ha.org].

7.1.2. Linux-HA-dev
This is the developer mailing list. Post here if you have enhancement suggestions or think you
have found a bug. The post address for this list is linux-ha-dev@lists.linux-ha.org [mailto:linux-
ha-dev@lists.linux-ha.org].

7.2. Bug�Tracking�System
The Linux-HA project uses a public Bugzilla installation run by the Linux Foundation at http://
developerbugs.linuxfoundation.org. To report bugs or comment on existing entries, you must
create an account, which requires a valid email address.

7.3. IRC
Linux-HA developers can usually be found on the irc.freenode.net server, in the #linux-
ha channel.

21
Chapter 8. Submitting�patches
If you have found a problem in Cluster Glue or Heartbeat, and you are able to fix it, the developers
would love to see a patch from you. To do so, please follow the steps outlined in this section.

Create a working copy (a Mercurial clone) of the upstream repository with the following
command:

hg clone http://hg.linux-ha.org/dev heartbeat

Create a new Mercurial queue, and a new patchset:

cd heartbeat
hg qinit
hg qnew --edit --force fix-superfrobnication

Note
The --force option rolls your local modications into the patch set you are just
creating.

In your patch message, be sure to include a meaningful description, for example:

High: IPC: fix superfrobnication on 63-bit platforms

In the IPC layer, superfrobnication breaks on 63-bit middle-endian


Linux platforms when ACPI is disabled. Fencepost align
foobar_create_reqqueue() and guard against memory spinlock starvation
to fix this.

Now the patch set is good for review on the mailing list:

hg email --to=linux-ha-dev@lists.linux-ha.org fix-superfrobnication

Once your patch has been accepted for merging, one of the upstream developers will push it into
the repository. At that point, you can update your checkout from upstream, and remove your
own patch set.

hg qpop -a
hg pull --update
hg qdelete fix-superfrobnication

22