Professional Documents
Culture Documents
Property of IBM
Implementation
Chapter 1 General Configuration ............................................................................................2
Configure the cluster ........................................................................................................................2 Configure the shared disks ..............................................................................................................2
Create the message broker - HACMP/ServiceGuard/VCS/Linux-HA......................................27 Place the broker under cluster control .........................................................................................29
HACMP/ServiceGuard ...............................................................................................................................29 VCS.............................................................................................................................................................31 Linux-HA Control.......................................................................................................................................32
Create the UNS Group ...................................................................................................................44 Create and Configure the UNS Queue Manager .........................................................................44 Create the Clustered UNS..............................................................................................................44
Message Broker...............................................................................................................................51
Create the Message Broker Group ..............................................................................................................51 Configure the Message Brokers Queue Manager ......................................................................................51 Configure the DB2 instance and database ..................................................................................................51 Configure the Message Broker ...................................................................................................................52 WMB Toolkit..............................................................................................................................................53
Comments ................................................................................................................................60
Concepts
Chapter 1 Introduction
Concepts
High Availability Cluster Software is available for a number of vendors for various platforms as listed below HACMP for AIX 5.2 & 5.3 ServiceGaurd for HP-UX 11.11i MSCS for Windows Server 2003 Enterprise Edition Linux-HA Veritas Cluster Server (available for many platforms,Solaris used in this document) Clustering servers enable applications to be distributed across and moved between a number of physical servers, thus providing redundancy and fault resilience for business-critical applications. WebSphere Message Broker (WMB) provides services based on message brokers to route, transform, store, modify and publish messages from WebSphere MQ and other transports. By using Websphere MQ, WMB and a HA Application altogether, it is possible to further enhance the availability of an WMB Configuration Manager, broker or User Name Server (UNS). With a suitably configured cluster, it is possible for failures of power supplies, nodes, disks, disk controllers, networks, network adapters or critical processes to be detected and automatically trigger recovery procedures to bring an affected service back online as quickly as possible. This SupportPac provides notes and example scripts to assist with the installation and configuration of Websphere MB in an HA environment. Used in conjunction with the documentation for Websphere MQ, DB2 and WMB, it shows how to create and configure WMB Config Manager, brokers and a UNS such that they are amenable to operation within an HA cluster. This SupportPac provides brief descriptions of how to configure queue managers or database instances for HA operation. Websphere MQ SupportPac MC91 (High Availability for WebSphere MQ on UNIX platforms) and the DB2 product documentation should be consulted for more detailed assistance relating to queue managers and database instances. An WMB broker requires a database to store runtime data. This database can be DB2, Oracle or Sybase. The testing of this SupportPac used DB2 for the broker repository. With suitable conversion or substitution it would be possible to apply the instructions contained in this SupportPac to other suitable database management systems, but this is beyond the scope of this document. The instructions could also be used for databases other than the broker database, such as the NEON Repository. This SupportPac does not include details of how to configure redundant power supplies, redundant disk controllers, disk mirroring or multiple network or adapter configurations. The reader is referred to the HA Software documentation for assistance with these topics.
because following a failure of a queue manager, messaging applications can still access surviving instances of a cluster queue. Whilst Websphere MQ can be configured to provide automatic detection of queue manager failure and automatic triggering of local restart, it does not include the ability to fail a queue manager over to a surviving node. HA clusters provide monitoring and local restart and also support failover. The two types of cluster can used together to good effect.
Cluster Configurations
A HA Cluster contains multiple machines (nodes), which host resource/service groups. A resource group is a means of binding together related resources which must be co-located on the same node, but which can be moved collectively from one node to another. It is possible to construct a number of cluster topologies, including simple clustered pairs, rings, or "N+1" standby topologies. The ability to move resource groups from one node to another allows you to run highly available workload on multiple nodes, simultaneously. It is also possible to create configurations in which one or more nodes act as standby nodes which may be running other workload if desired. This SupportPac can be used to help set up either standby or takeover configurations, including mutual takeover where all cluster nodes are running WMB workload. A standby configuration is the most basic cluster configuration in which one node performs work whilst the other node acts only as standby. The standby node does not perform work and is referred to as idle; this configuration is sometimes called "cold standby". Such a configuration requires a high degree of hardware redundancy. To economise on hardware, it is possible to extend this configuration to have multiple worker nodes with a single standby node, the idea being that the standby node can take over the work of either worker node. This is still referred to as a standby configuration and sometimes as an "N+1" configuration. A takeover configuration is a more advanced configuration in which all nodes perform some kind of work and critical work can be taken over in the event of a node failure. A "one-sided takeover" configuration is one in which a standby node performs some additional, non-critical and non-movable work. This is rather like a standby configuration but with (non-critical) work being performed by the standby node. A "mutual takeover" configuration is one in which all nodes are performing highly available (movable) work. This type of cluster configuration is sometimes referred to as "Active/Active" to indicate that all nodes are actively processing critical workload. With the extended standby configuration or either of the takeover configurations it is important to consider the peak load which may be placed on any node which can take over the work of other nodes. Such a node must possess sufficient capacity to maintain an acceptable level of performance. HACMP, VCS, ServiceGuard , Heartbeat and MSCS all use a "shared nothing" clustering architecture. A shared nothing cluster has no concurrently shared resources, and works by transferring ownership of resources from one node to another, to work around failures or in response to operator commands. Resources are things like disks, network addresses, or critical processes. Critical data is stored on external disks which can be owned by either of the nodes in the cluster. Such disks are sometimes referred to as "shared disks" but because each disk can be owned by only one node at a time, clusters which use such an architecture are strictly "shared nothing" clusters. Figure 1. shows a generic shared nothing cluster.
Remote Clients
Remote Servers
internal disks
Virtual IP Address 1
Virtual IP Address 2
internal disks
Node B
Critical Processes 1
Critical Processes 2
Critical Data 1
Critical Data 2
shared disks
Figure 1 - A shared nothing cluster HACMP HACMP provides the ability to group resources, so that they can be kept together and can be either restarted on the same node that they were running on, or failed over to another node. Depending on the type of resource group chosen, it is also possible to control whether a group will move back to its former node when that node is restarted. Whilst resource groups define which resources must be co-located, it is application servers and application monitors that allow HACMP to control and monitor services such as queue managers, database instances, message brokers or a UNS. This SupportPac contains scripts that can be used to configure one or more application servers that each contain an WMB broker or UNS. The SupportPac also contains scripts that enable each application server to be monitored. The scripts can be used as shipped, or as a basis from which to develop your own scripts. Whether the scripts are being used to manage brokers, configuration managers or a UNS they rely on the scripts from Websphere MQ SupportPac MC91 to manage queue managers and when managing a broker they also rely on the use of HACMP scripts to manage the database instance containing the broker database. HACMP scripts to manage database instances are available separately. For example, IBM DB2 for AIX includes a set of example scripts for HACMP operations. The example scripts supplied with this SupportPac contain the logic necessary to allow one or more brokers, one or more congfiguration managers and one UNS to be run within a cluster. It is possible to configure multiple resource groups. This, combined with the ability to move resource groups between nodes, allows you to simultaneously run highly available brokers on the cluster nodes, subject only to the constraint that each broker can only run on a node that can access the necessary shared disks. This constraint is enforced by the resource group, which has a list of which nodes it can run on and the volume groups it owns. In the case of the UNS, you can only configure one UNS per cluster. This is because the name "UserNameServer" is fixed, which would cause different UNSs to interfere if you tried to
configure them simultaneously on one node1. The script used for creating an HA compliant UNS enforces this constraint. This SupportPac also includes application monitor scripts which will allow HACMP to monitor the health of brokers (and their queue managers and database instances), configuration managers and the UNS (and its queue manager) and initiate recovery actions that you configure, including the ability to attempt to restart these components. HACMP only permits one application server in a resource group to have an application monitor. If you wished to configure a resource group which contained multiple brokers or a combination of a broker and a UNS then you could combine elements of the example scripts to enable you to run a more complex application server. You would also need to combine elements of the application monitor scripts. This approach is not recommended; it is preferable to use separate resource groups, as described in Architectural guidelines
VCS Whilst service groups define which resources must be co-located, it is agents that allow VCS to monitor and control the running of services such as queue managers, database instances and brokers. This SupportPac contains three example agents MQSIBroker is for managing WMB brokers MQSIConfigMgr is for managing WMB Configuration Managers MQSIUNS is for managing a WMB user name server (UNS). The data services can be used as shipped, or as a basis from which to develop your own agents. Both the MQSIBroker and MQSIUNS agents rely on the use of the MQM agent available in Websphere MQ SupportPac MC91 and the MQSIBroker agent also relies on the use of an appropriate agent to manage the database instance containing the broker database. Database agents are available from VERITAS or the database vendor. IBM provides an agent which supports DB2 version 7.2 onwards. The example agents contain the logic necessary to allow one or more brokers, one or more configuration managers and one UNS to be run within a service group. The agents can also be used by multiple service groups. This, combined with the ability to move service groups between systems, allows you to simultaneously run highly available brokers on the systems within the cluster, subject only to the constraint that each broker can only run on a system that can access the shared disks which contain the files used by the queue manager and database instance used by the broker. This constraint is enforced by VCS. HP This SupportPac contains scripts that can be used to configure one or more application servers that each contain an broker, configuration manager or UNS. The SupportPac also contains scripts that enable each application server to be monitored. The scripts can be used as shipped, or as a basis from which to develop your own scripts. Whether the scripts are being used to manage brokers, configuration managers or a UNS they rely on the scripts from Websphere MQ SupportPac MC91 to manage queue managers and when managing a broker they also rely on the use of MC/ServiceGuard scripts to manage the database instance containing the broker database. MC/ServiceGuard scripts to manage database instances are available separately. The example scripts supplied with this SupportPac contain the logic necessary to allow one or more brokers, one or more configuration managers and one UNS to be run within a cluster. This is described fully in Chapter 4. It is possible to configure multiple resource groups. This,
Technically, you could partition the cluster into non-overlapping sets of nodes and configure multiple resource groups each confined to one of these sets. Each resource group could then support a UNS. However, this is not recommended.
combined with the ability to move resource groups between nodes, allows you to simultaneously run highly available brokers on the cluster nodes, subject only to the constraint that each broker can only run on a node that can access the necessary shared disks. This constraint is enforced by the package, which has a list of which nodes it can run on and the volume groups it owns. In the case of the UNS, you can only configure one UNS per cluster. This is because the name "UserNameServer" is fixed, which WebSphere MQ Integrator for HPUX - Implementing with MC/ServiceGuard would cause different UNSs to interfere if you tried to configure them simultaneously on one node1. The script used for creating an HA compliant UNS enforces this constraint. Linux-HA (Heartbeat) Heartbeat provides the ability to group resources, so that they can be kept together and can be either restarted on the same node that they were running on, or failed over to another node. Depending on the type of resource group chosen, it is also possible to control whether a group will move back to its former node when that node is restarted. Whilst resource groups define which resources must be co-located, it is agents and application monitors that allow hearbeat to control and monitor services such as queue managers, database instances, message brokers, configuration managers or a UNS. These scripts do not rely on any other support pacs. Currently they do not provide scripts which allow DB2 to be placed under HA. If you would like to put DB2 under the control of Heartbeat please refer to the DB2 Documentation, for example, Open Source Linux High Availability for IBM DB2 Universal Database - Implementation Guide2. For the rest of this document it is presume that the DB instance is located on a machine outside out of HeartBeat. The example scripts supplied with this SupportPac contain the logic necessary to allow one or more components to be run within a cluster. This is described fully in Chapter 4. It is possible to configure multiple resource groups. This, combined with the ability to move resource groups between nodes, allows you to simultaneously run highly available brokers on the cluster nodes, subject only to the constraint that each broker can only run on a node that can access the necessary shared disks. This constraint is enforced by the resource group, which has a list of which nodes it can run on. In the case of the UNS, you can only configure one UNS per cluster. This is because the name "UserNameServer" is fixed, which would cause different UNSs to interfere if you tried to configure them simultaneously on one node3. The script used for creating an HA compliant UNS enforces this constraint. If you are using Heartbeat V2 then this SupportPac also includes application monitor scripts and agents which will allow Heartbeat to monitor the health of components (and their queue managers and database instances) and initiate recovery actions that you configure, including the ability to attempt to restart these components. Heartbeat provides the ability to run a number of monitors within a single resource group. If you wished to configure a resource group which contained multiple brokers or a combination of a broker and a UNS then you could combine elements of the example scripts to enable you to run a more complex agent. You would also need to combine elements of the application monitor scripts. This approach is not recommended; it is preferable to use separate resource groups, as described Chapter 4 Microsoft Cluster Services Microsoft Cluster Services provides the ability to group resources, so that they can be kept together and can be either restarted on the same node that they were running on, or failed over to another node. Depending on the type of MSCS group chosen, it is also possible to control whether a group will move back to its former node when that node is restarted.
This is a whitepaper available from the DB2 for Linux web site - http://www306.ibm.com/software/data/db2/linux/papers.html
Whilst MSCS groups define which resources must be co-located, it is service resources that allow MSCS to control and monitor services such as queue managers, database instances, message brokers, configuration managers or a UNS. There are no scripts suppled with this support pac. All the actions are GUI driven within MSCS. This SupportPac contain the instructions necessary to allow one or more brokers, one or more congfiguration managers and one UNS to be run within a cluster. This is described fully in Chapter 4It is possible to configure multiple resource groups. This, combined with the ability to move resource groups between nodes, allows you to simultaneously run highly available brokers on the cluster nodes, subject only to the constraint that each broker can only run on a node that can access the necessary shared disks. This constraint is enforced by the resource group, which has a list of which nodes it can run on and the volume groups it owns. In the case of the UNS, you can only configure one UNS per cluster. This is because the name "UserNameServer" is fixed, which would cause different UNSs to interfere if you tried to configure them simultaneously on one node.
Network connections
Clustered services, such as Websphere MQ queue managers, are configured to use virtual IP addresses which are under cluster control. When a clustered service moves from one cluster node to the other, it takes its virtual IP address with it. The virtual IP address is different to the stationary physical IP address that is assigned to a cluster node. Remote clients and servers which need to communicate with clustered services must be configured to connect to the virtual IP address and must be written such that they can tolerate a broken connection by repeatedly trying to reconnect.
10
Chapter 2 Requirements
Software Requirements
WebSphere Message Broker V6 FixPack 1 or higher Websphere MQ V5.3 FixPack 8 or higher DB2 V8.1 or above Websphere MQ SupportPac MC91: High Availability for WebSphere MQ on UNIX Platforms (not required for Linux-HA)
Networks
The SupportPac has been tested with TCP/IP public networks, and serial private networks. TCP/IP networks could be used for both public and private networks. It would also be possible to configure HACMP to handle SNA networks.
VCS
Sun Solaris 8 or 9 VCS 4.0 or above
HP/ServiceGuard
HPUX Version 11i MC/ServiceGuard Version 11.1
Linux-HA
SLES9 HeartBeat V2.0.5
MSCS
Windows Server 2003 Enterprise Edition Microsoft Cluster Server (MSCS) components
11
12
agents. This layout of directories is assumed by the example agent methods. You could use a different layout if you wanted to, but you'd need to change some of the example scripts.. Download the SupportPac onto each of the cluster systems into the /opt/VRTSvcs/bin directory and uncompress and untar it. This will create peer directories called MQSIBroker, MQSIConfigMgr and MQSIUNS. Ensure that the necessary scripts in each of these subdirectories are executable, by issuing: cd /opt/VRTSvcs/bin/MQSIBroker chmod +x clean explain ha* monitor offline online cd /opt/VRTSvcs/bin/MQSIConfigMgr chmod +x clean explain ha* monitor offline online cd /opt/VRTSvcs/bin/MQSIUNS chmod +x clean explain ha* monitor offline online The agent methods are written in perl. You need to copy or link the ScriptAgent binary (supplied as part of VCS) into each of the MQSIBroker and MQSIUNS directories, as follows: cd /opt/VRTSvcs/bin cp ScriptAgent MQSIBroker/MQSIBrokerAgent cp ScriptAgent MQSIBroker/MQSIConfigMgrAgent cp ScriptAgent MQSIUNS/MQSIUNSAgent The MQSIBroker, MQSIConfigMgr and MQSIUNS resource types need to be added to the cluster configuration file. This can be done using the VCS GUI or ha* commands while the cluster is running, or by editing the types.cf file with the cluster stopped. If you choose to do this by editing the types.cf file, stop the cluster and edit /etc/VRTSvcs/conf/config/types.cf file by appending the MQSIBroker, MQSIConfigMgr and MQSIUNS type definitions. For convenience, these definitions can be copied directly from the types.MQSIBroker, types.MQSIConfigMgr and types.MQSIUNS files, in each of the agent directories. The default parameter settings provide suggested values for the OnlineWaitLimit, OfflineTimeout and LogLevel attributes of the resource types. See Appendix A for more details. Configure and restart the cluster and check that the new resource types are recognized correctly by issuing the following commands: hatype -display MQSIBroker hatype -display MQSIConfigMgr hatype -display MQSIUNS
13
Next copy the files from the <temp directory>/resource.d directory to /etc/ha.r/resorce.d directory. All of the scripts in the monitor directory need to have executable permission. The easiest way to do this is to change to the working directory and run chmod +x mqsi* mqm*
14
15
Message Broker
Configuration Manager
Broker Database
Figure 2 - Inclusion of WMB components in the cluster Figure 2. shows the relationships between these components and their recommended placement relative to the Cluster. The double boxes show which components have an extra instance if an additional broker is configured. Before proceeding, the reader is advised to study the diagram and consider which components of their solution architecture they wish to place on the cluster nodes and which components will be run remotely. The components which you decide to run on the cluster nodes need to be put under HA control. Components which are run outside the cluster may also need to made highly available. They could, for example, be put into a separate cluster. The contents of this document could be used in conjunction with the product documentation as a basis for how to do that. Alternatively the remote components could be made highly available in other ways, but it is outside the scope of this document to describe this. The unit of failover in HACMP is a resource group, in Service Guard is a package, in VSC is a service group, in Heartbeat is a resource group and in MSCS is a group. For the remainder of this document the term resource group will be used to represent a unit of failover. A resource group is a collection of resources (e.g. disks, IP addresses, processes) needed to deliver a highly available service. Ideally a group should contain only those processes and resources needed for a particular instance of a service. This approach maximises the independence of each resource group, providing flexibility and minimising disruption during a failure or planned maintenance. The remainder of this Chapter describes the Message Broker, Configuration Manager and UNS components in more detail and then describes the steps needed to configure the cluster.
16
cluster
Node A
Configuration Manager (Internal DB) Queue Manager ip address
Node B
Configuration Manager (Internal DB)
disks
disks
ConfigMgr group
ConfigMgr group
17
Broker component
A WMB broker relies on a Websphere MQ queue manager and a database, referred to as the broker database, which in turn is depend on other lower level resources, such as disks and IP addresses. The broker database might be run in a database instance on the same node as the message broker, in which case the database instance and its lower level dependencies must be failed over with the message broker. Alternatively the broker database might be run in a remote instance accessed using a remote ODBC connection, in which case it is necessary is to ensure that the database is accessible from either cluster node, so that the message broker can operate correctly on either node. The smallest unit of failover of an WMB Message Broker is the broker service together with the Websphere MQ queue manager upon which it depends and the broker database, if run locally. The optimal configuration of brokers and groups is to place each broker in a separate group, with the resources upon which it depends. Additional brokers should be configured to use separate database instances and be placed in separate groups. The UNS and Configuration Manager should also be placed in a separate group. A broker group must contain the message broker, the broker's queue manager and the queue manager's shared disks and IP address. If the broker database is run locally then the broker group also needs to contain the database instance and its shared disks. The resource groups that might therefore be constructed, depending on your placement of broker databases are as shown in Figure 4. and Figure5. You could put multiple message brokers (and associated resources) into a group, but if you did they would all have to failover to another node together, even if the problem causing the failover were confined to one message broker or its dependencies. This would cause unnecessary disruption to other message brokers in the same group. HACMP users who wish to use application monitoring should also note the restriction that only one application server in a resource group can be monitored. If you wanted to monitor multiple brokers in the same group, you would need to edit the example scripts so that the brokers were in the same application server. On UNIXs, a message broker makes use of the /var/mqsi/components/<broker> directory and the /var/mqsi/registry/<broker> directory, so these directories need to be on shared disks in addition to the directories used by the queue manager and broker database instance. A broker runs as a pair of processes, called bipservice and bipbroker. The latter in turn creates the execution groups that run message flows. It is this collection of processes which are managed by HA Software. The configuration steps described in later chapers how to use the example broker scripts to place a broker under HA control.
18
cluster
Node A
Message Broker Queue Manager Broker Database
Node B
Message Broker Queue Manager Broker Database
disks
ip address
broker group
disks
disks
ip address
broker group
disks
cluster
Node A
Message Broker Queue Manager ip address
Node B
Message Broker Queue Manager ip address
disks
disks
broker group
broker group
19
and the queue manager will require an IP address and shared disk resources. The IP address and shared disk are separate from those described earlier for the broker. Similar to the rationale for placing each message broker in a separate resource group, the most flexible configuration for the UNS is for it to be placed in a resource group of its own. This resource group must contain the UNS, its queue manager and the queue manager's shared disks and IP address. shows the contents of a resource group containing the UNS.
cluster
Node A
User Name Server Queue Manager ip address
Node B
User Name Server Queue Manager ip address
disks
disks
UNS group
You could put the UNS in the same group as one or more brokers and it could share its queue manager with a broker, but this would bind the UNS permanently to that broker. For flexibility it is better to separate them. HACMP users who wish to use application monitoring should also note the restriction that only one application server in a resource group can be monitored. If you were to place the UNS in the same group as one or more brokers and wanted to monitor all of them, you would need to edit the example scripts to put multiple components in an application server. On AIX, the UNS makes use of the /var/mqsi/Components/UserNameServer directory and the /var/mqsi/registry/UserNameServer directory, so these directories need to be stored on shared disks in addition to the directories used by the queue manager. The UNS runs as a pair of processes, bipservice and bipuns. It is this collection of processes which are managed by HA Software. The configuration steps described in later chapers how to use the example broker scripts to place a broker under HA control.
Architectural guidelines
From the preceding discussion it is apparent that there are a number of choices as to which components your architecture will contain and which of them will be clustered. It is advisable to decide and record at this stage what the architecture will be. The following list provides a few suggested guidelines which may help in this. You don't have to adhere to them, but they may help. Figure 7 shows one possible architecture which implements the guidelines. Each Configuration Manager should be in a sepeate resource group
20
Each Configuration Manager must have its own queue manager. The queue manager must be in the same resource group as the broker. Each broker should be in a separate resource group dedicated to that broker. Each broker must have its own queue manager, not used by any other brokers. The queue manager must be in the same resource group as the broker. Each broker should use a separate database instance. The broker database instance can be remote, in which case it should be run on a machine outside the cluster. Alternatively, the broker database instance can be run on the same machine as the broker, although it does ont need to be in the same resource group as the broker. The UNS, if you have one, should be run in the Cluster and should be in a separate resource group. The UNS should have its own queue manager, which must be in the same resource group as the UNS.
Broker1DB
UNSQM
CfgMgrQM
Broker2QM Broker2DB
Cluster Node B
res grp B1 UNS res grp
Cluster Node C
res grp B2
Broker1 QM DB
UNS QM
Broker2 QM DB
HACMP Cluster
Cluster Node A
res grp C1
21
Implementation
Configure the cluster, cluster nodes and adapters to HA Software as usual. Synchronise the Cluster Topology.
4. Now would be a good time to create and configure the user accounts that will be used to run the database instances, brokers and UNS. Home directories, (numeric) user ids, passwords, profiles and group memberships should be the same on all cluster nodes.
Test the operation of the cluster by creating an instance of a Generic Application (such as the System Clock) and ensure that it fails back and forth between the nodes in accordance with the resource parameters you set for it. Also test the operation of a shared drive placed under HA control, making sure that it fails back and forth correctly.
For each broker that you wish to run in the cluster, you also need to specify where the broker queue manager, the broker database instance and the broker itself should store their files. The Configuration Manager and UNS is similar to a broker except that it doesn't use a database. You need to create a volume group for each resource group. Within each volume group, you can decide where to place the files for the queue manager, database instance and broker, or queue manager and UNS, but the following is recommended: Queue manager - refer to SupportPac MC91 for details of how the queue manager's filesystems should be organised. For Linux-HA refer to Chapter 2 Database instance - you can locate the database instance home directory for each broker's database instance under the data directory for the corresponding queue manager, in the same filesystem. Broker - a broker stores some information on disks. When you install WMB on a node, it creates the /var/mqsi directory within the existing /var filesystem on internal disks. There is only one such directory per node, regardless of the number of brokers that the node may host. Within the /var/mqsi directory, there are broker specific directories which need to be on shared disk so that they can be moved to a surviving node in the event of a failure. The remainder of /var/mqsi, those parts which are not specific to one particular broker, should be on internal
disks. The broker specific data stored on shared disk can be placed in the filesystem used for the queue manager data files, below the directory specified by the MQHAFSDATA environment variable used in SupportPac MC91. The split between which broker directories are local and which are shared is shown in Figure 7, which should be read in conjunction with the similar diagram from SupportPac MC91. The broker-related commands described later in this chapter set this up for you. Configuration Manager - the directories should be organised in the same way as the broker directories with the exception that a Configuration Manager has no database. UNS - the UNS directories should be organised in the same way as the broker directories with the exception that a UNS has no database. The UNS-related commands described later in this chapter set up the UNS directories for you.
Filesystem organisation
This diagram shows the filesystem organisation for a single broker, called imb1, using queue manager ha.csq1 and database instance db2inst1
Filesystem: /var
mqm
/var
internal disks
ipc qmgrs
bin
ha!csq1 ha!csq1
as for MC91
mqsi
locks
Node A
Node B
registry
Filesystem: /MQHA/ha.csq1/data
data /MQHA ha.csq1 log
Filesystem: /MQHA/ha.csq1/log
= subdirectory = symlink
shared disks
= note
Figure 8 - Division of broker directories between local disks and shared disks diskxample cluster architecture Actions:
1. For the Configuration Manager, create a volume group that will be used for the queue manager's data and log files and the UNS directories under /var/mqsi. 2. For each broker, create a volume group that will be used for the queue manager's data and log files, the database instance and the broker-specific directories under /var/mqsi. 3. For the UNS, create a volume group that will be used for the queue manager's data and log files and the UNS directories under /var/mqsi. 4. For each volume group, create the data and log filesystems as described in SupportPac MC91, Step 2. When choosing the path names for the filesystems you may prefer to use the name of the broker, configuration manager or the name "UNS" instead of using the name of the queue manager. For each node in turn, ensure that the filesystems can be mounted, then unmount the filesystems.
<operations> <op id="IPaddr_1_mon" interval="5s" name="monitor" timeout="5s"/> </operations> <instance_attributes id="IPaddr_1_inst_attr"> <attributes> <nvpair id="IPaddr_1_attr_0" name="ip" value="192.168.1.11"/> </attributes> </instance_attributes> </primitive> <primitive class="ocf" id="Filesystem_2" provider="heartbeat" type="Filesystem"> <operations> <op id="Filesystem_2_mon" interval="120s" name="monitor" timeout="60s"/> </operations> <instance_attributes id="Filesystem_2_inst_attr"> <attributes> <nvpair id="Filesystem_2_attr_0" name="device" value="/dev/sdb1"/> <nvpair id="Filesystem_2_attr_1" name="directory" value="/MQHA/cm1qm"/> <nvpair id="Filesystem_2_attr_2" name="fstype" value="ext2"/> </attributes> </instance_attributes> </primitive> </group> # New Code Ends Here </resources> <constraints> # New Code Starts Here <rsc_location id="rsc_location_group_1" rsc="group_1"> <rule id="prefered_location_group_1" score="100"> <expression attribute="#uname" id="prefered_location_group_1_expr" operation="eq" value="ha-node1"/> </rule> </rsc_location> # New Code Ends Here </constraints> </configuration> </cib>
uname="ha-node1" Node on which the resource should normally be run. value="192.168.1.11" This is the IP address for the resource group. nvpair id="Filesystem_2_attr_0" name="device" value="/dev/sdb1"/ This it the shared disc which holds the file system for the queue manager. nvpair id="Filesystem_2_attr_1" name="directory" value="/MQHA/cm1qm"/ This is the location you want to mount the shared disk onto.
An exmple cib.xml file can be found in the support pac. For further information on the definitions of the tags please refer to the Linux-HA documentation (http://linuxha.org/v2_2fExamples_2fSimple) Make sure that all nodes have been updated with the new ha.cf before restarting hearbeat on the node the queue manager is based
2. Make sure the queue manager is stopped, by issuing the endmqm command. 3. On the node which currently has the queue manager shared disks and has the queue manager filesystems mounted, run the hadltmqm script provided in the SupportPac. 4. You can now destroy the filesystems /MQHA/<qmgr>/data and /MQHA/<qmgr>/log. 5. On each of the other nodes in the cluster,
a. b.
Run the hadltmqm command as above, which will clean up the subdirectories related to the queue manager. Manually remove the queue manager stanza from the /var/mqm/mqs.ini file.
The queue manager has now been completely removed from the cluster and the nodes.
During the creation of the UNS queue manager, you will create the resource group as described in SupportPac MC91. The resource group can be either cascading or rotating. Whichever you choose, bear the following points in mind: The resource group will use the IP address as the service label. This is the address which clients and channels will use to connect to the queue manager. If you choose cascading, it is recommended that you consider disabling the automatic fallback facility by setting Cascading Without Fallback to true. This is to avoid the interruption to the UNS which would be caused by the reintegration of the top priority node after a failure. Unless you have a specific requirement which would make automatic fallback desirable in your configuration, then it is probably better to manually move the resource group back to the preferred node when it will cause minimum disruption.
For VCS
The service group in which the UNS and its queue manager will run is created during the creation of the queue manager, as described in SupportPac MC91. The queue manager will use the IP address managed by the service group, rather than an IP address statically assigned to a system. The logical address is the address which clients and channels will use to connect to the queue manager.
For ServiceGaurd
During the creation of the UNS queue manager, you will create the package as described in SupportPac MC91. The UNS queue manager needs to be configured so that the UNS can communicate with brokers and the Configuration Manager. Such configuration is described assuming that the UNS and broker are using separate queue managers. If they are sharing a queue manager then you can omit the creation of the transmission queues and channels. The only difference between the clustered UNS and non-clustered UNS configurations is that in the clustered case
you need to use a virtual IP address for channels sending to the UNS queue manager rather than the machine IP address.
Actions: 1.
On one node, create a cltered queue manager as described in SupportPac MC91, using the hacrtmqm command. Use the volume group that you created for the UNS and place the volume group and queue manager into a resource group to which the UNS will be added in. Don't configure the application server or application monitor described in SupportPac MC91- you will create an application server that covers both the UNS and the queue manager. Set up queues and channels between the UNS queue manager and the Configuration Manager queue manager:
a.
2.
On the Configuration Manager queue manager create a transmission queue for communication to the UNS queue manager. Ensure that the queue is given the same name and case as the UNS queue manager. The transmission queue should be set to trigger the sender channel. On the Configuration Manager queue manager create a sender and receiver channel for communication with the UNS queue manager. The sender channel should use the service address of the UNS resource group and the UNS queue manager's port number. On the UNS queue manager create a transmission queue for communication to the Configuration Manager queue manager. Ensure that the queue is given the same name and case as the Configuration Manager queue manager. The transmission queue should be set to trigger the sender channel. On the UNS queue manager create sender and receiver channels to match those just created on the Configuration Manager queue manager. The sender channel should use the IP address of the machine where the Configuration Manager queue manager runs, and the corresponding listener port number.
b.
c.
d.
3.
Test that the above queue managers can communicate regardless of which node owns the resource group for the UNS.
Linux-HA
Create a queue manager as decribed in Chapter 2During the creation of the UNS queue manager, you will create the resource group as described in Chapter 2
1. Create the UNS on the node hosting the resource group using the hamqsicreateusernameserver command.
hamqsicreateusernameserver command
The hamqsicreateusernameserver command will create the UNS and will ensure that its directories are arranged to allow for HA operation. The hamqsicreateusernameserver command puts the UNS directories under the same path used for the data associated with the queue manager which the UNS uses. It parses the /var/mqm/mqs.ini file to locate this path information. The invocation of the hamqsicreateusernameserver command uses exactly the same parameters that you would normally use for mqsicreateusernameserver. You must be root to run this command.
Syntax hamqsicreateusernameserver <creation parameters> Parameters
creation parameters - are exactly the same as for the regular WMB mqsicreateusernameserver command
Example: hamqsicreateusernameserver -i mqsi -a mqsi -q ha.csq1
2. Ensure that you can start and stop the UNS manually using the mqsistart and mqsistop commands. You will need to login as the user id under which the UNS runs to test this.
3. On any other nodes in the resource group's nodelist (i.e. excluding the one on which you
just created the UNS), run the hamqsiaddunsstandby command to create the information needed by these nodes to enable them to host the UNS.
hamqsiaddunsstandby
The hamqsiaddunsstandby command will create the information required for a cluster node to act as a standby for the UNS. This command does not create the UNS - which is created as described earlier. This command defines symbolic links within subdirectories under /var/mqsi on the standby node which allow the UNS to move to that node. The hamqsiaddunsstandby command expects the UNS directories to have been created by the hamqsicreateusernameserver command under the same path used for the data associated with the queue manager which the UNS uses. It parses the /var/mqm/mqs.ini file to locate this path information. You must be root to run this command.
Syntax hamqsiaddusnstandby <qm name> <userid> Parameters
qm name - the name of the queue manager used by the UNS userid - the account under which the UNS service runs
Example: hamqsiaddunsstandby ha.csq1 mqsi
The hamqsi_start_uns_as and hamqsi_stop_uns_as scripts are the ones you configure as the start and stop methods for the application server. They invoke methods supplied by MC91 to control the queue manager and the hamqsi_start_uns and hamqsi_stop_uns methods to control the UNS. You can configure an application monitor which will monitor the health of the UNS and its queue manager. This will monitor the UNS and its queue manager and trigger recovery actions as a result of failures. Recovery actions include the ability to perform local restarts of the UNS and queue manager (see below) or to cause a failover of the resource group to another node. In HACMP you can only configure one application monitor per resource group. When you ran hamqsicreateusernameserver an application monitor was created. This application monitor is specifically for monitoring an application server containing a UNS and queue manager and is called:
hamqsi_applmon.UNS
If you use the application monitor, it will call it periodically to check that the UNS processes and queue manager are running. The example application monitor checks that the bipservice process is running. The bipservice process monitors and restarts the bipuns process.
hamqsi_start_uns_as
The example start script is called hamqsi_start_uns_as. This script is robust in that it does not assume anything about the state of the UNS or queue manager on entry. It accepts two command line parameters which are the queue manager name and the userid under which the UNS runs, so when you define the start command in HA, include the parameters.
Example
10
hamqsi_stop_uns_as
The example stop script is called hamqsi_stop_uns_as. The stop script accepts three command line parameters, the first is the queue manager name, the second parameter is the UNS userid and the third is the timeout (in seconds) to use on each of the levels of severity of stop. When you define the stop command in HA you should include the parameters.
Example
"/MQHA/bin/hamqsi_stop_uns_as ha.csq1 mqsi 10" The stop command will use the timeout parameter as the time to allow either the queue manager or UNS to respond to an attempt to stop it. If a stop attempt times out, then a more severe stop is performed. The stop command has to ensure that the UNS and queue manager are both fully stopped by the time the command completes.
The fault monitoring interval is configured in the HA Software, which also include the parameters that control whether a failure of either component of the application server will trigger a restart. It is recommended that the restart count is set to 1 so that one restart is attempted, and that the time period is set to a small multiple of the expected start time for the components of the UNS group. With these settings, if successive restarts fail without a significant period of stability between, then the resource group will failover to a different node. Attempting more restarts on a node on which a restart has just failed is unlikely to succeed.
Actions: 1.
Create an application server which will run the UNS and its queue manager using the example scripts provided in this SupportPac. The example scripts are called hamqsi_start_uns_as and hamqsi_stop_uns_as and are described in the following frames. hamqsicreateusernameserver. An application monitor script cannot be passed parameters, so just specify the name of the monitor script. Also configure the other application monitor parameters, including the monitoring interval and the restart parameters you require. The example application monitor script provided in this SupportPac is described in the
2. You can also specify an application monitor using the hamqsi_applmon.UNS script created by
hamqsi_applmon.UNS
The hamqsi_applmon.UNS created for you by hamqsicreateusernameserver will be called at the polling frequency you specify. It is a parameter-less wrapper script that calls hamqsi_monitor_uns_as which checks the state of the UNS and queue manager. Success of both tests causes the application monitor to return a zero exit code, indicating that the UNS and queue manager are working properly. Failure of either test will result in the application monitor returning a non-zero exit code. HACMP/ServiceGuard will then take whatever action has been configured. If you wish to use the example application monitor then supply its for the UNS application server. The monitoring script accepts no parameters.
Example "/MQHA/bin/hamqsi_applmon.UNS"
The example application monitor is tolerant if it finds that the queue manager is starting because this may be due to the stabilisation interval being too short.
11
following frame:
3. 4. 5. 6. 7.
Synchronise the cluster resources. Ensure that the UNS and its queue manager are stopped, and start the application server. Check that the UNS and queue manager started and test that the resource group can be moved from one node to the other and that the UNS runs correctly on each node. Ensure that stopping the application server stops the UNS and its queue manager. With the application server started, verify that the local restart capability is working as configured. During this testing a convenient way to cause failures is to identify the bipservice for the UserNameServer and kill it.
The SupportPac contains the definition of the MQSIUNS resource type, which can be found in the types.MQSIUNS file. The UNS is put under cluster control by creating a resource of this type. Because the UNS needs to be co-located with its queue manager, the UNS resource needs to have a resource dependency on the corresponding queue manager resource (of resource type MQM, provided by SupportPac MC91). This dependency tells VCS that the queue manager needs to be started before the UNS and that the UNS needs to be stopped before the queue manager. Actions: 1. Create the MQSIUNS resource type. This can be performed using the VCS GUI or by editing the main.cf or types.cf files, either by including the types.MQSIUNS file in the main.cf for the cluster, or by copying the content of the types.MQSIUNS file into the existing types.cf file. If you opt to edit the files, then ensure that the configuration is read-write and use hacf to verify the changes. 2. Add a resource of type MQSIUNS, either using the VCS GUI or by editing the main.cf file. The resource needs to be in the same service group as the queue manager upon which the UNS depends and it should have a requires statement to record the dependency. The sample main.cf included in Appendix A can be used as a guide. Verify and enable the changes. 3. Ensure that the UNS is stopped, and bring the service group online. The queue manager and UNS should start. 4. Check that the UNS started and test that the service group can be switched from one system to another and that the UNS runs correctly on each system. 5. Ensure that taking the service group offline stops the UNS. 6. With the service group online, verify that the monitor correctly detects failures and configure the restart attributes to your desired values. During this testing a convenient way to cause failures is to identify the bipservice for the UserNameServer and kill it.
Linux-HA
The UNS is managed by the mqsiuns agent supplied in this SupportPac. The agent contains the following methods:
12
start - starts the UNS stop - stops the UNS status - tests the health of the UNS monitor - tests the health of the UNS
The mqsiusn script is the one you configure with Heartbeat which calls hamqsi_start_uns_as and hamqsi_stop_uns_as. They invoke methods supplied by scripts to control the queue manager and the hamqsi_start_uns and hamqsi_stop_uns methods to control the UNS. You can configure an application monitor which will monitor the health of the UNS and its queue manager. This will monitor the UNS and its queue manager and trigger recovery actions as a result of failures. Recovery actions include the ability to perform local restarts of the UNS and queue manager (see below) or to cause a failover of the resource group to another node. With Heartbeat its possible to configure many monitors for a single resource group. For this support pac we will only use one To place the user name server under the control of Heartbeat the /var/lib/heartbeat/crm/cib.xml file must be updated to include the uns as a component in the resource group. If you have followed the instruction in Chapter 2 you should already have a resource group for the uns. An extra primative class is required for the username server as shown below higlighted in italics/red
<cib> <configuration> <crm_config> <cluster_property_set id="default"> <attributes> <nvpair id="is_managed_default" name="is_managed_default" value="true"/> </attributes> </cluster_property_set> </crm_config> <nodes> <node id="3928ccf5-63d2-4fbd-b4ed-e9d6163afbc9" uname="ha-node1" type="normal"/> </nodes> <resources> <group id="group_1"> <primitive class="ocf" id="IPaddr_1" provider="heartbeat" type="IPaddr"> <operations> <op id="IPaddr_1_mon" interval="5s" name="monitor" timeout="5s"/> </operations> <instance_attributes id="IPaddr_1_inst_attr"> <attributes> <nvpair id="IPaddr_1_attr_0" name="ip" value="192.168.1.11"/> </attributes> </instance_attributes> </primitive> <primitive class="ocf" id="Filesystem_2" provider="heartbeat" type="Filesystem"> <operations> <op id="Filesystem_2_mon" interval="120s" name="monitor" timeout="60s"/> </operations> <instance_attributes id="Filesystem_2_inst_attr"> <attributes> <nvpair id="Filesystem_2_attr_0" name="device" value="/dev/sdb1"/> <nvpair id="Filesystem_2_attr_1" name="directory" value="/MQHA/cm1qm"/> <nvpair id="Filesystem_2_attr_2" name="fstype" value="ext2"/> </attributes> </instance_attributes> </primitive> # New section starts here <primitive class="heartbeat" id="mqsiuns_4" provider="heartbeat" type="mqsiuns"> <operations>
13
<op id="mqsiuns_4_mon" interval="120s" name="monitor" timeout="60s"/> </operations> <instance_attributes id="mqsiuns_4_inst_attr"> <attributes> <nvpair id=" mqsiuns_4_attr_1" name="1" value="usnqm"/> <nvpair id=" mqsiuns_4_attr_2" name="2" value="mqm"/> <nvpair id=" mqsiuns_4_attr_3" name="3" value="argostr"/> </attributes> </instance_attributes> </primitive> # New section ends here </group> </resources> <constraints> <rsc_location id="rsc_location_group_1" rsc="group_1"> <rule id="prefered_location_group_1" score="100"> <expression attribute="#uname" id="prefered_location_group_1_expr" operation="eq" value="ha-node1"/> </rule> </rsc_location> </constraints> </configuration> </cib>
<nvpair id=" mqsiuns_4_attr_1" name="1" value="usnqm"/> Name of the queue manager the uns runs on. <nvpair id=" mqsiuns_4_attr_2" name="2" value="mqm"/> Name of the MQ userID. <nvpair id=" mqsiuns_4_attr_3" name="3" value="argostr"/> Name of the usn userID.
Once you have made this change make sure you syncronise the cib.xml file to all other machines within the cluster.
hamqsi_start_uns_as
The example start script is called hamqsi_start_uns_as. This script is robust in that it does not assume anything about the state of the UNS or queue manager on entry. It accepts two command line parameters which are the queue manager name and the userid under which the UNS runs, so when you define the start command in HA, include the parameters.
Example
14
hamqsi_stop_uns_as
The example stop script is called hamqsi_stop_uns_as. The stop script accepts three command line parameters, the first is the queue manager name, the second parameter is the UNS userid and the third is the timeout (in seconds) to use on each of the levels of severity of stop. When you define the stop command in HA you should include the parameters.
Example
"/MQHA/bin/hamqsi_stop_uns_as ha.csq1 mqsi 10" The stop command will use the timeout parameter as the time to allow either the queue manager or UNS to respond to an attempt to stop it. If a stop attempt times out, then a more severe stop is performed. The stop command has to ensure that the UNS and queue manager are both fully stopped by the time the command completes.
15
During the creation of the configuration manager queue manager, you will create the resource group as described in SupportPac MC91. The resource group can be either cascading or rotating. Whichever you choose, bear the following points in mind: The resource group will use the IP address as the service label. This is the address which clients and channels will use to connect to the queue manager. If you choose cascading, it is recommended that you consider disabling the automatic fallback facility by setting Cascading Without Fallback to true. This is to avoid the interruption to the configuration manager which would be caused by the reintegration of the top priority node after a failure. Unless you have a specific requirement which would make automatic fallback desirable in your configuration, then it is probably better to manually move the resource group back to the preferred node when it will cause minimum disruption.
For VCS
The service group in which the configuration manager and its queue manager will run is created during the creation of the queue manager, as described in SupportPac MC91. The queue manager will use the IP address managed by the service group, rather than an IP address statically assigned to a system. The logical address is the address which clients and channels will use to connect to the queue manager.
For ServiceGaurd
During the creation of the UNS queue manager, you will create the package as described in SupportPac MC91. The configuration manager queue manager needs to be configured so that the configuration manager can communicate with brokers. Such configuration is described assuming that the
16
configuration manager and broker are using separate queue managers. If they are sharing a queue manager then you can omit the creation of the transmission queues and channels. The only difference between the clustered configuration manager and non-clustered configuration manager configurations is that in the clustered case you need to use a virtual IP address for channels sending to the UNS queue manager rather than the machine IP address.
Actions:
1. On one node, create a cltered queue manager as described in SupportPac MC91, using the hacrtmqm command. Use the volume group that you created for the configuration manager and place the volume group and queue manager into a resource group to which the configuration manager will be added. Don't configure the application server or application monitor described in SupportPac MC91- you will create an application server that covers both the configuration manager and the queue manager. 2. Set up queues and channels between the configuration manager queue manager and the brokers: a. On the Configuration Manager queue manager create a transmission queue for communication to the brokers queue manager. Ensure that the queue is given the same name and case as the brokers queue manager. The transmission queue should be set to trigger the sender channel. b. On the Configuration Manager queue manager create a sender and receiver channel for communication with the brokers queue manager. The sender channel should use the service address of the brokers resource group and the brokers queue manager's port number. c. On the brokers queue manager create a transmission queue for communication to the Configuration Manager queue manager. Ensure that the queue is given the same name and case as the Configuration Manager queue manager. The transmission queue should be set to trigger the sender channel.
d. On the brokers queue manager create sender and receiver channels to match those just created on the Configuration Manager queue manager. The sender channel should use the IP address of the machine where the Configuration Manager queue manager runs, and the corresponding listener port number. 3. Test that the above queue managers can communicate regardless of which node owns the resource group for the brokers.
Linux-HA
Create a queue manager as decribed in Chapter 2 aboveDuring the creation of the configuration managers queue manager, you will create the resource group as described in Chapter 2 above
4. Create the configuration manager on the node hosting the resource group using the hamqsicreatecfgmgr command.
17
hamqsicreatecfgmgr command
The hamqsicreatecfgmgr command will create the configuration manager and will ensure that its directories are arranged to allow for HA operation. The hamqsicreatecfgmgr command puts the configuration managers directories under the same path used for the data associated with the queue manager which the configuration manager uses. It parses the /var/mqm/mqs.ini file to locate this path information. The invocation of the hamqsicreatecfgmgr command uses exactly the same parameters that you would normally use for hamqsicreatecfgmgr. You must be root to run this command.
Syntax hamqsicreatecfgmgr <creation parameters> Parameters
creation parameters - are exactly the same as for the regular WMB mqsicreateconfigmgr command
Example: Hamqsicreatecfgmgr MyCfgMgr -i mqsi -a mqsi -q ha.csq1
5. Ensure that you can start and stop the configuration manager manually using the mqsistart and mqsistop commands. You will need to login as the user id under which the configuration manager runs to test this.
6. On any other nodes in the resource group's nodelist (i.e. excluding the one on which you
just created the configuration manager), run the hamqsiaddcfgmgrstandby command to create the information needed by these nodes to enable them to host the configuration manager.
hamqsiaddcfgmgrstandby
The hamqsiaddcfgmgrstandby command will create the information required for a cluster node to act as a standby for the configuration manager. This command does not create the configuration manager - which is created as described earlier. This command defines symbolic links within subdirectories under /var/mqsi on the standby node which allow the configuration manager to move to that node. The hamqsiaddcfgmgrstandby command expects the configuration manager directories to have been created by the hamqsicreatecfgmgr command under the same path used for the data associated with the queue manager which the configuration manager uses. It parses the /var/mqm/mqs.ini file to locate this path information. You must be root to run this command.
Syntax hamqsiaddcfgmgrstandby <cfgmgr> <qmname> <userid> Parameters
cfgmgr configuration managers name qm name - the name of the queue manager used by the UNS userid - the account under which the configuration manager service runs
Example: hamqsiaddcfgmgrstandby MyCfgMgr ha.csq1 mqsi
18
The hamqsi_start_cfgmgr_as and hamqsi_stop_cfgmgr_as scripts are the ones you configure as the start and stop methods for the application server. They invoke methods supplied by MC91 to control the queue manager and the hamqsi_start_cfgmgr and hamqsi_stop_cfgmgr methods to control the UNS. You can configure an application monitor which will monitor the health of the configuration manager and its queue manager. This will monitor the configuration manager and its queue manager and trigger recovery actions as a result of failures. Recovery actions include the ability to perform local restarts of the configuration manager and queue manager (see below) or to cause a failover of the resource group to another node. In HACMP you can only configure one application monitor per resource group. When you ran hamqsicreatecfgmgr an application monitor was created. This application monitor is specifically for monitoring an application server containing a configuration manager and queue manager and is called:
hamqsi_applmon.<cfgmgr>
If you use the application monitor, it will call it periodically to check that the configuration manager processes and queue manager are running. The example application monitor checks that the bipservice process is running. The bipservice process monitors and restarts the bipuns process.
hamqsi_start_cfgmgr_as
The example start script is called hamqsi_start_cfgmgr_as. This script is robust in that it does not assume anything about the state of the configuration manager or queue manager on entry. It accepts two command line parameters which are the queue manager name and the userid under which the configuration manager runs, so when you define the start command in HA, include the parameters.
Example
19
hamqsi_stop_cfgmgr_as
The example stop script is called hamqsi_stop_cfgmgr_as. The stop script accepts three command line parameters, the first is the queue manager name, the second parameter is the configuration manager userid and the third is the timeout (in seconds) to use on each of the levels of severity of stop. When you define the stop command in HA you should include the parameters.
Example
"/MQHA/bin/hamqsi_stop_cfgmgr_as My CfgMgr ha.csq1 mqsi 10" The stop command will use the timeout parameter as the time to allow either the queue manager or configuration manager to respond to an attempt to stop it. If a stop attempt times out, then a more severe stop is performed. The stop command has to ensure that the configuration manager and queue manager are both fully stopped by the time the command completes.
The fault monitoring interval is configured in the HA Software, which also include the parameters that control whether a failure of either component of the application server will trigger a restart. It is recommended that the restart count is set to 1 so that one restart is attempted, and that the time period is set to a small multiple of the expected start time for the components of the UNS group. With these settings, if successive restarts fail without a significant period of stability between, then the resource group will failover to a different node. Attempting more restarts on a node on which a restart has just failed is unlikely to succeed.
Actions: 8.
Create an application server which will run the configuration manager and its queue manager using the example scripts provided in this SupportPac. The example scripts are called hamqsi_start_uns_as and hamqsi_stop_uns_as and are described in the following frames. created by hamqsicreatecfgmgr. An application monitor script cannot be passed parameters, so just specify the name of the monitor script. Also configure the other application monitor parameters, including the monitoring interval and the restart parameters you require. The example application monitor script provided in this SupportPac is described in the following frame:
9. You can also specify an application monitor using the hamqsi_applmon.<cfgmgr> script
20
hamqsi_applmon.<cfgmgr>
The hamqsi_applmon.<cfgmgr> created for you by hamqsicreatecfgmgr will be called at the polling frequency you specify. It is a parameter-less wrapper script that calls hamqsi_monitor_cfgmgr_as which checks the state of the configuration manager and queue manager. Success of both tests causes the application monitor to return a zero exit code, indicating that the configuration manager and queue manager are working properly. Failure of either test will result in the application monitor returning a non-zero exit code. HACMP/ServiceGuard will then take whatever action has been configured. If you wish to use the example application monitor then supply its for the configuration manager application server. The monitoring script accepts no parameters.
Example "/MQHA/bin/hamqsi_applmon.MyCfgMgr"
The example application monitor is tolerant if it finds that the queue manager is starting because this may be due to the stabilisation interval being too short.
10. Synchronise the cluster resources. 11. Ensure that the configuration manager and its queue manager are stopped, and start the application server. 12. Check that the configuration manager and queue manager started and test that the resource group can be moved from one node to the other and that the configuration manager runs correctly on each node. 13. Ensure that stopping the application server stops the configuration manager and its queue manager. 14. With the application server started, verify that the local restart capability is working as configured. During this testing a convenient way to cause failures is to identify the bipservice for the configuration manager and kill it.
The SupportPac contains the definition of the MQSIConfigMgr resource type, which can be found in the types. MQSIConfigMgr file. The configuration manager is put under cluster control by creating a resource of this type. Because the configuration manager needs to be co-located with its queue manager, the configuration manager resource needs to have a resource dependency on the corresponding queue manager resource (of resource type MQM, provided by SupportPac MC91). This dependency tells VCS that the queue manager needs to be started before the configuration manager and that the configuration manager needs to be stopped before the queue manager. Actions: 7. Create the MQSIConfigMgr resource type. This can be performed using the VCS GUI or by editing the main.cf or types.cf files, either by including the types. MQSIConfigMgr file in the main.cf for the cluster, or by copying the content of the types.
21
MQSIConfigMgr file into the existing types.cf file. If you opt to edit the files, then ensure that the configuration is read-write and use hacf to verify the changes. 8. Add a resource of type MQSIConfigMgr, either using the VCS GUI or by editing the main.cf file. The resource needs to be in the same service group as the queue manager upon which the configuration manager depends and it should have a requires statement to record the dependency. The sample main.cf included in Appendix A can be used as a guide. Verify and enable the changes. 9. Ensure that the configuration manager is stopped, and bring the service group online. The queue manager and configuration manager should start. 10. Check that the configuration manager started and test that the service group can be switched from one system to another and that the configuration manager runs correctly on each system. 11. Ensure that taking the service group offline stops the configuration manager. 12. With the service group online, verify that the monitor correctly detects failures and configure the restart attributes to your desired values. During this testing a convenient way to cause failures is to identify the bipservice for the configuration manager and kill it.
Linux-HA
The configuration manager is managed by the mqsiuns agent supplied in this SupportPac. The agent contains the following methods:
start - starts the configuration manager stop - stops the configuration manager status - tests the health of the configuration manager monitor - tests the health of the configuration manager
The mqsiusn script is the one you configure with Heartbeat which calls hamqsi_start_uns_as and hamqsi_stop_uns_as. They invoke methods supplied by scripts to control the queue manager and the hamqsi_start_uns and hamqsi_stop_uns methods to control the configuration manager. You can configure an application monitor which will monitor the health of the configuration manager and its queue manager. This will monitor the configuration manager and its queue manager and trigger recovery actions as a result of failures. Recovery actions include the ability to perform local restarts of the configuration manager and queue manager (see below) or to cause a failover of the resource group to another node. With Heartbeat its possible to configure many monitors for a single resource group. For this support pac we will only use one To place the configuration manager under the control of Heartbeat the /var/lib/heartbeat/crm/cib.xml file must be updated to include the bipconfigmgr as a component in the resource group. If you have followed the instruction in Chapter 2 above you should already have a resource group for the configuration manager. An extra primative class is required for the username server as shown below higlighted in italics/red
<cib> <configuration> <crm_config> <cluster_property_set id="default"> <attributes> <nvpair id="is_managed_default" name="is_managed_default" value="true"/>
22
</attributes> </cluster_property_set> </crm_config> <nodes> <node id="3928ccf5-63d2-4fbd-b4ed-e9d6163afbc9" uname="ha-node1" type="normal"/> </nodes> <resources> <group id="group_1"> <primitive class="ocf" id="IPaddr_1" provider="heartbeat" type="IPaddr"> <operations> <op id="IPaddr_1_mon" interval="5s" name="monitor" timeout="5s"/> </operations> <instance_attributes id="IPaddr_1_inst_attr"> <attributes> <nvpair id="IPaddr_1_attr_0" name="ip" value="192.168.1.11"/> </attributes> </instance_attributes> </primitive> <primitive class="ocf" id="Filesystem_2" provider="heartbeat" type="Filesystem"> <operations> <op id="Filesystem_2_mon" interval="120s" name="monitor" timeout="60s"/> </operations> <instance_attributes id="Filesystem_2_inst_attr"> <attributes> <nvpair id="Filesystem_2_attr_0" name="device" value="/dev/sdb1"/> <nvpair id="Filesystem_2_attr_1" name="directory" value="/MQHA/cm1qm"/> <nvpair id="Filesystem_2_attr_2" name="fstype" value="ext2"/> </attributes> </instance_attributes> </primitive> # New section starts here <primitive class="heartbeat" id="mqsicfgmgr_4" provider="heartbeat" type="mqsicfgmgr"> <operations> <op id="mqsicfgmgr_4_mon" interval="120s" name="monitor" timeout="60s"/> </operations> <instance_attributes id="mqsicfgmgr_4_inst_attr"> <attributes> <nvpair id=" mqsicfgmgr _4_attr_1" name="1" value="cfgqm"/> <nvpair id=" mqsicfgmgr _4_attr_2" name="2" value="mqm"/> <nvpair id=" mqsicfgmgr _4_attr_3" name="3" value="argostr"/> </attributes> </instance_attributes> </primitive> # New section ends here </group> </resources> <constraints> <rsc_location id="rsc_location_group_1" rsc="group_1"> <rule id="prefered_location_group_1" score="100"> <expression attribute="#uname" id="prefered_location_group_1_expr" operation="eq" value="ha-node1"/> </rule> </rsc_location> </constraints> </configuration> </cib>
<nvpair id=" mqsicfgmgr_4_attr_1" name="1" value="cfgqm"/> Name of the queue manager the configuration manager runs on. <nvpair id=" mqsicfgmgr _4_attr_2" name="2" value="mqm"/> Name of the MQ userID. <nvpair id=" mqsicfgmgr _4_attr_3" name="3" value="argostr"/> Name of the configuration manager userID.
Once you have made this change make sure you syncronise the cib.xml file to all other machines within the cluster.
23
hamqsi_start_cfgmgr_as
The example start script is called hamqsi_start_cfgmgr_as. This script is robust in that it does not assume anything about the state of the configuration manager or queue manager on entry. It accepts two command line parameters which are the queue manager name and the userid under which the configuration manager runs, so when you define the start command in HA, include the parameters.
Example
hamqsi_stop_cfgmgr_as
The example stop script is called hamqsi_stop_cfgmgr_as. The stop script accepts three command line parameters, the first is the queue manager name, the second parameter is the configuration manager userid and the third is the timeout (in seconds) to use on each of the levels of severity of stop. When you define the stop command in HA you should include the parameters.
Example
"/MQHA/bin/hamqsi_stop_cfgmgr_as MyCfgMgr ha.csq1 mqsi 10" The stop command will use the timeout parameter as the time to allow either the queue manager or configuration manager to respond to an attempt to stop it. If a stop attempt times out, then a more severe stop is performed. The stop command has to ensure that the configuration manager and queue manager are both fully stopped by the time the command completes.
24
During the creation of the UNS queue manager, you will create the resource group as described in SupportPac MC91. The resource group can be either cascading or rotating. Whichever you choose, bear the following points in mind: The resource group will use the IP address as the service label. This is the address which clients and channels will use to connect to the queue manager. If you choose cascading, it is recommended that you consider disabling the automatic fallback facility by setting Cascading Without Fallback to true. This is to avoid the interruption to the UNS which would be caused by the reintegration of the top priority node after a failure. Unless you have a specific requirement which would make automatic fallback desirable in your configuration, then it is probably better to manually move the resource group back to the preferred node when it will cause minimum disruption.
For VCS
The service group in which the UNS and its queue manager will run is created during the creation of the queue manager, as described in SupportPac MC91. The queue manager will use the IP address managed by the service group, rather than an IP address statically assigned to a system. The logical address is the address which clients and channels will use to connect to the queue manager.
For ServiceGaurd
During the creation of the UNS queue manager, you will create the package as described in SupportPac MC91. The broker queue manager needs to be configured so that the broker can communicate with the Configuration Manager and UNS. The following actions are written assuming that the UNS is not sharing the broker queue manager. If the broker is sharing its queue manager with the UNS, then you can omit the creation of the relevant transmission queues and channels. If the broker is running in a collective then it will also need to communicate with other brokers and you should configure additional queues and channels for broker to broker communication. Remember that because the broker queue manager is clustered you need to use the service address for channels sending to the broker queue manager rather than the machine IP address.
Actions: 1. On one node, create a clustered queue manager as described in SupportPac MC91, using the hacrtmqm command. Use the volume group that you created for the broker and place the volume group and queue manager into a resource group to which the broker will be added. Don't configure the application server or application monitor described in SupportPac MC91 you will create an application server that covers the broker, queue manager and broker database instance. 2. Set up queues and channels between the broker queue manager and the Configuration Manager queue manager: a. On the Configuration Manager queue manager create a transmission queue for communication to the broker queue manager. Ensure that the queue is given the same name and case as the broker queue manager. The transmission queue should be set to trigger the sender channel. b. On the Configuration Manager queue manager create a sender and receiver channel for communication with the broker queue manager. The sender channel should use the service address of the broker resource group and the broker queue manager's port number.
25
c. On the broker queue manager create a transmission queue for communication to the Configuration Manager queue manager. Ensure that the queue is given the same name and case as the Configuration Manager queue manager. The transmission queue should be set to trigger the sender channel. d. On the broker queue manager create sender and receiver channels to match those just created on the Configuration Manager queue manager. The sender channel should use the IP address of the machine where the Configuration Manager queue manager runs, and the corresponding listener port number. 3. If you are using a UNS, set up queues and channels between the broker queue manager and the UNS queue manager: a. On the broker queue manager create a transmission queue for communication to the UNS queue manager. Ensure that the queue is given the same name and case as the UNS queue manager. The transmission queue should be set to trigger the sender channel. b. On the broker queue manager create a sender and receiver channel for communication with the UNS queue manager. If the UNS is clustered, the sender channel should use the service address of the UNS resource group and the UNS queue manager's port number. c. On the UNS queue manager create a transmission queue for communication to the broker queue manager. Ensure that the queue is given the same name and case as the broker queue manager. The transmission queue should be set to trigger the sender channel. d. On the UNS queue manager create a sender and receiver channel for communication with the broker queue manager, with the same names as the receiver and sender channel just created on the broker queue manager. The sender channel should use the service address of the broker resource group and the broker queue manager's port number. 4. Test that the above queue managers can communicate regardless of which node owns the resource groups they belong to. For Linux-HA
The service group in which the UNS and its queue manager will run is created during the creation of the queue manager, as described in Chapter 6. The queue manager will use the IP address managed by the service group, rather than an IP address statically assigned to a system. The logical address is the address which clients and channels will use to connect to the queue manager.
For HACMP/ServiceGuard
The database instance is made highly available by invoking its HA scripts from within the scripts you configure for the broker resource group's application server. If you are using DB2 , then HACMP scripts are supplied with DB2 and are described in the DB2 Administration Guide. ServiceGaurd scripts can be found on the DB2 web site in appendix 5 of the IBM DB2 EE v.7.1 Implementation and Certification with MC/ServiceGuard High Availability Software If you are using a different database manager then follow the instructions provided with that database manager. The example scripts supplied in this SupportPac for the broker application server
26
include calls to the DB2 Version 8.2 scripts. If you are using a different database manager then edit the scripts accordingly.
Actions: 1. Create a database instance home directory in the volume group owned by the resource group. As portrayed in Figure 7, this can be in the queue manager's data path in the "databases" directory. 2. 3.
Create a database instance owner user, specifying the home directory just created. Create the database instance.
4. Start the instance and create and configure the broker database as described in the WMB documentation, including creation of an ODBC data source for it, defined on all cluster nodes that may host the resource group. 5. Don't create an application server for the database instance - it will be included in the application server which you will create. 6. Ensure that the database instance runs correctly on each node the resource group can move to. You will need to manually start and stop the database instance to test this.
VCS
The database instance is the unit of failover of the database manager. The database instance is made highly available by using an appropriate agent. If you are using DB2, then appropriate agents are available eihter from IBM or VERITAS
Actions:
1. Create a database instance home directory in the disk group owned by the service group. 2. Create a database instance owner user, specifying the home directory just created. 3. Create the database instance. 4. Start the instance and create and configure the broker database as described in the WMB documentation, including creation of an ODBC data source for it. 5. Place the database instance under VCS control by configuring the database agent and modifying the cluster configuration. 6. Ensure that the database agent can start and stop the database instance as the service group is placed online and offline. 7. Ensure that the database instance runs correctly on each system in the service groups systemlist.
Linux-HA
For the Linux section is it presumed that the database is made highly available in its own entity. This may be achieved using another HA application or Linux-HA. Documention on how to complete this is provided by DB2 such as the following RedPaper Open Source Linux High Availability for IBM DB2 Universal Database ftp://ftp.software.ibm.com/software/data/pubs/papers/db2halinux.pdf
27
1. Create the broker on the node hosting the logical host using the hamqsicreatebroker
command.
hamqsicreatebroker command
The hamqsicreatebroker command will create the broker and will ensure that its directories are arranged to allow for HA operation. The hamqsicreatebroker command puts the broker directories under the same path used for the data associated with the queue manager which the broker uses. It parses the /var/mqm/mqs.ini file to locate this path information. The invocation of the hamqsicreatebroker command uses exactly the same parameters that you would normally use for mqsicreatebroker. You must be root to run this command.
Syntax hamqsicreatebroker <creation parameters> Parameters
creation parameters - are exactly the same as for the regular WMB mqsicreatebroker command
Example: hamqsicreatebroker imb1 -i mqsi -a mqsi -q ha.csq1 -n IMB1DB
2. Ensure that you can start and stop the broker manually using the mqsistart and mqsistop commands. You will need to login as the user id under which the UNS runs to test this.
3. On any other nodes in the resource group's nodelist (i.e. excluding the one on which you just created the broker), run the hamqsiaddbrokerstandby command to create the information needed by these nodes to enable them to host the broker. hamqsiaddbrokerstandby
The hamqsiaddbrokerstandby command will create the information required for a cluster node to act as a standby for the broker. This command does not create the broker - which is created as described earlier. This command defines symbolic links within subdirectories under /var/mqsi on the standby node which allow the broker to move to that node. The hamqsiaddbrokerstandby command expects the broker directories to have been created by the hamqsicreatebroker command under the same path used for the data associated with the queue manager which the broker uses. It parses the /var/mqm/mqs.ini file to locate this path information. You must be root to run this command.
Syntax hamqsiaddbrokerstandby <broker> <qm> <userid> Parameters
broker - the name of the broker qm - the name of the queue manager used by the broker userid - the account under which the broker service runs
Example: hamqsiaddbrokerstandby imb1 ha.csq1 mqsi
28
The hamqsi_start_broker_as and hamqsi_stop_broker_as scripts are the ones you configure as the start and stop methods for the application server. They invoke methods supplied by SupportPac MC91 to control the queue manager and invoke the database HA scripts to control the database instance. In addition they invoke the hamqsi_start_broker and hamqsi_stop_broker methods to control the broker. You can configure an application monitor which will monitor the health of the broker, queue manager and database instance. The application monitor will assess the state of these components and trigger recovery actions as a result of failures. Recovery actions include the ability to perform local restarts of the broker and its dependencies (see below) or to cause a failover of the resource group to another node. You can only configure one application monitor per resource group. When you ran hamqsicreatebroker an application monitor was created. This application monitor is specifically for monitoring an application server containing a broker, a queue manager and a database instance and is called:
hamqsi_applmon.<broker>
where <broker> is the name of the broker. If you configure application monitor, it will call it periodically to check that the broker, queue manager and database instance are running. The example application monitor checks that the bipservice process is running. The bipservice process monitors and restarts the bipbroker process. The bipbroker process monitors and restarts DataFlowEngines. The application monitor does not check for DataFlowEngines because you may have none deployed. If you wish to monitor for these as well, then customise the example hamqsi_monitor_broker_as script, but remember that depending on how you customise it, you may have to suspend monitoring if you wish to deploy a removal of a DataFlowEngine, otherwise it would be classed as a failure. The fault monitoring interval is configured in the HA panels, which also include the parameters that control whether a failure of either component of the application server will trigger a restart. It is recommended that the restart count is set to 1 so that one restart is attempted, and that the time period is set to a small multiple of the expected start time for the components of the broker group. With these settings, if successive restarts fail without a significant period of stability between, then the resource group will failover to a different node. Attempting more restarts on a node on which a restart has just failed is unlikely to succeed.
Actions:
1. Create an application server which will run the broker, its queue manager and the database instance, using the example scripts provided in this SupportPac. The example scripts are called hamqsi_start_broker_as and hamqsi_stop_broker_as and are described in the following frames.
29
hamqsi_start_broker_as
The example start script is called hamqsi_start_broker_as. This script is robust in that it does not assume anything about the state of the broker, queue manager or database instance on entry. It accepts command line parameters which provide the name of the broker, the queue manager, the userid under which the broker runs and the names of the database instance and database. When you define the start command in HACMP, include the parameters.
Example
hamqsi_stop_broker_as
The example stop script is called hamqsi_stop_broker_as. The stop script accepts command line parameters that provide the name of the broker, the queue manager name, the broker userid, the database instance name and a timeout (in seconds) to use on each of the levels of severity of stop. When you define the stop command you should include the parameters.
Example
"/MQHA/bin/hamqsi_stop_broker_as imb1 ha.csq1 mqsi db2inst1 10" The stop command will use the timeout parameter as the time to allow either the queue manager or broker to respond to an attempt to stop it. If a stop attempt times out, then a more severe stop is performed. The stop command has to ensure that the broker and queue manager are both fully stopped by the time the command completes.
2. You can also specify an application monitor using the hamqsi_applmon.<broker> script created by hamqsicreatebroker. An application monitor script cannot be passed parameters, so just specify the name of the monitor script. Also configure the other application monitor parameters, including the monitoring interval and the restart parameters you require. The example application monitor script provided in this SupportPac is described in the following frame:
30
hamqsi_applmon.<broker>
The hamqsi_applmon.<broker> created for you by hamqsicreatebroker will be called at the polling frequency you specify. It is a parameter-less wrapper script that calls hamqsi_monitor_broker_as which checks the state of the broker, the queue manager and the database instance. A successful test of all three components causes the application monitor to return a zero exit code, indicating that the components are working properly. Failure of any component test will result in the application monitor returning a non-zero exit code. If you wish to use the example application monitor then supply its name tothe HA Software. The monitoring script accepts no parameters.
Example "/MQHA/bin/hamqsi_applmon.<broker>"
The example application monitor is tolerant if it finds that the queue manager is starting because this may be due to the stabilisation interval being too short.
3. Synchronise the cluster resources. 4. Ensure that the broker, queue manager and database instance are stopped, and start the application server. 5. Check that the components started and test that the resource group can be moved from one node to the other and that they run correctly on each node. 6. Ensure that stopping the application server stops the components. 7. With the application server started, verify that the HACMP local restart capability is working as configured. During this testing a convenient way to cause failures is to identify the bipservice for the broker and kill it.
VCS
The broker is managed by the MQSIBroker agent supplied in this SupportPac. The agent contains the following methods:
online - starts the broker offline - stops the broker monitor - tests the health of the broker clean - forcibly terminates the broker
The SupportPac contains the definition of the MQSIBroker resource type, which can be found in the types.MQSIBroker file. A broker is put under cluster control by creating a resource of this type. Because the broker needs to be co-located with its queue manager, the broker resource needs to have a resource dependency on the corresponding queue manager resource (of resource type MQM, provided by SupportPac MC91This dependency tells VCS that the queue manager needs to be started before the broker and that the broker needs to be stopped before the queue manager. The broker resource needs to be configured to depend upon the queue manager resource which manages the queue manager and the database resource which manages the database instance in which the broker database runs. These dependencies tell VCS that the queue manager and database instance need to be started before the broker and that the broker needs to be stopped before the queue manager and database instance.
31
The monitor method checks that the bipservice process is running and either restarts it or moves the service group to another system, depending on how you configure it. The bipservice process monitors and restarts the bipbroker process. The bipbroker process monitors and restarts DataFlowEngines. The probe does not check for DataFlowEngines because you may have none deployed. If you wish to monitor for these as well, then customise the monitor method.
Actions:
1. Create the MQSIBroker resource type. This can be performed using the VCS GUI or by editing the main.cf or types.cf files, either by including the types.MQSIBroker file in the main.cf for the cluster, or by copying the content of the types.MQSIBroker file into the existing types.cf file. If you opt to edit the files, then ensure that the configuration is read-write and use hacf to verify the changes. 2. Add a resource of type MQSIBroker, either using the VCS GUI or by editing the main.cf file. The resource needs to be in the same service group as the queue manager and database instance upon which the broker depends and it should have requires statements to record these dependencies. The sample main.cf included in Appendix A can be used as a guide. Verify and enable the changes. 3. Ensure that the broker is stopped, and bring the service group online. The queue manager, database and broker should start. 4. Check that the broker started and test that the service group can be switched from one system to another and that the broker runs correctly on each system. 5. Ensure that taking the service group offline stops the broker. 6. With the service group online, verify that the monitor correctly detects failures and configure the restart attributes to your desired values. During this testing a convenient way to cause failures is to identify the bipservice for the broker and kill it.
Linux-HA Control
The broker is managed by the mqsibroker agent supplied in this SupportPac. The agent contains the following methods:
start - starts the Broker stop - stops the Broker status - tests the health of the Broker monitor - tests the health of the Broker
The mqsibroker script is the one you configure with Heartbeat which calls hamqsi_start_broker_as and hamqsi_broker_uns_as. They invoke methods to control the queue manager and the hamqsi_start_broker and hamqsi_stop_broker methods to control the broker. You can configure an application monitor which will monitor the health of the broker and its queue manager. This will monitor the broker and its queue manager and trigger recovery actions as a result of failures. Recovery actions include the ability to perform local restarts of the broker and queue manager or to cause a failover of the resource group to another node. With Heartbeat its possible to configure many monitors for a single resource group. For this support pac we will only use one. To place the broker under the control of Heartbeat the /var/lib/heartbeat/crm/cib.xml file must be updated to include the broker as a component in the resource group. If you have followed
32
the instruction in Chapter 2 above you should already have a resource group for the uns. An extra primative class is required for the broker as shown below highlighted in italics/red.
<cib> <configuration> <crm_config> <cluster_property_set id="default"> <attributes> <nvpair id="is_managed_default" name="is_managed_default" value="true"/> </attributes> </cluster_property_set> </crm_config> <nodes> <node id="3928ccf5-63d2-4fbd-b4ed-e9d6163afbc9" uname="ha-node1" type="normal"/> </nodes> <resources> <group id="group_1"> <primitive class="ocf" id="IPaddr_1" provider="heartbeat" type="IPaddr"> <operations> <op id="IPaddr_1_mon" interval="5s" name="monitor" timeout="5s"/> </operations> <instance_attributes id="IPaddr_1_inst_attr"> <attributes> <nvpair id="IPaddr_1_attr_0" name="ip" value="192.168.1.11"/> </attributes> </instance_attributes> </primitive> <primitive class="ocf" id="Filesystem_1" provider="heartbeat" type="Filesystem"> <operations> <op id="Filesystem_1_mon" interval="120s" name="monitor" timeout="60s"/> </operations> <instance_attributes id="Filesystem_1_inst_attr"> <attributes> <nvpair id="Filesystem_1_attr_0" name="device" value="/dev/sdb1"/> <nvpair id="Filesystem_1_attr_1" name="directory" value="/MQHA/cm1qm"/> <nvpair id="Filesystem_1_attr_2" name="fstype" value="ext2"/> </attributes> </instance_attributes> </primitive> # New section starts here <primitive class="heartbeat" id="mqsibroker_1" provider="heartbeat" type="mqsibroker"> <operations> <op id=" mqsibroker _1_mon" interval="120s" name="monitor" timeout="60s"/> </operations> <instance_attributes id=" mqsibroker _1_inst_attr"> <attributes> <nvpair id="mqsibroker_1_attr_1" name="1" value="br1"/> <nvpair id="mqsibroker_1_attr_2" name="2" value="br1qm"/> <nvpair id="mqsibroker_1_attr_3" name="3" value="mqm"/> <nvpair id="mqsibroker_1_attr_4" name="4" value="argostr"/> </attributes> </instance_attributes> </primitive> # New section ends here </group> </resources> <constraints> <rsc_location id="rsc_location_group_1" rsc="group_1"> <rule id="prefered_location_group_1" score="100"> <expression attribute="#uname" id="prefered_location_group_1_expr" operation="eq" value="ha-node1"/> </rule> </rsc_location> </constraints> </configuration> </cib>
<nvpair id="mqsibroker_1_attr_1" name="1" value="br1"/> Name of the Broker. <nvpair id="mqsibroker_1_attr_2" name="2" value="br1qm"/> Name of the queue manager the broker runs on.
33
<nvpair id="mqsibroker_1_attr_3" name="3" value="mqm"/> Name of the MQ userID. <nvpair id="mqsibroker_1_attr_4" name="4" value="argostr"/> Name of the userID the broker runs under.
Once you have made this change make sure you syncronise the cib.xml file to all other machines within the cluster. Once these updates are complete you are ready to restart Heartbeat to pickup the new resource.
34
1. Stop the application server which runs the UNS. 2. Delete the application monitor, 3. Delete the application server. If you wish to retain the queue manager under cluster control then you may decide to replace the application server with one that uses the scripts from SupportPac MC91 4. If you have no further use for them, remove the filesystem, service label and volume group resources from the resource group and delete the group. 5. Synchronise the cluster resources configuration.
VCS Actions:
1. Take the service group offline. This stops the resources managed by the service group, unmounts the filesystems on the disk groups it contains and deports the disk groups. 2. Modify the cluster configuration, either using the VCS GUI or by editing the configuration files. 3. Verify the changes and, if desired, bring the service group back online.
35
Linux-HA Actions:
1. Stop the resource being monitored by removing it from CRM. This is achieved by running crm_resource D r <resource id> -t primative. You can find out the resource ID by looking at your cib.xml file or by running crm_resource L which will list the resource with resource id i.e. crm_resource D r mqsiuns_1 t primative 2. Once the resouce is no longer managed you need to stop it manually using the mqsibroker script i.e. /etc/ha.d/resource.d/mqsiuns unsqm mqm argostr stop 3. Verify the user name server is stopped by checking for a bupuns process.
1. Ensure that the UNS has been removed from cluster control and identify which node it is running on. 2. Identify which other nodes have standby information on them. If you are not sure whether a node has standby information then look for a symbolic link called /var/mqsi/components/Currentversion/UserNameServer. If a node has such a link, then it is a standby node. 3. On the standby nodes, run the hamqsiremoveunsstandby command.
hamqsiremoveunsstandby command
The hamqsiremoveunsstandby command will remove the standby information from standby nodes for the UNS. This will remove the symlinks for the UNS from the subdirectories under the /var/mqsi directory. You must be root to run this command.
Syntax hamqsiremoveunsstandby Parameters
none
36
1. Delete the UNS from the Configuration Manager and deploy the changes as described in the WMB Administration Guide. 2. Identify the node on which the UNS is defined and on that node ensure that the UNS is stopped. If it is not, then issue the mqsistop UserNameServer command to stop it. 3. On the same node, as root, run the hamqsideleteusernameserver command.
hamqsideleteusernameserver command
The hamqsideleteusernameserver command will delete the UNS. This will destroy its control files and remove the definition of the broker from the /var/mqsi directory. This is similar to the behaviour of the mqsideleteusernameserver command, which the command uses internally. You must be root to run this command.
Syntax hamqsideleteusernameserver <userid> Parameters
1. Stop the application server which runs the configuration manager. 2. Delete the application monitor, 3. Delete the application server. If you wish to retain the queue manager under cluster control then you may decide to replace the application server with one that uses the scripts from SupportPac MC91. 4. If you have no further use for them, remove the filesystem, service label and volume group resources from the resource group and delete the group. 5. Synchronise the cluster resources configuration.
VCS Actions:
1. Take the service group offline. This stops the resources managed by the service group, unmounts the filesystems on the disk groups it contains and deports the disk groups. 2. Modify the cluster configuration, either using the VCS GUI or by editing the configuration files. 3. Verify the changes and, if desired, bring the service group back online.
Linux-HA Actions:
37
1. Stop the resource being monitored by removing it from CRM. This is achieved by running crm_resource D r <resource id> -t primative. You can find out the resource ID by looking at your cib.xml file or by running crm_resource L which will list the resource with resource id i.e. crm_resource D r mqsicfgmgr_1 t primative 2. Once the resouce is no longer managed you need to stop it manually using the mqsiconfigmgr script i.e. /etc/ha.d/resource.d/mqsiconfigmgr cfgmgr cfg1qm mqm argostr stop 3. Verify the broker is stopped by checking for a bipbroker process.
hamqsiremoveconfigmgrstandby command
The hamqsiremoveconfigmgrstandby command will remove the standby information from standby nodes for a configuration manager. This will remove the symlinks for the broker from the subdirectories under the /var/mqsi directory. You must be root to run this command.
Syntax hamqsiremoveconfigmgrstandby <config mgr name> Parameters
1. Identify the node on which the configuration manager is defined and on that node ensure that the configuration manager is stopped. If it is not, then issue the mqsistop
38
<configmgr> command to stop it, where <configmgr> is the name of the configuration manager. 2. On the same node, as root, run the hamqsideleteconfigmgr command.
hamqsideleteconfigmgr command
The hamqsideleteconfigmgr command will delete a configuration manager. This will destroy its control files and remove the definition of the configuration manager from the /var/mqsi directory. This is similar to the behaviour of the mqsideleteconfigmgr command, which the hamqsideleteconfigmgr command uses internally. You must be root to run this command.
Syntax hamqsideleteconfigmgr <config mgr name> <userid> Parameters
Config mgr name - the name of the configuration manager to be deleted userid - the userid under which the broker service runs
6. Stop the application server which runs the broker. 7. Delete the application monitor, i 8. Delete the application server. If you wish to retain the queue manager or database instance under cluster control then you may decide to replace the application server with one that uses the scripts from SupportPac MC91 or the database HACMP scripts. 9. If you have no further use for them, remove the filesystem, service label and volume group resources from the resource group and delete the group. 10. Synchronise the cluster resources configuration.
VCS Actions:
4. Take the service group offline. This stops the resources managed by the service group, unmounts the filesystems on the disk groups it contains and deports the disk groups. 5. Modify the cluster configuration, either using the VCS GUI or by editing the configuration files. 6. Verify the changes and, if desired, bring the service group back online.
Linux-HA Actions:
39
4. Stop the resource being monitored by removing it from CRM. This is achieved by running crm_resource D r <resource id> -t primative. You can find out the resource ID by looking at your cib.xml file or by running crm_resource L which will list the resource with resource id i.e. crm_resource D r mqsibrk_1 t primative 5. Once the resouce is no longer managed you need to stop it manually using the mqsibroker script i.e. /etc/ha.d/resource.d/mqsibroker br1 br1qm mqm argostr stop 6. Verify the broker is stopped by checking for a bipbroker process.
hamqsiremovebrokerstandby command
The hamqsiremovebrokerstandby command will remove the standby information from standby nodes for a broker. This will remove the symlinks for the broker from the subdirectories under the /var/mqsi directory. You must be root to run this command.
Syntax hamqsiremovebrokerstandby <broker name> Parameters
Delete a broker
When a broker has been removed from cluster control, it is possible to delete it. The broker was created by the hamqsicreatebroker command, in such a way that it is amenable to HA operation. There is a companion command called hamqsideletebroker which is aware of the changes made for HA operation and which should always be used to delete the broker. This companion command reverses the HA changes and then deletes the broker. This destroys the broker.
Actions: 7. Delete the broker from the Configuration Manager and deploy the changes as described in the WMB Administration Guide.
40
8. Identify the node on which the broker is defined and on that node ensure that the broker is stopped. If it is not, then issue the mqsistop <broker> command to stop it, where <broker> is the name of the broker. 9.
hamqsideletebroker command
The hamqsideletebroker command will delete a broker. This will destroy its control files and remove the definition of the broker from the /var/mqsi directory. This is similar to the behaviour of the mqsideletebroker command, which the hamqsideletebroker command uses internally. You must be root to run this command.
Syntax hamqsideletebroker <broker name> <userid> Parameters
broker name - the name of the broker to be deleted userid - the userid under which the broker service runs
41
Optionally, test failover of the Queue Manager to the other cluster node. Ensure the Configuration Manager group is on the primary node.
5. Create the Configuration Manager on the primary node using the mqsicreateconfigmgr command specifying the option defining a location on the shared drive used by the Configuration Manager Queue nager. For example: S:\WorkPath. Do not start the component. 6. Move the Configuration Manager group to the secondary node and ensure that the Queue Manager is online.
42
7. Re-execute the mqsicreateconfigmgr command on the secondary node. Ensure that the parameters match those used when the Configuration Manager was created on the primary node. Do not start the component. 8. Move the Configuration Manager group back to the primary node in the cluster. Ensure that the Queue Manager is online. 9. On the primary node, create an MSCS Generic Service resource for the Configuration Manager service. Specify the dependency on the Configuration Manager Queue Manager. The name of the service is MQSeriesBroker<ConfigurationManager>. When asked for any registry keys to monitor add the following key SOFTWARE\IBM\WebSphereMQIntegrator\2\<ConfigurationManager> where <ConfigurationManager> is the name of the component created. 10. Leave the start-up policy for the Configuration Manager service as its default setting of manual on both nodes. MSCS will be solely responsible for starting and stopping this service. 11. Bring the Configuration Manager generic service resource online. 12. Test that the Configuration Manager group can be moved from one node to the other and that the Configuration Manager runs correctly on either node.
Create the Configuration Manager using the mqsicreateconfigmgr command. Start the Configuration Manager and ensure that it has started correctly.
43
Start the Cluster Administrator and create a new MSCS group. It is recommended that you leave the group failback policy at the default value of disabled.
e. On the UNS Queue Manager create a sender and receiver channel for the associated Configuration Manager transmission queue. If the Configuration Manager is clustered, the sender channel should use the virtual IP address of the Configuration Manager Queue Manager and its associated listener port number. 3. Test that the Queue Managers for the Configuration Manager and the UNS can communicate.
44
separate nodes. This would also require that the UNS have a separate Queue Manager from any Queue Managers used by the Message Broker(s) which is reliant on its own shared disk(s). If you were to decide to configure the UNS to use a Queue Manager that will also be used by a Message Broker you would have to configure the UNS and Message Broker(s) to be in the same MSCS group. The following instructions are written assuming that the UNS is being put in a separate group. The following descriptions refer to the nodes as primary and secondary. The choice of which node is the primary is initially arbitrary but once made should be adhered to.
1. The Queue Manager and its dependencies (shared disks and IP address) should be put in the MSCS group created for the User Name Server. 2. Bring the Queue Manager resource online and test that it starts the Queue Manager, and that the Queue Manager functions correctly. 3. 4.
Optionally, test failover of the Queue Manager to the other cluster node. Ensure the UNS group is on the primary node.
5. Create the UNS on the primary node using the mqsicreateusernameserver command specifying the option defining a location on the shared drive used by the User Name Servers Queue Manager. For example: S:\WorkPath. Do not start the UNS. 6.
Move the UNS group to the secondary node and ensure that the Queue Manager is online.
7. Re-execute the mqsicreateusernameserver command on the secondary node. Ensure that the parameters match those used when the UNS was created on the primary node. Do not start the UNS. 8. Move the UNS group back to the primary node in the cluster. Ensure that the Queue Manager is online. 9. On the primary node, create an MSCS Generic Service resource for the UNS service. Specify the dependency on the UNS Queue Manager. The name of the service is MQSeriesBrokerUserNameServer. When asked for any registry keys to monitor add the following key SOFTWARE\IBM\WebSphereMQIntegrator\2\UserNameServer 10. Leave the start-up policy for the UNS service as its default setting of manual on both nodes. MSCS will be solely responsible for starting and stopping this service. 11. Bring the UNS generic service resource online. 12. Test that the UNS group can be moved from one node to the other and that the UNS runs correctly on either node.
45
Test failover of the Queue Manager to the other cluster node ensuring correct operation.
5. Now that the Message Broker Queue Manager has been created it is necessary to set up queues and channels between it and the Configuration Manager Queue Manager, which is assumed to already exist. a. On the Configuration Manager Queue Manager create a transmission queue for communication to the Message Broker Queue Manager. Ensure that the queue is given the same name and case as the Message Broker Queue Manager. The transmission queue should be set to trigger the sender channel. b. On the Configuration Manager Queue Manager create a sender and receiver channel for the associated Message Broker transmission queue. Use the virtual IP address of the Message Broker Queue Manager and its associated listener port number. c. On the Message Broker Queue Manager create a transmission queue for communication to the Configuration Manager Queue Manager. Ensure that the queue is given the same name and case as the Configuration Manager Queue Manager. The transmission queue should be set to trigger the sender channel. d. On the Message Broker Queue Manager create sender and receiver channels to match those just created on the Configuration Manager. If the Configuration Manager is clustered, the sender channel should use the virtual IP address of the Configuration Manager Queue Manager and its associated listener port number. 6. If you are using a UNS, set up queues and channels between the broker queue manager and the UNS queue manager: a. On the UNS Queue Manager create a transmission queue for communication to the Message Broker Queue Manager. Ensure that the queue is given the same name and case as the Message Broker Queue Manager. The transmission queue should be set to trigger the sender channel.
46
b. On the UNS Queue Manager create a sender and receiver channel for the associated Message Broker transmission queue. Use the virtual IP address of the Message Broker Queue Manager and its associated listener port number. c. On the Message Broker Queue Manager create a transmission queue for communication to the UNS Queue Manager. Ensure that the queue is given the same name and case as the UNS Queue Manager. The transmission queue should be set to trigger the sender channel. d. On the Message Broker Queue Manager create sender and receiver channels to match those just created on the UNS. If the UNS is clustered, the sender channel should use the virtual IP address of the UNS Queue Manager and its associated listener port number.
47
6. Move the Message Broker group to the secondary node in the cluster. Check that the Message Broker Queue Manager is available and functioning correctly and that the DB2 instance has successfully migrated to the secondary node. 7. On secondary node(s) use the DB2 Control Center to add the instance (if it is not visible already) and create a duplicate ODBC connection as defined on the primary node. 8. Move the Message Broker group back to the primary node.
2. Create the Message Broker on the primary node using the mqsicreatebroker command specifying the option defining a location on the shared drive used by the Message Broker Queue Manager. For example S:\WorkPath. Do not start the broker at this stage. 3.
4. Re-execute the mqsicreatebroker command on the secondary node. Ensure that the same parameters are used as when initially created on the primary node. 5.
Move the Message Broker group back to the primary node in the cluster.
6. On the primary node create an MSCS Generic Service resource for the Message Broker service. a. The name of the service will be MQSeriesBroker<BrokerName>, where <BrokerName> should be replaced with the actual name of your Message Broker. b. c.
Create a dependency on the Message Broker Queue Manager. Create a dependency on the Message Broker database.
d. When asked for any registry keys to monitor, add the following key SOFTWARE\IBM\WebSphereMQIntegrator\2\<BrokerName> where <BrokerName> should be replaced with the actual name of your Message Broker. 7.
8. Test that the Message Broker, plus its dependencies can move from one cluster node to the other.
WMB Toolkit
To connect the WMB toolkit to a highly available Configuration Manager use the virtual IP address defined for the Configuration Manager to ensure that the tooling can connect to the Configuration Manager no matter which machine the component is currently running on. It may be necessary to use the mqsicreateaclentry function of WMB or the MCAUSER parameter of MQ channels to allow communication between the Toolkit and the Configuration Manager when the Configuration Manager is remote. For example the MCAUSER parameter of SYSTEM.BKR.CONFIG could be set to <user>@<domain>, if necessary. Refer to the documentation for more detailed descriptions of these functions.
48
Configuration Manager
Ensure that the Configuration Manager is stopped.
Start the Cluster Administrator and create a new MSCS group. It is recommended that you leave the group failback policy at the default value of disabled.
49
DisplayName= "IBM WebSphere Message Broker component <ConfigurationManagerName>" obj= <UserName> password= <Password>
b.
c. The UserName value should start ./ for local users or be a valid Domain user. The user must be defined and valid on all nodes in the MSCS cluster. 6. On the primary node, create an MSCS Generic Service resource for the Configuration Manager service. Specify the dependency on the Configuration Manager Queue Manager. The name of the service is MQSeriesBroker<ConfigurationManager>. When asked for any registry keys to monitor, add the following key: SOFTWARE \IBM\WebSphereMQIntegrator\2\<ConfigurationManager> where <ConfigurationManager> is the name of the component created. 7. Leave the start-up policy for the Configuration Manager service as its default setting of manual on both nodes. MSCS will be solely responsible for starting and stopping this service. 8.
9. Move the MSCS group to the secondary node in the cluster and check that all resources are available and running correctly.
Start the Cluster Administrator and create a new MSCS group. It is recommended that you leave the group failback policy at the default value of disabled.
50
a. sc \\<NodeName> create MQSeriesBrokerUserNameServer binPath= "C:\Program Files\IBM\MQSI\6.0\bin/bipservice.exe" depend= MQSeriesServices DisplayName= "IBM WebSphere Message Broker component UserNameServer" obj= <UserName> password= <Password> b.
c. The UserName value should start ./ for local users or be a valid Domain user. The user must be defined and valid on all nodes in the MSCS cluster. 6. On the primary node, create an MSCS Generic Service resource for the User Name Servers service. Specify the dependency on the User Name Servers Queue Manager. The name of the service is MQSeriesBrokerUserNameServer. When asked for any registry keys to monitor, add the following key: SOFTWARE\IBM\WebSphereMQIntegrator\2\UserNameServer. 7. Leave the start-up policy for the User Name Servers service as its default setting of manual on both nodes. MSCS will be solely responsible for starting and stopping this service. 8.
9. Move the MSCS group to the secondary node in the cluster and check that all resources are available and running correctly.
Message Broker
Ensure that the Message Broker is stopped.
Start the Cluster Administrator and create a new MSCS group. It is recommended that you leave the group failback policy at the default value of disabled.
Create a new instance using db2icrt db2icrt <new instance name> Update the command prompt to use this new instance set DB2INSTANCE=<new instance name> Start the new instance db2start Create a file to define relocation parameters
51
a. DB_NAME=<database name> DB_PATH=<disk on which the database is located> INSTANCE=<original instance>,<new instance name> NODENUM=0 8. a. 9. a. b.
Relocate the database db2relocatedb -f relocate.cfg Delete the old database set DB2INSTANCE=DB2 drop db HADB1
10. Move the new instance to MSCS Control 11. Follow the instructions for Using a Local Message Broker database in the Create and Configure the Message Broker section of this document.
c. The UserName value should start ./ for local users or be a valid Domain user. The user must be defined and valid on all nodes in the MSCS cluster. 17. On the primary node create an MSCS Generic Service resource for the Message Broker service. a. The name of the service will be MQSeriesBroker<BrokerName>, where <BrokerName> should be replaced with the actual name of your Message Broker. b. c.
Create a dependency on the Message Brokers Queue Manager. Create a dependency on the Message Brokers database.
d. When asked for any registry keys to monitor, add the following key SOFTWARE\IBM\WebSphereMQIntegrator\2\<BrokerName> where <BrokerName> should be replaced with the actual name of your Message Broker. 18. Bring the Message Broker generic service resource online. 19. Move the MSCS group to the secondary node in the cluster and check that all resources are available and running correctly.
52
WMB Toolkit
The .confgmgr file(s) created by the toolkit provide location information for Configuration Manager(s). Update this file using the virtual IP address defined in the MSCS group so that the Configuration Manager can be located no matter which node it is running on
53
1. Ensure the Configuration Manager group is on the primary node. 2. In the Cluster Administrator, delete the Configuration Manager resource instance. 3. Move the Message Brokers resources to the local machine a.
Copy all files from S:\WorkPath to C:\WorkPath (for example)
b. Using regedit update the value for the Message Brokers work path, change SOFTWARE\IBM\WebSphereMQIntegrator\2\<ConfigurationManager>\CurrentVersion\WorkP ath to the new location of the files (C:\<ConfigurationManager> from above) c. Delete from the C:\WorkPath\components directory all components that are still managed by MSCS. d.
Delete S:\WorkPath\components\<ConfigurationManager>
4. Remove the Configuration Manager Queue Manager from Configuration Manager group as
described in the MQ product documentation.
5. You may want to reconfigure any channels which refer to the Configuration Managers
Queue Manager so that they use the physical IP address rather the virtual IP address.
6. Delete the resource instances for lower level resources, such as disks and IP addresses
used by the Configuration Manager and its Queue Manager.
7. Delete the Configuration Manager group. 8. On the secondary node 9. Use the Windows 2003 command sc to uninstall the service a.
sc \\<NodeName> delete MQSeriesBroker<ConfigurationManager>
b. Also delete the following registry tree entry, if available SOFTWARE\IBM\WebSphereMQIntegrator\2\<ConfigurationManager> c. Use hadltmqm to delete the Configuration Managers Queue Manager from the secondary node. d. Optionally, reboot the secondary node. This will clear the active control set and ensure that the node is completely clean. 10. The result should be an operational Configuration Manager which is fixed on the primary
node. Although the resources have been deleted from the cluster configuration, the real services that they represented should continue to function correctly, but under manual control.
54
Ensure the UNS group is on the primary node. In the Cluster Administrator, delete the UNS resource instance. Move the UNS resources to the local machine Copy all files from S:\WorkPath to C:\WorkPath (for example)
b. Using regedit update the value for the Message Brokers work path, change SOFTWARE\IBM\WebSphereMQIntegrator\2\UserNameServer\CurrentVersion\WorkPath to the new location of the files (C:\WorkPath from above) c. Delete from the C:\WorkPath\components directory all components that are still managed by MSCS. d.
Delete S:\WorkPath\components\UserNameServer
4. Remove the UNS Queue Manager from UNS group as described in the MQ product documentation. 5. You may want to reconfigure any channels which refer to the UNS Queue Manager so that they use the physical IP address rather the virtual IP address. 6. Delete the resource instances for lower level resources, such as disks and IP addresses used by the UNS and its Queue Manager. 7. 8.
a. Use the Windows 2003 command sc to uninstall the service sc \\<NodeName> delete MQSeriesBrokerUserNameServer b. Also delete the following registry tree entry, if available SOFTWARE\IBM\WebSphereMQIntegrator\2\UserNameServer c.
Use hadltmqm to delete the UNS Queue Manager from the secondary node.
d. Optionally, reboot the secondary node. This will clear the active control set and ensure that the node is completely clean. 9. The result should be an operational UNS which is fixed on the primary node. Although the resources have been deleted from the cluster configuration, the real services that they represented should continue to function correctly, but under manual control.
Ensure the Message Broker group is on the primary node. In the Cluster Administrator delete the Message Brokers resource instance. Move the Message Brokers resources to the local machine
55
a.
b. Using regedit update the value for the Message Brokers work path, change SOFTWARE\IBM\WebSphereMQIntegrator\2\<BrokerName>\CurrentVersion\WorkPath to the new location of the files (C:\WorkPath from above) c. d.
Delete from C:\WorkPath\components all components that are still managed by MSCS. Delete S:\WorkPath\components\<BrokerName>
4. Remove the Message Brokers Queue Manager from the Message Broker group as described in the MQ product documentation. 5. You may also want to reconfigure any channels which refer to the queue manager so that they use the physical IP address rather the virtual IP address assigned in MSCS. 6. Remove the broker database from the Message Broker group as described in the documentation for the db2mscs command. 7. Optionally delete the resource instances for lower level resources, such as disks and IP addresses used by the broker and its dependencies. 8. 9. a.
Optionally delete the Message Broker group. On the secondary node: Use the Windows 2003 command sc to uninstall the service
b. sc \\<NodeName> delete MQSeriesBroker<BrokerName> where <BrokerName> is the name of the Message Broker. c. Also delete the following registry tree entry, if available SOFTWARE\IBM\WebSphereMQIntegrator\2\<BrokerName> where <BrokerName> should be replaced with the actual name of your Message Broker. d. e.
Also delete the ODBC data source set up for the brokers database. Use hadltmqm to delete the Message Brokers Queue Manager from the secondary node.
f. Optionally, reboot the secondary node. This will clear the active control set and ensure that the node is completely clean. 10. The result should be an operational Message Broker which is fixed on the primary node. Although the resources have been deleted from the cluster configuration, the real services that they represented should continue to function normally, but under manual control.
56
type MQSIConfigMgr ( static int OfflineTimeout = 60 static str LogLevel = error static str ArgList[] = { ConfigMgrName, UserID } NameRule = resource.ConfigMgrName str ConfigMgrName str UserID ) type MQSIBroker ( static int OfflineTimeout = 60 static str LogLevel = error static str ArgList[] = { BrokerName, UserID } NameRule = resource.BrokerName str BrokerName str UserID ) type MQSIUNS ( static int static str static str NameRule = str UserID ) OfflineTimeout = 60 LogLevel = error ArgList[] = { UserID } "UserNameServer"
As well as creating the MQSIBroker and MQSIUNS resource types, this also sets the values of the following resource type attributes: OfflineTimeout The VCS default of 300 seconds is quite long for an MQSI broker or UNS, so the suggested value for this attribute is 60 seconds. You can adjust this attribute to suit your own configuration, but it is recommended that you do not set it any shorter than approximately 15 seconds. OnlineWaitLimit It is recommended that you configure the OnlineWaitLimit for the resource types. The default setting is 2, but to accelerate detection of start failures, this attribute should be set to 0. LogLevel It is recommended that you run the MQSIBroker and MQSIUNS agents with LogLevel set to error. This will display any serious error conditions (in the VCS log). If you want more detail of what either agent is doing, then you can increase the LogLevel to debug or all, but this will produce far more messages and this is not recommended for regular operation.
57
main.cf
Resources of types MQSIBroker and MQSIUNS can be defined by adding resource entries to the /etc/VRTSvcs/conf/config/main.cf. The following is a complete main.cf for a simple cluster (called Kona) with two systems (sunph1, sunph2) and one service group (vxg1) which includes resources for a ConfigMgr (ConfigMgr) broker (BRK1) and a UserNameServer2 both of which use a queue manager (VXQM1). There is also a database instance for the broker database (VXBDB1), used by the VXB1 broker. The group uses an IP address (resource name vxip1) and filesystems managed by Mount resources (vxmnt1, vxmnt2) and a DiskGroup (resource name vxdg1). The file has been split across multiple pages for clarity.
include "types.cf" cluster Kona ( UserNames = { admin = "cDRpdxPmHpzS." } CounterInterval = 5 Factor = { runque = 5, memory = 1, disk = 10, cpu = 25, network = 5 } MaxFactor = { runque = 100, memory = 10, disk = 100, cpu = 100, network = 100 } ) system sunph1 system sunph2 snmp vcs ( TrapList = { 1 = "A new system has joined the VCS Cluster", 2 = "An existing system has changed its state", 3 = "A service group has changed its state", 4 = "One or more heartbeat links has gone down", 5 = "An HA service has done a manual restart", 6 = "An HA service has been manually idled", 7 = "An HA service has been successfully started" } ) group vxg1 ( SystemList = { sunph1, sunph2 } ) MQSIConfigMgr ConfigMgr { ConfigMgrName @snetterton = ConfigMgr UserID @snetterton = argostr } MQSIUNS UserNameServer { UserID @snetterton = argostr } MQSIBroker BRK1 { BrokerName @snetterton = BRK1 UserID @snetterton = argostr } Db2udb VXBDB1 ( DB2InstOwner = vxdb1 DB2InstHome = "/MQHA/VXQM1/data/databases/vxdb1" ) MQM VXQM1 ( QMName = VXQM1 ) DiskGroup vxdg1 ( DiskGroup = vxdg1 ) IP vxip1 ( Device = hme0 Address = "9.20.110.248"
58
59
Comments
If you have any comments on this SupportPac, please send them to: Email: convery@uk.ibm.com coxsteph@uk.ibm.com aford@uk.ibm.com Post: Rob Convery MailPoint 211 IBM UK Laboratories Ltd, Hursley Park Winchester SO21 2JN UK Stephen Cox MailPoint 154 IBM UK Laboratories Ltd, Hursley Park Winchester SO21 2JN UK
60