Ha Overview

POWERHA Implementation Overview
Andrew Lanczi Certified Consulting I/T Specialist alanczi@us.ibm.com
IBM Corporation 2008
Objectives Understand the Implementation of HACMP

Planning How HACMP works Configuration options HACMP/XD
Over 60,000 Licenses Worldwide
The Americas: HACMP Customers

Air Reservations Manufacturing Trading Systems Credit Verification File Servers Claims Processing Inventory Management Telco Billings Plant Floor Telco Directory Entertainment NIC Servers Financial Retail ISP's
Europe Africa Asia/Pacific: HACMP Customers

Police & Fire Publishing Banking ATM's Trading Systems Process Control Air Traffic Bond Trading Fleet Services Security Services PCB Process Mfg. Fleet Management Cellular Phone Srvc Credit Processing Health & Hospital
HACMP Version Summary

Version V4.5 V5.1 V5.2 V5.3 V5.4 Availability June, 2002 July, 2003 July, 2004 July, 2005 July, 2006 End of Support Sept, 2005 Sept, 2006 Sept, 2007 Sept, 2008 Sept, 2009 (est)
Green supported; Red no longer supported

HACMP V5.4 Key Features

Faster Failure Detection using First Failure Data Capture (FFDC) HACMP on Linux (for Power) Performance improvements to HACMP/XD GLVM Can have up to 4 data mirroring networks Support for Enhanced Concurrent Volume Groups GPFS 2.3 Support IPAT support on Geographic Networks (for HACMP/XD)
HACMP Cluster
Network(s)
Node A
Client
Node B
Shared Disk
A typical cluster consists of nodes, networks, shared storage, and clients
HACMP supports 2 - 32 nodes per cluster

Hardware
pSeries Servers/ POWER
No Integrated serial ports for heartbeat on P5 servers Announcement letter for limitations - #5765-F62
Network/SAN
2-port Async. RS-232 - FC 5723 2 Gb FC PCI-X - FC 5716 10 Gb FC 5718 and FC 5719 IBM, Cisco, McData, Brocade etc.. TotalStorage SAN Volume Controller Software
Additional Hardware TotalStorage Products

DS8000 - online firmware update is now supported (code level 6.0.0.324 R10h.9b050406) and higher DS6000 - online firmware update is not supported in HACMP clusters DS4100 and DS4000 EXP1000 Serial ATA Hardware
DS4100 does not support multi-path I/O so no multi path fallover CSPOC cannot be used to add a DS4100 disk to AIX Totalstorage DS4000 EXP710 FC Storage Expansion unit (1740-710)
TotalStorage ESS (2105-F20, 2105-800)

OEM per CSA, EMC, HDS...
Online Planning Worksheets (OLPW)
Stand alone tools for planning a cluster Can be used to configure a cluster with cl_opsconfig utility Does not monitor or manage a cluster Useable on AIX or Windows 2000
Installation Requirements Java Runtime Environment version 1.3.0 or higher AIX already has this level jre windows needs it www.ibm.com/developerworks/webservices/sdk
ASCII Based Cluster Configuration

Well defined and documented XML file structure using Document Type Definition (DTD) and XML schema Configuration file can be passed to XML editors Allows customers to modify cluster configs that can then be passed to multiple clusters cl_exportdefinition - worksheet from existing cluster Extermal DTD /usr/es/sbin/cluster/worksheets/hacmp-v5300.dtd External XML Schema /usr/es/sbin/cluster/worksheets/hacmp-v5300.xsd
ASCII Based Cluster Configuration

Duplicate a cluster with changes
Export Definition file for OLPW XML file XML Editor
HACMP Cluster
Updated XML file
cl_opsconfig
New HACMP Cluster
Web-based Cluster Management-webSMIT The Main page after login
Web-based Cluster Management-webSMIT
Requirements Any "Apache-Compliant" web Server

IBM HTTP Server Apache
/usr/es/sbin/cluster/wsm/README tells
how to install Apache from RPMs
Fileset cluster.es.client.wsm
(optionally)Documentation filesets:
cluster.doc.en_US.es.pdf cluster.doc.en_US.es.html
HACMP Components
RSCT - (Reliable Scalable Cluster Technology)
RMC subsystem
Cluster Manager - (clstrmgr)

Recovery driver SNMP Services
clcomd - cluster communications daemon clinfo - (cluster information services, optional)

How HACMP Works
Heartbeat is used to monitor health
IP Network
RS-232
Disk
Disk
Networks for an HACMP cluster At least two networks are recommended

One physical network with multiple logical IP subnets One non IP based network rs-232 disk heartbeat The goal is to avoid a partitioned cluster both nodes always get the latest information
Decide on the mechanism to provide availability of Service addresses

IPAT via Replacement IPAT via Aliasing - The default
Other requirements
persistent IP addresses
Which IPAT?
"IPAT via Aliasing": IP address takeover performed by moving

an IP alias address from one interface to another , without changing the base address of the interface IP aliasing allows multiple resource groups to be configured using the same adapters IP aliased networks use boot and service labels
Boot label on standby adapters as well No Hardware Address Takeover on Service IP
"IPAT via IP Replacement": IP address takeover performed

by swapping an interfaces boot-time address with a service IP address
Only one address can be active on an interface at any time
Network IPAT Connection options

IPAT via Aliasing
a_boot1 a_svc a_boot2 a_boot a_svc a_standby
10.10.20.1 192.37.56.1 10.10.30.1 192.37.56.10 192.37.56.1 10.10.20.1

Ethernet2
ETHERNET1 subnet mask 255.255.255.0
b_boot3 b_boot4 b_svc b_standby
10.10.20.10 10.10.30.2 192.37.56.20 10.10.20.2
IPAT via Replacement
client
sysa
sysb
Networking with switches

One Layer 3 VLAN with multiple logical subnets
Do not place intelligent network equipment that does not transparently pass through UDP broadcasts and other packets to all cluster nodes. If such equipment is placed in the paths between cluster nodes and clients, use a $PING_CLIENT_LIST (in clinfo.rc
CISCO Example with HACMP

Assume that the customer is using the standard Cisco Switch product line of 3550, 3750, 49xx, 6500, etc. At the L3 level, one vlan is all that is needed to satisfy the HACMP setup. You just define multiple IP addresses as secondary addresses on this vlan interface for the multiple subnets. For example: the three subnets are 1.1.1.0/24, 1.1.2.0/24, and 1.1.3.0/24 as primary, standby, alias HA networks respectively all using vlan 50. On a Cisco L3 switch/router using IOS, you would code the following: Switch# config t Switch(config)# vlan 50 Switch(config-vlan)# name HACMP_Setup Switch(config-vlan)# exit Switch(config)# int vlan 50 Switch(config-if)# ip address 1.1.1.254 255.255.255.0 Switch(config-if)# ip address 1.1.2.254 255.255.255.0 secondary ( Alias IP address net) Switch(config-if)# ip address 1.1.3.254 255.255.255.0 secondary ( Standby IP address net) Switch(config-if)# no shut Switch(config-if)# exit Switch(config)# exit Switch# You now have vlan 50 customized with three different ip address identities (one for each of the subnets), and all of them pingable. Alias and standby/boot are tagged as secondary.
Persistant Labels
An IP alias that is always available if a service or boot interface is active Intended to provide administrators access to a node Only one persistent label per node per network is allowed Once synchronized, they are always available
Can be used for HATivoli oserv process IP
Heartbeat Over RS232
A point to point non IP serial network usually implemented using Async adapter and a null modem cable connection On some pSeries servers that have 3 or 4 built in serial ports, ports 2, 3 or 4 can be used for this connection Built in serial port 1 is not supported for HACMP
In LPAR mode it is better to allocate a PCI slot for the

async adapter on each server is using rs-232 serial network Check documentation to make sure the port is supported, prior to implementation, p5 server = no!
Heartbeat Over Disk (diskhb)

Provides users with: A point to point network type that is easy to configure as a volume group Additional protection against cluster partitioning A Serial network that can use any disk type Doesn't require additional hardware. For customers that consider rs232, tmssa, or tmscsi too costly or complex to setup
Requires an enhanced concurrent VG Configured via the "Extended Configuration path" Uses a disk sector formerly reserved for clvmd
May not be a good alternative for a disk with heavy I/O
Heartbeat Over Disk (diskhb)

Any disk in an enhanced concurrent VG can be used point to point networks
disknet1 disknet2 enhconcvg hdisk1 hdisk2 hdisk3 disknet3
disknet1
disknet2
disknet3
Cluster Communication Daemon clcomd

provides a secure transport layer caches ODM's for performance /var/hacmp/odmcache - about 1MB per node
managed by SRC and started by init, the inittab entry is: clcomdES:2:once:startsrc -s clcomdES > /dev/console 2>&1 Source addresses are checked against /usr/sbin/cluster/etc/rhosts HACMPadapter ODM HACMPnode ODM (communication paths)
Cluster Communication Daemon

Security Strategies
The default is autodiscovery AIX Cluster security - CtSec Use a VPN tunnel
Set up persistent IP labels on the same subnet chgsrc to add the -p to clcomd specify port 6191 (clcomd entry in /etc/services) use the extended VPN configuration screen to secure traffic for other cluster services
If there is an unresolvable label in /usr/es/sbin/cluster/etc/rhosts, all connections will be denied
Log Files
/var/hacmp/clcomd/clcomd.log[.0]- up to 1 MB each /var/hacmp/clcomd/clcomddiag.log[.0]- up to 9 MB each
LVM and Disks

Use mirrored logical volumes
Including mirrored jfslogs Consider quorum issues Use mutiple connections from the servers to the disk subsystem(s)
DBVG jfsloglv dblv db2lv DBVG' jfsloglv' dblv' db2lv'
Fast Disk Takeover

Requires Enhanced Concurrent Volume Groups Provides a significant performance gain for takeover of volume groups consisting of a large number of disks
requires Enhanced Concurrent Volume Groups in non-concurrent resource groups
Uses RSCT for communication HACMP coordinates activity between nodes - active vs passive varyon etc. bos.clvm.enh If migrating shared VGs must be converted System Management (CSPOC) - recommended or chvg -C on ALL cluster nodes
Cross Site LVM mirroring
Management feature to simplify the configuration of LVM mirroring between two sites.
Provides automatic LVM mirror synchronization after disk failure when a node/disk becomes available in a SAN network. Maintains data integrity by eliminating manual LVM operations. Cluster verification enhancements to ensure the data high availability. Keeping the data in different locations eliminates the possibility of data loss upon disk block failure situation. For high availability each mirror copy should be located on separate physical disk, in separate disk enclosure, at separate physical locations. LVM mirroring allows up to three data copies. Mirror synchronization is required for stale partitions.
Cross-Site LVM Mirroring
Two sites connected using a SAN network Site A

Node A
Site B
PV1 PV3
Node C
PV5
FC Switch 1
FC Switch 2
PV6
PV2
PV4
Node D Node B
VIO Server
SAN Storage Subsystem
AIX1VG AIX2VG
VIO Server
AIX1 Partition
AIX2 Partition
Ethernet
HYPERVISOR
hdisk2 hdisk1
VIOS owns physical disk resources - LVM based storage on VIO Server LPARS sees disks as vSCSI (Virtual SCSI) devices - Virtual SCSI devices added to partition via HMC - LUNs on VIOS accessed as vSCSI disk
VIO Server Implementation
Single VIO Server configuration has exposures The VIO Server partition is shutdown or fails Network connectivity through the VIO Server Disk Failure System failure
High Availability with Dual VIO Servers
External Servers
VIO Server AIX1 AIX2 VIO Server
External Servers
vSCSI vLAN
vSCSI vLAN
Shared Ethernet Adapter
Shared Ethernet Adapter
Virtual Ethernet Switch

POWER Hypervisor
VLAN 1 VLAN 2
Virtual Ethernet - Partition to partition communication - Requires AIX 5L V5.3 and POWER5 VLAN Virtual LAN - Provide ability for adapter to be on multiple subnets - Provide isolation of communication to VLAN members - Allows a single adapter to support multiple subnets
IEEE VLANS - Up to 4096 VLANS - Up to 65533 vENET adapters - 21 VLANS per vENET adapter Shared Ethernet Adapter - Provides access to outside world - Uses Physical Adapter in the Virtual I/O Server
VIO Server with HACMP Cluster

SAN Storage Subsystem
AIX1VG AIX2VG
VIO Server
HACMP AIX1 Partition
HACMP AIX2 Partition
Ethernet
HYPERVISOR
hdisk1
hdisk2
Issues: Network Connectivity? Shared Disk Access? SPOFs

Available via Advance POWER Virtualization
Custom Resource Groups
A collection of resources is a resource group. Resources can be: Applications Volume Groups, Disks, Filesystems IP Addresses Custom Resource Groups Users explicitly specify the desired startup , fallover, and fallback behaviors Can be configured using standard or extended path Settling and Fallback timers provide further granularity
Dynamic node priority can provide even further granularity in multi-node cluster.
Custom Resource Groups Startup Preferences

Online On Home Node Only - (OHN) Online on First Available Node - (OFAN) Online Using Distribution Policy - (OUDP) Online On All Available Nodes (concurrent) - (OAAN)
Fallover Preferences
Fallover To Next Priority Node In The List - (FNPN) Fallover Using Dynamic Node Priority - (FUDNP) Bring Offline (On Error Node Only) - (BO)
This is most appropriate for concurrent type RGs Fallback Preferences

Fallback To Higher Priority Node - (FHPN) Never Fallback - (NFB)
Resource Distribution Policies Control of IP via aliasing service labels

Collocation - all resources of this type will be on the same physical
resource
Anti-collocation - all resources of this type are allocated to the first

physical resource that is not already serving a resource - default HACMP Extended Resources Configuration Configure Resource Distribution Policies Configure Service IP Labels/Address Distribution Preference Type or select Values in entry fields. Press Enter AFTER making all desired changes [Entry Fields] *Network Name *Distribution Preference
net_ether_01 Anti-Collocation
Resource Distribution Policies All policies are exercised by cluster event scripts
acquire_service_addr acquire_takeover_addr cl_configure_persistent_address collocation with persistent anti-collocation with persistent
Feature is available in all versions of HACMP V5 HA 5.1 requires APAR IY63515 HA 5.2 requires APAR IY63516
Dependent Resource Groups

Used for multi-tiered architectures that require ordered resource group processing Allows the implementer to specify cluster wide dependencies between resouces groups parent - child - Dependency type Option to display dependancies clRGinfo -a
Resource Group A
(child resource group) Dependency
Resource Group B
(parent resource group)
Resource Group C
(parent resource group)
Custom Resources
Three Node Cluster with one resource group configured for Online on Home Node Only priority at startup Sysa is the current owner of the resource group
GROUPA a_svc 1.1.1.1 dbvg dbapp
a_stdby
a_svc
b_svc
b_stdby c_svc
c_stdby
dbvg
sysa sysb
sysc
Custom Resources after fallover

Sysa has crashed! Fallover To Next Priority Node In The List If sysb was not available, sysc would have acquired GROUPA
GROUPA
a_svc 1.1.1.1 dbvg dbapp
a_stdby
b_svc
a_svc
c_svc
c_stdby
dbvg
sysa
sysb
sysc
Custom Resources - reintegration

Sysa is repaired and hacmp is restarted on sysa
Fallback To Higher Priority Node
GROUPA
a_stdby
a_svc
b_svc
b_stdby c_svc
c_stdby
dbvg
sysa sysb sysc
Custom Resources after fallover

Sysa has crashed! Fallover Using Dynamic Node Priority Destination determined by DNP rules - lowest CPU usage
GROUPA
a_stdby
b_svc
b_boot
c_svc
c_boot
dbvg
sysa sysb sysc
Online on All Nodes

Up to 32 nodes access the data simultaneously All systems have the resource group All systems read and write to the database LCMP
GROUPA dbvg
ORAC
dbvg
sysa sysb sysc
Configuration Settling Time

Implementers can configure a cluster settling time to minimize RG fallback activity when multiple nodes are started at the same time controls how long to wait for a higher priority node to join the cluster before bringing a resource group online One time per cluster - preference = Online On First Available Node
Can be found under "Configure Resource Group Run-Time Policies" in SMIT

Configure Settling Time for Resource Groups Type or select values in entry fields. Press Enter AFTER making all desired changes.
* Settling Time
(in Seconds )
[Entry Fields] [0]
Custom Resource Groups

Configuration Fallback Timers Fallback Timers allow the implementer to control when a RG fallback will occur - off peak, weekends, etc.. Can only be configured in the extended path
Configure Specific Date Fallback Timer Policy Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] [ ] [ ] [ ] [ ] [ ] [ ]
* * * * * *
Name of the Fallback Policy YEAR MONTH (jan - Dec) Day of Month (1 - 31) HOUR (0 - 23) MINUTES (0 - 59)
# + +# +# +#
Application Servers - Standard Path

An Application Server is a label with an associated start and stop script
Start/Stop = Absolute path to the executable scripts

Add an Application Server Type or select values in entry fields. Press Enter AFTER making all desired changes.
* Server Name * Start Script * Stop Script
[Entry Fields] [ myapp ] [/usr/local/app/start_app] [/usr/local/app/stop_app]
F1=Help F5=Reset F9=Shell
F2=Refresh F6=Command F10=Exit

F3=Cancel F7=Edit Enter=Do
F4=List F8=Image
Application Monitoring
HACMP supports multiple monitors per application server Configured via SMIT - Extended Configuration
#smitty hacmp Extended Configuration Extended Resource Configuration HACMP Extended Resources Configuration Configure HACMP Applications Add an Application Server
Add Application Server * Server Name * Start Script * Stop Script Application Monitor Name(s) [Entry Fields] [ appsrv ] [ /app/startserver ] [ /app/stopserver ] monitor1 monitor2 +
Application Monitoring
Add a Process Application Monitor Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] [] + [Long-running monitori> + [] [] [] # [] # [] # [] # [notify] + [] [] []
* Monitor Name * Application Server(s) to Monitor * Monitor Mode * Processes to Monitor * Process Owner Instance Count * Stabilization Interval * Restart Count Restart Interval * Action on Application Failure Notify Method Cleanup Method Restart Method
User Interface
SMIT flow in HACMP
"Standard" configuration path allows users to easily configure most common options
IPAT via Aliasing Networks Shared service IP labels Volume Groups and Filesystems Application Servers
easy as cake
"Extended" path is used for fine tuning a configuration and configuring less common features
Configure all network types Configure all resource types Less common options Site support Application Monitoring Performance Tuning Parameters
User Interface
Topology configuration in the "Standard" path is carried out automatically Configuration discovery is automatic Node names are set by discovering the host names IP network topology is set based on physical connectivity and netmasks Why Do I Need the "Extended Path"? Specify sites, global networks, specific network attributes Application Monitoring Tape resources Custom disk methods and resource recovery Extended Event Configuration Extended Performance Tuning Security and Users Snapshot Configuration
User Interface
SMITTY HACMP
HACMP for AIX Move cursor to desired item and press Enter. Initialization and Standard Configuration Extended Configuration System Management (C-SPOC) Problem Determination Tools
F1=Help F9=Shell
F2=Refresh F10=Exit
F3=Cancel Enter=Do
F8=Image
Standard Path Installation

Standard Configuration Menu
Initialization and Standard Configuration Move cursor to desired item and press Enter. Two-Node Cluster Configuration Assistant Add Nodes to an HACMP Cluster Configure Resources to Make Highly Available Configure HACMP Resource Groups Verify and Synchronize HACMP Configuration HACMP Cluster Test Tool Display HACMP Configuration
F1=Help F9=Shell
F2=Refresh F10=Exit
F3=Cancel Enter=Do
F8=Image
Two-Node Configuration Assistant

An advanced automation infrastructure for On-Demand operating environments
intended for pre-existing application environments that wish to add high availability
Automatically configures a simple two node cluster based on the following input:
Communication path to the remote node application server name application start/stop scripts service IP label Will automatically copy the start/stop scripts to the remote node
Two Node configuration Assistant

Users must configure the topology and resources At the AIX level before using the Coonfiguration Assistant Before you start, complete the following tasks: Connect and configure all IP network interfaces. Install and configure the application to be made highly available. Add the application's service IP label to /etc/hosts on all nodes. Configure the volume groups that contain the application's shared data on disks that are attached to both nodes. An active communication path to the takeover node. A unique name to identify the application to be made highly available. The full path to the application's start and stop scripts. The application's service IP label.
Two-Node Cluster Configuration Assistant

# smitty hacmp
Installation and standard configuration
Two-Node Cluster Configuration Assistant
Two-Node Cluster Configuration Assistant Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] [] + [] [] [] [] +
* Communication Path to Takeover Node * Application Server Name * Application Server Start Script * Application Server Stop Script * Service IP Label
Two-Node Configuration Assistant

Will create a cluster with the following characteristics
IPAT via IP aliasing Local node is the highest priority Startup on highest priority node Fallover to the remote node Never Fallback Contains one application server Contains one service IP label Contains all shareable Volume Groups
Activity is logged to /var/hacmp/log/clconfigassist Will synch and verify - clverify can be set to auto-correct- default is no
Cluster Start
# smitty clstart System Management - (C-SPOC) Manage HACMP Services Start Cluster Services
Start Cluster Services Type or select values in entry fields. Press Enter AFTER making all desired changes.
* Start now, on system restart or both Start Cluster Services on these nodes BROADCAST message at startup? Startup Cluster Information Daemon Reacquire after forced down
[Entry Fields] [both] [] true true false
+ + +
Ignore verification errors? Automatically correct errors found during Cluster start?
false Interactively
clverify logfiles
clverify collects and archives the data
/var/hacmp/clverify/current/ - stores data used during the current
verification attempt. This should not exist unless verification is running or was aborted
/var/hacmp/clverify/aborted/ - stores data from the most recent aborted

verification attempt
/var/hacmp/clverify/fail/ - stores data from the most recent failed

verification attempt
/var/hacmp/clverify/pass/ - stores data from the most recent passed

verification attempt.
/var/hacmp/clverify/pass.prev/ - stores data from the second most

recent passed attempt
Standard Path Installation

Standard Topology Configuration
Users must specify communication path, IP address, IP label, or FQDN HACMP will contact the nodes using the specified comm paths and automatically configure the base IP topology
Configure Nodes to an HACMP Cluster (standard) Type or select values in entry fields. Press Enter AFTER making all desired changes.
* Cluster Name New Nodes (via selected communication paths) Currently Configured Node(s)
[Entry Fields] [andrews_cluster] [node1 node2] +
Installation Standard Path

Standard Resource Configuration
Users may only configure the most common resource types NOTE: Service IP Labels/Addresses are now configured as resources Configuring a Service label is required for the "Standard Path"
Configure Resources to Make Highly Available Move cursor to desired item and press Enter. Configure Configure Configure Configure Service IP Labels/Addresses Application Servers Volume Groups, Logical Volumes, and Filesystems Concurrent Volume Groups and Logical Volumes
Cluster Test Tool

Simplifies cluster validation Automates testing of an HACMP cluster
Tests are carried out in sequence and analyzed by the cluster
Log file /var/hacmp/log/cl_testtool.log Custom Test plans can be created Cluster test tool runs the following tests by default
NODE_UP -Start one or more nodes NODE_DOWN_FORCED - Stop a node forced NODE_DOWN_GRACEFUL - Stop one or more nodes NODE_DOWN_TAKEOVER - Stop a node with takeover CLSTRMGR_KILL - catastrophic failure NETWORK_DOWN_LOCAL - Stop a network on a node NETWORK_UP_LOCAL - Restart a network on a node SERVER_DOWN - Stop an application server
SMS Text Messaging - HACMP
Allows alerts of cluster events to be sent to cell phones and pagers Easily customizable using SMIT Messages may be sent through an SMS gateway
1-555-444-9999@sms.verizon.com andrew_cell@cingular.com
GSM Modem (Global System for Mobil Comm)

A wireless modem that connects to a cellular network allowing a computer to connect to the net wirelessly
SMS Text Messaging - HACMP

A ";" in the number will result in an alpha numeric page being sent - 18005552222;437-1881 The @ character will send an SMS message using /usr/bin/mail The "#" will cause an SMS message to be sent wirelessly via GSM modem - 437-1881#
Add a Custom Remote Notification Method Type or select Values in entry fields. Press Enter AFTER making all desired changes [Entry Fields] * Method Name [SMS_Notify] Description [ Node Down ] * Nodename(s) [ NodeA] + * Number to dial or cell phone address [ 6034442222@sms.verizon.net] * Filename [ /usr/es/sbin/cluster/samples/pager/sample.txt ] Cluster Event (s) node_down +
Smart Assists
WebSphere 6.0 standalone and ND N+1 and hot standby Oracle "cold failover cluster" (CFC) Oracle app server 10g(9.0.4) (AS 10g) Two node - hot standby DB2 - UDB Enterprise Server Edition (v8.1 and 8.2) N+1 and hot standby DB2 software must not be installed on the share storage
Uses Site Support for PPRC

Provides support for ESS 2105-F20 and 2105-800
HACMP/XD for eRCMF (Enterprise Remote Copy Management Facility) Requires ESS eRCMF version 2.0 Uses Site support for GLVM supports cross site data replication with no distance limitation Synchronous A maximum of 2 sites
HACMP XD:PPRC Support

Peer-to peer Remote Copy HACMP coordinates Sharks to ensure failover of the environment
WAN
Site 1
ESS PPRC
Shark 1
Site 2
Shark 2
Hardware Based Data Mirroring
Two Site HACMP Cluster With Geographic LVM

TCP/IP WAN
Boston
PV1 = hdisk7 PV3 = hdisk9 PV2 = hdisk8 PV4= hdisk10 Node A PV1 = hdisk5 PV2 = hdisk6
Austin
PV3 = hdisk7 PV4 = hdisk8 Node B
PV1
PV2
PV3
PV4
PV1
PV2
PV3
PV4
Real Copy #1
Virtual Copy #2
Virtual Copy #1
Real Copy #2
One volume group actually spans both sites. Each site contains a copy of mission-critical data. Instead of extremely long disk cables, a TCP/IP network and the RPV device driver are used for remote disk access.
More Information
HACMP System Administration l: Planning and Implementation HACMP System Administration ll: Administration and Problem Determination HACMP System Administration III: Virtualization and Disaster Recovery HACMP Problem Determination and Recovery HACMP Certification Workshop (2 days) AM050 HACMP High Availability Products for pSeries Overview (2 days) hafeedbk@us.ibm.com - Comments and Questions about HACMP
www-1.ibm.com/servers/eserver/pseries/ha
http://www.ibm.com/servers/aix/library http://www-1.ibm.com/servers/eserver/pseries/library/hacmp_docs.html

Ha Overview

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ha Overview

Uploaded by

Copyright:

Available Formats

POWERHA Implementation Overview

Andrew Lanczi Certified Consulting I/T Specialist alanczi@us.ibm.com

IBM Corporation 2008

Objectives Understand the Implementation of HACMP

IBM Corporation 2008

Over 60,000 Licenses Worldwide

The Americas: HACMP Customers

Europe Africa Asia/Pacific: HACMP Customers

IBM Corporation 2008

HACMP Version Summary

Green supported; Red no longer supported

HACMP V5.4 Key Features

IBM Corporation 2008

A typical cluster consists of nodes, networks, shared storage, and clients

HACMP supports 2 - 32 nodes per cluster

IBM Corporation 2008

Additional Hardware TotalStorage Products

TotalStorage ESS (2105-F20, 2105-800)

IBM Corporation 2008

Online Planning Worksheets (OLPW)

IBM Corporation 2008

ASCII Based Cluster Configuration

IBM Corporation 2008

ASCII Based Cluster Configuration

Updated XML file

Web-based Cluster Management-webSMIT The Main page after login

IBM Corporation 2008

Web-based Cluster Management-webSMIT

Requirements Any "Apache-Compliant" web Server

RSCT - (Reliable Scalable Cluster Technology)

Cluster Manager - (clstrmgr)

clcomd - cluster communications daemon clinfo - (cluster information services, optional)

How HACMP Works

Heartbeat is used to monitor health

IBM Corporation 2008

Networks for an HACMP cluster At least two networks are recommended

Decide on the mechanism to provide availability of Service addresses

"IPAT via Aliasing": IP address takeover performed by moving

"IPAT via IP Replacement": IP address takeover performed

Network IPAT Connection options

a_boot1 a_svc a_boot2 a_boot a_svc a_standby

10.10.20.1 192.37.56.1 10.10.30.1 192.37.56.10 192.37.56.1 10.10.20.1

b_boot3 b_boot4 b_svc b_standby

10.10.20.10 10.10.30.2 192.37.56.20 10.10.20.2

IPAT via Replacement

IBM Corporation 2008

Networking with switches

IBM Corporation 2008

CISCO Example with HACMP

Heartbeat Over RS232

In LPAR mode it is better to allocate a PCI slot for the

IBM Corporation 2008

Heartbeat Over Disk (diskhb)

Heartbeat Over Disk (diskhb)

IBM Corporation 2008

Cluster Communication Daemon clcomd

IBM Corporation 2008

Cluster Communication Daemon

If there is an unresolvable label in /usr/es/sbin/cluster/etc/rhosts, all connections will be denied

LVM and Disks

IBM Corporation 2008

Fast Disk Takeover

IBM Corporation 2008

Cross Site LVM mirroring