Professional Documents
Culture Documents
Ha Overview
Ha Overview
HACMP Cluster
Network(s)
Node A
Client
Node B
Shared Disk
Hardware
pSeries Servers/ POWER
No Integrated serial ports for heartbeat on P5 servers Announcement letter for limitations - #5765-F62
Network/SAN
2-port Async. RS-232 - FC 5723 2 Gb FC PCI-X - FC 5716 10 Gb FC 5718 and FC 5719 IBM, Cisco, McData, Brocade etc.. TotalStorage SAN Volume Controller Software
Stand alone tools for planning a cluster Can be used to configure a cluster with cl_opsconfig utility Does not monitor or manage a cluster Useable on AIX or Windows 2000
Installation Requirements Java Runtime Environment version 1.3.0 or higher AIX already has this level jre windows needs it www.ibm.com/developerworks/webservices/sdk
HACMP Cluster
cl_opsconfig
New HACMP Cluster
IBM Corporation 2008
/usr/es/sbin/cluster/wsm/README tells
how to install Apache from RPMs
Fileset cluster.es.client.wsm
(optionally)Documentation filesets:
cluster.doc.en_US.es.pdf cluster.doc.en_US.es.html
IBM Corporation 2008
HACMP Components
RMC subsystem
IP Network
RS-232
Disk
Disk
Other requirements
persistent IP addresses
IBM Corporation 2008
Which IPAT?
client
sysa
sysb
Do not place intelligent network equipment that does not transparently pass through UDP broadcasts and other packets to all cluster nodes. If such equipment is placed in the paths between cluster nodes and clients, use a $PING_CLIENT_LIST (in clinfo.rc
Persistant Labels
An IP alias that is always available if a service or boot interface is active Intended to provide administrators access to a node Only one persistent label per node per network is allowed Once synchronized, they are always available
Can be used for HATivoli oserv process IP
IBM Corporation 2008
A point to point non IP serial network usually implemented using Async adapter and a null modem cable connection On some pSeries servers that have 3 or 4 built in serial ports, ports 2, 3 or 4 can be used for this connection Built in serial port 1 is not supported for HACMP
Requires an enhanced concurrent VG Configured via the "Extended Configuration path" Uses a disk sector formerly reserved for clvmd
May not be a good alternative for a disk with heavy I/O
IBM Corporation 2008
disknet1
disknet2
disknet3
managed by SRC and started by init, the inittab entry is: clcomdES:2:once:startsrc -s clcomdES > /dev/console 2>&1 Source addresses are checked against /usr/sbin/cluster/etc/rhosts HACMPadapter ODM HACMPnode ODM (communication paths)
The default is autodiscovery AIX Cluster security - CtSec Use a VPN tunnel
Set up persistent IP labels on the same subnet chgsrc to add the -p to clcomd specify port 6191 (clcomd entry in /etc/services) use the extended VPN configuration screen to secure traffic for other cluster services
Log Files
/var/hacmp/clcomd/clcomd.log[.0]- up to 1 MB each /var/hacmp/clcomd/clcomddiag.log[.0]- up to 9 MB each
IBM Corporation 2008
Uses RSCT for communication HACMP coordinates activity between nodes - active vs passive varyon etc. bos.clvm.enh If migrating shared VGs must be converted System Management (CSPOC) - recommended or chvg -C on ALL cluster nodes
Management feature to simplify the configuration of LVM mirroring between two sites.
Provides automatic LVM mirror synchronization after disk failure when a node/disk becomes available in a SAN network. Maintains data integrity by eliminating manual LVM operations. Cluster verification enhancements to ensure the data high availability. Keeping the data in different locations eliminates the possibility of data loss upon disk block failure situation. For high availability each mirror copy should be located on separate physical disk, in separate disk enclosure, at separate physical locations. LVM mirroring allows up to three data copies. Mirror synchronization is required for stale partitions.
Site B
PV1 PV3
Node C
PV5
FC Switch 1
FC Switch 2
PV6
PV2
PV4
Node D Node B
IBM Corporation 2008
VIO Server
SAN Storage Subsystem
AIX1VG AIX2VG
VIO Server
AIX1 Partition
AIX2 Partition
Ethernet
HYPERVISOR
hdisk2 hdisk1
VIOS owns physical disk resources - LVM based storage on VIO Server LPARS sees disks as vSCSI (Virtual SCSI) devices - Virtual SCSI devices added to partition via HMC - LUNs on VIOS accessed as vSCSI disk
IBM Corporation 2008
Single VIO Server configuration has exposures The VIO Server partition is shutdown or fails Network connectivity through the VIO Server Disk Failure System failure
External Servers
VIO Server AIX1 AIX2 VIO Server
External Servers
vSCSI vLAN
vSCSI vLAN
Shared Ethernet Adapter
VLAN 1 VLAN 2
Virtual Ethernet - Partition to partition communication - Requires AIX 5L V5.3 and POWER5 VLAN Virtual LAN - Provide ability for adapter to be on multiple subnets - Provide isolation of communication to VLAN members - Allows a single adapter to support multiple subnets
IEEE VLANS - Up to 4096 VLANS - Up to 65533 vENET adapters - 21 VLANS per vENET adapter Shared Ethernet Adapter - Provides access to outside world - Uses Physical Adapter in the Virtual I/O Server
VIO Server
Ethernet
HYPERVISOR
hdisk1
hdisk2
A collection of resources is a resource group. Resources can be: Applications Volume Groups, Disks, Filesystems IP Addresses Custom Resource Groups Users explicitly specify the desired startup , fallover, and fallback behaviors Can be configured using standard or extended path Settling and Fallback timers provide further granularity
Dynamic node priority can provide even further granularity in multi-node cluster.
Fallover Preferences
Fallover To Next Priority Node In The List - (FNPN) Fallover Using Dynamic Node Priority - (FUDNP) Bring Offline (On Error Node Only) - (BO)
net_ether_01 Anti-Collocation
Resource Distribution Policies All policies are exercised by cluster event scripts
acquire_service_addr acquire_takeover_addr cl_configure_persistent_address collocation with persistent anti-collocation with persistent
Feature is available in all versions of HACMP V5 HA 5.1 requires APAR IY63515 HA 5.2 requires APAR IY63516
IBM Corporation 2008
Resource Group B
(parent resource group)
Resource Group C
(parent resource group)
Custom Resources
Three Node Cluster with one resource group configured for Online on Home Node Only priority at startup Sysa is the current owner of the resource group
GROUPA a_svc 1.1.1.1 dbvg dbapp
a_stdby
a_svc
b_svc
b_stdby c_svc
c_stdby
dbvg
sysa sysb
IBM Corporation 2008
sysc
a_stdby
b_svc
a_svc
c_svc
c_stdby
dbvg
sysa
sysb
sysc
a_stdby
a_svc
b_svc
b_stdby c_svc
c_stdby
dbvg
sysa sysb sysc
a_stdby
b_svc
b_boot
c_svc
c_boot
dbvg
sysa sysb sysc
dbvg
sysa sysb sysc
* Settling Time
(in Seconds )
* * * * * *
Name of the Fallback Policy YEAR MONTH (jan - Dec) Day of Month (1 - 31) HOUR (0 - 23) MINUTES (0 - 59)
IBM Corporation 2008
# + +# +# +#
F4=List F8=Image
Application Monitoring
HACMP supports multiple monitors per application server Configured via SMIT - Extended Configuration
#smitty hacmp Extended Configuration Extended Resource Configuration HACMP Extended Resources Configuration Configure HACMP Applications Add an Application Server
Add Application Server * Server Name * Start Script * Stop Script Application Monitor Name(s) [Entry Fields] [ appsrv ] [ /app/startserver ] [ /app/stopserver ] monitor1 monitor2 +
Application Monitoring
Add a Process Application Monitor Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] [] + [Long-running monitori> + [] [] [] # [] # [] # [] # [notify] + [] [] []
IBM Corporation 2008
* Monitor Name * Application Server(s) to Monitor * Monitor Mode * Processes to Monitor * Process Owner Instance Count * Stabilization Interval * Restart Count Restart Interval * Action on Application Failure Notify Method Cleanup Method Restart Method
User Interface
SMIT flow in HACMP
"Standard" configuration path allows users to easily configure most common options
IPAT via Aliasing Networks Shared service IP labels Volume Groups and Filesystems Application Servers
easy as cake
"Extended" path is used for fine tuning a configuration and configuring less common features
Configure all network types Configure all resource types Less common options Site support Application Monitoring Performance Tuning Parameters
IBM Corporation 2008
User Interface
Topology configuration in the "Standard" path is carried out automatically Configuration discovery is automatic Node names are set by discovering the host names IP network topology is set based on physical connectivity and netmasks Why Do I Need the "Extended Path"? Specify sites, global networks, specific network attributes Application Monitoring Tape resources Custom disk methods and resource recovery Extended Event Configuration Extended Performance Tuning Security and Users Snapshot Configuration
IBM Corporation 2008
User Interface
SMITTY HACMP
HACMP for AIX Move cursor to desired item and press Enter. Initialization and Standard Configuration Extended Configuration System Management (C-SPOC) Problem Determination Tools
F1=Help F9=Shell
F2=Refresh F10=Exit
F3=Cancel Enter=Do
F8=Image
F1=Help F9=Shell
F2=Refresh F10=Exit
F3=Cancel Enter=Do
IBM Corporation 2008
F8=Image
Automatically configures a simple two node cluster based on the following input:
Communication path to the remote node application server name application start/stop scripts service IP label Will automatically copy the start/stop scripts to the remote node
IBM Corporation 2008
Two-Node Cluster Configuration Assistant Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] [] + [] [] [] [] +
* Communication Path to Takeover Node * Application Server Name * Application Server Start Script * Application Server Stop Script * Service IP Label
IBM Corporation 2008
Activity is logged to /var/hacmp/log/clconfigassist Will synch and verify - clverify can be set to auto-correct- default is no
IBM Corporation 2008
Cluster Start
# smitty clstart System Management - (C-SPOC) Manage HACMP Services Start Cluster Services
Start Cluster Services Type or select values in entry fields. Press Enter AFTER making all desired changes.
* Start now, on system restart or both Start Cluster Services on these nodes BROADCAST message at startup? Startup Cluster Information Daemon Reacquire after forced down
+ + +
Ignore verification errors? Automatically correct errors found during Cluster start?
IBM Corporation 2008
false Interactively
clverify logfiles
clverify collects and archives the data
/var/hacmp/clverify/current/ - stores data used during the current
verification attempt. This should not exist unless verification is running or was aborted
Configure Nodes to an HACMP Cluster (standard) Type or select values in entry fields. Press Enter AFTER making all desired changes.
* Cluster Name New Nodes (via selected communication paths) Currently Configured Node(s)
Configure Resources to Make Highly Available Move cursor to desired item and press Enter. Configure Configure Configure Configure Service IP Labels/Addresses Application Servers Volume Groups, Logical Volumes, and Filesystems Concurrent Volume Groups and Logical Volumes
Log file /var/hacmp/log/cl_testtool.log Custom Test plans can be created Cluster test tool runs the following tests by default
NODE_UP -Start one or more nodes NODE_DOWN_FORCED - Stop a node forced NODE_DOWN_GRACEFUL - Stop one or more nodes NODE_DOWN_TAKEOVER - Stop a node with takeover CLSTRMGR_KILL - catastrophic failure NETWORK_DOWN_LOCAL - Stop a network on a node NETWORK_UP_LOCAL - Restart a network on a node SERVER_DOWN - Stop an application server
IBM Corporation 2008
Allows alerts of cluster events to be sent to cell phones and pagers Easily customizable using SMIT Messages may be sent through an SMS gateway
1-555-444-9999@sms.verizon.com andrew_cell@cingular.com
Smart Assists
WebSphere 6.0 standalone and ND N+1 and hot standby Oracle "cold failover cluster" (CFC) Oracle app server 10g(9.0.4) (AS 10g) Two node - hot standby DB2 - UDB Enterprise Server Edition (v8.1 and 8.2) N+1 and hot standby DB2 software must not be installed on the share storage
HACMP/XD for eRCMF (Enterprise Remote Copy Management Facility) Requires ESS eRCMF version 2.0 Uses Site support for GLVM supports cross site data replication with no distance limitation Synchronous A maximum of 2 sites
IBM Corporation 2008
WAN
Site 1
ESS PPRC
Shark 1
Site 2
Shark 2
Boston
PV1 = hdisk7 PV3 = hdisk9 PV2 = hdisk8 PV4= hdisk10 Node A PV1 = hdisk5 PV2 = hdisk6
Austin
PV3 = hdisk7 PV4 = hdisk8 Node B
PV1
PV2
PV3
PV4
PV1
PV2
PV3
PV4
Real Copy #1
Virtual Copy #2
Virtual Copy #1
Real Copy #2
One volume group actually spans both sites. Each site contains a copy of mission-critical data. Instead of extremely long disk cables, a TCP/IP network and the RPV device driver are used for remote disk access.
More Information
HACMP System Administration l: Planning and Implementation HACMP System Administration ll: Administration and Problem Determination HACMP System Administration III: Virtualization and Disaster Recovery HACMP Problem Determination and Recovery HACMP Certification Workshop (2 days) AM050 HACMP High Availability Products for pSeries Overview (2 days) hafeedbk@us.ibm.com - Comments and Questions about HACMP
www-1.ibm.com/servers/eserver/pseries/ha
http://www.ibm.com/servers/aix/library http://www-1.ibm.com/servers/eserver/pseries/library/hacmp_docs.html