You are on page 1of 7

Questions during AIX Virtual User Group Webinar January 26, 2012

PowerHA Session 1

Q: Can you communicate via the SAN in 7.1 using NPIV adapters over SVC?
A: Yes, you can use SANCOM via NPIV. Note that TME would need to enabled on the HBAs on the VIO
servers. There is also another requirement for a virtual interface on VLAN 3358 on the clients in order
for them to register the traffic. The use of SVC vs. something else is a wash

Q: Does HACMP Cluster requires same HW model of both (Primary & HA node) Server ?
A: no. Shawn said you could mix hardware models. You should keep AIX levels the same though.
Q: How do we configure parent resgource group & Child resource group ?
A: The panels are within the Extended Configuration options within the HA SMIT panels. We
documented it pretty well in the 5.3 HA redbook with scenarios. Otherwise the software pubs have this
outlined in detailed. Offhand not sure if the Installation and Planning guide or the Admin Guide
Q: Will there be any such sessions on GPFS?
A: I can put it on the list. We haven't had one in a while.
Q: More to the point, using GPFS as a NFS replacement in a PowerHA environ.
A: Yep, GPFS would typically provide a more robust NFS type configuration

Q: how do u write an error when rootvg is gone?


A: running at the kernel level now with CAA when using 7.1 or 7.1.1. Significant change from previous
releases. In prior releases you had access to no runtimes, it also didn't help that the heartbeats were
cached in memory so the fallover could take long than the traditional failure detection rate
Q: what will be the failover time in 7.1?
A: Seconds. Typically faster than any 6.1 version. Note that the application dictates a lot fo the fallover
time which HA has no control over. Also version 7.1 requires the use of ECM VGs for all your volume
groups, hence if you were comparing to an older version that was using standard VGs you would see a
significant gain. Otherwise you are only talking a difference of seconds from older to new. Note that
the newer versions would be sitting on a more resilient infrastructure since they leverage CAA
communication at the kernel level
Q: So is NFS (the old HANFS) part of the "standard Edition"?
A: Yes, HA will handle NFS crossmounts now with the Standard Edition for NFSV3 and NFSV4
Q: For multicast network communication, if you are connected to 3 networks and only 1 or 2 allow
multicast traffic, is this supported?
A: sure, but you would probably want to restrict the interfaces that are on the non-multicast capable
networks from heartbeating, via the ifrestrict file
Q: If you have a path failure the ios to root vg could take a while to timeout IO (say 45 sec for
powerpath) to the still good path. Can you account for this in rootvg failure detection?
A: Good question, not sure what the timeout is offhand if there is a path failure and the swap to the new
path has not taken place. I would need to fup on that item. Shoot one of us or Joe an email and we will
try to get you a response
Q: Are there best practices/requirements for zoning NPIV for use with SAN cluster communication? IE,
do you zone all the WWN into one big zone? Do you pick an HBA from each and create a zone for those
two HBAs? Repeat for all HBAs in the cluster?
A: This one is kind of up to the practices that you follow in house. Lets say you have a choice. A
dedicated zone for the WWPNs on both sides or one big zone. As long as they can see each other the
SANCOM will work
Q: 1) Is the Cluster Repsoitory Disk used as the DiskHB in previous Veriosns?
A: used for heartbeating yes, but no as in it does not use the protocol nor is it a point to point network
like it was in the past. A network does not even need to get defined for it in order for the cluster to use
it for heartbeating
Q: three questions:
A: --unanswered--
Q: 2) Are App servers going away in V7, being replaced by Resource Type and groups?
A: not going away. App servers were renamed to be called application controllers, same exact concept -
just new name
Q: 3) Where is the best location to get moew info about using CoD w/HA?
A: the new HA 6.1 DR redbook had a section that covers some of it in detail (I wrote it), otherwise shoot
us an email.

Q: What are the main *implementation* differences between PowerHA 6.1 & 7.1?
A: this should be covered in Shawn's presentation.
Q: What are the main *implementation* differences between PowerHA 6.1 & 7.1?
A: There are different fileset and minimum requirements that will probably be covered in the session.
There are also some new requirements like the need to have IP multicasting enabled on the 7.1 releases
in order for HA to be able to heartbeat

Q: How does this work with Oracle Clusterware? Would it just be a replacement?
A: behind on the questions so can you please repost with the details about the presentation that apply
to your question

Q: Does 6.1 come with the Cluster Test Tool or is this a 7.1 feature?
A: Yes, it came in 5.X so 6.1 has it
Q: Which version of AIX supports CAA?
A: AIX 6.1 TL6 and AIX 7.1. The newest 7.1.1 release needs the higher levels of AIX/CAA which are AIX 6.1
TL7 SP2 and AIX 7.1 TL1 SP2

Q: Do we need a new license for HA 7.1 if we migrate from HA 6.1 to HA 7.1?


A: HA 6.1 changed the packaging from HACMP and HACMP/XD to standard edition and enterprise
edition. As long as your SWMA is current and you are updating from local clustering to the standard
Edition you are good to go, upgrades should be free. Going from standard edition to enterprise edition
would cost $

Q: moving RG based on energy (Exhaust temp) of nodes is available through Director's VMC..it is not RG
but moving the partitions..
A: HA used to have a bunch of options using dynamic node priority, which typically would make sense in
clusters larger than 2 nodes. Note that VMC is handling the relocation from the outside. Just another
option available for the management of the power systems environment, but you are right not available
from within PowerHA SystemMirror
Q: (Director)command line is not clmgr it is "smcli" which uses clmgr internally for PowerHA...fyi..
A: thanks
Q: I understand that 7.1 does have enterprise support like site,metro mirror..? Is that rt ?
A: no. There is no Enterprise Edition available for the 7.1 releases. Expect something this coming 2012
Fall. So you could potentially have a stretch cluster (with a stretched out SAN and LVM mirroring or
VDISK mirroring, but currently there is no integration in HA 7.x with the IP or Storage level replication
offerings
Q: The licensing question was actually about On/Off CoD rather than DLPAR, but the answer did provide
some good information. Do you happen to know the answer to the specific question as to whether or
not temporary On/Off CoD charges include temporary usage of PowerHA?
A: If you have it incorporated into the fallover as I mentioned I would say yes. If you are just activating
resources on the fly to boost horse power then I would say no.
Q: Let me phrase the question another way. Are there ooCoD daily use feature codes/charges for
PowerHA 6 or PowerHA 7?
A: Not that we are aware of. Note that we don't have keys enforcing the CPU counts that you are
licensed for so if it was only for a short test it would be pretty tough to detect.
Q: If a system licensed for PowerHA employs On/Off CoD to temporarily activate additional processors,
are there any additional temporary usage license charges for PowerHA?
A: You can leverage DLPAR to manipulate the CPU and/or memory counts on fallover. The integrated
panels do allow you to specify to use on/off. Since the application would be leaving the source box you
wouldn't have temp HA licenses, the existing licenses would carry over. So if you implement it properly
you could typically get away with licensing 5 on prod and 1 on the fallover system and only having to
license 6 CPUs, rather than 10. That is assuming that the stby side is not running anything that would
consume the CPUs.
Q: If all systems in a cluster need access to a shared disk, then is GPFS a requirment?
A: HA can provide concurrent support for raw lvs from multiple systems - ie Oracle RAC. To provide
access to a concurrent filesystem the only thing that HA can do is manage NFS crossmounts. To truly
have access to a global namespace via network or SAN connectivity GPFS might be the way to go
depending on the requirements
Q: why is HA not implement on the VIOS level simialar to vSphere. It takes the complexcity away from
the normal AIX LPARs
A: prob because they originated from different development areas within IBM. HA has been around
since way before the VIO and not all implementations are virtualized today, even though that is the
ongoing trend
Q: This is true, but vSphere handles it in great way. IBM might want to think about it
A: agreed, any feedback we ever receive typically gets passed to development on our monthly calls.
Note that you can also email the team at hafeedbk@us.ibm.com with your suggestions
Q: ty. I will send them an email. Let's hope they already work on it.
A: --unanswered--
Q: I could never invoke a failover with a failed application disk unless I mirrored a LV. The mirroring is
handle on the SAN level in our environment.
A: It would work for the loss of quorum. I believe that changes were made a while back to
accommodate for non cross site LVM configurations as well. And if I recall correctly it was still based on
the logging of the LVM_SA_QUORCLOSE event in the error report
Q: what is considered to to lose the rootvg? FS corruption is considered?
A: not sure how it would react to actual corruption. The simulation of rootvg would be most common
when you are booting up off the SAN and loss of access to the storage and hence the run times occurs.
Ie. Failed cables, ports, storage.
Q: Then if you boot from the SAN and loss access then SAN should also be clusterized and implement a
way to syn the disks like flashcopy or BCV... is this the case?
A: You have to be careful with the replication of rootvg (specially in an HA environment), not sure if you
were considering that as well. You can certainly do your data volumes as many customers do
Q: I went through a situation where in the company I give support we had FS from the rootvg corrupted
, and even there were errors reported in the ERRPT , cluster did not failover the other node
A: As unlikely as that scenario is - rootvg corruption would be tough to accommodate for since you
cannot guarantee what portions of the rootvg would work or not work.
Q: Generic XD Replicated Resources: Is that a 7.1-only feature, or does 6.1 have it as well?
A: The cluster test tool came in the 5.X releases, so yes there for 6.1 or 7.1 for sure
Q: ANy hope of getting non IP heartbeat on XD (Metro Mirror) clusters?
A: There is the ability to define xd-rs232 networks and disk heartbeating for extended clusters. We
documented this ability in the HA 6.1 Enterprise Redbook that came out a couple of years ago. SG24-
7841 if I recall correctly
Q: Is this feature applicable to SVC PPRC relationship?
A: playing catchup on the responses, can you repost the ? with the feature you are referring to
Q: On the Multicast IP polling, can that be over a WAN to some remote server?
A: In a stretch cluster configuration yes, but there are other considerations like the repository disk
requirements. Note that there is also currently no site support in the 7.1 releases. We should have an
Enterprise Edition available for 7.1 this Fall 2012.

Q: how to find which node is home node ?


A: Home now for what piece? clRGinfo will show the location where the RG is being hosted. If you
mean about the resource group policies it would be based on the node list within the resource group -
ie. Node A, Node B or Node B, Node A - the order would matter depending on the policies that you
specified for Startup, Fallover and Fallback policies
Q: how to extend cluster FS, VG
A: if you are referring to CAA repository disk I would not do that (they really want you guys to not
manipulate that volume) Note that its actual usage is minimal and the requirements are pretty tiny
Q: HI ..Link to the Presentation Material?
A: The presentation materials are on the wiki:
https://www.ibm.com/developerworks/wikis/display/WikiPtype/AIX+Virtual+User+Group+-+USA

Audience Question
Q: What are the main *implementation* differences between PowerHA 6.1 & 7.1?
A: There are different fileset and minimum requirements that will probably be covered in the session.
There are also some new requirements like the need to have IP multicasting enabled on the 7.1 releases
in order for HA to be able to heartbeat
Audience Question
A: --unanswered--
Q: have done offline migration of 5.5. to 6.1

A: Same thing. To complete start up HA on all nodes then proceed business as usual with verify / sync
and other operations
Q: after upgrading 5.5 to 6.1 i could not verify &sync cluster until I start cluster. Is it normal behaviour.
A: It really depends on how you did the migration. Note that for a rolling migration the last node has to
be integrated into the cluster for the migration to complete. Hence, once that step was done then verify
and sync would work (it is typically not advised to do a verify/sync mid migration - if that is what you
did)
Q: sorry
A: --unanswered--
Q: iQ: after upgrading 5.5 to 6.1 i could not verify &sync cluster until I start cluster. Is it normal
behaviour.
A: It really depends on how you did the migration. Note that for a rolling migration the last node has to
be integrated into the cluster for the migration to complete. Hence, once that step was done then verify
and sync would work (it is typically not advised to do a verify/sync mid migration - if that is what you
did)
I have done offline migration of 5.5. to 5.1
A: --unanswered--
Q: how to change VG attribute like (big,,sclable) under power HA
A: The options were not there in prior releases, I believe that the 7.1 versions have finally caught up to
include those options in the CSPOC panels
Q: Can HA7.1 be at all configured without using the Central repository disk needed by CAA ? Assuming
that all other heartbeat requirement is met and keeping the HA config local on nodes is acceptable ?
A: No. The CAA repo disk is a set requirement for now. The 7.1.1 version will be providing the ability to
swap your repository disk to a new one in SP1 coming out in the near future
Q: In HA 7.1, with a node having multiple network adapters, can a specific adapter be kept outside of
cluster control ?
A: yes there is now a way to restrict the adapters that send heartbeat transmission. Not sure if shawn
has it documented in the slides for the ifrestrict file that you can populate with the interfaces to not use
for HA communication
Q: in HA 7.1, with multiple adapters, 2 adapter are connected to multicast supported networks and 3rd
one does not support multicast. 3rd interface has been added to ifrestrict file. Can we still have a service
IP float across this 3rd interface ?
A: I don't believe so since the interface will not show up in the CAA status and we could not monitor its
status. Hence things like selective fallover recovery behaviors would not be able to work appropriately
Q: Is CAA covered in AQ100 - or is there a good course to learn CAA inside out ?
A: the instructors I believe updated the new courses to include HA 7.1 now, where they do cover CAA in
some detail. I have to see anything will really make you a true expert
Q: is NPIV supported for TME?
A: Yes, a couple of extra steps are required to get the clients to register the traffic on the clients. Ie.
additional virtual ethernet adapter on VLAN 3358. TME needs to be enabled on the VIO servers

Q: that's an IBM only url. What's the BP url?


A: I will get this from Shawn and post on the wiki
Q: please post it on the Wiki
A: --unanswered--

Q: can you please align the presentation to fit to the screen .. itseems its not shwoing the complete
presentation..ty
A: hmmm. it is showing the complete slide on my screen. You should be seeng the same thing. Can you
see the page number in the lower right?
Q: can u please paste the wiki link again ..ty
A: https://www.ibm.com/developerworks/wikis/display/WikiPtype/AIX+Virtual+User+Group+-+USA
Q: thanks
A: --unanswered--
Q: nope
A: you might try resizing. Mike and I both see the full slide. No other people have said anything. You
can download them from the wiki if you can't get it to work.
Q: on which disk we need to keep hearbeat
A: For which version would be my question. On 6.1 and earlier you specify a disk to be used for disk
heartbating. It would become a part of a point to point network. In the 7.1 releases the disk now
required for the CAA cluster repository is used for heartbeating, also the older diskhb functions have
been discontinued. Now heartbeating over the HBAs take place on the 4GB and 8GB FC adapters,
including NPIV.
Q: When is end-of-life for PowerHA 5.4.1?
A: Sept 2011, not sure what the extended support ($) options are for it offhand
Q: did I understand this correctly, that on the SAN network now in 7.1 the cluster nodes can
communicate directly FC adapter to FC adapter?
A: yes, via the supported adapters we can leverage the TME (target mode setting). Note that is disabled
by default. Zoning may need to also be set up properly depending on how your environments are set up
Q: Would you be so kind to check the pdf on the site, it's missing 2 or 3 links... Could not write down the
links from the webinar window...
A: Joe will get it updated

Q: if we have dual vio server, why we need HA environment


A: That only provides a more resilient infrastructure. HA would cover you for a crash. Or if you a
monitoring the application from within HA you could also have it send you notification or initiate
takeover actions. The key thing here would be a) to automate fallover and b) to expedite the fallover
process and help meet SLAs.

Q: what should be the ideal timeout value for a powerpath device on a system with HA cluster?
A: Not sure that I would have recommendation offhand for this. A fup at the end of the session or an
email might be advised
Q: why not use dual san arrays?
A: For what piece? We have a lot of customers already doing LVM cross site LVM mirroring and VDISK
mirroring behind the scenes. I believe that EMC also published a white paper from their testing and
qualification support of the VPLEX environment. So definitely
Q: Is there still a framework of cluster daemons? Does PowerHA leverage scripts for operation, or
binaries?
A: I suspect you mean with the 7.X releases. The daemons are pretty close to the same with a few
additions. Also the newest 7.X releases no longer use the topology services daemon. However, RSCT is
still used heavily to orchestrate things. As far as the scripts it uses both, there are various binaries along
with various ksh scripts within the product. You can also code in your own custom events and have
them invoked as pre or post event scripts to the various operations within the SW
Q: In HACMP you use to be able to stop all cluster services without impacting the VGs, network
adapters, etc. Can you still do that with PowerHA?
A: Yes, you still have the UNMANAGE option which was the follow on the FORCED stop option in the
cluster stop options. Using that option would leave the resources online but stop the cluster monitoring
of the resources
Q: The RG Options span multiple nodes in the cluster, correct?
A: The RG options are specific to the nodes defined in the node list. So yes, that could be some or all
nodes in the cluster
Q: How does Multi Channel Health Mgmt account for performance problems with a node. Failover just
moves the problem. right?
A: Do you mean for multipathing? Depending on the timeout values set and the triggers from our side
that is potentially what would happen. Selective Fallover on VG loss could potentially invoke an RG
move and if the problem persisted on the target side we could potentially see the problem move
Q: What describes a "complete" failure? Do you have any addiitonal logic to avoid split brain as a result
of application performance problems?
A: Yes the low level heartbeating in the 7.1 release makes it pretty difficult to become partitioned. The
product will now heartbeat across all interfaces, not just the ones in the HA networks. The SAN based
communication and the heartbeating over the repository disk would make it pretty difficult to become
partitioned. Note that if you only use portiong of the heartbeating interfaces selective fallover behavior
will still work. The nice thing now is that the low level heartbeating at the kernel level is not succeptible
to topservices becoming starved for resources like in previous releases. Hence if the box is thrashing
you could still communicate with the other boxes. There are also new disk fencing enhancements to
further secure the ECM VGs in the 7.1 releases
Q: Is there a POWERHA cluster test tool that can be run in a windows env that would allow creation of
testable cluster(s) and possibly the cfg files that can be ported to AIX platform?
A: hmm. Today no. If you are interested submit feedback to hafeedbk@us.ibm.com

Q: Are there any locking issues for Cluster repository?


A: I have not personally seen any. In the initial CAA rev the VG is varied on all the cluster members.
Each node in a two node cluster would have its own filesystem mounted. They have since changed that
in the new CAA updates (by removing the filesystems in the CAA VG), but access would still be allowed
from the multiple cluster members