Bookmark and Share
RSS

Recent Posts

Fixes for a PowerHA Issue

April 10, 2018

I received this information from Chris Gibson a few weeks ago. If you use PowerHA, I recommend checking to see if you're on affected levels of AIX:

High Impact / Highly Pervasive APAR
IJ02843 - PowerHA node halt during ip changes

“USERS AFFECTED:

  • Systems running PowerHA System Mirror on the
  • AIX 7100-05 Technology Level or
  • AIX 7200-02 Technology Level with
  • rsct.basic.rte at 3.2.3.0.
    **************************************************************
  • PROBLEM DESCRIPTION:
  • An improvement in obtaining adapter state information from
  • AHAFS event responses introduced some errors in handling
  • * internal tracking of monitored IP addresses.
    *
  • This can result in a core dump of the hagsd process any time
  • an IP change occurs at the OS layer. This means the failure
  • cannot happen while a cluster is running stable with no
  • changes occurring, but it is a risk during startup, shutdown,
  • or a failover scenario, and cannot be predicted beyond that.

Problem summary
    A flaw in handling of monitored IP changes during some
    adapter state improvements in RSCT 3.2.3.0 has led to the
    risk of a hagsd core dump in a couple code paths.

Problem conclusion
    Transition of IP lists during a monitoring change has been
    corrected.”


Here's some additional information:

In a PowerHA cluster, if an IP address is changed on one of the AIX nodes, the node may reboot unexpectedly due to a core dump of the hagsd process. This can happen when a Service IP is configured during normal PowerHA startup/shutdown/failover, or other operations resulting in an IP change.

Affected AIX Levels and Recommended Fixes
Minimum Affected Level Maximum Affected Level Fixing Level Interim Fix
7100-05-00
rsct.basic.rte 3.2.3.0 7100-05-02
rsct.basic.rte 3.2.3.0 7100-05-03
IJ02843 iFix
7200-02-00
rsct.basic.rte 3.2.3.0 7200-02-02
rsct.basic.rte 3.2.3.0 7200-02-03
IJ02843 iFix

Note: Applying the ifix requires PowerHA to be stopped on the node prior to applying the fix.

Posted April 10, 2018| Permalink

Post a Comment

Note: Comments are moderated and will not appear until approved

comments powered by Disqus