Conduct an End-of-Year AIX Systems Health Check
AIX expert Jaqui Lynch explains how and why your organization should conduct an annual review.
By Jaqui Lynch12/09/2019
A good healthcare strategy for your systems is to review hardware life expectancy, firmware levels, OS levels, Java, SSH and SSL levels, HMC levels and so on. You then need to have a strategy for updating these in a timely manner. This also applies to applications such as PowerHA and Spectrum Scale.
This is where IBM’s Fix Central, FLRT (fix level recommendation tool) and Entitled Software websites come in very useful. You can use these to determine current recommended levels, recommended additional patches and to download the recommended fixes.
Step 1: Inventory Using HMCScannerThe first step is to take an inventory of the equipment and what’s running on it. The quickest way to get this is to run the HMCScanner tool against each of your HMCs. I do this before and after every change I make to a server. A significant amount of the information can be obtained using the HMCScanner if you are using hardware management consoles (HMCs). The HMCScanner also works with IVM, FSM (Flex manager) and SDMC. From the output you’ll get a list of all the servers that the HMC can see, including model and serial number, the server firmware levels, the HMC software level and the OS level for the LPARs and VIO servers. The latest level for HMCScanner is v42.
HMCScanner is a Java based tool that connects to your HMC and documents everything on it that the HMC can see. It’s easy to install and can be run from Windows or AIX. On Windows I change into the directory and type in:
hmcScanner.bat hmcname hscroot -p password
Substitute your HMC name for hmcname and the correct username and password
The subsequent spreadsheet fully documents whatever the HMC can see, including virtual ethernets and the SEA (shared ethernet adapter).
Once the HMCScanner report is run, you can then review firmware and maintenance levels and check for withdrawal dates, which will then allow you to prepare a plan for updates and a budget plan for replacements. The next step is to determine what’s currently out of service or about to be.
Step 2: Use FLRT to Check for Updates
FLRT (Fix level recommendation tool) is used to determine whether the levels you are running are still supported, and what the recommended upgrades are. It is used toprovide guidance for software and firmware maintenance. The most recent version provides you with the release date for the versions you are running and their end of service date (if announced). It also provides the same dates for the recommended update levels and provides links to the readme files for the recommended updates. FLRT can be used to determine updates for the HMC, VIO servers, Linux, AIX, IBM I, PowerHA, PowerVC, PowerVP and Spectrum Scale. It will also show if a server will no longer receive firmware updates—this is an indication that the server has been withdrawn and is now out of service.
The FLRT home page provide multiple tools under icons such as: report tools, data tables, scripting tools and Apar tools.
Step 3: Plan to Replace Out of Service Hardware
Each server, HMC and other hardware has four dates associated with it: the announcement date, general availability date, withdrawal from marketing and the service discontinued date. As of March 31, 2019, all POWER6 servers will have their service discontinued and as of September 30, 2019 most POWER7 servers will have their service discontinued. The last few POWER7 servers will have their service discontinued effective December 31, 2020. IBM has also started to withdraw some of the POWER8 models. This means that you’ll no longer be able to get service as of the withdrawal date unless you sign an extended service contract, which is typically very expensive.
If you’re using any of these servers, now is a good time to start budgeting to migrate to POWER9 to get to a fully supported, better performing system with less expensive support. It’s important that you build into that plan any minimum operating system or VIO levels that are required. These are well documented in the associated Redbook for the server and the announcement letter.
I also recommend requesting an installation planning meeting with whoever you plan to purchase from. In this meeting you’ll get information to plan for power and cooling requirements, as well as determining any network and SAN requirements. I see issues with many installations where this planning was skipped and it results in a challenging installation. You’ll need to know how many ports you want for the network and SAN, are the network ports copper or fiber, do you have enough switch ports and licenses and so on. You’ll also need to confirm whether the current power distribution units (PDUs) have enough available amps to support the new hardware and whether they have the correct C13 or C19 connectivity. It’s much better to find this out well in advance so you can plan for an electrician and any other additional purchases prior to bringing in any new hardware.
Plan to Upgrade Firmware
On any servers that you plan to keep around this is the time to plan the next years firmware updates. Firmware updates should include the HMC, server firmware and adapter firmware. Effective August 31, 2019, all HMC software prior to v9 is out of service. The x86 HMC only requires an update to the HMC software itself. The new POWER HMC requires the software update but it also has a BMC (baseboard management controller) that requires updating. There’s a BMC and a PNOR update and they go together. You’ll find those updates in Fix Central: Select POWER, Power hardware management console, 7063-cr1. The HMC software is found under Power System management consoles and then you pick the 7042 or 7063 for the software.
As of November 25, 2019 the latest HMC software is:
HMC v9r1m940 (MH01837)
BMC 3.13 and PNOR 3.08
lshmc -b shows the updated BMC and PNOR as:
The 20190708 is the release date of the code that shows on Fix Central
You should ensure you read the description or readme prior to putting on these updates to ensure there are no prerequisites.
Once the HMC is up to date, you should look at server firmware and adapter firmware. Unfortunately, most server firmware updates include wither deferred or disruptive update. It’s wise to plan for a total power off of the box every 6 months if possible, but at least once a year. The updates can be found on Fix Central along with the readme/description file.
Additionally, if you’re planning any updates and have older HMCs, then it’s time to consider replacing them with POWER HMCs. IBM has said that v9.1 is the last HMC version that will support x86 HMCs and the minimum x86 it supports is a 7042-CR7. The POWER HMC is fully supported. V9 requires the use of enhanced mode instead of classic mode in the GUI and it does not support any server prior to POWER7. This may be another reason to plan to migrate those older servers.
Update the VIO Servers and Adapters
I build my adapter firmware updates into my VIO server maintenance window because the VIO server usually owns all the adapters. Since I’ll be rebooting the VIO server after I update it, I’ll update the adapter at that time. It ‘s really important to keep adapters up to date and this is often overlooked. The high speed adapters, in particular, have regular mandatory updates and these impact reliability and performance.
All of the above updates provide an update history in the description/readme file or in a separate firmware history file linked from the readme.
I keep my VIO servers as current as possible. As of September 30, 2019 all VIO servers prior to v2.2.6 have been withdrawn from support. Effective September 30, 2020 all v2.2.6 VIO servers will be withdrawn from support—so now is the time to plan to migrate to v3.1.
As of November 15, 2019 the latest level for the VIO servers is 184.108.40.206. If your VIO servers are redundant and correctly configured then you should be able to upgrade one VIO, then, once all the paths are back, upgrade the second VIO without taking an outage. This depends on your network and SAN being setup correctly to be redundant. If you’re not running dual VIO servers then any VIO update will be disruptive as it will require a reboot.
Additionally, you’ll want to run FLRTVC on the VIO (and any AIX LPARS you update later) to check for any required patches. I always update Java, SSH and SSL during maintenance updates as they usually have security holes until they are patched. VIO 3.1.1 does not require Java5 or Java6 so I remove those.
FLRTVC (FLRT vulnerability checker) is a script that you install on the VIO or AIX LPAR to be checked. That script will download a file from IBM called apar.csv. It uses wget or curl to try to download a file called apar.csv from IBM and it then checks known issues against your software levels. The most common things it finds are back levels of SSH, SSL and Java. If your server is unable to download the file you can download it yourself from the site. Then just edit the script and change SKIPDOWNLOAD=0 to 1. It will then read the local file.
I then run it as follows:
The .txt file can be downloaded and opened in Excel. That file identifies the efixes and ifixes that need to go on, provides links to the readmes, and also provides links to the actual download where possible.
Update Your OSes
Once again you would go to Fix Central to obtain the latest levels and patches. If you’re migrating to new hardware you may need to upgrade prior to the migration in order to reach the minimum required level. New base levels (i.e., 7.2.0) are obtained from the IBM entitled software site and technology levels and service packs are obtained from Fix Central.
The first system I update is my NIM server. As of December 2019 the latest AIX level is AIX 7200-04-01-1939. My NIM server is now running that level. You should include adapter firmware updates here as the NIM server usually is standalone and owns its own adapters. You should update NIM and create the new SPOT and LPP source prior to upgrading any other systems including the VIO servers. NIM has to be at the highest level.
Finally, you can update your LPARs. For the VIO, NIM and AIX LPAR updates you should run FLRTVC when you are finished to ensure you have installed all necessary security patches. You should always assume there will be java, ssh and ssl updates beyond what is in the IBM fixpack.
Java, SSH and SSL
I have mentioned Java, ssh and ssl updates in several places. To get the Java patches you go to Fix Central, then click on find product and type in Java in the product selector prompt. Then select Runtimes for Java Technology. After that it will give you choices regarding version—32 or 64 bit, which must be downloaded separately. These updates are installed with either install (AIX) or updateios (VIO server).
To get the latest SSH and SSL patches you will need to go to this website (which requires an IBM login). You select either OpenSSH or OpenSSL and then continue and it takes you to the latest downloads.
The end of the year is a great time to take a step back and do some serious planning. This is the time to evaluate your systems and to determine which ones you will keep and which ones you will upgrade to newer hardware. It is also the time that you should be planning updates to firmware, HMCs, VIO servers and other operating systems. Having a well-structured annual refresh plan is critical to having a healthy infrastructure to run your environment and it will lead to better performing more reliable systems. The last couple of months of the year is a great time to do this as it is normally when many companies have their annual change freeze.
Jaqui Lynch has over 38 years of experience working with a projects and OSes across vendor platforms, including IBM Z, UNIX systems and more.